Accepted Article invertebrates freshwater of Metabarcoding title: Running [email protected] Email Spain 38206, Laguna dela Cristóbal San (IPNA-CSIC), Agrobiología Ins Islas, y Evolución en Ecología de Grupo Andujar, Carmelo Dr to: Correspondence NR47TJ, UK Academy of Sciences, Kunming, Yunnan 650223, China This by is protectedarticle reserved. Allrights copyright. doi: 10.1111/mec.14410 lead to differences between this version and the throughbeen the copyediting, pagination typesetting, which process, may and proofreading hasThis article been accepted publicationfor andundergone peerfull but review not has 6 5 4 3 2 1 Article :Original type Article PROF. ALFRIED VOGLER ID (Orcid :0000-0002-2462-3718) DR. ANDUJAR CARMELO (Orcid :0000-0001-9759-7402) ID San Cristóbal Laguna la de 38206, Spain Carmelo Andújar Douglas W. Yu W. Douglas

School of Biological Sciences, University of East Anglia, Norwich Research Park, Norwich, Norfolk Norfolk Norwich, Park, Research Norwich Anglia, East of University Sciences, Biological of School Chinese Zoology, of Institute Kunming Evolution, and Resources Genetic of Key Laboratory State 9TY, TW20 UK Surrey, Egham, Lane, NatureMetricsBakeham CABI Ltd, Site, Ins Islas, en Evolución y Ecología de Grupo UK 7PY, SL5 Ascot, Campus, Park Silwood London, College Imperial Sciences, Life of Department Department of Life Sciences, Natural Histor Natural Sciences, of Life Department Metabarcoding of freshwater invertebrates to detect the effects of a pesticide apesticide spill of the detect effects to invertebrates of freshwater Metabarcoding 5,6 , Alfried P. Vogler P. , Alfried 1,2,3 , Paula Arribas 1,2 1,2,3

, Clare Gray y Museum, Cromwell Road, London, SW7 5BD, UK UK 5BD, SW7 London, Road, Cromwell Museum, y tuto de Productos Naturales y Agrobiología (IPNA-CSIC), (IPNA-CSIC), Agrobiología y Naturales Productos de tuto

Version of Record. Please Version as this cite of Record. article 2 , Katherine Bruce tuto de Productos Naturales y Naturales Productos de tuto 4 , Guy Woodward 2

, Accepted Article yet to be embraced (Pauls (Pauls be embraced to yet has ecology mainstream overtaking is that revolution molecular the and resolution, taxonomic of level coarse a to microscopy, via best at or eye, by macroinvertebrates identifying on relies still schemes to adapt toa (Friberg rapidly world changing failing for criticised roundly been has and years, intervening the in adjustments and tweaks, advances, some statistical techniques with albeit a over on ago, thatwerelargelycentury reliant developed ofstill However, applied is ecology 2000). this field States European(United Commission 1972; Water the EuropeanAct and the US of Clean Union Directive Framework (WFD) as theWater andto freshwaters of status ecological to assess the therefore is required Biomonitoring assemblages. localalter species that disturbances anthropogenic drivers and, increasingly, environmental isaffecteda bynatural of host biota The freshwater INTRODUCTION ecology. molecular and monitoring invertebrate biodiversity freshwater for newavenues opening molecular methods, non- thetime traditional of of cost and ata can taxonomicbe extracted resolved fraction data highly Thus, separately. samples macro-faunal and meio- and for group, taxonomic each major for gene marker, each with detectable samples, vs. "control" "impacted" between strong differences revealed Community analyses minute (meiofauna). of specimens extraction for protocol flotation OTUs); ofLinnaean splitting intomultiple binomials molecular OTUs; and ofa the use filtration- 55 named into ‘’ a single morphotaxon splitting (e.g. greater taxonomic resolution The due increase was to: monitoring. morphological recognised with species times of number the five OTUs, 207of these to matches produced reference metabarcoding Kennet OTUs. The River into clustered and BOLDdatabase the in Arthropoda for entries barcode available all from created by missed Nematoda, and Platyhelminthes The recovered a SSUmarker range parameters. broad of under were profiles obtained Similar taxonomic pair merging andclustering). filtering, (read denoising, parameters tested 72 under spill in the England). River Taxonomic Operational (Southern (OTU) recoveryKennet Unit was This by is protectedarticle reserved. Allrights copyright. the of fragments two with assemblages invertebrate of metabarcoding We used validation. further requires but detection, species for a provides approach new (metabarcoding) assemblages sequencing entire of Next-generation to obtain. difficult is resolution where species-level taxonomicidentification coarse uses (macro)invertebrates of surveys for methodology Current conservation. and management guides and ecosystems freshwater of assessment the environmental underpins Biomonitoring Abstract cox1

"barcode" and partial nuclear ribosomal ( nuclear ribosomal partial and "barcode"

et al. cox1 , while Rotifera were only amplified with with amplified Rotifera wereonly , while 2014; Bohan Bohan 2014; et al. et al. 2017). Because of the need for rapid and cost- 2017). theneed rapid of Because for 2011). The vast majority of biomonitoring biomonitoring of majority vast The 2011). enforce their protection through legislation, such legislation, through protection their enforce SSU ) genes, to assess the effects of a pesticide pesticide a the effects of assess genes, to ) cox1 . A referenceset was Accepted Article freshwater communities in a large lowland river as a test case. On July 1, 2013, a the of pulse On a 1, case. 2013, July as test river ain large lowland communities freshwater oninvertebrate spill an insecticide theconsequences tostudy of Here, metabarcoding we applied Hajibabaei Hajibabaei (Creer techniques microscopy and sorting using traditional identify see or to small too either are of organisms that largecapturing the proportion and also for ecosystems, entire of biomonitoring for desired species-resolution the elusive provide thus Metabarcoding can 2017). al. et al. et (Yang litter leaf and soil as such samples, environmental from obtained specimens minute of numbers large analysis of simultaneous the permits Metabarcoding 2012;Brandon-Mong & Harris (Taylor Sanger-sequencing with specimen single a for much thanrequired greater is not now DNAbarcoding for assemblage an entire required of technology, theeffort sequencing (HTS) community tonew and level, high-throughput thanks the of (a fragment DNA diagnostic throughbarcodes short species identification (Taberlet complex assemblages of identification taxonomic faster andmore provide highly-resolved specimencould mixtures, these i.e. metabarcoding, for Recent protocols groupings (Schmidt-Kloiber & Hering 2015; Leese & 2015; Hering groupings (Schmidt-Kloiber coarse such other or families, or genera of level the at is assessment the if lost are distinctions (Bohan in trophic networks and position membershipfeeding in groups and alsomaythe reveal conditions, environmental to responses which maydifferential reflect traits, can physiological and behavioural link the establish ecological, known to level identification stressors (Stubauer & Moog 2000; Chessman Chessman Moog 2000; & (Stubauer stressors environmental impact important of missmay the full coarse inventories so and taxonomically species, closely related even between candiffer quality water in changes to responses population However, for manyMarshall quality (Camargo decades 1993; habitat and water of assessment the for used successfully been has and in process, the jettisoned are inevitably that information environmental-status of huge amounts the water body, despite a of state ecological the of assessment a broad provide nonetheless can shortcuts labour-saving These 2008). Jones 2004; &Nijboer (Schmidt-Kloiber entity single a treated as or ignored either typically are midges) chironomid (e.g. resides groups whichmostof the aquatic biodiversity in difficult taxonomically This by is protectedarticle reserved. Allrights copyright. (Wright systems and AUSIVAS inRIVPACS as used family, by e.g. taxonomically, areinstead lumped many isroutine practice that it effective approaches, 2016), and freshwater sediments and the water column (Elbrecht & Leese 2015; BistaLeese & (Elbrecht 2015; column and the water sediments and freshwater 2016), (Esling sea thedeep 2016), et al. et et al. 2000), or, less frequently, into trait-based groupings, such as “riverflies”. The “riverflies”. as such groupings, trait-based into frequently, less or, 2000), 2016; Bohan Bohan 2016; et al. et al. et et al. 2017). 2017). 2012). This methodology applies Hebert et al.’s (2003) idea (2003) of etal.’s This methodology Hebert applies 2012). 2015; Guardiola 2015;Guardiola et al. quencing of PCR amplicons from environmental environmental from PCRamplicons of quencing 2002; Gutiérrez-Cánovas 2002;Gutiérrez-Cánovas et al. et et al. et taxa are to taxa identified individualnot species but et al. 2016). 2016). 2006; Sánchez-Montoya 2006; 2015; Leray & Knowlton 2015;Leray Lanzén2015; & Knowlton et al. 2014; Arribas Arribas 2014; et al. et al. 2017). These 2017). et al. 2010; Ji 2010; cox1 et al. et 2008). Species- 2008). et al. et al. gene) to the the to gene) 2015). 2015). 2016;Zinger 2007). 2007). et al. et et al. 2013; 2013;

Accepted Article This by is protectedarticle reserved. Allrights copyright. by the canmatching ways, intwo either be used from fully identified specimens available at public databases (NCBI or BOLD). These reference sets these many for of and canbe uses data, critical al. et clustered into of the concern PCR A major isthat amplification delimitation. and detection species of affect thesuccess biases could which primer choice, due to potential and breadth amplification a of provides use fragments barcode two communities. test of The parallel for id protocols diagnostic trial new environmental and to metabarcoding, of explore to theeffectiveness length km site river’s the overof spill several Thompson taxa many of (see invertebrate extinctions and localised crashes population England, to southern led theThames of River in tributary largest theRiver Kennet, the in chlorpyrifos organophosphate covering the entire entire the covering gene fragments two We metabarcoded bp). (658 have the of barcode amplicon region full-length suited is for sequencing not the platform on usedIllumina widely the readto the length, constraints The (turnover). elsewhere from a of dispersed species new or of set community (nestedness) that a itmightbe of of subset composed abundance, increased the despite or, impact community, thepre- from community unchanged be largely chironomid ofmight the post-impact composition into the mechanisms of ecological resilience and recoveryafter the spill. Specifically, the species gain insight further possible to species-level not is it resolution, with than taxonomic rather levels, at was higher measured only composition community because However, spill. the after abundance The ability asgreatly midges) agroup rapidly. Chironomidaerecolonize to (non-biting in increased were far less affected,and at a later sampling time returned to post-spill levels, possibly due to their adultlife stage, an aerial taxa,with especially those whereas other the spill, from number downstream theamphipod detritivore, asthedominant community such thelocal components of some For example, perturbation. communities toa environmental profound true species diversity (Tang in variation sequence lower However, taxa. of range phylogenetic larger across a bias by primer affected is less and thus segments variable bracketing more knowledg our to ecosystems never testedinfreshwater nuclear( 18SrRNA al. (Deagle assemblages species thedetectable bias in produce may region generally lowandthus this (Arribas 2014; Lobo 2014; cox1 cox1 2005), that can be directly used for downstream ecological analyses. Species identification is identification Species analyses. ecological downstream that can 2005), for be used directly et al. et gene is the obvious choice of a marker for metabarcoding of aquatic invertebrates, but due but aquaticinvertebrates, metabarcoding of for a marker of gene choice the obvious is et al. et 2016),which here werethe appliedfor first freshwater time to invertebrate de novo et al. et 2016 for details). We used samples collected upstream and downstream from the the from downstream and upstream collected samples Weused details). for 2016 cox1 cox1 SSU 2015; Creer 2015; generated species proxies, i.e. Operational Taxonomic Units (OTUs) (Blaxter (Blaxter (OTUs) Taxonomic Units Operational i.e. proxies, species generated ) gene, frequently used for sequencing marine meiofaunal communities but but communities meiofaunal marine sequencing gene, used ) for frequently barcode region using two primer pairs shown to have broad target ranges target have ranges broad to shown pairs twoprimer using region barcode et al. 2012). The resulting metabarcode sequences are typically first first typically are sequences metabarcode The 2012). resulting et al. et 2016). We therefore also conducted metabarcoding with the the with metabarcoding conducted also Wetherefore 2016). obtained against existing databases of DNA of sequences databases existing against obtained de novo de entifying resp differential e. This gene contains highly conserved regions regions conserved genecontains highly e. This generated OTUs against the external the against generated OTUs Gammarus pulex Gammarus cox1 cox1 SSU region in several is aquatic phyla in region generally underestimates the the underestimates generally onses of invertebrate of invertebrate onses , were greatly reduced in in reduced greatly , were et Accepted Article in the Surber sample by debris and no effort was made to remove small organisms at this stage. The at this small organisms removewas to andnoeffort made samplein the Surber debris by were size this retained below organisms many but mesh used aSurber theoriginal Note µm 335 that microorganisms and silt with copious amounts of water (Fonseca water (Fonseca of amounts with copious silt and microorganisms flushing (size out 1mm) mesh< while sieve fraction the wire meiofaunal capture to 45 µm passed a through sieve was retained by then not material this The smaller sieve macrofauna. retain to a 1mm the remainder through mesh wire and by filtered where hand subsequently wedebris removed the to laboratory, and transferred ethanol in absolute collection after immediately was preserved sample per One and time site Surber after spill. 2.5 the months community the recovery of short-term the and sites, upstream unaffected the to relative point spill the of downstream effect Invertebrates were collected using a Surber sampler (0.0625 m (0.0625 sampler a using Surber collected were Invertebrates samples used in Thompson samples usedThompson in as same time was the The Fig. latter S1). (Suppl. September 2013) 2(17th thespill; time days after 1 times:time at 11 two sampled July and 2013), (12th were surroundings, and riparian channel forms This by is protectedarticle reserved. Allrights copyright. mortality along stretch a 15-km downstream macroinvertebrate by widespread was affected that river The Kennetisa chalk River lowland Study site andsampling protocol METHODS MATERIALS AND to the use schemes may shift metabarcoding. of biomonitoring future recent technology, Illumina with achievable Given sequence data of the highquality whose species representation is increasingly complete at least for this ecosystem in Western Europe. databases, sequence taxonomic publicly available growing rapidly the of use made we community local the of For identification identification. species and numbers species of estimates accurate ribosomal RNA ( of the amplification for existingprimers of universal evaluation samples, the sediment bulk from macro-fauna and meio- of extraction for protocols of development the included al. et The spill, approachescharacterised River Kennetpesticide (Thompsonconventional with previously absence Shokralla (e.g. or presence testspecies to occasions has onvarious used 2011), been which Westcott & (Schloss approach OTU a prior dependent’ clustering ‘ in without set the referencedirectly to reads by sequence or approach, matching raw the ina independent’ sequences, reference ‘taxonomy (including the four sites sampled in Thompson Thompson in sampled sites the four (including m a long, ca. each stretch along 50 river impacted km 6 reaches, and downstream three control 2016), was used to trial the metabarcoding methodology for freshwater invertebrates. This invertebrates. freshwater for was trial methodology used metabarcoding 2016), to the SSU ) genes, and the calibration of bioinformatics tools and parameter settings for for and settings parameter tools bioinformatics of genes, ) calibration and the et al. et al. 2014; Arribas 2014;Arribas (2016). The sampling regime permitted to the explore to immediate permitted regime The sampling (2016).

et al. from an insecticide site spill (Thompson et al. 2016). 2016). Sites were ca. 1 km apart, with similar similar with apart, km 1 ca. were Sites 2016). 2 , 335 et al. μ m mesh) atthree upstream 2010; Arribas 2010; cox1 andnuclear 18S et al. et et al. 2016). 2016). 2016). 2016). Accepted Article CCIGAYATRGCITTYCCICG) (Shokralla (Shokralla CCIGAYATRGCITTYCCICG) ≈ combination for each sample. each for sample. combination a with and giventag R2) (R1 unique dual each, produce cell reads of on 1.5% flow the to paired end reads) bppaired (2x300 sequencer onan MiSeq sequenced were Illumina libraries metabarcoding The and the costs. resulting 24 ina combined reducing library single samplethus indexes tagging, for but thesameusing individually processed were gene three the fragments For eachsample, USA). CA, San Diego, Illumina, Kit; XT Index (Nextera adapters sequencing P7 Illumina and P5 the and to adddual-index barcodes amplification PCR secondary a limited-cycle for wereas used template amplicons primary these which after beads, XPmagnetic using were Ampure cleaned pools Amplicon Data S1. in was included conditions and PCR regarding information reagents All were pooled. amplicons PCR the and performed, were primers of each pair for reactions independent three sample, Library Protocol at http://support.illumina.com) Preparation (see Arribas (16S samples gene sequencing rRNA microbial 16S in analogythe protocol for Illumina tothe PCR, in nested subsequent sequence for an toinclude adapter overhang Primers were modified al. (Blaxter bp 400 to 300 of varying length of afragment producing GCCTGCTGCCTTCCTTGGA), (5'- SSU-R22 and (5'-GCTTGTCTCAAAGATTAAGCC) SSU-FO4 primers (Yu TANACYTCNGGRTGNCCRAARAAYCA) (Shokralla GGIGGRTAIACIGTTCAICC (Yu TCNACNAAYCAYAARRAYATYGG 5' (Fol-degen-for: awide variety of validated primersin using already and was amplified This by is protectedarticle reserved. Allrights copyright. LUDOX 40 using flotation by cleaned was step second this from filtrate DNA extraction andIllumina DNA sequencing extraction (Fig. 1). and sequencing DNAextraction 24 of for samples number used a final for and macro- meiofauna, the separately for sample Each wasprocessed thesampled meiofauna. represent DNA to extracted for layer The floating tendtosink. was which particles, inorganic from tofloat, tend which organisms, degeneracy. The et a with but 2003), al., higher animalbarcode of of degree amplification standard the (Hebert for mitochondrial a amplified: the fragment of were individually markers Three DNA (Qiagen). Kit Tissue Spin-Column and Blood a DNeasy lysate using sample of µl 200 from in a Falcon and extracted tube, homogenised and DNA was was sample Each dried 420 bp of 420 bpof the 3' the end of 1998). cox1 bc5' barcode region. The two fragments were bracketed by the “Folmer” primers used used primers The“Folmer” werethe region. by barcode bracketed two fragments fragment corresponds to to corresponds fragment cox1 barcode was amplified with with barcodeprimers wasamplified Ill_B_F (5'- et al.

et al. et al. et ≈ SSU 2015). 2015). the Similarly, 350 bp the 5'of the end of 2015) (5'- 2015) and Fol-degen-rev 2012) and Ill_C_R: 5' andIll_C_R: 2012) et al. gene, and two fragments gene, fragments ( and two 2012). The 2012). SSU bc3' TM marker wasamplified using etal. fragment corresponding to to corresponding fragment (Burgess 2001) to separate separate to 2001) (Burgess cox1 bc5' 2016). For each 2016). barcode fragment, fragment, barcode and and bc3' ) within the within ) et Accepted Article associated species name to the the to species name associated the available, where and, number aBIN attach to permitted which sequences, representative our of We the correspondence established Hebert 2013). & groups (Ratnasingham Number) Identification method (Barcode BIN thatproducesso-called the ona graph theory barcode based data of clustering provides already The BOLD sequences. database metabarcoding of the taxonomic identification for analyses subsequent in sequence” “representative the simplicity, For some cases. in sequences hundred toseveral just sequence a single from ranging inthe BOLD database, entries primary OTUs(referred as to This by is protectedarticle reserved. Allrights copyright. were Primers 18/07/2016). accessed (http://hannonlab.cshl.edu/fastx_toolkit/; FASTX-Toolkit the using the eachof three DNA fragments, datasets for independent get to de-multiplexed and subsequently 2013) Institute (Babraham Fastqc checked in and moleculesremoval ofhybrid al. of low-quality (Schirmer data reduce theproportion to applied were steps bioinformatics Various read processing Bioinformatic the tomultiple species attached same name was each general, In level to order only. identified was and which group namedBIN any on BOLD to attached wasnot that amatched single sequence reads metabarcoding where 18cases were there but study, were themetabarcoding matched by 2013) (Edgar v7 Usearch with clustered bp) (658 length available full- All accessed 2017). January on8th 2007; & Hebert (Ratnasingham Data Public Portal A custom reference set of OTUs for Arthropoda was created from sequences obtained from the BOLD database sequence reference a Creating diversity, identification problems in the reference database, or the existence of cryptic species (Table cryptic theexistence of or problemsdatabase, reference the in identification diversity, incomplete taxonomy complexthis of species (Williams BOLD-OTUs cox1 separate theisopod with associated For example, sequences species. ofLinnaeanTable 4). 1, Suppl. The equally affected bysplitting are BINs BOLDthe on database 2015). These steps included the trimming of 3’ ends, merging R1 and R2 reads, and the detection andthedetection and merging ends, of 3’ R2 reads, thetrimming R1 These included 2015). steps ) is a well-established observation in in ) isobservation a case the well-established of BOLD-OTUs wereassigned to cox1 BOLD-OTU sequences for Arthropoda were retrieved from BOLD and subsequently subsequently and BOLD from retrieved were Arthropoda for sequences BOLD-OTUs matching 8 different BINs. High intraspecific variation (>3% divergence in in divergence (>3% variation intraspecific High BINs. 8 different matching corresponded to a unique Linnaean species name, but inseveral cases name,but Linnaeanspecies toa corresponded unique BOLD-OTUs. Baetis rhodani

formed duringfrom PCRmixed templates. Raw reads were quality from hereon), whichwas of eachof variable on numbers based hereon), from

under a 3% similarity threshold. This resulted in thousands of thousands of in resulted This threshold. similarity 3% a under

centroid We obtained species names for most of the OTUs that that OTUs the of most for names species We obtained (Ephemeroptera), which is also reflected in the in isalsoreflected which (Ephemeroptera), BOLD-OTUs sequence of each each of sequence Asellus fastx_barcode_splitter.pl fastx_barcode_splitter.pl et al. BOLD-OTUs Asellus aquaticus Asellus (Sworobowicz (Sworobowicz , indicating high intraspecific genetic highintraspecific , indicating 2006; Bisconti 2006; with the BINs based on the the on theBINs based with BOLD-OTU et al. et were assigned to eight et al. 2015). Three 2015). option of the option 2016). 2016). was used the was used as et Accepted Article corresponded to the targeted invertebrates and to remove bacterial and other sequences. For this For this sequences. and and toremoveother bacterial invertebrates tothe targeted corresponded for the threshold the for thresholds dissimilarity and using 10% 3% algorithms clustering three the of each with clustering to subjected were sequences metabarcode Quality-filtered De novo ( conducted checking was sortbysize bc3’) Usearch (Edgar 2013) pipeline: reads were merged (option (option pipeline:reads merged were 2013) Usearch (Edgar the of steps several following and processed BFC using (Li 2015), reads denoised analyses: were Based on the results from these tests (below), we used the following parameters all for further OTUs of ineachthe 72 pair OTU sets were estimated from the OTU table using R. similarity threshold, using the command a 3% under sets OTU 72 the of each for obtained the OTUs map subsequently to as references used and eachOTU, represent to sequence as consensus arule majority exported 50% assemblies were sets with a minimum similarity threshold of 3% in Geneious v7.1.9 (Kearse Geneious of (Kearse 3% av7.1.9 in threshold setssimilarity minimum with This thatwe required possess OTUs alist all presentof inthe various sets. We assembled the 72OTU the of 72OTU obtained. pair sets each OTUs between and shared exclusive of estimated number the OTU sets for each of the three metabarcode fragments. For each DNA fragment (bc5’, bc3’, 72 of a yielded total methods and 3OTU 24 clustering of settings readThe processing combination (Mahé approach; This by is protectedarticle reserved. Allrights copyright. cluster_otus 2013) Edgar ( approach; heuristic (greedy Usearch algorithms: different three clustered with OTU were sequences without Processed Maxee). Maxee=1; (Maxee=0.5; Usearch parameters in parameters 0.16 (Staton Pairfq and the 2013) using paired were reads R2 and R1 R2). for 250 (CROP CROP:270 MINLEN:250 TRAILING:20 trimmed using procedures for pair merging using either v0.9.6 Pear merging usingeither pair procedures for (ii) two without de-noising); K=33; K=33; K=33;s=20 s=2 (s=0.35 2015) BFC (Liparameters using denoising Four different (i) These included: Fig. analyses S2). (Suppl. each DNAmarker tested for me pair read denoising, for proceduresAlternative , - fastq_maxdiffs OTU generation metabarcoding from generation OTU options). Sequences with only one read ( read one only with Sequences options). or option); CROP (Bayesian approach; Hao Hao approach; (Bayesian CROP option); Usearch v7 (Edgar v7(Edgar 2013) Usearch fastx_trimmer SSU et al. et gene fragment. Each set wasOTU to retain onlyfiltered OTUs, which

30 2014) (Suppl. Fig. S2). Fig.S2). (Suppl. 2014) ) -uchime_denovo , quality filtered (Maxee=1), dereplicated dereplicated (- (Maxee=1), filtered quality and makepairs makepairs

reads were processed in Trimmomatic inTrimmomatic (Bolger processed were reads

as above; and (iii) three alternative quality filtering threeasabove; and(iii) alternative filtering quality –usearch_global option). option). option. option.

rging, quality filtering and clustering method weremethod clustering filteringand quality rging, -minsize 2 -minsize

(Zhang (Zhang et al. et . The proportion of shared and of exclusive The proportion . mergepairs 2011); and Swarm (agglomerative (agglomerative and Swarm 2011); et al. ) were excluded, and a and excluded, were ) cox1 2014) with with 2014) bc5' and bc3' fragments,and 3% derep_fulllength

–fastq_minovlen et al. et –q 26 26 –q et al. 2012). The resulting 2012). and other default default and other de novo ) and sorted (- 2014) 2014) using

150 (130 for 150 (130 - chimera chimera SSU ) we Accepted Article metabarcoding reads only, after filtering, or after the after or filtering, after only, reads metabarcoding Processed reads of of reads Processed against conducted were identifications the Secondly, within Insecta and Crustacea. andorders Arthropoda, within Metazoa, classes phyla within for richness toestimate Megan wasused and 2016) (December intheNCBI ranks database Taxonomy taxonomic eachthe Weaccepted OTU. 5 -evalue 0.001 outfmt OTUs. For the identification of identification OTUs.For the The script python with same 3%threshold. the the NCBI against BLAST to searches subjected sequence was NCBI and OTUsthe against of Identification excluded. were frame reading the disturbing codons stop or deletions insertions, with sequences subsequently Align Geneious option, and Translation the MAFFT aligned using in and were sequences This by is protectedarticle reserved. Allrights copyright. the matching of sameprotocol v5 (LCA)implemented (Huson Megan in algorithm ancestor thelowestcommon with conducted were categories OTUs of high taxonomic to level Assignments the libraries using a R were custom with removed script gene fragment the notrepresenting targeted presumably sequences other all and dissimilarity wereretained, threshold a30% within UPGMA tree the cladeof generatean UPGMA tree based on Tamura-Nei distances. in Only the OTUsthe largest included to and 2013) &Standley (Katoh FFT-NS-2 MAFFT options with Usearch) to according the centroid i.e. sequence, representative align inbatchmode (the each OTU set to used was Geneious purpose, de novo ( reads processed by Illumina matching obtained were xtables) (OTU site tables composition Community sub-datasets. and Crustacea Insecta Metazoa, the Arthropoda, for conducted sites were sampling across 2010) components (Baselga andnestedness turnover the associated diversity beta total and for analyses statistical Ecological Community composition species indicative and analyses representative sequences from each OTU (at threshold).3% for SSU OTUs with Usearch obtained and (3% similarity10% thresholds for ) ; and (b) the ape (Paradis (Paradis bc5' BOLD-OTUs ). of affinity BLAST). compute Megan into taxonomic matchesthe were to fed and and et al. et bc3' 2004), 2004), BOLD-OTUs de novo were matched against against were matched reference dataset (for rncl blastn -outfmt 5 -outfmt -evalue 0.001 blastn OTUs from clustering with Usearch, Swarm andCROP Swarm the Usearch, with clustering OTUs from and in the reference database wasapplied to the stringr stringr uc2otutab.py BOLD-OTUs de novo BOLD-OTUs. (Wickham 2013). (Wickham 2013). Finally, the retained BOLD-OTUs bc5'

was used to generate a list of matched of a generate list wasused to OTU clustering step described earlier. earlier. described step OTUclustering et al. and and reference set set reference nt database (December 2016; 2016; (December database bc3' 2007). 2007). Each representative OTU Analyses were based either on based either were Analyses usingthe ) against (a) the selected sets of gene fragments, only for bc5' -usearch_global and bc3' , and3% blastn - blastn cox1 option

Accepted Article S3), i.e. different read abundances between samples did not affect the number of of number the affect OTUs not recovered. did samples read abundances between different i.e. S3), Fig. (Suppl. thethree DNA fragments asMetazoa for classified OTUs numbers of number and the Spearman’s 11samples, remaining Timeproduceto For 1:IS1-T1)the a failed theanalysis. and removed PCR from was product for 100,344 for - 177,653 41,306 (ranging extracts macro-fauna and meio- the both from sample sequence per bp) reads (2x300 Illumina of thousands of ontens was conducted OTU delimitation Sequencing, and readprocessing OTU clustering RESULTS versus sites. impacted ( species analysis indicator we Dufrene-Legendre Finally, used approach. (taxonomy-dependent) the with were plotted ScalingAnalyses (NMDS). analyses Multidimensional Nonparametric ( and sites impacted betweencontrol distribution differencestheir in significant species show (factor effects function using the Anova was conducted (Permanova) a Finally, Permutational spill. the of and downstream upstream and values richness on based clines to generate option library R the and 2012), & Orme (Baselga using R library the effect) richness pure index; - Simpson (Sorensen andnestedness richness) in variation the effect of without index; species replacement, (Simpson turnover index), (Sorensen time asasingle sample.Distance matrices by pairs of sites weregenerated for total betadiversity site and same the from bothsamples combining after separately or samples meio- and macro-fauna eitherthe for composition, community of analyses conduct to datawere of used Taxonomic subsets and Crustacea sub-datasets. This by is protectedarticle reserved. Allrights copyright. data abundance multivariate (Wang of the analysis for in the communities upstream and downstream ofthe spill. Weadditionally used the envfit the Likewise, an LCAvalue 90. of Crustacea parameter with and(d) Insecta Arthropoda, (c) (a) Metazoa, (b) assignments in obtained Megan (see above) were toextract used sub-datasets asOTUsof identified Whenusing sub-datasets). Crustacea and Insecta Arthropoda, indval anova.manyglm test of correlation. Jaccard distances were used to verify the existence of significant differences differences significant of existence toverify the were used distances Jaccard correlation. testof ) (Dufrêne & Legendre 1997) &using Legendre (Dufrêne Rthe 1997) ) package

bc5’ BOLD-OTUs spill ; and for 34,555 - 107,016 function; test=LR, nBoot=999) using the results from the reference database database reference the from results using the nBoot=999) test=LR, function; adonis )

on community composition ( oncommunity composition , and the significance of differences was assessed using a stress test and the andthe astress test using assessed was differences of , andthesignificance reference sequences were taxonomically assigned to Arthropoda, Insecta Arthropoda, to assigned taxonomically were sequences reference rho correlation tests revealed no correlation between raw read nocorrelation tests revealed correlation vegan SSU ) (Suppl. Table S1). A single sample (Impacted site 1- (Oksanen manyglm manyglm

ordispider et al. et labdsv function) andto function) identify whichindividual et al. 2012) testupstream anddownstream 2012) to de novo de novo (Roberts 2007) to compare control control to compare 2007) (Roberts 2013) was used to perform perform was 2013) used to to connect the samples from from samples the connect to generated OTUs, taxonomic OTUs, generated taxonomic mvabund bc3’ ; 5,487 - ; 5,487 ordisurf package package betapart

Accepted Article Taxonomic based on profiles using themerging step Usearch PEARfor or Finally, settings). parameters among OTUs shared of (~90% onOTU arecovery effect moderate of alsoproduced option inthe absence the OTUs.30%-50% resulted useof The Maxeefiltering the of except for OTUs of shared), obtained OTUs (~90% of effect onthe number moderate and withCROP.Swarm,not but The of implementation between were Usearch 20.3% shared an thethree and additional methods generated with OTUs were and Trichoptera. Inand the Trichoptera. Ephemeroptera by Coleoptera, followed by The 3). Diptera, dominated were (Fig. Crustacea Insecta the within subphylum classes andvarious by Arachnida followed group, was themost abundant Insecta Arthropoda, Within Annelida. and Rotifera by followed Arthropoda, of OTUs by dominated Next, for was observed Thesame trend other the methods. with OTUs obtained respectively, the majorwith to difference attributed This by is protectedarticle reserved. Allrights copyright. SSU representative parameter setting (BFC with asingle under obtained OTUset the assessed Weonly Methods). and Materials (see taxon the for cases the of 92.7% and 91.1% in shared were Swarm) and CROP (Usearch, methods clustering different with The obtained OTUs (Fig. 2). settings across parameter the results similar broadly eachgene) (for shared showed OTUs of and exclusive obtained The proportion bc5' 810 for the 479to from ranged parameter settings various these thenumber OTUs under of Methods), and (see Materials method usingatree-based groups taxonomic or gene fragments expected matching the OTUs excluding After not Fig. S2). (Suppl. methods clustering using different three OTU clustering subjected to reads then merged were The resulting applied. wereparameter settings different 24 under for quality and readfiltering of merging the basic protocol modifications Various Table S1). (Suppl. across samples was uniform broadly but reduction, ina further reads resulted merging R1 Theand R2 of (forward-reverse) paired-end Table S1). samples (Suppl. macro-fauna and between meio- or anddownstream in totalreadbetweenfiltering upstream after numbers quality reads. only of differences There 10-20% nosystematic were wasretaining sometimes lost, proportion the reads for the initial of The raw readssubjected tobasic were qualityfiltering, which reduced theusable aboutreads 90% to Swarm, CROP) under similarity thresholds of 3% and 10% for the and for 3% 10% of thresholds similarity under CROP) Swarm, mergepairs fragment, from 543 to 1150 for for to 1150 543 from fragment, fragment (Fig.3). In the de novo in Usearch for read merging), to which we applied the three clustering methods (Usearch, (Usearch, methods three clustering the we applied which to merging), read for Usearch in OTUs were identified against the Genbank database via LCA assignment to major to viaassignment LCA database Genbank the against identified were OTUs SSU SSU data, the OTU composition was also dominated by Arthropoda but the Arthropoda dominatedby wasalso data, the OTU composition bc5' marker, while for both de novo and and bc3' bc3' OTUs using Megan using OTUs , and from 193 to 193, andfrom for 590 to fragments, the LCA assignment showed that samples were were samples that showed assignment LCA the fragments, s= resulted in in shared 2). ~90% OTUs (Fig. resulted 0.5 for read0.5 denoising, the CROP clustering method lackedthat of 4.6% the CROP method clustering cox1 a denoising tool (BFC software) resulted in a markers in many samples a much greater a greater samples much many in markers

SSU cox1 Maxee=1 (Suppl. Table S2). Table(Suppl. S2). fragments, and 3% for the for 3% and fragments, bc5' SSU and , where a 65.6% of the the of a65.6% where , for read filtering, bc3' bc3' s fragments, =20 which which =20

and and Accepted Article This by is protectedarticle reserved. Allrights copyright. toapproximately 1/4 was reduced which the Diptera, and Insecta specifically total number of OTUs wassubstantially lower compared tothose from to a maximum of >20000 reads for some species of . Out of a total of 207of atotal of Out Baetidae. of species some for reads >20000 amaximum of to magnitude, orders of by varied four each OTU for BOLD. Read numbers with consistence maximise lumped, to were these for anddata BIN, a single from sequences of clusters separate produced of pair a cases, two In system. BIN their BOLD in included by not were but sequence, Table theseand S4for theof Eighteen Suppl. 207 complete dataset). The majority ofthese 207 to matches in resulted set reference the against reads sequence of mapping Direct the Taxonomicagainst assignment Table S3). Suppl. 4; asubsta with but above, described as trend the same the with detectable OTUs only were and Nematoda most Platyhelminthes for while Mollusca, OTUs of OTUs with dozen several of recovery the to compared in reduced were OTU numbers and Diplostraca. Cyclopoida Isopoda, Amphipoda, Podocopida, as such orders major the genes detected both Table Crustacea, In S3). Suppl. 4; OTUs (Fig. Next, we mapped the we Next, mapped were obtained. in mostof (<10) but these reads a only fragment, cases few read matches, 36OTUs exclusively were obtained with the The total numbers of identified OTUs from read matching and Table S4). Suppl. 4; read counts (Fig. ( marker or Swarm, CROP) (Usearch, methods onthethree depending clustering Species detection Table 4). reads produce not of thatdid Suppl. an OTU number 1, (Table sensitivelow cases in of less wasslightly the read in thiswaymethod mapping, from the those although closely matched lower, as in the case of several classes and orders within Crustacea (Fig. 4; Suppl. Table S3), whose Suppl.Table S3), (Fig. 4; Crustacea within and orders classes several thecaseas of lower, in the thresholds of thoseat 10% the closer to matchingread from were both Diptera for particular in in similar, bc5' SSU and marker (Suppl. Table showed Taxonomic marker ata threshold (Suppl. S3). profiles obtained similarity 10% bc3' ) was generally consistent, and observed differences tend to occur for species with low low with species for occur tendto differences and observed consistent, was) generally cox1 marker in Annelida, and Rotifera were not detected with with detected not and Rotifera were Annelida, in marker de novo BOLD-OTUs generated OTUs against the the OTUs against generated a matched BINunique the in BOLD (Tabledatabase 1forDiptera BOLD-OTU cox1 bc5' . In contrast,. In the and bc3' reference database reference ntial reduction in the total number of OTUs (Fig. OTUstotal number of the reduction in ntial . In other taxonomic groups, the OTU counts the taxonomic. groups, In other BOLD-OTUs bc5' SSU de novo fragment and 46 OTUs with the and the with OTUs fragment 46 dataset retrieved a similar number of of number asimilar datasetretrieved BOLD-OTUs de novo

clustering generally were clustering cox1 . Species detections obtained obtained detections . Species , in particular in the th OTU analysis or even even or analysis OTU of the of SSU included a single a single included BOLD-OTUs at all, despite the the despite all, at bc3’ BOLD-OTUs. BOLD-OTUs and and bc5’ SSU with with bc3'

Accepted Article SSU; SSU; BLAST+Megan for OTU classification and reads based approaches using using approaches based reads and classification OTU for BLAST+Megan for very were similar Results 5, S4-S9). beta (Fig. of component diversity nestedness the for not but species, of turnover high a by driven mainly were sitesthat control and 2). Significant differences in community compositio for and 10% 3% at OTUs (a) for: datasets Crustacea and Insecta the Arthropoda, Metazoa, for were conducted sites sampling across and turnover nestedness diversity, beta total site. for NMDS spill the analyses and below above assemblages of nestedness and turnover the establish used to was samples metabarcoding various in species absence of The or presence species of indicative the and on community spill Effect composition species. cryptic of existence possible the even or species diversityonsome intraspecific indicating high Tablethe 4), seeSuppl. cases; all in unequivocally several cases, other In analysis. e.g. into 5 OTUs, for the entry Dytiscidae 3OTUs, into and even genus-level entries weresplit further, identified to species. the Similarly, single entry Limnephilidaefor in Thompson (Fig. 7). patterns showed very similar also separately fractions onthemeio- The NMDS(Fig.conducted 6, plots Table 2). identification and macro-faunal in Thompson Tanypodinae) subfamily and (Chironomidae taxa two as listed were which Chironomidae, the Notably metabarcodes. In many cases, Thompson the by recovered easily thesewere taxa in species other while Arachnida and others, Isopoda, Coleoptera, Plecoptera, Ephemeroptera, taxa, were including a range from higher of The taxa missing even (not level). taxonomic at OTUs the higher representeddataset taxa in not were Thompson ( This by is protectedarticle reserved. Allrights copyright. al. analysis (Thompson by provided traditional of macroinvertebrates the taxon list with Comparisons the some of eight several also matched reads sequence cases, the these some for of at 3%; multipleinto OTUs weresplit that reference databases were lesscomplete. The reduction at the 10% threshold generally affected cases cox1 2016) were made for Arthropoda based on identifications by read mapping with the the with mapping read by identifications on based Arthropoda madefor were 2016) Baetis sp. ) fragments at the 3% threshold level. The identifications included many exact matches to the matches to exact many included level. The threshold identifications at 3% the fragments ) and BOLD-OTUs

(c) for the BOLD reference dataset and et al. et et al. corresponded to 9OTUs and to corresponded (2016) list, although the latter included only 38 arthropod 38arthropod only taxa 11 of latter the although which intotal, included (2016) list, BOLD-OTUs (2016) were split into a total of 55 OTUs in the current analysis and most could be could most and analysis current the 55 OTUsin of atotal into split were (2016) but based on taxonomic assignments were identified as the same species, e.g. e.g. species, as same the were identified assignments basedontaxonomic but identified as the isopod identified BOLD OTUs et al. et Simulium sp. Simulium were assigned the same binomial (although not (although not thesame binomial assigned were (2016) did not separate the entities at the species level. level. species the at entities the separate not did (2016) bc5' bc5' n were detected in all cases between impacted cases inall between n impacted were detected and Asellus aquaticus Asellus corresponded to the 4 OTUs to corresponded in bc3' bc3' gene fragments (summarized in Table in (summarized gene fragments bc5' bc5' and

bc3'; (Fig. Table 4; Suppl. S4). BOLD-OTUs de novo (b) OTUs at 3% for for 3% at OTUs (b) et al. et clustering with with clustering cox1 (2016) was split wassplit (2016) bc5' for for cox1

and and

bc3' et

Accepted Article bioinformatics tools and parameter settings. If linked with existing DNA databases, metabarcoding metabarcoding DNA existing with databases, If linked andparameter settings. tools bioinformatics appropriate andthechoice the most metabarcoding, of for primers geneand universal regions addr by species approach, level explicit this towards microscopy (Friberg via traditional are yet abundant verywhich are diverseroutinel and meiofauna, permanent and temporary the both including – sample the of components all virtually data atthespecies butwith for capturing biomonitoring, level the added of conventional advantage as of theidentity as effectively taxacan now determine databases, reference of completeness Gray Friberg anyway; useabundance ever approaches relative only most biomonitoring routine (note that as aresult abundancesacrificed data were of the costand absolute although fraction time, al. et (Thompson taxonomic assignments based conventional on study linethe with earlier in River Kennet, the sites in and pesticide-impacted control between turnover species high revealed Metabarcoding DISCUSSION 2. period sites atthe sampling bicinctus were considered in combination, two Chironomidae ( and the period sampling when the spill whereas factor spill), 2 months after the sampling period (2.5 to the twosampling periods the resulted in Ephemeroptera with the Additional tests conducted (p<0.05). Mvabund albimanus Paratendipes bicinctus, ( species Chironomidae This by is protectedarticle reserved. Allrights copyright. Gammarus pulex Indval of unique versus both OTUs had a 53) theimpacted but (65 proportion high compared sites to species more slightly siteshad control the fraction, macro-faunal meio- and of count species total the combining Second, to fraction). addition the (in macrofaunal fraction recovered inthe meiofauna also were large-bodied most but size, body small their reflecting former, the in found entirely almost were Arachnida some and Collembola Brachiopoda, that showing example for samples, faunal meio- and macro- of thedistribution in patterns clear There themeiofauna. were OTUsto unique were meiofauna sample, only OTUsagainst 126 from the macrofauna (filtration at fraction >1 mm), and 70 to reads the bymatching detected OTUs 207 the Of 2016), but with far more complete and taxonomically resolved data. This was also achieved at a also but data. Thiswas with more andtaxonomically complete 2016), far resolved et al. analyses for strong effects of the spill revealed a statistically significant indicative value of indicative significant a statistically revealed spill the of effects strong analyses for ) ( ) and one Branchiopoda analyses resulted in si in resulted analyses 2016). The high read depth and accuracy of Illumina sequences, in addition an increased to inaddition sequences, The Illumina 2016). of and high read accuracy depth

associated to control sites (indval p<0.05), while p<0.05), sites (indval control to associated Tanytarsus eminulus, T. brundini, T.pa T. brundini, eminulus, Tanytarsus gnificant differences for for differences gnificant Chydorus sphaericus ) were associated to impacted sites (indval p<0.05) (Table 3). (Table p<0.05) (indval sites impacted to were associated ) et al. 2011; Gray 2011; indval essing ofappropriate twokey issues: the selection BOLD-OTUs function to identify indicative species associatedtoidentify species indicative function y ignored due difficulties in to their identification et al. Eukiefferiella claripennis Eukiefferiella ) were found as indicative of theimpacted of as were) found indicative Asellus aquaticus Asellus 2016). Our study takes a further step step afurther takes study Our 2016). luteolum Centroptilum llidicornis, T. ejuncidus, Cricotopus ejuncidus, T. llidicornis, , 170 were obtained from the from 170were obtained , Asellus aquaticus Asellus and and and Tanytarsus eminulus eminulus Tanytarsus as indicative of of as indicative Cricotopus Cricotopus and several et al. et 2011; 2011; Accepted Article cox1 of for some groups amplification Even so, poor universally. species can separate that a locus single of all for amplification phyla or can truly primers be found taxa. universal that It isunlikely target the on depending need PCR breadth of amplification carefully be considered to the spectrum and in differences taxonomic resolution of issues Thesewell-known theworld. most around schemes underpin arthropods whereas biomonitoring, traditional in are ever if groups considered rarely former only with were Nematoda recovered of conservation the greater to due presumably strongly, differed also profiles taxonomic the overall close of lumping relatives, to addition the in However, Chironomidae. the as such species, congeneric multiple groups containing (Tang thanmitochondrial DNA andthus presumablyclosely relatedspecies are collapsed into singleOTUs the In contrast, numbers reads. by of low OTUs represented due mostly to are disagreements fact the that in as reflected species, biomass) (low on low-abundance PCR amongindividual variation to due is possibly but investigation, further requires which species, bo with detection OTU between Differences reliable. islargely from mixtures thatamplification indicating groups, major of and distribution taxonomic Theprimers usedhere both for metazoans. and other arthropods of a wide range taxonomically universa that demonstration the (iii) studies; and a different of comparison for allowing choice, of marker the standardization the accepted widely benefits from an alreadyavailable and growing database for this marker from standard barcoding; (ii) communities. the using The of metabarcoding studies animal of advantage This by is protectedarticle reserved. Allrights copyright. the of fragment barcode the criteria, these on data.Based reference available of completeness species, and the of separation of power groups, of taxonomic major amplification affecting: the importantmarker choice, effects of genes illustrate the mitochondrial of portions amplifying here used genetic markers two The PCR primers. choiceof and the samples environmental various from DNA extraction for the wetlab by of outcomeprocedures affected theearly steps The is final metabarcoding. stagesof all at settings parameter and processing data a for options rangewide of with confronted are Researchers and bioinformatics gene markers decisions: of choice Methodological functioning. andecosystem community structure of biomonitoring bridge to and, ultimately, sites and undisturbed disturbed of to the indicators identify of composition, community resolution level species provides the possibly can be alleviated with different primer sequences, at least for regions that regions are that partly at least for primer sequences, different with alleviated can be possibly et al. 2012). This was evident from the lower number of OTUs obtained, in particular for for in particular OTUsobtained, of number lower the from evident was This 2012). SSU cox1 primer binding sites across phyla. Specifically, Platyhelminthes and Platyhelminthes Specifically, across phyla. binding sites primer fragments produced rather similar results for OTU total for numbers similar results rather produced fragments SSU , whereas the efficiency was lower for Arthropoda. The Arthropoda. for lower was efficiency the whereas , cox1 gene remains the most powerful choice powerful for gene most the remains l primers recover complex communities composed of composed communities complex primers recover l th primers are mainly areth individual at thelevelof mainly primers SSU (rRNA) marker isless far variable cox1 and the nuclear 18S rRNA andthenuclear 18SrRNA

cox1 gene to(i) isthe due Accepted Article Species definitions in Species definitions consistently. and set be applied chosen should parameter a particular comparability, for study; each for necessary not is reassessment its so wide and spaceisfairly the appropriate parameter be but applied, should settings parameters stringent rule, ageneral analysis. As and turnover species counts with complications thuslater avoiding metazoans, target clade the major only andretaining for UPGMA rapid trees using approach phylogenetic viaa removed These were bacteria. taxa, non-target including to correspond or non-mitochondrial are that either Wemany that OTUs generated also are outcome the OTUfind the of delimitation. final clustering of3% threshold that here,was applied an OTUs the into areat the minimum subsumed among that variants reads the mainlyminor eliminates procedure assembly of the Presumably, denoising OTUs. <10% changed the outcome final for the of Maxee) only maximal error, frequency (the inthe reads expected final based filtering error the on andthe step The read thesoftware. of pairing applications most of published within the boundaries are that overbroadly a wasparameters denoising range robust step of of methods. first Equally, the and Swarm OTUs Usearch fewer the than slightly detected which algorithm, perhaps CROP the OTUs, except generated methodology uniform largely different employing fundamentally algorithms For the bioinformaticsparameter processing, different settings effect. had Different little clustering barcode the for databases theavailable sequence against comparisons of theadvantage misses this approach This by is protectedarticle reserved. Allrights copyright. in the and dependent onthethreshold. 3% to If set evident were species hypotheses inparticular differences OTUagreed of delineations, and onthe number extent algorithms broadly clustering various andnuc copies (heteroplasmy) variable mitochondrial from variants andtrue isoverlain by but variation errors, actual sequencing the read Paulay 2005), similarity thresholds, akin to the search for a barcoding gap standardin (Sanger) barcoding (Meyer & on based limits species the define algorithms Theclustering 2004). (Wheeler circumscription species data for onsolid based eachspecies for taxon concept a valid sample, requires which and the speciesthe of in identification recognition procedure is the themetabarcoding of A aspect key (Taberlet differentiation andspecies-level unbiased PCR broad, compromise between a present which studies, metabarcoding recent in genessuggested and 16S) (12S rRNA mitochondrial relatively little added effort. An alternative is the use of slightly more conservedmarkers, suchas the with libraries in Illumina be the included to ontheDNAextracts, PCRs usingmultiple overlapping, application of more refined algorithms to the OTU circumscriptions is desirable, although coalescence although isdesirable, the circumscriptions OTU to of morealgorithms refined application The lumpedmany species. threshold 10% conservative thehighly whereas Linnaean taxonomy, cox1 cox1 marker, OTU delineation largely reflected the species numbers expected from the the expected from numbers species the reflected largely OTU delineation marker, gene fragment. gene fragment. de novo clustering and read mapping read clustering and d thus their prior elimination has little impact on impact on has little elimination prior their d thus lear mitochondrial insertions (numts). While the While insertions (numts). lear mitochondrial

et al. et 2012), but but 2012), Accepted Article novo similar results (although with slightly fewer entities detected in the latter). When working with produced generally 2011) & Westcott (Schloss thedatabase reference against via mapping read on approach based independent’ the i.e. ‘taxonomy OTU community generating lists, for approaches here used The twoprincipal illustrated illustrated of by oneOTU retrievedclassification the with both independently stringency 90%)willresult(about in many OTUs similarity whereas higher (e.g. are70%), considered thresholds similarity when relaxed classifications oninaccurate can result databases reference of incompleteness conducted. Nevertheless, as here databases, reference against searches onBlast based tool asa proven useful Megan has been related thepresence to divergent ofmultiple copiesmitogenome in the focalspecies.For example, be may but unclear remains OTUs related closely these of relevance The information. additional with tobe need evaluated limits species whose real in some species variation intraspecific high reflecting and BINs multiple cases several in Linnaeannames, thebut a with show congruence general good Both compilations threshold. fr asthose OTU circumscriptions similar produce they that and weestablished of the reference data, compilation existing analternative database are sampleswereconsidered furtherof (a207 OTUs). total The BINgroups presented the BOLDin in target the encountered sequences tothe metabarcoding similarity with those only which after database (the areference theproduce data, inanalogy to to metabarcode sequences, these on andconducted OTUclustering sequences available all we downloaded For simplicity, in BOLD. good heavily representation species we onthe relatively draw could freshwater arthropods In a Linnaean theEuropean a binomial. and sequences to set of link variation intraspecific that define databases reference the to shifted is delimitation species of problem the approach, mapping read the In these best matches,and thus this sequence from divergent quite a lineage representing despite thematch of area database, a in reference certain match inthe database, hasclose no focal sequence at toall taxononomic levelidentification the of deepest corresponding thesematches. weakIf the weak matches with several other referencesequences from distantly related taxa, return to an Meganuse may suchasimilarly match, weak with For sequences reference. the align identified to to indels several 80% below sequenceand the requires similarity toa matches with corresponds few the notexpected identification in group target amarine This tobe sample. Echinodermata, present This by is protectedarticle reserved. Allrights copyright. procedure. comparativelyshort, which limits the resolving powerof these sequences in anyspecies delimitation are sequences metabarcoding addition, In reads. of number thevery large easily to applicable GMYC andPTP (Pons basedincluding procedures generated OTUs, attention should special be pa BOLD-OTUs de novo generated OTUs, and the ‘taxonomy dependent’ approach approach dependent’ ‘taxonomy the and OTUs, generated is necessarilyis not relatedtoEchinodermata. were associated to a single binomial, probably probably a single to binomial, associated were om the Usearch clustering at a 3% similarity similarity at a 3% om the clustering Usearch with notaxonomic Thiswith assignation. is here spurious spurious Blast similarities indicate might a best et al. id to the method used for classifying OTUs. classifyingOTUs. used for method the id to 2006; Zhang 2006; et al. 2013) currently are not not are currently 2013) cox1 fragments as BOLD-OTUs de ), Accepted Article Gammarus nekkensis to these cases may affect the referencesequence Woodward 2011; iscomplementedwith biomonitoring mapping direct wherefor approach, combined ause should protocols standardized Ideally, ecosystems. unstudied hitherto in biodiversity discovery in used be could and patchy, still is sets reference of coverage whose assemblages species for option the of a Linnaean name. The robustness link without to albeit reference in to yetembedded databases, particular in identify species useful not the from OTUs studies, different between comparability maximising OTU clustering, of step problematic the approachcircumvent could read Whereasmapping direct the European macroinvertebrates. aquatic for coverage an excellent holds already database barcode existing the cases, complicated taxonomically these beyond Yet, delineation. consensusspecies for biologist and molecular by taxonomists expert needassessment acareful of the highlights and complexity This taxonomic molecular to 13-06-2017). 10 BINsup forms (accessed be that noted should Williams mult of consisting complexes species established ephemeropteran the as such cases, between structural biodiversity and functional measures related to ecosystem processes (Friberg (Friberg processes ecosystem to measures related biodiversity and functional between structural (Tachet inEurope species of aquatic invertebrates a large for proportion available databases traits to connected can metabarcoding with the then be taxonomic In impact). through link future, direct identifications, an after happens turnover high while can be stable richness Chironomidae (e.g., practices management and improve conservation and subsequently to monitoring programs, by river reached conclusions improveto can which be used as Chironomidae, taxa, such level even mostproblematic the in species to to and resolution taxonomy link Thisprovides a reliable biomonitoring. for identification species stable and repeatable easily provides grouping DNA-based these against reads sequence mapping of straightforward the hand, set is in reference Oncea validated databases. local reference refine to contributing an process, iterative through database, the to gaps help reference identify in can This databases. reference OTUswith of comparison a and parameters defined under clustering (Doublet copies heteroplasmic as constitutively maintained are apparently that some isopods, including including some isopods, This by is protectedarticle reserved. Allrights copyright. et al. et et al. 2002; Schmidt-Kloiber & Hering 2015), to bridge the critical gap that still exists to & 2015), still gap that bridge theHering critical Schmidt-Kloiber 2002; 2006;Rutschmann et al. G. pulex with the with 2013). 2013). A. aquaticus is represented by 5 different BINs at BOLD, whereas whereas at BINsBOLD, by different 5 is represented bc5' bc5' et al. rhodani Baetis fragment, whereas , exhibit atypical mitogenomes composed of duplicated regions duplicated of regions composed atypical mitogenomes , exhibit 2014; Bisconti 2014;Bisconti de novo approach shown here therefore offers a defensible a defensible offers shown here therefore approach databases, such the detection of the oriental the oriental of detection the such databases, iple cryptic species (Karaman & Pinkster 1977; 1977; & Pinkster iple species(Karaman cryptic or the or G. pulex et al.

amphipod amphipod 2016). Taxonomic difficulties associated Taxonomic 2016). associated difficulties is exclusively found with with isexclusively found Gammarus pulex de novo clustering could still couldbe still clustering et al. et G. nekkensis G. 2012). Other 2012). , are already well well already , are de novo bc3'. bc3'. OTU OTU It It

et al.

Accepted Article This by is protectedarticle reserved. Allrights copyright. Gray waters (e.g. and running standing multiple and decades several nowspan data that species-level Network Waters Monitoring approach biomonitoring’ (Bohan this testbed for ‘next-generation reference streamsites used for the UK-wide RIVPACS biomonitoring scheme would offer an ideal pre-defined 600+ For instance, the quality. water assessing for and indexes schemes against existing these data matching after perhaps purposes, regulatory manner for inaapplicable consistent highly species show aclear responsethat environmental Ultimately to thesemethodsconditions. be should additional may reveal sampling denser and many to tobe developed samples, well moreapplied sufficiently now methodology impactzone. the Theis aboveand below three three sites each, from samples Surber independent two of consisted sample Our error. stochastic by affected is which itself among taxa, thevariation and the sampling runs, PCR particular efficiency theprimer low species, for Linnaean of splitting and lumping of problem the inventories: metabarcoding complete generating of challenges remaining some the highlights of also The study assemblage. freshwater the of sampling incomplete of and diversity, cryptic poor separation identification, namelyspecies the incomplete techniques, conventional of drawbacks the of some resolve thus factors These microscopy. light under even assessments, invisual overlooked are often which species, larger-bodied otherwise of instars early inpart representing meiofauna, thetemporary especially in those minute of detection specimens, species multiple names into molecular OTUs the 3%threshold (at in of split (ii) the binomial the for Chironomidae; especially levels, conducted onhigher taxonomic was the analysis that compared morphological to resolution taxonomic the greater (i) high numbers: to the contribute aspects Three parameters. detection 5 of - the OTU even under conservative most analyses (Thompson conventional using detected species invertebrate of number the far study exceeds metabarcoding our species identified in of Thediversity total studies ecological and biomonitoring Implications for biodiversity. Most of the taxa in the impact samples are orthoclads (grazers on stones and plant and onstones plant (grazers the are the taxa samples orthoclads of impact Most in biodiversity. butprovided onchironomid no information in the andpredators, absencecompetitors potential of abundance in increases revealed marked a singlethat entity taxa into andspeciose highly responsive lumped these simply ecology, freshwater asin common approach, is density. Thelower conventional at already present species of abundance in just not richness, species in increase an indeed is increase thecrashes tooccupyin following rapidly vacantniches and particularin shows it thatthe most resilient detailed complements species list now with information broad-scale this traditionalthe approaches, The each three with markers. the of established communities the for and meio- and macro-fauna, for classes, arthropod different for levels, taxonomic various at discerned be can This sites. impacted and control the from samples separated clearly ordinations the as samples, among individual variation the despite community composition, in shifts the observed previously confirmed analyses Our et al. et 2016). 2016). r -selected taxa, such as chironomids, recover most recover aschironomids, such taxa, -selected

K -selected taxa. Ourstudy confirms that this et al. cox1 et al. 2017), as would the as UKAcid 2017), would ); and (iii) the better better and(iii)the ); 2016) - by at least a factor afactor least at - by 2016) Accepted Article DATA ACCESSIBILITY DATA ACCESSIBILITY communities. freshwater in responses species about discussions for Ekrem Tjorborn and Price Benjamin Brooks, Steve to grateful We DYW, are APVandKB. to NERC by grant NE/M021955 This was funded study Acknowledgements stressors. environmental to responses its and ecosystems freshwater biodiversity in of understanding andmechanistic informative deliver a more promises far to is available, expertise taxonomic which for thana subset taxa,use all to narrow capacity rather freshwater species in Europe, and also to begin to elucidate relevant trait differences. In addition, the for now available thatare databases exploit extensivetime for the first ecological the to enabled us ability to detect the spill’s impact onthecommunity. The offull availability species-level inventories our threshold affected or method the or clustering theread marker, processing the neither community, assessment ofwater quality and ecological Atleaststatus. interms detectingof relevant changes in state-of-the-art current advance the biomonitoring to next-generation for potential its highlights case study specific andour generally, studies spacemetabarcoding of study the parameter Our refines Conclusions possible.. currently drivers than is a of range in and to wider and variables relation detection response both of power open to improving approach usedoor biomonitoring could the here we the sensitivity next-generation ar stressors other to responses whereas acidification, or, toaextent, lesser toorganic pollution responses on isfocused at present Biomonitoring schemes. turn is likely toreflect marked functional trait shifts thatare overlooked inroutine biomonitoring interms of both impactsand resilienceis masked (Thompson sites thepost-spill in observed species diatom large several of the with increase is consistent 2004), detritus feeders that have been shown to feed on diatoms in the early larval stages (Ingvason (Ingvason stages theearly tofeedlarval diatoms on in shown have been that feeders detritus This by is protectedarticle reserved. Allrights copyright. Raw metabarcode data deposited in Dryad: doi:10.5061/dryad.104kg of species several of The increase species. somepredatory to addition in sediments, soft in living detritivores are that others afew with surfaces), et al. et

These2016). that suggest preliminary datainferences response theecological about

Tanytarsus by limitedtaxonomic andresolution, this that in e still poorly characterised in natural systems: the systems: innatural poorly characterised e still , a group of dominant sediment-dwelling sediment-dwelling , a dominant of group et al.

Accepted Article This by is protectedarticle reserved. Allrights copyright. BOLD:ACP0608 BOLD:ACI4790 BOLD:AAJ7051 BIN thresholdof (i) processed and reads (ii)OTUs clustered3% at (details in text). 1. Table 3. Codemetabarcoding pipelines, readfor processing clustering and 3.tables Supplementary 2.figures Supplementary onPCR1. Information reagentsand conditions MATERIALS SUPPLEMENTARY manuscript the of writing the final and the results of discussion the to contributed author all and writing the led APV and CA designed the analyses; CAdid PA and the molecular worklab and performed the analyses; APV GWand conceived the study; CG and GW collected the samples; CA, PA, DWY and APV CONTRIBUTIONS AUTHORS'S BOLD:AAM9576 BOLD:AAD1527 BOLD:ADE2432 BOLD:AAW5799 BOLD:ACS1169 BOLD:AAC7552 BOLD:ACP8764 BOLD:ACR1089 BOLD:ACQ3496 BOLD:AAP5886 BOLD:ACD1957 BOLD:AAI1530 BOLD:ACP6740 BOLD:ACD1670 BOLD:ACR0263 BOLD:ACP2182 BOLD:ACT8698 BOLD:AAW0928 BOLD:AAW5785 BOLD:AAD8971 BOLD:ACU8677 BOLD:AAP5931 BOLD:AAF2345 BOLD:AAW5449 BOLD:AAM5389 BOLD:AAI6018 BOLD:ACX3335 BOLD:AAT9677 BOLD:ACT5340 BOLD:AAE4568 BOLD:AAU2576 BOLD:AAA5299 BOLD:AAM5377 BOLD:AAW4635 BOLD:ACT0982 BOLD:AAO1037 BOLD:ACM0242 BOLD:AAL3267 BOLD:AAD4167 BOLD:AAB8862 BOLD:AAX3566 BOLD:AAL0178 BOLD:AAU2481 BOLD:AAC7823 BINs of Diptera from BOLD identified based on on based identified BOLD from Diptera of BINs

Bibionidae Chironomidae Chironomidae aiy anseisi f I c5 c3 -c5 -c3 M M T T2 T1 MS MC I C r-bc-3' r-bc-5' bc-3' bc-5' BIN of species id Main Cecidomyiidae Cecidomyiidae Agromyzidae Family Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Ceratopogonidae Chironomidae Chironomidae Chironomidae R/U/C/S R/U/C/S 550 93 b5 b3 b5 b3 b5 b5 b3 b5 b3 b5 b3 b5 b3b5 b3 b5 b5b3 b5 b5b3 93 550 R/U/C/S R/U/C/S Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae hrnmdeChironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae febrilis [21] febrilis Dilophus A A Polypedilum albinodus [3] [3] albinodus Polypedilum Micropsectra lindrothi [18] Brillia bifida[6] Conchapelopia melanops [11] melanops Conchapelopia [3] hittmairorum Conchapelopia Palpomyia flavipes [6] flavipes Palpomyia irpetapliua[24] pallidula Micropsectra ocaeoi aldl [1] pallidula Conchapelopia Micropsectra sp. 5ES [33] [33] 5ES sp. Micropsectra Corynoneura sp. Corynoneura Microtendipes pedellus [8] [8] pedellus Microtendipes Corynoneura sp. [6] sp. Corynoneura aoldu etnri [6] rectinervis Nanocladius Cricotopus bicinctus [1] bicinctus [1] Cricotopus [19] annulator Cricotopus [139] albiforceps Cricotopus Orthocladius rubicundus [119] [119] rubicundus Orthocladius [330] oblidens Orthocladius Cricotopus bicinctus [1] bicinctus [1] Cricotopus Orthocladius rubicundus [25] [25] rubicundus Orthocladius Cricotopus bicinctus bicinctus [123] Cricotopus aaldu udioou [2] quadrinodosus Paracladius Cricotopus trifascia trifascia Cricotopus [5] sylvestris [64] Cricotopus [289] rufiventris Cricotopus bicinctus [27] Cricotopus Paracladopelma camptolabis [2] [2] camptolabis Paracladopelma Eukiefferiella claripennis [185] Paratanytarsus sp. Paratanytarsus [3] lauterborni Paratanytarsus [12] dissimilis Paratanytarsus eronmsernts[18] eurynotus Metriocnemus [13] nebulosa Macropelopia [2] 2SW sp. Heterotrissocladius oyeiu lion [78 albicorne Polypedilum [13] flavipes Phaenopsectra Paratendipes albimanus [127] Micropsectra atrofasciata [26] Micropsectra contracta[14] psectrotanypus trifascipennis [9] trifascipennis psectrotanypus [19] pseudoreptans gromyza [13] [60] [92 [3]

]

] R/U/C/S R/U/C/S 478 428 b5 b3 b5 b3 b5 b5 b5 b3 b5 b5 b5 b3 b5 b5b3 428 478 R/U/C/S R/U/C/S R/U/C/S R/U/C/S 227 25 b5 b3 b5 b3 b5 b3 b5 b3 b5 b3 b5 b5b3 b3 b5 b5 b3 b5 b3 b5b3 b5 b3 b5 b3 25 104 227 145 R/U/C/S R/U/C/S R/U/C/S R/U/C/S b3 b5 b3 b5 b5b3 b3 b5 98 62 R/U/C/S R/U/C/S /// RUCS 5 3 b b b3 b3 b5 b3 b3 b5 b3 b5b3 b3 b5 b5b3 1446 81 5 R/U/C/S 0 R/U/C/S R/U/C/S -/-/-/- R/U/C/S R/U/C/S 33 26 b5 b3 b5 b3 b5 b3 b5 b3 b5 b5b3 26 33 R/U/C/S R/U/C/S R/U/C/S R/-/-/S 455 31 b5 b3 b5 b3 b5 b3 b5 b3 b5 b5b3 31 455 R/-/-/S R/U/C/S R/U/C/S R/-/-/- 5 4 b5 b3 b5 b3 b5 b3 b5 b5b3 b3 b5 4 5 R/-/-/- R/U/C/S /// /// 5 5 b b5 b5 b5 0 1 -/-/-/- R/-/-/- R/-/-/- R/-/-/- 1 4 b5 b3 b5 b3 b5 b3 b5 b3 b5 b5b3 4 b5 b3 b5 b3 b5b3 b5 b3 b5 b3 b5 b3 331 1 1018 R/U/C/S R/-/-/- R/U/C/S R/-/-/- R/U/-/- R/U/C/S 20 41 b5 b3 b5 b3 b5 b3 b5 b3 b5 b3 b5 b3 b5 b3 b5b3 b5 b3 b5 b3 b5 b3 41 20 R/U/C/S R/U/-/- R/U/C/S R/U/C/S 91 53 b5 b3 b5 b3 b5 b3 b5 b3 b5 b3 b5 b3 b5b3 b5 b3 b5 b3 53 91 R/U/C/S R/U/C/S /// ---- 3 0 b 5 b b5 b5 b5 b5 0 130 -/-/-/- R/U/C/S /// /// 5 5 b5 b5 b5 0 3 -/-/-/- R/U/C/S R/U/C/S R/U/C/S 61 1641 b5 b3 b5 b3 b5 b3 b3 b5 b3 b5 b3 b3 b5 b5b3 b3 b5 1641 61 R/U/C/S R/U/C/S R/-/-/- R/U/C/S 11 9 b5 b3 b3 b5 b3 b5 b3 b5 b3 b5 b5b3 b3 9 11 R/U/C/S R/-/-/- R/U/C/S R/U/C/S 1178 1263 b5 b3 b5 b3 b5 b3 b5 b3 b5 b3 b5 b3 b5 b5b3 b3 b5 b5b3 1263 1178 R/U/C/S R/U/C/S R/U/C/S R/U/C/S 12 19 b5 b3 b5 b3 b5 b3 b5 b3 b5 b5b3 19 12 R/U/C/S R/U/C/S R/U/C/S R/U/C/S 21 72 b5 b3 b5 b3 b5 b3 b5 b3 b5 b3 b5 b5b3 b5b3 72 21 R/U/C/S R/U/C/S R/U/C/S R/U/C/S 593 166 b5 b3 b5 b3 b5 b3 b5 b3 b5 b3 b5 b3 b5 b3 b5b3 b5 b3 b5 b3 b5 b3 166 593 R/U/C/S R/U/C/S /// ---- 0 5 b 5 b5 b3 b5 b5 b3 b5 b5 b3 b5 b5 b5 b5b3 b3 b5 0 b5b3 60 4 14 2 30 -/-/-/- R/U/C/S R/-/-/- R/-/-/- R/U/C/S R/-/-/- /// RUCS 8 4 5 b 3b 3 5b b5b3 b5b3 b5 b3 b5 b3 b5 64 98 b5 b3 b5 b3 b5b3 b5 b3 b5 b3 b5 b3 5138 R/U/C/S 11359 R/U/C/S R/U/C/S R/U/C/S /// ---- 0 b 5 b b5 b5 b5 b5 0 5 -/-/-/- R/-/-/- R/U/C/S R/U/C/S 39 145 b5 b3 b5 b3 b5 b3 b5 b3 b5 b3 b5 b5b3 b3 b5 145 39 R/U/C/S R/U/C/S /// /// 7145 5 5b 5b*b 3b b5b3 b5 b3 b5 b3* b5 b5 b3 b5 4759 5751 R/-/C/S R/U/C/- /// /// b b b3 b3 b3 5 0 R/-/-/- -/-/-/- R/U/C/S R/U/C/S 4 88 b5 b3 b5 b3 b5 b3 b5 b5b3 b3 b5 b5b3 b5 b3 b5 b3 b5b3 b3 b5 b3 b5b3 b3 b5 b5b3 126 88 b5b3 b5 b3 b5 b3 119 44 4 b3 R/U/C/S 1991 8 R/U/C/S 123 R/U/C/S R/U/C/- R/U/C/S R/U/C/S R/U/C/S R/U/-/S R/U/C/S R/U/C/S 13 15 b5 b3 b3 b5 b3 b3 b5 b3 b5 b3 b3 b5 b3 b5b3 15 13 R/U/C/S R/U/C/S R/U/C/- R/U/C/- 164 116 b5 b3 b5 b3 b5 b3 b5 b3 b5 b3 b5 b5b3 b3 b5 116 164 R/U/C/- R/U/C/- R/U/C/S R/U/C/S 47 4 b5 b3 b5 b3 b5 b3 b5 b3 b5 b3 b5 b5 b3 b3 b5 b3 b5 b5b3 b3 b5 b5b3 b3 b5 4 47 239 90 R/U/C/S 157 35 R/U/C/S R/U/C/S R/U/C/S R/U/C/S R/U/C/S R/-/-/- R/U/C/S 3 18 b5 b3 b5 b3 b5 b3 b3 b5 b3 b5 b5b3 18 b3 b5 b3 b5b3 b5b3 b5 3 b5 b3 58 b5 b3 b5 b3 b5b3 b5 b3 b5 b3 b5 b3 R/U/C/S 234 29 48 R/U/C/S R/U/C/S R/-/-/- R/U/C/S R/U/C/S R/U/-/S R/U/C/S 10 20 b5 b3 b3 b5 b3 b3 b5 b3 b5 b3 b3 b5 b3 b3 b5 b3b5 b3 b5 b5b3 b5 b3 b3b5 20 10 159 R/U/C/S 161 R/U/C/S R/U/-/S R/U/C/S R/U/C/S R/U/C/S 826 596 b5 b3 b5 b3 b5 b3 b5 b3 b5 b3 b5 b5b3 b3 b5 596 826 R/U/C/S R/U/C/S usearch_global searches under searches under a 3% similarity Accepted Article This by is protectedarticle reserved. Allrights copyright. bc-5': Notes: BOLD:AAD7458 BOLD:AAW4728 BIN BOLD:ACZ6583 BOLD:AAJ5023 BOLD:AAN6407 BOLD:AAB8624 BOLD:AAP9556 BOLD:AAA8323 BOLD:AAN3314 BOLD:AAL7819 BOLD:ABA7297 BOLD:ACY5064 BOLD:ACP1316 BOLD:ACQ1908 BOLD:ABV4656 BOLD:AAV2322 species are the most abundant within each BIN, in brackets the number of specimens identified to species level in the reference inthe level species to identified of specimens number the brackets in BIN, each within abundant most the are species fragments respectively on based the processed reads. In bold species identified with indval analyses as indicative for impacted bc- the OTUwith of the detection the b3indicates and b5 spill). the after months 2 (2.5 intime collected Samples T2: spill); afte days 1(11 intime collected Samples T1: subsamples; Meiofauna MS; subsamples; Macrofauna MC: sites; (downstream) Impacted and SWARM respectively. r-bc-5' and r-bc-3': Number of reads matched for bc-5' and bc-3' respectively. C: Control (upstream) si BOLD:AAE7386 BOLD:AAD0309 BOLD:AAG1011 BOLD:AAF6378 BOLD:ACQ8988 BOLD:ACM5335 BOLD:AAU2625 BOLD:AAB9119 BOLD:AAW1102 BOLD:AAU4439 BOLD:ACF7553 BOLD:AAV3526 BOLD:ACR3318 BOLD:ACD2995

cox1

barcode 5'fragment; bc-3': Sphaeroceridae Simuliidae Simuliidae Simuliidae Simuliidae Psychodidae Pediciidae Ephydridae aiy anseisi f I b-' c3 rb-'rb-'C M M T T2 T1 MS MC I C r-bc-3' r-bc-5' bc-3' bc-5' BIN of species id Main Chironomidae Chironomidae Family n.a Diptera R/U/C/S R/U/C/S 106 95 b5 b3 b5 b3 b5 b5 b3 b5 b3 b5 b3b5 b3 b5 b5 b3 b5b3 b5 b3 b5 b3 b5 b3 b5b3 260b5 b5b3 95 97 106 R/U/C/S R/U/C/S R/U/C/S R/U/C/S Tabanidae Diptera Diptera n.a n.a Chironomidae Chironomidae Tipulidae Chironomidae Tipulidae Tipulidae Chironomidae Chironomidae Chironomidae Chironomidae Tipulidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Empididae Empididae ferruginata [126] [126] ferruginata Coproica [27] vernum Simulium Simulium velutinum[11 [151 silvestre Simulium Simulium ornatum [41] sp.[8] Psychoda [6] bimaculata Dicranota [8] tenuicosta Scatella Prodiamesa olivacea [46] [2] pullum Polypedilum hyosceuin [8] caecutiens Chrysops Rheocricotopus chalybeatus [6] [6] chalybeatus Rheocricotopus Tipula benesignata [2] [2] benesignata Tipula horctpsfsie [20] fuscipes Rheocricotopus Tvetenia calvescens [14] [14] calvescens Tvetenia Tipula paludosa [355] Synorthocladius semivirens [7] [7] semivirens Synorthocladius edwardsi [10] Stempellinella [10] bausei Stempellina Tvetenia calvescens [186] ayassbudn [13] Tanytarsus brundini Tanytarsus brundini [5] Tanytarsus ejuncidus [24] ayasseiuu [124] Tanytarsus eminulus ayasshudni [5] heusdensis Tanytarsus Tanytarsus heusdensis [6] [6] heusdensis Tanytarsus Tanytarsus pallidicornis [10] Chelifera precatoria [6 precatoria Chelifera cox1 barcodefragment; 3' ] ] ] /// ---- 1 b 5 b5 b5 b5 b5 b5 b5 b3 b5 b3 b5 b3 b5 0 b3 b5 b3 b5 b5 b5 b5b3 b3 b5 0 0 11 b5b3 b3 162 191 b3 b5 1 -/-/-/- 191 b5 b3 b5 b3 b5b3 -/-/-/- b5 b3 b5 b3 3 b3 b5 b3 b5 b5 b3 b5 b3 R/U/C/S 178 -/-/-/- 140 R/U/C/S 524 60 b5b3 12 R/U/C/S R/U/C/S b3 b5 b3 b5 b3 b5 R/-/-/- R/U/C/S R/U/C/S R/-/-/- b5b3 13 b5b3 R/U/C/S 385 27 R/U/C/- 156 R/-/-/- R/U/C/S R/U/C/S R/U/C/S R/U/C/S /// ---- 0 b b b5 b5 b5 b3 b5 b3 b5 b5b3 b5 0 b5b3 16 4 79 R/U/C/S -/-/-/- R/U/C/S R/-/-/- R/U/C/S -/-/-/- 18 0 b5 b5 b5 b5 b5 b5 0 18 -/-/-/- R/U/C/S /// ---- 8 5 5 b5 b5 b5 0 58 -/-/-/- R/U/C/S R/U/C/S R/U/C/S 21 26 b5 b3 b5 b3 b5 b3 b5 b3 b5 b5b3 26 21 R/U/C/S R/U/C/S R/U/C/S R/U/C/S 11 31 b5 b3 b5 b3 b5 b3 b5 b3 b5 b5b3 31 11 R/U/C/S R/U/C/S /// R--- 4 3 3 b b3 b3 b3 b3 b5 b3 b5 4 0 b5b3 1516 R/-/-/- 238 R/U/C/S -/-/-/- R/U/C/S R/U/-/S R/U/C/S 7 9 b5 b3 b5 b3 b5 b3 b5 b3 b5 b5b3 b3 b5 b3 b5 b3 b5 b5 b3 b5 b5 b3 b5b3 b5 9 b3 b5 b3 b5 b3 3924 950 7 b5b3 438 R/U/C/S R/U/C/S 141 R/U/C/S R/U/C/S R/U/-/S R/U/C/S R/U/C/S R/U/C/S 14729 1930 b5 b3 b5 b3 b5 b3 b5 b3 b5 b3 b5 b3 b5 b3 b5b3 b5 b3 b5 b3 b5 b3 1930 14729 R/U/C/S R/U/C/S R/U/-/S R/U/C/S 2716 3155 b5 b3 b5 b3 b5 b3 b5 b3 b5 b3 b5 b3b5 b3 b5 b5 b3 b3b5 3155 2716 R/U/C/S R/U/-/S R/-/C/- R/-/-/- 1496 16 b5 b3 b5 b3 b5 b5 b5 b3 b5 b5 b5 b5 b3 b3b5 16 1496 R/-/-/- R/-/C/- R/U/C/- R/U/C/S 2398 3230 b5 b3 b5 b3 b5 b3 b5 b3 b5 b3 b5 b3 b5 b3b5 b3 b5 b5 b3 b3b5 b5 b3 3230 2398 R/U/C/S R/U/C/- R/U/C/S R/U/C/S 4847 2023 b5 b5 b3 b5 b3 b5 b3 b5 b3 b5 b3 b5 b3b5 b3 b5 b5 b3 b3b5 b5 2023 4847 R/U/C/S R/U/C/S R/-/-/- R/-/-/- 1 5 b5 b3 b5 b3 b5 b3 b3 b3 b5 b3 b5 b5b3 5 1 R/-/-/- R/-/-/- R/U/C/S R/U/C/S 2 34 b5 b3 b3 b5 b3 b5 b3 b5 b3 b5 b3 b5b3 34 2 R/U/C/S R/U/C/S /// RUCS 4 6 3 5b 5b b 3 3 b5 b3 b3 b5 b3 b5 b3 b5 b3 b3 469 34 R/U/C/S R/U/C/S R/U/C/ S: Indicates detection based on Reads, USEARCH, CROP 5' and bc-3' sites.Named database. tes; I: tes; r the r the Accepted Article This by is protectedarticle reserved. Allrights copyright. generated from BOLD data. OTU clusters mapping against of approach reads the from ‘taxon dependent’ theresults indicates Notes: control and impacted sites. Table 2. Arthropoda Crustacea Crustacea Metazoa Insecta Taxa

de novo Betadiversity values, NMDS stress values and p-values for the comparison between between comparison the for p-values and values stress NMDS values, Betadiversity fragment fragment cox1-3' de cox1-3' novo cox1-5 cox1-5 cox1-5 cox1-3 cox1-5 cox1-3 cox1-3 Dataset DNA DNA SSU de SSU novo de SSU novo de SSU novo SSU de SSU novo de novo de novo de novo de novo de novo de novo 3% and and 3%

' ' ' ' ' ' ' de novo de novo de novo de novo de novo de novo de novo de novo de novo ODrf .5 00501 .0 .1 .7 .2 .0 007 .9 0.569 0.095 0.077 0.006 0.128 0.779 0.113 0.104 0.19 0.669 0.005 0.181 0.856 0.073 BOLD ref 0.001 0.077 0.753 0.001 0.089 0.536 0.29 0.096 0.004 0.099 0.826 0.033 BOLD ref 0.146 0.758 0.108 0.109 0.17 0.014 0.857 BOLD ref ODrf .9 006 .4 .9 004 .2 003009 .7 012 0.475 0.004 0.132 0.843 0.072 BOLD ref 0.009 0.073 0.725 0.004 0.091 0.196 0.34 0.058 0.006 0.087 0.797 0.041 BOLD ref 0.162 0.759 0.089 0.154 0.16 0.023 0.846 BOLD ref Method Method de novo 1% .7 00402 .7 .0 .7013006 .0 001 0.779 0.081 0.007 0.103 0.832 0.006 10% 0.133 0.67 0.009 0.078 0.28 0.37 0.004 0.066 0.773 0.088 10% 0.028 0.144 0.761 0.009 0.088 0.19 0.005 0.849 10% 0.011 0.835 10% 1% .2 0.002 0.82 10% 0.006 0.763 10% 0.107 0.035 0.09 0.019 0.144 0.118 0.818 10% 0.731 0.036 0.12 0.18 0.005 0.821 10% 3 .4 .0 01 .1 .1 07 .3 .7 .7 .6 0.26 0.179 0.075 0.061 0.77 0.136 0.190.011 0.113 0.005 3% 0.845 3 077 .0 .2005006075018004 .4 009 0.798 0.059 0.042 0.004 0.118 0.745 0.006 0.095 0.223 0.22 0.076 0.001 0.07 0.787 0.011 3% 0.103 0.675 0.004 0.057 0.62 0.32 0.059 0.005 0.066 0.746 0.057 3% 0.161 0.739 0.132 0.173 0.16 0.037 0.805 3% 3 089 .0 .8012002076013003 .8 004 0.457 0.064 0.083 0.003 0.143 0.746 0.002 0.142 0.18 0.102 0.047 0.005 0.104 0.829 0.003 3% 0.656 0.095 0.658 0.069 0.118 0.006 0.001 0.085 0.122 0.369 0.38 0.684 0.063 0.013 0.002 0.092 0.328 0.762 0.107 0.154 0.382 3% 0.23 0.749 0.35 0.07 0.002 0.153 0.032 0.085 0.801 0.049 0.14 0.035 0.158 3% 0.036 0.143 0.16 0.042 0.108 0.776 0.726 0.09 0.099 0.03 0.025 0.107 0.089 0.067 0.841 0.087 0.149 0.087 0 3% 0.18 0.743 0.179 0.144 0.016 0.004 0.21 0.763 0.862 0.122 0.007 0.023 3% 0.17 0.775 0.086 3% 0.004 0.18 0.833 0.007 3% 0.85 3% beta.so 10% refers to OTU clustering at these threshold values. BOLD ref ref values.BOLD threshold at OTU clustering to refers these 10% r Betadiversity (Sorensen index) p-value p-value Adonis Adonis Adonis Adonis 0.18 0.19 0.20 0.17 0.18 0.43 r 2 stress .7 .0 07101400 002009 0.242 0.072 0.059 0.02 0.771 0.171 0.006 0.164 .7 .1 0760130.054 0.099 0.099 0.053 0.736 0.075 0.018 0.143 .9 .1 0780150.015 0.317 0.074 0.043 0.758 0.092 0.017 0.135 .0 .6 0790150.226 0.314 0.079 0.058 0.739 0.108 0.066 0.115 .3 .1 07 .4 .1 00 0 0.236 0 0.07 0.013 0.75 0.142 0.139 0.017 0.196 0.079 0.1 0.005 0.663 0.098 0.004 0.073 envfit p- value

easmstress beta.sim Turnover (Simpson (Simpson Turnover index) p-value envfit

beta.sne stressbeta.sne Nestedness (Sorensen- Simpson index) envfit p- envfit value

Accepted Article This by is protectedarticle reserved. Allrights copyright. 2. period Impacted T2-I: spill); the after months (2.5 2 period collection T2: sites; I: impacted sites, Control C: T:Treatment; Notes: BOLD:AAF2659 Branchiopoda Chydoridae Diplostraca BOLD:AAF2659 Branchiopoda Chydoridae Diplostraca Class BIN Order GENE Family 3. Table Species T reference database. reference database-dependent approach for approach database-dependent the ODAP91Isca itr Chironomidae Diptera BOLD:AAP5931 Insecta BOLD:ACV6778 Malacostraca Isopoda Asellidae Asellidae Malacostraca BOLD:ACV6778 Isopoda ODAU67Isca itr Chironomidae Diptera BOLD:ACU8677 Insecta ODAH90Mlcsrc mhpd Gammaridae Amphipoda BOLD:ACH7960 Malacostraca BOLD:AAA1971 Malacostraca Isopoda Asellidae Asellidae Malacostraca BOLD:AAA1971 Isopoda ODAW12Isca itr Chironomidae Chironomidae Diptera Chironomidae Diptera BOLD:AAW1102 Insecta Diptera BOLD:AAB9119 Insecta BOLD:ACQ8988 Insecta ODAI08Isca itr Chironomidae Diptera BOLD:AAI6018 Insecta ODAG33Mlcsrc mhpd Gammaridae Amphipoda BOLD:ACG8343 Malacostraca ODAU49Isca itr Chironomidae Diptera BOLD:AAU4439 Insecta ODAI08Isca itr Chironomidae Diptera BOLD:AAI6018 Insecta ODAA91Mlcsrc spd Asellidae Isopoda Malacostraca BOLD:AAA1971 ODAU49Isca itr Chironomidae Diptera BOLD:AAU4439 Insecta ODAE58Isca itr Chironomidae Diptera BOLD:AAE4568 Insecta ODAR38Isca itr Chironomidae Diptera BOLD:ACR3318 Insecta ODAO07Isca itr Chironomidae Diptera BOLD:AAO1037 Insecta ODAU07Isca Ephemeroptera Baetidae BOLD:AAU1007 Insecta ODAO07Isca itr Chironomidae Diptera BOLD:AAO1037 Insecta ODAH82Mlcsrc mhpd Gammaridae Amphipoda BOLD:ACH6832 Malacostraca ODAQ98Isca itr Chironomidae Diptera BOLD:ACQ8988 Insecta

Named species are the most abundant within each BIN, in brackets the number of specimens identified to species level in the the in level species to identified specimens of number the brackets in BIN, each within abundant most the are species Named

Species with indicativevalue as identified by

cox1 gene fragments bc-5' and bc-3'. bc-3'. and bc-5' fragments gene indval Chydorus sphaericus Chydorus sphaericus Chydorus rctpsbcnts[1] bicinctus Cricotopus A Asellus aquaticus [7] [7] aquaticus Asellus rctpsbcnts[1] bicinctus Cricotopus amrsnkess[1] nekkensis Gammarus Asellus aquaticus [85] [85] aquaticus Asellus ayasseucds[24] ejuncidus Tanytarsus brundin Tanytarsus brundin Tanytarsus rctpsbcnts[123] Cricotopus bicinctus amrsfsau [37] Gammarus fossarum Tanytarsus eminulus [124] Cricotopus bicinctus [123] Tanytarsus eminulus [124] uifeilacaiens[185] Eukiefferiella claripennis ayasspliions[10] pallidicornis Tanytarsus aaedpsabmns[127] Paratendipes albimanus etotlmltou [17] luteolum Centroptilum aaedpsabmns[127] Paratendipes albimanus Gammarus pule Gammarus Tanytarsus brundin Tanytarsus elsautcs[85] aquaticus sellus analyses based on the results of the reference reference the of results the on based analyses x [11] i i i [5] [13] [13] c5 2I1 0.014 bc-5' T2-I 0.046 0.6 1 bc-5' I c3 1 0.004 1 bc-3' I c5 2I1 0.024 bc-5' T2-I 1 c3 0.83 0.016 bc-3' C c3 08 0.01 0.8 bc-3' I c5 08 0.017 0.8 bc-5' I 0.02 0.8 bc-5' I 0.02 0.8 bc-3' I c5 2I1 0.022 bc-5' T2-I 1 c5 1 0.004 1 bc-5' C c5 0.86 0.019 bc-5' I c3 0.75 0.049 bc-3' I c3 06 0.048 0.6 bc-3' I c5 0.83 0.025 bc-5' C c3 1 0.002 1 bc-3' I c3 2I1 0.023 bc-3' T2-I 1 c5 0.86 0.015 bc-5' I c5 2I1 0.019 bc-5' T2-I 1 c3 0.86 0.015 bc-3' I c3 08 0.024 0.8 bc-3' I c5 21 0.001 bc-5'1 T2 c5 08 0.014 0.8 bc-5' I sites at collection collection at sites value Ind. P Accepted Article This by is protectedarticle reserved. Allrights copyright. Figure 3. step. denoising the in BFC=20 in CROP the clustering the with except for settings, commontoall percentage large lists a indicates OTU of intersection analyses most the in that Note parameterof settings. pair any for a shared OTUs of list in proportion the show The diagrams Fig. in S2. steps four each the of Figure 2. PCR also produce products. might otherwise that microorganisms and bacteria other including smaller items, flush used to amplewas both water At the steps sieving (D). meiofauna sieve micron retains whereas a0.45 (C), a through macrofauna 1 mm the sievemetal mesh retains that (B) bypassage separated were fractions both (A) flotation after how illustrates figure The samples. Figure 1. FIGURES at 3% sequence similarity threshold; dark grey: Usearch at 10% similarity threshold; grey: light threshold; 10%similarity read at grey: dark Usearch threshold; similarity at sequence 3% on thematches the to Maxee=1 paired reads after quality filtering using the following parameters: BFC with from was started each OTUs of program grey:with Theclustering CROP; dark Swarm. grey: light at 3% and 10% for the for 10% and 3% at against mapping read by asobtained matrices community presence/absence Figure 6. at a 10%similarity thresholdfor the against mapping read by asobtained matrices community presence/absence Figure 5. mapping to BOLD dataset. reference considered independently. independently. considered “m”) with subsamples “M”) andmeiofauna with Macro(labelled with (labelled database approach Figure 7. 3% for SSU( two portions of Figure 4. for readfiltering, for Number of OTUs at 3% similarity thresholds 3%similarity at thresholds OTUs of Number of settings parameter using from alternative Shared OTUs Surber original the from macro-fauna and meio- extracting for method The flotation NMDS total betadiversity ordinations for Arthropoda, Insecta and Crustacea based on based and Crustacea Insecta Arthropoda, for ordinations betadiversity NMDS total on based andCrustacea Insecta Metazoa, Arthropoda, for NMDS ordinations Total number of OTUs the with NMDS ordinations for Arthropoda and the Arthropoda for NMDS ordinations SSU cox1 3%), and 3%), by against read mapping . The OTU count is based on BLAST+Megan for for BLAST+Megan on based is count OTU The . BOLD-OTU bc-3' bc-3'

and and mergepairs bc-5' bc-5' reference databasefor the read mapping approach. Black: Usearch bc-3' bc-3' gene fragments ( fragments gene gene fragment. de novo in Usearch for read merging. merging. read for Usearch in generation and read mapping approaches for the for and approaches mappingread generation BOLD-OTUS cox1-5’ bc-5' at various hierarchical levels. Black: Usearch; Black:Usearch; levels. hierarchical at various 3%; 3%; and cox1-3' cox1-3' bc-3' ( bc-3' de novo de novo 3%; BOLD and bc-5' BOLD) BOLD) bc-5' BOLDand datasets using the reference bc-5' de novo de novo de novo OTU generation, for for generation, OTU s= generated OTUsand generated SSU 0.5 for read denoising, 10%; 10%; dataset andthe generated OTUs generated OTUs generated bc-3' 10%), 10%), at Accepted Article Brandon-Mong G-J, Gan H-M, Sing K-W K-W Sing H-M, Gan G-J, Brandon-Mong Bolger AM, Lohse M,Usadel B (2014) Trimmomatic: Aflexible trimmer for Illumina sequencedata. Bohan DA, Vacher C, Tamaddoni-Nezhad A A Tamaddoni-Nezhad C, Vacher DA, Bohan T Chapman J, Mann M, Blaxter Blaxter ML, De Ley P, Garey JR JR Garey P, Ley ML,De Blaxter Bista I,CarvalhoGR,WalshK R Tenchini D, Canestrelli R, Bisconti diversity. beta of study the for package R an betapart: OrmeCDL(2012) A, Baselga Baselga A (2010) Partitioning the turnover and nestedness components of beta diversity. diversity. of beta components nestedness and turnover the Partitioning A(2010) Baselga data. sequence throughput high for tool control Aquality FastQC: (2013) Babraham Institute This by is protectedarticle reserved. Allrights copyright. K C, Hopkins Andújar P, Arribas References Bioinformatics 477–487. scale, automated reconstruction of ecological networks. 360 data. evaluation of primers and pipelines. pipelines. and primers of evaluation Mediterranean islands. islands. Mediterranean widespread speciesgroup Baetis rhodan Nematoda. 14087. biodiversity. ecosystem lake of dynamics relevant ecologically Evolution and Ecology Ecology and Biogeography http://www.bioinformatics.babraham.ac.uk/projects/fastqc. http://www.bioinformatics.babraham.ac.uk/projects/fastqc. 1071–1081. endogean arthropods to unveil the mesofauna of the soil. soil. the of mesofauna the unveil to arthropods endogean , 1935–1943. Philosophical transactions of the Royal Society of London. Series B, Biological Sciences Biological B, Series London. of Society Royal the of transactions Philosophical Nature , 30 , 2114–2120. , 392 , 3 , 71–75. Ecology and Evolution et al. , 808–812. et al. et al. et et al. et , 19 (2017) Annual time-series anal time-series Annual (2017) , 134–143. et al. (1998) A molecular(1998) evolutionary framework for the phylum (2005) Defining operational taxonomic units using DNA barcode DNA barcode using units taxonomic operational Defining (2005) (2016) Metabarcoding (2016) and mitochondrial metagenomics of et al. et (2016) Cryptic diversity multipleand origins of the Bulletin of Entomological Research Entomological of Bulletin et al. (2015) DNA metabarcoding (2015) (2017)Next-Generation global biomonitoring: large- , 6 i (Ephemeroptera: Baetidae) on northwestern on northwestern Baetidae) (Ephemeroptera: i , 7901–7910. Trends in Ecology and Evolution and Ecology in Trends Methods in Ecology and Evolution and Ecology in Methods ysis of aqueous eDNA reveals reveals eDNA aqueous ysis of Nature Communications of insectsof and allies: an , 105 , 717–727. Methods in Global Global , 8 , , 32 , , , 7 , Accepted Article Edgar UPARSE: RC (2013) highly accurate sequences OTU from microbial amplicon reads. F Grandjean R, Raimond V, Doublet Elbrecht V, Leese F (2015) Can DNA-based ecosystem assessments quantify species abundance? abundance? species quantify assessments ecosystem Can DNA-based (2015) F Leese V, Elbrecht Dufrêne M,Legendre SpeciesP (1997) assemblages Esling LejzerowiczP, F,Pawlowski Accurate J(2015) multiplexing and filtering high-throughput for E Coissac SN, Jarman BE, Deagle This by is protectedarticle reserved. Allrights copyright. Creer S, Fonseca VG, PorazinskaDL Burgess R(2001)An improved protocol for separating meiofauna from sediments using colloidal Creer S, Deiner K, Frey S Creer S, DeinerK, S Frey Chessman BC, TraylerDavis Family-JA KM, (2002) and species-level biotic forindices Camargo Macrobenthic (1993) J surveys as avaluabletool for assessing freshwater quality in the European Commission Directive (2000) ofthe2000/60/EC European Parliament and of the Council Genome cons to a related Peracarida) isopods (Crustacea, Methods protocol. protocol. rela biomass—sequence and bias primer Testing asymmetrical approach. match. perfect a not marker: I subunit amplicon-sequencing. amplicon-sequencing. Practice, pitfalls and promises. silica sols. establishing a Community framework for the of water the policy. field in action biodiversity. Research Freshwater macroinvertebrates of wetlands on the Coastal Swan Plain, Western Australia. Peninsula. Iberian (OJ L 327) (OJ L , , 55 Plos One Plos . 10 Marine Ecology Progress Series Progress Ecology Marine , 234–244. , 996–8. Methods in Ecology and Evolution and Ecology in Methods , Environ. Monit. Assess et al. 10 , , e0130324. , e0130324. 53 Nucleic AcidsNucleic Research (2016) (2016) ecologist’sThe field guide tosequence-based identificationof , 919–930. Ecological Monographs Ecological et al. et al. et al. Molecular Ecology Molecular (2014)DNA metabarcoding andcytochrome the c oxidase (2010) Ultrasequencing of the meiofaunal biosphere: biosphere: meiofaunal the of Ultrasequencing (2010) (2012) (2012) Widespread atypical mitochondrial DNAstructure in Biology LettersBiology , , 24 214 , 71–90. , 43 , 161–165. , and indicator species: The need for a flexible flexible fora Theneed species: indicator and 7 , tionships with innovative an metabarcoding titutive heteroplasmy in terrestrial species. , 2513–2524. , 1008–1018. 1008–1018. , 67 , 19 , 345–366. , 4–20. , 4–20. , 10 , 1789–1793. Marine and Official Journal Journal Official Nature Accepted Article Gutiérrez-Cánovas Velasco C, J,Millán A(2008) SALINDEX: A macroinvertebrate index for assessing P Taberlet MJ, Uriz M, Guardiola X Lu AG, Hildrew C, Gray HusonAuchDH, J, AF,Qi SC(2007) Schuster MEGAN analysis of metagenomic data. Hebert PDN, Cywinska A, Ball SL, DeWaard BiologicalJR (2003) identifications through DNA HaoJiang R, X, Chen Clustering(2011) T 16S forOTU prediction:rRNA amethod of unsupervised Darwin’s contemplate to way new A (2016) GB Golding R, Beiko NA, Fahner DJ, Baird M, Hajibabaei Friberg N, Bonada N, Bradley DC This by is protectedarticle reserved. Allrights copyright. Ingvason HR, ÓlafssonGardarsson JS, (2004)Food A selectionof Tanytarsus gracilentus larvae W Sung GR, Carvalho VG, Fonseca Karaman GS, Freshwater Pinkster (1977) S Gammarusspecies from Europe, AfricaNorth and of taxonomic resolution Jones influence on The freshwater sufficiency: FC(2008) Taxonomic SSM Pedley L, Ji Ashton Y, effects of acidification. Ecosystems. Good, the The Ugly. Bad the and the ecological status of saline “ramblas” from SE of the Iberian Peninsula. Peninsula. Iberian the of SE from “ramblas” saline of status ecological the canyons. marine of sediments from DNA extracellular Research barcodes. clustering. Bayesian Philosophical Transactions of the Royal Society ofLondon B: BiologicalSciences biomonitoring. and science biodiversity reconnecting are barcodes DNA how bank: tangled 299316. unmasks marine metazoan biodiversity. biodiversity. metazoan marine unmasks (Diptera: Chironomidae): An analysis instarsof andcohorts. bioassessments using benthic macroinvertebrates. macroinvertebrates. benthic using bioassessments metabarcoding. species. species. (Crustacea-Amphipoda Asia of regions adjacent Bijdragen tot de Dierkunde de tot Bijdragen , 17 Proceedings of the Royal Society of London B London of Society theRoyal of Proceedings , 377–386. Ecology Letters Ecology et al. Bioinformatics (Oxford, England) (Oxford, Bioinformatics et al. Advances in Ecological Research Ecological in Advances (2016) Recovery and nonrecovery of freshwater food webs from the webs Recovery fromthe nonrecovery food (2016) freshwater and of (2013) Reliable, (2013) verifiable and efficient monitoring of biodiversity via et al. et al. et al. et , (2011) Biomonitoring of Human Impacts in Freshwater Freshwater in Impacts Human of Biomonitoring (2011) (2015) Deep-sea, (2015) deep-sequencing: Metabarcoding 16 (2010) Second-generation environmental sequencing sequencing environmental Second-generation (2010) , 1245–57. , 47 , 1–97. , 1–97. Nature Communications Nature Advances in Ecological Research Ecological in Advances ). Part I. Gammarus I. Part ). pilex-group and related Environmental Reviews , PLoS ONE PLoS 27 , , 270 55 , 611–8. , 475–534. , 313–21. , 313–21. Aquatic Ecology Aquatic , , 10 1 , 98. . Limnetica , 16 , 38 , , 45–69. Genome Genome , 44 , 231–237. , 231–237. 371 , 1–68. , , 20150330. 27 , Accepted Article This by is protectedarticle reserved. Allrights copyright. Li BFC: H(2015) correctingIllumina sequencing errors. reveal samples of standardized metabarcoding and barcoding DNA (2015) N Knowlton M, Leray Leese F, Altermatt F, Bouchez A LanzénLekang A, K,Jonassen I, Thompson EM, Troedsson (2016)High-throughput C metabarcoding Improvements 7: version software alignment sequence multiple MAFFT (2013) DM Standley K, Katoh Lobo ShokrallaJ, S, Costa MH, Hajibabaei M, Costa FO (2015) Stepwise implementation ofhigh- Kearse M, Moir R, Wilson A Marshall JC, Steward AL, Harch (2006)Taxonomi BD clustering fast and Swarm: robust (2014) M C,Dunthorn de Vargas C, Quince T, Rognes F, Mahé Oksanen J, Blanchet FG, Kindt R R Kindt FG, Blanchet J, Oksanen Meyer CP, Paulay G(2005)DNAbarcoding: rates Error basedoncomprehensive sampling. Paradis E, Claude J, Strimmer K (2004) APE: Analyses of phylogenetics and evolution in R language. language. R in evolution and phylogenetics of APE: Analyses (2004) K Strimmer J, Claude E, Paradis of eukaryotic diversity for environmental moni environmental diversity for of eukaryotic 1647–1649. usability. and in performance 201424997. diversity. benthic marine of patterns 2 Europe. in ecosystems aquatic of monitoring and bioassessment Ecology species abundance data. data. abundance species using of costs and Thebenefits river: dryland Australian an from samples macroinvertebrate studies. amplicon-based for method 248. communities. macrobenthic estuarine to metabarcoding sequencing throughput data. sequence of analysis and organization the for platform software 2.0-10. http://cran.r-project.org/package=vegan. http://cran.r-project.org/package=vegan. 2.0-10. Biology Bioinformatics , e11321. , , 3 25 , e422. , 4392–4406. , 20 , 289–290. et al. Hydrobiologia et al. et et al. (2012) Geneious Basic: An integrated and extendable desktop desktop extendable and integrated An Basic: Geneious (2012) Molecular Biology and Evolution and Biology Molecular (2016) DNAqua-Net: Developing new genetic tools for for tools genetic new Developing DNAqua-Net: (2016) (2013) Vegan: Community Ecology Package. R package version version package R Package. Ecology Community Vegan: (2013) PeerJ Proceedings of the National Academy of Sciences of Academy National ofthe Proceedings , 572 , 2 , 171–194. , e593. c resolution and quantificationof freshwater toring of offshore oil-drilling activities. activities. oil-drilling offshore of toring Bioinformatics R package ver. 2.0–8 , 30 , 1–3. , 1–3. Research Ideas and Outcomes and Ideas Research , 772–780. Bioinformatics , 254. Genome , PLoS PLoS Molecular 28 , 2014 , , 58 , , , Accepted Article Schmidt-KloiberHering A, D Www.freshwaterecology.info (2015) An online- tool that unifies, Shokralla S,Gibson JF, Nikbakht H of assessment the on resolution taxonomic of effect The (2004) RC Nijboer A, Schmidt-Kloiber taxonomic operational in used methods improving and Assessing (2011) SL Westcott PD, Schloss SchirmerD’AmoreIjaz R UZ, M, ML T, Suárez Puntí MDM, Sánchez-Montoya RutschmannGattolliatJL, HughesSJ S, Roberts D Labdsv:(2007) Ordination and multivariateanalysis for ecology. Rpackage version 3-1.1, Index Barcode the species: animal all for registry A DNA-based (2013) PDN Hebert S, Ratnasingham BOLD BARCODING, (2007) PDN Hebert S, Ratnasingham Pons J, Barraclough T, Gomez-Zurita J Gomez-Zurita T, J, Barraclough Pons This by is protectedarticle reserved. Allrights copyright. Pauls BálintM SU, Alp M, ecological preferences. their and organisms freshwater European 20,000 more than codifies and standardises platform. MiSeq Illumina the with sequencing (www.barcodinglife.org). (www.barcodinglife.org). Molecular Ecology Resources specimens. single from capture barcode DNA and accelerate enhance to sequencing generation ecological water quality classes. Microbiology analysis. sequence gene rRNA 16S for approaches unit-based 2255. macroinvertebrate assemblages in Mediterranean streams. Madeira. and Islands Canary the on Baetidae) (Ephemeroptera, species Cloeon and Baetis cryptic morphologically Number (BIN) system. taxonomy of undescribed insects. insects. undescribed of taxonomy Developments and opportunities. , 77 , 3219–3226. et al. Freshwater Biology PLoS ONE PLoS Ecological Indicators Ecological (2014) Integrating (2014) molecular tools into freshwater ecology: Molecular Ecology Notes et al. , et al. 14 (2015) Insight into biases and sequencing errors for amplicon amplicon for errors sequencing and biases into Insight (2015) , Hydrobiologia et al. , 892–901. et al. et 8 (2014) Next-generation (2014) DNA barcoding: Using next- Systematic Biology Systematic Freshwater Biology Freshwater . (2006) Sequence-based species delimitation for the DNA the Sequence-based for (2006) species delimitation (2014) Evolution andisland endemism of et al. et , 59 , (2007) Concordance between ecotypes and and ecotypes between Concordance (2007) , 2516–2527. 53 Nucleic Acids Research , 516 , 271–282.

, : The Barcode of Life Data System , 269–283. 7 , 355–364. , , 55 59 , 595–609. , 1559–1576. Freshwater Biology Freshwater AppliedEnvironmental and , 43 , e37. , e37. , 52 , 2240– Accepted Article This by is protectedarticle reserved. Allrights copyright. package. R operations. string common for wrappers consistent simple, stringr: (2013) H Wickham on science emergent An (2012) WE Harris HR, Taylor U Obertegger F, Leasi CQ, Tang Tachet H, Richoux P, Bournaud M, Usseglio-Polatera(2002)P T Mamos M, Omasz Grabowski L, Sworobowicz Stubauer I, Moog (2000)Taxonomic O sufficiency ve reads. keep singleton and files FASTA/Q paired-end sync Pairfq: (2013) S Staton JF Gibson TM, Porter S, Shokralla Taberlet P, Coissac E, Pompanon F,Brochmann Willerslev C, E(2012)Towards next-generation Wheeler QD (2004) Taxonomic triage analysis model-based for package R an Mvabund- (2012) DI Warton ST, Wright U, Naumann Y, Wang United States (1972) Federal PollutionWater Act Control AmendmentsPub.L. of 1972. 92-500, Bell ThompsonT MSA,C, Bankier of the International Association of Theoretical and Applied Limnology. Vol. 27 water monitoring.quality In: biological in Austrian experience platform. MiSeq Illumina an using identification specimen years of DNA barcoding. Sciences of Academy National the meiofauna. of the surveys biodiversity in diversity true underestimates greatly Ecologie Biologie, Systematique, Biology Freshwater diversification. spatiotemporal and diversity cryptic into insights Europe: in aquaticus Royal Society of London. Series B, Biological sciences B,Biological Series London. of Society Royal data. abundance of multivariate October 18. 2037–2050. spill: testing amultilevel bioassessment approach in river ecosystem.a metabarcoding. DNA using assessment biodiversity , 60 , 1824–1840. Molecular Ecology Resources Ecology Molecular et al. et al. et al. (2012) The widely used small subunit 18S rDNA molecule molecule rDNA 18S subunit small used widely The (2012) and the poverty of phylogeny. (C Editions, Ed,). Paris. Ed,). (CEditions, Methods in Ecology and Evolution and Ecology in Methods (2015) Massively (2015) parallel multiplex DNA sequencing for (2016) Gene-to-ecosystem impacts of a catastrophic pesticide pesticide catastrophic a of impacts Gene-to-ecosystem (2016) , 109 , 16208–16212. et al. rsus need for information- comments based on (2015) Revisiting (2015) phylogeographythe of Asellus the brink of irrelevance: a review of the past 8 past the of review a irrelevance: of brink the Molecular Ecology Molecular , 359 , 12 Invertebres D’eau Douce. , 571–583. , 377–88. Scientific reports Proceedings of the 27th Congress Congress 27th the of Proceedings Philosophical transactions of the the of transactions Philosophical , 3 , 471–474. , 21 Freshwater Biology Freshwater , 2045–50. , 5 , p. 5. 5. , p. , 9687. Proceedings of of Proceedings , 61 , Accepted Article This by is protectedarticle reserved. Allrights copyright. BOLD:ACP0608 BOLD:ACI4790 BOLD:AAJ7051 BIN thresholdof (i) processed and reads (ii)OTUs clustered3% at (details in text). 1. Table Williams HC, Ormerod SJ, Bruford Molecular MW (2006) systematics and phylogeography of the BOLD:ADE2432 BOLD:ACP2182 BOLD:ACS1169 Wright JF, SutcliffeDW, (2000) Furse MT an in perspectives new Century: 21st the for Biomonitoring (2013) DJ Baird C, Gray G, Woodward BOLD:ACP8764 BOLD:ACR1089 BOLD:AAW5799 BOLD:ACD1957 BOLD:ACP6740 BOLD:ACQ3496 BOLD:AAP5886 Yang C, Wang X, Miller JA JA Miller X, Wang C, Yang BOLD:ACD1670 Yu D, Ji Y, Emerson B B Emerson Y, Ji Yu D, BOLD:ACT8698 BOLD:AAW5785 Zhang J, Kobert K, Flouri T, Stamatakis A (2014) PEAR: a fast and accurate Illumina Paired-End reAd reAd Paired-End Illumina accurate and a fast PEAR: A (2014) Stamatakis T, Flouri K, Kobert J, Zhang Zhang J, PavlidisKapliP, Stamatakis P, A (2013) A general species delimitation methodwith BOLD:ACU8677 BOLD:AAP5931 BOLD:AAF2345 BOLD:AAI6018 Zinger L, Chave J, Coissac E Coissac J, Zinger Chave L, BOLD:AAT9677

and Evolution cryptic speciescomplex Baetisrhodani (Ephemeroptera, Baetidae). age of globalisation and emerging environmental threats. threats. environmental emerging and globalisation of age Biological Association, Ambleside, UK. Cumbria, biodiversity assessment and biomonitoring. 389. indicator. biodiversity ageneral as beused can samples litter mergeR. mergeR. applications phylogenetic to placements. alternative for multi-taxa surveys based on soil DNA. BINs of Diptera from BOLD identified based on on based identified BOLD from Diptera of BINs

Bibionidae aiy anseisi f I b-' c3 rb-'rb-'C M M T T2 T1 MS MC I C r-bc-3' r-bc-5' bc-3' bc-5' BIN of species id Main Agromyzidae Family Cecidomyiidae Cecidomyiidae Chironomidae Chironomidae Chironomidae Chironomidae Ceratopogonidae Chironomidae Chironomidae R/U/C/S R/U/C/S 550 93 b5 b3 b5 b3 b5 b5 b3 b5 b3 b5 b3 b5 b3b5 b3 b5 b5b3 b5 b5b3 93 550 R/U/C/S R/U/C/S Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae hrnmdeChironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Bioinformatics (Oxford, England) (Oxford, Bioinformatics , 40 et al. et , 370–382. , 370–382. et al. (2012) Biodiversity soup: metabarcoding of arthropods for rapid rapid for arthropods of metabarcoding soup: Biodiversity (2012) Dilophus febrilis [21] febrilis Dilophus A A Brillia bifida[6] Palpomyia flavipes [6] flavipes Palpomyia Conchapelopia melanops [11] melanops Conchapelopia [3] hittmairorum Conchapelopia ocaeoi aldl [1] pallidula Conchapelopia Corynoneura sp. Corynoneura Corynoneura sp. [6] sp. Corynoneura Cricotopus bicinctus [1] bicinctus [1] Cricotopus [19] annulator Cricotopus [139] albiforceps Cricotopus Cricotopus bicinctus [1] bicinctus [1] Cricotopus Cricotopus bicinctus bicinctus [123] Cricotopus Cricotopus bicinctus bicinctus [27] Cricotopus et al. gromyza pseudoreptans [19] [19] pseudoreptans gromyza psectrotanypus trifascipennis [9] trifascipennis psectrotanypus (2014) Using (2014) metabarcoding to ask ifcollected easily soil andleaf- (2016) Extracellular (2016) extractionDNA fast,cheap is a and reliable [92] [13] [60 [3] ]

Assessing the biological quality of fresh waters fresh of quality biological the Assessing Bioinformatics (Oxford,England) , 30 R/U/C/S R/U/C/S 478 428 b5 b3 b5 b3 b5 b5 b5 b3 b5 b5 b5 b3 b5 b5b3 428 478 R/U/C/S R/U/C/S /// RUCS 5 3 b b b3 b3 b5 b3 b3 b5 b3 b5b3 b3 b5 b5b3 1446 81 5 R/U/C/S 0 R/U/C/S R/U/C/S -/-/-/- R/-/-/- R/U/C/S 11 9 b5 b3 b3 b5 b3 b5 b3 b5 b3 b5 b5b3 b3 9 11 R/U/C/S R/-/-/- R/U/C/S R/-/-/S 455 31 b5 b3 b5 b3 b5 b3 b5 b3 b5 b5b3 31 455 R/-/-/S R/U/C/S /// /// 5 5 b b5 b5 b5 0 1 -/-/-/- R/-/-/- R/U/C/S R/U/C/S 33 26 b5 b3 b5 b3 b5 b3 b5 b3 b5 b5b3 26 33 R/U/C/S R/U/C/S R/U/-/- R/U/C/S 20 41 b5 b3 b5 b3 b5 b3 b5 b3 b5 b3 b5 b3 b5 b3 b5b3 b5 b3 b5 b3 b5 b3 41 20 R/U/C/S R/U/-/- /// ---- 3 0 b 5 b b5 b5 b5 b5 0 130 -/-/-/- R/U/C/S R/-/-/- R/-/-/- 1 4 b5 b3 b5 b3 b5 b3 b5 b3 b5 b5b3 4 b5 b3 b5 b3 b5b3 b5 b3 b5 b3 b5 b3 331 1 1018 R/U/C/S R/-/-/- R/U/C/S R/-/-/- /// /// 5 5 b5 b5 b5 0 3 -/-/-/- R/U/C/S R/U/C/S R/U/C/S 1178 1263 b5 b3 b5 b3 b5 b3 b5 b3 b5 b3 b5 b3 b5 b5b3 b3 b5 b5b3 1263 1178 R/U/C/S R/U/C/S R/U/C/S R/U/C/S 21 72 b5 b3 b5 b3 b5 b3 b5 b3 b5 b3 b5 b5b3 b5b3 72 21 R/U/C/S R/U/C/S /// ---- 0 5 b 5 b5 b3 b5 b5 b3 b5 b5 b3 b5 b5 b5 b5b3 b3 b5 0 b5b3 60 4 14 2 30 -/-/-/- R/U/C/S R/-/-/- R/-/-/- R/U/C/S R/-/-/- /// ---- 0 b 5 b b5 b5 b5 b5 0 5 -/-/-/- R/-/-/- /// /// 7145 5 5b 5b*b 3b b5b3 b5 b3 b5 b3* b5 b5 b3 b5 4759 5751 R/-/C/S R/U/C/- /// /// 2 91b 5b 5b b 3 b5b3 b5b3 b5 b3 b5 b3 b3 1991 123 R/U/C/- R/U/-/S Methods in Ecology and Evolution and Ecology in Methods , 614–20. usearch_global Soil Biology and Biochemistry and Soil Biology Limnetica searches under searches under a 3% similarity Ecological Indicators Ecological Molecular Phylogenetics , 32 , 159–174. , 29 , 2869–76. , 3 , 613–623. , . Freshwater Freshwater . , 46 96 , 379– , 16–19. , 16–19. Accepted Article This by is protectedarticle reserved. Allrights copyright. bc-5': Notes: BOLD:AAE4568 BOLD:AAU2576 BOLD:AAA5299 BOLD:AAM5377 BIN BOLD:ACR0263 BOLD:ACT0982 BOLD:AAJ5023 BOLD:AAN6407 BOLD:AAB8624 BOLD:AAP9556 BOLD:AAA8323 BOLD:AAN3314 BOLD:AAL7819 BOLD:ABA7297 BOLD:ACY5064 BOLD:ACP1316 BOLD:ACZ6583 BOLD:AAW4728 BOLD:AAM9576 BOLD:AAL0178 BOLD:AAU2481 BOLD:ABV4656 BOLD:AAD7458 BOLD:AAW0928 BOLD:AAD4167 BOLD:AAB8862 BOLD:AAX3566 species are the most abundant within each BIN, in brackets the number of specimens identified to species level in the reference inthe level species to identified of specimens number the brackets in BIN, each within abundant most the are species fragments respectively on based the processed reads. In bold species identified with indval analyses as indicative for impacted bc- the OTUwith of the detection the b3indicates and b5 spill). the after months 2 (2.5 intime collected Samples T2: spill); afte days 1(11 intime collected Samples T1: subsamples; Meiofauna MS; subsamples; Macrofauna MC: sites; (downstream) Impacted and SWARM respectively. r-bc-5' and r-bc-3': Number of reads matched for bc-5' and bc-3' respectively. C: Control (upstream) si BOLD:AAD8971 BOLD:AAC7823 BOLD:AAE7386 BOLD:ACQ1908

BOLD:AAG1011 BOLD:AAF6378 BOLD:AAV2322 BOLD:AAW5449 BOLD:AAM5389 BOLD:AAD1527 BOLD:AAD0309 BOLD:ACX3335 BOLD:AAC7552 BOLD:ACQ8988 BOLD:ACM5335 BOLD:AAU2625 BOLD:ACT5340 BOLD:AAI1530 BOLD:AAB9119 BOLD:AAW4635 BOLD:AAW1102 BOLD:AAO1037 BOLD:ACM0242 BOLD:AAL3267 BOLD:AAU4439 BOLD:ACF7553 BOLD:AAV3526 BOLD:ACR3318 BOLD:ACD2995

cox1

barcode 5'fragment; bc-3': Chironomidae Chironomidae Chironomidae Chironomidae Sphaeroceridae Sphaeroceridae Simuliidae Simuliidae Simuliidae Simuliidae Psychodidae Pediciidae Ephydridae Chironomidae Chironomidae Chironomidae aiy anseisi f I b-' c3 rb-'rb-'C M M T T2 T1 MS MC I C r-bc-3' r-bc-5' bc-3' bc-5' BIN of species id Main Family Chironomidae n.a Diptera R/U/C/S R/U/C/S 106 95 b5 b3 b5 b3 b5 b5 b3 b5 b3 b5 b5 b3 b5 b3 b5b3 b5 b3 b5 b3 b5 b3 b3 b5 260 b5b3 b5 b5b3 97 95 R/U/C/S 106 R/U/C/S R/U/C/S R/U/C/S Tabanidae Diptera Diptera n.a n.a Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Tipulidae Chironomidae Chironomidae Chironomidae Tipulidae Tipulidae Chironomidae Chironomidae Chironomidae Chironomidae Tipulidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Chironomidae Empididae Empididae Cricotopus sylvestris sylvestris [64] Cricotopus [289] rufiventris Cricotopus Cricotopus trifascia trifascia Cricotopus [5] Coproica ferruginata [126] [126] ferruginata Coproica [27] vernum Simulium Simulium velutinum[11] [151 silvestre Simulium Simulium ornatum [41] sp.[8] Psychoda [6] bimaculata Dicranota [8] tenuicosta Scatella [3] albinodus Polypedilum [78] albicorne Polypedilum [13] flavipes Phaenopsectra uifeilacaiens[185 Eukiefferiella claripennis hyosceuin [8] caecutiens Chrysops [2] pullum Polypedilum [8] pedellus Microtendipes [18] eurynotus Metriocnemus [13] nebulosa Macropelopia [2] 2SW sp. Heterotrissocladius Tipula benesignata [2] [2] benesignata Tipula Prodiamesa olivacea [46] [6] rectinervis Nanocladius [26] Micropsectra atrofasciata Tvetenia calvescens [14] calvescens Tvetenia Tipula paludosa [355] [6] chalybeatus Rheocricotopus [119] rubicundus Orthocladius [330] oblidens Orthocladius Micropsectra contracta[14] Tvetenia calvescens [186] [20] fuscipes Rheocricotopus [25] rubicundus Orthocladius Micropsectra lindrothi [18] Synorthocladius semivirens [7] [7] semivirens Synorthocladius edwardsi [10] Stempellinella [10] bausei Stempellina [2] quadrinodosus Paracladius [24] pallidula Micropsectra Tanytarsus brundini [13] [2] camptolabis Paracladopelma [33 sp. 5ES Micropsectra Tanytarsus brundini [5] sp. Paratanytarsus [3] lauterborni Paratanytarsus [12] dissimilis Paratanytarsus Tanytarsus ejuncidus [24] Paratendipes [127 albimanus ayasseiuu [124] Tanytarsus eminulus Tanytarsus heusdensis [5] [5] heusdensis Tanytarsus Tanytarsus heusdensis [6] [6] heusdensis Tanytarsus Tanytarsus pallidicornis [10] Chelifera precatoria [6 precatoria Chelifera cox1 barcodefragment; 3' ] ] ] ] ] R/U/C/S R/U/C/S 4 88 b5 b3 b5 b3 b5 b3 b5 b3 b5 b3 b5 b5b3 b3b3 b5 b5 b5b3 b3 b5 b3 b5b3 b3 b5 88 126 44 119 4 R/U/C/S 8 R/U/C/S R/U/C/S R/U/C/S R/U/C/S R/U/C/S /// /// 2 5b 3b 3 5b b5b3 b5b3 b5 b3 b5 b3 b3 b5 b3 b3 b5 b5 b3 b5 b3 b5b3 b5 b3 b5 b3 25 b5 b3 b5 b5 b3 b3 b5 b3 b5 b3 b3 b5b3 227 104 20 R/U/C/S 145 10 159 R/U/C/S R/U/C/S 161 R/U/C/S R/U/C/S R/U/C/S R/U/-/S R/U/C/S b5b3 b3 b5b3 b5 b3 b5 b3 1641 61 R/U/C/S b5b3 b5b3 b5 b3 b5 b3 R/U/C/S 116 164 R/U/C/- R/U/C/- /// ---- 1 b 5 b5 b5 b5 b5 b5 b5 b3 b5 b5 b3 b5 b3 0 b3 b5 b3 b5 b5 b5 0 0 11 b5b3 b3 b5 b5b3 162 b3 b5 b3 1 191 -/-/-/- b5 b3 b5 b3 b5b3 191 -/-/-/- b5 b3 b5 b3 3 b3 b5 b3 b5 b5 b3 b5 b3 178 R/U/C/S -/-/-/- 140 R/U/C/S 524 60 b5b3 12 R/U/C/S R/U/C/S b3 b5 b3 b5 b3 b5 R/-/-/- R/U/C/S R/U/C/S R/-/-/- b5b3 13 b5b3 R/U/C/S 385 27 R/U/C/- 156 R/-/-/- R/U/C/S R/U/C/S R/U/C/S R/U/C/S /// ---- 0 b b b5 b5 b5 0 4 -/-/-/- R/-/-/- b3 b5 b3 b5 b5b3 19 b3 b5 b3 b5 12 b5b3 R/U/C/S b3 b5 b3 b5b3 18 b5b3 R/U/C/S b5 b5 b3 3 58 b5 b3 b5 b3 b5b3 b5 b3 b5 b3 b5 b3 R/U/C/S 234 29 48 R/U/C/S R/U/C/S R/-/-/- R/U/C/S R/U/C/S R/U/C/S -/-/-/- 18 0 b5 b5 b5 b5 b5 b5 0 18 -/-/-/- R/U/C/S R/U/C/S R/U/C/S 79 16 b5 b3 b5 b3 b5 b5 b3 b5 b3 b5 b3 b5 b5b3 b5 b5b3 16 79 R/U/C/S R/U/C/S b5 b3 b5 b3 b5b3 b5 b3 b5 b3 b5 b3 166 b5b3 593 R/U/C/S b5b3 b5 b3 b5 b3 R/U/C/S 596 826 R/U/C/S R/U/C/S R/U/C/S R/U/C/S 21 26 b5 b3 b5 b3 b5 b3 b5 b3 b5 b5b3 26 21 R/U/C/S R/U/C/S /// ---- 8 5 5 b5 b5 b5 0 58 -/-/-/- R/U/C/S b5b3 b5b3 b5 b3 b5 b3 b5 64 b5 b3 b5 b3 b5b3 b5 b3 b5 b3 98 b5 b3 b3 b5 5138 R/U/C/S 11359 R/U/C/S b3 b5 b5b3 b3 b5 R/U/C/S R/U/C/S 98 62 R/U/C/S R/U/C/S /// /// b b 3 b3 b3 b3 b3 b5 b3 4 b5 0 b5b3 R/-/-/- 1516 238 R/U/C/S -/-/-/- R/U/C/S R/U/C/S R/U/C/S 11 31 b5 b3 b5 b3 b5 b3 b5 b3 b5 b5 b3 31 11 R/U/C/S R/U/C/S b3 b5 b3 b5 b5b3 b3 b5 b3 b5 145 39 R/U/C/S b5b3 b3 b5 R/U/C/S 4 5 R/-/-/- R/U/C/S R/U/C/S R/U/C/S 14729 1930 b5 b3 b5 b3 b5 b3 b5 b3 b5 b3 b5 b3 b5 b3 b5b3 b5 b3 b5 b3 b5 b3 1930 14729 R/U/C/S R/U/C/S R/U/C/S R/U/C/S 141 438 b5 b3 b5 b5 b3 b5 b3 b5 b3 b5 b3 b5 b3 b5 b5 b5b3 438 141 b3 R/U/C/S b3 R/U/C/S b3 5 0 R/-/-/- b5 b3 b5 b3 b5b3 b5 b3 b5 b3 -/-/-/- 53 91 R/U/C/S R/U/C/S R/U/-/S R/U/C/S 7 9 b5 b3 b5 b3 b5 b3 b5 b3 b5 b5b3 b5 b3 b5 b3 b5b3 b5 9 b3 b5 b3 b5 b3 3924 950 7 R/U/C/S R/U/C/S R/U/C/S R/U/-/S R/U/C/S R/U/C/S 13 15 b5 b3 b3 b5 b3 b3 b5 b3 b5 b3 b3 b5 b3 b5b3 15 13 R/U/C/S R/U/C/S R/U/-/S R/U/C/S 2716 3155 b5 b3 b5 b3 b5 b3 b5 b3 b5 b3 b5 b3b5 b3 b5 b5 b3 b3b5 3155 2716 R/U/C/S R/U/-/S R/U/C/S R/U/C/S 47 4 b5 b3 b5 b3 b5 b3 b5 b3 b5 b3 b5 b5 b3 b3 b5 b3 b5 b5b3 b3 b5 b5b3 b3 b5 4 47 239 90 R/U/C/S 157 35 R/U/C/S R/U/C/S R/U/C/S R/U/C/S R/U/C/S R/-/C/- R/-/-/- 1496 16 b5 b3 b5 b3 b5 b5 b5 b3 b5 b5 b5 b5 b3 b3b5 16 1496 R/-/-/- R/-/C/- R/U/C/- R/U/C/S 2398 3230 b5 b3 b5 b3 b5 b3 b5 b3 b5 b3 b5 b3 b5 b3b5 b3 b5 b5 b3 b3b5 b5 b3 3230 2398 R/U/C/S R/U/C/- R/U/C/S R/U/C/S 4847 2023 b5 b5 b3 b5 b3 b5 b3 b5 b3 b5 b3 b5 b3b5 b3 b5 b5 b3 b3b5 b5 2023 4847 R/U/C/S R/U/C/S R/-/-/- R/-/-/- 1 5 b5 b3 b5 b3 b5 b3 b3 b3 b5 b3 b5 b5b3 5 1 R/-/-/- R/-/-/- R/U/C/S R/U/C/S 2 34 b5 b3 b3 b5 b3 b5 b3 b5 b3 b5 b3 b5b3 34 2 R/U/C/S R/U/C/S /// RUCS 4 6 3 5b 5b b 3 3 b5 b3 b3 b5 b3 b5 b3 b5 b3 b3 469 34 R/U/C/S R/U/C/S R/U/C/ S: Indicates detection based on Reads, USEARCH, CROP 5' and bc-3' sites.Named database. tes; I: tes; r the r the

Accepted Article This by is protectedarticle reserved. Allrights copyright. generated from BOLD data. OTU clusters mapping against of approach reads the from ‘taxon dependent’ theresults indicates Notes: control and impacted sites. Table 2. Arthropoda Crustacea Crustacea Metazoa Insecta Taxa

de novo Betadiversity values, NMDS stress values and p-values for the comparison between between comparison the for p-values and values stress NMDS values, Betadiversity fragment fragment cox1-3' de cox1-3' novo cox1-5 cox1-5 cox1-5 cox1-3 cox1-5 cox1-3 cox1-3 Dataset DNA DNA SSU de SSU novo de SSU novo de SSU novo SSU de SSU novo de novo de novo de novo de novo de novo de novo 3% and and 3%

' ' ' ' ' ' ' de novo de novo de novo de novo de novo de novo de novo de novo de novo ODrf .5 00501 .0 .1 .7 .2 .0 007 .9 0.569 0.095 0.077 0.006 0.128 0.779 0.113 0.104 0.19 0.669 0.005 0.181 0.856 0.073 BOLD ref 0.001 0.077 0.753 0.001 0.089 0.536 0.29 0.096 0.004 0.099 0.826 0.033 BOLD ref 0.146 0.758 0.108 0.109 0.17 0.014 0.857 BOLD ref ODrf .9 006 .4 .9 004 .2 003009 .7 012 0.475 0.004 0.132 0.843 0.072 BOLD ref 0.009 0.073 0.725 0.004 0.091 0.196 0.34 0.058 0.006 0.087 0.797 0.041 BOLD ref 0.162 0.759 0.089 0.154 0.16 0.023 0.846 BOLD ref Method Method de novo 1% .7 00402 .7 .0 .7013006 .0 001 0.779 0.081 0.007 0.103 0.832 0.006 10% 0.133 0.67 0.009 0.078 0.28 0.37 0.004 0.066 0.773 0.088 10% 0.028 0.144 0.761 0.009 0.088 0.19 0.005 0.849 10% 0.011 0.835 10% 1% .2 0.002 0.82 10% 0.006 0.763 10% 0.107 0.035 0.09 0.019 0.144 0.118 0.818 10% 0.731 0.036 0.12 0.18 0.005 0.821 10% 3 .4 .0 01 .1 .1 07 .3 .7 .7 .6 0.26 0.179 0.075 0.061 0.77 0.136 0.190.011 0.113 0.005 3% 0.845 3 077 .0 .2005006075018004 .4 009 0.798 0.059 0.042 0.004 0.118 0.745 0.006 0.095 0.223 0.22 0.076 0.001 0.07 0.787 0.011 3% 0.103 0.675 0.004 0.057 0.62 0.32 0.059 0.005 0.066 0.746 0.057 3% 0.161 0.739 0.132 0.173 0.16 0.037 0.805 3% 3 089 .0 .8012002076013003 .8 004 0.457 0.064 0.083 0.003 0.143 0.746 0.002 0.142 0.18 0.102 0.047 0.005 0.104 0.829 0.003 3% 0.656 0.095 0.658 0.069 0.118 0.006 0.001 0.085 0.122 0.369 0.38 0.684 0.063 0.013 0.002 0.092 0.328 0.762 0.107 0.154 0.382 3% 0.23 0.749 0.35 0.07 0.002 0.153 0.032 0.085 0.801 0.049 0.14 0.035 0.158 3% 0.036 0.143 0.16 0.042 0.108 0.776 0.726 0.09 0.099 0.03 0.025 0.107 0.089 0.067 0.841 0.087 0.149 0.087 0 3% 0.18 0.743 0.179 0.144 0.016 0.004 0.21 0.763 0.862 0.122 0.007 0.023 3% 0.17 0.775 0.086 3% 0.004 0.18 0.833 0.007 3% 0.85 3% beta.so 10% refers to OTU clustering at these threshold values. BOLD ref ref values.BOLD threshold at OTU clustering to refers these 10% r Betadiversity (Sorensen index) p-value p-value Adonis Adonis Adonis Adonis 0.18 0.19 0.20 0.17 0.18 0.43 r 2 stress .7 .0 07101400 002009 0.242 0.072 0.059 0.02 0.771 0.171 0.006 0.164 .7 .1 0760130.054 0.099 0.099 0.053 0.736 0.075 0.018 0.143 .9 .1 0780150.015 0.317 0.074 0.043 0.758 0.092 0.017 0.135 .0 .6 0790150.226 0.314 0.079 0.058 0.739 0.108 0.066 0.115 .3 .1 07 .4 .1 00 0 0.236 0 0.07 0.013 0.75 0.142 0.139 0.017 0.196 0.079 0.1 0.005 0.663 0.098 0.004 0.073 envfit p- value

easmstress beta.sim Turnover (Simpson (Simpson Turnover index) p-value envfit

beta.sne stressbeta.sne Nestedness (Sorensen- Simpson index) envfit p- envfit value

Accepted Article This by is protectedarticle reserved. Allrights copyright. 2. period Impacted T2-I: spill); the after months (2.5 2 period collection T2: sites; I: impacted sites, Control C: T:Treatment; Notes: BOLD:AAF2659 Branchiopoda Chydoridae Diplostraca BOLD:AAF2659 Branchiopoda Chydoridae Diplostraca Class BIN Order GENE Family 3. Table Species T reference database. reference

database-dependent approach for approach database-dependent the ODAP91Isca itr Chironomidae Diptera BOLD:AAP5931 Insecta BOLD:ACV6778 Malacostraca Isopoda Asellidae Asellidae Malacostraca BOLD:ACV6778 Isopoda ODAU67Isca itr Chironomidae Diptera BOLD:ACU8677 Insecta ODAH90Mlcsrc mhpd Gammaridae Amphipoda BOLD:ACH7960 Malacostraca BOLD:AAA1971 Malacostraca Isopoda Asellidae Asellidae Malacostraca BOLD:AAA1971 Isopoda ODAW12Isca itr Chironomidae Chironomidae Diptera Chironomidae Diptera BOLD:AAW1102 Insecta Diptera BOLD:AAB9119 Insecta BOLD:ACQ8988 Insecta ODAI08Isca itr Chironomidae Diptera BOLD:AAI6018 Insecta ODAG33Mlcsrc mhpd Gammaridae Amphipoda BOLD:ACG8343 Malacostraca ODAU49Isca itr Chironomidae Diptera BOLD:AAU4439 Insecta ODAI08Isca itr Chironomidae Diptera BOLD:AAI6018 Insecta ODAA91Mlcsrc spd Asellidae Isopoda Malacostraca BOLD:AAA1971 ODAU49Isca itr Chironomidae Diptera BOLD:AAU4439 Insecta ODAE58Isca itr Chironomidae Diptera BOLD:AAE4568 Insecta ODAR38Isca itr Chironomidae Diptera BOLD:ACR3318 Insecta ODAO07Isca itr Chironomidae Diptera BOLD:AAO1037 Insecta ODAU07Isca Ephemeroptera Baetidae BOLD:AAU1007 Insecta ODAO07Isca itr Chironomidae Diptera BOLD:AAO1037 Insecta ODAH82Mlcsrc mhpd Gammaridae Amphipoda BOLD:ACH6832 Malacostraca ODAQ98Isca itr Chironomidae Diptera BOLD:ACQ8988 Insecta

Named species are the most abundant within each BIN, in brackets the number of specimens identified to species level in the the in level to species identified specimens of number the brackets in BIN, each within abundant most the are species Named

Species with indicativevalue as identified by

cox1 gene fragments bc-5' and bc-3'. bc-3'. and bc-5' fragments gene indval Chydorus sphaericus Chydorus sphaericus Chydorus rctpsbcnts[1] bicinctus Cricotopus A Asellus aquaticus [7] [7] aquaticus Asellus rctpsbcnts[1] bicinctus Cricotopus amrsnkess[1] nekkensis Gammarus Asellus aquaticus [85] [85] aquaticus Asellus ayasseucds[24] ejuncidus Tanytarsus brundin Tanytarsus brundin Tanytarsus rctpsbcnts[123] Cricotopus bicinctus amrsfsau [37] Gammarus fossarum Tanytarsus eminulus [124] Cricotopus bicinctus [123] Tanytarsus eminulus [124] uifeilacaiens[185] Eukiefferiella claripennis ayasspliions[10] pallidicornis Tanytarsus aaedpsabmns[127] Paratendipes albimanus etotlmltou [17] luteolum Centroptilum aaedpsabmns[127] Paratendipes albimanus Gammarus pule Gammarus Tanytarsus brundin Tanytarsus elsautcs[85] aquaticus sellus analyses based on the results of the reference reference the of results the on based analyses x [11] i i i [5] [13] [13] c5 2I1 0.014 bc-5' T2-I 0.046 0.6 1 bc-5' I c3 1 0.004 1 bc-3' I c5 2I1 0.024 bc-5' T2-I 1 c3 0.83 0.016 bc-3' C c3 08 0.01 0.8 bc-3' I c5 08 0.017 0.8 bc-5' I 0.02 0.8 bc-5' I 0.02 0.8 bc-3' I c5 2I1 0.022 bc-5' T2-I 1 c5 1 0.004 1 bc-5' C c5 0.86 0.019 bc-5' I c3 0.75 0.049 bc-3' I c3 06 0.048 0.6 bc-3' I c5 0.83 0.025 bc-5' C c3 1 0.002 1 bc-3' I c3 2I1 0.023 bc-3' T2-I 1 c5 0.86 0.015 bc-5' I c5 2I1 0.019 bc-5' T2-I 1 c3 0.86 0.015 bc-3' I c3 08 0.024 0.8 bc-3' I c5 21 0.001 bc-5'1 T2 c5 08 0.014 0.8 bc-5' I sites at collection collection at sites value Ind. P Accepted Article This by is protectedarticle reserved. Allrights copyright.

Accepted Article This by is protectedarticle reserved. Allrights copyright.

Accepted Article This by is protectedarticle reserved. Allrights copyright.

Accepted Article This by is protectedarticle reserved. Allrights copyright.

Accepted Article This by is protectedarticle reserved. Allrights copyright.

Accepted Article This by is protectedarticle reserved. Allrights copyright.

Accepted Article This by is protectedarticle reserved. Allrights copyright.