<<

Posted on Authorea 21 Sep 2020 | The copyright holder is the author/funder. All rights reserved. No reuse without permission. | https://doi.org/10.22541/au.160071155.58915559 | This a preprint and has not been peer reviewed. Data may be preliminary. unn title: Running [email protected] ; [email protected] authors: *Corresponding Norway. Oslo, 2 0316 Blindern, 1066 Box P.O. 1 fungal Estensmo eleven H˚avard of F. Kauserud study Lena case A Eva : DNA during variation sequence resolution. species intraspecific species-level approach of to influence needed as is The used The step be often clustering not region. haplotypes, extra should rDNA ASVs) an extra the (or but in These haplotypes analyses, variation community that intragenomic fungal suggest approach. or ITS-based ITS2 either errors in in variation using units intragenomic) and basic detected possibly sequencing PCR haplotypes (and Sanger to intraspecific by additional due of likely generated few presence were haplotypes a frequencies, between low of correspondence intraspecific in exception high of occurring level a the some observed fungal included with sequencing we one, eleven HTS, throughput except Overall, of high species, and region. focal and populations ITS2 All sequencing local the DADA2. Sanger software in analyzing both the variation we using by with include Here haplotypes correction metabarcoding, may error ITS2 with analyses. DNA region, of coupled diversity metabarcoding ITS in (HTS), allelic DNA fungal variation the during the investigated sequence are We like species intraspecific often – species. of 18S, of markers over-splitting and effects variable to 16S more the lead like but other assess can markers, species, samples, which DNA related variation, environmental conserved closely While intraspecific from among considerable communities potential. discriminate full complex to its analyzing able limiting for not challenges approach methodological powerful still are a there become has metabarcoding DNA Abstract 2020 21, September 4 3 2 1 Skrede species Estensmo fungal Lena eleven Eva of study DNA case during A variation metabarcoding: sequence intraspecific of influence The auai idvriyCne,Vnela 5 ..Bx91,20,R edn h Netherlands the Leiden, RA 2300, 9517, Box P.O. 55, Vondellaan Center, Biodiversity Naturalis nvriyi Oslo in University Oslo of Sciences University Natural and Center Biodiversity Mathematics Naturalis of Faculty Oslo of University eto o eeisadEouinr ilg Eoee,Dprmn fBocecs nvriyo Oslo, of University Biosciences, of Department (Evogene), Biology Evolutionary and for Section 1 n H˚avard and Kauserud , altp iest mat N metabarcoding DNA impacts diversity Haplotype 1 1 1 ud Maurice sundy , ud Maurice Sundy , 4 1 usMorgado Luis , 1 usMorgado Luis , 1 1,2 er .Martin-Sanchez M. Pedro , 2 er Martin-Sanchez Pedro , 1 ne Skrede Inger , 3 Inger , 1 , Posted on Authorea 21 Sep 2020 | The copyright holder is the author/funder. All rights reserved. No reuse without permission. | https://doi.org/10.22541/au.160071155.58915559 | This a preprint and has not been peer reviewed. Data may be preliminary. cls 05.Ltr oeeaoaeapoce eedvlpdi re obte itnus ewe PCR between distinguish better and Westcott to 2013; order Edgar in 2010; developed al. were et approaches cluster Caporaso, elaborate 2009; to more based al. was entities), et Later, approach (Schloss, taxonomic 2015). first threshold Schloss biological similarity One for data sequence approximations analyses. HTS fixed (OTUs; community the a units on delineate in taxonomic and downstream group operational used to into are developed sequences that been has entities approaches variability. biological bioinformatics that sequence into different variation intraspecific of sequence array occurring concern artificial wide diversity, naturally a introduce A species also from conserved errors is the disentangle variation sequencing more underestimates to sequence and it hard intraspecific of PCR as on be addition, case based problem can be taxa In common the of analyses. should a splitting community in is marker, sequences in taxa ITS example, the variable of For the that merging for imply analyses. 18S, while and markers metabarcoding 16S taxonomic like DNA these markers, also during of 2012). but al. differently peculiarities et resolution, the processed Schoch, interspecific ignored, 2008; al. higher often more et considerably provides Although (Nilsson, includes variability consequently variability marker sequence al. intraspecific and ITS of of fungal et 18S, degree the levels (Hadziavdic, to some general, contrasting micro-eukaryotes In compared include for region resolution. variability markers 16S of region sequence the levels these 18S various 2012), the rates, provide the al. include evolutionary thus, et and different and, microorganisms Schoch, 1994) to 2008; for al. Goebel Due markers et and 2014). barcoding (Nilsson, groups, (Stackebrandt fungi taxonomic rDNA some bacteria for broad used provide region for amplify can most (ITS) to which Spacer The in-between, used Transcribed variation (rD- resolution. be Internal intraspecific DNA can low taxonomic ribosomal that and of nuclear interspecific sites degree high the primer of conserved within areas offer lie a with region microorganisms combined also this for is of region and Parts barcoding NA). 2018), DNA al. used 2017). commonly biomonitoring et al. ecosystem The Bahram, et and Stat, 2019) 2014; 2012; al. function al. et al. and et (Barsoum, et structure biodiversity (Tedersoo, (Douglas, the the et habitats surveying Taberlet, of for 2016; different understanding approach al. our in well-established et improved communities Goodwin, considerably 2013; microbial al. has of et tool metabarcoding powerful Lindahl, DNA a 2012; become al. 2018). has et al. (Taberlet, metabarcoding communities DNA i.e. microbial markers study amplified to of (HTS) sequencing throughput High Introduction species-level approach to units needed basic is as step used words: clustering be Key extra not an should pos- ASVs) but PCR (and (or intraspecific to analyses, of haplotypes community due resolution. presence that likely fungal The suggest were ITS-based region. ITS2 frequencies, rDNA in detected in low the haplotypes variation in in additional intragenomic) variation few occurring intragenomic sibly a often or of haplotypes, errors some exception haplo- sequencing extra between included the and These correspondence with one, high approach. HTS, except a and either species, observed sequencing using we focal Overall, Sanger All region. by ITS2 dada2. generated the software types sequencing in the throughput variation high with intraspecific intraspecific and of correction inves- sequencing of level We Sanger error effects species. both with fungal the using eleven coupled haplotypes assess of ITS2 (HTS), lead va- we populations of can more diversity local Here which allelic other analyzing variation, the analyses. by species, tigated intraspecific metabarcoding, metabarcoding related considerable DNA DNA DNA closely in include during variation among conserved may sequence species discriminate region, While ITS to of potential. fungal over-splitting able full the not to its like are limiting – often markers challenges 18S, riable methodological environmen- and from still 16S communities are like complex markers, there analyzing for but approach samples, powerful a tal become has metabarcoding DNA Abstract omnt clg;DAmtbroig ug;IS haplotypes ITS, fungi; metabarcoding; DNA ecology; Community 2 Posted on Authorea 21 Sep 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.160071155.58915559 — This a preprint and has not been peer reviewed. Data may be preliminary. osse f14.6 of consisted of combination probability ( a (5’- technical using low 16 tags gITS7 amplified x with with with was the library together species primers with Each plates, fungal tagged controls. targeted PCR six PCR uniquely 96-well negative of 96 two x (composed and of 2 dataset), of communities into our range mock processed in concentration identical were occurrence a two samples to replicates, DNA Tris PCR body mM fruit 10 and 176 with reagent The standardized HTR and the Technologies) adding (Life by ng/ kit Biotek) 5-10 and Assay (Omega Gardes kit BR 1980; DNA DNA Thompson Soil ds and E.Z.N.A manufacturer (Murray the the protocol following with then extraction cleaned CTAB and modified 1993) a Bruns using extracted was example). genotypes DNA for heterozygous S1 and Fig. is hyphae (mm μ dikaryotic (see logs millimeter present of spruce square is up between five variation made Approximately dispersal ITS2 is intraspecific clonal fungi if no these Southeastern expected logs. of basidiospores, in spruce therefore tissue sexual distinct forest are body by on spruce fruit collected spread were old-growth The typically bodies an expected. fungi fruit in these individual 16 sampled that species, Given were each For 1) Kuhmo). (Table (Issakka, species Finland fungal wood-decay degree Eleven what to and resolution. Methods sequencing, species-level and Sanger the approach Material whether and to tested needed metabarcoding we is 2016), DNA clustering al. by et sequence identified (Callahan, further were dada2 sequence corresponding using haplotypes the intraspecific data ITS2 to performed sequence with compared same we the and deal topic, denoising species this basidiomycetes to By we address 11 To sequences. able of Here, taxa. Sanger specimens of is fungal communities. over-splitting this 176 marker, to fungi on how leads ITS2 metabarcoding variation on of DNA this fungal knowledge degree studies the what little to metabarcoding have using and we variation, DNA metabarcoding, fungi, in DNA in to how delineation variation lead assess OTU sequence might ITS into 2013). threshold intraspecific translates al. general of variation et reasonable level a (Blaalid, a high such others represents the However, of 97% Despite 2013). lumping that al. and indicated et taxa 2010; some have Blaalid, al. of at studies 2008; et debated splitting Several al. (Caporaso, been species-level et 2015). has the (Nilsson, Schloss it approximate region, approximation needed and to ITS be Westcott clustered the may 2013; and be step fungi should Edgar clustering For on sequences ASVs extra OTUs. based level species-level an the typically similarity into variation, is Hence, which ASVs) ITS ecology (or analyses. community intraspecific haplotypes while statistical the group region, downstream overestimated for to ITS tremendously in correct of To be entity variants analyses. will allelic biological diversity species-level different fungi a represent the for as best) since used (at differences marker haplotype problematic (Callahan, ITS will highly least, ITS analyses the be each at like downstream variation, can for treating reflect, intraspecific this input by of can 2008), as level ASVs difference al. high use with et pair For to markers (Nilsson, haplotypes. base for suggested However, underlying single been 2017). the has one al. for it et where approximations genera, 18S, are and a which and the species In analyses, between 16S in single- dada2 sequencing. like provide present of term and to markers output The haplotypes, able conserved PCR the region. is underlying during dada2 16S for software bacterial generated the coined the entire variability been that identify a the shown sequence was 2019), to analyzing the it when al. is 2019), all resolution al. et methods to et (Pauvert, (Callahan, recent rise study software gave recent more various that the in DNA, in developed template aim been Callahan, entities. have 2016; basic biological al. solutions common the et different approximating Boyer, better somewhat 2015; OTUs al. Although return et thus, (Mahe, and variation 2016), sequence al. biological et and artefacts sequencing and f2 TBad1 eamratehnluigaRtc M0 ie 4x4 t2 oscillations). 25 at s 45 x (4 mixer MM200 Retsch a using beta-mercaptoethanol 1% and CTAB 2% of l TCCTTGTT)(ht,e l 90 rmr.TePRmxuei 25 in mixture PCR The primers. 1990) al. et (White, CTCCGCTTATTGATATG) μ l. μ il- ae,2.5 water, Milli-Q l x ´ GTGA udlns N a ltdi 100 in eluted was DNA guidelines. s μ R 0 odbffr 0.2 buffer, Gold 10x l TCATCGA 2 x ftsu eectotfo ahfutbd n rne n800 in grinded and body fruit each from out cut were tissue of ) agn rm79bs ar.Tefna T2rgo was region ITS2 fungal The pairs. base 7-9 from ranging ) 3 R CTG Irak ta.21)adIS (5’- ITS4 and 2012) al. et (Ihrmark, TCTTTG) μ NPs(5n) 1.5 nM), (25 dNTP’s l mlcnsqec variants sequence amplicon μ lto ue,qatfidwt Qubit with quantified buffer, elution l μ ees n forward and reverse l μ nlvolume, final l AV)has (ASVs) Posted on Authorea 21 Sep 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.160071155.58915559 — This a preprint and has not been peer reviewed. Data may be preliminary. eune noeIlmn ie Ilmn,SnDeo A S)ln ih2 with lane USA) and CA, Assay PhiX Diego, 20% BR San with DNA spiked (Illumina, ds adapters, MiSeq Qubit Illumina with with Illumina barcoded measured one were was in magnetic libraries concentration two XP sequenced DNA The AMPure normalization the Agencourt Technologies). to using and (Life prior purified kit AS) gel 20 and Meszansky in concentrated agarose (Nerliens eluted pooled, 2% were beads and a library (Invitrogen) on each Kit electrophoresis within Plate by products Normalization controlled SequalPrep was the products using PCR of quality The 72 at extension s, 30 95 at activation ng/ 5-10 and nooet i T2hpoye e pce Tbe1,ietfigattl4 altpsfo h Sanger the from haplotypes 45 total population. local a of the identifying dephased in level 1), were present high (Table were sequences variation or species ITS2 except sequence templates) The per ITS2 species, sequences, multiple haplotypes all interpret. low-quality For fruit (generating ITS2 to in 16 six dataset. bodies hard resulted to to chromatograms or fruit 6 one amplify to from the not into leading ranging inside did bodies, indels, growing either fruit of fungi 176 bodies heterozygosity of fruit fungicolous out remaining to 151 The due for species. were sequences datasets per Sanger haplotype ITS2 bodies two quality the high obtained from We sequences the 2016). last, al. At et datasets 2019). (Rognes, two Results Team VSEARCH the Core by across identity haplotype R each 97% 3.6.2; with of with of (v abundance generated clustered calculation mutational R relative the were one in in as In species correspondence made scored ITS2. 11 were the was relative gaps in showing the multi-position variation their biplot for where intraspecific A obtained compared characters, networks of we as event. and Haplotype level scored species, HTS were the 2019). each indels displaying the network, Team for 2015), haplotype from Core Bryant and Hence, another R and strands, and 2017). (Leigh 3.6.2; sequence popart sequences (v al. homozygous Sanger R et two the in were into (Rozas, abundance tissue, from v.6 split dikaryotic dataset was dnasp the haplotype in sample in one each variability polymorphisms 2020.0.5 allelic of DNA to for sequence due prime analyzed consensus sites the heterozygous alignment geneious i.e sequence with single dephased many a Separate in to sequences, quality). merged Sanger then on The processed were (depending which possible sequences, further code, Sanger species. when nucleotide and each merged IUPAC ASVs for were the were from to reads generated according reverse sequences were characterized alignments and were forward Sanger sites were the Heterozygous sequences and poor-quality dataset. the and the curated the from manually and were excluded inside sequences table growing Sanger ASVs ASV fungi The 11 resulting fungicolous (https://www.geneious.com). the the of the to and because assigned , data sequences Both 2005) HTS 57 the only al. and in Taxonomy retained correction et we occurred bodies. variability. error (Koljalg, fruit others analyses, length for database numerous downstream preserve used unite For species, was to the target 2016) sequences. order by 3,647 in ASVs al. of sequences raw et consisting the the (Callahan, truncating to dada2 target, without assigned and bp). was reads, query 100 between the with overlap of bp of demultiplexed length 26 were merging least minimum amplicons (at sequences and removed resulting were The indel reads The Eurofins quality no reads. low by and 25,953,804 above. 2011) directions comprised (Martin as cutadapt dataset both program metabarcoding in and resulting was sequenced mix The Amplification and PCR Aldrich) same bodies. (Sigma the fruiting Germany) ExoProStar with 176 (Ebersberg, with 1990), the cleaned for al. were ITS2 et of (White, (5’-TCCTCCGCTTATTGATATGC) ITS4 primers sequences and (5’-GCATCGATGAAGAACGCAGC) Sanger ITS3 with generated performed we comparison, Germany). For Mainz, GmbH, (StarSEQ StarSEQ at reads (10 primers μ ) 2.5 M), μ fDAtmlt.Tefloigccigprmtr eeue o mlfiain enzyme amplification: for used were parameters cycling following The template. DNA of l ° o i,floe y3 ylso eauaina 95 at denaturation of cycles 32 by followed min, 5 for C μ ° o i,adafia xeso tpa 72 at step extension final a and min, 1 for C MgCl l 2 5 M,1.0 mM), (50 mlcsi lapponica Amylocystis μ S 2 gm) 0.2 mg/ml), (20 BSA l 4 ersne yhpoye oelvlo intraspecific of level some haplotype, by represented ° o 0min. 10 for C μ mlTqGl oyeae( U/ (5 polymerase Gold AmpliTaq l ° o 0s neln t55 at annealing s, 30 for C μ lto ue.Te9 PCR 96 The buffer. elution l × 0 aepi paired-end pair base 300 ° for C μ l) Posted on Authorea 21 Sep 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.160071155.58915559 — This a preprint and has not been peer reviewed. Data may be preliminary. nitapcfi T aiblt ntefna igo Sih ta.20;Nlsn ta.20) Nilsson 2008). al. ( et literature 3.33% Nilsson, previous of 2007; the variability al. with sequence et well intraspecific (Smith, corresponds detected, an kingdom was This region reported fungal forest). ITS2 single the (2008) the a in in al. variation variability (i.e. et intraspecific scale ITS of geographic intraspecific level fine some on one, the this except at dataset, to species, even sequences. Sanger and target due the Sanger models the in original either error all haplotypes the on were For additional of based the detected dephasing reconstruction to erroneous ASVs comes sequence from it additional probabilistic result When are some could uncertainty. is ASVs associated ASVs case, It that an obtained with (Vˇetrovsk´y this have 2013). mind variation the thus Baldrian sequenced after in in 16S and and keep even intragenomic bacteria also alternatively, templates to erroneous of However, or important original be errors, communities the can sequencing 2016). mock between or (ASVs), of detected PCR haplotypes al. was rRNA few et correspondence 16S a (Callahan, high that full-length a as exemplifies analyzing SCC, identify This By haplotypes PacBio to beginning chimeric the sequences. failed five either processing. dada2 the towards implemented, breakpoints that dada2 of chimeric filter and end with sequence cycles dataset, the chimeric metabarcoding PCR or DNA a initial filtered has the the in algorithm during occurred dataset dada2 introduced metabarcoding errors Although DNA the by PCR in artifacts. detected to abundance haplotypes low due extra in be appearing the haplotypes might of unique some the that of possibility are some Alternatively, the paralogs out ITS intragenomic rule since paralogs. Although cannot sequencing, ITS we Sanger templates. represent 2013), direct DNA HTS with multiple al. detect the et to homogenizing from (Lindner, hard evolution derived rare is concerted is variation of Such sequence lack to consensus 2011). due a Banik occurs and ITS (Lindner in paralogs variation the intragenomic species, fungal some methodological In to due relative be represent either in detect. may can correlation they approaches to base or high the failed single of steps, a by various one caused observed at by 57% were detected introduced also mismatches with haplotypes errors striking We additional species, most The the DNA fungal datasets. where indels. eleven versus two datasets, pair the the sequencing the across across across Sanger haplotypes marker shared of methods, ITS2 abundances haplotypes two the detected the in the between variation of correspondence allelic assessing good in a metabarcoding, observed we general, In Each species. 11 the Discussion for OTUs OTUs. or two two, clusters for with 13 rare exept represented obtained OTU, allele These were one we single S1). identity, by Table represented a 97% was alleles, from variation. with species of intragenomic sequences expected in alternatively, number the be were or by clustering would errors, divided haplotypes After number what sequencing Five read and than total PCR lower steps. represent folds (i.e. mutational likely ten population few haplotypes the dataset, a identified in HTS one by present haplotypes the from separated being varying in the related, species, abundance between closely low across relationship were variation very the some intraspecific haplotypes of illustrate With most level 2) that data. the datasets demonstrate HTS (Fig. two (in and the the networks haplotype datasets across to haplotype two haplotypes specific of the The abundance were S1), from relative (Table (30.8%) the 1). approaches in 20 two found the while (Fig. was between correspondence sequencing, shared high were Sanger a (57%) 57 from exceptions, 37 total only which in (12.3%) of 1), haplotypes, (Table eight 65 species additional detected each five we for removing were Overall, (ASVs) and sequences dada2 haplotypes ITS2 using 8 for 2,316,395 sequences and After species, the 1 of available. eleven between denoising haplotypes. total were identified the After a we sequences across species. fungi, sequences, Sanger target distributed chimeric fungicolous which 11 bodies to for the fruit corresponding to specimens 176 attributed sequences the of ITS2 on out all focused 163 removing only for we data purposes comparative HTS obtained we Although mlcsi lapponica Amylocystis oeee (in eleven to ) hloiu nigrolimitatus Phellopilus hloiu nigrolimitatus Phellopilus 5 efacto de eunevrainta n ftemethods the of one that variation sequence ± tnaddvaino .2 for 5.62) of deviation standard and .Tentok loindicate also networks The ). hei centrifuga Phlebia which , Posted on Authorea 21 Sep 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.160071155.58915559 — This a preprint and has not been peer reviewed. Data may be preliminary. ,Asa ,Ceh P aedH ta.21.Srcueadfnto ftegoa osi microbiome. topsoil global the of function and Structure 2018. al. et Bengtsson-Palme H, PM, Harend Bodegom LP, NA, Soudzilovskaia Coelho JL, S, Anderson Anslan SK, Forslund All J, F, manuscript. Hildebrand M, the Bahram wrote HK and SM ELE, References LM. and ELE manuscript. sequencing. SM, Sanger the HK, approved and processed preparation and from SM library edited contributions sampling. HTS authors performed performed with HK ELE data and and SM SM analyzed research. DNA. the extracted conceptualized and and samples designed ELE and SM HK, Repository Digital contributions Dryad Author the at available are codes and dataset HTS (http://dx.doi.org/issakka/its NCBI. in deposited are sequences Sanger Accessibility Data interest. competing no declare authors University The the at Biosciences of Interests Department the Competing and supported 254746) financially No was (grant research Norway Oslo. The of of organization. Council sampling with Research help the for by Kuhmo Mets in from Center Meriruoko Research Park Ari Helo, Teppo acknowledge We will classification, we taxonomic direction, reliable this more Acknowledgements towards tools. provide head bioinformatics to To suitable regions 2018). to different al. addition of coverage et in higher Tedersoo, promising with 2018; 16S, databases are al. for require Nanopore) et bp (Kennedy, Oxford samp- 500-1500 environmental resolution PacBio, (e.g. most taxonomic (e.g. barcodes from datasets longer technologies taxonomic multi-locus generate improve sequencing generate to to to generation position required third a often Alternatively, in when are not les. species markers still resolve related DNA to are closely approach we independent Yet, powerful separate multiple resolution. a to marker, is metabarcoding and DNA DNA da- populations single of with natural targeting limitations error-correction in variation and the allelic HTS Given the on communities. reflects based complex extent metabarcoding, large DNA a that different to conclude across da2, we distributions results, same our the to frequent show According most 2018). largely The different al. they data. against et 16S and robust and (Botnen, pattern ITS highly treatments community line for are the data In both drive patterns 99%, 2018). diversity ASVs) to Martiny results similarity (or beta and studies sequence comparable OTUs that 85% (Glassman In demonstrated that from clusters aims. ranging (2018) shown sequence levels, study al with clustering representing been et the OTUs species, Botnen previously on or this, 11 ASVs depends has with the using this it obtained representing of species- turnover), be approach OTUs importance (community results can to 13 The diversity Our correction haplotypes. obtained variation. error beta species two after we intraspecific emphasizing between by needed with haplotypes, separate represented is markers our step may species variable clustering clustering mutation two for a After pair approach marker, case reasonable resolution. base ITS the the variable a level single not as the is a however haplotypes) in this as when is that Indeed, to show 18S, This referred 2017). genera. and largely al. even 16S study et or our like (Callahan, (in markers, analyses ASVs conserved community term for microbial the previously in use been to units advocated has basic been region in recently ITS 2003), has Schumacher It the and Kauserud in 2002; results. variation Schumacher our and sequence with (Kauserud species, scales line spatial target broader the across reported of some For Basidiomycota. lseig n ilb aeaalbeuo publication. upon available made be will and clustering) 6 > hliu n egl ´roy rmteFriendship the V´arkonyi from Gergely and ¨ ahallitus 0 pfrISad60b o O)adimprove and COI) for bp 650 and ITS for bp 700 Posted on Authorea 21 Sep 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.160071155.58915559 — This a preprint and has not been peer reviewed. Data may be preliminary. end G ln C ogZ 08 rbn rms esspromnei ogrra uglmetabar- fungal read longer in performance versus promise Probing 217:973-976. 2018. phytologist Z. New Song coding. LC, fungus Cline wood-decay PG, pioneer the Kennedy of Phellinus structure fungus population 95:416-425. local decay Mycologia and 80:597-606. abietinum. wood Regional Trichaptum botanique 2003. endangered de T. 454- Schumacher the canadien H, by Journal of Kauserud region–evaluation botany. structure of ITS2 journal Population fungal Canadian 2002. (Basidiomycota). the T. nigrolimitatus 82:666-677. Schumacher amplify ecology microbiology H, to FEMS Kauserud primers communities. natural New and 2012. artificial of KE. sequencing Clemmensen 9:e87624. M, the ONE of Durling PLOS Characterization primers. 2014. B specific C. K, eukaryote Troedsson Ihrmark universal EM, designing Thompson for I, sequencing Jonassen next-generation rRNA A, of 18S Lanzen years K, Lekang ten K, age: Hadziavdic of 17:333-351. Coming Genetics 2016. variants Reviews WR. sequence Nature McCombie exact technologies. JD, of McPherson use S, to Goodwin robust 3. are MSphere the patterns units. to ecological taxonomic basidiomycetes–application Broadscale operational 2018. for versus JB. specificity Martiny 2:113-118. enhanced SI, Ecology Glassman with Molecular primers rusts. and ITS mycorrhizae 1993. of TD. identification Bruns methods Nature M, reads. Gardes amplicon microbial from sequences OTU 3:613- accurate 10:996. highly Evolution UPARSE: of and 2013. metabarcoding Ecology RC. soup: Edgar in Biodiversity 2012. Methods Z. biomonitoring. Ding and sequencing C, Yang assessment community C, 623. biodiversity Ye high-throughput Pe˜na rapid AG, X, Wang of for N, BC, analysis arthropods Fierer Emerson Y, allows EK, Ji WY, QIIME Costello Douglas FD, 2010. Bushman al. 7:335-336. K, et methods Nature Bittinger JI, High- data. J, Gordon 2019. Stombaugh MK. JK, J, Dougherty Goodrich SK, Kuczynski Nucleic McGill JG, resolution. AS, single-nucleotide Caporaso with Gulati gene CM, rRNA Theriot 16S 47:e103. full-length S, Research the Acids Oh of C, sequencing Heiner amplicon high-resolution throughput J, DADA2: 13:581. Wong 2016. methods SP. BJ, Nature Holmes Callahan data. AJA, amplicon Johnson AW, Illumina Han from MJ, inference Rosen sample PJ, 11:2639-2643. McMurdie software journal taxonomic BJ, ISME unix-inspired operational Callahan replace The a should analysis. variants Obitools: data sequence 2016. marker-gene Exact 2017. E. in SP. units Coissac Holmes P, PJ, McMurdie Taberlet 16:176-182. BJ, Resources Y, Callahan on Ecology Bras effect Molecular Le metabarcoding. little A, DNA has Bonin for threshold package C, 18:1064-1076. clustering DNA Resources Mercier Sequence Ecology as F, 2018. Molecular Boyer ITS2 structure. H. versus community Kauserud ITS1 microbial R, of 2013. Halvorsen recovery H. the ML, Kauserud Davey PM, SS, 13:218-224. Kirk Botnen Resources K, Ecology Abarenkov Molecular RH, fungi. surrogate Nilsson for with S, metabarcodes compared Kumar composition R, arthropods stand of Blaalid forest metabarcoding 101:313-323. detail: to Indicators the response Ecological in biodiversity. is biodiversity devil of of The measures 2019. measure DW. sensitive Yu Y-Q, a Ji provides J, Forster C, Bruce N, Barsoum 560:233-237. Nature dkrI rzMrie ,FiegH uatv ,ShnkJ ti ,SeldJ Brandstr J, Stenlid Y, Strid J, Schenck A, Kubartova H, Friberg K, Cruz-Martinez I, ¨ odeker 7 ¨ om- Posted on Authorea 21 Sep 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.160071155.58915559 — This a preprint and has not been peer reviewed. Data may be preliminary. tcerntE oblB.19.Txnmcnt:apaefrDADArascainad1SrRNA 44:846-849. 16S Microbiol Evol and Syst reassociation J DNA-DNA Int bacteriology. for in place definition quercus a species present a note: the from Taxonomic in roots 1994. analysis ectomycorrhizal BM. sequence pooled Goebel and E, sporocarps Stackebrandt ectomycorrhi- of of sequencing variation ITS rDNA 18:15-22. intra-sporocarp Mycorrhiza by and woodland. assessed 109:6241-6246. Intra-specific as PNAS ribosomal 2007. fungi. Nuclear fungi DM. for 2012. Rizzo zal marker W. GW, barcode Chen Douhan DNA CA, ME, universal Levesque Smith a JL, as Spouge region V, (ITS) Robert spacer S, transcribed internal Huhndorf KA, Seifert CL, Schoch 75:7537- microbiology environmental Parks and BB, Applied communities. Oakley microbial RA, 7541. comparing Lesniewski soft- and community-supported EB, describing platform-independent, Hollister for open-source, Evolution M, ware mothur: Hartmann Introducing Biology 2009. JR, Molecular CJ. Hall Robinson sets. T, DH, Ryabin data SL, for large Westcott PD, of tool Schloss analysis source polymorphism open sequence versatile DNA Sanchez-Gracia SE, 34:3299-3302. 6: a Ramos-Onsins P, DnaSP VSEARCH: Librado 2017. S, 2016. Guirao-Rico A. JC, Mah´e F. Sanchez-DelBarrio A, Ferrer-Mata C, J, Rozas Quince B, 4:e2409v2401. Nichols Preprints PeerJ T, Statistical . Flouri for Foundation T, R computing. Rognes statistical the for on environment http://www.r-project.org/index.html. dependent and Austria. language highly Vienna, A is Computing, R: data (2019). Team community 2019. Core fungal C. R Vacher soil J, 41:23-33. and Vallance ecology I, plant Fungal Lesur of pipeline. A, metabarcoding accuracy molecular Gautier L, the for Fauchery matters: implications V, in Bioinformatics its Edel-Hermann variability V, and Laval ITS Bu´ee databases M, C, Intraspecific sequence Pauvert 2008. international 4:EBO.S653. K-H. the Bioinformatics Acids Larsson in Evolutionary Nucleic N, identification. expressed Hallenberg species DNA. as M, plant fungi Ryberg kingdom weight E, the molecular Kristiansson high RH, of Nilsson isolation Rapid 1980. EMBnet. 8:4321-4325. reads. WF. Research sequencing Thompson high-throughput MG, from Murray sequences adapter removes 17:10-12. Cutadapt journal 2011. high-resolution and M. highly-scalable v2: Martin Swarm 2015. 454 M. 3:e1420. Dunthorn Employing C, PeerJ Vargas clustering. 2013. de C, amplicon H. region Quince T, rDNA Kauserud Rognes spacer F, T, Mahe transcribed 3:1751-1764. Schumacher internal Evolution M, the and in Davey Ecology divergence fungi. R, intragenomic in Nilsson 103:731-740. reveal Mycologia to Henrik pyrosequencing Laetiporus. relati- T, genus phylogenetic amplicon Carlsen in obscures units region DL, taxonomic rDNA Lindner ITS operational the of in estimates variation inflates Intragenomic and markers–a 2011. onships amplified MT. of Banik sequencing DL, high-throughput Lindner by Rosendahl T, analysis 199:288-299. Pennanen community phytologist U, Fungal New Koljalg 2013. R, guide. al. Kjoller in user’s T, et Methods Carlsen J, K, construction. Stenlid Abarenkov L, network S, Tedersoo RH, haplotype Nilsson for BD, Lindahl software full-feature 6:1110-1116. popart: Evolution identification 2015. and molecular D. Ecology the Bryant for methods JW, Kjoller web-based K, Leigh providing 166:1063-1068. Hoiland database S, phytologist a Erland New U, UNITE: fungi. Eberhardt 2005. ectomycorrhizal IJ, al. of Alexander et RH, E, Nilsson Larsson K, R, Abarenkov KH, Larsson U, Koljalg 8 Posted on Authorea 21 Sep 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.160071155.58915559 — This a preprint and has not been peer reviewed. Data may be preliminary. trifuga cen- Phlebia ium separ- Gloeophyllum rosea Fomitopsis cola pini- Fomitopsis serialis Antrodia ica lappon- Amylocystis Specimen Species approach. HTS either by and identified haplotypes Sanger common additional between dikaryotic include and each (Hap.) overlap HTS haplotypes from and Total the sequence variants. Sanger sequence is ITS2 both amplicon the from pair) for to haplotypes stands (base correspond ASVs and length sequences individual dephased Sequence (n+n) of shown. number alignments, are sequence approaches both RNA from ribosomal fungal 1. of 18:315-322. Table applications sequencing and direct methods and to Amplification guide Tables 1990. a J. protocols: Taylor PCR S, phylogenetics. Lee for 3:e1487. T, PeerJ Bruns units. TJ, taxonomic White assigning operational for to methods conse- sequences reference-based its outperform gene methods and rRNA clustering novo 16S De bacterial 2015. PD. 8:e57923-e57923. in Schloss SL, ONE gene Westcott PLOS rRNA analyses. 16S community the bacterial eukaryotes: for of other quences variability and The fungi 2013. 217:1370-1385. of P. 346:1256688. phytologist Vˇetrovsk´y metabarcoding Science Baldrian New PacBio T, fungi. perspectives. 2018. Thu soil and AM, S. of biases Vasco-Palacios geography Anslan LV, errors, and A, Ruiz diversity Tooming-Klunderud R, Global L, Wijesundera Tedersoo 2014. NS, Yorou al. U, et Koljalg A, S, Suija Polme PQ, M, Bahram 21:2045-2050. L, biodiversity Tedersoo Ecology next-generation Towards Molecular 2012. E. metabarcoding. Willerslev DNA C, monito- using Brochmann and assessment F, research Pompanon biodiversity E, Coissac for P, DNA: Taberlet Environmental 2018. 2017. Press. L. M. University Zinger Bunce Oxford E, ring: ES, Coissac A, Harvey environment. Bonin marine SJ, P, tropical Taberlet Newman a in TE, life Berry of tree 7:1-11. JD, the Reports across DiBattista Scientific metabarcoding R, eDNA: with Bernasconi biomonitoring MJ, Ecosystem Huggett M, Stat oprsno agradHSsqecs nyseiesfrwihsqecswr obtained were sequences which for specimens Only sequences. HTS and Sanger of Comparison 6263 5,6 044 6 4 8 5 9 10 8 4 355,067 5 8 1 429,990 6 4 168,991 7 1 238,143 3 2 178,787 0 5 66,364 1 5 5 32 5 5 4 32 276 5 1 30 271 3 16 32 277 0 16 32 229 15 18 188 16 308 16 9 (bp) length Sequence a sequences Dephased quences se- Sanger sites Polymorphic quences se- Sanger 9 a.RasPolymorphic Reads Hap. une SsAV ASVs ASVs ASVs quences se- Sanger

sites Hap. Total Haplo- types Posted on Authorea 21 Sep 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.160071155.58915559 — This a preprint and has not been peer reviewed. Data may be preliminary. o euneaudnei h T aaie 0tmsblwwa ol eepce rmasnl leein allele single a from expected be very would what with (Hap below blue occurring haplotypes times and haplotypes of 10 dataset, indicate naming indicates i.e. The arrows HTS data color Red from HTS population. Green the the haplotypes sequencing. in yellow step. Sanger abundance HTS, mutational by sequence low and one detected sequencing represent only Sanger were dash by haplotypes each both and detected haplotype haplotypes one represents circle Each y-axis). 2. on Figure HTS and x-axis on sequencing (Sanger 1. Figure Figures inum abiet- Trichaptum caesia Postia viticola Phellinus itatus grolim- ni- Phellopilus cus neofus- ferrugi- Specimen Phellinus Species altp ewrsdslyn h ee fitapcfi aito nIS o h 1fna species. fungal 11 the for ITS2 in variation intraspecific of level the displaying networks Haplotype iltsoigtecrepnec fhpoye nrltv bnac costetodatasets two the across abundance relative in haplotypes of correspondence the showing Biplot 5283 4,6 8 4 5 6 4 11 5 7 148,066 10 7 5 191,483 4 120,306 8 5 46,677 5 5 3 178,295 5 7 2 6 30 4 20 268 9 32 4 232 15 281 12 5 10 296 16 32 6 282 16

(bp) length Sequence a quences se- Sanger

oHap to 1 quences se- Sanger 10 5 olw al S1. Table follows 65) une SsAV ASVs ASVs ASVs quences se- Sanger Total Haplo- types Posted on Authorea 21 Sep 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.160071155.58915559 — This a preprint and has not been peer reviewed. Data may be preliminary.

Prop. abundance HTS

0.0 0.2 0.4 0.6 0.8 1.0 0.0 p R 2 = = 7.6e−15 0.613 0.2 Prop. abundance Sangersequencing 0.4 11 0.6 0.8 1.0 Posted on Authorea 21 Sep 2020 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.160071155.58915559 — This a preprint and has not been peer reviewed. Data may be preliminary. Hap_19 Fomitopsis rosea Amylocystis lapponica Hap_56 Hap_36 Phellopilus nigrolimitatus Phellopilus Postia caesia Hap_48 Phellinus ferrogineofuscus Hap_22 Hap_20 Hap_57 Hap_18 Hap_15 Hap_54 Hap_34 Hap_1 Hap_38 Hap_16 Hap_55 Hap_21 Hap_37 Hap_41 Hap_43 Hap_65 Hap_17 Hap_5 Hap_33 Hap_60 Gloeophyllum sepiarium Hap_59 Hap_24 Hap_47 Antrodia serialis Trichaptum abietinum Hap_3 Hap_35 Hap_28 Hap_25 Hap_26 Hap_23 12 Hap_2 Hap_58 Hap_46 Hap_64 Hap_4 Hap_42 Hap_27 Hap_63 Hap_30 Hap_9 Hap_61 Hap_40 Fomitopsis pinicola Hap_62 Hap_29 Hap_10 Phlebia centrifuga Hap_44 Hap_39 Hap_1 Hap_13 Hap_52 Hap_6 Hap_31 Hap_45 1 Phellinus viticola Hap_8 Hap_49 Hap_50 Hap_51 Hap_12 Hap_14 Hap_7 Hap_32 Hap_53