Posted on Authorea 19 Apr 2021 | The copyright holder is the author/funder. All rights reserved. No reuse without permission. | https://doi.org/10.22541/au.161882857.78207100/v1 | This a preprint and has not been peer reviewed. Data may be preliminary. h er o aa´s(iue1,EsenAao,i h otes fteBaiinsaeo Par´a, is of state Brazilian the of as southeast known the formation disturbing in a the by Amazon, of covered face Eastern outcrops al., in 1), ironstone 2016). et forests by al., (Figure Silva Caraj´as the formed et (see dos of (Levine endemism change resilience Serra of massive climate the The its and centres for extant impacts along different important anthropological addition, all several are direct In of of which of 2002; 2018). composed (Fearnside, quarter effects within), BFG, ecosystems is references 2015; one Amazon basin Lughadha, and the estimated & Amazon of 2005 an Morim portion the huge 2010; harbouring area, a al., the planet, geographic about et undoubtedly of knowledge the Milliken is of maintenance region in 2007; lack the the a Hopkins, ecosystems for Although is important important 2018). there al., species, exceptionally most et being the (Antonelli biome, time of 2017). over diverse Alroy, region one 2010; and Neotropical al., vast et the is Milliken a in this 2007; biodiversity is although Hopkins, interest, 2000; basin of al., area Amazon et the (Myers The in regions biodiversity megadiverse the for of available knowledge rarely detailed a on depend efforts Conservation we of relevance addition, the In Caraj´as. discuss dos time. we Introduction Serra Amazon, achieving first Brazilian the Upon the the in canga. for in measures the flora conservation barcoded in complete future diversity being a plant guide species to surveying to directed for 344 results effort samples of our barcoding bulk species, total DNA of 538 a comprehensive of metabarcoding first specimens with DNA 1,130 the , using for of vascular most barcodes DNA potential the DNA of as different the describe families we ITS2 assessed eight Here and 115 robust flora. testing rbcL and for regional chose fast after the genera need we a Thus, in ITS2), growing 323 application and as activities. the broad trnH-psbA appears a mining psbK-psbI, considering for barcoding atpF-atpH, industrial Caraj´as markers flora, rpoC1, DNA suitable with dos available rpoB, scenario, Serra area is rbcL, this (matK, the an information Caraj´as. markers In of such genetic barcode dos diversity no in Serra genetic to planning the the scarce conservation of assess biodiversity published, outcrops to recently ironstone approach been endemic the effective several has of harbouring and community, survey species plant flora open plant complete unique most a a for to Although home is species. Amazon, rare Eastern Caraj´as, in and dos Serra the of canga The Abstract 2021 19, April 5 4 3 2 1 Giulietti Viana Zappi Daniela Talvˆane Lima Vasconcelos Santelmo Amazonian barcoding the DNA of diversity plant the Unravelling nvriaeEtda eCmia apsCdd nvriai eeioVaz Zeferino Universitaria Kew Cidade Gardens Campus Botanic - Royal Campinas Pesquisa de de Estadual Campus Universidade Goeldi Emilio Paraense Museu Brasilia de Universidade Sustent´avel Tecnol´ogicoDesenvolvimento Vale Instituto 3 Andr´e Gil , 1 n ulem Oliveira Guilherme and , 1 2 drPires Eder , lc Hiura Alice , 3 nr Sim˜oes Andre , 1 ieeNunes Gisele , 1 aalValadares Rafael , 1 aaaPastore Mayara , 1 4 eaImperatriz-Fonseca Vera , 1 ain Dias Mariana , 1 1 1 oneAlves Ronnie , iin Vasconcelos Liziane , 1 aiyLorena Jamily , apsrupestres campos 1 1 amn Harley Raymond , Maur´ıcio Watanabe , 1 aaMota Nara , canga 1 eaoOliveira Renato , on canga through a ealdin detailed (as 5 3 Ana , Pedro , 1 , 1 , Posted on Authorea 19 Apr 2021 | The copyright holder is the author/funder. All rights reserved. No reuse without permission. | https://doi.org/10.22541/au.161882857.78207100/v1 | This a preprint and has not been peer reviewed. Data may be preliminary. fDAmtbroigfrmntrn idvriy(rs,21;Drot ta. 08 dmwc tal., et Adamowicz 2018; al., et DNA application Dormontt curated successful 2017; a the (Kress, other Thus, for biodiversity 2018). with basis al., surveying monitoring the comparison as et provide 2019). for for in Gous can such choice metabarcoding bp) 2015; procedures barcode, (~450 DNA of al., analytical et size of DNA stablished markers Richardson well amplicon main this and 2010; smaller the library al., using a et barcode of of and (Chen one advantages conditions regions been used methodological PCR frequently has standardizing the et of ITS2 samples considering been (Deiner ease plants, composite once, has identification For the approach multi-species of at 2019). This automated use species al., for metabarcoding. the et multiple approach DNA cost-effective Zinger enables as and 2017; known level fast, environment, al., species robust, given a the a as from at regarded barcodes species barcodes DNA of useful DNA detection yield of for 2011). gene, genome, al., generation rRNA nuclear et 35S the Hollingsworth the the Furthermore, 2010; of of ITS2) al., regions and et some (ITS1 (Chen markers, spacers plants organelle transcribed for Besides internal the 2017). as Kress, such 2009; Group, different Working using rates Plant as success the function variable of will reporting combination regions the been cpDNA have which (cpDNA) authors standardizing DNA several (e.g., in since markers chloroplast difficulty barcodes, the some a considerably as DNA is of such plant the there plastomes, universality reliable Also, with the the 2011). within arise rates and al., species substitution et et genomes nucleotide plant (Hollingsworth Hollingsworth higher organelle of 2016; with the al., plants barcoding those et of mainly for DNA markers, Hebert straightforward evolution with 2009; as of associated al., not pace et problems is (Fazekas slower main fungi barcodes and The DNA animals 2016). of as development al., such the eukaryotes, that other for fact respectively. well-known as (2018), a al. is et it Nunes However, and (2017) al. et Babiychuk by the diversity of the morning-glory species cangae measuring stands the plant and 2003) more identifying of as al., for the guiding et information populations such from thus cost-effective (Hebert natural voucher, and seen barcodes reliable of which DNA deposited were of of species, status a source plants application the efficient by The of an of area. backed as data identification FCC the and out for correct genomic the efforts specialists, the of and conservation taxonomist ensure initiative all genetic would floristic by effectively of measure the authenticated availability a of been the Such part had important. as as extremely board 2018), as on al., onset taken three et was in samples (Mota lycophytes DNA project 2017). 11 of Giulietti, families, collection & 22 systematic (Mota gymnosperm, in The Amazon 2016). single ferns al., Brazilian a et 175 the and (Viana in included species 2018), distributed FCC et 600 widely al., the around (Mota et in of species (Salino estimate detailed provided 900 initial families groups project the approximately This plant However, than 2018). comprising vascular higher (2016). al., families, considerably Other et al. number (Mota angiosperm et a Amazon 116 Brazilian 2018), Viana the al., for by of the treatments region detailed of a floristic as Flora for complete 1970s, Flora the complete 2020). the flora, first al., its in the et being Caraj´as publish started Giannini to dos 2019; effective project Serra al., scenarios through change a et the climate Miranda protection in the 2016; species surveys of al., view Plant ensure (Skirycz et in to (Levine activities Such especially region necessary mining 2019). activities, the industrial al., ore are for of et iron surveys predicted presence (Zappi for biodiversity the sites mainly in robust among efforts years heterogeneity and conservation the floristic 2014), throughout high al., explored a et with been 2019), have al., outcrops et de ironstone Giulietti FLONA 2016; as The Caraj´as, or al., such matrix. de et species, forest Nacional plant dense (Floresta and rare Forest a (Araceae) and Caraj´as by National endemic the surrounded several in 2019), Caraj´as), harbouring mostly al., found et Caraj´as is Zappi dos 2019; al., et Souza-Filho ...eer,Sln St¨utzel and & Salino J.B.S.Pereira, aaai cangae Carajasia rpo B, rpo rbc C1, and L pme cavalcantei Ipomoea atp ..aa,ELCba esi Rbaee Siyze l,21;Viana 2014; al., et (Skirycz (Rubiaceae) Dessein & E.L.Cabral R.M.Salas, F- mat atp eune aebe eomne stecr acdn oi(CBOL loci barcoding core the as recommended been have sequences K H, psb .serracarajensis I. K- canga psb ..utn(ovluaee,adtequillworts the and (Convolvulaceae), D.F.Austin and I canga 2 fCrja FC,to ati utudrfu years, four under just in part took Caraj´as (FCC), of ntmnodiflorum Gnetum trn srcnl eosrtdfredmcspecies endemic for demonstrated recently as , H- ...eer,Sln St¨utzel (Isoetaceae), & Salino J.B.S.Pereira, psb )(.. aea ta. 08,although 2008), al., et Fazekas (e.g., A) hldnrncarajasense Philodendron rnn Geaee,aliana a (Gnetaceae), Brongn. canga fteSerra the of mat E.G.Gon¸c. gene K Isoetes Posted on Authorea 19 Apr 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.161882857.78207100/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. io oictos(.-. fla isead1 Lo h xrcinbffr ihteadto f4 w/v analysis 4% fragment of with and addition conditions the (2005), with PCR buffer, al. (1994). extraction et al. the Weising of et by mL Michaels incubation described 10 by v/v and I an 0.2% tissue protocol including and CTAB and of PVP the g buffer using (0.5-1.0 VXL modifications performed the minor were HT regarding without extractions QIAcube 350 modifications out DNA minor adding a carried with the after in (Qiagen), was s performed Kit which DNA 30 was 96 step, for extraction QIAamp preparation automated the of sample an 300 V1’ Afterwards, the and protocol debris ‘Q plate. the eliminate 60 with U-bottom to (Qiagen) at deep-well ground rpm 600 the min 96 Then, 4,000 to at 40 a Hz. added min 30 were to for NaCl) at deep 1 incubated M min collection a for 1.4 1 were mL centrifuged EDTA, in for mM samples were 1.2 (Qiagen) frozen 20 racked II the Tris-HCl, were TissueLyser mM 96 and samples a 0.1 in CTAB, The in material w/v separated ground (Qiagen). (2% then were buffer beads and samples) extraction h carbide of dried 18-20 tungsten for silica mm degC) for (-80 3 mg freezer two ~10 considering with the (or materials, (Axygen) tissue in plant microtubes plant observed all fresh for groups of protocol taxonomic automated mg of efficient diversity an high established we the extractions, (~25 DNA temperature the For room at stored then Rogstad and extraction by gel DNA silica described in as dried buffer, were saturated 13%) processing. CTAB-NaCl ca. 2% (147, (~4 tissues in refrigeration collected collected under stored were MG then 87%) the and at (1992), ca. deposited ( (983, were markers samples specimens best Most sampled two all Bel´em, of the Par´a, Em´ılio vouchers Goeldi, ). of Paraense The (Museu selection S3). herbarium the and S1 after Tables different only (Supplementary seven below barcoded test were al., to used samples et were Salino 534 species) 2018; 370 Par´a, Brazil al., and et of genera of K- genes Mota 243 55% state (the Approximately families, 2016; regions 63324-1. 96 Amazon, al., and from cpDNA et Eastern 53990-1 specimens (Viana 48272-6, (645 in were 47856-2, project samples regions numbers species FCC those relevant 577 permit the and ICMBio/MMA of other genera under part 343 2018), and Caraj´as families, as 120 dos S1), from Table Serra plants Eriocaulaceae, (Supplementary vascular and the of Cactaceae or of specimens in vegetative species 1,179 other of collected of either case total the although A in extractions, instance. as DNA for needed, when the employed for were sampled structures reproductive were tissues leaf young proceduresPreferentially, barcode DNA the here. for developed materials library Plant surveying barcode for DNA ITS2 the methods with of and analyses advantage main Caraj´as, metabarcoding taking Materials two dos DNA followed Serra of the we the potential in Here the in groups diversity basin. test taxonomic plant to provide Amazon of aimed most to the diversity we order the Moreover, the of in chose analyses. considering flora universality, area, then the marker the and of as in possible regions canga data approach range highest diversity barcode barcoding the mountain genetic DNA DNA (1) this the assess premises: used of of to commonly composition application tools eight broader biodiversity robust of a the potential for mainly of markers the of species understanding suitable flora plant tested vascular complete We an for the whole. to barcodes to relevant a DNA directed are describe approach Par´a that we barcoding of Hence, DNA basin. other the Amazon no on the is focusing in there region knowledge, other our any of best the To psb and I n 2 esnbesadriainadatmto ftepooosfrsml rcsigand processing sample for protocols the of automation and standardization reasonable a (2) and ; trn canga H- psb β- ecpotao) olwdb h eetv rcptto fplschrdsdescribed polysaccharides of precipitation selective the by followed mercaptoethanol), fteSradsCrja,as nldn lnsohrfo ra nteBaiinstate Brazilian the in areas from other plants including Caraj´as, also dos Serra the of )adteIS negncrgo SplmnayTbe 1adS) h remaining The S2). and S1 Tables (Supplementary region intergenic ITS2 the and A) μ fbnigbffrAB iigfrsxtms lo o oedffiutsamples, difficult some for Also, times. six for mixing ACB, buffer binding of L mat K, rbc L, rpo ,and B, ° )utlteDAetato a are u.Teremaining The out. carried was extraction DNA the until C) 3 rpo canga 1 n h negncspacers intergenic the and C1, ° nawtrbt.Tecleto microtubes collection The bath. water a in C fteSradsCrja.Apoiaey20 Caraj´as. Approximately dos Serra the of μ ftespraatwr transferred were supernatant the of L rbc n T2,a detailed as ITS2), and L atp F- atp ° )until C) H, psb μ L Posted on Authorea 19 Apr 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.161882857.78207100/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. osdrn oae l 21) n P 21)(upeetr aaS;SplmnayFgrsS1-S6): modifications Figures Supplementary minor S1; with Data (2017), (Supplementary Neto (2016) (1) I Meira – PPG & matrices and Gastauer concatenated (2018), two on al. based et relationships Mota RAxML the in considering family on constructed based the were matrices for ML different constraint six specimen. bootstrap using sampled of above, one 70% specimens. than described least more sampled as at with with the those which species among only monophyletic regions, diversity considering counting intergenic by taxonomic support, resolution the high phylogenetic of the the tested to case we due Finally, the align in addi- to especially difficult for sequences, more we (http://blast.ncbi.nlm.nih.gov/Blast.cgi) problematic considerably Furthermore, avoid database were to S1-S12). GenBank control Figures model the quality (Supplementary substitution in tional replicates the searches using 1,000 BLASTn (http://phylo.org), 8.2 with RAxML portal performed bootstrapping with CIPRES rapid constructed the were and in se- (ML) GTR+G implemented the likelihood as maximum affiliations, 2014) on family based (Stamatakis, trees and algorithm phylogenetic the order Then, using contaminated cod- mainly 7.388 separately. or their MAFFT considering mislabelled with for either aligned groupings, frame were (from correct quences specimen sequences the the problematic unusual determine for ( for generating to initially genes 90% 11 check samples) coding to as of Afterwards, protein parameter similarity “–gcode” for region. minimum the ing 0 a of defined and threshold we and 1 genes, fastq a bp as as- protein-coding using to 15 parameter the filtered converted of “–coding” generate and were overlap and the to trimmed C1 minimum files defined then *.phd) We a trace were and considering step. the sequences (*.ab1 assembly also Initially, Converted files score, step. sequences. trace PHRED control reverse DNA all 20 quality to and process the prior forward to for min, the files 2018) 2 of al., least consensus et at sembled (Oliveira for PIPEBAR ice used on analyses We put phylogenetic and then Biosystems). control and (Applied quality Analyzer min assembly, DNA 3 Sequences 3730 for ABI denatured an were in samples analyses fragment the tubes, the to 4 at min 35 supernatant, the discarding 10 25 of sample, total consisted a reaction to μ Each water Milli-Q protocol. manufacturer’s Kit and Terminator Sequencing the Cycle of v3.1 version 2 Terminator modified BigDye of a the following using Biosystems), out carried (Applied 10 were at 20 reactions rpm (ca. sequencing 4,000 temperature Bidirectional at 10 room min at in 10 dried resuspended for were was centrifuged tubes DNA were the amplified tubes again, the discarded and was added, supernatant were ethanol 70% cold o 5mnadcnrfgdfr4 i t400rma 10 at rpm 4,000 at min 45 for centrifuged and min 15 for 72 at step conditions: extension following the using Biosystems), (Applied 54 Cycler 94 Thermal at 96-Well Veriti denaturation a initial in 12 run were of PCRs total The a to 0.25 water each), mM Milli-Q (2 mix dNTP of ohbroe;adfu arcsfo eaaeainet (3) – alignments separate from (2) (5) matrices and sampling, markers, complete four two and the barcodes; of one both of sequences missing with protocol CTAB the from DNA genomic of ng 2 ˜20 follows: (or as robot S4) HT Table QIAcube (Supplementary markers the 1.2 cpDNA in I), seven extracted the DNA for genomic conditions of PCR same the used We fa11(/)slto otiig3Msdu ctt p .)ad15m DA(H80 oeach to 8.0) (pH EDTA mM 125 and 5.2) (pH acetate sodium M 3 containing solution (v/v) 1:1 a of L ° ecp for (except C μ fteapie N,2 DNA, amplified the of L μ mat f10 of L ° .Te,tespraatwsdsadd 10 discarded, was supernatant the Then, C. μ )aditrei ein (ITS2, regions intergenic and K) faslt tao,bifl otxn n etiuiga ,0 p o 0mna 4 at min 40 for rpm 4,000 at centrifuging and vortexing briefly ethanol, absolute of L × ecinbffr(0 MTi-C,p .,ad50m C) 0.6 KCl), mM 500 and 8.3, pH Tris-HCl, mM (100 buffer reaction mat ° o i.Te,teapie N a rcpttdwt 100 with precipitated was DNA amplified the Then, min. 7 for C ,frwihteanaigtmeauews46 was temperature annealing the which for K, rbcL ° o i,floe y3 ylso mlfiainwt i t94 at min 1 with amplification of cycles 30 by followed min, 3 for C u n 6 ITS2 (6) and cut μ .Frteapicto fIS,w de 1 added we ITS2, of amplification the For L. μ μ fcl 0 tao eeadd n h ue eecnrfgdfr10 for centrifuged were tubes the and added, were ethanol 70% cold of L rbc fec rmra 10 at primer each of L μ f5 of L L+ITS2 μ fmliQwater. milli-Q of L × atp eunigbffr 0.5 buffer, sequencing u,bsdo h eue apigue ntescn matrix. second the in used sampling reduced the on based cut, ul osdrn h opeesmln,icuigaccessions including sampling, complete the considering full, μ .Tesqecn ecin eepeiiae yadn 2 adding by precipitated were reactions sequencing The L. F- atp 4 H, μ fH-ifraie(hroFse)wr added were Fisher) (Thermo formamide Hi-Di of L μ psb ,05Uo a oyeae(hroFse)and Fisher) (Thermo polymerase Taq of U 0.5 M, rbc ° rbc .Atrdsadn h uentn,125 supernatant, the discarding After C. Auto K- L+ITS2 n T2ainet,icuigatopology a including alignments, ITS2 and L psb rbc Kth&Sade,21)frec marker each for 2013) Standley, & (Katoh and I μ f10 of L L u,cnieigol pcmn with specimens only considering cut, ° ulad()ITS2 (4) and full )ad1mna 72 at min 1 and C) trn H- μ μ rmr 0.5 primer, M fDS ntereactions. the in DMSO of L psb μ f5 MMgCl mM 50 of L ) epciey o the For respectively. A), ° )fr3 i,adthe and min, 30 for C) μ f6%isopropanol 65% of L ul ae nthe on based full, rbc ° ° ,wt final a with C, .Fnly the Finally, C. μ L, fBigDye of L ° ,1mnat min 1 C, rpo ° .After C. 2 B, 1 , μ of L rpo μ μ L L Posted on Authorea 19 Apr 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.161882857.78207100/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. 12. MMri oms 03,wt nojc ul rmteAV uae eso,uigdt from data using version, curated ASVs the from Phyloseq Results built package of R plots. object the threshold sampling an with similarity and with performed assignments minimum 2013), were default Holmes, analyses & the downstream (McMurdie Additionally, and 2017). our v1.26.1 0.95 al., using of et BLASTn (Frøslev co-occurrence (-perc 84% with relative settings coverage minimum taxa were was and ASVs, to similarity reads model minimum erroneous assigned error -qcov and on were based FASTX-Toolkit The bases, and database, sequences 10. reference selected representative 95 = local removeBimeraDeNovo. filterAndTrim, a trimRight and randomly as the using 20, created, parameters: library million with = was ITS2 removed following 100 trimLeft table were 2, on the sequences ASVs processed = chimeric based An and Identified truncQ learnErrors derepFastq. 2 2, using function = were of dereplicated were maxEE the error 0, tags with = and expected calculated run maxN primers maximum filterAndTrim 90, the of = with using sequences trimmed minLen DADA2, and determination). sequencing and filtered OTU in were Splitter reads to function Barcode low-quality (equivalent Then, FASTX single-end (ASVs) Clipper. FASTA/Q with variants with removed demultiplexed sequence were amplicon the exact reads Firstly, infer and Fisher). errors from (Thermo sequencing composed platform was PGM plot) Ion Then, data the collection (http://hannonlab.cshl.edu/fastx trP1. using per ma- adapter sequenced following library the and Coulter), (one Raw with replicates (Beckman 1990), libraries PCR Beads al., different independent minor XP et four six AMPure with (White pooling the sample. Agencourt by ITS4 of before, per kit and Each as the mL) A, instructions. with conditions Ion (15 nufacturer’s purified adapters PCR buffer were the same extraction products with the PCR and 2010), followed g) al., et region (8 (Chen ITS2 tissue as 1 followed the leaf including polysaccharides of of modifications, of amounts amplification precipitation the selective the and for Likewise, CTAB previously except using as above, extraction stored mentioned then DNA and for buffer, procedures saturated cm CTAB-NaCl The 1 2% approximately the with of mL young 30 of described. containing pieces tube locality, Falcon sampled mL each 50 For Table 27 (September Supplementary 2016). season vegetation; al., dry rupestrian et the of open end and the groves near (forest S6), the types in vegetation diversity different plant monitoring markedly for two samples bulk with analysis metabarcoding canga using of potential the assess To accession the under database analysis referred S1. Metabarcoding Table the species in Supplementary database the deposited the BOLD were for in produced the published sequences listed in as previously all numbers performed and barcode such work, were any this 38.1%), searches was in Besides, there analysed spp.; whether (2011). (207 in check al. to barcode species, et (http://www.boldsystems.org) a Burgess other following the with of resolution, of specimen accessions species combination one with the whether than compared tested when more species higher with the or species for scabra sequence Prime similar of available Geneious either one cases in just were the of plugin identities cases presented BLASTn pairwise the sequence the in cific query as itself, using given such species 61.9%), a (2011), query the spp.; whether as with (336 assignment al. (both only correct identity et herein a pairwise Burgess considered obtained 100% markers we sequences in different Thus, the eight described (Biomatters). the with as 2019.2.3 of performed database), species) were assigned local searches correctly and BLAST of percentage all-to-all the barcodes, (as as resolution barcode the test To analysis Barcode esmldaldsenbepatseieswti napoiae1 aisi i lt,including plots, six in radius m 10 approximate an within specimens plant discernible all sampled we , Hffan.e om cut)KShm SplmnayTbe 1adS) diinly we Additionally, S3). and S1 Tables (Supplementary K.Schum. Schult.) & Roem. ex (Hoffmanns. s 0.Fnly eue h UUcrto loih ihdfutstig ocollapse to settings default with algorithm curation LULU the used we Finally, 70). hsp advlatenuifolia Mandevilla × B-A ue Smrko ta. 03 n sn h rmr ITS2-S2F primers the using and 2013) al., et (Samarakoon buffer TBT-PAR oli)adteRpcaeDD2(alhne l,21)t correct to 2016) al., et (Callahan DADA2 package R the and toolkit) rbc n T2broe ( barcodes ITS2 and L JCMkn odo Aoyaee,o hnteintraspe- the when or (Apocynaceae), Woodson (J.C.Mikan) th n 28 and 5 th 07 htlssfo a oOtbr(e Viana (see October to May from lasts that 2017) , rbc +T2 ol infiatyincrease significantly would L+ITS2) 2 eecletdi a in collected were Mandevilla identity Posted on Authorea 19 Apr 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.161882857.78207100/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. lcrpeorm ihmn uepsdpasdet h rsneo oouloierpas sobserved as repeats, mononucleotide was of generated commonly presence problem results the This sequencing to reads. of which in due both for reads amplicons peaks regions, or in and intergenic one superposed DNA cpDNA many either the quality with for of good electropherograms results cases yielded sequencing the in that poor evident presented species more several ranges size of expected samples their that however, noteworthy, is S1). It ( Table species (Supplementary four barcoded from this successfully samples from in were five specimens adopted sp.) of 19 total protocols of a “universal” out only the that, na, genera, Considering with eight to problematic. and process Samples strikingly species due to were S1). 13 challenging barcoded Table species (Supplementary more be sequence and markedly not barcode work, were valid could taxa a (6.43%) some with species (Sup- representatives from were levels without Trigoniaceae 37 quality and families required from Hydrocharitaceae minimally only Dilleniaceae, with (4.16%) the reads Begoniaceae, samples sequencing Balanophoraceae, generating 49 S1). in Table of or plementary out PCRs total 115 the with a for either sequences hand, problems obtained other we the families, On sampled S1). Table the (Supplementary Considering (95.83%) S5). 120 repre- and did of and S1 (10.22%) speciose Tables from here (Supplementary being barcoded ) work them species present of several the with with in genera database, time 323 the BOLD first da- in the the (95.84%) the BOLD families in of for specimens sentative available the out barcoded sequence 1,130 in 33 were any totalling record have addition, species not (93.56%), previous those In for species of S1). searching (63.94%) sampled Table After 344 (Supplementary 575 that S1). the Table observed of we (Supplementary plus tabase, out barcodes markers, eight 538 DNA with 2,729 for test and markers initial eight only the using the in barcoded used of samples specimens remaining 645 534 the the considering sampling, complete S3). our and barcoded From all S1 Tables Almost (Supplementary S3). and barcodes S1 either both Tables of presented (Supplementary sequences respectively least, of samples), at 425 had, sequences species and obtained (393 we species samples, 140 534 and additional the considering Afterwards, rajaensis (Alismataceae), Britton (Mart.) species: six only ericae for the markers barcode eight all – Fabaceae – Picramniaceae W.W.Thomas, ( eight F- the of Out S1). tenuiflora Table Supplementary (Table cangae 1; coverage (Table jasia species regions highest neither two with the last species these with especially regions, sequences, barcode generated hand, best other the the be On F- sampling to S1). the complete 286) Table species, the (490; Supplementary 370 to 1; ITS2 and similar and samples were with species) 645 respectively) only 304 using 93.24%, barcoded assessment and specimens markers (94.73% the eight including success the barcoding with of test proportions initial the only Considering barcodes of success sequencing and Amplification n oca ( and utcapotamogeton Justicia atp atp ioi heliotropoides 16 127), (176; H ;and H; krp Rtca)(upeetr alsS n S2). and S1 Tables (Supplementary (Rutaceae) Skorupa ..ornc .ua,Mraee– Myrtaceae E.Lucas, A.R.Louren¸co & ..neso (Marantaceae), C.L.Andersson mat atp uica – Rubiaceae , Actinocladum eintspalmata Hemionitis F- K, atp rpo rbc psb eune of sequences H idu cnhca – Acanthaceae Lindau, o T2broe,fiehdsqecso utoeo h eann akr ( markers remaining the of one just of sequences had five barcodes, ITS2 nor L canga K- and B psb Triana, rpo , Hildaea uha seaee( Asteraceae as such , 15 5 and 95) (125; I rpo B; rpo aqeotatamnifolia Jacquemontia inni minima Sinningia . trdca – Pteridaceae L., oeohl crassipes Noterophila 1 SplmnayTbeS) diinly eotie eune of sequences obtained we Additionally, S1). Table (Supplementary C1) and B pme cavalcantei Ipomoea , rbc Paratheria roalncinereum Eriocaulon atp rIS 57oto 3 p. 85%,fo hc 9 (75.71%) 399 which from 98.50%), spp.; 535 of out (527 ITS2 or L rbc trn eihl integrifolia Aegiphila mat F- rbc n T2 u eut lal showed clearly results Our ITS2. and L mat H- atp rpo , Cavalcantia psb n T2 eotie ai eune fa es one least at of sequences valid obtained we ITS2, and L K, ..ruo&Catm,Gseica – Gesneriaceae Chautems, & A.O.Araujo trn 6 ;and H; 14 126), (154; K rpo C1; 19 7 rsne osdrbylwrnmesof numbers lower considerably presented 77) (109; A Nui. ree ...oh and M.J.R.Rocha & Kriebel (Naudin.) H- o ntne(upeetr iueS13). Figure (Supplementary instance for , L)Gie.(ovluaee and (Convolvulaceae) Griseb. (L.) 1and C1 tcyapeaglabra Stachytarpheta psb en latifolia Senna .r (Eriocaulaceae), R.Br. ) hl he a oeta n barcode one than more had three while A), , , Monogereion Raddiella atp Jc. odne(Lamiaceae), Moldenke (Jacq.) rpo F- 16 119), (136; B atp elcagrossularioides Bellucia , GMy)HSIwn&Barneby, & H.S.Irwin (G.Mey.) Rhytachne H; rbc , Parapiqueria irmi ferrea Picramnia hm,Vreaee– Verbenaceae Cham., n T2frohr183 other for ITS2 and L rpo eatimtenellum Helanthium rbc and 1(5;126), (156; C1 53samples; (503 L Trichanthecium ioapsca- Pilocarpus and rpo Tibouchina B; Praxelis L)Tria- (L.) iai& Pirani Ctenan- Myrcia Cara- atp atp ) Posted on Authorea 19 Apr 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.161882857.78207100/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. mn ra a ut aibe ihms bevdseisbigascae ihasnl olcinpo,such plot, collection single taxa a of with distribution associated the endemics being Lamiales general, species the Gentianales, In observed most Fabales, S7). as Table with Malpighiales Asterales, variable, Supplementary S7). quite by 3; Table was (Figure followed identified Supplementary areas each among species, 3; 34 species nine (Figure three in with orders with resulting 14 order, respectively, and and representative coverage, families most and species in 21 the the observed similarity genera, was to were sequence 33 classified ASVs to of ASVs different 41 belonging 70% 508 into species, of and grouped total being 95% A then considering bp. filtering, level, 314 sequence after of high- analysis 2,269,135 length step, average metabarcoding control the an quality the Caraj´as. yielding of After dos remained, samples Serra reads the composite in quality the S6) from Table reads (Supplementary raw plots 4,465,309 six generated the sequencing amplicon in high-throughput as ITS2 observed, The also was resolution, barcode without analysis Metabarcoding but trees situation, all both opposite in of as the with monophyletic Correspondingly, case such were S3). the analyses Table species phylogenies, (Supplementary BLAST the trees the which the six in all in in as in DNA resolved polyphyletic recovered identified the as were not correctly appearing by analysis identified were was resolution correctly which resolution) species barcode (Fabaceae), the barcode the of (with in some identified Nevertheless, barcodes S3). correctly Table species (Supplementary monophyletic the of most 2). Additionally, (Table respectively) 76.12%, and 67.91% with ITS2 the rbc than (66.02%) species The 2). (Table phylogenetic resolution recovered higher species presented marker monophyletic ITS2 of rbc the ITS2 percentage matrixes, (Table for reduced the (75.16% 34.10-38.87% and in resolution between complete variation both ranging for wider Considering close, 62.30% S3). a were from Table observed ranging accession we matrix, one hand, S1-S6), each than Figures by other Supplementary more the 2; with (Figure On matrixes species 2). used of six the proportions from the obtained trees phylogenetic and the (67.05% Among species sampled one than for resolution more Phylogenetic 95.65% with and genera (91.83% the barcoded resolution single from both for a of species of by 86.67% levels with represented values comparison higher genera resolution in most considerably for respectively) the observed Likewise, also specimen, S3). Table we ( barcoded Supplementary available species, 2; single accessions (Table two spp.) a least 153 at with than for with more species with species or the the one for of only with case lower, species ( were the markers comparing levels In when resolution pattern accession. barcode resolution one the barcode different regions, a two observed these we with only sequencing barcoded best since samples the additional with 534 markers with the tested sampling two complete the the of Considering resolution combined the hand, ( other results the On of 1). (percentage (Table resolution for barcode 97.40% of with The levels regions, observed all we almost markers, for for eight 90% above the species) with identified test initial the Considering resolution Barcode mat L L+ITS2 rbc rbc u)(al ) neetnl,tecnaeae arcspeetdcnrsigpten fphylogenetic of patterns contrasting presented matrices concatenated the Interestingly, 2). (Table cut) akrpeetdtelws acd eouin ih8.2 fteseisscesul identified successfully species the of 83.22% with resolution, barcode lowest the presented marker L ,9.0 for 93.70% K, rbc ,IS and ITS2 L, rbc rbc idri brachyphylla Lindernia +T2 a ihr(25% hnterslto fbt ein ln Tbe1). (Table alone regions both of resolution the than (92.54%) higher was L+ITS2) u 7.5)wshge hnbt qiaetidpnetmtie ( matrices independent equivalent both than higher was (79.85%) cut ih7.0 o 1 p. n T2wt 36%fr23sp)wr osdrbyhge than higher considerably were spp.) 283 for 93.64% with ITS2 and spp.; 317 for 77.60% with L n T2 respectively). ITS2, and L uhacarajasensis Cuphea atp rbc rbc F- +T2peetd7.0,8.5 n 60% epciey(al ) Moreover, 2). (Table respectively 86.06%, and 89.45% 75.00%, presented L+ITS2 atp ulad7.2 o ITS2 for 76.12% and full L+ITS2 ,9.6 o T2 26%for 92.63% ITS2, for 92.66% H, ulmti 7.6)(al ) ovrey h hlgntcrslto of resolution phylogenetic the Conversely, 2). (Table (75.16%) matrix full enl Lnenaee SplmnayTbeS3). Table (Supplementary (Linderniaceae) Pennell ulmti rsne osdrbylwrpooto fmonophyletic of proportion lower considerably a presented matrix full oreg(Lythraceae) Lourteig rbc rbc L n T2 hticue h 4 pcmn rmtets and test the from specimens 645 the includes that ITS2, and L ulad7.7 for 79.87% and full 7 rbc u)than cut) ih6.5 o 8 p. n T2wt 78.21% with ITS2 and spp.; 183 for 68.85% with L aaiuracavalcantei Parapiqueria psb rbc rbc K- 6.0 for (62.30% L trn psb L+ITS2 H- n 04%for 90.48% and I psb u Tbe2 Supplementary 2; (Table cut ,9.6 for 94.96% A, rbc rbc rbc n T2 although ITS2, and L ltrafalcata Clitoria L ..ig&H.Rob. & R.M.King L ulad6.1 for 67.91% and full u n ITS2 and cut rpo rbc rpo 1(al 1). (Table C1 n ITS2, and L ,93.65% B, Lam. cut, Posted on Authorea 19 Apr 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.161882857.78207100/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. ein hc slreyepoe steoca acd einfrtegop(cohe l,21;Badotti 2012; al., et (Schoch group the ITS for the region especially barcode fungi, official for the barcodes as DNA employed as largely used is successfully (Hollingsworth which been species region, have given related sequences a closely for rDNA-based among library Nuclear levels barcode polymorphism comprehensive lower a 2011). build its al., to designed account options et primers into known many taking the of even among nature universal choice flora, fully safest of the almost the most of the marker with hand, portions this other sequence problems the and methodological On amplify of 2017). to for al., reports observed et frequently several the Ghorbani as with Throughout been 2015; far, have regions. have although so intergenic we amplified, there tested preferred cpDNA As successfully regions barcoding, the the 2014). the were of DNA al., of samples cases plant et the many one of Erickson in that history been mainly 2008; emphasize recovery, al., has data to et sequence region paramount Lahaye unsatisfactory is intergenic of (e.g., it this studies fourths above, several since three related in surprising approximately markers for is in barcode results which alternative problems worse sequences, even sequencing valid obtained and/or alongside We generating regions amplification specimens. barcode with effort tested core samples, barcoding the two DNA our the extensive in the of most poorly that one the as fact as defined work the date. Although this despite to characterizes up that, time) Amazon emphasize first Brazilian to the the essential collected for for 160 were is barcoded 91 and it being which Par´a. families Thus, 538 from of 11 obtained, state additional of were Brazilian an majority project for the FCC the sequences on the the of surrounding hand, focusing for lists forest being other lowland published described some the the the from, species orchid On in in DNA elusive the 2019). included extract the of al., not to as many et species tissue such (378 Giulietti that of collections, diversity amount (see note species satisfactory Ilk.-Borg. to their the minimally from important of a only third is obtain known one to It approximately difficult 36.21%). covered and/or spp.; 2018; barcodes rare DNA al., 1044 our et of 2018), Mota the out al.; 2016; for et al., listed Salino et plants 2018; (Viana vascular of all project plants FCC Considering vascular the of diversity by 2018). vast inventoried al., evaluation the recently et This considering Caraj´as, 2018). Salino as 2011), al., dos initially et al., Serra we Lima et Therefore, the 2008; (Hollingsworth al., 2018). scalability et al., (Fazekas and et particularities regions minimalism establish Lima the barcode to 2017; considering DNA important used ecosystems, Kress, different most was 2016; the from al., of groups et methodologies eight taxonomic (Hollingsworth practical tested Ne- of taxa and 2009). range universal several wide al., acknowledged establishing in a extensively of et observed Fazekas in been difficulties barcoding applied inherent (see has DNA diversity the be species of plant despite to animal surveying implementation decade, for in The last than barcodes the (2011). DNA complex during DeWalt of forma- by more its importance out and after the slower pointed mainstream vertheless, was as became plants and (2003), old for al. quite approaches et is sequences Hebert Amazonian DNA by the using lization species of identifying flora of the practice for The library barcode DNA reliable a Establishing Discussion areas. different two least at of and (Rubiaceae) Gomes (Convolvulaceae), (Myrtaceae), DC. hand, other and (Asteraceae) canga ysnm stipulacea Byrsonima eaacarajensis Perama pce tl akbroe,tettlnme fseisbroe een(ih34otof out 344 (with herein barcoded species of number total the barcodes, lack still species oseoi affinis Forsteronia oulaegleri Moquilea orlaliliastrum Sobralia rbc n T2a h etmrescvrn h rnilso standardization, of principles the covering markers best the as ITS2 and L .us (Malpighiaceae), A.Juss. Pac)Stes&Pac (Chrysobalanaceae), Prance & Sothers (Prance) ..ikr Rbaee Fgr ;SplmnayTbeS) nthe On S7). Table Supplementary 3; (Figure (Rubiaceae) J.H.Kirkbr. M canga l.r.(Apocynaceae), ull.Arg. ¨ rbc az.e id.(rhdca)cudb dnie rmsamples from identified be could (Orchidaceae) Lindl. ex Salzm. ee biul nldn h rmrpi eue ee makes here, used we pair primer the including obviously gene, L ucoso h er o aa´s n 9fo te localities, other from 69 Caraj´as, and dos Serra the of outcrops canga mat 8 yteFCpoet(in ta. 06 oae al., et Mota 2016; al., et (Viana project FCC the by eg,Fzkse l,20;CO,20;Lue al., et Liu 2009; CBOL, 2008; al., et Fazekas (e.g., K trn Croton H- pme marabaensis Ipomoea psb lirhslongipedicellata Uleiorchis rbc p (Euphorbiaceae), sp. ,wt esta 0 forsamples our of 20% than less with A, CO,2009), (CBOL, L ap rupestre campo ihri brasiliensis Richardia ..utn&Secco & D.F.Austin uei flavescens Eugenia mat on .ads & A.Cardoso performed K canga canga are Posted on Authorea 19 Apr 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.161882857.78207100/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. utemr,tedsrmnto eesotie o ohmres(eaaeyadcmie)wr naccor- in were combined) and (separately markers both for observed obtained were levels support discrimination the bootstrap Furthermore, low and groupings S1-S6). several non-monophyletic Figures of as (Supplementary of signals, populations cases phylogenetic the within Therefore, In conflicting or 2019). 2017). al., among Kress, et 2016; either Giulietti and (Fabaceae) by al., relationships described et (as evolutionary of taxa Miller discrimination complex other endemic 2015; several species of al., enables of indicatives et data levels observed Kress barcoding the 2014; DNA relationships assessing with al., evolutionary reflects besides reconstructions et that phylogenetic in (Erickson volume parameter of large inferences a identification) inclusion employing a analytical of species the handling importance as when consequently the obvious, especially hand, (and is other practical, assignment the and On straightforward sequence data. quite of correct is approaches a barcoding determine DNA to search) BLAST and 75.16%, tfis lne h acd eouinmyse oeatatv prah sntcal ihrvalues higher ( noticeably combined as and approach, individually attractive both more markers a best seem two may and the relationships resolution for phylogenetic barcode obtained (as the considers were drawbacks approaches. evaluation and glance, which advantages both first -based, with use At is to both resolution) preferred 2009), one we (barcode al., second Therefore, BLAST et al., below). the using Gonzalez et species discussed search-based and (e.g., Vere identifying is tree-based) 2011), correctly (de resolution, first of al., barcoded (phylogenetic capability The already et the markers. plants assess Burgess barcode seed to the (e.g., DNA approaches of than of main species larger two resolution) times 1,143 basically (species 152 of are area catalogue there an Also, whole with 2012). but the country, with small Caraj´as, and a of is fact in which Wales, fteAao rntn ed,frisac,myapa ob oelmtdi cp hnsuyn the studying 144 than ca. of scope area in original limited an more with However, be regions. to geographic appear broader first may or The instance, countries the km assume. whole as for DNA such of may coverage. fields, area, diversity sampling one ironstone geographic plant and area well-delimited as Amazon study a straightforward the the within as to of floras alt- related not local important, are specific is considerations undoubtedly important) barcoding is analyses most approaches different the barcoding perhaps from DNA (and in results discrimination comparing species hough of levels the Assessing 20 January barcoding. to DNA resolution of (up Species principles database three far, BOLD all of so covers the with inventory angiosperms basin in together effective Amazon for available regions ITS2 an sequences barcode implementing for 340,000 used of essential frequently ca. behind is most the only the repositories of 2021), of public 26.7% one in for been marker has accounting given ITS2 samples and the a diversity, of 81.33% of plant tested and second-best sequences species the the of as to of availability ITS2 81.04% close of Kuzmina for relatively usefulness 2010; barcodes barcoding, the valid performing al., plant showed with et recovery, and for data (Chen sequence our regions samples of Likewise, best authors. terms DNA 2018). the in quality the al., of marker lower by et one for Ramalho barcoded even as 2012; success sequencing successfully indicated ( al., sequencing been being et pseudogenes poor of has trees rate and obtained region Amazonian high ITS2 paralogs instance, a sampled smaller to presenting for the the rDNA related hand, (2009), of other ITS1, issues 41% al. the – only to et On Nevertheless, regions with due Gonzalez 2018). ITS, three mainly Rossell´o, 2007). resolving for its al., plants, & results (including et Feliner for for ITS Vasconcelos 2003; rare informative complete Wendel, 2017; not highly the & al., are of ITS as et ITS2) recovery the Saha and regarded sequence of 5.8S 2015; with frequently potential al., problems enormous also of et the are reports Liu emphasized which (e.g. have authors relationships barcoding, Several phylogenetic plant 2019). al., for et components Wurzbacher 2017; al., et 2 rbc SuaFloe l,21) the 2019), al., et (Souza-Filho +T2–8.6) hncmae ihtepyoeei eouin( resolution phylogenetic the with compared when 86.06%), – L+ITS2 rbc orraelaiosulcata Borreria +T2–6.2) oevr sn arieiett o te eae aaeeso a of parameters related other (or identity pairwise using Moreover, 66.02%). – L+ITS2 rbc and L mat ,wt 58 n 16,rsetvl.Tu,w eiv htorchoice our that believe we Thus, respectively. 31.6%, and 35.8% with K, rbc canga ..arl&LMMge Rbaee,frisac,tetespresented trees the instance, for (Rubiaceae), L.M.Miguel & E.L.Cabral rbc 9.5 fteseisad7.8 ftesmls.Ovosy the Obviously, samples). the of 79.38% and species the of (91.45% L spiaybroe o hshgl ies oafudi the in found flora diverse highly this for barcodes primary as L fCrja abu ogl smn pce fvsua lnsas plants vascular of species many as roughly Caraj´as harbour of 9 rbc n T2we acdn h C,w also we FCC, the barcoding when ITS2 and L ioaskinneri Mimosa rbc 50% T2–89.45%, – ITS2 75.00%, – L rbc ap rupestre campo ap rupestre campo var. 23% T2– ITS2 62.30%, – L carajarum on on Barneby Alvarez ´ canga canga th , Posted on Authorea 19 Apr 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.161882857.78207100/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. h eeec N acd irr Asse l,21) hs aems etknfrisuefridentification for use its for taken be must of care level thus, 2020). completeness 2018), al., the et al., on Bush et 2019; depending (Alsos al., affected library et considering greatly barcode Zinger be DNA especially 2017; biodiversity, reference can al., the metabarcoding monitoring et of (Deiner for effectiveness system metabarcoding analytical the Certainly, this DNA of pointed using efficiency have authors of and Several robustness advantages 2017). on al., many based et approaches the Deiner identification 2015; out al., multispecies et the of (Kress along development technologies ground the sequencing gained high-throughput with consistently importance has with further tools library achieving DNA-based barcode years, through DNA species inventorying a above, Amazonian protocols accomplishing mentioned the towards directed As of step more flora next developing our the for species be for need barcoded coverage will crucial four which full the the taxa, for acknowledge for problematic we than here at Thus, obtained better aiming used. we amplification slightly results protocols of only the universal were rates region, the problem ITS2 low a ITS such with reported the overcome the Melastomataceae 41 could using also authors that by of of these (2018) markers S˜ao Although mention total al. of Paulo. state a to the et Brazilian with with the important Lima of FCC, 2018). flora is the the al., in It in families et family species). sequences Mota angiosperm the generate 2017; for diverse of to success al., most being able et 30.77% fifth were frequently (Rocha (and the we taxa species performance as specimens which worst some recorded sampled for the was with Melastomataceae, of with Melastomataceae groups, far, group 26.32% taxonomic The by for was, protocols. of only adopted universe range would the sampling species. wide on procedures our 36 isolation depending a of within DNA others, samples from throughput other than samples high 47 problematic we for that with more barcode, observed occurred work as already DNA markers, always had a tested (2011) not without the al. species of et any of endemic Burgess of specimens eight Likewise, sequences two the DNA only From obtain of (2019). not samples al. could tissue et 2019; to Giulietti al., access by et had listed the Babiychuk species of 2018; plants 38 al., endemic the of et list (Lanes 2017; the analyses al., Considering ). et populational (Babiychuk review further time under first by al., the followed et for Dallapicola 2018), status diversity al., genetic et their investigating Nunes the data in barcoding distribution DNA geographic and/or on limited endemic very to morning-glory attention a present extra the the the pay species as of to vegetation of case especially essential Amazon “one is the are properly unique (and it immense being a in goal Hence, (2017), an such important basin. task, such of Kress Amazon in an easy species the planning by such rare conservation as an achieving proper area out been ensure known) in to pointed poorly not difficulties needed still and actions has The the decade”. before barcodes considering next by DNA mentioned evidenced the plant as et for with (Forest However, be challenges databases process can biggest 2021). decision-making public hotspots al., the the biodiversity et in populating of parameters Diniz services diversity 2007; ecological directing phylogenetic better al., maintaining including in of by role effectiveness enhanced important undeniably greatly the an as have data efforts, barcoding conservation DNA more by provided either indexes present Biodiversity that genera conservation Such and/or and 2). species barcodes Table from DNA 79.85%; taxonomy. specimens to problematic 66.02% of or (from phylogenetic exclusion histories sampling the evolutionary the complete of complex to the case due with overly the comparison occurred barco- were in in difference both here sampling strong with reduced used especially specimens the approaches was in only al. combination difference discrimination considering et the This species analyses Gonzalez of values. both the by resolution resolution that as reported higher noteworthy, fact as provided is floras, The des coverage diverse (2015). sampling other al. to for et sensitive observed Liu than and higher (2009) relatively although 2013), for al., results previous with dance rbc and L mat hnbroigte pce fMlsoaaee n ftems species-rich most the of one Melastomataceae, of species tree barcoding when K rbc rbc pme cavalcantei Ipomoea +T2 iha nraeo 09%i h rprino eovdspecies resolved of proportion the in 20.95% of increase an with L+ITS2, n T2(.. rs ta. 09 ugs ta. 01 amniret Parmentier 2011; al., et Burgess 2009; al., et Kress (e.g., ITS2 and L canga canga . fSradsCrja,w bandbroe o 0otof out 30 for barcodes obtained Caraj´as, we dos Serra of 10 lrm carajasense Pleroma n h quillwort the and ap rupestre campo canga rbc Guitie l,21) ihsuisbased studies with 2019), al., et (Giulietti oebroe pce) considering species), barcoded (one L on see cangae Isoetes canga Mlsoaaee,frwihwe which for (Melastomataceae), fteSradsCaraj´as, as dos Serra the of o ntne Both instance. for , Posted on Authorea 19 Apr 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.161882857.78207100/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. aot,F,Oier,F . aca .F,Vz .B . osc,P .C,Nhm .A,Oier,G,& G., Oliveira, A., identification L. the Nahum, for C., markers L. barcode P. Fonseca, DNA M., as B. sub-regions A. and Vaz, ITS F., of C. Effectiveness Garcia, F., (2017). canga S., J. G´oes-Neto, A. Amazon F. Santos, Oliveira, from Geography F., F., species (2019). E. S. Badotti, glory Silva, Kushnir, morning & A., diversified I., L. florally L. between Romeiro, V. Fonseca, isolation F., savannahs. O., reproductive J. T. for Siqueira, J. essential A., Guimaraes, Castilho, is F., L., D. Tyski, Silva, S., G., Vasconcelos, J. the Teixeira, F., of E., J. history Babiychuk, Natural Santos, (2017). L., G. G. Oliveira, Nunes, & L., N., 7 V. Carvalho-Filho, Imperatriz-Fonseca, C., A., endemics Castilho, M. narrow F., Dias, D. S., Silva, Vasconcelos, L., Tyski, S., Kushnir, (2018). E., L. 115 F. Babiychuk, America, Condamine, of & States D., biodiversity. United Silvestro, Neotropical C.D., the of Bacon, of source R., primary Scharn, the inference. A., is phylogenetic F. Amazonia Carvalho, plant A., and Zizka, sequences A., Antonelli, ITS Ribosomal vegetation. 29 (2003). Evolution, contemporary F. and J. the Wendel, represent & it I., does Alvarez, How Sj sediments: T., lake Jørgensen, 13 of G., ONE, N. metabarcoding Yoccoz, DNA den Y., Van Plant 114 Lammers, & America, biodiversity. G., X., of forest I. Zhou, States tropical Alsos, United S., on the Xu, disturbance of J., habitat Sciences Xu, of of Mwale, S., Academy Effects A., Wood, metabarcoding. D. (2017). J., and Liftmaer, J. J. F., barcoding Alroy, Wilson, Leese, DNA D., D., in I. Trends Steinke, Hogg, (2019). X., L., M. Pochon, B. Bank, M., Fisher, F., A. Chain, Naaum, S., M., J. Boatwright, the J., all S. and Adamowicz, Lopes Manoel technician laboratory the References initiative. thank PFC to the like in would involved supported authors taxonomists were The (381271/2018-8) fellowships. TGL and CNPq (443365/2015-6) by MP (380290/2016-2), (380502/2016-0 MCD LVV the 380517/2017-5), (307479/2016-1), and GO and (Consel- (305301/2018-7), the CNPq DZ and by (RCUK/BB/P027849/1). funded 307479/2016-1) project is 402756/2018-5, CABANA Cient´ıfico; GO 444227/2018-0, Desenvolvimento and de RBRS000603.85) Genomics, Nacional (Biodiversity ho Vale fast by as funded was well work as This region, be the will in dos region areas. sites Serra the native Acknowledgements mining of for untouched decommissioned plants libraries in in the surveys barcode vegetation reforestation for DNA robust scale of the and large optimization of development a Besides, the the on for 2018). implemented Hence, essential and al., taxa and 2019). (eDNA) et plant established al., DNA (Ramalho some environmental et being Caraj´as caves of (Oliveira with currently approaches Caraj´as ferruginous has importance are metabarcoding dos the sampling in using Serra identify bulk communities projects to the animal monitoring helping in for diversity by efforts plant providers context conservation nutrient ecological guide as the to acting in data Caraj´as was also barcoding dos demonstrated DNA Serra been library. in of barcode value species DNA plant the incomplete monitoring Furthermore, yet for a the ITS2 having of with despite (30.75%) metabarcoding third several demonstrated, one DNA dos the successfully with than among of Serra areas less and validity from of within in the diversity coverage samples Thus, taxonomic especially a bulk high with available, the relatively even is a for sites, observe library collection here could obtained we barcode as results complete promising, the very Caraj´as a Nevertheless, were until endemics. level distributed species narrowly at specimens of ´ 7493. , e0195403. , cetfi eot,9 Reports, Scientific pme cavalcantei Ipomoea 18052. , 417-434. , 6034-6039. , and .marabaensis I. 11 rmAao ag savannahs. Canga Amazon from 6056-6061. , rceig fteNtoa cdm fSciences of Academy National the of Proceedings ge,P,Gel,L,&Ewrs .E (2018). E. M. Edwards, & L., Gielly, P., ¨ ogren, eoe 62 Genome, canga v-vii. , rceig fteNational the of Proceedings pce ihIS barcodes. ITS2 with species cetfi Reports, Scientific Molecular PLoS Posted on Authorea 19 Apr 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.161882857.78207100/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. .M,Hjbbe,M,&Bret .C .(08.Mlil utlcsDAbroe rmteplastid the from barcodes DNA multilocus Multiple (2008). H. Percy, C., C. B. well. Husband, S. equally G., Barrett, S. species Newmaster, plant & W., Kress, discriminate M., S. & genome Graham, K., Hajibabaei, R., P. J. M., Kesanakurti, Zimmerman, S., D. K. D., Mi, Burgess, Xing, M., dynamics J., Meegaskumbura, W., A. forest Fazekas, K., Ye, multiple Ma, T., across Hao, A., structure A. approach. X.-J., J. phylogenetic Wolf, mega-phylogeny Ge, and Lutz, J., a J., diversity Y., S. plots: evolutionary S. K. Davies, Comparative Wright, S. W., (2014) I., Lum, Chen, J. Fang-Sun, J., A., W. D., N. A. Bourg, J. Larson, N., Parker, C.-L., DNA Pei, Huang, X., Advancing G., W., N. (2018). R. Swenson, J. Howe, A., Z., – F. A. Jones, collections Lowe, Encinas- L., herbarium & S., D. of Erickson, M., Caddy-Retalic, analysis Waycott, M., systematic M., Byrne, requires F., J. perspective. plant M. Young, Australian for Breed, an A., Tropical applications E., of Shapcott, metabarcoding Biffin, and G., dynamics L., P. barcoding Phylogenetic K. Nevill, Bell, (2021). K., F., A. Dijk, Viso, A. van J. E., E. Meira-Neto, Dormontt, & J., Thiele, M., Forests. Gastauer, Atlantic S., view. E. of Diniz, point taxonomic a metabarcoding: 30 barcoding: DNA Society, Environmental DNA logical (2017). (2011). L. E. communities. Bernatchez, R. plant I., & DeWalt, Bista, and E., S., animal M. Creer, survey Pfrender, F., we Altermatt, N., how A., Vere, Lacoursiere-Roussel, transforming de M., Davies, M., Seymour, D., E., D. Satterthwaite, Machler, DNA Lodge, W., M., (2012). H. C. Wales. J. Bik, M. of Moore, K., Wilkinson, conifers C., Deiner, & Long, and K., plants A., Walker, flowering H., S. native Trinder, Garbett, the R., T., Tatarinova, C. barcoding S., Ford, Conservation Amazon. Ronca, G., review) Eastern J., C. (under Allainguillaume, the G. T. H., from Calder´on, Oliveira, Rich, P., E. quillwort F., N., M. endemic Vere, Santos, C. narrowest A., de Caldeira, the Castilho, A., in G., F. structure M. Esteves, genetic F. N., Guimar˜aes, Santos, of R. J. S., D., Fonseca, implications B. Scherer, L., J. N., R. Pereira, Li, T. L., identifying Martins, Fernandes, Y., G. for N., C., Li, Nunes, barcode M. K., S., DNA Dias, E. Luo, novel F., Pires, X., a T. S., Pang, Vasconcelos, as Jaff´e, T., R., A., region Alves, Gao, ITS2 J., X., the Dalapicolla, Ma, of Validation Y., (2010). Zhu, C. L., species. Leon, Shi, plant & J., medicinal Y., Song, Lin, C., X., Liu, Jia, J., X., Han, 106 H., America, Yao, S., plants. of Chen, land States for United barcode the DADA2: DNA of (2016). P. A Sciences S. (2009). of Holmes, Group A., Working J. Plant A. data. Johnson, CBOL, amplicon W., G., Illumina A. Han, from T. J., inference M. M. sample Rosen, Wright, J., High-resolution P. S., McMurdie, Shokralla, J., B. M., Callahan, T. Porter, threatened a L., in D. dynamics metacommunity 117 Peters, reveals flora metabarcoding G., wilderness. DNA temperate wetland (2020). Z. local J. boreal D. Compson, a Baird, in & A., M., species W. Hajibabaei, plant Monk, Discriminating Percy, A., G., (2011). S. Bush, H. Newmaster, C. C., B. S. Husband, Barrett., W., the & S. using M., Graham, R., Hajibabaei, P. Target M., Kesanakurti, meet J., D. A. to Fazekas, (GSPC). collaboration S., Conservation K. and Plant Burgess, Innovation for 2020: Strategy Flora Global Brazilian the (2018) of Group 1 Flora Brazilian The BFG, (Fungi). Basidiomycota of 8539-8545. , rbc L+ mat vltoayEooy 35 Ecology, Evolutionary 174-181. , N barcode. DNA K LSOE 5 ONE, PLoS rnir nEooyadEouin 6 Evolution, and Ecology in Frontiers M irbooy 17 Microbiology, BMC rceig fteNtoa cdm fSine fteUie ttso America, of States United the of Sciences of Academy National the of Proceedings rnir ngntc,5 genetics, in Frontiers ehd nEooyadEouin 2 Evolution, and Ecology in Methods e8613. , 65-81. , LSOE 3 ONE, PLoS 12794-12797. , 42. , 12 Rodrigu´esia, 69 358. , LSOE 7 ONE, PLoS oeua clg,26 Ecology, Molecular e2802. , aueMtos 13 Methods, Nature 134. , ora fteNrhAeia Bentho- American North the of Journal rceig fteNtoa Academy National the of Proceedings 1513-1527. , e37945. , 333-340. , 5872-5895. , 581-583. , Posted on Authorea 19 Apr 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.161882857.78207100/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. okn,M .G 20) oeln h nw n nnw ln idvriyo h mznBasin. Amazon improvements 7: the version software of alignment sequence usability. biodiversity multiple and MAFFT plant performance (2013). unknown M. in D. Standley, and & known K., Katoh, the Modelling 34 Biogeography, (2007). of G. Journal J. with M. apart species Hopkins, plant Telling barcode. (2016). DNA D. A. plant 20150338. Twyford, a genomes. & , using M., to Bank, barcodes and der from van Choosing DNA: D.-Z., (2011). Li, P. M., P. D. Hollingsworth, Little, & encyclopedia W., 6 the S. ONE, reading PLoS Graham, to writing M., From P. (2016). Hollingsworth, M. Hajibabaei, & M., P. DNA life. through Hollingsworth, of identifications N., Biological D. (2003). P. R. Hebert, J. deWaard, & L., S. Ball, interactions A., collection. barcodes. Plant-pollinator Cywinska, historic (2019). N., a S. D. P. in Willows-Munro, Hebert, & bees D., from C. metabarcoding & Eardley, Th´ebaud, pollen C., H., A., time: Z. over barcodes. Roger, D. Ri´era, Siqueira, Caraj´as, DNA B., P´etronelli, Swanevelder, P., with of M., A., A., trees canga R. Gous, S. Amazonian Harley, the Mori, of M., of J., Identification R. plants Engel, (2009). vascular Brito, C., J. Amazon: Chave, U. S., Baraloto, Silva, the B. A., M., in J. M. Pastore, endemism Pereira, Gonzalez, L., Edaphic C., P. Viana, (2019). H. C., C. Lima, T. D. R., M. Brazil. Zappi, J. Watanabe, & Pirani, O., O., F., F. J. Imperatriz- N. M. occurrence-restricted Mota, & Siqueira, and C., M., S., T. A. crop-pollinator C. Saraiva, Giannini, Amazon: M., W., P. Eastern A. C. Giulietti, the Costa, affected. in L., more change tuberous Miranda, potentially Climate C., of are R. (2020). bees barcoding Borges, L. DNA F., V. W. (2017). Fonseca, Costa, H. C., Boer, T. Salep. Giannini, de in & used orchids S., of phylogenetic Zarre, identification analyzing 352. S., for for resource Selliah, tree a B., family Orchidoideae: Gravendeel, angiosperm A., Updated (2017). Ghorbani, A. A. structure. J. community Neto, and estimates. diversity (2017). Meira biodiversity J. & A. reliable M., Hansen, yields & Gastauer, C., data Pietroni, amplicon K., DNA A. of Brunbjerg, 8 R., Communications, curation Ejrnæs, post-clustering H., H. for the J. Bruun, Preserving Algorithm Manning, R., (2007). Kjøller, A., V. G., Balmford, Savolainen, T. P., hotspots. Frøslev, & D. biodiversity J., Faith, in A. M., floras T. R. of Hedderson, Cowling, potential G., J., evolutionary Reeves, utilization T. M., Davies, insightful Bank, M., Proche¸s, der for S¸., van C., Rouget, Guidelines R., Grenyer, know? F., you plants. Forest, in devil studies the evolutionary Better species-level (2007). 911-919. in A. ITS Rossell´o, J. nrDNA & valueof risks, N., forests: G. Amazonian than Feliner, Brazil’s discriminate in to service environmental harder an inherently conservation. as species and Biodiversity Newmaster, plant (2002). C., Are M. S. P. (2009). Barrett, Fearnside, markers? W., C. barcoding S. B. DNA Graham, Husband, using M., & species D. Percy, animal M., S., Hajibabaei, K. Burgess, G., R., S. P. Kesanakurti, J., A. Fazekas, hlspia rnatoso h oa oit :Booia cecs 371 Sciences, Biological B: Society Royal the of Transactions Philosophical h oaia eiw 85 Review, Botanical The rceig fteRylSceyo odnSre ilgclSine,270 Sciences, Biological B Series London of Society Royal the of Proceedings e19254. , niomna osrain 26 Conservation, Environmental 1188. , 1400-1411. , oeua ilg n vlto,30 Evolution, and Biology Molecular hlspia rnatoso h oa oit :Booia cecs 371 Sciences, Biological B: Society Royal the of Transactions Philosophical 357-383. , einlEvrnetlCag,20 Change, Environmental Regional caBtnc rslc,31 Brasilica, Botanica Acta oeua clg eore,9 Resources, Ecology Molecular 305-321. , 13 aue 445 Nature, oeua hlgntc n vlto,44 Evolution, and Phylogenetics Molecular 191-198. , 772-780. , vltoayApiain,12 Applications, Evolutionary oeua clg eore,17 Resources, Ecology Molecular 757-760. , 9. , LSOE 4 ONE, PLoS up 1 130-139. s1, Suppl 20150321. , 313-321. , e7483. , 187-197. , Nature 342- , , Posted on Authorea 19 Apr 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.161882857.78207100/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. oa .F . ilet,A .(06.Foao h agso h er o aa´s Par´a, Caraj´as, Brazil: dos Serra the of cangas the of Flora (2016). M. A. Giulietti, 2020 their & achieve O., botanists Gnetaceae. Brazil’s F. Can N. Online: Brazil Mota, of Flora ecosystem (2015). on N. impact M. change vision? E. Climate Lughadha, P., (2019). M. C. Amazonia. Morim, T. southeastern in Giannini, birds & by L., provided V. functions much Imperatriz-Fonseca, how matter? S., vegetation: it Amazon L. (2010). does Miranda, T. much R. how Pennington, and & M., know Hopkins, we D., don’t Sasaki, D., bioinventories by Zappi, and DNA W., taxonomy Advancing Milliken, plant (2016). from H. D. polysaccharides Janzen, barcodes. & of DNA W., Removal Hallwachs, with A., (1994). Hausmann, M. E., and S. R. Miller, analysis Amasino, interactive & reproducible C., for M. precipitation. package ethanol John, R D., S. an Michaels, Phyloseq: data. (2013). census S. microbiome Holmes, of & graphics China. barcoding J., DNA in P. of forests McMurdie, use subtropical The P., (2015). of G. X.-J. biogeography Frey, Ge, conservation & P., 188-199. the S., , Ragupathy, Dias, for N., tool G., Pei, G., a L. S. in as Newmaster, R. implemented H.-F., be Yan, Coelho, barcoding J., B., Liu, DNA Brazil. plant T. State, Can S˜ao Paulo Flores, (2018). from J. D., Chave perspective & 41 G. A C., Colletta, regions? V. tropical Souza, A., R., species-rich A. R. Oliveira, Rodrigues, A., F., A., Iribar, Prieto, A. N., R. P. Vargas, Lima, M., ecological the L. determines 113 A. change. heterogeneity America, Mendoza, climate Ecosystem of R., to (2016). States T. R. Amazon Andrade, Alvarez-D´avila, Feldpausch, P. E., the L., L., Moorcroft, S. of T. & Lewis, resilience Y., Erwin, L., Malhi, W., O. E., J. Phillips, J. R. A., Silva-Espejo, Brienen, Baccini, S., M., Vasconcelos, widespread C. Longo, W., a K., A. P. Zhang, Imperatriz- and Souza-Filho, M., narrow-endemic M., O., N. a A. Levine, J. of Giulietti, Siqueira, . assessment C., Amazonian R., conservation T. from genomic A. Giannini, glory Silva, Landscape M., morning G., (2018). N. Jaff´e, Oliveira, R. Carvalho-Filho, & S., W., R., S., Monteiro, Duthoit, Alves, 105 America, L., S., O., of V. N. Maurin, States Fonseca, Pope, G., hotspots. United C., biodiversity Gigot, the of E. F., of floras Lanes, Sciences the vascular Pupulin, of barcoding the J., Academy DNA Warner, of National (2008). Identification D., the V. Savolainen, (2012). of Bogarin, & N. M., G., D. Bank, T. P. Barraclough, der Hebert, library. van & barcode R., R., DNA Lahaye, a H. using Barron, Manitoba, L., Churchill, K. of plants Johnson, L., M. evolution, ecology, Kuzmina, for barcodes DNA (2015). L. D. . conservation. Erickson, & in and (2009). M., plot E. Garc´ıa-Robledo, Uriarte, dynamics J., C., Bermingham, W. forest & 106 Kress, O., Sciences, tropical Sanjur, a of R., Academy of Perez, phylogeny National G., community the N. a Swenson, of A., and F. barcodes Jones, of DNA L., Plant Journal D. future. Erickson, J., the W. in Kress, and today applications barcodes: DNA 55 Plant Evolution, (2017). J. W. Kress, 661-670. , Rodrigu´esia, 66 Rodrigu´esia, 67 291-307. , r nsi clg vlto,30 Evolution, & Ecology in Trends hlspia rnatoso h oa oit :Booia cecs 371 Sciences, Biological B: Society Royal the of Transactions Philosophical itcnqe,17 Biotechniques, 793-797. , 1115-1135. , 1191-1194. , LSOE 8 ONE, PLoS 274-276. , rnir nPatSine 9 Science, Plant in Frontiers 18621-18626. , rceig fteNtoa cdm fSine fteUnited the of Sciences of Academy National the of Proceedings e ultn 65 Bulletin, Kew e61217. , 14 25-35. , LSOE 14 ONE, PLoS M clg,12 Ecology, BMC 691-709. , e0215229. , 532. , eeisadMlclrBiology, Molecular and Genetics 2923-2928. , iest n itiuin,21 Distributions, and Diversity 25. , ytmtc and Proceedings Proceedings 20150339. , Posted on Authorea 19 Apr 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.161882857.78207100/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. iv,J .C,Rlns .B,&Fnea .A .(05.Teft fteAaoinaeso endemism. of areas Amazonian the of fate The (2005). B. A. G. Fonseca, 109 Fungi. & (2012). America, B., for 19 W. of A. Biology, marker Chen, States Rylands, Conservation barcode & United C., DNA A., the M. universal C. of J. Levesque, a Silva, Sciences L., as of J. Academy region Spouge, National (ITS) V., the spacer Robert, of Proceedings transcribed recalci- S., internal from Huhndorf, DNA A., ribosomal K. of Nuclear Seifert, amplification L., PCR C. Enhancing additive. Schoch, (2013). trehalose-based H. a M. using Alford, specimens & Eastern plant Caraj´as, struc- an Y., trant dos S. Serra secondary Wang, from T., lycophytes ITS2 and Samarakoon, Ferns and (2018). E. 5.8S T. range. Almeida, ITS1, subclades mountain & four Amazonian J., DNA A. of Arruda, Ribosomal characteristics A., distinctive (2017). Salino, reveal S. analyses Jha, phytochemical DNA and for & dos of content leaves Serra M., DNA of of nuclear preservation Sengupta, ture, field cangas S., of the P. means of a Saha, Flora as (2017). solution NaCl-CTAB L. Saturated P. analyses. (1992). Viana, H. & S. J., Rogstad, Meirelles, R., Goldenberg, J., Melastomataceae. Caraj´as, Par´a, C. Brazil: agroecosystem. an K. Applicati- in (2015). bees Rocha, M. honey by R. 3 collected Johnson, Sciences, pollen & of Plant K., provenance Goodell, in Jaff´e, the R., O., determine Applications C., J. to Quijia, light metabarcoding M. B., ITS2 sheds Dias, D. of barcoding conservation. Sponsler, on S., DNA cave C.-H., Vasconcelos, testing: for Lin, Blind T., subsidy C., R. (2018). a T. Richardson, M. as M. fragments A. plant Giulietti, Watanabe, of & L., identity G., G. the Oliveira, upon C., Nunes, T. C., Giannini, D. X., Prous, Zappi, J., ferns. A. and Ramalho, lycophytes B., extant for G. 54 classification Chuyong, Evolution, community-derived rainforest D., and African A of Kenfack, (2016). identification W., I the in PPG D. barcodes DNA Thomas, are effective M., How (2013). Philippe, J. trees? M., O. Hardy, Kuzmina, & C., J., Cruaud, assembly. Duminil, paired-end OverlapPER: I., and and Parmentier, PIPEBAR analysis (2018). barcoding R. Alves DNA & accurate G., Oliveira and L., 297. fast G. T. a Lima, in for L., genomics G. tools and Nunes, M., barcoding R. DNA R. (2019). Oliveira, S. Vasconcelos, fields. & altitude R., Amazon Alves, megadiverse Imperatriz- R., the Valadares, S., M., G., W. Vasconcelos, multidisciplinary Nunes, P. a G., C., Souza-Filho, Amazon: Oliveira, Silva, the A., Caldeira, from N., Castilho, Quillworts Carvalho-Filho, M., (2018). T., G. M., Fernandes, on Oliveira, A. M. & study G., R., Giulietti, H. M. populational Alves, C. O., F. F., J. Santos, Bandeira, Siqueira, Jaff´e, T. M., R., V., Fonseca, J., T. Guimar˜aes, J. Pereira, Rodrigues, M., M., F., Watanabe, R. E. M., R. Dias, E., Oliveira, Pires, L., hotspots Biodiversity G. (2000). J. Kent, Nunes, & plants. B., M., A. seed A. G. of Fonseca, Giulietti, priorities. G., list conservation S., C. the R. for Mittermeier, A., by Viveros, Caraj´as R. revealed J., Mittermeier, of Pallos, N., vegetation Myers, L., unique A. the Hiura, canga: C., D. Amazon Zappi, (2018). Rodrigu´esia, C., L. 69 P. T. M. Viana, Watanabe, & O., F. N. Mota, Protasparagus LSOE 8 ONE, PLoS ao,41 Taxon, 1435-1488. , . ora fSseaisadEouin 55 Evolution, and Systematics of Journal 563-603. , 701-708. , e54921. , see serracarajensis Isoetes 689-694. , aue 403 Nature, Rodrigu´esia, 69 apps.1400066. , 853-858. , Rodrigu´esia, 68 BLBroeBlei,9 Bulletin, Barcode iBOL 1417-1434. , and see cangae Isoetes 15 plctosi ln cecs 1 Sciences, Plant in Applications 997-1034. , 54-70. , . LSOE 13 ONE, PLoS 5498. , rnir nPatSine 9 Science, Plant in Frontiers M iifrais 19 Bioinformatics, BMC e0201417. , ora fSystematics of Journal 6241-6246. , apps.1200236. , 1052. , , Posted on Authorea 19 Apr 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.161882857.78207100/v1 — This a preprint and has not been peer reviewed. Data may be preliminary...... adGO;temnsrp a rte ySV n ...Tefia eso of version final The J.L., G.L.N. G.L.N., and authors. S.V., S.V. the by by all captions performed by written Figure approved was and and manuscript designed read the was were G.O.; manuscript S.V., analyses the by and bioinformatic executed R.A. were the specimens T.G.L.L., L.V.V., experiments M.P., E.S.P.; the R.R.M.O., laboratory A.L.H., and G.O.; the D.C.Z., A.M.G.; and M.T.C.W., J.L. and R.B.S.V., A.M.G. M.C.D., R.M.H J.L., V.L.I.F., A.O.S., M.C.D., P.L.V., A.S.B.G, S.V., G.L.N., P.L.V., by S.V., N.F.O.M., processed by or collected designed either and were conceived was data. study supplementary The he the in request. provided upon Contributions alignments available Author indicated sequence made numbers the be accession as will BOLD well data the as through other accessed S1, All Table be may Supplementary work the this in for generated sequences DNA experimental L., All robust Tedersoo, for J., need Pawlowski, – J., Accessibility metabarcoding E., Pansu, Data DNA Coissac, conclusions. L., Gilbert, (2019). ecological S., L., P. Orlando, sound Fumagalli, Creer, Taberlet, N., H., draw A., & Fierer, to Kauserud, F., E., A. designs A., G. Willerslev, Chariton, Ficetola, Jumpponen, F., J., F., P. S., A. & Boyer, Dumbrell, Thomsen, Jarman, C., H., A., context. P., I. T. Bik, Dickie, T. rupestre B´alint, M. M., M., M. G., campo Barba, Watanabe, De I. a F., E., Alsos, in B. N. Deagle, A., vegetation Mota, Bonin, canga L., L., Amazonian P. Zinger, for Viana, future T., a Meagher, 14 Plotting B., ONE, Walker, (2019). PLoS F., N. M. fungi. E. for Moro, Lughadha, barcoding C., (Eds.), repeat D. White tandem Zappi, E., ribosomal J. Kristiansson, Introducing T. S., (2019). Svantesson, & H. S., 19 Sninsky, Resources, R. Wyngaert, Ecology J. Nilsson, den Van & J. J., M., Gelfand, Bengtsson-Palme, Kagami, ribosomal H. E., fungal Larsson, D. of C., applications sequencing Innis, Wurzbacher, and direct A. methods and M. to Amplification guide (1990). In: J. a phylogenetics. Protocols: Taylor, for S., Lee, genes T., RNA Bruns, J., T. L., (2005). White, A. G. Kahl, Ilkiu-Borges, & M., K., methodology. applications R. Wolff, A. and and Harley, Giulietti, H., area C., & Nybom, study C., D. history, Caraj´as, K., Maurity, Par´a, Trov´o, Brazil: Weising, dos Zappi, M., M., Serra U. A., the J. Salino, of Santos, B., cangas C., the T. S. Rodrigu´esia, of M. 67 A. Watanabe, Flora Gil, E., (2016). T. O., M. Almeida, F. S., N. R. Secco, Mota, L., (2018). M. P. A. Viana, Benko-Iseppon, traditional & the G., Oliveira, among the B., relationships of T. phylogenetic groups Croat, phylo- the M., large on C. insights of Sakuragui, New post-analysis L., M. and Soares, analysis S., phylogenetic Vasconcelos, for tool a 8: version conservation. genies. RAxML biodiversity R., (2014). for W. A. challenge quanti- Jr., Stamatakis, a and Nascimento Amazon: Mapping C., Brazilian (2019). O. D. the 14 J. Santos, in ONE, Siqueira, savannas M., PLoS & outcrop A. L., ferruginous V. Giulietti, of Imperatriz-Fonseca, Jaff´e, R., F., fication C., M. Costa, T. biodiver- F., Giannini, Canga T. Guimar˜aes, M., (2014). J. J.O. W. Siqueira, P. & Souza-Filho, G., Tzotzos, mining. N., of Carvalho, matter C., a Chaparro, sity, A., Castilho, A., Skirycz, iifrais 30 Bioinformatics, Homalomena e0219753. , e0211095. , 1107-1124. , 2de..Bc ao,F:CCPress. CRC FL: Raton, Boca ed.). (2nd 118-127. , rnir nPatSine 5 Science, Plant in Frontiers 1312-1313. , ld (Araceae). oeua hlgntc n vlto,127 Evolution, and Phylogenetics Molecular oeua clg,28 Ecology, Molecular p.3532.SnDeo A cdmcPress. Academic CA: Diego, San 315-322). (pp. 16 653. , N nepitn npat:Picpe,methods, Principles, plants: in fingerprinting DNA 1857-1862. , Philodendron ugnr n h other the and subgenera 168-178. , Molecular PCR Posted on Authorea 19 Apr 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.161882857.78207100/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. otta aus(? 0 r hw bv h rnhso h cladogram. the of the with branches of analysis the likelihood B) above maximum shown in on are phylogram based 70) and regions ([?] A related values and in with Bootstrap Carajas (cladogram analysis cladogram. dos trees likelihood the Serra the Phylogenetic of of maximum of branches S10. on B) the Figure above in based Supplementary shown phylogram regions are and related 70) A and ([?] in values Carajas Bootstrap (cladogram dos trees Serra Phylogenetic cladogram. of the of S9. branches the Figure the of Supplementary above B) with shown analysis in are likelihood phylogram maximum 70) and on based ([?] with A regions values in related analysis and (cladogram cladogram. Carajas likelihood the trees the dos maximum Serra of Phylogenetic of branches on S8. B) the Figure based above in Supplementary shown phylogram regions are and related 70) A and ([?] in values Carajas Bootstrap (cladogram dos trees Serra Phylogenetic of S7. matrix cladogram. Figure concatenated the Supplementary the of with branches the analysis the of likelihood above B) maximum shown on in are phylogram based regions and related A of and in Carajas (cladogram dos trees Serra Phylogenetic matrix S6. cladogram. ITS2 Figure reduced the Supplementary the of branches with the the analysis of above likelihood B) shown maximum in are on phylogram based and regions A related (ITS2 and in (cladogram Carajas dos trees Serra Phylogenetic S5. cladogram. Figure reduced the Supplementary the of with branches the analysis the of likelihood above maximum B) shown on in are based phylogram regions and related A ( and in Carajas (cladogram dos trees Serra Phylogenetic S4. matrix Figure concatenated Supplementary the with cladogram. the analysis the of likelihood of B) maximum branches on in the phylogram based available regions and all related A with and in Carajas (cladogram dos trees cladogram. Serra Phylogenetic the Bootstrap of S3. branches the sequences. Figure the ITS2 of Supplementary with above B) analysis shown in likelihood are maximum phylogram 70) on and based ([?] A regions values related in and (cladogram Carajas trees cladogram. dos Serra Phylogenetic the of S2. branches the Figure the of Supplementary above B) shown with in are analysis phylogram 70) likelihood maximum and ([?] on A values based in regions related (cladogram Caraj´as and dos trees Serra Phylogenetic S1. Figure Supplementary samples bulk with captions analysis material metabarcoding Supplementary DNA the in different species six observed the in the to collected of correspond abundance branches Relative coloured 3. The Figure Amazon. Eastern the in orders. regions the listed related Caraj´as from and dos tree Serra likelihood the within Maximum evidenced 2. are Figure (CFNP) Park National Ferruginosos Forest. Campos Amazon the and (CNF) Forest Caraj´as National the the of Distribution 1. Figure rbc rbc L n T2( ITS2 and L u) nldn nysmlswt both with samples only including cut), u) nldn nysmlswt both with samples only including cut), rbc rbc n T2sqecs( sequences ITS2 and L L+ITS2 canga canga lt nteSradsCrja,a ealdi h upeetr al S6. Table Supplementary the in detailed Caraj´as, as dos Serra the in plots u) nldn nysmlswt ohsqecs otta aus(? 70) ([?] values Bootstrap sequences. both with samples only including cut), omto nteSradsCrja,Pra rzl h icmcitosof circumscriptions Caraj´as, The Par´a, dos Brazil. Serra the in formation rbc rbc rbc rbc n T2sqecsaalbe otta aus(? 70) ([?] values Bootstrap available. sequences ITS2 and L L+ITS2 n T2sqecsaalbe otta aus(? 70) ([?] values Bootstrap available. sequences ITS2 and L n T2cnaeae arxo the of matrix concatenated ITS2 and L 17 ul.Bosrpvle []7)aesonabove shown are 70) ([?] values Bootstrap full). rpo rbc eune.Bootstrap sequences. B atp eune.Bootstrap sequences. L F- rpo mat atp canga canga canga canga canga canga canga canga 1sequences. C1 sequences. K canga canga rbc canga sequences. H matrix L lnsof plants lnsof plants lnsof plants of plants of plants of plants of plants of plants plants plants plants Posted on Authorea 19 Apr 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.161882857.78207100/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. eune nec ftesxdffrn apigsos sdtie nSplmnayTbeS6. Table Supplementary in detailed as recovered spots, of of sampling number different samples and six bulk obtained the with parameters of analysis BLAST of each metabarcoding attribution, in samples DNA taxonomic sequences bulk the their in indicating spots. using leaves, ASVs sampling analysis plant Identified the metabarcoding of S7. coordinates DNA Table the Supplementary the and for characteristics vegetation collected indicating Samples leaves, collected work. indicating by present S6. database, referred the BOLD Table in and the Supplementary sampled in used species records as and barcode Amazon, DNA families previous Eastern respective without their Para, Genera Serra of S5. of state Table canga Supplementary Brazilian the of the species (2017). in plant al. the regions et for results related Babiychuk barcodes different and DNA the generate Carajas as to dos used well Primers ( as S4. shown, matrices Table is Supplementary alignment marker different per six species the per using L+ITS2 accessions analyses of number both total of (phylogenetic The resolution ITS2. phylogenetic and and the approach) of (BLAST approach) resolution reconstruction Barcode names S3. Table the barcode Supplementary with DNA eight ones We the the with approach. for identification marker. BLAST species except per for ( accessions, analysis sequences tested all resolution markers the Barcode for of S2. markers availability Table the Supplementary and of database, one BOLD red. least the published in at previously marked in of of presence species markers, sequences different numbers, the eight obtained the voucher for with the accessions, test barcode of the of BOLD Flora in DNA included barcoding the information, was of DNA sample taxonomic the list the whether bringing species in the regions, used in related samples presence other all qualities, of and lowest different List the Carajas in and coloured highest S1. the boxes Table being in Supplementary tones represented darkest are and lightest qualities the call with Base peaks, respectively. respectively. the C, below blue and of G shades T, an A, to of correspond electropherogram cladogram. Representative the cavalcantei of the S13. with branches of analysis the Figure B) above likelihood Supplementary shown maximum in on are phylogram based 70) and regions ([?] A related values and in Bootstrap Carajas (cladogram cladogram. dos trees the Serra Phylogenetic of of with the branches analysis S12. of the likelihood Figure B) above maximum Supplementary shown in on are phylogram based 70) and regions ([?] related A values and in Bootstrap Carajas (cladogram dos trees Serra Phylogenetic of S11. Figure Supplementary uland full apeeiecn eea ueipsdbs alpas e,gen elwadbu peaks blue and yellow green, Red, peaks. call base superimposed several evidencing sample mat rbc L+ITS2 K, rbc L, cut). rpo canga B, rpo lnso h er o aaa n te eae ein with regions related other and Carajas dos Serra the of plants C1, atp canga F- atp 18 fCrjs(C) sdsrbdi oae l (2018), al. et Mota in described as (FCC), Carajas of H, psb rbc K- psb L full, I, atp trn rbc F- H- atp L canga psb u,ITS2 cut, euneo an of sequence H n T2 hog the through ITS2) and A lnso h er dos Serra the of plants trn psb H- K- ul ITS2 full, psb psb canga canga sequences. A sequences. I Ipomoea cut, plants plants canga rbc rbc L Posted on Authorea 19 Apr 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.161882857.78207100/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. figures/Figure-2-color-tree/Figure-2-color-tree-eps-converted-to.pdf 19 Posted on Authorea 19 Apr 2021 — The copyright holder is the author/funder. All rights reserved. No reuse without permission. — https://doi.org/10.22541/au.161882857.78207100/v1 — This a preprint and has not been peer reviewed. Data may be preliminary. plant-diversity-of-the-amazonian-canga-through-dna-barcoding Table_2.pdf file Hosted plant-diversity-of-the-amazonian-canga-through-dna-barcoding Table_1.pdf file Hosted vial at available at available figures/Figure-3/Figure-3-eps-converted-to.pdf https://authorea.com/users/408673/articles/518601-unravelling-the- https://authorea.com/users/408673/articles/518601-unravelling-the- 20