Supplementary material

Recommended citation of this material: Couton M, Comtet T, Le Cam S, Corre E, Viard F (2019) Metabarcoding on planktonic larval stages: an efficient approach for detecting and investigating life cycle dynamics of benthic aliens. Management of Biological Invasions 10(4): 657–689, https://doi.org/10.3391/mbi.2019.10.4.06

Table S1. Details concerning the 42 target non-indigenous . Figure S1. Use of an in-silico mock community to determine the values for –d and –r parameters used in the obiclean tool. Figure S2. Frequency distribution of pairwise sequence identity between and within species 18S and COI markers. Figure S3. Agarose gel illustrating the amplification success of 24 NIS with both 18S and COI primers. Figure S4. Phylogenetic tree of the Styelidae family inferred from 18S sequences. Figure S5. Phylogenetic tree of the Bugulidae family inferred from 18S sequences. Figure S6. Phylogenetic tree of the Veneridae family inferred from 18S sequences. Figure S7. Monthly variations in abundance of slipper limpet (Crepidula fornicata) larvae based on morphological identification, and of reads assigned to C. fornicata based on metabarcoding data using the COI marker.

1

Table S1 Details concerning the 42 target non-indigenous species. The total number of reference sequences, retrieved from public databases or produced locally (number in parenthesis) is indicated for each marker. The identification of the species following our metabarcoding analysis is indicated in the columns “detected”, for each of the two markers used. Whenever possible, individual DNA was tested for amplification bias for COI. The results are reported in the last column with “yes” meaning that no (or weak) amplification was visible for the COI marker, and “no” meaning that amplification was visible (see figure S3). No bias was observed for the 18S marker.

Present in the Detected Detected Amplification Class Species Dispersal modea Native rangeb 18S COI study bay with 18S with COI with COI Asterocarpa humilis short disperser yes S Pacific 2 (1) 4 (2) yes no no Botrylloides diegensis short disperser yes NE Pacific 1 (1) 2 (2) noc yes yes Botrylloides violaceus short disperser yes NW Pacific 1 (1) 5 (1) noc yes yes Botrylloides sp Xd short disperser no Unknown 1 (1) 1 (1) no no yes Ciona robusta short disperser yese NW Pacific 1 (1) 55 (55) no no no Ascidiacea cryptogenic (native from the southern Corella eumyota short disperser yes 1 (1) 11 (11) yes no no hemisphere) Didemnum vexillum short disperser yes cryptogenic (introduced in Europe) 1 (-) 21 (8) no no yes Molgula manhattensis short disperser no NW Atlantic 1 (-) 19 (17) no no not tested Perophora japonica short disperser yes NW Pacific 1 (1) 1 (1) no no nof Styela clava short disperser yes NW Pacific 1 (1) 27 (1) no no weak

Anadara kagoshimensis long disperser no NW Pacific 2 (-) 64 (-) no no not tested

Bivalvia Arcuatula senhousia long disperser no NW Pacific 1 (-) 59 (-) no no not tested Corbicula fluminea short disperser no NW Pacific 4 (-) 15 (-) no no not tested a Short and long disperser describe species with a bentho-pelagic life cycle for which the larvae spend either less or more than 2 days in the water column, respectively, based on literature data. b Cryptogenic refers to species for which the native range is unknown c No unique variants assigned to this species but some were assigned to the genus d Botrylloides sp X is a cryptic species recently discovered within the genus Botrylloides (Viard et al. in prep.; Wood et al. (2015)). Reference: Wood C, Bishop J, Yunnie A (2015) Comprehensive reassessment of NNS in Welsh marinas, online report available at: http://plymsea.ac.uk/id/eprint/7138/1/Comprehensive%20Reassessment%20of%20NNS%20in%20Welsh%20marinas.pdf e This species has been reported in 2012 but disappeared after that (Bouchemousse et al. 2017; Authors, personal observation). Reference: Bouchemousse S, Lévêque L and Viard F (2017) Do settlement dynamics influence competitive interactions between an alien tunicate and its native congener? Ecology and Evolution 7: 200-213. f The lack of PCR amplification for P. japonica is not shown in Figure S3 but has been observed in previous experiments (data not shown).

2

Ensis leei long disperser no NW Atlantic 3 (-) 9 (-) noc no not tested yes / Crassostrea gigasg long disperser NW Pacific 4 (1) 70 (1) yes no yes Aquaculture Mercenaria mercenaria long disperser no NW Atlantic 5 (-) 8 (-) yes no yes Mizuhopecten yessoensis long disperser no NW Pacific 1 (-) 7 (-) no no not tested Mya arenaria long disperser yes NW Atlantic 2 (1) 12 (-) yes yes yes

Mytilopsis leucophaeata long disperser no NW Atlantic & Gulf of Mexico 4 (3) 2 (1) no no weak Petricolaria long disperser no NW Atlantic 1 (-) 1 (-) no no not tested pholadiformis Rangia cuneata long disperser no Gulf of Mexico 1 (1) 5 (3) no no yes Ruditapes philippinarum long disperser noh NW Pacific 2 (1) 91(-) yes yes yes Xenostrobus securis long disperser no South Pacific 2 (-) 23 (-) no no not tested

Corambe obscura long disperser no NW Atlantic and Gulf of Mexico 1 (-) 2 (-) no no not tested

Crepidula fornicata long disperser yes NW Atlantic 2 (-) 68 (50) yes yes yes Crepipatella dilatata direct developper no SE Pacific 2 (2) 35 (3) yes no no Gracilipurpura rostrata direct developper yes Mediterranean Sea 1 (1) 1 (1) no no weak Haminoea japonica short disperser no Indo NW Pacific 0i 36 (-) no no not tested Mediterranean Sea & Macaronesian Hexaplex trunculus direct developper no 0 37 (-) no no not tested islands Ocinebrellus inornatus direct developper no NW Pacific 0 8 (-) no no not tested Rapana venosa long disperser no NW Pacific 1 (-) 45 (-) no no not tested

Tritia neritea direct developper yes Mediterranean and Black Seas 0 18 (-) noc no not tested

Tritia pellucida - j no Mediterranean Sea 0 1 (-) noc no not tested

Urosalpinx cinerea direct developper no NW Atlantic 0 11 (-) no no not tested

Gymnolaemata Bugula neritina short disperser yes South Pacific 1 (1) 11 (1) yes yes yes g Note that two species names are currently accepted according to the World Register of Marine Species, Crassostrea gigas and Magallana gigas h This species has not been officially reported in the bay of Morlaix but some individuals have been collected by authors and other researchers. i No 18S sequence available for H. japonica but one attributed to Haminoea sp. was used in this study. j Information not available for this species 3

Bugulina fulva short disperser yes NW Atlantic 1 (-) 2 (1) no no yes cryptogenic (Presumably from the Bugulina simplex short disperser no 1 (1) 2 (1) no no yes Mediterranean Sea) Bugulina stolonifera short disperser yes NW Pacific 1 (-) 2 (1) no no yes Celleporaria brunnea short disperser no NE Pacific 1 (1) 1 (1) no no yes Schizoporella japonica short disperser no NW Pacific 1 (1) 1 (1) no no yes

Tricellaria inopinata short disperser yes cryptogenic (Presumably NE Pacific) 0 1 (-) no no not tested

Watersipora subatra k short disperser yes cryptogenic 1 (1) 3 (3) yes no yes

k A recent revision of the genus by Vieira et al. (2014) revealed that the bryozoan W. subtorquata previously reported as an introduced species in Europe was actually W. subatra. Référence: Viera LM, Spencer Jones M, Taylor PD (2014) The identity of the invasive fouling bryozoan Watersipora subtorquata (d’Orbigny) and some other congeneric species. Zootaxa 3857(2): 151-182 4

Figure S1 Use of an in-silico mock community to determine the values for –d and –r parameters used in the obliclean tool In order to evaluate the most appropriate values to use for two parameters of the obiclean tool available in the OBITools suite v.1.2.11, an in-silico mock community was created. A set of 294 sequences for the 18S marker, representing 254 species across 41 families within our four classes of interest were gathered from the SILVA public database. Some of these sequences were multiplied to mimic the variations in abundance which could be observed in a real dataset (between 1 and 130 sequences for each species). Sequencing was simulated using ART (Huang et al. 2012) and the quality profile of our real dataset was applied to the artificial one. The pipeline described above was applied to the 15,480 produced reads, and the obiclean tool was used several times with different values of –d (number of differences allowed for a read to be considered as an error produced from a variant) and –r (ratio between the abundance of two variants below which the less abundant one will be discarded). All variants passing the filtering process were then compared to the 294 starting sequences using BLAST®, and assigned to a species name when matching with more than 99% query cover and 100% identity. The number of species not retrieved (representing reads wrongly considered as errors) is shown in A, and the number of unassigned variants (representing undetected errors) is shown in B for the three values of –d and the five values of –r tested. Reference: Huang W, Li L, Myers JR, Marth GT (2012) ART: a next-generation sequencing read simulator. Bioinformatics 28(4): 593-594

5

Figure S2 Frequency distribution of pairwise sequence identity (%) between (BS) and within (WS) species using data from our custom-designed reference database for 18S (A) and COI (B) markers. The thresholds chosen for taxonomic assignment (i.e., 99% for 18S and 92% for COI) are indicated by red lines.

6

Figure S3 Taxonomic amplification efficiency for 18S and COI. The picture displays some of the amplification results obtained using DNA extracted from 24 species to test for failure vs. success of amplification.

7

Figure S4 Phylogenetic tree of the Styelidae family inferred from 18S sequences by using the maximum likelihood method based on the Kimura-2-parameter model (log likelihood = -1563.8858). A discrete Gamma distribution was used to model evolutionary rate differences among sites (5 categories). Bootstrap values (percentage over 1000 permutations) are shown for each visible node. This tree was computed using public references available for this family as well as all locally produced references (“Ref”), unique variants assigned to a Styelidae species (“UV”, bold), and two outgroups (“OG”).

8

Figure S5 Phylogenetic tree of the Bugulidae family inferred from 18S sequences by using the maximum likelihood method based on the Kimura-2-parameter model (log likelihood = -1160.7132). A discrete Gamma distribution was used to model evolutionary rate differences among sites (5 categories). Bootstrap values (percentage over 1000 permutations) are shown for each visible node. This tree was computed using public references available for this family as well as locally produced references (“Ref”), unique variants assigned to a Bugulidae species (“UV”, bold), and two outgroups (“OG”).

9

Figure S6 Phylogenetic tree of the Veneridae family inferred from 18S sequences by using the maximum likelihood method based on the Kimura-2- parameter model (log likelihood = -1618.0341). A discrete Gamma distribution was used to model evolutionary rate differences among sites (5 categories). Bootstrap values (over 1000 permutations) are shown for each visible node. This tree was computed using public references available for this family as well as locally produced references (“Ref”), unique variants assigned to a Veneridae species (“UV”, bold), and two outgroups (“OG”).

10

Figure S7 Monthly variations in abundance of slipper limpet (Crepidula fornicata) larvae based on morphological identification (larvae m-3), averaged across seven years (red, right axis), and monthly variations in abundance of reads assigned to C. fornicata based metabarcoding data using the COI marker, averaged across replicates and samples of both years (grey bars, left axis). Each point/bar represent the mean number of observations at a given month. Black error bars and the red area represent the standard error of mean.

11