Supplementary Information for:

In situ relationships between microbiota and potential pathobiota in Arabidopsis thaliana

Claudia Bartoli1¶, Léa Frachon1¶, Matthieu Barret2, Mylène Rigal1, Carine Huard-Chauveau1,

Baptiste Mayjonade1, Catherine Zanchetta3, Olivier Bouchez3, Dominique Roby1, Sébastien

Carrère1, Fabrice Roux1*

Affiliations:

1 LIPM, Université de Toulouse, INRA, CNRS, Castanet-Tolosan,

2 IRHS, INRA, AGROCAMPUS-Ouest, Université d’Angers, SFR4207 QUASAV, 42, rue

Georges Morel, 49071 Beaucouzé, France

3 INRA, GeT-PlaGe, Genotoul, Castanet-Tolosan, France

¶These authors contributed equally to this work.

*To whom correspondence should be addressed. E-mail: [email protected]

This file includes: Supplementary Text: Material and Methods, Results and References

Supplementary Figures 1-17 Supplementary Tables 1-14

1

SUPPLEMENTARY TEXT

MATERIAL AND METHODS

Plant material In agreement with previous observations in natural populations of A. thaliana located in northeast of Spain (Montesinos et al., 2009), the 163 populations studied here strongly differed in their main germination flush in autumn 2014, thereby leading to the observation of different life plant stages among populations. We therefore defined three seasonal groups. The first group, hereafter named ‘autumn’, corresponded to 84 populations where most plants had reached the 5-leaf rosette stage during the 23-day sampling period in autumn (mid –

November 2014 – early December 2014). The second group, hereafter named ‘spring with autumn’, corresponded to 80 populations already sampled in autumn and additionally sampled during a 29-day period in early-spring (mid-February 2015 – mid-March 2015). Four populations sampled during autumn were not collected in early-spring to avoid modifications of their demographic dynamics. The third group, hereafter named ‘spring without autumn’, corresponded to 79 populations only sampled during the 29-day period in early-spring. These populations were not sampled in autumn because the life stage of most plants was between 2- cotyledon and 4-leaf. The ‘autumn’ and ‘spring with autumn’ groups allowed to test whether the dynamics of diversity and composition of the bacterial communities across seasons was dependent on the population considered. On the other hand, the ‘spring with autumn’ and

‘spring without autumn’ groups allowed to test whether the diversity and composition of bacterial communities in spring 2015 were affected by germination timing in autumn 2014.

2

Validation of the gyrB marker used for characterization of bacterial communities

To characterize the A. thaliana bacterial fraction, a portion of gyrB (encoding the β subunit of the bacterial gyrase) has been used. This prokaryote-specific molecular marker has a deeper taxonomic resolution (species-level) than other molecular markers designed on the hypervariable regions of the 16S rRNA gene (Barret et al., 2015). Furthermore, this single- copy gene limits the overestimation of taxa carrying multiple copies of rrn operons. The gyrB prevalence was investigated in 32,062 bacterial genomic sequences available in the IMG database v4 (Markowitz et al., 2014) at the time of analysis (10th December 2015). Coding sequences that exclusively belong to the protein family TIGR01059 and KO2470 were defined as GyrB orthologs and retrieved for further analysis (30,627 hits found in 30,175 genomic sequences). The corresponding nucleotide sequences were aligned against a reference gyrB alignment (Barret et al., 2015) with the align.seqs function in mothur (Schloss et al., 2009). Sequences that did not align (102) were discarded and only unique sequences were conserved in the reference alignment (10,427 haplotypes). According to the gANI

(Varghese et al., 2015), the gyrB marker was highly precise (0.964) and sensitive (0.955) at a genetic distance of 0.02 (98% identity cutoff). In order to assess the potential amplification bias of gyrB, we amplified a mock community containing 52 bacterial strains (Supplementary

Data 1) with both gyrB and 16S rRNA V4 region primers (Caporaso et al., 2011). Results showed that 16S rRNA gene and gyrB sequences were clustered with an identity threshold of

97% and 98%, respectively. Based on this clustering, we obtained n = 19 OTUs with the 16 rRNA gene and n = 45 OTUs with gyrB. The 52 members of the mock community were all detected with the 16S rRNA gene marker, while three strains were not detected with the gyrB segment (Supplementary Data 1). Overall, our results suggested a better taxonomic resolution of the gyrB but associated with a small cost on bacterial detection.

3

Sampling, generation of the gyrB amplicons and sequences

Plants were excavated using flame-sterilized spoons and then manipulated with flame- sterilized forceps on a sterilized porcelain plate. Gloves and plate were sterilized by using

Surface'SafeAnios®. Roots and rosettes were rinsed into individual tubes of sterilized distilled water to remove all visible rhizosphere. Both leaves and roots were then placed into sterilized tubes and immediately stored in dry ice. Samples were stored at -80°C prior DNA extraction.

For each plant, we recorded the sampling date. In addition, the age of each plant was approximated by measuring the maximum rosette diameter and by counting the number of leaves. Finally, each plant was visualized with the human eye for the presence of disease symptoms on the rosette leaves. Following Roux et al. (2010), the presence of disease symptoms was determined by the presence of chlorosis, water soaking, or cell death. All plants were therefore classified according to a binary category, i.e. presence or absence of visible disease symptoms on the rosette leaves.

The DNA of each leaf and root sample was extracted as follows: i) leaves were placed in 96 well plates containing sterilized beads and homogenized for 1 min with 30 vibrations per second in a plate shaker and incubated 30 min in 500 µl of buffer containing 200 mM of

Tris-HCl at pH 7.5, 250 mM of NaCl, 25 mM of EDTA and 0.5% SDS, ii) roots were placed in Eppendorf tubes and incubated 10 min in a sonicator bath and treated with the same conditions described above for the leaves, iii) for both leaves and roots phenol/chloroform

25:24:1 pH 8.0 (Sigma Aldrich®) was used for extraction and purification of the DNA, and iv)

DNA was precipitated with isopropanol and washed with 70% EtOH and eluted in 100 µl of

DNA-free water. DNA samples were stored at -20°C prior PCR amplification.

Briefly, three tags were added at each 5' and 3' of the original primers to allow the multiplexing of three plates. Primers including Illumina adapter sequences and without

4

internal tags were: Fw (5'-

CTTTCCCTACACGACGCTCTTCCGATCTMGNCCNGSNATGTAYATHGG - 3') and Rv

(5'-

GGAGTTCAGACGTGTGCTCTTCCGATCTCCTCTTACNCCRTGNARDCCDCCNGA -

3'). The internal tags for multiplexing consisted in: TAG1 (Fw - GACTAC, Rv - AAGGCC),

TAG2 (Fw - CTGGTT, Rv - GTCAGG), TAG3 (Fw - ACTCGA, Rv – CCTCTT). MTP taq

DNA Polymerase-Sigma Aldrich® was used and the PCR mix was composed by 2.5 µl of Taq

Buffer, 0.2 µl of dNTPs 10mM, 1 µl of Fw Primers (10p/mol), 1 µl of Rv Primers (10p/mol),

0.3 µl of Taq polymerase, and 1 µl of 10 fold diluted DNA for a final volume of 25 µl. PCR amplifications were performed by using 95°C for 5 min of initial denaturation followed by 40 cycles with 95°C for 30 sec, 52°C for 1 min and 30 sec, 68°C for 1 min and a final elongation of 68°C for 5 min. Negative controls were also added to investigate whether amplification of bacterial DNA was detected on (i) the water used for both leaves and roots washing, (ii) the water used for DNA elution and (iii) the water used for PCR mix.

Statistical analyses

Natural variation of microbiota and potential pathobiota

Natural variation for the eight descriptors of microbiota (i.e. richness, α-diversity Shannon index, PCo1 and PCo2) and potential pathobiota microbiota (i.e. richness, α-diversity

Shannon index, PCo1 and PCo2) was explored using the following mixed models:

1. To explore natural variation in populations collected both in autumn and spring, we used

the following mixed model:

Yijklmno = µtrait + seasoni+ compartmentj + seasoni× compartmentj + populationk + seasoni× populationk + compartmentj× populationk + seasoni× compartmentj× populationk+ sampling_datel(seasoni) + diameterm(seasoni) + leaf_numbern(seasoni) + obso+ εijklmno (1) 5

where ‘Y’ is one of the 7 descriptors,‘µ’ is the overall phenotypic mean; ‘season’ accounts for differences between autumn and spring; ‘compartment’ accounts for differences between leaves and roots; ‘population’ measures the effect of populations; interaction terms involving the ‘population’ term account for variation among populations in reaction norms across the two seasons and/or the two plant compartments; ‘ε’ is the residual term. Four terms were added to control for noise that may affect significance of the other model terms. First,

‘sampling_date’ accounts for the number of days since the first population was collected within each season. Second, because age can shape leaf and root microbiota (Wagner et al.,

2016), the two traits ‘rosette diameter’ and ‘leaf number’ were used as proxies of plant age.

Third, ‘obs’ corresponds to the total number of observations and accounts for technical noise attributable to sequencing depth.

2. To explore natural variation in populations collected in spring, we used the following

mixed model:

Yijklmno = µtrait + w/wo_Autumni+ compartmentj + w/wo_Autumni× compartmentj + populationk(w/wo_Autumni) + compartmentj× populationk(w/wo_Autumni) + sampling_datel

+ diameterm + leaf_numbern + obso+ εijklmno (2)

All the terms are described in model (1) with the exception of ‘w/wo_Autumn’ accounting for differences between populations collected both in autumn and spring (i.e. ‘spring with autumn’ seasonal group) and populations collected only in spring (i.e. ‘spring without autumn’ seasonal group).

6

3. To explore natural variation in populations within each of the three following categories

of populations (i.e. ‘autumn’ populations, ‘spring with autumn’ populations and ‘spring

without autumn’ populations), we used the following mixed model:

Yijklmno = µtrait + compartmentj + populationk + compartmentj× populationk + sampling_datel + diameterm + leaf_numbern + obso+ εijklmno (3)

4. To explore natural variation in populations within each ‘plant compartment x

categories of populations (i.e. ‘autumn’ populations, ‘spring’ populations, ‘spring with

autumn’ populations and ‘spring without autumn’ populations)’ combination, we used the

following mixed model:

Yijklmno = µtrait + populationk + sampling_datel + diameterm + leaf_numbern + obso+ εijklmno

(4)

In these four models, all factors were treated as fixed effects, except ‘population’ which was treated as a random effect. For fixed effects, terms were tested over their appropriate denominators for calculating F-values. Significance of the random effects was determined by likelihood ratio tests of model with and without these effects. Inference was performed using ReML estimation, using the PROC MIXED procedure in SAS 9.3 (SAS

Institute Inc., Cary, North Carolina, USA) for all traits. A correction for the number of tests was performed to control the FDR at a nominal level of 5%. For the purpose of drawing plots,

Best Linear Unbiased Predictions (BLUPs) were obtained for each population by running model (4).

To estimate the percentage of phenotypic variance explained by each classification variable (i.e. ‘season’, ‘compartment’, ‘population’ and interacting terms) in models (1), (3)

7

and (4), noise was first taken into account by performing a first regression of the descriptors against the terms ‘sampling_date’, ‘diameter’, ‘leaf_number’ and ‘obs’ using the PROC

MIXED procedure in SAS 9.3 (SAS Institute Inc., Cary, North Carolina, USA). Then, a second regression including the appropriate classification terms was run on the residuals of the first regression using the PROC VARCOMP procedure in SAS 9.3 (SAS Institute Inc.,

Cary, North Carolina, USA).

Relationships between disease symptoms observed in natura and relative abundance of the potential pathobiota

For each sample, we estimated the relative abundance of the potential pathobiota by dividing the number of reads belonging to the potential pathobiota by the total number of reads. To test whether the relative abundance of the potential pathobiota differs between plants with visible disease symptoms and asymptomatic plants when sampled in situ, we used the following general linear model (PROC GLM in SAS 9.3):

Yimno = µtrait + symptomi+ diameterm + leaf_numbern + obso+ εimno (1)

where ‘Y’ is the relative abundance of the potential pathobiota,‘µ’ is the overall phenotypic mean; ‘symptom’ is a binary category that accounts for differences between plants with visible disease symptoms and asymptomatic plants; ‘ε’ is the residual term. The terms

‘diameter’, ‘leaf number’ and ‘obs’ are described above. Raw relative abundances were arc- sin transformed to satisfy the normality and equal variance assumptions of linear regression.

Because disease symptoms were scored on the rosette leaves, the model was only run on the

‘leaf’ samples (n = 820).

8

Microbiota – potential pathobiota relationships

In order to study the relationship between diversity estimates (species richness and

Shannon α-diversity) of the microbiota and the potential pathobiota, we fitted a linear model

(pathobiota’s diversity ~ intercept + a*microbiota’s diversity) and a non-linear model that includes a quadratic function (pathobiota’s diversity ~ k*microbiota’s diversity – q*microbiota’s diversity*microbiota’s diversity). Linear and non-linear regressions were fitted using the ‘lm’ and ‘nls’ functions implemented in the R environment, respectively.

Fitting linear and non-linear models was performed by considering (i) all samples, (ii) samples from each seasonal group and (iii) samples from each ‘plant compartment x seasonal group’ combination. Using a paired sample t-test, model selection was performed by comparing the goodness of fit between linear and non-linear models across the ‘diversity estimate x plant compartment x seasonal group’ combinations. In addition, in order to confirm the significance of the humped-back curve observed between diversity estimates (species richness and Shannon α-diversity) of the microbiota and the potential pathobiota, the parameters ‘k’ and ‘q’ of the non-linear model were compared to a null distribution of those parameters obtained by generating 100 random microbiota OTU matrices paired with 100 random pathobiota OTU matrices. To do so, 6,598 OTUs were randomly sampled among the

6,627 OTUs considered in this study, leading to a random microbiota matrix. The remaining

29 OTUs were considered as the paired random pathobiota matrix. Based on this paired random microbiota matrix – random pathobiota matrix, we fitted the non-linear model described above and obtained values for the parameters ‘k’ and ‘q’. This procedure was repeated 100 times.

In order to study the relationship between microbiota composition and potential pathobiota composition, a sparse Partial Least Square Regression (sPLSR) (Lê Cao et al.,

9

2008; Carrascal, Galván and Gordo, 2009) was adopted to maximize the covariance between linear combinations of relative abundances of OTUs from the microbiota (matrix X) and linear combinations of relative abundances of species from the potential pathobiota (matrix

Y). sPLSR was run using the mixOmics package implemented in the R environment (Lê Cao et al., 2008; Lê Cao, Meugnier and McLachlan, 2010). For the microbiota matrix, only OTUs present in at least 1% of the samples were considered (‘All samples’ n = 94 OTUs, ‘Autumn –

Leaf’ n= 122 , ‘Autumn – Root’ n = 87, ‘Spring w/ Autumn - Leaf’ n= 156, ‘Spring w/

Autumn - Root’ n= 135,‘Spring w/o Autumn - Leaf’ n= 167, ‘Spring w/o Autumn - Root’ n=

122). For the potential pathobiota matrix, we considered the seven most abundant bacterial species. In addition to the lasso model, sPLSR results were validated by plotting the Root

Mean Square Error of Prediction (Maestre, 2004; Lê Cao et al., 2008). For the microbiota, we calculated the final loadings for the ten OTUs with the highest initial loadings on the first component. Given the small number of OTUs in the potential pathobiota (n = 7), the initial loading of each OTU was kept on the first component. Following Carrascal et al. (2009), only

OTUs (from the microbiota or the potential pathobiota) with a loading value above 0.2 were considered as significant. Significance of the OTUs included in the linear combinations was further estimated by a Jackknife resampling approach by leaving out 10% of the samples

1,000 times (Supplementary Text). Only OTUs with a loading value above 0.2 in more than

75% of the resampled matrices were considered as significant.

Isolation and characterization of bacterial strains belonging to the potential pathobiota

Isolation and characterization of strains belonging to the P. syringae complex In order to isolate and characterize strains belonging to the P. syringae complex, two/three plants per populations were collected to isolate strains in the P. syringae complex

10

from both leaves and roots. Plants were transferred in sterilized bags and placed at 4°C before processing. Roots were cut from the rosette and washed after transferring into the laboratory.

Both leaves and roots were placed into sterilized Eppendorf tubes containing 500ul of distilled sterilized water. Leaves were homogenized with a scalpel and roots were placed into a bath sonicator for 10 min to allow the release of the endophytic bacteria. Suspensions from leaves and roots were then diluted and plated on Trypticase Soy Agar (TSA) medium supplemented with 100 mg/L of cycloheximide as described previously (Bartoli et al., 2014). Plates were incubated at 24°C for two days and cytochrome C oxidase activity was assessed on individual bacterial colonies. All cytochrome C oxidase negative colonies were stored at -20°C in 30% of glycerol and screened by PCR amplification for the P. syringae marker (Psy) as described in Guilbaud et al. (2016). Strains positive for the Psy PCR were sequenced for the housekeeping gene citrate synthase (cts). We then performed a phylogenetic analysis to place the isolated strains into the diversity of the P. syringae complex. The cts sequences were trimmed and concatenated with DAMBE version 5.1.1 (Xia, 2013) and MEGA6 was used to infer the phylogeny by following a maximum likelihood model (Tamura et al., 2013).

Sequences for the cts genes are available in Supplementary Data 3. Reference P. syringae sequences previously published (Berge et al., 2014) were also used in the phylogeny to allow the identification of the phylogroups to which belong the strains isolated from A. thaliana.

Information about the strains is available in Supplementary Data 3. Strains are stored and maintained in the bacterial collection of the Laboratory of Plant-Microbe Interactions

(LIPM) of INRA (Toulouse, France) and available under reasonable request to the corresponding author.

11

Isolation and characterization of Xanthomonas campestris strains

For the isolation of strains belonging to the phytopathogenic species Xanthomonas campestris, A. thaliana leaves and roots were processed as described above for the isolation of strains belonging to the P. syringae complex. The obtained homogenized suspensions from leaves and roots were plated on King’s B medium (King et al., 1954) and plates were incubated three days at 28°C before observation. Pure bacterial colonies both displaying a yellow and mucoid morphology and being negative for fluorescence production under UV lights were purified on TSA medium without the addition of antibiotics. The purified colonies were tested for the amplification of the Xanthomonas molecular marker rpfB, a gene involved in the regulation of pathogenicity factors, by using a primer set and PCR conditions previously described in Simoes et al., (2007). Strains positive for the amplification of the rpfB were stored at – 20°C in 30% of glycerol. Phylogenetic affiliation of the Xanthomonas strains was performed with a portion of gyrB with primer set and PCR conditions described in

Parkinson et al. (2007). Sequence and phylogenetic analysis were performed as described above for the P. syringae strains. Finally, only the strains phylogenetically within the X. campestris clade were further analyzed for their pathogenic behavior on both A. thaliana and tobacco plants. Information about the X. campestris strains as well the gyrB sequences used for their classification are available in the Supplementary Data 3.

Isolation and characterization of a Pantoea agglomerans strain

In order to test the pathogenicity on the third most abundant bacterial species identified in our potential pathobiota, one strain of P. agglomerans (0001-Pag-MONTI-DL) was isolated by using TSA medium with the addition of 100 mg/L of cycloheximide. The affiliation of the strain to the P. agglomerans species was confirmed by sequencing a portion

12

of gyrB housekeeping gene with the primer set described in Barret et al., (2015). The sequence of the strain is available in Supplementary Data 3.

Assessing pathogenicity on A. thaliana and a non-host species

Pathogenicity tests for strains belonging to the P. syringae complex

In order to test for pathogenicity of natural strains belonging to the Pseudomonas syringae complex, we first evaluated in planta bacterial growth over seven days of two P. viridiflava strains (0114-Psy-NAUV-BLand 0124-Psy-SAUB-AL) and two P. syringae sensu stricto strains (0132-Psy-BAZI-AL and 0143-Psy-THOM-AL) in the four corresponding local natural populations of A. thaliana, each represented by two randomly selected accessions (At-

NAUV-B-7, At-NAUV-B-14, At-SAUB-A-3, At-SAUB-A-7, At-BAZI-A-1, At-BAZI-A-2,

At-THOM-A-3 and At-THOM-A-6). A growth chamber experiment of 768 plants was set up at the Toulouse Plant Microbe Phenotyping Platform (TPMP) using a randomized complete block design (RCBD) with six experimental blocks. Each block was represented by 128 plants corresponding to the combination of four strains × eight accessions × four time points of scoring (0, 3, 5 and 7 days post-inoculation). After a 4-day stratification treatment, plants were grown at 22 °C under 90% humidity and artificial light to provide a 9-hr photoperiod as described in Huard-Chauveau et al. (2013). Bacterial infection was conducted on 22-day-old plants using a blunt-ended syringe (Terumo® SYRINGE 1mL, SS+01T1). Three leaves per plant were infiltrated with 50μL of a 103 CFUmL-1 bacterial solution. Plants were scored for bacterial growth by taking a hole-punch (Ø7 mm) from each infected leaf and grinding the leaf discs in 100µL of Milli-Q water to release bacteria with glass bead in a 96-well plate (25 strokes per second, twice 30”). Appropriate serial dilutions, plating and calculation of the number of CFUs per cm² were performed according to Bartoli et al., (2014). For each plant,

13

the number of CFUs per cm² was obtained by calculating the median between the three leaf discs.

To explore natural variation of in planta bacterial growth, we used the following mixed model:

logCFUijklm = µtrait + blocki+ strainj + populationk + timel+ strainj × populationk + strainj × timel+ populationk × timel + strainj × populationk × timel+ accessionm(populationk) + strainj × timel × accessionm(populationk) + strainj × timel ×accessionm(populationk) + accessionm(populationk) + εijklm (5)

where ‘µ’ is the overall phenotypic mean; ‘block’ accounts for differences in micro- environment among the six experimental blocks; ‘strain’ measures the effect of the four

Pseudomonas strains; ‘population’ accounts for differences among the four A. thaliana populations; ‘accession’ measures the mean effect of accessions within each population;

‘time’ tests the evolution of bacterial growth over time; interaction terms involving the ‘time’ term account for variation among strains and populations for bacterial growth over time; ‘ε’ is the residual term. All factors were treated as fixed effects, except ‘accession’ which was treated as a random effect.

In a second growth chamber experiment, we evaluated the occurrence of disease and estimated the extent of genetic variation of A. thaliana for response to natural Pseudomonas syrngae complex strains. For this purpose, we used four strains of P. viridiflava (0114-Psy-

NAUV-BL, 0124-Psy-SAUB-AL, 0105-Psy-JACO-CLand0106-Psy-RADE-AL), four strains of P. syringae sensu stricto (0111-Psy-RAYR-BL, 0117-Psy-NAZA-AL, 0099-Psy-SIMO-

AL and 0132-Psy-BAZI-AL) and eight corresponding local natural populations of A. thaliana, each represented by one randomly selected accession (At-NAUV-B-14, At-SAUB-A-3, At-

14

JACO-C-5, At-RADE-A-6, At-RAYR-B-13, At-NAZA-A-2, At-SIMO-A-15 and At-BAZI-A-

2) (Supplementary Table 1). A growth chamber experiment with 320 plants was set up at the

Toulouse Plant Microbe Phenotyping Platform (TPMP) using a randomized complete block design (RCBD) with five experimental blocks. Each block was represented by 64 plants corresponding to the combination of eight strains × eight accessions. Growth chamber conditions were similar to the in planta bacterial growth experiment. Bacterial infection was conducted on 28-day-old plants using a blunt-ended syringe (Terumo® SYRINGE 1mL,

SS+01T1). Three leaves per plant were entirely infiltrated with a 5.107 CFUmL-1 bacterial solution. Disease symptoms were scored visually 1, 2 and 3 days after inoculation as described in Roux et al., (2010). Each infected leaf received a score from 0 to 1, with 0 corresponding to no symptoms and 0.5 and 1 corresponding to medium and severe symptoms, respectively. These scores categorize the percentage of leaf area infected, as determined by the presence of visible chlorosis, water soaking, or cell death. We averaged the scores for the three infected leaves per plant.

To explore natural variation of disease symptoms, we used the following mixed model:

disease symptomijklm = µtrait + blocki+ strainj + accessionm + timel+ strainj × accessionm + strainj × timel+ accessionm × timel + strainj × accessionm × timel+ εijklm (6)

where ‘µ’ is the overall phenotypic mean; ‘block’ accounts for differences in micro- environment among the five experimental blocks; ‘strain’ measures the effect of the eight

Pseudomonas strains; ‘accession’ accounts for differences among the eight A. thaliana accessions; ‘time’ tests the evolution of disease symptoms over time; interaction terms involving the ‘time’ term account for variation among strains and populations for disease

15

evolution; ‘ε’ is the residual term. All factors were treated as fixed effects, except ‘accession’ which was treated as a random effect.

In models (5) and (6), the significance of terms of fixed and random effects was evaluated as described in the subsection ‘Natural variation of microbiota and potential pathobiota’. A Bonferroni correction for the number of tests was performed at a nominal level of 5%.

Hypersensitive Reaction (HR) on tobacco was also tested for all P. syringae sensu stricto and P. viridiflava strains by infiltrating 20 µL of 107 CFU.mL-1 bacterial suspensions in 14-day-old Nicotiana tabacum plants. Inoculated tobacco plants were incubated at room temperature for 24 hours before HR scoring (presence/absence). HR results are shown in

Supplementary Data 3.

Pathogenicity tests for Xanthomonas campestris strains

The 62 X. campestris strains isolated from both A. thaliana leaves and roots

(Supplementary Data 3) were tested for pathogenicity on the A. thaliana Kashmir accession

(Kas-1), which is susceptible to most X. campestris strains isolated on crops (Huard-

Chauveau et al., 2013). We set up a growth chamber experiment with 248 Kas-1 plants corresponding to four plants per X. campestris strain. To control for micro-environmental effects among flats, one plant of the accessions Col-0 and Kas-1 and one plant of the mutant rks1-1 were added per flat and inoculated with the X. campestris control strain Xcc568

(Huard-Chauveau et al., 2013). Growth chamber conditions and pathogenicity tests

(inoculation and disease score protocols) were similar those already described in Huard-

Chauveau et al. (2013). Disease symptoms were scored on all plants 10 days post inoculation

16

according to a scale ranging from 0 (resistance) to 4 (susceptibility), as previously described in Meyer et al., (2005).

To explore natural variation among the 62 strains, we used the following mixed model

(PROC MIXED procedure in SAS 9.3):

disease symptomj = µtrait + Xcc strainj + covCol-0k + covKas-1l + covrks1-1m + εijklm (7)

where ‘µ’ is the overall phenotypic mean; ‘strain’ accounts for differences among the

62 X. campestris strains; ‘covCol-0’, ‘covKas-1’ and ‘covrks1-1’ are covariates accounting for micro-environmental differences among flats; ‘ε’ is the residual term. The term

‘accession’ was treated as a random effect whose significance was evaluated as described in the subsection ‘Natural variation of microbiota and potential pathobiota’.

Hypersensitive Reaction (HR) on tobacco was also tested for all the X. campestris strains as described above for the strains belonging to the P. syringae complex.

Pathogenicity tests for the Pantoea agglomerans strain

In order to test the pathogenicity of P. agglomerans on A. thaliana, the strain 0001-

Pag-MONTI-DL was first inoculated on 4-week old plants on the reference accession Col-0.

We then estimated the extent of the genetic variation of A. thaliana for response to the strain

0001-Pag-MONTI-DL. For this purpose, we used 23 A. thaliana accessions collected in the

Midi-Pyrénées region (At-BAZI-A-2, At-CLAR-A-2, At-FERR-A-14, At-JACO-C-5, At-

LABA-B-17, At-MERE-A-2, At-MONTI-A-17, At-MONTI-A-5, At-MONTI-A-7, At-

MONTI-B-10, At-MONTI-B-12, At-MONTI-B-16, At-MONTI-D-12, At-MONTI-D-14, At-

MONTI-D-16, At-NAUV-B-14, At-NAZA-A-2, At-RADE-A-6, At-RAYR-B-13, At-SAUR-

17

A-1, At-SIMO-A-15, At-VIEL-A-9 and At-VILLEM-A-12). A growth chamber experiment with 92 plants was set up based on a randomized complete block design (RCBD) with four experimental blocks. Each block corresponded to one plant per accession. Plants were grown as described above for the X. campestris strains. Four weeks after sowing, three leaves of each plant were filled in their entirety by pressing a blunt-end syringe (Terumo® SYRINGE 1mL,

SS+01T1) into their bottom surfaces with a bacterial solution of 108 CFU / mL. Disease symptoms were scored at 2, 3, 4 and 5 days post bacterial inoculation, according to a scale ranging from 0 (no visible symptoms) to 2 (both sides of the leave presenting symptoms) per plant, leading to a disease score per plant ranging from 0 to 6.

To explore natural variation of disease symptoms, we used the following mixed model at each of the four time points after inoculation (PROC MIXED procedure in SAS 9.3):

disease symptomij = µtrait + blocki + accessionj + εijklm (8)

where ‘µ’ is the overall phenotypic mean; ‘block’ accounts for differences in micro- environment among the four experimental blocks; ‘accession’ accounts for differences among the 23 A. thaliana accessions; ‘ε’ is the residual term. The term ‘block’ was treated as a fixed effect while the term ‘accession’ was treated as a random effect. The significance of the terms was evaluated as described in the subsection ‘Natural variation of microbiota and potential pathobiota’. A Bonferroni correction for the number of tests was performed at a nominal level of 5%. At each time point after inoculation, broad-sense heritability (H²) was estimated by running equation (8) with the PROC VARCOMP procedure in SAS 9.3.

18

RESULTS

Isolation and in planta tests for strains belonging to the A. thaliana pathobiota

Strains belonging to the P. syringae complex

We succeed in isolating a total number of 97 strains (Supplementary Figure S4), all from the leaf compartment. In agreement with the results obtained by community profiling approach (Figure 2b), most of the strains (n = 74) clustered with P. viridiflava, and in particular with the phylogroup 7 (Supplementary Figure S4). All the remaining strains belong to P. syringae sensu stricto. Among these strains, 10, 6 and 2 strains were placed into the phylogroups 2, 13 and 9 respectively (Supplementary Figure S4, Supplementary Data 3). A single strain was found for the phylogroups 1 and 11. Three strains (0108-Psy-GAIL-BL,

0117-Psy-NAZA-AL and 0097-Psy-BAGNB-BL) were not affiliated to any P. syringae phylogroup and they might be considered as new phylogroups (Supplementary Figure S4,

Supplementary Data 3). We found more P. syringae strains in spring (n=74) than in autumn

(n=23) (Supplementary Figure S4, Supplementary Data 3). These results are in accordance to those obtained by metagenomics showing a burst of strains belonging to the P. syringae complex during spring in the leaf compartment (Figure 2).

Hypersensitive Response (HR) on tobacco showed that 84 strains displayed positive reactions. Among the HR negative strains, eight belonged to the P. viridiflava phylogroup 7, four strains were clustered in the phylogroup 13 and one strain was not affiliated

(Supplementary Data 3). Four strains (0114-Psy-NAUV-BL and 0124-Psy-SAUB-AL from the P. viridiflava phylogroup 7; 0132-Psy-BAZI-AL from phylogroup 2 and 0143-Psy-

THOM-AL from phylogroup 13) were tested for in planta bacterial growth in the four corresponding local natural populations of A. thaliana, each represented by two randomly selected accessions. A highly significant bacterial growth was observed for each strain

19

(Supplementary Table S2), with bacterial concentrations reaching 106 CFU.cm-2 7 days post inoculation (Supplementary Figure S5a). In addition, eight strains were tested for pathogenicity on each of the eight corresponding local natural populations of A. thaliana.

Disease symptoms were observed for strains belonging to either P. viridiflava (Supplementary

Figure S6) or P. syringae sensu stricto (Supplementary Figure S5, S6, S7). For example, the

0111-Psy-RAYR-BLstrain (phylogroup 1) was highly aggressive on all the accessions tested

(Supplementary Figure S5b). Interestingly, the appearance of symptoms over time was highly dependent on the interactions between the eight strains and the eight corresponding local accessions (Supplementary Table S3), suggesting G x G interactions. Notably, while almost no genetic variation in disease symptoms was observed among A. thaliana accessions for each

P. syringae sensu stricto strain, disease symptoms largely differed among A. thaliana accessions for each P. viridiflava strain (Supplementary Figure S5, S6, S7).

Taken together, our results demonstrated that most of the strains belonging to the P. syringae complex are able to induce disease on the host of origin as well on a non-host species (i.e. tobacco), thereby supporting the pathogenicity behavior of strains identified in the potential pathobiota.

Strains belonging to Xanthomonas campestris

By using the isolation and detection methods described above, we succeeded in isolating 62 Xanthomonas campestris strains. In agreement with the results obtained by community profiling approach (Figure 2b), most of the strains were isolated during autumn and only seven of the 62 strains were isolated during spring. As showed in Supplementary

Figure S8, all the strains were phylogenetically related to the pathovars campestris and raphani. However, most of the strains isolated from A. thaliana leaves and roots were

20

clustered in distinct clades when compared to strains isolated from crops (Supplementary

Figure S8).

HR on tobacco showed that 58 of the 62 X. campestris strains were able to induce a positive reaction 24 hours post infiltration. When infiltrated on the Kas-1 accession, all the 62 strains were able to induce symptoms with a degree of disease severity ranging from 1.4 to

2.6 (Supplementary Data 3, Supplementary Figure S9). A highly significant genetic variation was observed among the 62 strains (LRT = 16.7, P = 0.00004). In addition, disease symptoms were similar to those obtained when inoculating strains isolated from crops (data not shown).

Taken together, these results confirmed that the 62 X. campestris strains isolated from the natural A. thaliana populations were pathogenic on both host (A. thaliana) and non-host

(tobacco) plants, thereby reinforcing the pathogenic potential of the strains here included in the pathobiota of A. thaliana.

Pantoea agglomerans strain

The P. agglomerans strain (0001-Pag-MONTI-DL) isolated from the rosette of a A. thaliana plant collected in the MONTI-D population clustered with a P. agglomerans strain isolated from switchgrass and was phylogenetically related to other P. agglomerans strains isolated from other plant species or from environmental habitats (Supplementary Figure S10).

When inoculated on the reference accession Col-0 as well on 23 A. thaliana accessions of the

Midi-Pyrénées region, the 0001-Pag-MONTI-DL strain was able to induce yellow necrosis on the infiltrated leaves on all the accessions (Supplementary Figure S11). Disease severity on the 23 accessions gradually increased after the inoculation of the strain and the most severe necrosis were observed 5 days post inoculation (Supplementary Figure S12). We observed a significant genetic variation among the 23 accessions for disease symptoms, with the highest

21

value for broad-sense heritability observed 3 days post inoculation (H² = 0.86)

(Supplementary Table S4). Taken together, these results confirmed the pathogenic behavior of the P. agglomerans strain isolated from A. thaliana leaves, thereby reinforcing the validation of the A. thaliana pathobiota identified here by metagenomics.

REFERENCES

Barret M, Briand M, Bonneau S, Préveaux A, Valière S, Bouchez O, et al. (2015). Emergence shapes the structure of the seed microbiota. Appl Environ Microbiol 81: 1257–1266. Bartoli C, Berge O, Monteil CL, Guilbaud C, Balestra GM, Varvaro L, et al. (2014). The Pseudomonas viridiflava phylogroups in the P. syringae complex are characterized by genetic variability and phenotypic plasticity of pathogenicity-related traits. Environ Microbiol 16: 2301–2315. Berge O, Monteil CL, Bartoli C, Chandeysson C, Guilbaud C, Sands DC, et al. (2014). A user’s guide to a data base of the diversity of Pseudomonas syringae and its application to classifying strains in this phylogenetic complex. PloS ONE 9: p. e105547 Caporaso JG, Lauber CL., Walters WA, Berg-Lyons D, Lozupone CA, Turnbaugh PJ, et al. (2011). Global patterns of 16S rRNA diversity at a depth of millions of sequences per sample. Proc Nat Acad Sci 108: 4516–4522. Carrascal LM., Galván I, Gordo O. (2009). Partial least squares regression as an alternative to current regression methods used in ecology. Oikos 118: 681–690. Guilbaud C, Morris CE, Barakat M, Ortet P, Berge O. (2016). Isolation and identification of Pseudomonas syringae facilitated by a PCR targeting the whole P. syringae group. FEMS Microbiol Ecol 92: pii: fiv146. Huard-Chauveau C, Perchepied L, Debieu M, Rivas S, Kroj T, Kars I, Bergelson J, Roux F, Roby D (2013). An atypical kinase under balancing selection confers broad-spectrum disease resistance in Arabidopsis. Plos Genet 9: e1003766. King EO, Ward MK, Raney DE. (1954). Two simple media for the demonstration of pyocyanin and fluorescin. Lab Clin Med 44: 301-307. Lê Cao KA, Meugnier E, McLachlan G. (2010). Integrative mixture of experts to combine clinical factors and gene markers. Bioinformatics 26: 1192–1198. Lê Cao KA, Rossouw D, Robert-Granié C, Besse P. (2008). Sparse PLS: variable selection when integrating omics data. Stat Appl Mol Biol 7: doi: 10.2202/1544-6115.1390

22

Maestre FT. (2004).On the importance of patch attributes, environmental factors and past human impacts as determinants of perennial plant species richness and diversity in mediterranean semiarid steppes. Diver Distrib 10: 21–29. Markowitz VM, Chen I-M A, Palaniappan, K, Chu, K, Szeto, E, Pillay, M, et al. (2014). IMG 4 version of the integrated microbial genomes comparative analysis system. Nucl Acids Res 42: 560-567. Montesinos A, Tonsor SJ, Alonso-Blanco C, Picó FX. (2009). Demographic and genetic patterns of variation among populations of Arabidopsis thaliana from contrasting native environments. PloS ONE 4: e7213. Meyer D, Lauber E, Roby D, Arlat M, Kroj T (2005). Optimization of pathogenicity assays to study the Arabidopsis thaliana-Xanthomonas campestris pv. campestris pathosystem. Mol Plant Pathol 6: 327-333. Parkinson N, Cowie C, Heeney J, Stead D. (2009). Phylogenetic structure of Xanthomonas determined by comparison of gyrB sequences. Int J System Evol Microbiol 59: 264-274. Roux F, Gao L, Bergelson J. (2010). Impact of initial pathogen density on resistance and tolerance in a polymorphic disease resistance gene system in Arabidopsis thaliana. Genetics 185: 283–291. Schloss PD, Westcott SL, Ryabin T, Hall JR, Hartmann M, Hollister EB, et al. (2009). Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol 75: 7537–7541. Simoes THN, Gonçalves ER, Rosato YB, Metha A. (2006). Differentiation of Xanthomonas species by PCR-RFLP of rpfB and atpD genes. FEMS Microbiol Lett 271: 33–33. Tamura K, Stecher G, Peterson D, Filipski A, Kumar S. (2013). MEGA6: Molecular evolutionary genetics analysis version 6.0. Mol Biol Evol 30: 2725–2729. Varghese NJ, Mukherjee S, Ivanova N, Konstantinidis KT, Mavrommatis K, Kyrpides NC, et al. (2015). Microbial species delineation using whole genome sequences. Nucl Acids Res 43: 6761–6771. Wagner MR, Lundberg DS, Del Rio TG, Tringe SG, Dangl JL, Mitchell-Olds T. (2016). Host genotype and age shape the leaf and root microbiomes of a wild perennial plant. Nature Comm 7: 12151. Xia X. (2013). DAMBE5: a comprehensive software package for data analysis in molecular biology and evolution. Mol Biol Evol 30: 1720–1728.

23

SUPPLEMENTARY FIGURES

Supplementary Figure S1. The invasion paradox. (A) Species richness of both native and invasive species is positively correlated with the quality of the environment. Abiotic factors often have similar effects on biodiversity and invader success. (B) The increasing quality of the environment leads to a positive relationship between species richness of invasive species and species richness of native species. (C) Elton’s hypothesis describing a negative association between the increase of biodiversity and success of the invaders. The interplay between resource availability and diversity in determining invasion resistance leads to the well-known phenomenon called ‘invasion paradox’. (D), (E) and (F) indicate application of the theories to microbial communities.

24

Supplementary Figure S2. Distribution of the number of reads (expressed in log10) for the 6,627 OTUs obtained after data filtering (top panel) and for the remaining 271,706 OTUs discarded with the filtering (bottom panel). The mean number of reads per OTU maintained after trimming was 51.3 fold higher than the one of the discarded OTUs.

25

Supplementary Figure S3. PCoA perfomed on a Hellinger distance matrix based on rarefied data (top panel) and on a Jaccard similarity coefficient matrix (bottom panel).

26

0193-Psy-NAUV-BL TA0002 0189-Psy-BAGNB-BL 0188-Psy-BAGNB-BL 0184-Psy-BANI-CL 0182-Psy-MONTG-DL 0181-Psy-MONTG-DL 0177-Psy-LABA-BL 0176-Psy-LABA-BL 0175-Psy-LABA-BL 0162-Psy-MERV-AL 0149-Psy-PREI-AL 0140-Psy-MARS-AL 0139-Psy-MARS-AL 0138-Psy-MARS-AL 0137-Psy-MARS-AL 0114-Psy-NAUV-BL 0110-Psy-RAYR-BL 0095-Psy-ESPE-BL 0173-Psy-MAZA-AL 0174-Psy-MAZA-AL 0112-Psy-RAYR-BL 0113-Psy-NAUV-BL 92 0109-Psy-GAIL-BL 97 0104-Psy-JACO-CL 0105-Psy-JACO-CL 0091-Pvir-JACO-CL 0101-Psy-JACO-CL 0183-Psy-MONTG-DL 0154-Psy-CARL-AL 0106-Psy-RADE-AL 0107-Psy-RADE-AL 90 0160-Psy-DAMI-BL TA043 CC1582 0130-Psy-GAIL-BL 0131-Psy-GAIL-BL CMO0110 Phylogroup 7 0157-Psy-CARL-AL 0158-Psy-CARL-AL 0156-Psy-CARL-AL 0155-Psy-CARL-AL 0163-Psy-CAPE-AL 0164-Psy-CAPE-AL 0088-Pvir-JACO-CL 0089-Pvir-JACO-CL 0090-Pvir-JACO-CL 0092-Pvir-JACO-CL 0102-Psy-JACO-CL 0152-Psy-LABAS-BL 0122-Psy-SAUB-AL 0123-Psy-SAUB-AL 0124-Psy-SAUB-AL 0125-Psy-SAUB-AL 0126-Psy-SAUB-AL 0128-Psy-SAUB-AL 0129-Psy-SAUB-AL 75 0150-Psy-LABAS-BL 0151-Psy-LABAS-BL 0161-Psy-MERV-AL 0191-Psy-BULE-BL 0192-Psy-BULE-BL BS0002 0121-Psy-SAUB-AL 85 0180-Psy-BANI-BL 0193-Psy-NAUV-BL 0118-Psy-SAUB-AL TA0002 0119-Psy-SAUB-AL 0189-Psy-BAGNB-BL 0120-Psy-SAUB-AL 0188-Psy-BAGNB-BL 0127-Psy-SAUB-AL 0184-Psy-BANI-CL 0167-Psy-PAMP-AL 0182-Psy-MONTG-DL 0181-Psy-MONTG-DL 0169-Psy-PAMP-AL 0177-Psy-LABA-BL 0170-Psy-PAMP-AL 0176-Psy-LABA-BL 0185-Psy-LACR-CL 0175-Psy-LABA-BL 79 0186-Psy-LACR-CL 0162-Psy-MERV-AL FMU107 0149-Psy-PREI-AL Phylogroup 8 0140-Psy-MARS-AL 99 CMW0020 0139-Psy-MARS-AL 0138-Psy-MARS-AL 99 CC1524 0137-Psy-MARS-AL 95 CC1417 Phylogroup 9 0114-Psy-NAUV-BL 0136-Psy-MONTB-AL 0110-Psy-RAYR-BL 0190-Psy-BULE-BL 98 0095-Psy-ESPE-BL 82 TA0006 0173-Psy-MAZA-AL 81 94 Phylogroup 10 0174-Psy-MAZA-AL Phylogroup 5 0112-Psy-RAYR-BL 0113-Psy-NAUV-BL CC1416 92 0109-Psy-GAIL-BL 99 CC1427 97 Phylogroup 1 0104-Psy-JACO-CL DC3000 0105-Psy-JACO-CL 0111-Psy-RAYR-BL 0091-Pvir-JACO-CL Phylogroup 6 0101-Psy-JACO-CL 81 Phylogroup 3 0183-Psy-MONTG-DL 99 Phylogroup 4 0154-Psy-CARL-AL 99 0106-Psy-RADE-AL 97 0141-Psy-MONF-AL 0107-Psy-RADE-AL 91 CC457 90 0160-Psy-DAMI-BL 84 M301072PT TA043 H5E1 CC1582 0148-Psy-CLAR-AL 0130-Psy-GAIL-BL 99 PsyCit7 0131-Psy-GAIL-BL CC1470 CMO0110 Phylogroup 7 94 B728a 0157-Psy-CARL-AL CC0301 0158-Psy-CARL-AL 508 0156-Psy-CARL-AL 0155-Psy-CARL-AL PSy642 Phylogroup 2 0163-Psy-CAPE-AL 0145-Psy-CLAR-AL 84 0164-Psy-CAPE-AL 0146-Psy-CLAR-AL 0088-Pvir-JACO-CL 93 0134-Psy-BAZI-AL 0089-Pvir-JACO-CL 0133-Psy-BAZI-AL 0090-Pvir-JACO-CL 0132-Psy-BAZI-AL 0092-Pvir-JACO-CL 0165-Psy-PAMP-AL 0102-Psy-JACO-CL 0166-Psy-PAMP-AL 0152-Psy-LABAS-BL 0168-Psy-PAMP-AL 0122-Psy-SAUB-AL 87 0123-Psy-SAUB-AL 0171-Psy-PAMP-AL 0124-Psy-SAUB-AL 0172-Psy-PAMP-AL 0125-Psy-SAUB-AL 0108-Psy-GAIL-BL 0126-Psy-SAUB-AL 83.1 0128-Psy-SAUB-AL

99 0115-Psy-VILLE-AL Phylogroup 11 0129-Psy-SAUB-AL 75 CFBP4407 0150-Psy-LABAS-BL 0159-Psy-DAMI-BL 0151-Psy-LABAS-BL 0142-Psy-MARS-AL 0161-Psy-MERV-AL 0094-Psy-ESPE-BL 0191-Psy-BULE-BL 0192-Psy-BULE-BL CCE0915 BS0002 0144-Psy-CLAR-AL Phylogroup 13 0121-Psy-SAUB-AL 0096-Psy-BAGNB-BL 85 0180-Psy-BANI-BL 88 UB246 0118-Psy-SAUB-AL 83 0143-Psy-THOM-AL 0119-Psy-SAUB-AL

97 CCV0567 0120-Psy-SAUB-AL CLA0302 0127-Psy-SAUB-AL 98 0117-Psy-NAZA-AL 0167-Psy-PAMP-AL 0169-Psy-PAMP-AL 94 0097-Psy-BAGNB-BL 0170-Psy-PAMP-AL Phylogroup 12 99 0185-Psy-LACR-CL Pseudomonas aeruginosa 79 0186-Psy-LACR-CL FMU107 0.01 Phylogroup 8 99 CMW0020

99 CC1524 95 CC1417 Phylogroup 9 Supplementary Figure S4. Phylogenetic tree inferred with a Neighbor 0136-Psy-MONTB-AL Joining (NJ) model (3000 98 0190-Psy-BULE-BL 82 TA0006 bootstrap repetitions) and based on the cts sequences (350bp) of 81the 97 P. syringae strains isolated 94 Phylogroup 10 from the 163 A. thaliana populations. Bootstrap values are shown at each Phylogroup node 5 and names of the strain CC1416 at each branch. All the strains belong to the P. syringae complex and99 are CC1427 mainly distributed in the Phylogroup 1 DC3000 phylogroups 7, 9, 1, 2, 11 and 13. Reference P. syringae strains representative of 0111-Psy-RAYR-BL the 13 phylogroups Phylogroup 6 81 are labeled in green. Phylogroup affiliation was based on the previous work Phylogroup from 3 Berge et al. (2014). 99 Phylogroup 4 99 97 0141-Psy-MONF-AL 91 CC457 27 84 M301072PT H5E1 0148-Psy-CLAR-AL 99 PsyCit7 CC1470

94 B728a CC0301 508 PSy642 Phylogroup 2 0145-Psy-CLAR-AL 84 0146-Psy-CLAR-AL

93 0134-Psy-BAZI-AL 0133-Psy-BAZI-AL 0132-Psy-BAZI-AL 0165-Psy-PAMP-AL 0166-Psy-PAMP-AL 0168-Psy-PAMP-AL 87 0171-Psy-PAMP-AL 0172-Psy-PAMP-AL 0108-Psy-GAIL-BL 83.1

99 0115-Psy-VILLE-AL Phylogroup 11 CFBP4407 0159-Psy-DAMI-BL 0142-Psy-MARS-AL 0094-Psy-ESPE-BL CCE0915 0144-Psy-CLAR-AL Phylogroup 13 0096-Psy-BAGNB-BL

88 UB246 83 0143-Psy-THOM-AL

97 CCV0567 CLA0302 98 0117-Psy-NAZA-AL 94 0097-Psy-BAGNB-BL Phylogroup 12 99 Pseudomonas aeruginosa

0.01

Supplementary Figure S5. Pathogenicity of Pseudomonas sp. strains isolated from the 163 A. thaliana populations of the region Midi-Pyrénées. (A) Mean bacterial growth across eight accessions collected in the region Midi-Pyrénées for two strains of P. viridiflava (0114-Psy-NAUV-BL and 0124- Psy-SAUB-AL) and two strains of P. syringae sensu stricto (0132-Psy-BAZI-AL and 0143-Psy- THOM-AL). D0, D3, D5 and D7 indicate the number of days after inoculation. (B) Illustration of symptoms observed three days after inoculation for two strains of P. syringae. Presence and absence of symptoms was observed on each of the eight accessions collected in the region Midi-Pyrénées for the strains 0111-Psy-RAYR-BL and 0117-Psy-NAZA-AL, respectively.

28

ACCESSIONS NAZA-A-2 RADE-A-6 RAYR-B-13 SAUB-A-3 BAZI-A-2

mock

0124-Psy- SAUB-AL

Supplementary Figure S6. Genetic variation among five accessions from the Midi-Pyrénées region for the response to the P. viridiflava strain 0124-Psy-SAUB-AL, three days after infiltration. Mock: infiltration with water.

29

accession

strain

BAZI-A-2

NAZA-A-2

RAYR-B-13

SIMO-A-15

JACO-C-5

NAUV-B-14

RADE-A-6 SAUB-A-3 mean/strain P. syringae 0132-Psy-BAZI-AL 0.01 symptoms P. syringae 0117-Psy-NAZA-AL 0.00 > 0.5 P. syringae 0111-Psy-RAYR-BL 0.62 0.4 - 0.5 P. syringae 0099-Psy-SIMO-AL 0.03 0.3 - 0.4 P. viridiflava 0105-Psy-JACO-CL 0.05 0.2 - 0.3 P. viridiflava 0114-Psy-NAUV-BL 0.21 0.1 - 0.2 P. viridiflava 0106-Psy-RADE-AL 0.22 0.05 - 0.1 P. viridiflava 0124-Psy-SAUB-AL 0.26 0 - 0.05 mean/accession 0.30 0.10 0.06 0.28 0.23 0.17 0.09 0.18

Supplementary Figure S7. Heatmap for the symptoms observed three days after inoculation with P. syringae strains. The heatmap shows the interactions between eight natural accessions of A. thaliana from the region Midi-Pyrénées and eight natural strains belonging to the P. syringae complex collected in the same populations than the eight natural accessions.

30

0040-Xcc-CRAN-A3 0055-Xcc-NAZA-A17 0045-Xcc-VILLE-A2 0064-Xcc-BACC-B8 0060-Xcc-VILLA-A7 0049-Xcc-LABAS-A1 0066-Xcc-LABAS-A1 0058-Xcc-VILLA-A2 0035-Xcc-RADE-A4 0056-Xcc-BELL-A2 0037-Xcc-ANGE-A2 0048-Xcc-BACC-B8 0044-Xcc-BELL-A2 0046-Xcc-VILLE-A3 0039-Xcc-CRAN-A2 0010-Xcc-LABR-A-4 0063-Xcc-VILLA-A18

0054-Xcc-NAZA-A11 Xanthomonas campestris 0047-Xcc-VILLE-A18 0053-Xcc-NAZA-A1 0059-Xcc-VILLA-A3 0006-Xcc-BANI-D4-12 0019-Xcc-BANI-B22 0003-Xcc-BANI-D2-6 0008-Xcc-BANI-D4-15 0007-Xcc-BANI-D4-14 0005-Xcc-BANI-D4-9 0004-Xcc-BANI-D4-8

0061-Xcc-VILLA-A8 Group 0077-Xcc-CAPE2-AL3 0082-Xcc-CAPE2-AL15 0083-Xcc-SALV2-AL8 0079-Xcc-CAPE2-AL10 0080-Xcc-CAPE2-AL12 0078-Xcc-CAPE2-AL8 0024-Xcc-PREI-A8 92 0025-Xcc-PREI-A9

87 0032-Xcc-LESP-A3 0033-Xcc-LESP-A4 0031-Xcc-VILLE-A3 0034-Xcc-LESP-A5 0002-Xcc-BANI-D2-5 0041-Xcc-CRAN-A6 0020-Xcc-HECH-A2 0022-Xcc-BERNA-A4 0029-Xcc-VILLE-A1 0076-Xcc-ROQU2-BL7 0067-Xcc-LABAS-A2 0027-Xcc-JUZE-A2 0028-Xcc-JUZE-A3 0050-Xcc-JUZET-A2 0051-Xcc-JUZET-A3 LMCP11 Xanthomonas campestris 0021-Xcc-BERNA-A1 0052-Xcc-CAST-A5 0011-Xcc-LABR-A-6 0012-Xcc-LABR-A-7 0013-Xcc-LABR-A-8 CFBP5828R Xanthomonas campestris pv. raphani 0030-Xcc-VILLE-A2 0042-Xcc-CRAN-A7 8004 Xanthomonas campestris pv. campestris 756C Xanthomonas campestris pv. raphani CFBP1869 Xanthomonas campestris pv. campestris CFBP5817 Xanthomonas campestris pv. campestris 0038-Xcc-CRAN-A1 0015-Xcc-LABR-A-19 0016-Xcc-LABR-A-20

99 NAI8 Xanthomonas oryzae pv. oryzae PXO99A NCPPB1832 Xanthomonas arboricola pv. celebensis 87 CFBP7179 Xanthomonas arboricola pv. juglandis CFBP7634 99 85-10 Xanthomonas arboricola LMG27970 ATCC35937 Xanthomonas vesicatoria LG98 Xanthomonas citri 89 ICPB11122 Xanthomonas fuscans LMG730 Xanthomonas translucens

0.01 31

Supplementary Figure S8. Phylogenetic tree inferred with a Neighbor Joining (NJ) model (3000 bootstrap repetitions) and based on the gyrB sequences (280 bp) of the 62 Xanthomonas campestris strains isolated from the 163 A. thaliana populations. Bootstrap values are shown at each node and names of the strain at each branch. All the strains belong to the X. campestris group. Reference Xanthomonas strains are labeled in blue. Sequences for the reference strains were downloaded on GenBank.

Supplementary Figure S9. Distribution of disease symptoms observed on the accession Kas-1 among the 62 Xanthomonas campestris strains isolated from the 163 A. thaliana populations. The arrows indicate disease symptoms for the accessions Col-0 and Kas-1 and the mutant rks1-1 with the X. campestris control strain Xcc568 (Huard-Chauveau et al., 2013).

32

Eh318 environmental C410P1 lettuce LMAE-2 environmental Pantoea agglomerans RIT273 environmental group 99 P10c apple tree 0001-Pag-MONTI-DL 99 GR13 switchgrass FDAARGOS_160 human 84 C9-1 Pantoea vegans DC283 Pantoea stewartii PaMB1 Pantoea ananatis

99 YJ76 Pantoea ananatis 96 LMG5342 Pantoea ananatis

0.01 Supplementary Figure S10. Phylogenetic tree inferred with a Neighbor Joining (NJ) model (3000 bootstrap repetitions) and based on the gyrB sequences (290 bp) of the Pantoea agglomerans strain (labeled in green) isolated from the MONTI-DL population. Bootstrap values are shown at each node and names of the strain at each branch. All the strains belong to the P. agglomerans group. Reference Pantoea sequences were obtained from JGI.

33

Supplementary Figure S11. Pathogenicity of a strain of P. agglomerans (0001-Pag-MONTI-DL) isolated from the MONTI-D population of A. thaliana population located in the Midi-Pyrénées region. Left panel: Illustration of symptoms observed five days after inoculation on the reference accession Col-0. The two leaves on the left were inoculated on both sides of the midrib while only one side of the midrib has been inoculated for the three leaves on the right. Right panel: genetic variation observed among natural accessions collected in the Midi-Pyrénées region. Symptoms showed in the picture were obtained five days after inoculation of the P. agglomerans strain on three A. thaliana accessions of the Midi-Pyrénées region (MONTI-A-17, NAUV-B-14 and VILLEM-A-12). As illustrated, the 0001-Pag-MONTI-DL was able to induce disease symptoms but with different severity among the three accessions.

34

Supplementary Figure S12. Dynamics of disease symptoms across 23 accessions of the Midi- Pyrénées region inoculated with the strain 0001-Pag-MONTI-DL of P. agglomerans. dpi: days post- inoculation.

35

Supplementary Figure S13. Box-and-whisker plot illustrating the significant difference of the relative abundance of the pathobiota in the leaf compartment between plants with visible disease symptoms - represented by the blue box (n = 424 samples) - and asymptomatic plants - represented by the red box (n = 396 samples) - when sampled in situ.

36

Supplementary Figure S14. Comparison of the parameters ‘k’ and ‘q’ of the non-linear model (pathobiota’s diversity ~ k*microbiota’s diversity – q*microbiota’s diversity*microbiota’s diversity) run on the raw data (red arrows) to a null distribution of those parameters obtained by creating 100 random microbiota OTU matrices paired with 100 random pathobiota OTU matrices.

37

Supplementary Figure S15. Variation among populations in the dynamics of the β-diversity (PCoA, first axis) of the microbiota between autumn and spring. Each dot corresponds to the mean β-diversity (estimated as BLUPs) of a population. ‘leaf’ n = 74 populations, ‘root’ n = 78 populations.

38

Supplementary Figure S16. Percentage of variance of the β-diversity among populations. Red and blue bars correspond to microbiota and potential pathobiota, respectively. A correction for the number of tests was performed to control the FDR at a nominal level of 5%: ns non-significant, ** 0.01> P > 0.001, *** P < 0.001.

39

Supplementary Figure S17. Geographic variation of α-diversity (inferred as Shannon's index) and β - diversity (when considering the first axis of the PCoA) for the microbiota. For each ‘seasonal group x plant compartment’ combination, blue, green, yellow and red dots correspond to populations from the 1st, 2nd, 3rd and 4th quartiles of either the α-diversity of the β -diversity distribution (based on population BLUPs). Values indicate the percentage of α-diversity and β -diversity variation among populations. Significance after a FDR correction at a nominal level of 5%: ** 0.01 > P > 0.001, *** P < 0.001. Number of populations: ‘Autumn – Leaf’ n= 82, ‘Autumn – Root’ n = 82, ‘Spring w/ Autumn - Leaf’ n= 76, ‘Spring w/ Autumn - Root’ n= 79,‘Spring w/o Autumn - Leaf’ n= 80, ‘Spring w/o Autumn - Root’ n= 80.

40

SUPPLEMENTARY TABLES Supplementary Table S1. Names and GPS coordinates (expressed in degrees) of the 163 populations.

Population name Locality Latitude Longitude AMBR-A Ambres 43.733229 1.823869 ANGE-A Saint Angel, Salvagnac 43.911999 1.656649 ANGE-B Saint Angel, Salvagnac 43.91214 1.656855 AULO-A Aulon 43.190552 0.815774 AURE-B Aureville 43.477976 1.452214 AUZE-A Auzeville 43.527792 1.491628 AXLE-A Ax les Thermes 42.724197 1.834034 AXLE-B Ax les Thermes 42.724588 1.833497 BACC-B Baccarets (Cintegabelle) 43.312225 1.515167 BACC-C Baccarets (Cintegabelle) 43.311868 1.515459 BACC-D Baccarets (Cintegabelle) 43.311866 1.515623 BACC-E Baccarets (Cintegabelle) 43.31187 1.515709 BACC-F Baccarets (Cintegabelle) 43.311926 1.515463 BAGNB-A Bagnères de Bigore 43.075729 0.151764 BAGNB-B Bagnères de Bigore 43.076454 0.151533 BANI-B Banios 43.043644 0.234303 BANI-C Banios 43.043644 0.234303 BARA-B Baraqueville 44.269727 2.426322 BARA-C Baraqueville 44.270842 2.427551 BARC-A 43.362044 0.387723 BARR-A Barry le Cas (Caylus) 44.202421 1.767492 BAZI-A Baziège 43.453602 1.620674 BELC-A Belcastel 44.387532 2.336117 BELC-B Belcastel 44.387527 2.336782 BELC-C Belcastel 44.389212 2.336636 BELL-A Belleserre 43.790307 1.106456 BERNA-A Bernac-dessus 43.16215 0.111398 BESS-A Bessuéjouls 44.526359 2.730092 BOULO-A Boulogne-sur-Gesse 43.28908 0.639795 BROU-A Brousse-le-château 43.999349 2.621684 BROU-B Brousse-le-château 44.033129 2.638672 BROU-C Brousse-le-château 44.03326 2.638683 BULA-A Bulan 43.039803 0.277297 BULE-B Buleix (Soulan) 42.91058 1.248122 CAMA-C Camarès 43.824878 2.881661 CAMA-D Camarès 43.823736 2.881003 CAMA-E Camarès 43.824878 2.881661 CAPE-A Lacappelle - Ségalar 44.108545 1.990168 CARL-A Carla-bayle 43.151102 1.3923 CASS-A Cassagne-Begontes 44.17653 2.518164 CAST-A Castelginset 43.698534 1.427856 CASTI-A Castillon en Cousserans 42.920498 1.034063 CEPE-A Cepet 43.755183 1.435978 CERN-A Saint-Rome-de-Cernon 44.01194 2.966488 CERN-B Saint-Rome-de-Cernon 44.014684 2.967927 CHEI-A Chein-dessus 43.013708 0.86707 CIER-A Cier sur Luchon 42.85332 0.602039 CIER-B Cier de Luchon 42.859978 0.600413 CINT-A Cintegabelle 43.305466 1.520441 41

Supplementary Table S1. (continued)

Population name Locality Latitude Longitude CINT-B Cintegabelle 43.305611 1.520735 CLAR-A Saint Clar-de-Rivière 43.464776 1.219019 CLAR-B Saint Clar-de-Rivière 43.465281 1.218577 CLAR-C Saint Clar-de-Rivière 43.464058 1.21799 COLO-A Colombiès 44.346915 2.340243 COLO-B Colombiès 44.34773 2.339715 COLO-C Colombiès 44.34806 2.339698 COMT-A Villecomtal 44.540652 2.602245 CRAN-A Cransac (Aubin) 44.529845 2.260486 DAMI-A Damiatte 43.654515 1.977636 DAMI-B Damiatte 43.654515 1.977636 DAMI-C Damiatte 43.654515 1.977636 DECA-A Châteaude Cas (Espinas) 44.199896 1.77189 DIEU-A Ville-Dieu-du-temple 44.059797 1.220975 ESPE-B Esperausses 43.693335 2.534582 FAYA-A Fayet 43.8021 2.951709 FERR-A Ferrières 43.657743 2.44371 GAIL-A Gaillac 43.908928 1.900574 GAIL-B Gaillac 43.909032 1.901077 GREZ-A Grézian 42.876896 0.349714 JACO-A Jacoy (Boussenac) 42.905839 1.406513 JACO-C Jacoy (Boussenac) 42.905839 1.406513 JULI-A Saint Julien de Malmont (St Cyprien de Dourdou) 44.522606 2.36351 JUZE-A Juzes 43.448838 1.79053 JUZET-A Juzet d'Izaut 42.977713 0.756373 JUZET-B Juzet d'Izaut 42.977354 0.755498 JUZET-C Juzet d'Izaut 42.977354 0.755498 LABA-A Labarthe-sur-Lèze 43.45155 1.400498 LABA-B Labarthe-sur-Lèze 43.450892 1.40116 LABA-C Labarthe-sur-Lèze 43.451451 1.39935 LABAS-A La bastide de Sérou 43.00844 1.420039 LABAS-B La bastide de Sérou 43.008716 1.420053 LABR-A Labruguière 43.531185 2.262591 LACR-A Lacraste (Montgauch) 42.999869 1.075659 LACR-C Lacraste (Montgauch) 43.000155 1.075624 LAGR-A Lagraulhet St Nicolas 43.795323 1.073752 LAMA-A Lamasquère 43.487424 1.243559 LAMA-B Lamasquère 43.479745 1.241592 LANT-B Lanta 43.564943 1.65239 LANT-C Lanta 43.564822 1.65201 LANT-D Lanta 43.564822 1.65201 LAUZ-A Lauzerte 44.25608 1.140526 LECT-A 43.911721 0.629745 LECT-B Lectoure 43.911721 0.629745 LESP-A Les pujols 43.094237 1.719981 LOUB-A Loubens-Lauragais 43.574273 1.786038 LOUB-B Loubens-Lauragais 43.574647 1.785723 LUNA-A Lunax 43.339706 0.689839 LUZE-A Luzenac (Garanou) 42.764683 1.752959

42

Supplementary Table S1. (continued)

Population name Locality Latitude Longitude LUZE-B Luzenac (Garanou) 42.764419 1.753595 LUZE-D Luzenac (Garanou) 42.764419 1.753595 LUZE-E Luzenac (Garanou) 42.764683 1.752959 MARS-A Glaciane (Marsans) 43.662542 0.718265 MARS-B Glaciane (Marsans) 43.662542 0.718265 MART-A Martres Tolosane 43.202147 1.010976 MASS-A 43.437536 0.579271 MAZA-A Mazamet 43.497754 2.375372 MEDA-A Saint Medard 43.490485 0.461439 MERE-A Merens-les-Vals 42.656618 1.836221 MERE-B Merens-les-Vals 42.656546 1.836175 MERV-A Merville 43.720426 1.296824 MERV-B Merville 43.725141 1.247629 MONB-A 43.46529 0.986273 MONE-A Monestiès 44.115354 2.094725 MONF-A Monferran-Savès 43.616254 0.972435 MONT-A Montans 43.852212 1.87432 MONT-B Montans 43.852723 1.873536 MONTB-A Montbrun bocage 43.130495 1.269927 MONTG-B Montgaillard 43.12729 0.110681 MONTG-D Montgaillard 43.127713 0.110633 MONTI-A Montiès 43.389383 0.67282 MONTI-B Montiès 43.3839336 0.67257 MONTI-D Montiès 43.3839336 0.67257 MONTM-A Montmajou (Cier de Luchon) 42.86156 0.595943 MONTM-B Montmajou (Cier de Luchon) 42.861218 0.596869 MOUL-A Moularès 44.089762 2.296094 NAUV-A Nauviale 44.520751 2.427404 NAUV-B Nauviale 44.520418 2.427129 NAUV-C Nauviale 44.520397 2.42721 NAYR-A Le Nayrac (Cassagnes-Bégontes) 44.161368 2.544711 NAZA-A Saint-Pierre-de-Najac (Miramont de Quercy) 44.220329 1.064953 PAMP-A Pampelonne 44.124864 2.255514 PAMP-B Pampelonne 44.124876 2.255184 PANA-C Villefrance de Panat 44.078884 2.711136 PASD-B Pas du loup (Camarès) 43.811758 2.871661 PREI-A Preignan 43.717856 0.623298 PUYM-B Puymaurin 43.372913 0.765694 RADE-A Sainte Radegonde 44.345163 2.620821 RAYR-A Rayret (Cassagne-Begontes) 44.196005 2.493157 RAYR-B Rayret (Cassagne-Begontes) 44.196006 2.493076 REAL-A Réalmont 43.83165 2.20155 ROME-A Saint-Rome-du-tarn 44.041553 2.909576 ROQU-B Roquecourbe 43.667907 2.290214 SALE-A Saleich 43.024966 0.965965 SALV-A Saint-Salvy-de-la-Balme 43.602578 2.36338 SAMA-A 43.494325 0.92391 SAUB-A Saubens 43.464914 1.365136 SAUB-B Saubens 43.474107 1.364175 43

Supplementary Table S1. (continued)

Population name Locality Latitude Longitude SAUB-C Saubens 43.475583 1.367589 SEIS-A 43.487302 0.588798 SIMO-A 43.449392 0.734601 SORE-A Sorèze 43.452628 2.072476 TARN-C Villemur-sur-Tarn 43.85328 1.502009 THOM-A Saint Thomas 43.513975 1.082859 VALE-A Valence d'Albiegeois 44.022296 2.403434 VICT-B Saint Victor et Melvieu 44.052243 2.834023 VICT-C Saint Victor et Melvieu 44.052243 2.834023 VIEL-A Vielmur sur Agout 43.623801 2.089616 VILLA-A Villate 43.458174 1.380951 VILLE-A Villenouvelle 43.439784 1.671 VILLE-B Villenouvelle 43.440342 1.669595 VILLE-C Villenouvelle 43.440024 1.670111 VILLE-D Villenouvelle 43.439733 1.670712 VILLEM-A Villembits 43.273815 0.321238

44

Supplementary Table S2. In planta bacterial growth of four natural Pseudomonas strains in four corresponding local natural populations of A. thaliana, each represented by two accessions.

Model terms F or LRT P

block 28.95 9.97E-13 strain 0.56 0.6396 population 1.29 0.3292 time 178.54 7.72E-08 strain*population 0.21 0.9932 strain*time 1.13 0.3373 time*population 1.62 0.2484 strain*population*time 0.35 0.9579 accession(population) 1.20 0.2733 strain*accession(population) 0.00 1.0000 time*accession(population) 0.00 1.0000 strain*time*accession(population) 0.00 1.0000

In planta bacterial growth (logCFU.cm-2) was modeled using a mixed model. Model random terms (in italics) were tested with likelihood ratio tests (LRT) of models with and without these effects. A Bonferroni correction for the number of tests was performed at a nominal level of 5%. Bold values indicate statistically significant results after correction for multiple comparisons.

45

Supplementary Table S3. Natural interactions between eight local natural A. thaliana accessions and eight corresponding local natural Pseudomonas syringae strains for disease symptom evolution.

Model terms F or LRT P

block 6.90 1.80E-05 strain 1.77 0.0904 time 29.38 0.0002 strain*time 13.35 1.93E-13 accession 0.00 1.0000 strain*accession 0.00 1.0000 time*accession 4.50 0.0339 strain*time*accession 22.40 2.21E-06

Disease symptoms were modeled using a mixed model. Model random terms (in italics) were tested with likelihood ratio tests (LRT) of models with and without these effects. A Bonferroni correction for the number of tests was performed at a nominal level of 5%. Bold values indicate statistically significant results after correction for multiple comparisons.

46

Supplementary Table S4. Genetic variation among 23 accessions of the Midi-Pyrénées region for the response to the strain 0001-Pag-MONTI-DL of P. agglomerans at four time points after inoculation (dpi, days post-inoculation). H²: broad-sense heritability.

Block Accession

Time LRT P LRT P H²

2 dpi 4.8 0.0285 22.3 2.3E-06 0.78 3 dpi 2.4 0.1213 36.1 1.9E-09 0.85 4 dpi 0.3 0.5839 34.6 4.0E-09 0.84 5 dpi 0.0 1.0000 10.5 0.0012 0.65

Disease symptoms were modeled using a mixed model. Model random terms (in italics) were tested with likelihood ratio tests (LRT) of models with and without these effects. A Bonferroni correction for the number of tests was performed at a nominal level of 5%. Bold values indicate statistically significant results after correction for multiple comparisons.

47

Supplementary Table S5. Natural variation of α-diversity and β-diversity estimators of microbiota and potential pathobiota in natural populations of A. thaliana collected both in autumn and spring.

microbiota potential pathobiota

richness Shannon PCo1 PCo2 richness Shannon PCo1 PCo2

Model terms F or LRT P F or LRT P F or LRT P F or LRT P F or LRT P F or LRT P F or LRT P F or LRT P

season 0.3 0.7011 0.5 0.6277 0.3 0.7223 0.6 0.5858 1.6 0.3593 2.3 0.2474 1.8 0.3248 5.8 0.0473 plant compartment 17.0 0.0004 32.4 1.9E-06 73.5 2.5E-11 33.1 1.7E-06 76.3 1.8E-11 8.4 0.0170 1.9 0.3117 80.5 1.6E-15 season*plant compartment 1.3 0.4181 14.2 0.0016 30.8 3.9E-06 3.4 0.1460 0.0 0.9372 4.1 0.1113 3.4 0.1432 3.8 0.1185 population 4.1 0.1029 0.3 0.7234 1.1 0.4567 1.0 0.4838 0.5 0.6277 0.1 0.8547 3.1 0.1653 1.0 0.4838 season*population 4.6 0.0806 28.9 7.7E-07 124.3 5.6E-14 88.0 5.6E-14 9.9 0.0065 2.2 0.2548 5.3 0.0588 5.5 0.0538 plant compartment*population 0.4 0.6763 1.2 0.4343 1.1 0.4567 0.1 0.8547 3.4 0.1422 0.8 0.5392 0.2 0.7813 0.0 1.0000 season*plant compartment*population 2.4 0.2326 8.0 0.0160 6.7 0.0303 8.8 0.0108 0.0 1.0000 0.1 0.8547 0.0 1.0000 0.0 1.0000 sampling date (season) 8.4 0.0016 11.0 0.0002 12.8 4.7E-05 17.2 1.7E-06 16.7 2.6E-06 17.7 1.4E-06 2.1 0.2474 10.7 0.0003 diameter(season) 0.8 0.6025 0.4 0.7976 2.5 0.1726 0.1 0.9974 0.4 0.7834 0.3 0.8305 1.3 0.4343 0.5 0.7335 leaf number(season) 0.0 1.0000 0.2 0.9372 1.2 0.4791 0.1 0.9812 1.7 0.3248 1.3 0.4327 2.2 0.2257 3.1 0.1083 obs 6.6 0.0327 6.2 0.0390 1.1 0.4472 2.8 0.1892 20.7 4.0E-05 2.7 0.2052 0.3 0.6991 1.2 0.4295 Each trait was modeled separately using a mixed model. Model random terms (in italics) were tested with likelihood ratio tests (LRT) of models with and without these effects. A correction for the number of tests was performed across Supplementary Tables S5, S7, S8, S9, S10 and S12 to control the FDR at a nominal level of 5%. Bold values indicate statistically significant results after correction for multiple comparisons. ‘obs’, total number of observations.

48

Supplementary Table S6. Percentage of variance of α-diversity and β-diversity estimators of microbiota and potential pathobiota explained by the factors ‘seasons’, ‘plant compartment’, ‘population’ and their interactions in natural populations of A. thaliana collected both in autumn and spring.

microbiota potential pathobiota

Model terms richness Shannon PCo1 PCo2 richness Shannon PCo1 PCo2

season 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 plant compartment 3.3 5.2 5.8 4.0 17.6 1.9 0.0 14.9 season*plant compartment 0.0 2.3 2.7 0.4 0.0 0.6 0.5 0.4 population 8.0 2.2 5.8 7.7 1.2 0.0 7.1 2.3 season*population 7.9 26.8 53.8 43.2 11.5 8.6 9.8 7.6 plant compartment*population 2.2 3.7 1.9 0.5 5.3 5.2 1.7 0.0 season*plant compartment*population 6.4 9.1 3.8 6.5 0.0 1.0 0.0 0.0 error 72.1 50.8 26.2 37.8 64.4 82.8 80.9 74.8

Bold values indicate statistically significant results after a FDR correction for multiple comparisons (see Supplementary Table S5). Italic values indicate statistically significant results before a FDR correction for multiple comparisons (see Supplementary Table S5).

49

Supplementary Table S7. Natural variation of α-diversity and β-diversity estimators of microbiota and potential pathobiota in all the natural populations of A. thaliana collected in spring.

microbiota potential pathobiota

richness Shannon PCo1 PCo2 richness Shannon PCo1 PCo2

Model terms F or LRT P F or LRT P F or LRT P F or LRT P F or LRT P F or LRT P F or LRT P F or LRT P

w/wo_Autumn 1.3 0.4134 0.3 0.7383 1.9 0.2984 1.7 0.3396 1.0 0.4834 0.6 0.5881 4.1 0.1043 1.1 0.4574 plant compartment 11.5 0.0038 10.2 0.0068 40.6 2.2E-08 60.1 1.7E-11 73.7 4.8E-13 5.0 0.0694 0.0 0.9951 53.8 2.7E-10 w/wo_Autumn*plant compartment 0.4 0.6816 0.0 0.9781 0.1 0.8894 0.7 0.5611 2.4 0.2326 0.9 0.5147 0.0 0.9874 4.6 0.0843 population(w/wo_Autumn) 24.6 5.6E-06 36.5 1.9E-08 241.8 5.6E-14 216.1 5.6E-14 19.1 0.0001 11.6 0.0028 15.7 0.0004 8.1 0.0153 plant compartment*population(w/wo_Autumn) 0.5 0.6277 12.5 0.0019 20.6 3.8E-05 19.5 0.0001 1.2 0.4343 2.9 0.1828 0.0 1.0000 2.6 0.2118 sampling date 13.3 0.0017 3.3 0.1493 46.5 2.4E-09 48.1 1.4E-09 48.9 1.2E-09 65.1 5.4E-12 16.9 0.0004 38.2 7.5E-08 diameter 5.1 0.0651 3.0 0.1711 2.6 0.2124 0.2 0.7791 0.5 0.6143 1.3 0.4210 0.1 0.8643 0.0 1.0000 leaf number 0.6 0.5858 2.6 0.2095 4.8 0.0719 0.1 0.8345 2.4 0.2308 0.9 0.5199 0.4 0.6833 7.4 0.0219 obs 11.0 0.0040 6.6 0.0313 0.9 0.5146 3.6 0.1324 35.0 5.6E-08 9.8 0.0071 0.4 0.6676 0.9 0.5105

Each trait was modeled separately using a mixed model. Model random terms (in italics) were tested with likelihood ratio tests (LRT) of models with and without these effects. A correction for the number of tests was performed across Supplementary Tables S5, S7, S8, S9, S10 and S12 to control the FDR at a nominal level of 5%. Bold values indicate statistically significant results after correction for multiple comparisons. ‘obs’, total number of observations.

50

Supplementary Table S8. Natural variation of α-diversity and β-diversity estimators of microbiota and potential pathobiota in the natural populations of A. thaliana collected in the seasonal group ‘autumn’.

microbiota potential pathobiota

richness Shannon PCo1 PCo2 richness Shannon PCo1 PCo2

Model terms F or LRT P F or LRT P F or LRT P F or LRT P F or LRT P F or LRT P F or LRT P F or LRT P

plant compartment 10.4 0.0073 31.1 3.8E-06 76.4 1.8E-11 8.4 0.0170 47.2 3.7E-08 16.8 0.0005 7.2 0.0306 24.7 7.6E-06 population 11.4 0.0031 23.4 9.8E-06 83.2 5.6E-14 33.2 8.8E-08 3.8 0.1185 2.3 0.2444 12.3 0.0020 3.5 0.1362 plant compartment*population 11.1 0.0037 23.8 8.1E-06 13.9 0.0009 15.0 0.0005 0.9 0.5118 0.2 0.7813 0.0 1.0000 0.0 1.0000 sampling date 5.2 0.0660 13.3 0.0021 3.1 0.1726 29.6 4.2E-06 5.7 0.0558 4.2 0.1061 1.8 0.3190 6.1 0.0460 diameter 0.2 0.7791 0.6 0.5979 1.8 0.3187 0.2 0.7791 0.6 0.6005 0.0 0.9952 2.2 0.2506 0.1 0.9023 leaf number 0.0 0.9857 0.1 0.8547 1.0 0.4967 0.1 0.8643 0.3 0.7160 1.0 0.4850 1.5 0.3806 0.0 1.0000 obs 3.0 0.1726 2.5 0.2283 1.4 0.3826 0.1 0.8345 5.7 0.0501 0.8 0.5395 0.1 0.8904 0.1 0.9058 Each trait was modeled separately using a mixed model. Model random terms (in italics) were tested with likelihood ratio tests (LRT) of models with and without these effects. A correction for the number of tests was performed across Supplementary Tables S5, S7, S8, S9, S10 and S12 to control the FDR at a nominal level of 5%. Bold values indicate statistically significant results after correction for multiple comparisons. ‘obs’, total number of observations.

51

Supplementary Table S9. Natural variation of α-diversity and β-diversity estimators of microbiota and potential pathobiota in the natural populations of A. thaliana collected in the seasonal group ‘spring with autumn’.

microbiota potential pathobiota

richness Shannon PCo1 PCo2 richness Shannon PCo1 PCo2

Model terms F or LRT P F or LRT P F or LRT P F or LRT P F or LRT P F or LRT P F or LRT P F or LRT P

plant compartment 8.2 0.0186 5.7 0.0558 13.4 0.0020 32.6 1.6E-06 54.8 2.0E-11 0.9 0.5237 0.1 0.9130 58.7 1.1E-09 population 10.2 0.0056 23.2 1.1E-05 97.8 5.6E-14 90.9 5.6E-14 17.5 0.0002 3.4 0.1422 8.1 0.0153 7.3 0.0225 plant compartment*population 0.0 1.0000 3.9 0.1130 14.7 0.0006 7.2 0.0235 0.0 1.0000 3.4 0.1422 0.0 1.0000 0.3 0.7234 sampling date 9.4 0.0108 3.9 0.1201 26.4 1.4E-05 10.5 0.0068 19.8 0.0002 24.0 4.0E-05 5.3 0.0661 14.7 0.0013 diameter 2.3 0.2444 0.7 0.5595 2.2 0.2622 0.0 1.0000 0.5 0.6161 0.5 0.6455 0.3 0.7466 0.2 0.7839 leaf number 0.1 0.8415 0.3 0.7249 1.1 0.4479 0.0 1.0000 3.4 0.1435 2.1 0.2657 2.3 0.2423 5.2 0.0639 obs 4.3 0.0912 3.6 0.1332 0.0 0.9886 2.9 0.1845 20.0 0.0001 3.0 0.1737 0.0 0.9175 1.9 0.2946 Each trait was modeled separately using a mixed model. Model random terms (in italics) were tested with likelihood ratio tests (LRT) of models with and without these effects. A correction for the number of tests was performed across Supplementary Tables S5, S7, S8, S9, S10 and S12 to control the FDR at a nominal level of 5%. Bold values indicate statistically significant results after correction for multiple comparisons. ‘obs’, total number of observations.

52

Supplementary Table S10. Natural variation of α-diversity and β-diversity estimators of microbiota and potential pathobiota in the natural populations of A. thaliana collected in the seasonal group ‘spring without autumn’.

microbiota potential pathobiota

richness Shannon PCo1 PCo2 richness Shannon PCo1 PCo2

Model terms F or LRT P F or LRT P F or LRT P F or LRT P F or LRT P F or LRT P F or LRT P F or LRT P

plant compartment 3.7 0.1315 4.5 0.0889 32.3 1.9E-06 27.2 1.0E-05 21.3 0.0001 6.1 0.0464 0.0 0.9723 10.7 0.0066 population 15.0 0.0005 14.1 0.0009 151.4 5.6E-14 119.9 5.6E-14 3.5 0.1362 9.1 0.0094 7.3 0.0225 1.9 0.2984 plant compartment*population 1.2 0.4343 9.5 0.0076 5.4 0.0560 13.0 0.0015 4.7 0.0766 0.2 0.7813 0.0 1.0000 2.4 0.2326 sampling date 4.7 0.0834 0.4 0.6974 19.8 0.0002 42.4 8.3E-08 30.8 4.3E-06 41.5 1.3E-07 11.8 0.0045 23.4 0.0001 diameter 4.0 0.1077 3.5 0.1354 0.5 0.6174 0.8 0.5274 0.0 0.9874 0.9 0.4998 2.0 0.2907 0.7 0.5600 leaf number 4.1 0.1043 4.6 0.0836 6.7 0.0309 0.2 0.7791 0.0 1.0000 0.1 0.8719 1.4 0.3850 1.7 0.3435 6.5 2.7 2.1 1.0 16.4 8.2 1.2 0.0 obs 0.0345 0.2066 0.2725 0.4905 0.0003 0.0152 0.4472 1.0000 Each trait was modeled separately using a mixed model. Model random terms (in italics) were tested with likelihood ratio tests (LRT) of models with and without these effects A correction for the number of tests was performed across Supplementary Tables S5, S7, S8, S9, S10 and S12 to control the FDR at a nominal level of 5%. Bold values indicate statistically significant results after correction for multiple comparisons. ‘obs’, total number of observations.

53

Supplementary Table S11. Percentage of variance of α-diversity and β-diversity estimators of microbiota and pathobiota explained by the factors ‘population’, ‘plant compartment’ and their interactions in natural populations of A. thaliana collected either in autumn or ins spring.

microbiota potential pathobiota

Season Model terms richness Shannon PCo1 PCo2 richness Shannon PCo1 PCo2

Autumn population 17.5 29.0 55.7 36.4 8.8 6.4 18.5 5.4 plant compartment 3.6 10.4 12.0 2.4 18.1 6.8 2.4 9.4 plant compartment * population 13.1 15.0 5.5 11.2 3.9 2.1 0.2 0.0 error 65.8 45.6 26.8 50.0 69.3 84.7 78.9 85.2 Spring w/ Autumn population 13.4 27.9 67.4 59.6 17.8 9.9 16.9 11.3 plant compartment 2.4 1.6 2.3 6.1 17.6 0.0 0.0 22.6 plant compartment * population 0.1 8.5 5.6 4.1 0.0 11.1 0.0 2.7 error 84.1 62.1 24.7 30.2 64.7 78.9 83.1 63.3 Spring w/o Autumn population 19.5 21.8 80.7 65.7 10.9 16.7 13.2 7.7 plant compartment 1.0 1.5 2.5 4.2 11.4 2.4 0.0 5.9 plant compartment * population 4.8 13.4 2.0 4.5 13.2 2.7 0.0 10.3 error 74.7 63.3 14.7 25.7 64.4 78.3 86.8 76.1

Bold values indicate statistically significant results after a FDR correction for multiple comparisons (see Supplementary Tables S7, S8 and S9).

54

Supplementary Table S12. Natural variation of α-diversity and β-diversity estimators of microbiota and potential pathobiota for each ‘seasonal group x plant compartment’ combination.

microbiota potential pathobiota

richness Shannon PCo1 PCo2 richness Shannon PCo1 PCo2

Season Compart. Model terms F or P F or P F or P F or P F or P F or P F or P F or P LRT LRT LRT LRT LRT LRT LRT LRT

Autumn root population 50.6 2.0E-11 105.9 5.6E-14 212.3 5.6E-14 115.2 5.6E-14 0.1 0.8547 0.5 0.6277 5.1 0.0646 0.0 1.0000 sampling date 1.8 0.3155 11.4 0.0046 2.3 0.2480 20.7 0.0001 3.7 0.1307 3.4 0.1500 2.0 0.2882 5.1 0.0651 diameter 0.1 0.8547 0.3 0.7149 0.7 0.5611 0.0 0.9898 0.0 1.0000 0.6 0.5858 1.6 0.3583 1.2 0.4472 leaf number 0.1 0.8254 0.5 0.6502 0.3 0.7466 0.0 0.9167 0.0 0.9442 0.3 0.7466 0.7 0.5750 2.3 0.2464 obs 0.2 0.7791 0.0 0.9861 0.2 0.7976 0.0 0.9296 1.8 0.3181 2.4 0.2425 0.7 0.5611 0.9 0.5199 Autumn leaf population 11.8 0.0026 37.4 1.2E-08 169.6 5.6E-14 44.4 3.9E-10 12.9 0.0015 4.6 0.0806 19.5 0.0001 1.3 0.4139 sampling date 10.1 0.0080 11.5 0.0046 4.5 0.0902 30.1 3.9E-06 3.7 0.1336 2.9 0.1886 0.6 0.6005 3.4 0.1493 diameter 0.9 0.5146 1.2 0.4343 0.1 0.8345 2.1 0.2747 0.6 0.5979 0.2 0.7910 0.6 0.5877 0.8 0.5395 leaf number 0.5 0.6519 0.0 0.9727 0.0 0.9626 0.4 0.6955 0.4 0.6676 1.1 0.4472 1.0 0.4967 0.9 0.5212 obs 5.4 0.0578 4.3 0.0918 2.8 0.1945 0.3 0.7131 3.5 0.1369 0.0 1.0000 0.2 0.7791 0.8 0.5237 Spring root population 23.3 1.0E-05 92.4 5.6E-14 468.5 5.6E-14 461.8 5.6E-14 8.8 0.0108 9.6 0.0073 0.0 1.0000 13.3 0.0013 sampling date 12.4 0.0025 2.0 0.2943 48.2 1.3E-09 56.0 7.6E-11 77.3 1.3E-13 78.7 1.1E-13 22.9 1.8E-05 52.2 8.6E-10 diameter 1.3 0.4088 0.1 0.9023 0.3 0.7377 1.4 0.3983 1.7 0.3334 1.6 0.3525 0.8 0.5485 0.0 0.9918 leaf number 0.7 0.5750 1.5 0.3618 2.8 0.1975 0.0 0.9823 0.4 0.6863 0.6 0.5858 2.8 0.1896 4.9 0.0695 obs 11.1 0.0039 5.1 0.0650 1.6 0.3545 0.7 0.5553 0.6 0.6005 0.1 0.8345 0.0 1.0000 0.2 0.7994 Spring leaf population 19.8 0.0001 50.1 2.5E-11 351.1 5.6E-14 285.8 5.6E-14 27.9 1.2E-06 16.2 0.0003 11.4 0.0031 7.0 0.0260 sampling date 6.0 0.0460 3.2 0.1584 38.4 5.6E-08 34.0 3.4E-07 20.4 0.0001 22.8 2.9E-05 5.4 0.0609 9.0 0.0114 diameter 5.0 0.0685 5.9 0.0460 2.7 0.2031 0.1 0.8890 0.1 0.9045 0.4 0.6763 0.0 0.9727 0.0 1.0000 leaf number 0.8 0.5363 1.5 0.3615 4.6 0.0834 2.4 0.2318 4.5 0.0856 1.6 0.3462 0.2 0.7915 8.1 0.0158 obs 3.4 0.1427 5.9 0.0462 2.8 0.1886 5.9 0.0460 25.1 6.3E-06 7.4 0.0225 0.3 0.7463 3.5 0.1375 Spring root population 7.5 0.0204 44.6 3.6E-10 221.5 5.6E-14 217.2 5.6E-14 10.0 0.0062 7.5 0.0204 0.0 1.0000 13.2 0.0013 w/ Autumn sampling date 6.0 0.0471 1.4 0.3850 28.2 7.6E-06 13.5 0.0020 25.6 2.1E-05 27.6 1.2E-05 10.9 0.0047 19.9 0.0002 diameter 1.3 0.4097 0.2 0.7976 0.4 0.6833 0.7 0.5750 0.5 0.6174 0.1 0.8985 0.3 0.7423 0.2 0.8117 leaf number 0.6 0.5898 1.9 0.3081 0.7 0.5593 0.0 0.9727 0.2 0.7985 0.0 0.9861 2.0 0.2849 3.9 0.1156 obs 2.4 0.2326 1.0 0.4833 0.4 0.6829 0.1 0.8345 0.5 0.6161 0.0 0.9201 0.0 0.9346 1.1 0.4567 Spring leaf population 4.4 0.0879 27.9 1.2E-06 132.7 5.6E-14 103.0 5.6E-14 9.6 0.0073 4.4 0.0879 8.6 0.0119 0.6 0.5946 w/ Autumn sampling date 7.1 0.0300 5.4 0.0617 20.5 0.0001 5.9 0.0501 9.6 0.0100 6.9 0.0328 1.4 0.3867 3.5 0.1451 diameter 2.6 0.2164 1.6 0.3583 2.5 0.2202 0.7 0.5695 0.4 0.6676 0.4 0.6704 0.8 0.5483 0.7 0.5760 leaf number 1.1 0.4727 0.1 0.8274 1.2 0.4418 0.0 0.9951 3.6 0.1315 5.1 0.0651 0.9 0.5146 5.0 0.0685 obs 3.9 0.1184 7.2 0.0249 0.0 1.0000 3.7 0.1240 10.8 0.0047 2.2 0.2512 0.0 1.0000 2.2 0.2603 Spring root population 17.0 0.0002 48.2 6.2E-11 244.2 5.6E-14 224.1 5.6E-14 0.0 1.0000 1.1 0.4567 0.0 1.0000 2.3 0.2444 w/o Autumn sampling date 6.9 0.0328 0.7 0.5704 21.0 0.0001 47.2 2.0E-08 67.6 1.7E-12 59.7 1.9E-09 12.2 0.0027 34.3 3.9E-06 diameter 0.2 0.7801 0.1 0.8751 0.0 0.9823 0.4 0.6829 6.2 0.0405 7.9 0.0187 0.3 0.7020 1.6 0.3494 leaf number 0.0 0.9626 0.0 0.9346 2.1 0.2753 0.1 0.8547 0.6 0.5922 2.2 0.2603 0.1 0.8615 0.7 0.5556 obs 10.1 0.0066 4.9 0.0714 1.3 0.4097 0.7 0.5715 0.6 0.5898 0.4 0.6676 0.0 0.9727 2.7 0.2064 Spring leaf population 20.5 3.9E-05 24.1 7.1E-06 242.4 5.6E-14 176.1 5.6E-14 21.3 2.7E-05 10.9 0.0040 1.6 0.3511 6.3 0.0366 w/o Autumn sampling date 0.8 0.5485 0.0 0.9727 17.8 0.0003 35.3 8.6E-07 10.3 0.0072 16.7 0.0006 4.6 0.0875 5.7 0.0552 diameter 6.0 0.0451 8.2 0.0153 0.6 0.5898 0.4 0.6870 0.2 0.7803 0.0 1.0000 1.7 0.3292 0.2 0.8117 leaf number 8.6 0.0129 7.0 0.0277 5.0 0.0685 4.9 0.0721 0.0 0.9845 0.7 0.5815 3.0 0.1737 0.9 0.4998 obs 0.5 0.6277 0.6 0.5992 6.1 0.0427 2.3 0.2474 13.8 0.0013 5.3 0.0617 1.0 0.4967 1.2 0.4343 Each trait was modeled separately using a mixed model. Model random terms (in italics) were tested with likelihood ratio tests (LRT) of models with and without these effects. A correction for the number of tests was performed across Supplementary Tables S5, S7, S8, S9, S10 and S12 to control the FDR at a nominal level of 5%. Bold values indicate statistically significant results after correction for multiple comparisons. ‘obs’, total number of observations.

55

Supplementary Table S13. Comparison of the fit between a linear model and a non-linear model on the relationship between the α-diversity estimators (species richness and Shannon index) of the microbiota and the α-diversity estimators (species richness and Shannon index) of the potential pathobiota.

linear model non-linear model

Diversity estimate Compartment Season intercept P beta P Goodness of fit k P q P Goodness of fit richness All samples All seasons 2.369 5.49E-16 -0.005 6.60E-01 0.015 0.370 5.49E-16 0.013 5.49E-16 0.106 autumn 2.156 5.49E-16 0.017 2.36E-01 0.061 0.388 5.49E-16 0.013 5.49E-16 0.194 spring w/ autumn 2.802 5.49E-16 -0.031 1.13E-01 0.087 0.364 5.49E-16 0.012 5.49E-16 0.035 spring w/o autumn 2.367 5.49E-16 -0.019 3.04E-01 0.058 0.375 5.49E-16 0.015 5.49E-16 0.074 Leaf samples All seasons 2.885 5.49E-16 -0.025 6.95E-02 0.073 0.427 5.49E-16 0.015 5.49E-16 0.088 autumn 2.579 5.49E-16 0.006 7.93E-01 0.019 0.445 5.49E-16 0.016 5.49E-16 0.158 spring w/ autumn 3.656 5.49E-16 -0.072 6.25E-03 0.196 0.440 5.49E-16 0.015 5.49E-16 0.061 spring w/o autumn 2.652 5.49E-16 -0.026 3.21E-01 0.073 0.412 5.49E-16 0.016 8.07E-14 0.031 Root samples All seasons 1.925 5.49E-16 -0.001 9.59E-01 0.002 0.302 5.49E-16 0.010 5.49E-16 0.112 autumn 1.787 5.49E-16 0.006 7.53E-01 0.028 0.311 5.49E-16 0.010 5.49E-16 0.169 spring w/ autumn 1.967 1.53E-12 0.006 8.49E-01 0.018 0.282 5.49E-16 0.008 7.95E-09 0.053 spring w/o autumn 2.162 5.49E-16 -0.027 2.75E-01 0.096 0.335 5.49E-16 0.013 1.27E-13 0.133 Shannon All samples All seasons 0.382 4.05E-15 0.053 3.26E-02 0.065 0.542 5.49E-16 0.139 7.15E-15 0.110 autumn 0.358 6.36E-10 0.068 2.71E-02 0.109 0.551 5.49E-16 0.139 5.38E-08 0.137 spring w/ autumn 0.394 6.47E-04 0.063 2.84E-01 0.060 0.561 3.68E-13 0.141 2.75E-05 0.141 spring w/o autumn 0.462 1.32E-04 -0.005 9.48E-01 0.005 0.528 2.08E-10 0.144 1.18E-04 0.062 Leaf samples All seasons 0.468 3.91E-12 0.025 4.79E-01 0.030 0.569 5.49E-16 0.145 1.77E-10 0.069 autumn 0.448 5.85E-08 0.052 2.36E-01 0.079 0.575 2.32E-13 0.139 3.28E-05 0.073 spring w/ autumn 0.533 3.89E-04 -0.004 9.58E-01 0.005 0.631 1.78E-10 0.172 3.50E-05 0.152 spring w/o autumn 0.452 7.50E-03 0.012 9.11E-01 0.010 0.513 2.79E-06 0.131 7.25E-03 0.051 Root samples All seasons 0.309 1.71E-05 0.070 6.55E-02 0.087 0.525 1.32E-15 0.141 1.85E-06 0.144 autumn 0.295 3.29E-04 0.051 2.87E-01 0.087 0.532 8.06E-09 0.153 1.48E-04 0.176 spring w/ autumn 0.200 3.00E-01 0.163 8.68E-02 0.140 0.445 7.35E-04 0.087 1.68E-01 0.158 spring w/o autumn 0.484 6.43E-03 -0.033 7.45E-01 0.032 0.560 1.90E-05 0.167 5.01E-03 0.089 lm: linear model (pathobiota’s diversity ~ intercept + a*microbiota’s diversity); nlm: non-linear model, quadratic function (pathobiota’s diversity ~ k*microbiota’s diversity – q*microbiota’s diversity*microbiota’s diversity). A correction for the number of tests was performed across the 96 P values to control the FDR at a nominal level of 5%. Bold values indicate statistically significant results after correction for multiple comparisons. Goodness of fit values obtained with the non-linear model are significantly higher than the goodness of fit values obtained with the linear model (paired t-test, t-value = 3.521, P = 0.0018).

56

Supplementary Table S14. Relationships between the relative abundance of pathobiota and α- diversity estimators (species richness and Shannon index) of the microbiota and potential pathobiota.

microbiota pathobiota

richness Shannon richness Shannon

Compartment Season rho P rho P rho P rho P

All samples all seasons 0.044 9.85E-02 0.096 1.75E-04 0.514 1.32E-15 0.328 1.32E-15 autumn 0.222 4.62E-08 0.266 4.38E-11 0.522 1.32E-15 0.321 9.80E-12 spring w/ autumn -0.029 5.69E-01 0.022 6.51E-01 0.540 1.32E-15 0.295 1.08E-08 spring w/o autumn -0.107 2.18E-02 -0.020 6.61E-01 0.473 1.32E-15 0.362 2.91E-12 Leaf samples all seasons -0.071 5.85E-02 -0.004 9.09E-01 0.481 1.32E-15 0.232 1.62E-09 autumn 0.093 1.25E-01 0.131 2.92E-02 0.465 2.60E-15 0.158 1.37E-02 spring w/ autumn -0.146 3.21E-02 -0.084 2.30E-01 0.505 3.25E-14 0.273 1.25E-04 spring w/o autumn -0.226 3.66E-04 -0.124 6.08E-02 0.443 4.18E-11 0.257 2.48E-04 Root samples all seasons 0.040 2.97E-01 0.094 1.06E-02 0.456 1.32E-15 0.384 1.32E-15 autumn 0.223 1.38E-04 0.266 4.12E-06 0.451 6.33E-10 0.382 2.55E-07 spring w/ autumn -0.088 1.88E-01 -0.032 6.46E-01 0.501 1.14E-11 0.365 1.91E-06 spring w/o autumn -0.069 3.06E-01 0.037 6.05E-01 0.461 3.01E-09 0.465 2.37E-09

rho : Spearman’s rho. A correction for the number of tests was performed across the 48 P values to control the FDR at a nominal level of 5%. Bold values indicate statistically significant results after correction for multiple comparisons.

57