<<

Exploring the Molecular Basis of Microbial Wine-Terroir From Deep Soil Horizons to Grapevines and Wines

Alex Gobbi

Phd Thesis by Alex Gobbi

Department of Environmental Science (ENVS), Aarhus University Environmental Microbiology and Biotechnology (EMBI) Environmental Microbial Genomics (EMG) Supervisor: Prof. Lars Hestbjerg Hansen Co-Supervisor: Dr. Lea Ellegaard-Jensen

“Getting a PhD is like performing a guitar solo in front of an audience, learning how to play the instrument while playing it” Samuel Butler & Alex Gobbi

“A man cannot discover new oceans unless he has the courage to lose sight of the shore.” Andre Gide

Cover Designed by: Alex Gobbi February 2019 Copyrights statement: - PCoA Plot from Manuscript 1 produced by Alex Gobbi - Phylogenetic Tree from Manuscript 4 by Marie Rønne Aggerback - Topsoil image and grapevine tree pictures are freely available on internet and “no copyright violation is intended” according to the Google copyrights policy.

Acknowledgements I’d like to acknowledge my supervisor Prof. Lars Hestbjerg Hansen, for three main reasons; first of all because he believed in me, at the time of my application, giving the chance of joining his group for this amazing adventure that has been MICROWINE. Second, because he supported me during my work, when necessary. Third, and most important to me; because he let me free to learn and make mistakes (only small ones ) on my own. This, needlessy to say, allowed me to understand what I like and what I don’t, as well as what I am good at and which are my lacks in this kind of job. As I mentioned, I have been part of this fantastic group called the MICROWINE Network, a project funded by Marie-Curie Foundation within the Horizon 2020 Programme. This experience, all those journeys, projected me into an astonishing Network of crazy people and experienced PIs which became, during these three years, colleagues, mentors and dear friends. I want to thank Elisa, my girlfriend, for accepting my choice regardless the sacrifices I have exposed our relationship to. My gratitude for you extends also to the working environment, since you have been, for more than a year, a tireless help in the lab and your skills have been appreciated and acknowledged by several members of MICROWINE. Voglio ringraziare i miei genitori per aver accettato la mia scelta di partire e per essermi stati vicino ogni giorno, in ogni momento, da ben prima dell’inizio di questa esperienza. Grazie. Back in Denmark, I want to thank my office-mates Alexander, Amaru, Ifigeneia and Olivia with whom I shared the coffee-free room and where every-day chats suddenly could turn into meaningful scientific or philosophical discussion. Thanks to my colleagues which always welcomed and help me if I needed; in particular a big thank goes to Lea for the co-supervision, to the IT-Crew of Tue (s), Patrick, Zohaib, Toke and Marie for the collaborations and to all the other PhD students; I wish you good luck for your work . Thanks to all the technicians, Tina, Tanja and Pia for keeping the lab clean and efficient, thanks also for the kindness of the everyday chats along the corridor. Thanks to the other PIs Witold, Niels, Anne, Carsten and Anders with whom I felt comfortable whenever I needed to ask for an advice. Last but not least, thanks to Klaus, our best secretary ever. You are the real Santa Klaus for all the newcomers and it has been a pleasure to be your office-neighbour. Finally I want to thank Alberto, Adrian, Merche, Ana, Veronica, Javier, Eneas, Nuria, Francisco, Ava, Berta, Hector, Charles, Luis and Ignacio, from Biome Makers, for welcoming and hosting me in Valladolid and California and all the wine-makers who allowed me to get samples from their precious vineyards. I want to thank the guys that run the Friday Bar in Risø every week, for the consistent supply of smiles, beers, pizzas and pool challenges. Thanks in particular to Simon, Megha and Ilaria for doing such a big part of the job. I want to thank my friends in Italy. Thanks to Stefano, Valentina, Jennyfer, Giada, Alessandro, Irene, Francesca, Giuditta, Zambo, Tale and Davide which I feel close even

though they are always far. I wish I could make a pocket-size version of you all and stuff it in my luggage. More than this, I wish I could be there for you if you need it. However, although my everyday thoughts goes also to the friends I left home, I am grateful to all the new people I have met here and, despite they come from far, are closer than many. A big hug goes to Susana, Valentina, Jay, Jin, Marta, Kristian, Jibran, Monica, Ulla, Mar all the others that I must admit I have sinfully forgotten. tina, Dario, Giulia, Michał and to In the end, thanks to the Sønderlundsvej Crew, all the people that have been living with me, sharing their life into our old house for a long or short period of time. This living experience made me understand that we don’t need much more than a roof above our head to feel like home. The fact that we will never feel the same as in our country does not mean we cannot feel equally good in other ways. So thanks to Ifigeneia, Christina, Daina, Robertino and the fresh (almost) newcomers Michele and Ginesta. I wish you good luck for your studies. Meanwhile, thanks for the guitar lessons, the Netflix nights on Lars’s couch, the dinners together, to the many Gin-Tonics and to the legendary hammock in the living room.

Abstract Microbes are crucial in each stage of the wine production process. The correlation between microbes and geography, incorporated in the notion of terroir in wine, can be associated to the unique impact the microbial community can have on wine taste and flavour. Applying Next Generation Sequencing technology and in particular meta- barcoding, we revealed the dynamics and the composition of different microbiomes related with the wine-milieu.

During this 3-years long microbiological journey, we extended the concept of the microbial terroir on a global scale, sequencing soil-DNA from a hundred of vineyards worldwide in four continents. This biogeographical correlation exists at different scales; between different wine-regions in the same country and between fields within the same region. We revealed unique biomarkers in remote volcanic vineyards in Azores and temperature-driven microbial gradients from North Ebro Valley to Ribera Del Duero, passing from La Rioja, in Spain. We discriminated the microbial communities between closely related vineyards in Pfalz, Germany, an area famous for the production of Riesling. Further, we characterized the microbiota of Danish vineyards, young-winemaker country, within the regions of Zealand and Funen.

Throughout several fruitful industrial and academic collaborations, we improved the existing DNA extraction methods for analysing the microbes in deep-soil cores taken within vineyards, by testing DNA-enhancer such as G2. Hence, with this new protocol, we characterize the vertical stratification of the microbial community associated with different types of soil in three countries (Spain, Australia and Denmark). In another collaborative work, we characterize the inner-mycobiome of grapevine trunks, looking at the impacts of plant-fungal-disease such as Esca and different fungicide treatments. At the same time, we also characterize the microbial community resident on the phyllosphere and its dependence on the grape cultivar.

Furthermore, by combining NGS and qPCR in a commercial vineyard of Western Cape, South Africa, we measured the dynamics of the leaves’ microbiota during the complete growing season. Therefore, we investigated the impact of two different fungicide treatments: a traditional copper spray compared with the application of Lactobacillus plantarum MW-1 as a biocontrol agent.

All of these contributions to the existing scientific literature will be valuable for microbial ecologist interested in the agricultural part of the winemaking industry. This information will also help the winemakers, raising awareness on biodiversity and the importance it can have within the modern wine industry.

Resumé Mikroorganismer er af afgørende betydning i enhver fase af vinproduktionen. Sammenhængen mellem mikroorganismer og geografi, elementer som blandt andre indgår i vin-miljøets begreb terroir, kan skyldes den unikke påvirkning, som det mikrobielle samfund har på vinens overordnede smag og dens smagsnuancer. Ved anvendelse af Next Generation Sequencing (NGS) teknologi og især metabarcoding afslørede vi dynamikken og sammensætningen af forskellige mikrobiomer relateret til vin-miljøet. I løbet af denne 3-årige lange mikrobiologiske rejse udvidede vi begrebet mikrobielle terroir på globalt plan på baggrund af vores verdensomspændende sekventering af DNA fra hundredevis af vingårde jorde - fordelt på fire kontinenter. Vi fandt at disse biogeografiske korrelationer eksisterer på forskellige skalaer; mellem forskellige vinregioner i samme land og mellem marker inden for samme region. Vi afslørede unikke biomarkører i fjerntliggende vinmarker på Azorerne med vulkaniske jorde og temperaturdrevne mikrobielle gradienter fra nordlige Ebro dalen til Ribera Del Duero, der passerer fra La Rioja i Spanien. Vi sammenlignede de mikrobielle samfund mellem nært beslægtede vinmarker i Pfalz, Tyskland, et område berømt for produktion af Riesling. Endvidere har vi karakteriseret mikrobiota fra danske vinmarker inden for regionerne Sjælland og Fyn - i et relativt nystartet vin-producerende land. Gennem flere frugtbare industrielle og akademiske samarbejder forbedrede vi de eksisterende DNA-ekstraktionsmetoder til analyse af mikrober i jordkerner taget inden for vinmarker f.eks. ved at teste effekten af G2 DNA-enhancer. Med den nyudviklede protokol karakteriserede vi de mikrobielle samfund i forskellige jordlag for forskellige jordtyper i tre lande (Spanien, Australien og Danmark). I et andet samarbejdsarbejde karakteriserede vi det indre mycobiome af vinstammer, og undersøgte virkningerne af plantesygdommen Esca og effekterne af fungicidbehandlinger på denne. På samme tid karakteriserede vi også det mikrobielle samfund, der er hjemmehørende på phyllosfæren og dens afhængighed af druesorter. Ved at kombinere NGS og qPCR i vores undersøgelserne på en vingård i Western Cape, Sydafrika, målte vi også blad-mikrobiome dynamikken i hele vækstsæsonen. Derfor undersøgte vi virkningen af to forskellige fungicidbehandlinger: en traditionel kobberspray sammenlignet med anvendelsen af Lactobacillus plantarum MW-1 som et bio-pesticid. Alle disse bidrag til den eksisterende videnskabelige litteratur vil være værdifulde for mikrobielle økologer, der er interesserede i landbrugsdelen af vinfremstillingsindustrien. Disse oplysninger vil også hjælpe vinproducenterne, øge bevidstheden om biodiversitet og den betydning, det kan have inden for den moderne vinindustri.

CONTENTS I) Statement of originality ...... 7 II) Structure of the manuscript ...... 7 III) Complete List of Publications ...... 8 Published ...... 8 Submitted (in bold only first authorship or shared-first authorship) ...... 8 In Preparation (in bold only first authorship or shared-first authorship) ...... 8 IV) List of dissemination activity ...... 9 V) Disclaimer ...... 10 1. Introduction ...... 12 1.1 The microbiome ...... 12 1.2 High Throughput Sequencing HTS ...... 12 1.3 DNA Sequencing in food microbiology ...... 12 1.4 Microbiology in the wine-milieu...... 12 1.5 Definition of microbial terroir ...... 13 1.6 Effect of geography on microbial terroir ...... 14 1.7 Impact of vineyard management on microbial terroir ...... 14 1.8 Limitations of meta-barcoding ...... 15 1.9 Microbial terroir and plant health ...... 16 1.10 Concluding remarks ...... 17 2 Assessment of methodologies ...... 18 2.1 Sequencing protocol for MICROWINE amplicon-based studies ...... 18 2.2 Bioinformatics Pipeline ...... 19 2.3 Bibliography ...... 21 3.0 Manuscripts ...... 25 Manuscript 1 ...... 26 Manuscript 2 ...... 49 Manuscript 3 ...... 74 Manuscript 4 ...... 110 Manuscript 5 ...... 147 Manuscript 6 ...... 167 Appendix A ...... 190 Appendix B...... 193

STATEMENTS OF THE AUTHOR

I) Statement of originality

This manuscript is the result of 3 years of PhD, conducted mostly at the Department of Environmental Science at Aarhus University (Roskilde), under the supervision of Prof. Lars Hestbjerg Hansen. All the work included in this thesis, where not diversely indicated, has been wrote by myself in absence of any undeclared conflict of interest. All the work performed was financed by the Horizon 2020 Program of the European Commission, through -Curie Innovative Training Network called MicroWine (grant number 643063). All the different sections of this thesis will be adapted for publicationthe elsewhere Marie Skłodowska including the abstract and the introductory chapter. The papers included in the second chapter are already submitted or published in scientific peer- reviewed international journals. All other manuscripts in progress are listed apart in the list of publications below. According to GSST rules, part of this thesis were also used in the progress report for the qualifying examination. As some of these papers present unpublished data, the manuscripts should be treated as confidential.

II) Structure of the manuscript

This manuscript is divided in three main chapters: The first chapter works as introductory background, in which I performed a review of the state-of-the-art regarding wine-related microbiome and its implications in the modern wine industry. The whole section will be adapted and published as part of a book chapter, edited by Elsevier, about Vineyard Microbiome. Furthermore, the summary of this thesis has been adapted and translated in order to be published on La Revue Des Oenologues; a famous French wine-journal. The second chapter includes an assessment of the applied methodologies, which is expected by the GSST guidelines (February 2016). The third part of this manuscript includes all the published and submitted scientific papers I wrote as first author, co-first authors and as second name. I have included in this section, only those that are coherent with my PhD Project within the MICROWINE Network. A complete list of all the publications, completed and expected in the next future, is reported below. The third chapter follow an imaginary line, which connect all the different aspects of the wine-production process and the terroir implications that I have been extensively studying during my PhD. This line ideally starts from the topsoil of different vineyards worldwide, it goes down deep into the soil and then rise up, through the roots, into the trunk of the grapevine trees reaching leaves and grapes. Before each paper there will be a short explanation of my contributions in it as expected from GSST rules.

III) Complete List of Publications

Published - Whole-Metagenome-Sequencing-Based Community Profiles of Vitis vinifera L. cv. Corvina Berries Withered in Two Post-harvest Conditions. Elisa Salvetti, Stefano Campanaro, Ilenia Campedelli, Fabio Fracchetti, Alex Gobbi, Giovanni Battista Tornielli, Sandra Torriani and Giovanna E. Felis. (2016), Frontiers in Microbiology, Vol. 7, pp 937. https://doi.org/10.3389/fmicb.2016.00937. - Assessing the impact of plant genetic diversity in shaping the microbial community structure of Vitis vinifera phyllosphere in the Mediterranean. Prashant Singh, Alex Gobbi, Sylvain Santoni, Lars H Hansen, Patrice This & Jean- Pierre Péros. (2018), Frontiers in Life Science, 11:1, 35-46, https://doi.org/10.1080/21553769.2018.1552628.

Submitted (in bold only first authorship or shared-first authorship) - Quantitative and qualitative evaluation of the impact of the G2 enhancer, bead sizes and lysing tubes on the bacterial community composition during DNA extraction from recalcitrant soil core samples based on community sequencing and qPCR. Alex Gobbi, Rui Gustavo Santini, Elisa Filippi, Lea Ellegaard-Jensen, Carsten Suhr Jacobsen, Lars Hansen. Sub. To Plos-One. Available on: doi: https://doi.org/10.1101/365395 - Seasonal microbial dynamics on grapevine leaves under biocontrol and copper fungicide treatments. Alex Gobbi, Ifigeneia Kyrkou, Elisa Filippi, Lea Ellegaard-Jensen, Lars Hansen. Sub. To Frontiers in Microbiology. https://doi.org/10.1101/523977 - Characterization of the wood mycobiome of Vitis vinifera in a vineyard affected by esca. Spatial distribution of fungal communities and their putative relation to foliar symptoms. Giovanni del Frari, Alex Gobbi, Marie Rønne Aggerbeck, Pedro Talhinhas, Helena Oliveira, Lars Hestbjerg Hansen, Ricardo Boavida Ferreira. Sub to Frontiers in Microbiology. - Vertical stratification of microbial communities in vineyard soils. Rui G. Santini, Alex Gobbi, Lea Ellegaard-Jensen, Lars H. Hansen, Per Möller & Kurt H. Kjær - World Microbial Vineyard’s Map: extending the concept of microbial terroir to a global scale A. Gobbi, A. Acedo, I. Belda, R.G. Santini, L. Ellegaard-Jensen, L.H. Hansen

In Preparation (in bold only first authorship or shared-first authorship) - Vintage variations on bacterial and fungal diversity of grapevine rhizosphere soils as an effect of geography.

M. Oyuela Aguilar, A. Gobbi, P.D. Browne, J.I. Quelas, L. Ellegaard-Jensen, L.H. Hansen and M. Pistorio. - Vineyard fungal and bacterial community variations as an effect of geography M. Oyuela Aguilar, A. Gobbi, P.D. Browne, L.H. Hansen and M. Pistorio. - Fungicides and the grapevine wood mycobiome. A case study on tracheomycotic ascomycete Phaeomoniella chlamydospora reveals potential for two novel control strategies. Giovanni del Frari, Alex Gobbi, Marie Rønne Aggerbeck, Lars Hansen, Ricardo Boavida Ferreira You can find the abstracts of these upcoming publications in the Appendix A at the end of the manuscript.

IV) List of dissemination activity

- Poster at European Bioinformatics Institute (EBI) in Cambridge (UK) - Poster at Danish Microbiological Society Conference 2015 in Copenhagen - Poster at Danish Microbiological Society Conference 2016 in Copenhagen - Poster at Danish Microbiological Society Conference 2017 in Copenhagen - Poster at X International Symposium on Grapevine Physiology and Biotechnology in Verona, (Italy) 2016 - Abstract at OIV Conference, Uruguay (2018) - Seminar at Edmund Mach Foundation in San Michele All'Adige (Italy) - Poster at NENUN 2016 in Copenhagen - Talk at NENUN 2017 in Copenhagen - Talk at ENOFORUM 2017 in Vicenza (Italy) - Talk at Plant Biological Network Symposium 2018 in Copenhagen - Interview at OggiScienza (Italian Scientific Journal) https://oggiscienza.it/2018/11/19/unicita-vino-terroir-microbico-suolo/ - Thesis Memories at La Revue Des Oenologues (French Wine-Journal) - Talk at La Cité du Vin in Bordeaux for the final MICROWINE Conference - Talk at Royal Danish Academy of Science in Copenhagen for the Mid-Term MICROWINE Conference

V) Disclaimer Microbial terroir The concept of terroir has been highly debated for many years; the term “terroir” identifies the unique metabolic signature, which characterize a specific wine. This signature results from a multitude of edaphic, climatic, human and biotic factors, which contributes to modify the quality and the traits of the resulting wine (OIV). The existence of terroir in its comprehensive definition, within the wine industry, is generally recognized and accepted. This led to a series of commercial practices based on certification and rules, which aim to protect and exploit this unicity for its economical value. Within the concept of terroir, microbial ecologist and microbiologist initially hypothesized that the wine-metabolic signature was partially influenced by microbes. In fact, microbes are important at any stage of the wine production process since they could affect plant health and physiology in vineyard and also carry out the fermentation process in wineries. This aspect has been called “microbial terroir”. In this document, I will refer often, directly or not, to the concept of microbial terroir, although in none of the cases, I have tested the metabolic effect on the resulting wine. This choice is due to the fact that I refer to microbial terroir as the biogeographical correlation between microbial community and wine-regions which potentially, could have an impact on wine quality. However, this is not a proof that the microbial terroir, identified in vineyards, determines an effect on the resulting wine. Moreover, the definition itself of terroir lacks of fully scientifically recognized criteria to be unquestionably defined. For instance, it is not clear if, once certified its existence, the microbial or metabolic signature remain constant between different vintages despite any variation (such as climatic change). Furthermore, the word “quality” and “taste” reported in the definition of terroir leads to a subjective interpretation. In this regard, I do not personally consider the existence of a microbial terroir as “condicio sine qua non” to produce a wine-region-specific-metabolic signature. Although this claim is possibly true, it is frequently used for marketing reason without performing scientific assessment. I therefore dissociate myself from the use of these words in such regard while I raise a toast to anyone who can scientifically prove that microbial terroir not only exist, but contributes to create unicity in a bottle of wine.

Inferring Pathogenicity from Phylogenetics Relations This thesis contains six manuscripts related to microbial ecology within the wine-milieu. The methodologies applied in all of them, include DNA amplicon sequencing analyses for the bacterial and fungal population. This allowed us to retrieve, among the others, taxa that include, in their groups, known plant-pathogens. Assessing pathogenicity only from amplicon based study, in such a way, it is scientifically wrong. Most of the times, in fact, it is possible to retrieve, within the same or species, pathogenic or non-pathogenic strains. With our method and the primers we chose, it is impossible to discriminate at strain-level resolution, therefore it would be wrong to infer pathogenicity. Furthermore, these inferences fall into the subject of plant pathology which requires other evidences

to evaluate the incidence and the severity of plant-diseases. For instance, assessing pathogenicity or, viceversa, the biocontrol activity of a microbial strain against a pathogen, requires in-vitro isolation, interaction tests and p identification of the microbes involved, together with repeated field-trials between different years and possibly in several locations. Although some of these studies have been done in collaboration with experienced phytopathologist, all the claims are meant to fall into the field of microbial ecology.

1. INTRODUCTION 1.1 The microbiome All the environments on the terrestrial biosphere host a complex community of microorganisms (Gibbons et al. 2016). The soil represent one of the most complex and biodiverse known habitats, which harbor thousands of different microbial species (Brussaard et al. 2007). The system, which include the habitat and all the genetic material belonging to its microbial community, is called “microbiome” (Backhed et al. 2005). 1.2 High Throughput Sequencing HTS In recent years, the development of high throughput sequencing (HTS) techniques and other -omics methodologies such as meta-transcriptomics, meta-proteomics and metabolomics allowed the scientific community to retrieve novel information about microbial communities in all different kinds of environments; from human body to terrestrial, aquatic and aerial ecosystems (Prosser 2015). Among all these new approaches, those based on HTS, allowed the description of microbial community taxonomic composition by exploiting the phylogenetic relations and gene variability (Orgiazzi et al. 2013). In particular, the PCR amplification of gene-markers such as 16S rRNA gene for , D1/D2 26S rRNA gene and Internal Transcribed Spacer (ITS) (Schmidt et al. 2013) for yeast and fungi, is an approach known as “metabarcoding”. Alternatively the usage of information retrieved from the whole metagenome of a sample can be investigated through a technique called “shotgun sequencing” or “metagenomics” which result less biased by the PCR step since it does not include any primer selection bias (Tedersoo et al. 2015). Both these approaches, allow the researcher to access a broader inedited amount of information that was impossible to retrieve few years before, with low-throughput molecular techniques such as Double Gradient Gel Electrophoreses (DGGE) (Morgan, du Toit, and Setati 2017).

1.3 DNA Sequencing in food microbiology Across all the field of research, the application of DNA-based techniques, especially High Throughput Sequencing (HTS) to food microbiology has been of great importance to improve our understanding on the hidden protagonists of important industrial and traditional processes such as (but not only) alcoholic, lactic and malolactic fermentations (Chen et al., 2017). The first one crucial for bread and for fermented beverages such as beer and wine; the second one mostly responsible for the entire fermented diary and meat product and finally the third one, which occur mainly in wine-related processes to enrich flavor and stabilize the final product. Furthermore, the use of –omics technologies could suddenly change the way in which food industry perform quality control, bringing advantages in terms of safety and traceability along the food chain (Ferri et al. 2015).For a perspective view on how the techniques based on HTS contributes to reveal the microbial landscape within food production, refer to Bokulich et al., 2016.

1.4 Microbiology in the wine-milieu The wine-milieu, among all others type of food, stand out because of its multi-billionaire worldwide market and its historical, traditional and cultural value (Sperling 2005). Although there are several ways of producing wine and tens of factors that could affect

the result, microbes represent one of the main driving force in its production. In fact, the crucial step from turning grape-juice into a fermented beverage is carried out mainly by yeast during alcoholic fermentation (FA) and bacteria during malolactic fermentation (MF).

All this interest about wine allowed the retrieval of a deep knowledge about the two main actors of the winemaking process: Saccharomyces cerevisiae and Oenococcus oeni; the first one responsible for the conversion of sugar into alcohol while the second one perform the crucial reaction of the MF. These two microbes are, in fact, enough and necessary to turn grape-juice into wine under standard fermentation conditions. However, wine is much more than only alcohol and lactic acid. In fact, it includes a complete collection of secondary metabolites, which composed the final “flavorome” as defined by Belda et al., in 2016. The production of these metabolites and its abundance is mostly dependent on strain-specific characteristic of the fermenting yeast and bacteria.

In most of the industrial fermentation, based on pure-starter-culture, the inoculum is enough to guarantee the transformation of grape-juice into wine. An alternative to this is a pure spontaneous fermentation, which characterize most of the wines produced in ancient time (Maro et al. 2007). A spontaneous fermentation is carried out by the resident microbial community persisting on the grapes and coming from vineyard and winery and by unintentional inoculum during the wine-making process (Navarrete-Bolaños 2012). This approach present several drawbacks, which discourage large-scale application, such as the presence of microbes responsible for off-flavors and wine-spoilage. However, it is commonly accepted that a more biodiverse active microbial community produced a wine with a richer complexity due to its bouquet of flavors which is perceived as richer by consumers (Tempère et al. 2018). Taking this risk could result in a high-quality wine, which have the advantage of being unique, but for this reason also almost impossible to reproduce equal between different vintages or in different areas.

1.5 Definition of microbial terroir The unicity of the wine is due to many other variables, which contributes not only during wine fermentation but also during grape development, affecting its sugar content, secondary metabolites production and also berry-health. Among these factors we can find soil type, edaphic factors, clime, grape variety, vineyard management system, solar exposition and then in the winery all the steps of the wine-production and ageing. All these variables, also including cultural, traditional, historical and human factors, shape the concept of terroir (For et al. 2010). The existence of a unique terroir imply the unicity of a wine production process and therefore also imply the presence of a metabolic signature, which characterize a specific wine, produced in a limited area in a specific moment (Vaudour 2010).

Across the two main wine-related production environments, vineyards and wineries, microbes are involved in almost all crucial steps. In vineyard, they can affect the geochemical cycles of nutrients interacting with the minerals in soil, colonize the plant

through the rhizosphere, roots, leaves, flowers and grapes (Zarraonaindia et al. 2015), also establishing a community into the trunks known as endophytic bacteria (Andreolli et al., 2016, Vitulo et al., 2018). These microbial interactions with the plant can affect its productivity, growth and health and, finally, cause an effect on the grape-clusters that are produced. Considering the plant and the living community on and into it as a meta- organism or holobiont (East 2013) allow the introduction, into the concept of terroir, of another variable that we can define as microbial terroir (Gilbert et al. 2014)

1.6 Effect of geography on microbial terroir A few pioneers’ studies, exploiting NGS-based technology, appeared in 2014 and 2015 respectively from Bokulich and Burns, revealing a link between grape must and soil community and the geography of the territory between different American Viticultural Areas (AVA). These areas are distinguished by edaphic factors such as climate, geology, elevation and other physical-chemical parameters. These bio-geographical patterns have been then confirmed in other places such as in Catalonia (Spain), where Portillo et al in 2016 discriminated the bacterial community between different vineyards from Priorat wine-region. Further studies, which addressed different questions, using samples from several locations, also reach similar conclusion; it is the case of Perazzolli et al 2014 in which they addressed the natural resilience of the grapevine phyllosphere after different treatments. In this study, indeed they used samples coming from three different locations and displayed a strong impact of this factor on to the result. Most of these studies take into account closely related wine-regions and different vineyards, generally within the same country. An exception is represented by the study of Mezzasalma et al (2018) in which they analyzed the impact of geography and cultivars in Northern Italy and Northern Spain retrieving consistent conclusion. Although most of these studies only regards the bacteriome, something was done to enlighten the dynamics of the mycobiome, to elucidate if these regional associations belongs also to this fraction of the microbial community. It is the case of Taylor et al, in 2014 where they display significant differences in terms of fungal population between different major areas of production in New Zealand. However, a broad study on world vineyard microbial distribution is still missing and could answer different questions about the variability and biodiversity on landscape level (Manuscript 1).

1.7 Impact of vineyard management on microbial terroir Geographical locations is not the only responsible for shaping the microbial community of a vineyard. All other edaphic and environmental factors seems to have some impacts on the microbiome at different degrees of magnitude (Lauber et al. 2008). For instance, vineyard management system has been show to affect the bacterial community in several occasion. One of the first studies, to analyze this impact of different management system using NGS technology, was Hartmann et al. in 2015 comparing long-term conventional and organic management in the DOK experiment in Switzerland. Although the results suggested certain microbial shifts between different microbiomes attributable to agricultural practices, they concluded that the highly complexity of the interactions, in the soil-system, make any simplistic statements fall short. Therefore, claims such as

“higher biodiversity under low-input farming” may not reflect the real scenario of a specific biome (Hartmann et al. 2015). In 2016 was Burns et al., to examine bacterial shifts among several AVAs in California as function of different soil parameters. In this study, factors such as the occurrence of cover crops and tillage, affect the bacterial populations. However, the effect was not always consistent and robust between different soil types. Until now, we have been mainly discussing about the bacterial community shifts associated with geography, environmental and agricultural factors. However, a vineyard is a much more complex system and microbes are also dispersed on the different plant tissues (C. Yang et al. 2001). One of the rising question was to understand if the biogeographical correlation between soil and microbes produces detectable differences on the fruit microbiome. Following Zarraonaindia et al (2015) it appears clear that the soil act as a major reservoir of bacterial community with a certain degree of co- occurrence in other plant tissues such as roots, leaves, flowers and grapes. Furthermore, they showed that many factors affect the tissue-associated microbial assemblages but the bacteriome was still distinguishable and dependent on biogeography and vineyard management practices. However, this did not happened for all the tissues and all the factors in the same way. In particular, leaves seemed not to be affected by the different cultivar or vineyard of provenance. This might be because the strength of the habitat- selection is high and so, the bacteria responsible for the shift, are negatively selected once they migrate on leaves. The same question, to highlight the effect of different soil-factors on fruit microbial community, can be addressed looking at the different management system used in vineyards. One study, conducted by Chou et al in 2018, demonstrated that vineyard under-vine management produced shifts onto soil-microbial composition but no correspondences were found looking at fruit microbiome. For both types of microbes (bacteria and fungi), the different floor-management treatments affect the soil community but not the fruit-associated microbial composition. In this study, they reduced the impact of geography comparing samples coming from the same area, a vineyard in Ovid (NY), and repeated the sampling along three consecutive years. They focused their attention on the application of pesticides and the selected presence of cover crops and so we cannot exclude that other vineyard management practices could be more influential in shaping the associated grape-microbiome. Reducing the confounding factors such as the timing of the sampling or the effect of geography is therefore a winning strategy to highlight the impact of secondary factors on the microbial community. In fact, in 2018 Singh et al. demonstrated some association at genus-level between different genetic pools analyzing 279 different cultivars within the same vineyard in Montpellier (France) (Manuscript 5).

1.8 Limitations of meta-barcoding All these kind of microbiome analyses based on meta-barcoding allow the researcher to draw conclusion on the composition and distribution of the different microbial communities associated with wine-regions. These microbes are described in terms of richness, evenness, biodiversity and taxonomical composition between areas and vineyards. However, the resolution we can have with these common techniques reaches,

in few cases, only genus or species level and furthermore no quantitative information are available (Elbrecht and Leese 2015). This is due to the techniques itself in which one gene is simply not enough to distinguish between strains, but it is also due to bioinformatics pipeline databases, which may contain only one representative sequence for each species (Coissac et al. 2012). Although a low-rank resolution is enough to recognize the microbial fingerprint of different vineyards or wine-region it is not appropriate to discriminate between different strain of the same species, which can differ by few genes which contribute in shaping the flavorome of wine. It is the case of S. cerevisiae, the yeast mainly responsible for the alcoholic fermentation worldwide (Borneman et al. 2011). This yeast is ubiquitous in wineries and vineyards (Goddard and Greig 2015) but the bouquet of secondary metabolites, its tolerance to ethanol and SO2 are characteristics that vary with its genome. Different strains of this yeast, therefore, behave differently during the fermentation (Tofalo et al. 2013). For this reason, it is important to characterize which strain contributes to the wine-making process. However metabarcoding it is not a suitable approach for this purpose while other molecular methods could be applied such as genome sequencing, pulsed-field gel electrophoresis, mitochondrial DNA – Restriction Fragment Length Polymorphism (mtDNA-RFLP), Random Amplified Polymorphic DNA (RAPD-PCR), microsatellite analyses and Multilocus Sequence Typing (MLST) could achieve the purpose (Tofalo et al. 2013).

A few studies on yeast, employing techniques based on the microbial fingerprint by using DNA polymorphism such as ARISA or T-RFLP on isolates, also reported the existence of patterns between regions and the different strains of S. cerevisiae (Tofalo et al. 2013). Most of these studies are conducted in New Zealand where in 2010 Goddard et al. revealed the link with geography above all the evidences of local dispersion driven by insects and global dispersion human-aided by mean of oak-barrels. This results were subsequently confirmed in 2012 by Gayevskiy & Goddard in which they display the geographic delineation of yeast communities associated with vines and wines reporting a remarkable 94% of region-specific genotypes. Finally, in 2015, Knight et al. demonstrate how the different region-associated yeast population can affect the phenotype of the wine. As for bacteria also yeast are dependent by farming practices, as elucidated by Setati et al. in 2012; Setati also demonstrated the intra-vineyard diversity that is possible to retrieve with a proper sampling of the interested area. Although most of these studies exclusively focus on S. cerevisiae, other yeast could be responsible for the metabolic fingerprint and display regional-association properties with vines and wines. This potential still needs to be fully elucidated although an interesting review on the available results has been written by Capozzi et al. in 2015, highlighting the perspective of using mixed non-Saccharomyces/Saccharomyces starter culture as an alternative to a pure spontaneous fermentation.

1.9 Microbial terroir and plant health A better understanding of the microbial dynamics can have several applications that span from a precision agricultural practice to accurate wine-making processes. In the above- mentioned studies, the impact of microbes appear to be important mainly regards the

resultant wine-phenotype. However, in agriculture, the local biodiversity should be monitored and preserved, to maintain the good health and yield of the vineyard.

Given the effect of different agricultural practices on the soil microbial community, the application of fungicides and pesticides represent another important aspect to consider, in relation with the interaction between plant and pathogens. Grapevine, as most of the modern crops, need to be treated in order to prevent or cure the different disease that can affect the health of soil, plant and grapes, endangering yield and quality of the berries. In recent years, the introduction of bacteria or yeast with biocontrol activity becomes a promising alternative to traditional methods based on chemicals. To measure the impact and interaction of these microbial agents on the resident community, studies based on the analyses of the microbiome in vineyards are necessary. An example of this can be found in Perazzolli et al. in 2014, where they demonstrate the impact of a biological control agent (BCA) known as Lysobacter capsici (AZ78) on bacterial and fungal community, also measuring the effectiveness against Plasmopara viticola, a -like microorganism responsible for Downy Mildew. The study of biological control in the microbiome era account several other studies, using different crops. For a comprehensive view we recommend to read Massart et al. 2015.

1.10 Concluding remarks To conclude, the analyses of the grape-microbiome based on HTS techniques allow the discovery of the complex microbial community underneath the wine-making process and rise further question on its interactions during fermentation, to shape the final product. To deepen the knowledge on the microbiome approximations of the concept of terroir we direct you to the review of Belda et al., in 2017. All other –omics techniques such as meta- transcriptomic, metabolomics and meta-proteomics are also applicable to the wine production process in order to raise the awareness over the importance microbiome can have on the wine flavor and aroma. A quick view over the implication a multi-omics approach can have on wine production is given by the review of Sirén et al 2019. This field of research, already intensively studied in the past years, is going to be updated and discussed in light of the upcoming results from new multi-omics integrated studies.

2 Assessment of methodologies 2.1 Sequencing protocol for MICROWINE amplicon-based studies A brief introduction about HTS has been given in section 1.2, here we will assess the methodology that has been mostly used within our Network and in particular in this project: amplicon-based DNA sequencing or “metabarcoding”. Within the MICROWINE Network, few PhD students are performing amplicon based studies to unravel the microbial communities in different samples under different conditions. To reserve the possibility, in the future, of comparing the outcomes of these separate studies into a single meta-analysis, we developed a standard protocol applied, with minor modifications, to most of the meta-barcoding studies within our group. The development of this sequencing protocol passed through the choice of a sampling grid method, common DNA extraction protocol, PCR-primers pairs for 16S and ITS libraries preparation and the choice of a bioinformatics pipeline. Furthermore, an evaluation of certain biases was measured. In particular, I focused on the effect of DNA dilution, addition of DNA- Extraction Enhancer (G2) and the use of Peptide Nucleic Acids (PNA) to prevent the amplification of unwanted DNA such as plant-DNA in host associated microbiota. Although the details of the protocol we performed is reported in each paper, I want to include here some of the results coming from the test we have done. Some of these results were also included in the mid-term report for the qualifying exam and I cited here according to GSST rules.

Experiment Purpose Results Material and Methods The number of the replicates is more 30 Soil Samples from a Understanding important than the vineyard in Hillerød if the sampling position inside the grid (Denmark) were collected 1 – Sampling scheme can of the field. The diversity in triplicates from 10 spots Strategy affect the final within the single field- along field diagonals and result spot is sometimes higher independently extracted than the diversity in and sequenced. different field-spots. Adopt a direct We based the shared DNA extraction DNA extraction protocol The sequencing protocol methods which 2 – DNA on a bead-beating based has been applied on soil, could be easily Extraction commercial kit for soil rhizosphere, must, leaves, adapted to which could be adapted wine and trunks. several kinds of for different samples samples Samples of leaves (low Understanding A moderate dilution can DNA and high inhibition) how much DNA overwhelm PCR- and soil (high DNA but also 3 – Impact of the dilution can inhibition problem high complexity and Dilution affect the without affecting the inhibition) were sequenced microbial final microbial after diluting 1:10, 1:100 composition composition and 1:1000 with water. Improve the DNA yield G2 increases the yield Soil-core samples were extracted from without affecting the taken from deep layers and 4- Use of G2 – soil cores resultant microbial extracted using different DNA Enhancer without community. More details lysing tubes. Details are affecting the are reported in reported in the M&M microbial Manuscript 2 section of Manuscript 2 representation Leaves samples were Prevent the sequenced to highlight amplification of The use of PNA allow us bacterial population with unwanted DNA to exclude grapevine and without PNA. We 5 – Impact of such as plastidial DNA during compared the bacterial PNA Blocking plastidial and PCR amplification. This community of samples Primers mitochondrial without affecting the where PNA was used with 16S genes in bacterial community samples where plastidial plant-associated reads were removed with a microbiota filtration algorithm Table 1; in this table I summarized the most important aspects we evaluated during the realization of the sequencing protocol. More information are reported in Appendix B. 2.2 Bioinformatics Pipeline The techniques based on –omics technologies and, especially in this work, on next generation sequencing, allow the production of a massive amount of data that requires to be processed. Many studies have been conducted to evaluate which bioinformatics

pipeline performs better and in which conditions, such as (López-García et al. 2018; C. X. Yang et al. 2013). However, especially when it comes to amplicon analyses, none of them has been considered best in all conditions. Among the methods which are considered the most widespread, we find QIIME and now QIIME 2 both from Caporaso et al. 2010 , Mothur (Schloss et al. 2009) , USEARCH (Edgar 2010) and R which recently implemented an algorithm for amplicon data analyses. These pipelines are different from each other with a relatively high degree of customization of the parameters. All of them are open- source and, except for QIIME2 (which include a GUI), all of them works within a UNIX- environment and are managed thorough the command line. These pipelines share a similar thread for data analyses, which include a quality control step on raw reads, and the possibility of merging paired-end reads. Then, a dereplication step is usually performed. In this step, all quality filtered reads that are identical to each other are merged and counted. After this step, we assist to the biggest conceptual differences between pipelines. For several years, dereplicated reads were clustered into Operational Taxonomical Unit (OTUs) using a fixed threshold of identity such as 97% or 99% and filtered for chimeras. Only recently, some pipelines implemented the possibility of using these dereplicated reads as amplicon exact sequence variants (ASVs) (Callahan et al. 2017) without preliminary clusterization. This approach has been integrated in UNOISE (Edgar 2016) and DADA2 (Callahan et al. 2017), respectively in use for USEARCH and QIIME2 or R. This difference is conceptually important since the OTUs clusterization leads to a loss of rare-high-quality reads that differs from the representative sequence of each OTUs only for few nucleotides. Although this difference could be due to PCR- amplification or sequencing errors, there is also the possibility that very similar reads represent different microbes. This might affect the results. However, the use of sequence variants introduce other problems such as an overestimation of the number of different microbes that represent the community. Furthermore, retaining many rare reads, with very low frequency, lead to the presence of many “zeros” within the frequency table of our dataset and this negatively affect statistical analyses. After this step, in either way, the frequency table is obtained and it includes all the representative reads of our community. Therefore, a taxonomical assignment is performed, using one of the databases of phylogenetic gene-markers such as Greengenes, Silva or UNITE. In this way, each read will be assigned to the corresponding bacterial or fungal name within a certain threshold of confidence. This process lead to the characterization of the microbial fingerprint on a specific sample. Changing bioinformatics pipeline, or just the parameters within the same command, can produce similar but not equal results (Pinto and Raskin 2012). This highlight the importance of been aware of what we are doing but also why and how. However, the importance of producing high quality data should not be confined in a secondary position. Indeed, it is rather easy to correct and refine a bioinformatics algorithm in order to adapt it to our need but even the best bioinformatics pipeline cannot retrieve much out of poor quality sequenced dataset. This is extensively analyzed in the paper of Murray et al. 2015, which I invite the readers to read.

2.3 Bibliography

Andreolli, Marco, Silvia Lampis, Giacomo Zapparoli, Elisa Angelini, and Giovanni Vallini. 2016. “Diversity of Bacterial Endophytes in 3 and 15 Year-Old Grapevines of Vitis Vinifera Cv. Corvina and Their Potential for Plant Growth Promotion and Phytopathogen Control.” Microbiological Research 183: 42–52. https://doi.org/10.1016/j.micres.2015.11.009. Backhed, Fredrik, Ruth E. Ley, Justin L. Sonnenburg, Daniel A. Peterson, and Jeffrey I. Gordon. 2005. “Host-Bacterial Mutualism in the Human Intestine - Supplemental Materials.” Science 307 (5717): 1915–20. https://doi.org/10.1126/science.1104816. Belda, Ignacio, Javier Ruiz, Ana Alastruey-Izquierdo, Eva Navascués, Domingo Marquina, and Antonio Santos. 2016. “Unraveling the Enzymatic Basis of Wine ‘Flavorome’: A Phylo- Functional Study of Wine Related Yeast Species.” Frontiers in Microbiology 7 (JAN): 1–13. https://doi.org/10.3389/fmicb.2016.00012. Belda, Ignacio, Iratxe Zarraonaindia, Matthew Perisin, Antonio Palacios, and Alberto Acedo. 2017. “From Vineyard Soil to Wine Fermentation: Microbiome Approximations to Explain the ‘Terroir’ Concept.” Frontiers in Microbiology 8 (MAY): 1–12. https://doi.org/10.3389/fmicb.2017.00821. Bokulich, N. A., J. H. Thorngate, P. M. Richardson, and D. A. Mills. 2014. “PNAS Plus: From the Cover: Microbial Biogeography of Wine Grapes Is Conditioned by Cultivar, Vintage, and Climate.” Proceedings of the National Academy of Sciences 111 (1): E139–48. https://doi.org/10.1073/pnas.1317377110. Bokulich, Nicholas A., Zachery T. Lewis, Kyria Boundy-Mills, and David A. Mills. 2016. “A New Perspective on Microbial Landscapes within Food Production.” Current Opinion in Biotechnology 37: 182–89. https://doi.org/10.1016/j.copbio.2015.12.008. Borneman, Anthony R, Brian A Desany, David Riches, Jason P Affourtit, Angus H Forgan, S Isak, Michael Egholm, and Paul J Chambers. 2011. “Whole-Genome Comparison Reveals Novel Genetic Elements That Characterize the Genome of Industrial Strains of Saccharomyces Cerevisiae” 7 (2). https://doi.org/10.1371/journal.pgen.1001287. Brussaard, Lijbert, Peter C De Ruiter, and George G Brown. 2007. “Soil Biodiversity for Agricultural Sustainability” 121: 233–44. https://doi.org/10.1016/j.agee.2006.12.013. Burns, Kayla N., Nicholas A. Bokulich, Dario Cantu, Rachel F. Greenhut, Daniel A. Kluepfel, A. Toby O’Geen, Sarah L. Strauss, and Kerri L. Steenwerth. 2016. “Vineyard Soil Bacterial Diversity and Composition Revealed by 16S RRNA Genes: Differentiation by Vineyard Management.” Soil Biology and Biochemistry 103: 337–48. https://doi.org/10.1016/j.soilbio.2016.09.007. Callahan, Benjamin J., Paul J. McMurdie, and Susan P. Holmes. 2017. “Exact Sequence Variants Should Replace Operational Taxonomic Units in Marker-Gene Data Analysis.” ISME Journal 11 (12): 2639–43. https://doi.org/10.1038/ismej.2017.119. Caporaso, J Gregory, Justin Kuczynski, Jesse Stombaugh, Kyle Bittinger, Frederic D Bushman, Elizabeth K Costello, Noah Fierer, et al. 2010. “QIIME Allows Analysis of High-Throughput Community Sequencing Data.” Nature Methods 7 (5): 335–36. https://doi.org/10.1038/nmeth.f.303. Capozzi, Vittorio, Carmela Garofalo, Maria Assunta Chiriatti, Francesco Grieco, and Giuseppe Spano. 2015. “Microbial Terroir and Food Innovation: The Case of Yeast Biodiversity in Wine.” Microbiological Research 181: 75–83. https://doi.org/10.1016/j.micres.2015.10.005. Chen, Gu, Congcong Chen, and Zhonghua Lei. 2017. “Trends in Food Science & Technology Meta- Omics Insights in the Microbial Community pro Fi Ling and Functional Characterization of Fermented Foods.” Trends in Food Science & Technology 65: 23–31. https://doi.org/10.1016/j.tifs.2017.05.002. Chou, Ming Yi, Justine Vanden Heuvel, Terrence H. Bell, Kevin Panke-Buisse, and Jenny Kao- Kniffin. 2018. “Vineyard Under-Vine Floor Management Alters Soil Microbial Composition, While the Fruit Microbiome Shows No Corresponding Shifts.” Scientific Reports 8 (1): 1–9. https://doi.org/10.1038/s41598-018-29346-1. East, Roger. 2013. “Microbiome: Soil Science Comes to Life.” Nature 501 (7468): S18–19.

https://doi.org/10.1038/501S18a. Edgar, Robert C. 2010. “Search and Clustering Orders of Magnitude Faster than BLAST.” Bioinformatics 26 (19): 2460–61. https://doi.org/10.1093/bioinformatics/btq461. Edgar, Robert C. 2016. “UNOISE2: Improved Error-Correction for Illumina 16S and ITS Amplicon Sequencing.” BioRxiv, 81257. https://doi.org/10.1101/081257. Elbrecht, Vasco, and Florian Leese. 2015. “Can DNA-Based Ecosystem Assessments Quantify — Sequence Relationships with an Innovative Metabarcoding Protocol,” 1–16. https://doi.org/10.1371/journal.pone.0130Species Abundance ? Testing Primer Bias and324. Biomass Ferri, Emanuele, Andrea Galimberti, Maurizio Casiraghi, Cristina Airoldi, Carlotta Ciaramelli, Alessandro Palmioli, Valerio Mezzasalma, Ilaria Bruni, and Massimo Labra. 2015. “Towards a Universal Approach Based on Omics Technologies for the Quality Control of Food” 2015. For, Method, T H E Determination, Dicarbonyl Compounds, O F Wine, B Y Gaseous, Phase Chromatography, After Derivatization, et al. 2010. “Certified in Conformity Tbilisi, 25,” no. June: 1–8. Gayevskiy, Velimir, and Matthew R. Goddard. 2012. “Geographic Delineations of Yeast Communities and Populations Associated with Vines and Wines in New Zealand.” ISME Journal 6 (7): 1281–90. https://doi.org/10.1038/ismej.2011.195. Gibbons, Sean M, Jack A Gilbert, Woods Hole, and Resource Sciences. 2016. “HHS Public Access,” 66–72. https://doi.org/10.1016/j.gde.2015.10.003.Microbial. Gilbert, J. A., D. van der Lelie, and I. Zarraonaindia. 2014. “Microbial Terroir for Wine Grapes.” Proceedings of the National Academy of Sciences 111 (1): 5–6. https://doi.org/10.1073/pnas.1320471110. Goddard, Matthew R., Nicole Anfang, Rongying Tang, Richard C. Gardner, and Casey Jun. 2010. “A Distinct Population of Saccharomyces Cerevisiae in New Zealand: Evidence for Local Dispersal by Insects and Human-Aided Global Dispersal in Oak Barrels.” Environmental Microbiology 12 (1): 63–73. https://doi.org/10.1111/j.1462-2920.2009.02035.x.

–6. https://doi.org/10.1093/femsyr/fov009. Hartmann,Goddard, Matthew Martin, BeatR, and Frey, Duncan Jochen Greig. Mayer, 2015. Paul “Saccharomyces Mäder, and Franco Cerevisiae : Widmer. A Nomadic 2015. “Distinct Yeast with Soil NoMicrobial Niche ?,” Diversity no. January: under 1 Long-Term Organic and Conventional Farming.” ISME Journal 9 (5): 1177–94. https://doi.org/10.1038/ismej.2014.210. Knight, Sarah, Steffen Klaere, Bruno Fedrizzi, and Matthew R. Goddard. 2015. “Regional Microbial Signatures Positively Correlate with Differential Wine Phenotypes: Evidence for a Microbial Aspect to Terroir.” Scientific Reports 5 (August): 1–10. https://doi.org/10.1038/srep14233. Lauber, Christian L, Michael S Strickland, Mark A Bradford, and Noah Fierer. 2008. “Soil Biology & Biochemistry The Influence of Soil Properties on the Structure of Bacterial and Fungal Communities across Land-Use Types” 40: 2407–15. https://doi.org/10.1016/j.soilbio.2008.05.021. López-García, Adrian, Carolina Pineda-Quiroga, Raquel Atxaerandio, Adrian Pérez, Itziar Hernández, Aser García-Rodríguez, and Oscar González-Recio. 2018. “Comparison of Mothur and QIIME for the Analysis of Rumen Microbiota Composition Based on 16S RRNA Amplicon Sequences.” Frontiers in Microbiology 9 (December): 3010. https://doi.org/10.3389/FMICB.2018.03010. Maro, Elena Di, Danilo Ercolini, and Salvatore Coppola. 2007. “Yeast Dynamics during Spontaneous Wine Fermentation of the Catalanesca Grape” 117: 201–10. https://doi.org/10.1016/j.ijfoodmicro.2007.04.007. Massart, Sébastien, Martinez Medina Margarita, and M. Haissam Jijakli. 2015. “Biological Control in the Microbiome Era: Challenges and Opportunities.” Biological Control 89: 98–108. https://doi.org/10.1016/j.biocontrol.2015.06.003. Mezzasalma, Valerio, Anna Sandionigi, Lorenzo Guzzetti, Andrea Galimberti, Maria S. Grando, Javier Tardaguila, and Massimo Labra. 2018. “Geographical and Cultivar Features Differentiate Grape Microbiota in Northern Italy and Spain Vineyards.” Frontiers in Microbiology 9 (MAY): 1–13. https://doi.org/10.3389/fmicb.2018.00946.

Morgan, Horatio H., Maret du Toit, and Mathabatha E. Setati. 2017. “The Grapevine and Wine Microbiome: Insights from High-Throughput Amplicon Sequencing.” Frontiers in Microbiology 8 (MAY). https://doi.org/10.3389/fmicb.2017.00820. Murray, Dáithí C, Megan L Coghlan, and Michael Bunce. 2015. “From {Benchtop} to {Desktop}: {Important} {Considerations} When {Designing} {Amplicon} {Sequencing} {Workflows}.” PLoS ONE 10 (4): e0124671. https://doi.org/10.1371/journal.pone.0124671. Navarrete-Bolaños, José Luis. 2012. “Improving Traditional Fermented Beverages: How to Evolve from Spontaneous to Directed Fermentation.” Engineering in Life Sciences 12 (4): 410–18. https://doi.org/10.1002/elsc.201100128. Orgiazzi, Alberto, Valeria Bianciotto, Paola Bonfante, Stefania Daghino, Stefano Ghignone, Alexandra Lazzari, Erica Lumini, et al. 2013. “454 Pyrosequencing Analysis of Fungal Assemblages From Geographically Distant, Disparate Soils Reveals Spatial Patterning and a Core Mycobiome.” Diversity 5 (1): 73–98. https://doi.org/10.3390/d5010073. Perazzolli, Michele, Livio Antonielli, Michelangelo Storari, Gerardo Puopolo, Michael Pancher, Oscar Giovannini, Massimo Pindo, and Ilaria Pertot. 2014. “Resilience of the Natural Phyllosphere Microbiota of the Grapevine to Chemical and Biological Pesticides.” Applied and Environmental Microbiology 80 (12): 3585–96. https://doi.org/10.1128/AEM.00415- 14. Pinto, Ameet J., and Lutgarde Raskin. 2012. “PCR Biases Distort Bacterial and Archaeal Community Structure in Pyrosequencing Datasets.” PLoS ONE 7 (8). https://doi.org/10.1371/journal.pone.0043093. Portillo, Maria del Carmen, Judit Franquès, Isabel Araque, Cristina Reguant, and Albert Bordons. 2016. “Bacterial Diversity of Grenache and Carignan Grape Surface from Different Vineyards at Priorat Wine Region (Catalonia, Spain).” International Journal of Food Microbiology 219: 56–63. https://doi.org/10.1016/j.ijfoodmicro.2015.12.002. Prosser, James I. 2015. “Dispersing Misconceptions and Identifying Opportunities for the Use of 'Omics' in Soil Microbial Ecology.” Nature Reviews Microbiology 13 (June): 439. https://doi.org/10.1038/nrmicro3468. Review, Invited. 2012. “Bioinformatic Challenges for DNA Metabarcoding of Plants and Animals,” 1834–47. https://doi.org/10.1111/j.1365-294X.2012.05550.x. Schloss, Patrick D., Sarah L. Westcott, Thomas Ryabin, Justine R. Hall, Martin Hartmann, Emily B. Hollister, Ryan A. Lesniewski, et al. 2009. “Introducing Mothur: Open-Source, Platform- Independent, Community-Supported Software for Describing and Comparing Microbial Communities.” Applied and Environmental Microbiology 75 (23): 7537–41. https://doi.org/10.1128/AEM.01541-09. Schmidt, Philipp André, Miklós Bálint, Bastian Greshake, Cornelia Bandow, Jörg Römbke, and Imke Schmitt. 2013. “Illumina Metabarcoding of a Soil Fungal Community.” Soil Biology and Biochemistry 65: 128–32. https://doi.org/10.1016/j.soilbio.2013.05.014. Setati, Mathabatha Evodia, Daniel Jacobson, Ursula Claire Andong, and Florian Bauer. 2012. “The Vineyard Yeast Microbiome, a Mixed Model Microbial Map.” PLoS ONE 7 (12). https://doi.org/10.1371/journal.pone.0052609. Singh, Prashant, Alex Gobbi, Sylvain Santoni, Lars H Hansen, Patrice This, and Jean-Pierre Péros. 2018. “Assessing the Impact of Plant Genetic Diversity in Shaping the Microbial Community Structure of Vitis Vinifera Phyllosphere in the Mediterranean.” Frontiers in Life Science 11 (1): 35–46. https://doi.org/10.1080/21553769.2018.1552628. Sirén, Kimmo, Sarah Siu Tze Mak, Ulrich Fischer, Lars Hestbjerg Hansen, and M. Thomas P. Gilbert. 2019. “Multi-Omics and Potential Applications in Wine Production.” Current Opinion in Biotechnology 56: 172–78. https://doi.org/10.1016/j.copbio.2018.11.014. Sperling, Brian K. 2005. “Information Distribution in Complex Systems to Improve Team Performance.” ProQuest Dissertations and Theses 3170108 (3): 258–258 p. https://doi.org/10.1128/mBio.00631-16.Editor. Taylor, Michael W., Peter Tsai, Nicole Anfang, Howard A. Ross, and Matthew R. Goddard. 2014. “Pyrosequencing Reveals Regional Differences in Fruit-Associated Fungal Communities.” Environmental Microbiology 16 (9): 2848–58. https://doi.org/10.1111/1462-2920.12456.

Tedersoo, Leho, Sten Anslan, Mohammad Bahram, Sergei Põlme, Taavi Riit, Ingrid Liiv, Urmas Kõljalg, et al. 2015. “Shotgun Metagenomes and Multiple Primer Pair- in Metabarcoding Analyses of Fungi” 43: 1–43. https://doi.org/10.3897/mycokeys.10.4852. Tempère, Sophie, Axel Marchal, Jean Christophe Barbe, Marina Bely, Isabelle Masneuf-Pomarede, Philippe Marullo, and Warren Albertin. 2018. “The Complexity of Wine: Clarifying the Role of Microorganisms.” Applied Microbiology and Biotechnology 102 (9): 3995–4007. https://doi.org/10.1007/s00253-018-8914-8. Tofalo, Rosanna, Giorgia Perpetuini, Maria Schirone, Giuseppe Fasoli, Irene Aguzzi, Aldo Corsetti, and Giovanna Suzzi. 2013. “Biogeographical Characterization of Saccharomyces Cerevisiae Wine Yeast by Molecular Methods” 4 (June): 1–13. https://doi.org/10.3389/fmicb.2013.00166. Vaudour, Emmanuelle. 2010. “The

Notions of TerroirQuality of Grapesat and WineVarious in Relation toScales” Geography : Notions1264. https://doi.org/10.1080/095712602200001798.of Terroir at Various Scales The Quality of Grapes and Wine in Relation to Geography : Vitulo, Nicola, Wilson José Fernandes Lemos Junior, Matteo Calgaro, Marco Confalone, Giovanna E. Felis, Giacomo Zapparoli, and Tiziana Nardi. 2018. “Bark and Grape Microbiome of Vitis Vinifera: Influence of Geographic Patterns and Agronomic Management on Bacterial Diversity.” Frontiers in Microbiology 9 (January): 3203. https://doi.org/10.3389/FMICB.2018.03203. Yang, ChenXue X., YingQiu Q. Ji, XiaoYang Y. Wang, ChunYang Y. Yang, and Douglas W. Yu. 2013. “Testing Three Pipelines for 18S RDNA-Based Metabarcoding of Soil Faunal Diversity.” Science China Life Sciences 56 (1): 73–81. https://doi.org/10.1007/s11427-012-4423-7. Yang, Ching-hong, David E Crowley, James Borneman, and Noel T Keen. 2001. “Microbial Phyllosphere Populations Are More Complex than Previously Realized” 98 (7). Zarraonaindia, Iratxe, Sarah M Owens, Pamela Weisenhorn, Kristin West, Jarrad Hampton- marcell, Simon Lax, Nicholas A Bokulich, et al. 2015. “The Soil Microbiome Influences Grapevine-Associated Microbiota.” MBio 6 (2): 1–10. https://doi.org/10.1128/mBio.02527- 14.Editor.

3.0 MANUSCRIPTS

As mentioned within the paragraph II, the manuscript included in this thesis will be presented in a specific order. Following an imaginary path, that starts from the top-soil of different vineyards worldwide, goes down deep into the soil cores and then back up, through the roots, into the trunk of the grapevine trees reaching leaves and the grapes (Fig.1). Before each manuscript my contributions will be declared according to GSST rules.

Fig. 1 displays the imaginary path we followed during our microbiological journey and will guide the reader through the Manuscript section, third chapter of this thesis.

Copyright statement for Fig. 1: - The full picture is created by Alex Gobbi - Vineyard in Azores, picture taken from Alex Gobbi - World map is edited by Alex Gobbi - Soil cores pictures are taken and edited by Rui Gustavo Santini - Picture of top-soil and roots are freely available on internet and “no copyright violation is intended” according to the Google copyrights policy. - Picture of vineyard and soil horizons is taken from www.lodywine.com - Clipart of a grapevine tree is taken from Dave Johnson for Bay Area News Group and it is believed to be posted within my rights according to the U.S. Copyright Fair Use Act (Title 17 U.S. Code)

MANUSCRIPT 1

World Microbial Vineyard’s Map: extending the concept of microbial terroir to a global scale

Within this work we extended the concept of microbial wine-terroir, intended as the correlation between microbes and wine-regions, to a global level; showing evidences of a different scale-microbial-correlation at country, regions or single vineyards level. To our knowledge, this is the biggest microbial survey conducted within vineyards worldwide. This paper is presented in his final draft almost ready for submission to mBIO or ISME.

I conceived the idea of this paper to answer a compelling question about the existence of microbial dynamics on global scale, which could be ascribed to the elusive concept of microbial terroir. I decided to give my contribution to fill this gap after I noticed that, most of the existing studies, referred to a relatively small area. In light of this, geography plays a secondary role while the differences retrieved in those studies could be due to edaphic, climatic or antropogenic factors. In this study I collected the samples of the MICROWINE database, I performed the wet-lab part and arrange a collaboration with Biome Makers to join our database and effort to complete this study. I finally interpreted the results and wrote the main draft of the paper. A. Acedo, I. Belda and R. Ortiz-Alvarez helped from Biome-Makers to join the database and perform the main bioinformatics analyses. R.G. Santini, I. Belda and R. Ortiz-Alvarez contributed in writing sections of the drafted paper.

World Microbial Vineyard’s Map: Extending the concept of microbial terroir to a global scale

A. Gobbi1, A. Acedo2, I. Belda3, R.G. Santini4, R. Ortiz-Álvarez2, L. Ellegaard- Jensen1, L.H. Hansen1

1: Department of Environmental Science, Aarhus University, Roskilde, Denmark 2: Biome Makers, Valladolid, Spain 3: Rey Juan Carlos University, Madrid, Spain 4: Natural History Museum, Centre for GeoGenetics, University of Copenhagen, Copenhagen, Denmark

Abstract Microbes are crucial in each stage of the wine production process. In recent years, the use of next-generation sequencing technologies allowed new insights about the relation between microbes and geography. In the wine-milieu, this correlation has been more acknowledged within the concept of terroir, and could explain the unique impact the microbial community can have to produce a particular wine. In frame of the few limited evidences available in the literature, our goal is to understand if these dynamics are present on a global scale. Using 16s rRNA and ITS amplicon-based approach, we therefore demonstrated microbial differences linking to the elusive terroir on a global scale, sequencing soil-DNA from 100 vineyards worldwide, in four continents. We highlighted that this link exists at different scales; among countries, between different wine-regions in the same country and between fields within the same region. We revealed unique biomarkers in remote volcanic vineyards in Azores and temperature-driven shared- microbial gradients from North Ebro Valley to Ribera Del Duero, passing from La Rioja, in Spain. We discriminated, for the very first time, the unique microbial terroir shared between vineyards in new-winemaker regions such as Zealand and Funen in Denmark. Finally, we analyzed the impact of different vineyard management system, based on functional prediction of the genome complement. Importance Most of the interest about wine is justified by its massive economic, cultural and traditional value. Winemakers, especially from the Old World, rely on the concept of terroir to explain the unicity of their wine in terms of taste and flavor. Microbes play a central role in vineyards, where they affect grapevine health and growth and in winery during the fermentation. For this reason, understanding the relation between geography and microbial community will raise awareness on a new way to manage the wine- industry. In this work, we performed the hitherto largest microbial survey on vineyards worldwide, using amplicon-based sequencing approach. Our outcomes provide the possibility to detect putative biomarkers from specific wine-makers area, discriminate between vineyards that comes from a relatively small geographic area, within the same wine-region, and finally gaining insights on the functional differences driven by different vineyard management systems.

1 Introduction The interest in Vitis vinifera (wine grape) is justified by the cultural and economic value veiled behind this crop, which represent a multi-billion dollars market, per year (1). Thousands of different grape cultivars have been selected along the history by virtue of their berry’s characteristics such as flavor, color, shape and cluster’s size (2). Furthermore, a multitude of edaphic, climatic, human and biotic factors contributes to modify the quality and the traits of the resulting wines, defining the concept of terroir (3). The soil microbiome is one of the most diverse environments in terms of microbial community complexity (4), and also act as the main microbial reservoir for agricultural crops (5). Soil microbes can, in some cases, reach the phyllosphere or the roots of the plant, colonizing all the different tissues and establishing themselves as inner host- microbiome, defining an indivisible marriage known as the holobiont (6). Within their association with plants, soil microbes can play different roles spanning from mutualism to commensalism and pathogenicity (7). In particular, they are active in regards of plant growth, development, health and productivity and to soil processes that indirectly affect these traits (8). The soil microbial communities in different wine-growing regions are affected by e.g. climate, topography and by soil properties like pH and organic matter (9, 10). Soil properties are again directly and indirectly affected by vineyard management practices (11). Different vineyard’s management systems include a broad range of practices such as tillage, cover crops, and compost application. These can be bound to the winegrowing region, since they depend on climatic conditions and edaphic factors. Other practices, instead, are embedded to the use of conventional, organic or biodynamic regimes, for instance those regarding fertilization and pest treatments. All these agricultural practices, can drive not only the soil/plant-related microbiota composition, but also the diversity and response of fermentative wine-related species (12). In the wine-mileau, pioneering studies based on high-throughput sequencing (HTS) were conducted by (5, 9, 13), revealing microbial biogeography associations across multiple areas. Furthermore, in 2016 N. A. Bokulich et al. (1) and in 2015 S. Knight et al. (14) also shed light on the associations between the microbial and the metabolic fingerprint of the wine produced from different wine regions. In this field, I. Belda et al. (15) also described distinctive and clustered metabolic patterns for wine-related yeasts depending on their geographical origin. The links between the unicity of a wine-metabolic profile and all the factors affecting the grapes are known and valuable, to the viticulturist, supporting the concept of terroir. Within this complex and multifaceted concept, the microbiome of vineyards has been defined as a unique and integrative biomarker (16, 17), affecting wine quality indirectly (by affecting vine health and physiology), but also in a direct way, being the main reservoir of the autochthonous fermentative microbiota. The vast majority of the microbes stemming from soil or living on the plant phyllosphere are not active under the harsh conditions of the fermentation tank, where pH goes down and ethanol is rising in a concurrent anaerobic environment (18)). However, some of these microbes could have influenced the grape chemical composition in a pre- fermentation stage, in vineyards or in cellar during post-harvest treatments (19). Finally, a smaller number of microbes originating from vineyards soil and grapes or the cellar are able to survive, grow and develop phenotypical characteristic during the fermentation and thus directly influence the final product (20). The role of the microbiota within all

the stages of the wine production process, the presence of soil as main reservoir for microbes, and the microbial non-random biogeographical association can reveal the functional impact of the terroir, providing a biological target for future biotechnological applications regarding productivity or disease resistance (21). From a functional point of view, studies show that microbial community structure and functional behavior are linked (22, 22b). This is more evident when looking at the resilience of a soil community in because of its functional redundancy (23). Still, functional redundancy may alter community stability (23), and particular soil functional profiles could be associated to specific terroirs. . Finally, soil diversity is still among the most recognized parameters linked to the concept of sustainable agriculture (24-26). In this study, we applied HTS amplicon library approach to conduct a broad-spectrum survey of the soil microbial communities of vineyards, which span several countries, regions, grape varieties and management systems. Our hypothesis is that the existence of distinctive microbial communities associated with specific wine-regions, is a concept that can be extended to a global scale. The second hypothesis is that vineyard associated microbial communities, from different wine-regions which share similar edaphic and climatic factors, are likely more similar compared to the microbial communities in e.g. different climatic regions. Finally, to verify if some vineyard management practices can influence the microbial community more than geography, we focused on the differences between organic and conventional management system and on the use of artificial irrigations using functional prediction on amplicons data. In this paper, the final dataset was analyzed under three different points of view to explore the information we could recover from an amplicon libraries study. We aimed to evaluate the taxonomical fingerprint within different wine-regions worldwide. We focused on soil diversity and OTUs distribution and finally to the functional potential through an accurate gene prediction algorithm called Tax4fun. 2 Material and Methods 2.1 Material This study is a microbial amplicon based survey created with previously unpublished data combined from two different databases. The data of this study originate partly from the MicroWine project and partly private data from Biome Makers Inc. obtained with WineSeq® Technology.. This project includes a total of 475 soil samples from 100 vineyards. As such, different people collected these samples in the period 2015-2018 around the globe. Although some differences occurred in the sampling scheme, storage conditions and sample processing, a general description of this protocol can be delineated as follows. All the samples were of topsoil taken within a depth between 0-10 cm. MICROWINE samples consist of 5 samples collected and sequenced for each field, while Biome Makers samples were made pooling together top soil from three random spots in each field and extracting the DNA from this composite sample. Shipment conditions of the soil samples were either -20 oC or RT, followed by -20 or -80 C as storage temperature in the laboratories until DNA Extraction. DNA Extractions were performed using bead-beating based DNA extraction kits such as Dneasy Powerlyzer Powersoil Kit (Qiagen) for the WineSeq® platform (Patent publication number: WO2017096385, Biome Makers) and FAST-DNA Spin Kit for Soil (MP-Biochemical) for the MICROWINE Project. A complete overview of all the samples used in this study and the relative origin is reported in Table S1. The use of bead-beating based kits, such as those included in this

study, produces comparable results. Furthermore, they allow the recovery of the highest biodiversity within soil samples (27). 2.2 Library preparation A general protocol for the library preparation is delineate below, although some modifications occurred between studies. All the libraries were prepared following the two-steps PCR Illumina protocol and these were subsequently sequenced on Illumina MiSeq instrument (Illumina, San Diego, CA, USA) using 2x251 paired-end reads for MICROWINE and 2x301 paired-end reads for Biome-Makers samples. Different primers were used for 16S and ITS, however, in both dataset (MICROWINE and Biome-Makers) they are investigating the same variable region. A complete description of the primers is available in Table S2. All PCR reactions were prepared using UV sterilized equipment and negative controls were run alongside the samples. Also, PCR conditions such as number of cycles, annealing temperature, thermocycler and Master- mix composition change between samples coming from different projects. The library prepared for both datasets were done by a two-step PCR as described by Feld et al. 2016, Albers et al. 2018 (28, 29) with modifications. In the MICROWINE dataset, sample concentration was approximately 10-20 ng of DNA, and both PCR were carried out using Veriti Thermal Cycler (Applied Biosystems). In each reaction, of the first PCR for 16S, the mix contained 12 µL of AccuPrime™ SuperMix II (Thermo Scientific™), 0.5 µL of forward and reverse primer from a 10 µM stock, 0.5 µL of bovine serum albumin (BSA) to a final concentration of 0.025 mg/mL, 1.5 µL of sterile water, and 5 µL of template. The reaction mixture was pre-incubated at 95°C for 2 min, followed by 33 (16S) and 40 (ITS) cycles of 95°C for 15 sec, 55°C for 15 sec, 68°C for 40 sec; a final extension was performed at 68°C for 4 min. Samples were subsequently indexed by a second PCR using the protocol in Gobbi et al 2018 (30). Finally, samples were pooled in equimolar amount before sequencing. Biome-Makers samples were obtained amplifying the 16s rRNA V4 region while the ITS were obtained amplifying ITS1 region using WineSeq® custom primers (Patent WO2017096385). 2.3 Bioinformatics All the data produced and collected were subsequently analysed through a QIIME-based custom bioinformatics pipeline (Patent WO2017096385, Biome Makers). A first quality control was used to remove adapters and chimeras. After that, the reads were trimmed and OTU clusters were performed using 97% identity. assignation and abundance estimation were obtained comparing Operational Taxonomic Unit (OTU) clusters obtained with WineSeq® taxonomy database (Patent WO2017096385) after rarefying to a fixed sequencing depth. 2.4 Functional Analyses Functional predictions based on representative genomes are an useful tool for the estimation of metabolic potential (31). Although its limitations regarding strain-specific functional signatures, environmental distributions, or real magnitude of a process, the functional simulations allow the comparison of communities in terms of their predicted functional potential (32). For that purpose we applied an adaptation of the Tax4Fun routine (33) using presence/absence of genes rather than a normalized weighted value per taxa. To obtain the proportion of each community containing each specific function, we filtered a total of 108 KEGGs (functional orthologs) within 34 metabolic pathways

related to Carbon and Nitrogen cycles (34), complex organic matter decomposition pathways (35), phosphorus cycle pathways (36), and other soil relevant nutrients such as Zn, Ca, Mg, Cu, Fe and Mn (Table S6). We estimated the proportion distribution of each metabolism and their mean proportions by Country, and tested whether the vineyard management (art irrigation, conventional and organic practices) type had an impact on metabolism proportions through a Kruskal-Wallis test. 2.6 Sequencing Data To obtain a robust experimental design prior to the analyses, we removed from the original database all the samples that were not collected close to the harvest period. The dataset obtained this way strongly dependent from the geography and exclude meteorological and seasonal confounding factors. The final database represents 100 vineyards, coming from 12 countries distributed in 4 continents; After sequencing and quality filtering our data we obtained 31.194.544 reads, then clustered into 82.718 OTUs for the 16S dataset and 6.859.436 reads grouped into 26.035 OTUs for the ITS community. To overcome the confounding factors, deriving from the two different studies we included in this work, most of the analyses reported in the result section are performed using samples coming only from the same batch. To evaluate the impact of the vineyard management on alpha diversity and functional potential, we subset the database to retain only the samples which provide the indication about the management system (conventional and organic). Furthermore, we retained only the samples coming from a limited geographical area; in this case all samples from Spain were included since they had a relatively high number of samples (n=116 for 16S and n=100 for ITS) and a represent the highest number of wine-regions within the same country (n= 11). To evaluate the effect of temperature reducing the impact of geography, we retained samples coming from three different Spanish regions: North Ebro Valley in Navarra, La Rioja and Ribera del Duero in Castile and Léon. These regions were chosen due to their temperature-driven-gradient, detected looking at the average maximum temperature retrieved in August (right before harvest), that increases from North Ebro Valley, generally colder and humid (T= 22.5 °C - 25.0 °C), to la Rioja (T= 27.0 °C – 30.0 °C) and finally to Ribera del Duero (T= 30.0 °C), a very dry and hot region in summer time. The climatic data were obtained from the Atlas Climático Digital de la Penisula Iberica developed by Universitat Autonoma de Barcelona (UAB). 2.7 Statistics All the results of this study have been evaluated using statistical tests according to the different necessity. Alpha-diversity and functional prediction results are evaluated using Kruskal-Wallis. Beta-diversity distribution was evaluated using PERMANOVA. The validation of the specific regional-biomarker we have found is based on ANCOM. All p- values are considered significant below a threshold of 0.05. These test have been performed using QIIME2 v2018.2. 3 Results Here, we provide insights on the microbial community of vineyard soils worldwide by analyzing amplicon data for bacterial 16S rRNA gene and fungal ITS region sequencing. The resulting database was explored to understand the impact of geography and vineyard management on the characterization of the microbial terroir. A multi-scale evaluation of the data for diversity, taxonomical composition and metabolic prediction was performed and the results are displayed below.

3.1 Shannon Diversity First, we focused on the comparisons of the different vineyards represented in our database looking at the Shannon index as parameters to measure the alpha diversity. Based on Shannon-Wiener analysis, we looked at the comparison based on different scale such as country and wine-region but also between cultivars and vineyard management systems (Figures 1 and S1). Looking at the bacterial Shannon index from each country, the highest values occurred in Hungary (n=3), South Africa (n=3), Australia (n=32) and Denmark (n=15) while the lowest values was detected in Germany (n=15), Portugal (n=9) and France (n=4) which also statistically differ from each other with p-value < 0.05 (Fig 1.a). Regarding the wine region (Figure S1), the highest bacterial diversity was reported in Lehigh Valley in Pennsylvania while the lowest in Valle de la Orotava in Tenerife (Canary Island, Spain) (Fig 1.b). . The results from the analysis of the fungal communities were different from the bacterial ones; in particular, vineyard soils from Germany (n=10) and Denmark (n=15) showed the highest fungal diversity (Shannon index), while Argentina (n=3) and Chile (n=1) showed the lowest (Fig 1a). In terms of wine-region, the highest index belong to Frederiksburg (Texas) and North Ebro Valley (Spain) and the lowest index, for wine-regions, were obtained in Mentrida (Spain) and Mendoza (Argentina). When looking at differences, within Spanish vineyards, about management systems the Shannon Index for bacteria between conventional (n=29) and organic (n=29) management systems does not define any statistical difference (p-value > 0.05) as well as for fungi where n= 19 for conventional and n=32 for organic samples (Fig 1.c).

Figure 1; Shannon index values on rarefaction curves; a) Plots of Shannon-values for all countries in the database regarding ITS (right) and 16S (left). b) Plots of Shannon-values for all Spanish wine-regions in the database regarding ITS (right) and 16S (left). c) Plots of Shannon-values grouped by management system within samples coming from Spain and regarding ITS (right) and 16S (left). 3.2 Beta-Diversity To investigate regionality and other factors effect in shaping microbial communities, we looked at the frequency tables and OTUs distribution for 16S and ITS and depicted the results as PCoA plots. We calculated the distance matrix based on Bray-Curtis Dissimilarity and subsequently visualize the result using PCoA plots. In Figure 2 we show the distribution at country, region and field level on different subsets of the total distribution for 16S and ITS. The clusters for each country are recognizable as shown in Figure 2a. To investigate regional clustering within the same country, samples from Spain were used, due to the abundance of surveyed regions and samples from each; in this case it is visible in Figure 2b that the

separation is rather robust when looking at the wine-region level. Finally, this trend is maintained at field-level when we have several replicates coming from three fields within the same area (here using Denmark as an example) as shown in Figure 2c. These distributions for fields it is also visible in the samples coming from Azores, in which two vineyards coming from the coastline of the same island (Pico Island) cluster apart from each other; furthermore in the Pfalz region in Germany three vineyards from the same area display separate clusters (not shown). This confirms the robustness of our method together with the high variation in biodiversity between different soil samples even if geographically close to each other. The effect of geography at different scales was always statistically relevant (p-value < 0.05).

Figure 2: PCoA plots representing beta-diversity within our database for 16S (left column) and ITS (right column). a) Country-level distribution of the whole database, b) Regional level distribution considering only samples from Spain, c) Field-level distribution considering three fields within the same area in Denmark. Shown graphs represent things investigated and does not contain all. 3.3 Co-occurrence of OTUs for fungal and bacterial community Investigating the percentage of shared OTUs between different locations at multiple scales is an example of how database interrogation can answer specific questions. We

compared the OTUs between Europe and America and between mainland (Europe and Americas) and islands (Australia, Azores, Baleares and Canarias). A higher richness of prokaryotes and fungi were identified in the Europe and in the samples from mainland. Evaluating the microorganisms detected both in Europe and America, 58.5% of prokaryotes and 18.5% of fungi were shared. Finally, at global scale, 37.7% and 14.5% respectively of prokaryotes and fungi are shared between mainland and islands. At regional scale, looking at the three wine-regions in Spain, the results show North Ebro Valley with the higher richness of prokaryotes and fungi and Rioja and Ribera del Duero with similar numbers of fungal and bacterial OTUs. It is worth mention that while 31.9% of prokaryotes are shared between these three Spanish regions, the same occurred for only 7.4% of fungi. Furthermore, the number of prokaryotes shared only in two of the three regions is almost constant, but the amount of fungi is quite different and directional. In fact, only 2.6% of the fungal OTUs are shared between North Ebro and Ribera del Duero while they share 6% and 5.3% respectively with La Rioja. The three vineyards in Denmark showed similar numbers of prokaryotes and fungi, perhaps corresponding to the similar temperatures (T = 13.0 °C) detected in October (right before harvest recorded by dmi.dk). These vineyards shared 18.2% of prokaryotic OTUs and 12.1% of fungal OTUS. However, the highest percentage of OTUs in the distribution is always referred to the individual field for bacteria and fungi except in one case where they are equal. All these results are summarized in Figure 3

Figure 3. Venn Diagrams based on OTU distribution between continents, regions and fields. All OTUs that appeared at least twice in our database were retained for this analysis. MW-1, MW-2 and MW-3 refers to three vineyards sampled in Denmark within the same climatic region.

3.4 Taxonomical Composition When defining the microbial terroir of a wine-region, we infer it to be the unicity of the taxonomical composition in terms of presence/absence and relations between the relative abundances of the different microbes detected. Although soil microbiome is extremely biodiverse and evolve during time, it is also shaped by the local edaphic factors and the resilience to their seasonal change between different years. In this frame, we investigated the microbial community composition during one of the most enological relevant period of the year; harvest-time. This is a crucial moment since the organisms found on soil are more likely to be found on the colonized plant tissue such as the grape berries and the ones attached here will end up in the fermentation must. The taxonomical evaluation of our dataset at phylum level, highlight the dominance of bacteria belonging to: Proteobacteria, Firmicutes, Enterobacteriaceae, Sphingomonadaceae, Planctomycetaceae, Chitinophagaceae, , Gemmatimonadacea, Rhodocyclacea and Hypomicrobiaceae. For Archaea the population was dominated by members of the Nitrososphaeraceae family.

Figure 4: Taxonomical distribution on the represented country, highlighted in green, of our database; red line and blue line represent the isothermals that define the optimal- growth area of the grapevine worldwide. The double circular-crown, linked to each country, represent the taxonomical composition for ITS (external ring) and 16S, internal ring while the dominant taxa are shown on the side of the picture for Bacteria (legend on the left) and Fungi (legend on the right). Sample from Cyprus only display 16S taxonomical composition. 3.5 Taxa-distribution Amongst the 45 prokaryotic phyla identified in our samples, 12 showed relative abundances above 1% in at least one out of the 14 countries. Proteobacteria occurred with the highest relative abundances in all the 14 countries, with values varying between 27.6% (Portugal) and 44.3% (Croatia). Except for Croatia, Spain and USA, Actinobacteria was the second highest abundant bacterial phylum varying between 10.6% (Germany) and 25.4% (Portugal). Firmicutes in Spain and USA and Bacteroidetes in Croatia replaced Actinobacteria as the second highest abundant bacterial phyla. Additionally, Firmicutes in Australia and Chile, Bacteroidetes in France and Planctomycetes in Italy and South Africa also showed considerable abundances, with values above 10%. The archaea phylum

Crenarchaeota showed large relative abundance variation comparing the 14 countries, with the main values detected in Portugal (10.5%), Chile (13.1%) and Germany (26.8%) whilst in the other countries’ values ranged between 1.9% and 9.2%. Evaluating the fungal taxa detected with relative abundances above 1%, Cryptococcus when present showed as the dominant genus in 8 out of the 12 evaluated countries (Figure 5). The relative abundance of Cryptococcus in Argentina, Chile and South Africa reached ~90%, with only two to three other taxa with values above 1%. In the countries where Cryptococcus was not identified (Australia, Denmark, Germany and Portugal), Fusarium dominated. In Portugal Fusarium reached 93% while only two other taxa exceed 1%. Gibberella intricans was identified in all the 12 countries but in 4 of these countries the relative abundance values were below 0.7%. In the other 8 countries G. intricans showed relative abundances varying from 1.1-17.2%. Other more broadly detected fungal taxa refer to the phyla of , the order of and Hypocreales, the families of Cladosporiaceae, Mortierellaceae, and the genus of Cladosporium, , Epicoccum, Penicillium, Tetracladium, Monographella, Cortinarius, and Mortierella.and Cephalosporium serrae,. However, despite broad geographical detection, the following taxa showed relative abundance above 1% just in the mentioned countries: Epicoccum and Clavicipitaceae (Australia), Penicillium (USA and Chile), Tetracladium (Croatia and Spain) C. serrae (Germany and Denmark), Botrytis cinerea (Spain), , Metarhizium (Germany and Australia) and Cryptococcaceae (USA).

Figure 5; the figure displayed the relative abundances of the five topmost abundant taxa for Fungi (left) and Bacterial phyla (right)

Based on taxonomical composition and statistical test that allow us to detect differentially abundant microbes, we could interrogate our database in order to find taxa which belongs only to a very specific and small area, also known as bio-markers. Among the 384 bacterial taxa (372 assigned at least to family-level ) and 98 fungal taxa ( 47 of them assigned at least to family-level)that were detected as differentially abundant country-wise (Table S3 and S4), we focused on those that appeared to be dominant only one area and are assigned to genus or species level. In the Azores islands (Portugal) for instance, it is possible to retrieve two abundant taxa that are Ktedonobacter sp. and Crossiella sp. which selectively appear in percentages higher than 1% in relative abundance in this area. 3.6 Functional Prediction Analyses We then explored the functional predictions from the 16S rRNA gene using Tax4Fun2. We focused our attention to the impact that a few variables such as country, management system and artificial irrigation could potentially produce on specific metabolic pathways and nutrient transporters. Figure 6 shows the ranges and the geographical distribution of inferred soil nutrition-related metabolisms. Some of them are widespread and abundant in different regions (i.e. aerobic respiration, phosphate transport systems, zinc export) while other metabolisms seem less abundant (i.e. Aerobic CH4 oxidation, Lignine break, Anammox, Manganese import). In addition, metabolic activities that show a larger range of relative abundance in the observed microbial communities are the most interesting ones for finding geographical- or management-dependent patterns. In particular, we highlight the impact of artificial irrigation in the increment of organic matter degradation activities (cellulose, chitin and hemicellulose break) (Figure 6, right panel). About the different management systems we detected a decrease in nitrification, phosphate transfer, manganese import and zinc export in conventional management vs organic.

Figure 6. Functional prediction of 34 metabolic pathways related to Carbon, Nitrogen, Phosphorus, Potassium and micronutrients use and transport. For each metabolic pathway, the proportion distribution across the whole dataset is drawn in the left panel. The heatmap display the metabolism mean value by country, while the right panel shows statistical different metabolism according to vineyard management practices. 4 Discussion 4.1 Database When joining different datasets, coming from different studies, there is a need to address the impact of the so-called batch-effect bias. Being the database composed of samples coming from several countries, although produced in a similar way, we expected to see confounding factors in the global distribution. To overcome this issue we proceeded in a very conservative way, performing most of the analyses using different data subset which includes only samples coming from the same study. The batch effect is a very common bias, in many meta-analytical studies (37) which could influence the results. This risk is even higher when we compare the same type of sample, while the effect is lower when different types of samples are analyzed together as several meta-analyses report (37-39) . Doing so, we measured the impact of the geography on regional and vineyard scale for alpha and beta-diversity; we performed the functional prediction and most of the OTUs distribution analyses displayed through Venn diagrams. This observation could not be performed when looking at the global database distribution, since we needed to include

all the samples obtained from this study. However, despite this bias, the effect of geography is still visible and statistically relevant (p-value < 0.05). At global scale, we limited our observation to alpha-diversity using Shannon index, which is relatively stable against batch-effect coming from the different primers used since it accounts also for evenness in the OTUs distribution and it is not only based on presence/absence (40). In order to further reduce , another recognized solution is to compare the whole database after taxonomy assignment and preferably at higher taxonomical ranks such as phyla (40). In fact, two OTUs that have a different nucleotide sequence, coming from different regions of the same gene-marker, are more likely to be assigned to the same taxa, despite some differences that may occur in relative abundance. Based on taxonomical composition and ANCOM we highlighted potential biomarkers based on the presence/absence of certain taxa in specific wine-regions. However, these taxa require further investigation to be surely asserted, for instance using a group-specific qPCR. 4.2 Diversity The OTUs-distribution results displayed in Sections 3.2 and 3.3 provide information about the use of gene-markers microbial survey to answer ecological question. The PCoA plots confirm that the link between geography and microbial community of vineyards exist at regional scale as already demonstrated by K. N. Burns et al. (9) and N. A. Bokulich et al. (10) in California, S. Knight et al. (14) in New Zealand and L. E. Castañeda and O. Barbosa (41) in Chile. In this study, we also demonstrated that between different countries the geography has a strong effect on shaping the microbial community with few cluster overlapping, For instance, the microbial community of Argentina and Chile are not statistically different from each other but many others that are undoubtedly significant with p-value < 0,05 with PERMANOVA (Table S4). This led us to suggest that there is a hierarchy on the geographical distances and a general trend reflects the rule that a longer distance corresponds to a more diverse community. This result also reflects what is happening on a regional scale. In fact, the OTUs distribution between the three Spanish regions of North Ebro Valley (Navarra), Rioja and Ribera del Duero, display a similar trend. La Rioja is however in the geographical intermediate region and display an intermediate climate, which might explain why the fungal OTUs, shared with the other regions, are more abundant, as displayed in Figure 3b. This is true for bacteria but even more remarkable when looking at the fungal OTUs distribution. In fact, from the Venn diagrams in Figure 3, the percentages of shared microbes between countries, regions and fields are lower in fungi compared with bacteria. This fact, suggest the use of mycobiome as a more sensitive parameter to discriminate the microbial terroir of a location. When we measured the impact of other parameters, such as the vineyard management system or the use of artificial irrigation, no trend was detected. This could suggest that the agricultural practices have a secondary impact on the microbial community compared with the geography at country-scale (including all the climatic and edaphic factors). For this reason, to highlight this impact of the management system in use, it is necessary to reduce the area of the sampling; in fact, reducing the variables connected with the geography, studies like K. N. Burns et al. (11) revealed some bacterial shift linked with certain agricultural practices. 4.3 Taxonomy Proteobacteria and Acidobacteria have previously been identified as the most abundant bacterial phyla within soils (42) including topsoil studies carried out in vineyards (5, 9). In accordance with such observations, Proteobacteria showed as the most abundant

phylum in all the 14 countries evaluated; however, instead of Acidobacteria other phyla occurred in higher relative abundances. The widespread occurrence of Crenarchaeota members in our samples agrees with the detection of this archaeal phylum in a range of environments around the world, such as agricultural fields, sandy soil, forest soil, contaminated soil and the rhizosphere (43-48). However, while the relative abundance fraction of this phylum typically is reported as representing up to 5% of the total prokaryotic community (44, 46), we find it in 10 out of the 14 countries in values exceeding 5%, with emphasis on samples from Chile (13.1%) and Germany (26.8%). The high abundance of Crenarchaeota (12-38%) seen here in samples from Germany (Figure 5) was also identified by D. Kemnitz et al. (49) but in samples coming from an acidic forest soil. Differing from our observation, studies of soil samples coming from Chilean vineyards previously reported relative abundances of Crenarchaeota below 0.1% (41). For the Crenarchaeota, the ammonia-oxidizing archaea (AOA) Nitrososphaera was the main genus detected in all the countries surveyed in our study. This genus was observed by K. Zhalnina et al. (50) to significantly respond to agricultural management practices Soil samples from Azores (Portugal) showed higher abundance of members from at 4.66 % (Crossiella sp.) and Ktedonobacteriaceae at 5.62 % (after BLAST assigned to Ktedonobacteria) amongst the main four identified families. Pseudonocardiaceae have been found with important relative abundances in lava caves (i.e., caves formed during active lava flows) (51) including Azores (52). This could explain the high abundance of this family in our samples based on the volcanic soil distributed all along this island. Furthermore, it might suggest a link to the special terroir of volcanic- wines or at least to wines from Azores but further investigation are required. Information about Ktedonobacteriaceae is very limited, except that, together with the families Thermosporotrichaceae and Thermogemmatisporaceae, all belonging to Chloroflexi, have broadly flexible aerobic metabolisms capable of utilizing a wide range of carbohydrate polymers (53, 54). Several members of the phylum Chloroflexi are thermophilic (55) and the dark volcanic soil of the sampled Azorean vineyard is likely to reach temperatures allowing thermophilic species to thrive. In addition, the stone walled currais system built within the vineyards to protect from the oceanic wind help the soil to retain the heat. Within the fungi, Cryptococcus is the dominant genus in 8 out of the 12 countries. It belongs to the group of oxidative basidiomycetous yeasts and have been found associated with the phyllosphere, grapes and soil (56). Cryptococcus, together with the genera Candida, Hanseniaspora, and Pichia (these three genera not identified in our samples) provide most of the diversity of the frequently isolated yeast species related with grape or isolated from fermented grape juice (57). Studies have shown that Cryptococcus are not severely affected by fungicides which may justify their higher abundance on vineyards (58, 59). The other main fungal taxa identified in our samples belong to documented plant pathogens. A member of Fusarium was detected as the dominant taxa in Australia, Denmark, Germany and Portugal, besides important occurrence in Spain (4%). The genus Fusarium is recognized as containing many potent plant pathogen species (60). Gibberella intricans which was identified in the 12 studied countries is a teleomorph (i.e., the sexual reproductive stage) of Fusarium equiseti, a cosmopolitan soil fungus associated with stem and roots rots in different plant species (61). Pancher et al. (62) isolated the specie Gibberella pulicaris (Fusarium sambucinum) from the grapevine

endosphere of samples from Trentino, Northern Italy. Cladosporiaceae and within this family Cladosporium, also broadly detected in our samples, contains several species identified as fungal pathogenic on grapevines (63). Monographella, detected in 8 countries, was only recently described as occurring in vineyards. Members of this genus is a recognized leaf pathogen on barley, wheat, rice and maize (63). However, the detection of these phytopathogens does not directly indicate plant infection because the success of microorganisms to severely affect plant health relies on the whole soil and rhizosphere microbial interactions (64). This can be due to the fact they are present as spores or dead cells. Furthermore, it is not possible to infer pathogenicity based on a phylogenetic relation to a pathogen. 4.5 Functional Prediction Analyses Functional predictions based on representative genomes are an useful tool for the estimation of metabolic potential (31). Such predictions based on a curated database of barcoded sequences, could lead to the discovery of patterns that require to be further investigated by metagenomics. Despite its limitations regarding strain-specific functional signatures, environmental distributions, or real magnitude of a process, the functional simulations allow the comparison of communities in terms of their predicted functional potential (32). The use of artificial irrigation has been defined as responsible for substantial changes in soil structure. On one hand, water is a powerful agent for soil structure decline and on the other hand it is essential for the biological processes which also determine soil structure. In these conditions, an enhanced biological activity contributes to improve soil structure through increasing the concentrations of organic matter (65). In the light of the results shown in Figure 5, soils with artificial irrigation showed increase in degradation of organic matter that seems to be buffering the negative impact of water in soil structure. In the same line, the observation of G. M. King and M. Hungria (66) suggested that, regarding the soil-atmosphere gases exchange, agricultural ecosystems are able to consume Carbon monoxide (CO) at rates between 3-6 mg of CO m 2/day, with no significant effect of tillage method or crop, but CO exchange for a pasture− soil was near zero. They used corn and soybeans as model crops that are known to receive high doses of irrigation in comparison with pasture soils. Here, when comparing vineyard soils with and without artificial irrigation, we see an increased presence of bacterial species able to perform CO and CH4 oxidation in those receiving artificial irrigation (Fig. 6).

In regards to CH4, a major aim in the contribution of agriculture to the emission of greenhouse-related gases, the methanogenesis activity appeared to be increased in soils under conventional management (Figure 6, left panel). In a previous study analyzing the greenhouse gas emission from conventional and organic herbaceous crops in Spain, E. Aguilera et al. (67) reported a bigger CH4 emission from the former soils (except in the case of rice crops). This fact was explained by the lower input use and/or a higher carbon sequestration in the organic farming systems and a direct increased methanogenesis in conventional farmed soils (in our case, under vines) is suggested here. Finally, it is interesting to mention the decreased activity in mobilizing phosphorus and potassium in the conventional farmed vineyards. It has been reported that P and K are among the main nutrients differing between organic and conventional farmed soils (68). The yearly amount of external P and K inputs added in conventional systems is two times higher than in the organic systems (69) and the retention of them in conventional soils seems to be higher than in organic ones. We could suggest that a major input reduce the

need for mineral solubilization. However, P. Gosling and M. Shepherd (68) reported that when changing between farming systems, organic farming tend to mine the reserves of P and K built up by conventional management. In this context, they suggested that bigger P and K inputs are required in organic farmed soils if long-term declines in soil fertility are to be avoided. However, at a long-term the use of nutrients in soils, and thus their availability for plant uses, should be considered.Our results suggest a potential lower P and K solubilization capacity in conventional farming vineyards than in organic ones that requires further assessment. Despite all these results coming from the functional prediction, based on taxonomical differences, there are no clear evidences of any alpha and beta-diversity shifts due to the agricultural practices included in this study. This could suggest the impact of vineyard management on the microbial community composition is of secondary magnitude, compared with the geography. To conclude, this study provides new insights on microbial terroir. Enlarging this concept on a global scale we could detect region-specific putative biological markers and describe the overall microbial community that characterize several vineyards around the world. As a future perspective, the introduction of other studies’ datasets, the analysis of the terroir of specific areas will extend and detail the global microbial vineyards map initiated here. References 1. Bokulich NA, Collins TS, Masarweh C, Allen G, Heymann H, Ebeler SE, Mills DA. 2016. Associations among wine grape microbiome, metabolome, and fermentation behavior suggest microbial contribution to regional wine characteristics. MBio 7:e00631-16.

2. Terral J-F, Tabard E, Bouby L, Ivorra S, Pastor T, Figueiral I, Picq S, Chevance J-B, Jung C, Fabre L. 2009. Evolution and history of grapevine (Vitis vinifera) under domestication: new morphometric perspectives to understand seed domestication syndrome and reveal origins of ancient European cultivars. Annals of botany 105:443-455. 3. OIV R. 2010. VITI 333/2010, 2010. Definition of vitivinicultural “Terroir. 4. Fierer N. 2017. Embracing the unknown: disentangling the complexities of the soil microbiome. Nature Reviews Microbiology 15:579. 5. Zarraonaindia I, Owens SM, Weisenhorn P, West K, Hampton-Marcell J, Lax S, Bokulich NA, Mills DA, Martin G, Taghavi S. 2015. The soil microbiome influences grapevine-associated microbiota. MBio 6:e02527-14. 6. Hassani MA, Durán P, Hacquard S. 2018. Microbial interactions within the plant holobiont. Microbiome 6:58. 7. Schlaeppi K, Bulgarelli D. 2015. The plant microbiome at work. Molecular Plant-Microbe Interactions 28:212-217. 8. Vandenkoornhuyse P, Quaiser A, Duhamel M, Le Van A, Dufresne A. 2015. The importance of the microbiome of the plant holobiont. New Phytologist 206:1196-1206. 9. Burns KN, Kluepfel DA, Strauss SL, Bokulich NA, Cantu D, Steenwerth KL. 2015. Vineyard soil bacterial diversity and composition revealed by 16S rRNA genes: differentiation by geographic features. Soil Biology and Biochemistry 91:232-247. 10. Bokulich NA, Thorngate JH, Richardson PM, Mills DA. 2014. Microbial biogeography of wine grapes is conditioned by cultivar, vintage, and climate. Proceedings of the National Academy of Sciences 111:E139-E148. 11. Burns KN, Bokulich NA, Cantu D, Greenhut RF, Kluepfel DA, O'Geen AT, Strauss SL, Steenwerth KL. 2016. Vineyard soil bacterial diversity and composition revealed by 16S rRNA genes: differentiation by vineyard management. Soil Biology and Biochemistry 103:337-348. 12. Grangeteau C, Roullier‐Gall C, Rousseaux S, Gougeon RD, Schmitt‐Kopplin P, Alexandre H, Guilloux‐Benatier M. 2017. Wine microbiology is driven by vineyard and winery anthropogenic factors. Microbial biotechnology 10:354-370.

13. Bokulich NA, Joseph CL, Allen G, Benson AK, Mills DA. 2012. Next-generation sequencing reveals significant bacterial diversity of botrytized wine. PloS one 7:e36357. 14. Knight S, Klaere S, Fedrizzi B, Goddard MR. 2015. Regional microbial signatures positively correlate with differential wine phenotypes: evidence for a microbial aspect to terroir. Scientific reports 5:14233. 15. Belda I, Ruiz J, Alastruey-Izquierdo A, Navascués E, Marquina D, Santos A. 2016. Unraveling the enzymatic basis of wine “flavorome”: a phylo-functional study of wine related yeast species. Frontiers in microbiology 7:12. 16. Gilbert JA, van der Lelie D, Zarraonaindia I. 2014. Microbial terroir for wine grapes. Proceedings of the National Academy of Sciences 111:5-6. 17. Belda I, Zarraonaindia I, Perisin M, Palacios A, Acedo A. 2017. From vineyard soil to wine fermentation: microbiome approximations to explain the “terroir” concept. Frontiers in microbiology 8:821. 18. Vaudano E, Costantini A, Garcia‐Moruno E. 2017. Microbial ecology of wine, p 547-559. In de Souza Sant'Ana A (ed), Quantitative Microbiology in Food Processing: Modeling the Microbial Ecology. John Wiley & Sons. 19. Salvetti E, Campanaro S, Campedelli I, Fracchetti F, Gobbi A, Tornielli GB, Torriani S, Felis GE. 2016. Whole-metagenome-sequencing-based community profiles of Vitis vinifera L. cv. Corvina berries withered in two post-harvest conditions. Frontiers in microbiology 7:937. 20. Morrison‐Whittle P, Goddard MR. 2018. From vineyard to winery: a source map of microbial diversity driving wine fermentation. Environmental microbiology 20:75-84. 21. van der Heijden MG, Hartmann M. 2016. Networking in the plant microbiome. PLoS Biology 14:e1002378. 21b. Galand PE, Pereira O, Hochart C, Aguet JC, Debroas D. 2018. A strong link between marine microbial community and function challenges the idea of functional redundancy. 22. Marschner P, Kandeler E, Marschner B. 2003. Structure and function of the soil microbial community in a long-term fertilizer experiment. Soil Biology and Biochemistry 35:453-461. 23. Allison SD, Martiny JBH. 2008. Resistance, resilience and redundancy in microbial communities 24. Altieri MA. 1999. The ecological role of biodiversity in agroecosystems, p 19-31, Invertebrate Biodiversity as Bioindicators of Sustainable Landscapes. Elsevier. 25. Brussaard L, De Ruiter PC, Brown GG. 2007. Soil biodiversity for agricultural sustainability. Agriculture, ecosystems & environment 121:233-244. 26. Nielsen UN, Wall DH, Six J. 2015. Soil biodiversity and the environment. Annual Review of Environment and Resources 40:63-90. 27. Vishnivetskaya TA, Layton AC, Lau MC, Chauhan A, Cheng KR, Meyers AJ, Murphy JR, Rogers AW, Saarunya GS, Williams DE. 2014. Commercial DNA extraction kits impact observed microbial community composition in permafrost samples. FEMS microbiology ecology 87:217- 230. 28. Feld L, Nielsen TK, Hansen LH, Aamand J, Albers CN. 2016. Establishment of Bacterial Herbicide Degraders in a Rapid Sand Filter for Bioremediation of Phenoxypropionate-Polluted Groundwater. Applied and environmental microbiology 82:878-887. 29. Albers CN, Ellegaard-Jensen L, Hansen LH, Sørensen SR. 2018. Bioaugmentation of rapid sand filters by microbiome priming with a nitrifying consortium will optimize production of drinking water from groundwater. Water research 129:1-10. 30. Gobbi A, Santini RG, Filippi E, Ellegaard-Jensen L, Jacobsen CS, Hansen LH. in press. Quantitative and qualitative evaluation of the impact of the G2 enhancer, bead sizes and lysing tubes on the bacterial community composition during DNA extraction from recalcitrant soil core samples based on community sequencing and qPCR doi:https://doi.org/10.1101/365395, PlosOne. 31. Langille MG, Zaneveld J, Caporaso JG, McDonald D, Knights D, Reyes JA, Clemente JC, Burkepile DE, Thurber RLV, Knight R. 2013. Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences. Nature biotechnology 31:814.

32. Ortiz-Álvarez R, Fierer N, de los Ríos A, Casamayor EO, Barberán A. 2018. Consistent changes in the taxonomic structure and functional attributes of bacterial communities during primary succession. The ISME journal:1. 33. Aßhauer KP, Wemheuer B, Daniel R, Meinicke P. 2015. Tax4Fun: predicting functional profiles from metagenomic 16S rRNA data. Bioinformatics 31:2882-2884. 34. Llorens-Marès T, Yooseph S, Goll J, Hoffman J, Vila-Costa M, Borrego CM, Dupont CL, Casamayor EO. 2015. Connecting biodiversity and potential functional role in modern euxinic environments by microbial metagenomics. The ISME journal 9:1648. 35. Lüneberg K, Schneider D, Siebe C, Daniel R. 2018. Drylands soil bacterial community is affected by land use change and different irrigation practices in the Mezquital Valley, Mexico. Scientific reports 8:1413. 36. Vila‐Costa M, Sharma S, Moran MA, Casamayor EO. 2013. Diel gene expression profiles of a phosphorus limited mountain lake using metatranscriptomics. Environmental microbiology 15:1190-1203. 37. Henschel A, Anwar MZ, Manohar V. 2015. Comprehensive meta-analysis of ontology annotated 16S rRNA profiles identifies beta diversity clusters of environmental bacterial communities. PLoS computational biology 11:e1004468. 38. Lozupone C, Stombaugh J, Gonzalez A, Ackermann G, Wendel D, Vázquez-Baeza Y, Jansson JK, Gordon JI, Knight R. 2013. Meta-analyses of studies of the human microbiota. Genome research:gr. 151803.112. 39. Lozupone CA, Knight R. 2007. Global patterns in bacterial diversity. Proceedings of the National Academy of Sciences 104:11436-11440. 40. Engelbrektson A, Kunin V, Wrighton KC, Zvenigorodsky N, Chen F, Ochman H, Hugenholtz P. 2010. Experimental factors affecting PCR-based estimates of microbial species richness and evenness. The ISME journal 4:642. 41. Castañeda LE, Barbosa O. 2017. Metagenomic analysis exploring taxonomic and functional diversity of soil microbial communities in Chilean vineyards and surrounding native forests. PeerJ 5:e3098. 42. Janssen PH. 2006. Identifying the dominant soil bacterial taxa in libraries of 16S rRNA and 16S rRNA genes. Applied and environmental microbiology 72:1719-1728. 43. Bintrim SB, Donohue TJ, Handelsman J, Roberts GP, Goodman RM. 1997. Molecular phylogeny of Archaea from soil. Proceedings of the National Academy of Sciences 94:277-282. 44. Buckley DH, Graber JR, Schmidt TM. 1998. Phylogenetic analysis of nonthermophilic members of the kingdom Crenarchaeota and their diversity and abundance in soils. Applied and Environmental Microbiology 64:4333-4339. 45. Nicol GW, Glover LA, Prosser JI. 2003. Spatial analysis of archaeal community structure in grassland soil. Applied and Environmental Microbiology 69:7420-7429. 46. Ochsenreiter T, Selezi D, Quaiser A, Bonch‐Osmolovskaya L, Schleper C. 2003. Diversity and abundance of Crenarchaeota in terrestrial habitats studied by 16S RNA surveys and real time PCR. Environmental Microbiology 5:787-797. 47. Sandaa R-A, Enger Ø, Torsvik V. 1999. Abundance and diversity of Archaea in heavy- metal-contaminated soils. Applied and Environmental Microbiology 65:3293-3297. 48. Simon HM, Dodsworth JA, Goodman RM. 2000. Crenarchaeota colonize terrestrial plant roots. Environmental microbiology 2:495-505. 49. Kemnitz D, Kolb S, Conrad R. 2007. High abundance of Crenarchaeota in a temperate acidic forest soil. FEMS Microbiology Ecology 60:442-448. 50. Zhalnina K, de Quadros PD, Gano KA, Davis-Richardson A, Fagen JR, Brown CT, Giongo A, Drew JC, Sayavedra-Soto LA, Arp DJ. 2013. Ca. Nitrososphaera and Bradyrhizobium are inversely correlated and related to agricultural practices in long-term field experiments. Frontiers in microbiology 4:104. 51. Lavoie KH, Winter AS, Read KJ, Hughes EM, Spilde MN, Northup DE. 2017. Comparison of bacterial communities from lava cave microbial mats to overlying surface soils from Lava Beds National Monument, USA. PloS one 12:e0169339. 52. Riquelme C, Rigal F, Hathaway JJ, Northup DE, Spilde MN, Borges PA, Gabriel R, Amorim IR, Dapkevicius MdLN. 2015. Cave microbial community composition in oceanic islands:

disentangling the effect of different colored mats in diversity patterns of Azorean lava caves. FEMS microbiology ecology 91. 53. Tomazini Jr A, Lal S, Munir R, Stott M, Henrissat B, Polikarpov I, Sparling R, Levin DB. 2018. Analysis of carbohydrate-active enzymes in Thermogemmatispora sp. strain T81 reveals carbohydrate degradation ability. Canadian journal of microbiology 64:1-12. 54. Yabe S, Sakai Y, Abe K, Yokota A. 2017. Diversity of Ktedonobacteria with Actinomycetes- Like Morphology in Terrestrial Environments. Microbes and environments 32:61-70. 55. Hanada S. 2014. The phylum Chloroflexi, the family Chloroflexaceae, and the related phototrophic families Oscillochloridaceae and Roseiflexaceae, p 515-532, The prokaryotes. Springer. 56. Barata A, Malfeito-Ferreira M, Loureiro V. 2012. The microbial ecology of wine grape berries. International journal of food microbiology 153:243-259. 57. Kachalkin A, Abdullabekova D, Magomedova E, Magomedov G, Chernov IY. 2015. Yeasts of the vineyards in Dagestan and other regions. Microbiology 84:425-432. 58. , Raspor P. 2010. The effect of fungicides on yeast communities associated with grape berries. FEMS yeast research 10:619-630. 59. ČadežComitini N, ZupanF, Ciani J M. 2008. Influence of fungicide treatments on the occurrence of yeast flora associated with wine grapes. Annals of microbiology 58:489-493. 60. Kepler RM, Maul JE, Rehner SA. 2017. Managing the plant microbiome for biocontrol fungi: examples from Hypocreales. Current opinion in microbiology 37:48-53. 61. Pitt JI, Hocking AD. 2009. The ecology of fungal food spoilage, p 102, Fungi and food spoilage. Springer. 62. Pancher M, Ceol M, Corneo PE, Longa CMO, Yousaf S, Pertot I, Campisano A. 2012. Fungal endophytic communities in grapevines (Vitis vinifera L.) respond to crop management. Applied and environmental microbiology:AEM. 07655-11. 63. Jayawardena RS, Purahong W, Zhang W, Wubet T, Li X, Liu M, Zhao W, Hyde KD, Liu J, Yan J. 2018. Biodiversity of fungi on Vitis vinifera L. revealed by traditional and high-resolution culture-independent approaches. Fungal Diversity 90:1-84. 64. Berendsen RL, Pieterse CM, Bakker PA. 2012. The rhizosphere microbiome and plant health. Trends in plant science 17:478-486. 65. Murray RS, Grant CD. 2007. The impact of irrigation on soil structure. Land and Water Australia:1-31. 66. King GM, Hungria M. 2002. Soil-atmosphere CO exchanges and microbial biogeochemistry of CO transformations in a Brazilian agricultural ecosystem. Applied and environmental microbiology 68:4480-4485. 67. Aguilera E, Guzmán G, Alonso A. 2015. Greenhouse gas emissions from conventional and organic cropping systems in Spain. II. Fruit tree orchards. Agronomy for Sustainable Development 35:725-737. 68. Gosling P, Shepherd M. 2005. Long-term changes in soil fertility in organic arable farming systems in England, with particular reference to phosphorus and potassium. Agriculture, ecosystems & environment 105:425-432. 69. Mäder P, Fliessbach A, Dubois D, Gunst L, Fried P, Niggli U. 2002. Soil fertility and biodiversity in organic farming. Science 296:1694-1697.

Conflict of Interest A.Acedo, and R. Ortiz-Álvarez are currently employed at Biome-Makers. I. Belda was employed by Biome Makers (contributing to this work under the framework of his postdoctoral Torres Quevedo Grant – PTQ08253) but currently working at Rey Juan Carlos University (Spain). Part of the data of this study, necessary to replicate the experiment will be available under request and are covered by patent. All the material concerning the MICROWINE Project will be stored in a public repository.

Acknowledgements The authors would like to thank the Horizon 2020 Programme of the European Commission within the Marie Schlodowska-Curie Innovative Training Network “MicroWine” (grant number 643063) for financial support and all the wineries that provide samples to perform this study

MANUSCRIPT 2

Quantitative and qualitative evaluation of the impact of the G2 enhancer, bead sizes and lysing tubes on the bacterial community composition during DNA extraction from recalcitrant soil core samples based on community sequencing and qPCR

This paper describe the impact on the bacterial community of a reagent called G2 DNA/RNA Enhancer, which should increase the yield of DNA obtained from soil, preventing the adsorption to the matrix after the lyses step. Within this paper we tested also the impact of beads and tubes. The paper has been submitted to Plos- One and it is currently under revision. Furthermore, it is available on Biorxiv at the following link https://doi.org/10.1101/365395

I conceived together with R.G. Santini, the idea of this paper, while testing G2 DNA/RNA Enhancer, under suggestion of C.S. Jacobsen, on the soil-core samples collected from R.G. Santini. After a preliminary study it was clear to us that there was the potential to produce a comprehensive experiment to determine and develop an improved direct DNA extraction methods from difficult soil samples. This after noticing that most of the existing protocols exploit indirect DNA extraction methods to face the problem of inhibitors and soil-DNA-binding properties. I performed the laboratory part with E. Filippi and R.G. Santini. I therefore analyzed and interpreted the results. Finally I and R.G. Santini wrote the publication.

Quantitative and qualitative evaluation of the impact of the G2 enhancer, bead sizes and lysing tubes on the bacterial community composition during DNA extraction from recalcitrant soil core samples based on community sequencing and qPCR

Alex Gobbi1¶, Rui G. Santini2¶, Elisa Filippi1, Lea Ellegaard-Jensen1, Carsten S. Jacobsen1, Lars H. Hansen1*

1 Department of Environmental Science, Aarhus University, Roskilde, Denmark 2 Natural History Museum, Centre for GeoGenetics, University of Copenhagen, Copenhagen, Denmark

* Corresponding author E-mail: [email protected] (LHH)

¶ These authors contributed equally to this work.

Abstract Soil DNA extraction encounters numerous challenges that can affect both yield and purity of the recovered DNA. Clay particles lead to reduced DNA extraction efficiency, and PCR inhibitors from the soil matrix can negatively affect downstream analyses when applying DNA sequencing. Further, these effects impede molecular analysis of bacterial community compositions in lower biomass samples, as often observed in deeper soil layers. Many studies avoid these complications by using indirect DNA extraction with prior separation of the cells from the matrix, but such methods introduce other biases that influence the resulting microbial community composition. To address these issues, a direct DNA extraction method was applied in combination with the use of a commercial product, the G2 DNA/RNA Enhancer®, marketed as being capable of improving the amount of DNA recovered after the lysis step. The results showed that application of G2 increased DNA yields from the studied clayey soils from layers from 1.00 to 2.20 m. Importantly, the use of G2 did not introduce bias, as it did not result in any significant differences in the biodiversity of the bacterial community measured in terms of alpha and beta diversity and taxonomical composition. Finally, this study considered a set of customised lysing tubes for evaluating possible influences on the DNA yield. Tubes customization included different bead sizes and amounts, along with lysing tubes coming from two suppliers. Results showed that the lysing tubes with mixed beads allowed greater DNA recovery compared to the use of either 0.1 or 1.4 mm beads, irrespective of the tube supplier. These outcomes may help to improve commercial products in DNA/RNA extraction kits, besides raising awareness about the optimal choice of additives, offering opportunities for acquiring a better understanding of topics such as vertical microbial characterisation and environmental DNA recovery in low biomass samples.

Introduction The complex chemical and physical structure of soil greatly influences the binding strength of DNA to its particles. Soil factors, such as clay type and content, concentration and valence of cations, the amount of humic substances and pH, dictate the adsorption of DNA into the soil matrix [1, 2]. Clay minerals effectively bind DNA and other charged molecules, such as Ca2+, Mg2+ and Al3+. This binding effect can be either direct, due to the positively charged edges of the clay particles, or indirect due to the neutralising bridge between negative charges on clay surfaces and DNA macromolecules caused by the presence of polyvalent cations (Ca2+, Mg2+ and Al3+) [2-4]. The increased binding of DNA to soil particles significantly reduces degradation of this biological material by extracellular microbial DNases and nucleases [5-7]. However, the same binding force that protects DNA from degradation reduces the amount of DNA recovered during DNA extraction. Appropriate DNA recovery from the soil matrix is also affected when the soil layers, especially those located below the topsoil, contain lower biomass and hence smaller DNA quantities. Indeed, deeper soil layers generally have a lower carbon content, lower nutrient concentrations and consequently lower microbial biomass [8, 9]. Despite

the decrease in microbial biomass with soil depth, the hitherto poorly characterised communities dwelling in the deeper layers of the soil perform important roles in carbon sequestration [10], nutrient cycling [11, 12], mineral weathering and soil formation [13, 14], contaminant degradation [13] and groundwater quality [11]. Depending on the methodological approach applied, DNA extraction methods can be divided into direct and indirect methods. In direct extraction protocols, lysis is the first step and the microorganisms are treated with the matrix. During indirect extraction, however, the first step involves the detachment of the microbes from the soil matrix, generally conducted in a liquid media or supplemented by the use of density centrifugations such as in Nycodenz extractions [15]. Both these methods introduce a different extraction bias [16]. During direct extractions from inorganic soils that are particularly rich in clay, the DNA liberated from the microbial cells is quickly adsorbed onto clay particles, preventing complete recovery. During indirect extractions, regardless of which sample or soil type, the bias is due to the different efficacy of the separation treatment on specific microbes, which may enrich one particular microbial fraction over another [17]. These complications may impede the extractions from soil and sediment layers, particularly from lower depths, and thus hamper the study of topics such as vertical microbial characterisation and environmental ancient DNA. The importance of obtaining high DNA yields is also related to greater representativeness of the soil gene pool, reducing the bias introduced into the successive analyses [18, 19] . Although many authors working with soil or sediments have opted for indirect DNA extraction methods to overcome these problems [20-23], the present study focused on a direct approach. Direct extraction methods have been shown to recover the greatest diversity in terms of the number of OTUs, especially if based on mechanical lysis such as bead beating when compared with other lysis methods [24]. With the aim of reducing the retention capacity of clay particles, the commercial product G2 DNA/RNA Enhancer® (Ampliqon A/S, Odense, Denmark), hereafter referred to as G2, has been introduced into lysing matrix tubes. The G2 product is marketed as being able to improve the amount of DNA recovered after the lysis step [25]. G2 is a product made from freeze-dried highly mutagenised salmon sperm DNA [25] that adsorbs, like environmental DNA, to the clay particle before cell lysis [26]. However, unlike salmon sperm DNA, the G2 enhancer is not amplifiable in downstream polymerase chain reaction (PCR) applications [25]. The primary objective of the present study was to test the effects of the commercial product G2 on DNA yield and the diversity of the bacterial community from silty clay soil samples from layers between 1.00 and 2.20 m, thus aiming to develop an improved and bias-reduced direct DNA extraction protocol for this type of challenging sample. In addition to the effect of G2 on DNA yield, tests were conducted using both customised and commercially available DNA extraction kits. Customised tubes for evaluating possible influences on the DNA yield were prepared using different bead sizes and amounts, along with different plastic lysing tubes from different suppliers.

Material and methods Soil core sampling Soil sampling was performed in September 2016 at a vineyard located in the municipality of La Horra in the Ribera del Duero region in Spain (41°43'46.82"N, 3°53'29.85"W). The sampling approach was designed to obtain two undisturbed soil cores, allowing later sub- sampling for DNA analysis at specific soil core depths. To do so, a Fraste Multidrill model PL was applied for intact soil core sampling using a hydraulic hammer. This model is typically used for standard penetration test (SPT) analysis, but in this case it was fitted with a PVC tube adapted internally to the metal probe rod to allow undisturbed soil core recovery. The soil cores were recovered in the PVC tubes with a 2.5” diameter in lengths of 0.60 m. The cores were then sealed at both ends, labelled and stored in a cold room (4- -sampling.

6Soil ˚C) sub until-sampling further sub and homogenisation Sub-sampling was performed by opening the cores and recovering soil from specific depths and putting the samples into sterile 15 ml plastic tubes. Sub-samples were taken from the central, untouched part of the cores collected using a sterile spatula and tweezers. After sub-sampling, the 15 mL plastic tubes were promptly frozen at - shipped to Denmark and kept frozen until DNA extraction. 18 ˚C, To obtain a homogenous soil sample for distribution between the DNA extraction tubes, a composite soil sample was produced by mixing 22.5 g of soil. This pool was composed by mixing 1.5 g of soil from 15 selected soil sub-samples representing soil depths ranging from 1.00 to 2.20 m, a depth range chosen on the basis of previous pilot studies. The pilot studies showed that below the depth of approximately 2.20 m, no measurable DNA could be recovered without using G2, which would make a comparison of microbial communities between the extractions with and without G2 unviable. To proceed with the DNA extraction, 0.4 g of the homogenised soil pool was placed in 51 lysing tubes. The DNA extraction followed the protocol of the FastDNA® Spin Kit for Soil (MP Biomedicals, LLC, Solon, OH, USA), but was modified with regard to the lysing tubes. For this study, different types of lysing tubes were prepared using commercial products made by MP Biomedicals and Ampliqon in order to test the effect of plastics, beads and G2. All the preparations are summarised in Table 1. With these lysing tube combinations we aimed to compare the microbial communities for samples extracted with: 1) tubes containing different bead sizes (series compared: “g” with “h” and “i” and “a with “c” and “d”), 2) tubes containing G2 (series compared: “a” with “b”; “e” with “g”; and “f” with “h”), and 3) different tubes supplier (series compared: “a” with “i”; “c” with “g”; and “d” with “h”).

Table 1. All the lysing tubes used in the experiments.

Series n Tube Beads G2 a* 5 MP-Bio Mixed No b 5 MP-Bio Mixed Yes c 5 MP-Bio 1.4 mm No d 5 MP-Bio 0.1 mm No e* 5 Ampliqon 1.4 mm Yes f* 5 Ampliqon 0.1 mm Yes g 5 Ampliqon 1.4 mm No h 5 Ampliqon 0.1 mm No i 5 Ampliqon Mixed No A-NEG+G2 3 MP-Bio Mixed Yes A-NEG 3 MP-Bio Mixed No *Products already commercialised, while the remainder are customised preparations. A- NEG+G2 represents a negative control of the extraction with G2 added; A-NEG represents a negative control of the extraction using the kit’s lysing tube without a sample. With mixed beads, the intention was to have an exact proportion of 0.1 and 1.4 mm beads plus a large 4 mm glass bead. When mixed beads are used, the final weight of the beads is greater than the single type beads. It was decided not to modify this parameter since the test was intended to compare already existing commercial products. The customised preparations were made to evaluate the statistical effect of the different variables involved. Ampliqon and MP-Bio define the different tube supplier: Ampliqon A/S (Denmark) and MP-Biochemicals (Germany).

DNA quantification qPCR, library preparation and sequencing Following the extractions, the DNA yields were measured using Qubit®2.0 fluorometer (Thermo Scientific™). Qubit measurements were taken from all the tube preparations in triplicates and the average results reported in Table in S1 Supporting Information. All PCR reactions were prepared using UV sterilised equipment and negative controls were run alongside the samples. The qPCR with primers targeting the 16S rRNA gene was carried out on a CFX Connect™ Real-Time PCR Detection System (Bio-Rad). The primers used were 341F (TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG-CCTAYGGGRBGCASCAG) and 806R (GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG-GGACTACNNGGGTATCTAAT) [27], complete with adapters for Illumina MiSeq sequencing. Single qPCR reactions contained 4 µL of 5x HOT FIREPol® EvaGreen® qPCR Supermix (Solis BioDyne, Tartu, Estonia), 0.4 µL of forward and reverse primers (10 µM), 2 µL of bovine serum albumin (BSA) to a final concentration of 0.1 mg/mL, 12.2 µL of PCR grade

sterile water and 1 ng of template DNA. A standard curve consisting of dilution series of 16S standard was prepared from DNA extracts of Escherichia coli K-12, with seven 16S rRNA gene copies per genome [28]. The quantity of 10-1 16S standard was 8.45 × 107 16S genes/µL. Quantification parameters showed an efficiency of E=87.7 % and R2 of 0.997. The qPCR cycling conditions included initial denaturation at 95 °C for 12 min, followed by 40 cycles of denaturation at 95 °C for 15 sec, annealing at 56 °C for 30 sec, and an extension at 72 °C for 30 sec, with a final extension performed at 72 °C for 3 min. Amplicon library preparation was performed by a two-step PCR, as described by Feld, Nielsen (29) and Albers, Ellegaard-Jensen (30) with slight modifications. Sample concentration was approximately 5 ng of DNA, and both PCRs were carried out using a Veriti Thermal Cycler (Applied Biosystems). In each reaction of the first PCR, the mix contained 12 µL of AccuPrime™ SuperMix II (Thermo Scientific™), 0.5 µL of forward and reverse primer from a 10 µM stock, 0.5 µL of bovine serum albumin (BSA) to a final concentration of 0.025 mg/mL, 1.5 µL of sterile water and 5 µL of template. The reaction mixture was pre-incubated at 95 °C for 2 min, followed by 33 cycles of 95 °C for 15 sec, 55 °C for 15 sec, 68 °C for 40 sec, with a final extension performed at 68 °C for 4 min. Samples were subsequently indexed by a second PCR using the following PCR protocol.

(Thermo Scientific™), 2 µL of primers complete with indexes and P7/P5 ends, 7 µL of sterileAmplification water wasand performed5 µL of PCR1in 28 μLproduct. reactions The with cycling 12 µL conditioof AccuPrime™ns included SuperMix initial II denaturation at 98 °C for 1 min, followed by 13 cycles of denaturation at 98 °C for 10 sec, annealing at 55 °C for 20 sec, and extension at 68 °C for 40 sec, with a final extension performed at 68 °C for 5 min. The primer dimers formed in the PCR were removed, along with PCR components, using a clean-up step. In this step, HighPrep™ PCR reagent (MAGBIO) was used to selectively bind the DNA fragments to sequence according to the manufacturer’s protocol. PCR products were finally checked by electrophoresis on a 1.5 % agarose gel. Samples were then pooled in an equimolar amount of 10 ng and sequenced at Aarhus University (Roskilde, Denmark) on Illumina MiSeq instrument using 2x250 paired-end reads with V2 Chemistry.

Bioinformatics Sequencing data were analysed and visualised using QIIME 2 v. 2017.9 [31]. Demultiplexed reads from the Illumina MiSeq were quality filtered using the plugin quality filter with default parameters of QIIME2 [32]. Reads were then denoised, chimera checked and dereplicated using a DADA2 denoise-paired plugin [33]. The output was rarefied to the lowest sample at 16047 reads using qiime feature-table rarefy [34]. Thereafter a multiple-sequence alignment was performed using MAFFT [35] and subsequently a phylogenetic tree generated using FastTree [36]. Alpha and beta diversity analyses were performed through a q2-diversity plugin [37] with the core-metrics- phylogenetic method on the rarefied sequence-variant table. For the alpha diversity, two different parameters were measured: richness and evenness. Richness was measured based on Faith-pd [38], while evenness [39] was reported through the Pielou score [40]. This produced box plots and PCoA plots visualised through Emperor [41]. Taxonomic assignments were performed using qiime feature-classifier classify-sklearn in which a

pre-trained Naïve-Bayes classifier with Greengenes v_13.8 [42] was used. Taxa bar plots were built using the plugin qiime taxa bar plot with different filtered, unfiltered and grouped rarefied tables. All the data are available in the Sequence Read Archive (SRA) with the accession number PRJEB24676. Statistics A statistical evaluation of the results was performed separately for DNA quantification and sequencing dataset. For DNA quantification, assessments based on one-way ANOVA followed by Scheffe’s test were performed. The ANOVA test indicates whether there is at least one statistically significant (p-value below 0.05) difference in the whole dataset. After this we performed a t-test, on Qubit and qPCR results, to evaluate the impact of G2 on the whole dataset. Scheffe’s method, instead, indicates which group comparison is statistically significant (comparison score > critical Scheffe’s score). The critical value of Scheffe’s method is calculated starting from the F-critic of ANOVA test multiplied by (N- 1), where N is the number of comparisons performed with Scheffe’s method. Scheffe’s test was used here because it is less sensitive to an unequal number of samples representing the different variables analysed. Furthermore Scheffe’s methods have been used to reduce false positive results due to type I errors when multiple comparisons are performed on the same dataset [43]. Both tests were applied to the dataset of the DNA quantification obtained using Qubit and qPCR for the gene copy number. Since both results were consistent and G2 is mainly used to obtain DNA for PCR-based downstream applications, only the results of the qPCR are discussed. All statistical evaluations and visualisations regarding DNA amount and gene copy number quantification were performed in Microsoft Office Excel 2010. Sequencing data after QIIME 2 pipeline processing (2.4) were statistically evaluated using the Kruskal-Wallis test for alpha and beta diversity, a non-parametric method substitute of ANOVA when the normal distribution of data cannot be assumed. The resulting p-value of the alpha diversity comparison was based on the medians of different parameters (richness and evenness) calculated between the different series analysed. Beta diversity analyses were performed using both Kruskal Wallis and PERMANOVA with 999 permutations. Finally a statistical evaluation was performed of differentially abundant features based on an analysis of composition of microbiomes (ANCOM). ANCOM computes Aitchinson’s [44] log-ratio of relative abundance for each taxon, controlling the false discovery rate (FDR) using the Benjamini-Hochberg procedure. This test is based on the assumption that few features change in a statistical way between the samples, and hence it is very conservative [45]. All these statistical tests were applied using QIIME2 v2017.9.

Results The main aim of the present study was to evaluate the impact of the commercial product G2 on DNA extraction from a silty clay soil layer between 1.00 and 2.20 m. Soil samples description, defined as silty clay, was performed together with an experienced geologist. G2, freeze-dried inside the lysing tubes, was applied with the purpose of preventing or reducing the adsorption of environmental DNA onto the clay particles in the soil, thus improving the recovery of nucleic acids during cellular lysis. This paper presents the

results of the impact of G2 on final DNA extraction efficiency and its possible impact on the composition of soil microbial communities. Furthermore the G2 effect was compared with other variables such as bead size and tube supplier in different commercialised and customised lysing tubes, as summarised in Table 1. G2 enhanced DNA recovery from deep soil layers In order to test the influence of the presence of G2 on DNA yield, a series of extractions were set up using different bead size and tube combinations. Following DNA quantification via Qubit, the results were analysed. A first evaluation, using t-test to compare the series that differ for the presence of G2, resulted in a p-value of 9.29*10-16, indicating a higher yield when G2 is used. As seen in Fig 1A, the G2 component consistently improved the DNA yield (Scheffe’s score > 27.612 – Table in S1 Supporting Information). Fig 1A shows that there were differences between the different tube preparations, irrespective of the presence of G2. Different bead combinations and tubes without the addition of G2 were therefore tested. These variations could potentially be attributed to the different beads and the plastic of the tubes (Fig 1B). The mixed beads allowed greater DNA recovery compared to the use of either 1.4 mm or 0.1 mm beads, regardless of the type of plastic tubes used (Scheffe’s score > 27.612 – Table in S1 Supporting Information). The differences in the DNA yield between the 1.4 mm and 0.1 mm beads were less pronounced, including for the differences due to the plastic composition of the tubes. Mixed beads always allowed recovery of the highest DNA yield in all the tube preparations in which they were used.

Evaluation of G2 Effect on DNA Yield * 1.40 1.20 * 1.00 0.80 * 0.60 0.40 0.20 0.00 FAST DNA tubes Ampliqon tube Ampliqon tube

DNA DNA Concentration ng/ul + Mix Beads + 1,4mm beads + 0,1mm beads With G2 1.28 1.10 0.57 Without G2 0.68 0.32 0.26 a Beads Size and Tubes Comparison

0.80 * * 0.70 0.60 0.50 0.40 ** ** 0.30 0.20 0.10 0.00

DNA DNA Concentration ng/ul MP-Bio Tubes Ampliqon Tubes Mixed Beads 0.68 0.50 1,4mm beads 0.32 0.34 0,1 mm Beads 0.29 0.26 b Fig 1. Results based on Qubit quantification after DNA extraction expressed as ng/µl. (A) Comparison of three commercial products with and without G2. (B) Comparison of three different bead sizes and the two tube types, without added G2. Values shown are averages of n=3 independent measurements on n=5 biological replicates. Error bars are calculated from the standard deviation for each series. There was a statistical significance (Scheffe’s score > 27.612) for the three comparisons with and without G2 in test A, while for test B, it occurred in the comparisons between mixed vs. 1.4 mm beads and mixed vs. 0.1 mm beads, but not between 1.4 mm vs. 0.1 mm beads or between the two tube suppliers (Table in S1 Supporting Information).

Greater DNA yield corresponded to a higher number of 16S genes Considering that G2 is a DNA-based product, this could potentially affect the Qubit measurements since the signal recorded using a fluorescent dye is emitted when it binds dsDNA. In such a situation there might be an overestimation of soil DNA from the Qubit

result. To examine whether this was the case, the copy number variation of a genetic marker, such as the 16S rRNA gene for bacteria, was tested via qPCR. The results of the qPCR (Fig 2 and in Table in S3 Supporting Information) confirmed the trend observed with the DNA yield provided by the Qubit analysis. The ANOVA based on the qPCR results described the dataset with a p-value of 1.0147*E-36 (Table in S4 Supporting Information). This meant that in all the data produced by qPCR, taking into account the variance within and between groups, there was at least one difference that was statistically significant. We therefore performed a t-test, comparing the sample with and without G2, and it resulted in a p-value of 4.3*10-9, which indicate the statistical relevance of the usage of G2. However, to identify all the other differences statistically relevant in our database such as beads and tube suppliers, Scheffe’s method was applied (Table in S2 Supporting Information). Based on these results, it was evident that G2 always allowed the recovery of the highest number of genes, while mixed beads also improved gene recovery (Scheffe’s score > 26.117 – Table in S2 “a” Supporting Information). These results were also evident and consistent when looking at the DNA yield measured via Qubit, as reported in Table in S1 Supporting Information. The comparison of different plastic tubes (Table in S2 “e” Supporting Information) showed a non-significant difference. Instead, the beads used in lysis were shown to have a significant effect on DNA yield (Table in S2 “b” Supporting Information) and, as discussed below, also on microbial composition. From the Scheffe’s statistics, it was possible to observe that the mixed beads from the original lysing tube of the FastDNA® Spin Kit for Soil always recovered more DNA than the other bead preparations containing either 0.1 mm or 1.4 mm beads. Furthermore it was noted that although there was no statistically relevant difference between the use of 1.4 mm or 0.1 mm beads, the results changed, showing a statistical significance, when G2 was added (Table in S2 “d” Supporting Information).

Evaluation of G2 Effect on Copy Number 700000.00 * 600000.00

500000.00 *

400000.00 300000.00 ** 200000.00

Number of 16s Numberof 16s Genes/ul 100000.00

0.00 FAST DNA tubes + Ampliqon tube + Ampliqon tube + Mix Beads 1,4mm beads 0,1mm beads With G2 530846.34 385098.38 195552.21 Without G2 288930.60 146080.73 128784.14

Fig 2. 16S rRNA gene copy number variation after qPCR expressed as Genes/µl. A comparison between three commercial products with and without G2. Values shown are averages of n=3 technical replicates for n=3 biological replicates for each series. Error bars are calculated from the standard deviation for each series. (*) Significantly relevant comparison (Scheffe’s score > 26.117). (**) Not statistically relevant (Scheffe’s score < 26.117) (Table in S2 Supporting Information). Furthermore, the results reported in Table S2 “c” Supporting Information considered the possibility of basal contamination of the kit that may be relevant during the analyses or alternatively the possibility of a source of contamination introduced during the G2 freeze- drying process. In both cases, the control samples showed that basal contamination of the kit did not affect the results, and furthermore that the freeze-drying process did not introduce any more detectable DNA than the negative control. To visualise this statistical evaluation based on all the different comparisons using Scheffe’s method, the following procedure was applied: Scheffe’s score (Table in S2 Supporting Information) was averaged by grouping different comparisons based on the variables examined, i.e. G2, bead size and tube plastic. The average score was then plotted and the results shown in Fig 3. Calculations are reported in Table S5 Supporting Information. The results illustrated in Fig 3 show that the main influence to DNA yield came from the presence or absence of G2 in the lysing tubes, while the bead effect showed a smaller, but still relevant, significance. The results related to the two plastic tubes tested were non-significant.

G2

Plastic Beads Scheffe

Fig 3. Visualisation of the average score of the Scheffe’s test. Visualisation of the average score of the Scheffe’s test of categories a, b and e, according to Table in S2 Supporting Information, visualising the impact of G2 compared with the effect of plastics and beads. The values on which this visualisation is based are reported in Table in S5 Supporting Information.

Influence of G2 on the bacterial community structure We also tested whether the use of G2 influenced microbial community composition, for example by enriching particular taxa or skewing the relative taxa abundance between samples. Hence, a subset of the DNA samples was sequenced for the V3-V4 region of the 16S rRNA gene. A statistical evaluation of the differences in alpha and beta diversity allowed an assessment of the impact of G2 on the microbial community DNA from this sample. Similarly, the impact of other variables, such as tubes and beads, on microbial community DNA composition was also evaluated. After data pre-processing through the QIIME2 pipeline, denoised and rarefied exact sequence variants were obtained. From these, the effects of G2 and bead size on alpha diversity were examined, focusing on two parameters: richness and evenness. These results are summarised in Fig 4.

Fig 4. Alpha diversity analyses on G2 and bead impact on richness and evenness. (a) Boxplot based on the Faith-pd index and comparing all the samples with G2 (n=7) and without G2 (n=9). (b) Boxplot based on the Pielou-S score for evenness between samples obtained with G2 (n=7) and without G2 (n=9). (c) Boxplot based on the Faith-pd index and comparing all the samples with 0.1 mm (n=5), 1.4 mm (n=5) and mixed (n=6) beads. (d) Boxplot based on the Pielou-S score for evenness on samples obtained with 0.1 mm (n=5), 1.4 mm (n=5) and mixed (n=6) beads. The presence of G2 did not statistically significant affect the richness (p = 0.08) or evenness (p = 0.874) of the microbial community, as shown in Figs 4A and 4B respectively. The same tests were applied to the results of the bead and tube comparison. While the use of two different plastic tubes did not show any statistically significant effect in terms of richness (p = 0.874) or evenness (p=0.322), this was not the case for the beads. In particular, the use of mixed bead sizes introduced a significant negative difference in richness when compared to 0.1 mm beads (p = 0.03), and in evenness when performing the same pairwise comparison with 0.1 mm beads (p=0.01). This change in the microbial community resulted in a smaller amount of sequence variants and evenness when mixed beads were used. A further analysis was performed of the differences between the samples’ beta diversity using unweighted Unifrac and Bray-Curtis dissimilarity as metrics. These results were visualised using principal coordinates analysis (PCoA) plots and statistically evaluated in PERMANOVA. Fig 5 presents the PCoA plots showing the effects of the use of G2 (Fig 5A) and different beads (Fig 5B). These plots were obtained with the unweighted Unifrac distance matrix. The Bray-Curtis PCoA plot was consistent with this representation (not shown). PERMANOVA tests were performed with 999 permutations on beads, tubes and the G2 effect and the results are summarised in Table 2. These results showed that only mixed beads had a statistically relevant effect on beta diversity in the present study’s samples (p-value = 0.005 and 0.002). A PCoA plot of the tube type used did not allow a

clear distribution to be distinguished between the two different tubes (not shown) and this was confirmed by the PERMANOVA analysis (p-value = 0.465) (Table 2).

Fig 5. PCoA plots obtained from the UniFrac distance matrix of the sequenced samples from the different tube preparations. (A) Comparison of data obtained from samples with G2 (blue dots) and without G2 (red dots). (B) Comparison of data obtained from samples using 0.1 mm beads (green dots), 1.4 mm beads (blue dots) and mixed beads (red dots).

Table 2. Results of the PERMANOVA analyses.

Variable Groups Pseudo f- p- q-value value G2 Present /Absent 0.979 0.499 0.499 Beads 0.1 mm / 1.4 mm 0.907 0.787 0.787 Beads 0.1 mm / Mixed 1.639 0.005 0.0075* Beads 1.4 mm / Mixed 1.715 0.002 0.006* Tube Ampliqon/MP-Bio 0.979 0.465 0.465 The first row refers to the use of G2, the second, third and fourth rows refer to the comparison of different groups of beads, and the fifth row refers to the tube effect. *Statistically relevant comparisons. Pseudo f- refers to the ratio between cluster variance and within cluster variance; p-value is considered statistically relevant below 0.05; q- value is an adjusted p-value for false discovery rate (FDR) when multiple comparisons are performed. The compositional bar chart made on phylum resolution (Fig in S7 Supporting Information) shows that the bacterial communities of the DNA extracts with and without G2 were not visually distinguishable. By contrast, when looking at the profiles using different types of beads, these differences were more evident, especially between the mixed beads compared with 1.4 mm or 0.1 mm beads. All the data reported so far were obtained by excluding the negative control from the alpha and beta diversity analyses in order to maximise the effect of the different variables reducing any false negative results. This could be done after checking the composition of the negative control through a taxa- bar plot. The dominant taxa in the negative control belonged to the genus Ralstonia, already reported and known to be common contaminants of kits and PCR reagents [46]. This taxon was filtered out of the other sample since it was present in low amounts (~0.1 % across the different series). Furthermore, the other taxa that appeared in the negative controls were checked and since they were dominants in all the samples and represented only a small fraction of the negative control, it was decided not to remove them. The taxa composition (Fig in S1 Supporting Information) of the microbial community in the soil horizon between 1.00 and 2.20 m was shown to be mainly composed of members of the phyla of Proteobacteria, Actinobacteria, Acidobacteria and Chloroflexi. Archaea were also present and the dominant phylum was Crenarcheota, representing 10 % of the total relative abundance. For a better visualisation of the differences in terms of phyla attributable to the different variables involved, Fig 6 shows a heatmap of the microbial community composition of the dataset. Looking at the dendrogram on the left of Fig 6, two main groups can be distinguished according to the beads in the second heatmap. The use of mixed beads in particular altered the community profile, whereas none of the other variables produced this effect.

Fig 6. Heatmap at phylum level of the sequencing dataset. The logarithmic scale in which colour intensity determines the abundance of the taxa can be seen in the bottom right-hand corner. The heatmap represents the samples grouped by presence/absence of G2 (left legend), different bead sizes (middle legend), and different tube suppliers (right legend). The names of the phylum are stated below. On the left the dendrogram of similarity between all the samples is presented.

Finally, an ANCOM test was run in order to identify differentially abundant microbes in the different groups of samples and determine which variables mostly affected the result. Specifically, two different group comparisons were performed: one based on the presence/absence of G2 when mixed beads were used, and the other based on the difference between the use of mixed beads compared to 0.1 mm beads when G2 was used. These two tests confirmed that G2 did not statistically alter any taxa among the samples, as reported in Tables S8 and S9 Supporting Information. However there were four statistically relevant differences between the use of mixed and 0.1 mm beads. None of these four taxa belonged to the dominant fraction of the microbial community, but they represented around 1 % of the microbial community. Three of them were barely represented in the samples when 0.1 mm beads were used, while they were not present at all when mixed beads were used. They belong to the phyla of Acidobacteria and Gemmatimonadetes. Only one of these microbes was represented, at up to 1.36 %, when mixed beads were used compared to the 0.1 mm beads and it belongs to the order of Legionellales. The results of this last test are summarised in Tables S8 and S9 Supporting Information. Discussion The use of G2 allowed the recovery of more DNA in all settings and samples, making it the largest impact variable compared to the other variables of interest (tubes and beads). The increased DNA yield found in the presence of G2 was in accordance with Bælum et al. [47] and Jacobsen et al. [48]. The recovered DNA in the first case [47] was several orders of

magnitude higher, probably related to a higher clay content and/or the clay mineral type increasing the retention capacity of the matrix on the released DNA. A 7.5-fold average increase in DNA yield was obtained in the second study, which was an inter-laboratory test [48]. To resolve whether the amount of DNA detected was derived from the microorganisms dwelling in the soil and not to any residual G2, the impact on 16S rRNA gene copy numbers was measured using qPCR (Fig 2). The output confirmed the same trend observed for the Qubit results, meaning a significantly higher recovery of bacterial genes in all tested cases. The combined use of qPCR and Qubit guarantees the best qualitative and quantitative analyses of DNA [47, 49]. The differences between the measured DNA amount and the gene copy number can be explained by the qPCR only being applied to the bacterial population, leading to an underestimation of the total number of cells in the samples. This could partially be compensated for by the copy number variations of the 16S rRNA gene in different bacterial populations possibly leading to an overestimation of the recovered amount of cells within the present samples. G2 allowed the recovery of the largest amount of DNA when combined with mixed beads, but a relevant increase could also be observed by using just 1.4 mm beads, regardless of the plastic tube used. The number of cells could be estimated from the 16S rRNA gene numbers using the same methodology as Vishnivetskaya et al. [24], in which it was assumed that the average 16S copy number for the microbial community was 3.6, based on the observation of Klappenbach et al. [50]. This led to the calculation of a maximum of 3.7*105 cells g-1 of soil using G2 and 2*105 without G2. This value was lower than other reported cell counts, such as 2.2*108 using the same kit, but in topsoil [24], a difference that can be attributed to a deeper sampled layer. Comparing the present results with a direct DNA extraction from a deep soil layer, such as the one performed by Taylor et al. [51], a more similar value is obtained of around 1.2*106. Differences here could be related to the different soil type and the different kit used for the DNA extraction. The combination of the two statistical tests, ANOVA and Scheffe, allowed multiple comparison analyses on a single dataset, as reported by Mermillod-Blondin et al. [52] and Brown [53]. In addition to what is reported in Table in S5 Supporting Information and visualised in Fig 3, it was interesting to note that although there was not a statistically relevant difference between the use of 1.4 mm or 0.1 mm beads, the results changed when G2 was added and showed a statistical significance (Table in S2 “d” Supporting Information). This conversion might be due to the fact that 1.4 mm beads allowed improved lysis of the cells in the sample, probably because of a better dissolution of the soil particles, compared with 0.1 mm beads. However, if G2 was not added, most of the extracted DNA was suddenly adsorbed to the clay particles of the matrix, preventing its recovery. To verify whether the use of G2 influenced the microbial composition, a subset of the DNA extracted from the soil samples was sequenced for the 16S rRNA gene. An evaluation of alpha diversity showed that evenness and richness did not change when comparing samples obtained with and without G2 or when using different tubes. Non-residual contamination of G2 has also been confirmed by Jacobsen at al. [48], where in-depth

sequencing did not show traces of the DNA originating from G2 in the final DNA extract [48]. In contrast, bead size type had a statistically relevant effect on these two parameters. The lower richness and evenness when mixed beads were used could be due to the fact that most of the total DNA extracted came from the dominant fraction of soil bacteria. Since these dominant bacteria were not selectively lysed by the 1.4 and 4.0 mm beads alone, but were by the 0.1 mm beads, this ratio between dominant and rare taxa would be maintained with every bead preparation, although the absolute number of lysed cells would be different. However, the library size from each sample was not proportional to the starting amount of DNA, leading to an uneven representation of rare taxa in the sample with a higher starting amount of DNA. Since the reads belonging to the dominant bacteria were sequenced multiple times for each sample, the richness when mixed beads were used was slightly lower. This was a common issue in all the PCR-based surveys and has also been confirmed by Gonzalez et al. [54]. The lower evenness was also expected. In this case, samples from mixed bead tubes had a lower number of rare taxa, leading to a greater imbalance between dominant and rare taxa and thus to less evenness. With a larger amount of reads per sample, the effect on richness and evenness could probably be cancelled out or reverted due to the sequencing depth effect [55]. Some of the differences caused by uneven sequencing coverage could be reduced by performing a rarefaction on the sample, as was done in this study, but cannot be avoided completely [56]. In terms of beta diversity, only the use of mixed beads had an effect on the different samples. The PCoA distributions presented in Fig 5A show a clear cluster belonging to extracted mixed bead samples. The three bead sizes (0.1 mm, 1.4 mm and 4.0 mm) could act together to provide a better dissolution of the sample and recover more DNA, but if sequencing is not deep enough it could lead to the detection of a reduced number of taxa, as was the case in this study. Furthermore, since commercial products were used here, it is worth noting that the amount of beads differed between the MP-Biochemical tubes and the Ampliqon tubes. In particular MP-Bio tubes have a higher volume of beads, and this could explain some of the differences detected in terms of the amount of DNA and the quality of the taxa. The synergistic bead effect, both positive and negative, was not evaluated in this study. Furthermore, the PCoA plots presented in Fig 5, and then confirmed by the PERMANOVA analyses (Table 2), showed that there was no statistical difference in microbial composition between the samples with or without the use of G2 (Fig 5A) and also irrespective of the plastic tubes used (not shown). Finally, in terms of taxonomic composition of the microbial community (Fig in S1 Supporting Information), these results are consistent with Janssen [57] and He [58] with regard to the dominant phyla of Proteobacteria, Actinobacteria and Acidobacteria in soil at different horizon levels.

Conclusions Based on DNA yield quantification and gene copy-number detection after qPCR, the current study demonstrated that the use of the commercial product G2 DNA/RNA Enhancer® (Ampliqon A/S, Odense, Denmark) increased the amount of DNA recovered from composite and homogenised silty clay soil samples. Furthermore, the use of G2 did not introduce any significant differences in the richness or evenness of the bacterial

community obtained after amplicon library sequencing when compared with those samples sequenced without the addition of G2. The two plastic lysing tubes tested had no effect on either the yield or the composition of the microbiota. In contrast, the use of different bead sizes had a significant effect. A higher DNA yield was obtained with the simultaneous presence of differently sized beads. Moreover, the use of mixed beads in this case led to a slightly lower richness and evenness in the taxa distribution, an effect that could be explained by the sequencing depth. In terms of future perspectives coming out of this study, it would be worth applying G2 to other kinds of samples in which DNA recovery could be affected by proteins or other compounds biasing downstream application. These tests may provide useful information for the improvement of existing commercial products in DNA/RNA extraction kits, raising awareness about the optimal choice of additives. Acknowledgments We acknowledge Jesper S. Pedersen (Ampliqon A/S) for preparing and providing laboratorial material and Patricia Benitez for fieldwork support.

References 1. Ladd JN, Foster RC, Nannipieri P, Oades J. Soil structure and biological activity. In: G S, JM B, editors. Soil biochemistry. 9. New York: Marcel Dekker; 1996. p. 23-78. 2. Nielsen KM, Calamai L, Pietramellara G. Stabilization of extracellular DNA and proteins by transient binding to various soil components. In: Nannipieri P, Smalla K, editors. Nucleic acids and proteins in soil. Soil biology: Springer; 2006. p. 141-57. 3. Greaves M, Wilson M. The adsorption of nucleic acids by montmorillonite. Soil Biology and Biochemistry. 1969;1(4):317-23. 4. Paget E, Simonet P. On the track of natural transformation in soil. FEMS Microbiology Ecology. 1994;15(1-2):109-17. 5. Khanna M, Stotzky G. Transformation of Bacillus subtilis by DNA bound on montmorillonite and effect of DNase on the transforming ability of bound DNA. Applied and Environmental Microbiology. 1992;58(6):1930-9. 6. Crecchio C, Stotzky G. Binding of DNA on humic acids: effect on transformation of Bacillus subtilis and resistance to DNase. Soil Biology and Biochemistry. 1998;30(8):1061-7. 7. Ogram A, Sayler GS, Gustin D, Lewis RJ. DNA adsorption to soils and sediments. Environmental science & technology. 1988;22(8):982-4. 8. Agnelli A, Ascher J, Corti G, Ceccherini MT, Nannipieri P, Pietramellara G. Distribution of microbial communities in a forest soil profile investigated by microbial biomass, soil respiration and DGGE of total and extracellular DNA. Soil Biology and Biochemistry. 2004;36(5):859-68. 9. Fritze H, Pietikäinen J, Pennanen T. Distribution of microbial biomass and phospholipid fatty acids in Podzol profiles under coniferous forest. European Journal of Soil Science. 2000;51(4):565-73. 10. Fierer N, Chadwick OA, Trumbore SE. Production of CO 2 in soil profiles of a California annual grassland. Ecosystems. 2005;8(4):412-29. 11. Madsen EL. Impacts of agricultural practices on subsurface microbial ecology. Advances in agronomy (USA). 1995. 12. -9. 13. Konopka A, Turco R. Biodegradation of organic compounds in vadose zone and aquifer sediments. Applied andRichter Environmental DD, Markewitz Micro D. biology.How deep 1991;57(8):2260 is soil? BioScience.-8. 1995;45(9):600 14. Hiebert FK, Bennett PC. Microbial control of silicate weathering in organic-rich ground water. Science-New York then Washington-. 1992:278-. 15. Holmsgaard PN, Norman A, Hede SC, Poulsen PH, Al-Soud WA, Hansen LH, et al. Bias in bacterial diversity as a result of Nycodenz extraction from bulk soil. Soil Biology and Biochemistry. 2011;43(10):2152-9.

16. Lombard N, Prestat E, van Elsas JD, Simonet P. Soil‐specific limitations for access and analysis of soil microbial communities by metagenomics. FEMS microbiology ecology. 2011;78(1):31-49. 17. Delmont TO, Robe P, Clark I, Simonet P, Vogel TM. Metagenomic comparison of direct and indirect soil DNA extraction approaches. Journal of microbiological methods. 2011;86(3):397-400. 18. Bürgmann H, Pesaro M, Widmer F, Zeyer J. A strategy for optimizing quality and quantity of DNA extracted from soil. Journal of microbiological methods. 2001;45(1):7-20. 19. Kennedy K, Hall MW, Lynch MD, Moreno-Hagelsieb G, Neufeld JD. Evaluating bias of Illumina-based bacterial 16S rRNA gene profiles. Applied and environmental microbiology. 2014;80(18):5717-22. 20. Wang Q, Cheng C, He L, Huang Z, Sheng X. Characterization of depth-related changes in bacterial communities involved in mineral weathering along a mineral-rich soil profile. Geomicrobiology Journal. 2014;31(5):431-44. 21. Duarte GF, Rosado AS, Seldin L, Keijzer-Wolters AC, van Elsas JD. Extraction of ribosomal RNA and genomic DNA from soil for studying the diversity of the indigenous bacterial community. Journal of Microbiological Methods. 1998;32(1):21-9. 22. Crecchio C, Gelsomino A, Ambrosoli R, Minati JL, Ruggiero P. Functional and molecular responses of soil microbial communities under differing soil management practices. Soil Biology and Biochemistry. 2004;36(11):1873-83. 23. Williamson KE, Kan J, Polson SW, Williamson SJ. Optimizing the indirect extraction of prokaryotic DNA from soils. Soil Biology and Biochemistry. 2011;43(4):736-48. 24. Vishnivetskaya TA, Layton AC, Lau MC, Chauhan A, Cheng KR, Meyers AJ, et al. Commercial DNA extraction kits impact observed microbial community composition in permafrost samples. FEMS microbiology ecology. 2014;87(1):217-30. 25. Bælum J, Jacobsen CS, inventorsImprovement of low-biomass soil DNA/RNA extraction yield and quality. Denmark patent EP 2 443 251 B1. 2012 25.04.2012. 26. Ampliqon. G2 DNA/RNA Enhancer 2017 [cited 2017 July 3]. Available from: http://ampliqon.com/en/products/pcr-enzymes/dnarna-extraction/g2-dnarna-enhancer/. 27. Hansen CHF, Krych L, Nielsen DS, Vogensen FK, Hansen LH, Sørensen SJ, et al. Early life treatment with vancomycin propagates Akkermansia muciniphila and reduces diabetes incidence in the NOD mouse. Diabetologia. 2012;55(8):2285-94. 28. Blattner FR, Plunkett G, Bloch CA, Perna NT, Burland V, Riley M, et al. The complete genome sequence of Escherichia coli K-12. science. 1997;277(5331):1453-62. 29. Feld L, Nielsen TK, Hansen LH, Aamand J, Albers CN. Establishment of Bacterial Herbicide Degraders in a Rapid Sand Filter for Bioremediation of Phenoxypropionate-Polluted Groundwater. Applied and environmental microbiology. 2016;82(3):878-87. 30. Albers CN, Ellegaard-Jensen L, Hansen LH, Sørensen SR. Bioaugmentation of rapid sand filters by microbiome priming with a nitrifying consortium will optimize production of drinking water from groundwater. Water research. 2018;129:1-10. 31. Caporaso JG, Kuczynski J, Stombaugh J, Bittinger K, Bushman FD, Costello EK, et al. QIIME allows analysis of high-throughput community sequencing data. Nature methods. 2010;7(5):335-6. 32. Bokulich NA, Subramanian S, Faith JJ, Gevers D, Gordon JI, Knight R, et al. Quality-filtering vastly improves diversity estimates from Illumina amplicon sequencing. Nature methods. 2013;10(1):57-9. 33. Callahan BJ, McMurdie PJ, Rosen MJ, Han AW, Johnson AJA, Holmes SP. DADA2: high-resolution sample inference from Illumina amplicon data. Nature methods. 2016;13(7):581-3. 34. McDonald D, Clemente JC, Kuczynski J, Rideout JR, Stombaugh J, Wendel D, et al. The Biological Observation Matrix (BIOM) format or: how I learned to stop worrying and love the ome-ome. GigaScience. 2012;1(1):7. 35. Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Molecular biology and evolution. 2013;30(4):772-80. 36. Price MN, Dehal PS, Arkin AP. FastTree 2–approximately maximum-likelihood trees for large alignments. PloS one. 2010;5(3):e9490. 37. Lozupone C, Knight R. UniFrac: a new phylogenetic method for comparing microbial communities. Applied and environmental microbiology. 2005;71(12):8228-35. 38. Faith DP. Conservation evaluation and phylogenetic diversity. Biological conservation. 1992;61(1):1-10. 39. Mulder C, Bazeley‐White E, Dimitrakopoulos P, Hector A, Scherer‐Lorenzen M, Schmid B. Species evenness and productivity in experimental plant communities. Oikos. 2004;107(1):50-63. 40. Pielou EC. The measurement of diversity in different types of biological collections. Journal of theoretical biology. 1966;13:131-44.

41. Vázquez-Baeza Y, Pirrung M, Gonzalez A, Knight R. EMPeror: a tool for visualizing high-throughput microbial community data. Gigascience. 2013;2(1):16. 42. N/A. QIIME 2 plugin supporting taxonomic classification 2017 [cited 2017 02 Dec 2017]. Available from: https://github.com/qiime2/q2-feature-classifier. 43. Petrinovich LF, Hardyck CD. Error rates for multiple comparison methods: Some evidence concerning the frequency of erroneous conclusions. Psychological Bulletin. 1969;71(1):43. 44. Aitchison J. The statistical analysis of compositional data. Journal of the Royal Statistical Society Series B (Methodological). 1982:139-77. 45. Mandal S, Van Treuren W, White RA, Eggesbø M, Knight R, Peddada SD. Analysis of composition of microbiomes: a novel method for studying microbial composition. Microbial ecology in health and disease. 2015;26(1):27663. 46. Salter SJ, Cox MJ, Turek EM, Calus ST, Cookson WO, Moffatt MF, et al. Reagent and laboratory contamination can critically impact sequence-based microbiome analyses. BMC biology. 2014;12(1):87. 47. Bælum J, Scheutz C, Chambon JC, Jensen CM, Brochmann RP, Dennis P, et al. The impact of bioaugmentation on dechlorination kinetics and on microbial dechlorinating communities in subsurface clay till. Environmental pollution. 2014;186:149-57. 48. Jacobsen CS, Nielsen TK, Vester JK, Stougaard P, Nielsen JL, Voriskova J, et al. Inter-laboratory testing of the effect of DNA blocking reagent G2 on DNA extraction from low-biomass clay samples. Scientific Reports. 2018;8(1):5711. 49. Simbolo M, Gottardi M, Corbo V, Fassan M, Mafficini A, Malpeli G, et al. DNA qualification workflow for next generation sequencing of histopathological samples. PLoS One. 2013;8(6):e62692. 50. Klappenbach JA, Saxman PR, Cole JR, Schmidt TM. rrndb: the ribosomal RNA operon copy number database. Nucleic acids research. 2001;29(1):181-4. 51. Taylor J, Wilson B, Mills MS, Burns RG. Comparison of microbial numbers and enzymatic activities in surface soils and subsoils using various techniques. Soil Biology and Biochemistry. 2002;34(3):387-401. 52. Mermillod-Blondin F, Rosenberg R, François-Carcaillet F, Norling K, Mauclaire L. Influence of bioturbation by three benthic infaunal species on microbial communities and biogeochemical processes in marine sediment. Aquatic Microbial Ecology. 2004;36(3):271-84. 53. Brown AM. A new software for carrying out one-way ANOVA post hoc tests. Computer methods and programs in biomedicine. 2005;79(1):89-95. 54. Gonzalez JM, Portillo MC, Belda-Ferre P, Mira A. Amplification by PCR artificially reduces the proportion of the rare biosphere in microbial communities. PLoS One. 2012;7(1):e29973. 55. Smith DP, Peay KG. Sequence depth, not PCR replication, improves ecological inference from next generation DNA sequencing. PLoS One. 2014;9(2):e90234. 56. Weiss S, Xu ZZ, Peddada S, Amir A, Bittinger K, Gonzalez A, et al. Normalization and microbial differential abundance strategies depend upon data characteristics. Microbiome. 2017;5(1):27. 57. Janssen PH. Identifying the dominant soil bacterial taxa in libraries of 16S rRNA and 16S rRNA genes. Applied and environmental microbiology. 2006;72(3):1719-28. 58. He S, Guo L, Niu M, Miao F, Jiao S, Hu T, et al. Ecological diversity and co-occurrence patterns of bacterial community through soil profile in response to long-term switchgrass cultivation. Scientific Reports. 2017;7(1):3608.

Figure S1: Compositional bar charts on phylum resolution comparing G2 presence/absence (left), beads-effect (middle) and differences among tube supplier (right) with relative Taxonomic legend. The chromatic orders in the label start from the top-most abundant and re-cycle every eight taxa.

Supplemental Material Tab. S1 - Scheffe score for Qubit dataset based on the difference series comparisons FAST tube + Mix beads FAST + G2 FAST tube + Mix beads FAST - G2 172.86 FAST tube + 1.4 mm beads - G2 FAST tube + 0.1 mm beads - G2 1.12 FAST tube + Mix beads FAST - G2 FAST tube + 1.4 mm beads - G2 55.01 FAST tube + Mix beads FAST - G2 FAST tube + 0.1 mm beads - G2 71.78 Ampliqon tube + 1.4 mm beads + G2 Ampliqon tube + 1.4 mm beads - G2 292.02 Ampliqon tube + 0.1 mm beads + G2 Ampliqon tube + 0.1 mm beads - G2 46.50 27.61* Ampliqon tube + 1.4 mm beads + G2 Ampliqon tube + 0.1 mm beads + G2 132.48 FAST tube + 1.4 mm beads - G2 FAST tube + 0.1 mm beads - G2 1.12 Ampliqon tube + 1.4 mm beads - G2 Ampliqon tube + 0.1 mm beads - G2 1.54 Ampliqon tube + Mix beads FAST - G2 FAST tube + Mix beads FAST - G2 15.80 Ampliqon tube + 1.4 mm beads - G2 FAST tube + 1.4 mm beads - G2 0.31 Ampliqon tube + 0.1 mm beads - G2 FAST tube + 0.1 mm beads - G2 0.55 FAST tube + Mix beads FAST - G2 Ampliqon tube + Mix beads FAST - G2 15.80 Fast Tube + Mixed Beads + Neg + G2 FAST tube + Mix beads FAST + G2 675.95

(*) Scheffe Critic - calculated based on DNA amount (172.86) Statistically significant

Supplemental Material Tab. S2: Scheffe’s test results comparison. A result is considered to be statistically relevant when the Scheffe’s score is higher than the Scheffe’s critical value (Scheffe Critic = 26.117). * identify statistically relevant comparison. while letters identify the category in which they belong. a: Comparison based on presence/absence of G2. b: Comparison between mixed-beads and. respectively. 1.4 mm and 0.1 mm. c: Comparison between the series A-NEG+G2 and B. A-NEG+G2 and A-NEG (all reported in Table 1) to show that. based on the controls. G2 does not have any effect statistically relevant in terms of yield of DNA. d: Comparison between different bead-size (1.4 and 0.1 mm) in presence/absence of G2 e: Comparison between the different plastic tubes to evaluate the plastic-effect. Scheffe Critic First Comparison Second Comparison Scheffe Score Fast tube + Mix beads + G2 Fast tube + Mix beads 139.74 a*

Fast tube + 1.4 beads Fast tube + 0.1 mm beads 2.14

Fast tube + Mix beads Fast tube + 1.4 mm beads 48.62 b*

Fast tube + Mix beads Fast tube + 0.1 mm beads 71.14 b*

Ampliqon tube + 1.4 mm beads + G2 Ampliqon tube + 1.4 mm beads 136.41 a*

Ampliqon tube + 0.1 mm beads + G2 Ampliqon tube + 0.1 mm beads 10.64 a

Ampliqon tube + 1.4 mm beads + G2 Ampliqon tube + 0.1 mm beads + G2 85.78 d* 26.117 Ampliqon tube + 1.4 mm beads Ampliqon tube + 0.1 mm beads 0.71 d

Ampliqon tube + Mix beads Fast tube + Mix beads 0.24 e

Ampliqon tube + 1.4 mm beads Fast tube + 1.4 mm beads 0.00 e

Ampliqon tube + 0.1 mm beads Fast tube + 0.1 mm beads 0.37 e

Fast tube + Mix beads Ampliqon tube + Mix beads 0.24

Fast Tube + Mix Beads + Neg + G2 Fast Tube + Mix Beads + G2 334.82 c*

Fast Tube + Mix Beads + Neg + G2 Fast Tube + Mix Beads + Neg 5.04 c

Supplemental Material Tab. S3 - qPCR copy-number results (quantification based on 16s Gene/µl)

FAST tube + FAST tube + FAST tube + FAST tube + Ampliqon tube Ampliqon tube Ampliqon tube Ampliqon tube Ampliqon tube + Mix beads Mix beads 1.4 mm beads 0.1 mm beads + 1.4 mm + 0.1 mm + 1.4 mm + 0.1 mm Mix beads FAST FAST - G2 FAST + G2 - G2 - G2 beads + G2 beads + G2 beads - G2 beads - G2 - G2

239.674 533.510 150.081 136.344 344.158 197.511 146.488 122.612 306.625 288.782 474.536 146.095 119.145 361.691 217.290 147.103 99.747 278.049 278.633 373.218 156.656 86.502 447.643 139.696 135.779 104.346 257.049 357.835 441.625 120.722 168.695 460.500 174.806 168.359 161.059 289.533 310.236 711.104 161.245 126.655 417.669 188.429 134.722 141.191 286.099 324.357 541.500 98.555 114.850 436.516 190.285 142.007 143.750 271.976 265.700 547.510 154.716 126.569 383.342 246.825 139.759 128.784 266.401 271.138 608.738 158.943 78.312 347.810 193.563 143.349 128.784 275.977 264.020 545.876 169.024 89.805 266.555 211.566 157.162 128.784 278.869 Average

288.931 530.846 146.226 116.320 385.098 195.552 146.081 128.784 278.953 Std.

34.101 91.386 21.150 26.673 58.663 27.878 10.108 17.944 13.435

Supplemental Material Table S4 - ANOVA Table based on qPCR dataset

Source of Variation SS df MS F P-value F crit

Between Groups 1.58179E+12 9 1.75755E+11 93.2553326 1.0147E-36 2.0090249 Within Groups 1.39465E+11 74 1.884.663.581 Total 1.72126E+12 83

Supplemental Material Table S5 – Figure 3 Calculation

G2 Beads Tubes 139.74 2.1 0.00005 136.4 48.2 0.37 10.6 71.1 0.23 Average 95.58 40.46667 0.200017

S8 Supporting Information. ANCOM summary based on G2 comparison. (Online) S9 Supporting Information. ANCOM summary based on beads comparison. (Online)

MANUSCRIPT 3

Vertical stratification of microbial communities in vineyard soils

This paper describe how the microbial community of soil cores, taken in vineyards, differ along the depth in three different countries. This study was conducted in order to fill a gap in the existing literature about which microbes live in deep soil and how they are distributed between countries. Various microorganisms detected are potential influencers of plant health via microbe- root interactions while certain phyla strongly correlate with soil edaphic factors suggesting a soil composition driven selective process. This work is currently submitted to FEMS Microbiology Ecology. This study was initially conceived by R.G. Santini and his Supervisor K. Kjaer and included geochemistry analyses on soil cores using X-ray fluorescence techniques combined with microbiological analyses. After the arising of unexpected problems from the geochemical side I step more into the designing part of this work, where we focused on the development of the microbiological section. I performed all the experiments together with R.G. Santini which I trained in sequencing library preparation and then I performed the bioinformatics data analyses. To fulfill our goal we developed dedicated protocols which have been also resulting in Manuscript 2. I assisted R.G. Santini in the interpretation of the results and wrote parts of the manuscript.

Vertical stratification of microbial communities in vineyard soils

Rui G. Santini1*, Alex Gobbi2*, Lea Ellegaard-Jensen2, Lars H. Hansen2, Per Möller3 and Kurt H. Kjær1

• the two authors contributed equally to the study

1 Natural History Museum of Denmark, Centre for GeoGenetics, University of Copenhagen, Copenhagen, Denmark 2 Department of Environmental Science, Aarhus University, Roskilde, Denmark 3 Department of Geology/Quaternary Sciences, Lund University, Lund, Sweden

Abstract

A better understanding of the functional roles of microorganisms dwelling in different soil horizons relies on their taxonomical identification. Due to the relatively small number of studies that evaluate deeper soil depths, information about the changes in microbial communities along soil profiles is limited, however. To fill this knowledge gap, we retrieved 19 undisturbed soil cores taken down to depths of between 2.4 and 6 m from vineyards in Australia, Spain and Denmark. For microbial assessment, we sub- sampled the soil cores at different depths and sequenced the 16S rRNA gene and the ITS regions for, respectively, prokaryotes (313 samples) and fungi evaluation (63 samples). Microbial communities were shown to vary significantly with depth and geography. While several microbes’ taxa were detected solely in specific soil layers and countries, some prokaryotes were ubiquitous, dwelling at different soil depths, or depending on the geographical conditions. Various microorganisms detected along the soil profiles are potential influencers of plant health via microbe-root interactions, including the deep, fine roots found in plants such as grapevines. In addition, several bacterial phyla were correlated with soil edaphic factors, suggesting a strong effect of soil composition in shaping the microbial communities.

INTRODUCTION

Even though soil profiles can extend to many metres’ depth, soil microbiology studies focus mainly on the uppermost part, and generally the upper 0.25 m (1-3). Compared with deeper horizons, findings within the topsoil often show a significantly higher microbial density (1, 4, 5), a higher microbial diversity (1, 2, 6) and a differing composition (3, 5, 7). Despite such observations, the microbial communities dwelling in deeper soil profiles have important roles in carbon sequestration (8), nutrient cycling (9, 10), mineral weathering and soil formation (11, 12), contaminant degradation (11) and groundwater quality (9). In addition, the phylogeny and taxonomy of the microbial communities found in the deeper soil layers have hardly been studied hitherto (3). Such microbial community structures and diversity can therefore provide indispensable information for a better understanding of the soil system.

Soil microbial communities in forests and agricultural systems are important for nutrient transformation, mineralisation and cycling and plants’ root absorption, and are thus important for plant growth and productivity (13, 14). Another important positive influence on plant development is the symbiotic association between a wide range of microbes and plant roots (13, 15). On the other hand, two main negative effects of soil microbes on plant development are the presence of microbial pathogens and the bacteria-plant nutrient competition (13). Taking into consideration that these direct and indirect interactions all occur along the soil profile where the roots are found, deep- rooted plants, such as grapevines, can be expected to be influenced by the microbial communities found in the deeper soil layers.

The objective of this study is therefore to investigate the vertical distribution of microbial communities in vineyards in different geographical regions (Australia, Spain and Denmark) with expected different soil properties. We hypothesise that distinct microbial communities are adapted to different soil depths, with decreasing microbial diversity with depth. Soils for this study were collected in vineyards located in the countries mentioned above, from drill cores up to six metres long, a depth considered not uncommon for the vertical extension of grapevine roots (16). The soil-bacterial and archaeal communities were recovered via 16S rRNA gene sequencing, and information about soil-fungal communities was obtained through sequencing of the ITS region.

MATERIAL AND METHODS

Site descriptions

The three vineyards studied are located in: (1) the Margaret River region of southwestern Western Australia (33°49'22”S, 115°2'34”E); (2) the municipality of La Horra, Ribera del Duero region, north-central Spain (41°43'47”N, 3°53'29”W); and (3) Sjællands Odde, a peninsula in Odsherred municipality, located on the northwest coast of Zealand, in Denmark (55°57'21”N, 11°25'21”E) (Figure 1).

Figure 1. Vineyard areas and soil core-sampling locations. (A) La Horra, Ribera del Duero region, Spain, (B) Sjællands Odde, Denmark, and (C) Margaret River, Australia.

The soil from the vineyard in Western Australia is classified as a Chromosol, formed as a result of pedogenic processes during laterisation of the original geological strata (17, 18). Chromosol is a chemically weathered soil, mainly composed of Fe, Al, Ti, and Mn oxides, and generally lacking basic cations and primary silicate (19). The Spanish vineyard is located in the eastern domain of the Tierra de Campos facies, which is predominantly shale, interbedded by cross-bedded sands and gravels and calcareous horizons, which are regarded as being formed in a marginal part of an alluvial depositional setting (20). The Danish vineyard is situated at an intersection of glacial deposits from the last glaciation of the area, and onlapping postglacial sediments (21).

Soil core sampling and soil aliquot preparation

The undisturbed vineyard soil cores from Australia, Spain and Denmark were extracted during March 2016, September 2016 and August 2017, respectively. Soil cores were collected at an approximate distance of between 0.2 and 0.3 m from grapevine stems. Figure 1 shows soil sampling locations while Supplementary Table 1 presents soil core information.

The eight soil cores collected in Australia and the five cores collected in Denmark were obtained using the direct push technology, through a Geoprobe® model 6610 (Kejr, Inc., USA). In Spain, the six soil cores were collected using a Fraste Multidrill model PL (Fraste Spa, Italy), with a hydraulic hammer with a PVC tube adapted internally to the metal probe rod, allowing undisturbed soil recovery.

The collected soil cores were sub-sampled at several depth intervals, resulting in 12 to

22 sub-samples per core, depending on the final core depth (Supplementary Table 1). Sub-samples were taken from the central, untouched part of the cores, using a sterile spatula and tweezers. A total of 332 sub-samples were obtained from the, in total, 19 soil cores. The soil-core sub-sampling services were performed at the laboratories of the South West Institute of Technology (Western Australia), at the Spanish winery laboratory and at the University of Copenhagen (Denmark). The sub-sampled soil aliquots were transferred to individual 15-ml sterile plastic tubes and stored immediately at -20oC.

Soil properties

Soil pH was evaluated by mixing the freshly sampled soil and distilled water in a ratio of 1:2 (by volume). After one hour of incubation, the samples were measured with a pH probe (Hanna pH meter HI 98217, Hanna Instruments, USA). The pH values were characterised according to Schoeneberger et al., 2012 (22).

Sample preparation and analysis of total organic carbon (TOC) and nitrogen (N) were performed following the capsule method, based on established protocols (23, 24). Three soil cores from Australia, two from Spain and one from Denmark were selected for TOC and N analysis. From each of these six cores, we collected and analysed five to six soil sub- samples at depths of between 0.1 and 5 m.

Soil elemental analysis of the occurrence of Mg, Al, Si, P, S, Cl, K, Ca, Ti, Cr, Mn, Fe, Co, Ni, Cu and Zn was performed using a non-destructive Itrax core scanner (Cox Analytical Systems, Sweden). The Itrax core scanner applies X-ray fluorescence (XRF) spectrometry for evaluation of the elemental variation.

DNA extraction

Soil DNA extraction was performed on the 332 sub-samples from the 19 soil cores, using FastDNA® Spin Kit for Soil (MP Biomedicals, LLC, USA), combined with use of the commercial product G2 DNA/RNA Enhancer® (Ampliqon A/S, Denmark), hereinafter referred to as G2. Under controlled conditions, at the Ampliqon A/S facilities, 500 µL of freeze-dried G2 was added to each lysing matrix tube from the FastDNA® Spin Kit for Soil. Further information about G2 is available in Gobbi et al., in press (25).

To control the variability of the bacterial community within the same soil interval we produced triplicates of six sub-samples (Supplementary Table 1), resulting in 18 samples. A triplicate refers to three soil aliquots coming from the same 15-mL plastic tube, in this way representing the same soil horizon interval.

As a negative control, to verify possible sequenced DNA coming from the G2 product, we prepared five lysing tubes without soil addition, but with G2 added.

DNA extraction followed the FastDNA® Spin Kit for Soil protocol, and DNA yields were measured using a Qubit® 2.0 fluorometer (Thermo Fisher Scientific Inc., USA).

Library preparation and Illumina MiSeq amplicon sequencing

Amplicon library preparation was performed as two-step PCR, as reported by Feld et al. (29) and Albers et al. (30), but with minor modifications. In brief, the bacterial universal primers 341f/806r and fungal ITS1f/ITS2r were used to amplify the 16S rRNA gene and ITS regions, respectively. Further protocol details for 16S library preparation are described in Gobbi et al. in press (25). For ITS library preparation, the protocol and the cycle are the same as for 16S, except that Supreme NZYTaq II 2x Green Master Mix (NZYTech, Lda. – Genes and Enzyme, Portugal) was used instead of AccuPrime™. Samples pooled in equimolar amounts were sequenced on an Illumina MiSeq (Illumina Inc., USA) instrument, using the V2 Chemistry kit with 2x250 paired-end reads.

From the 332 initial samples for library preparation for 16S rRNA gene sequencing, 20 were excluded as a consequence of non-measurable DNA. Counting the triplicates and control samples, in total 327 samples were sent for 16S rRNA gene sequencing. For ITS region sequencing, a total of 67 samples, coming from 4 soil cores, were selected for library preparation. From this total, 4 samples were excluded and 63 were then sent for sequencing (Supplementary Table 1).

Bioinformatics and statistical analysis

Sequencing data was analysed, statistically evaluated and visualised, using QIIME 2 v. 2018.2 [31]. The bioinformatics pipeline for DNA-sequenced data treatment was performed in accordance with Gobbi et al. 2018 (in press) (25).

Statistical evaluation included tests for alpha and beta diversity, using the Kruskal-Wallis method, which is a non-parametric method substitute for ANOVA when the normal data distribution cannot be assumed. The resulting p-value of the alpha diversity comparison was based on the medians of different parameters (phylogenetic diversity PD and evenness), calculated between the different series analysed. Beta diversity analyses were performed, using both Kruskal Wallis and PERMANOVA with 999 permutations. A statistical evaluation was performed of differentially abundant features, based on an analysis of the composition of microbiomes (ANCOM). ANCOM computes Aitchinson’s (26) log-ratio of relative abundance for each taxon, controlling the false discovery rate (FDR) using the Benjamini-Hochberg procedure. This test is based on the assumption that few features change statistically between the samples, so that it is very conservative (27).

The relationship between the soil physico-chemical factors (pH, TOC, C:N ratio, Mg, Al, Si, P, S, Cl, K, Ca, Ti, Cr, Mn, Fe, Co, Ni, Cu, Zn) and the relative abundances of the main bacteria and archaea phyla observed at different depths were evaluated, using linear correlation (Pearson) in the StatPlus:mac LE v.6 program.

Data subsets

Soil profiles were divided into three categories: topsoil, medium layer and bottom layer. Further microbial discussion corresponds to the samples from soil depth intervals defined in these categories (Table 1).

Table 1. Soil depth intervals considered for discussion of microbial communities Australia Spain Denmark Soil layer Soil depth interval (m) Topsoil 0-0.4 0-0.3 0-0.3 Medium 0.5-2.4 0.4-2.4 0.4-2.4 Bottom 2.5-6 2.5-4.2 2.5-5

RESULTS

Soil properties

For the Australian samples, laboratory pH measurements at different depths of the soil cores (Supplementary Figure 1) indicated neutral conditions at a depth of 0.5 m, with a pH varying between 7.11 and 6.62, while measurements from further down, from a depth of 1 m and up to 4 m, indicated pH values of between 6.58 and 4.30, which are classified as slightly acidic to extremely acidic. Topsoil samples from Spain showed neutral to strongly alkaline conditions, with pH varying between 7.16 and 8.57, while pH measurements at depths of between 1 and 4.1 m indicated slightly alkaline to strongly alkaline conditions (7.36 to 8.64). For the Danish samples, the measurements indicated slightly acidic to neutral conditions in the topsoil, with pH varying between 6.31 and 6.99. An exception was seen for soil core Ob5, presenting a topsoil pH of 7.82. Measurements at between 1.5 and 5 m indicated slightly alkaline to strongly alkaline conditions, with values ranging between 7.45 and 8.65.

The highest values of TOC and total N occurred in the top-most samples (0.1 m), except for sample N in the core Bar1 (Spain), which showed a higher N value in the 0.5 m-depth sample (Figure 2). From a depth of 1 m and down to 3 to 5 m, depending on the soil core, measurement of TOC and N showed either relatively stable conditions, or less important variations. For the three countries, the C:N ratio generally decreased with depth (Figure 2).

Figure 2. TOC, N and C:N ratio measurements for selected soil core profiles. (A) Per cent of total soil organic carbon plotted on log scale, (B) Per cent of total N plotted on log scale, and (C) C:N ratio.

Filtering and initial assignment of amplicon sequences

After 16S rRNA gene sequencing, we obtained 12,877,007 quality-filtered reads of a minimum of 460 bp, with 10,459 to 128,705 sequences for each of the 313 samples. These sequences represent a total of 94,177 features, of which 92,006 represent bacteria and 1,967 belong to the archaea kingdom, while 204 features are unassigned. From the total bacteria and archaea features, 7,739 and 53, respectively, were only identified at kingdom level. For the ITS region sequencing, we obtained a total of 3,389,068 quality- filtered reads of 250 bp. They are divided into 63 samples, with 31,514 to 101,663 reads for each sample. In total, the ITS data set included 7,569 features assigned to fungi, 6 to protista, and 1 to chromista, besides 126 unassigned and 1 unidentified. Of the total number of fungi features, 5,807 were only identified at kingdom level.

Alpha diversity

Prokaryote phylogenetic diversity (Faith’s PD) showed a slightly uneven distribution between the vineyards (p = 0.024), with the Danish vineyard showing lower diversity, compared to Spain and Australia (Figure 3). Moreover, comparing the different cultivars, no significant phylogenetic diversity difference (p = 0.23) was measured. However, a significant phylogenetic diversity difference (p < 0.001) was observed between the three soil layers (top soil, medium layer and bottom layer). The phylogenetic diversity showed a consistent decrease from the topsoil to the bottom layer.

A

B

C

Figure 3. Alpha diversity box plots of prokaryotic phylogenetic diversity. Comparison of prokaryotic phylogenetic diversity according to: (A) vineyard location, (B) soil layers, and (C) cultivar.

The phylogenetic diversity distribution of fungi followed a similar trend as observed for prokaryotes (Figure 4). Even though vineyard location did not result in a significant difference in terms of number of species (p = 0.225), the three soil layers, defined according to depth intervals, showed significant phylogenetic diversity differences (p < 0.001). The topsoil revealed higher phylogenetic diversity, especially when compared with the bottom layer. Because only one cultivar per vineyard was considered for fungi evaluation, no phylogenetic diversity was measured for this parameter.

A

B

Figure 4. Alpha diversity box plots of fungi phylogenetic diversity comparison. Comparison of fungi phylogenetic diversity according to: (A) vineyard location and (B) soil layers.

Beta diversity Applying principal coordinate analysis (PCoA) to visualise the result of the calculated communities, based on the vineyard’s geographical position (Figure 5). The samples’ similaritydistance matrix, with each we otherobtained includes a significant a qualitative (p ≤ 0.001) and (semi)quantitative separation of the evaluationprokaryotic of the differences within the resulting data set. Samples from Denmark amid samples from Australia are all from topsoil, while the samples from Denmark clustered closer to Axis 2 all come from glacial sediment deposits (diamict) found at average depths of 1.40 m.

Figure 5. Vineyard location influences soil prokaryotic composition. The graph shows PCoA comparisons of prokaryotic (bacterial and archaeal) community compositions unweighted UniFrac distance (n=313).

As a consequence of the significant distinct prokaryotic communities, depending on vineyard geography, the influence of depth on the bacterial and archaeal communities according to soil layers was evaluated independently per vineyard (Figure 6). Significant distinct prokaryotic communities, depending on soil depth, were observed for the three -depth layer plot at the furthermost right-hand side of Axis 1 (Fig. 6A), near the cluster of the bottom layer, are samplesvineyards from (p ≤soil 0.001). layers Australianat between samples1.6 m and from 2.4 m, the and medium from the same soil core, K2. In the Danish vineyard, the medium-layer samples, amid the bottom-layer samples, all come from the same glacial sediment deposit layer (diamict). Furthermore, significantly distinct prokaryotic communities were shown according to grape cultivar in Australia (p

including different ≤ 0.001) and Denmark (p = 0.002). However, significant distinct prokaryotic communities (p ≤ 0.001) also occurred between the soil cores of the same vineyard,

cores within the same cultivar.

A B C

Figure 6. Soil depth influence on soil prokaryotic community compositions. The graphs are PCoA comparisons of bacterial and archaeal Bray Curtis dissimilarity for samples from (A) Australia (n=119), (B) Spain (n=99) and (C) Denmark (n=95).

In the same way as was observed for prokaryotes, a distinct fungal community

(Figure 7A). Due to the reduced number of samples, a PCoA of fungal communities’ distributionstructure was according observed, to dependingsoil layers was on thecreated vineyard’s that included geographical the three location countries (p ≤ (Figure 0.001) 7B). Despite the separation of the fungal-communities structure, depending on vineyard position, samples from the topsoil and from the bottom layer, irrespective of vineyard, are clustered according to respective soil layer, while medium-layer samples showed more dispersed distribution (p < 0.001). The two topsoil samples amid the middle layer samples correspond to the 0.3 m samples from the Spanish vineyard, while the samples from 0.1 m are clustered with the other topsoil samples from Australia and Denmark.

Figure 7. Vineyard location and soil depth influence on soil fungal community composition. The graphs are PCoA comparisons of fungal Bray Curtis dissimilarity from samples from Australia (n=17), Spain (n=27) and Denmark (n=19). (A) country and (B) soil depth effects on fungal communities.

Community structure and phylogenetic composition of prokaryotes

Based on the 313 samples evaluated after quality filtering, 1,415 taxa were identified, with an average of 194 taxa per sample (std. dev. = 69). Taxa shared between the three vineyard areas reached 726, while taxa shared between the three soil layers per vineyard varied from 415 to 491. Further details of the taxa numbers for vineyards and soil layers are shown in Figure 8. The term taxa applied here refers to the highest taxonomic level identified.

Figure 8. Venn diagrams showing the prokaryote distribution of the 1415 taxa detected in total. (A) Overlapping coloured areas indicate the number of taxa shared between the vineyard locations and between pairs of these. Numbers in brackets indicate taxa detected in only one out of the three countries (B, C and D) Overlapping coloured areas indicate the number of taxa shared between the three soil layers, and exclusively between pairs of soil layers, for samples coming from Australia, Spain and Denmark, respectively. Numbers in brackets indicate taxa detected in only one out of the three soil layers.

Counting the number of taxa shared between the soil layers per vineyard, the topsoil and medium layer showed the highest values of shared taxa, in comparison with taxa shared between medium and bottom layers, and between topsoil and bottom layer. Taxa detected within the three different soil layers showed similar values in the three countries, corresponding to 42% in Australia and Denmark, and 46% in Spain out of the total number of taxa detected in the respective vineyard area.

Out of the total of 1,415 taxa identified, 946 occurred in less than 10% of the 313 samples, while 41 were detected in more than 75%, and 16 in more than 90% of the samples. The

following five taxa, according to the higher taxonomic level, with the phyla specified in brackets, are emphasised as occurring in more than 300 out of the 313 samples: member of the families Syntrophobacteraceae (Proteobacteria) and Gaiellaceae (Actinobacteria), member of the classes S085 and Ellin6529 (both Chloroflexi), and member of the order iii1-15 (Acidobacteria).

Comparing the relative abundance of bacteria and archaea according to vineyard area, the highest values of archaea occurred for samples coming from Denmark (14.4%), followed by Spain (6.7%) and Australia (2.1%). Based on the higher taxonomic level among the 48 archaea taxa, members of the SAGMA-X and Cenarchaeaceae families, both belonging to Crenarchaeota phyla, were shown to be most abundant in the Danish soil. SAGMA-X and Cenarchaeaceae were also detected in Spain and Australia.

From the 54 bacterial phyla identified in the samples from the three vineyard areas, considering the entire soil profile evaluated, 10 phyla, besides one unclassified, showed relative abundance dominance, reaching summed values of 76%, 85% and 88%, respectively, in Denmark, Spain and Australia. These dominant phyla are: Proteobacteria, Actinobacteria, Acidobacteria, Chloroflexi, Firmicutes, Verrucomicrobia, Bacteroidetes, Nitrospirae, Planctomycetes and Gemmatimonadetes.

According to ANCOM test statistical results, of the 59 prokaryote phyla identified for our samples, 17 belonging to bacteria and 2 to archaea showed significant differences between the three vineyard areas. Running the ANCOM test for the class level, out of the 225 identified, 67 classes belonging to bacteria and 5 to archaea showed significant differences between the three vineyard areas (Supplementary Table 2).

Based on soil bacteria documented as plant growth-promoting rhizobacteria (PGPR), 19 species belonging to the genera Azospirillum, Bacillus, Clostridium and Pseudomonas were detected in our samples (Supplementary Table 3). Other genera with several members described as PGPR were also identified in our samples, but not assigned to species level, such as: Arthrobacter, Gluconacetobacter, Micrococcus, Flavobacterium, Rhizobium, Azoarcus, Serratia and Enterobacter. Of the PGPR identified, members of Bacillus were shown to be the most abundant, representing 0.7% in Australia, 0.8% in Spain and 1% in Denmark. Community structure and phylogenetic composition of fungi

The 63 sequenced samples provided 475 taxa belonging to the fungi kingdom, of which 65 are shared between the three vineyard areas. ANCOM test statistical results calculated that, of these 65 shared taxa, 4 showed significant differences between the three vineyard areas (Supplementary Table 4). The numbers of taxa identified in Australia, Spain and Denmark were 206, 240 and 265, respectively. An observed pattern is the pronounced reduction in the number of taxa at soil depths below 2.5 m. Comparing the topsoil with the bottom layer, the numbers of identified taxa belonging to fungi were 161 and 13,

respectively, in the Australian vineyard, 184 and 17 in the Spanish vineyard, and 179 and 28 in the Danish vineyard. Compared with the topsoil, the number of taxa in the medium layer also showed a reduction (117 in Australia, 121 in Spain and 170 in Denmark), although this was less striking than for the decrease observed in the bottom layer.

Of the seven documented phyla belonging to fungi, six were detected in Australia and Denmark (Ascomycota, Basidiomycota, Chytridiomycota, Glomeromycota, Rozellomycota and Zygomycota) and five in Spain (the same as previously listed, except for Rozellomycota).

As was the case for bacteria described as PGPR, we crossed the taxonomic data obtained from our samples with fungi listed as potential phytopathogens associated with grapevine diseases (28). Based on these listed fungi, 16 taxa were identified in our samples. Moreover, five other genera with several species associated with grapevine diseases were observed. Via the ITS region sequenced, it was not possible to reach species level identification (Supplementary Table 5), however. Of these 21 taxa, 16 were identified in just one or two samples from the corresponding vineyard, in general with a relative abundance <0.7%. More widespread taxa, occurring in a higher number of samples in the three vineyards investigated, and with a relative abundance generally >1%, refer to Alternaria sp., Clonostachys rosea, Fusarium sp. and Fusarium solani. The genera Cladosporium. Ilyonectria destructans was also frequent in the samples from the Spanish vineyard, but did not occur in Danish and Australian vineyards. Relative abundance, reaching values between 31.4% and 63.5%, was observed for Alternaria sp. in the Australian vineyard, Cladosporium in the Danish vineyard, and I. destructans in the Spanish vineyard.

Microbial communities according to soil layer - Prokaryotes

Evaluation of relative abundance within the soil core profiles, according to the three soil layer intervals (topsoil, medium layer and bottom layer), showed Acidobacteria and Proteobacteria among the four main phyla detected in the three vineyard areas and in the three soil layers. Actinobacteria was also detected among the four main phyla in the three soil layers from Australia and Spain, but only in the topsoil from Denmark. The other phyla observed among the main four, considering relative abundance values, for each vineyard and its respective soil layer are: Chloroflexi (medium layer in Australia and Spain, and bottom layer in Denmark), Firmicutes (bottom layer in Australia and topsoil in Spain and Denmark), Crenarchaeota (medium layer in Denmark and bottom layer in Demark and Spain), and Verrucomicrobia (topsoil in Australia and medium layer in Denmark). Individual relative abundances of these seven phyla varied between 7.7% and 39.6% (Supplementary Table 6). Evaluation of the main phyla-relative abundance variations, depending on soil layers, is presented in Figure 9. ANCOM test statistical results showed significant differences for relative abundances obtained in the three soil layers for 9 out of the 11 phyla mentioned in Figure 9. Exceptions occurred for

Gemmatimonadetes and Actinobacteria (Supplementary Table 7).

Figure 9. Relative abundance variations of the main phyla within the three soil layers. Phyla that showed relative abundance decrease ( arrow down ) and increase ( arrow up ) from the topsoil to the bottom layer in the Australian, Spanish and Danish

vineyards. Topsoil, medium and bottom layer respectively correspond to samples from the following depth intervals in m: 0-0.4, 0.5-2.4 and 2.5-6 in Australia; 0-0.3, 0.4-2.4 and 2.5-4.2 in Spain; and 0-0.3, 0.4-2.4 and 2.5-5 in Denmark.

Regarding archaea, it should be noted that, while samples from the Danish vineyard showed the highest relative abundance for this kingdom amongst the three vineyard areas, important differences can be observed on comparing the three soil layers. Danish samples from bottom layers showed a relative abundance of 39.6% for archaea, with 39.5% belonging to the Thaumarchaeota (Crenarchaeota) class. For comparison, in the topsoil archaea represented 5.4% and in the medium layer 14.9%, of the prokaryotes’ relative abundance.

Evaluating the higher taxonomic levels for the different soil layers, a series of bacterial taxa occurred as dominant in the three vineyards and in the three soil layers: iii1-15, Gaiellaceae, S085, Syntrophobacteraceae and Ellin6529. These five taxa are the same as previously described as occurring in more than 300 out of the 313 samples evaluated, and according to ANCOM test statistical results, there are no significant differences in the relative abundance of these between the three soil layers in the three vineyard areas. ANCOM calculated that the other 203 taxa have significant relative abundance differences between the three soil layers in the three vineyards (Supplementary Table 8.).

Regarding PGPR, we identified 60 taxa belonging to Rhizobiales, divided into 13 families, including two unclassified. On comparing the relative abundance of Rhizobiales in the three soil layers, a reduction by depth in the three vineyard areas occurred. Among the 13 Rhizobiales families identified in our samples, Hyphomicrobiaceae showed the highest relative abundance. For members of the Bacillus genus, a relative abundance reduction according to soil depth was also observed in the three vineyards. The other PRGP taxa showed relative abundances generally lower than 0.1%, not allowing for good correlation of the values between the different soil layers.

Microbial communities according to soil layer - Fungi

Observing phyla occurrence in the different soil layers (Figure 10), Chytridiomycota, Glomeromycota and Rozellomycota were not detected in the samples from the bottom layers, while Zygomycota showed a decrease in relative abundance in the soil-core bottom layers in the three vineyard areas.

A remarkable variation for our samples, on comparing relative abundances in the topsoil with the bottom layers, is the increase in unidentified fungi, especially for the Australian and Spanish vineyards. In these, the values in the topsoil were 28% and 15.7%, respectively, while the values for the bottom layers were 50.3% and 56.7%. In the Danish vineyard, unidentified fungi reached 52% in the bottom layers, although topsoil samples also showed a high value of 45%. Of the 5,807 features identified only until fungi kingdom level, 1,998 were detected in the Australian vineyard, 1,979 in the Spanish vineyard, and

2,020 in the Danish vineyard. Of these features, only 29 were detected in all the three vineyard areas, in addition to 21 shared between Australia and Spain; 53 between Australia and Denmark; and 29 between Spain and Denmark.

Unidentified features that occurred more than 10 times in the dataset were 1,094 in Australia, 1,154 in Spain and 1,147 in Denmark. From each vineyard area we blasted the five most predominant features (counts between 23,445 and 84,982) and the taxonomy reports generally showed a high score for uncultured fungi in the three vineyard areas, besides Alternaria sp., Phialocephala sp. and Neonectria sp. in Australia; Ilyonectria sp. and Dactylonectria sp. in Spain and Mortierella sp. in Denmark. Moreover, we checked the main shared features, considering an occurrence higher than 100 times in the three vineyard areas. The blast taxonomy report on the six features belonging to this main shared group showed a high score for uncultured fungus, Cladosporium sp., Meyerozyma guilliermondii Alternaria sp. and Truncatella sp. Most of the features of unidentified fungi detected in each vineyard area occurred exclusively in a specific soil layer, and only one feature in the Australian and four in the Danish vineyard occurred in the three soil layers. The occurrences of these shared features were 21,538 in Australia and between 107 and 12,217 in Denmark, and the blast showed that they seem to be assigned to uncultured fungus, besides Alternaria sp. in Australia, and Meyerozyma guilliermondii and Cladosporium sp. in Denmark.

Evaluation of the relative abundances for the higher taxonomic level identified per vineyard area according to soil layer generally showed a difference between those that were dominant (Supplementary Figure 2).

Figure 10. Fungi phyla relative abundance variation in the three soil layers per vineyard area. (A) Australian samples showed an increase in unidentified fungi and Ascomycota in the bottom layer, with a reduction in Basidiomycota and Zygomycota. (B) In Spain, the relative abundance of unidentified fungi greatly increased in the bottom

layer, while Ascomycota and Zygomycota decreased. (C) The Danish vineyard mainly showed an increase in Ascomycota in the bottom layer. Unidentified fungi in the topsoil were the highest from the three vineyard areas. Basidiomycota and Zygomycota were reduced in the bottom layer.

For the yeast group, we identified 15 taxa belonging to the Saccharomycetales order. The relative abundance variance of Saccharomycetales showed the same pattern of increase, with depth in the three vineyard areas. While Saccharomycetales relative abundances in the three vineyards were not detected above 0.42% in the topsoil, nor above 2% in the medium layer, bottom layer samples showed values of 26.7% in Australia, 29.2% in Spain and 17.8% in Denmark. Overall Saccharomycetales was the most abundant taxa identified in the bottom layer.

Contrary to what was observed for Saccharomycetales, taxa belonging to the Hypocreales and Mortierellales orders showed poor occurrence or were not detected in the bottom layer, while occurring amongst the most abundant taxa in the topsoil and medium layer.

Correlation between prokaryotic communities and soil edaphic factors

Variances with depth of measured pH, TOC and C:N were independently correlated with the main bacteria and archaea phyla relative abundances (Table 2). The relationship analysis between pH and the main phyla of samples from the Danish vineyard showed a statistically significant correlation (p < 0.05) for five groups. Positive significant correlation between pH and phyla occurred for the relative abundance of Crenarchaeota and Chloroflexi, and negative correlation was observed for Acidobacteria, Firmicutes and Proteobacteria. Australian vineyard samples showed a significant negative correlation between pH and Crenarchaeota, and a positive correlation for Verrucomicrobia. While the samples from the Spanish vineyard only showed a significant negative correlation between pH and Firmicutes.

Variations with depth in TOC and the main phyla-relative abundances showed significant positive (p < 0.05) correlation for Acidobacteria in the Spanish vineyards, for Actinobacteria in the Australian vineyard, and for Bacteroidetes and Firmicutes in the Danish vineyard. For variation with depth of the C:N ratio and bacteria, significant correlation only occurred for Acidobacteria in the Spanish vineyard and Actinobacteria in the Australian vineyard. However, several soil elements showed significant correlation (p < 0.05) with phyla relative abundances (Table 2).

1 DISCUSSION

2 Soil prokaryotic community composition and diversity

3 Our findings for the uppermost soil layer, as compared with deeper soil horizons, 4 are consistent with an often-significant higher microbial density, in general 5 associated with a higher diversity (1, 2, 4-6). Varying microbial composition with 6 depth, such as reported by K. G. Eilers et al. (3), , M. Hartmann et al. (5), C. M. Hansel 7 et al. (7) and M. Sagova-Mareckova et al. (29), was also confirmed, in our PCoA plots 8 and by the statistically significant differences obtained between the communities 9 found in the topsoil, medium layer and bottom layer. On comparing topsoil with the 10 medium layer in the three vineyard areas, microbial communities dwelling in 11 deeper horizons showed a lower diversity. A substantial number of taxa identified 12 in the bottom layer were shared with the upper soil layer; although numerous 13 unique taxa were only detected at this deeper layer. These microbial diversity 14 patterns at deeper soil depths have also been identified in other studies (3, 30, 31).

15 Despite a significant prokaryotic communities difference, depending on grape 16 cultivar, the fact that soil cores from the same vineyard and the same cultivar also 17 showed significant differences in these communities corroborates the general 18 concept that high microbial diversity differences can be found at very small scales 19 (32).

20 A reported increase of archaeal:bacterial ratio with soil depth (7) was also observed 21 in our samples from the three vineyard areas. The remarkable relative abundance 22 increase with depth for archaea in the samples from the Danish vineyard is probably 23 the result of the different pedogenic processes in this area. The local sediment at 24 depths below around 1.4 m is a diamict, which is a non-genetic term defined as an 25 unsorted, or poorly sorted, sediment in a wide range of particle sizes (33). This 26 material was deposited by the last Scandinavian ice sheet advance, flowing from 27 central Sweden into Denmark, and during its subsequent withdrawal that took place 28 during the Late Weichselian, (between 25 and 17 thousand years ago) (34). Younger 29 deposits above the diamict are ‘glaciofluvial sand and gravel’ and ‘marine sand and 30 gravel’ (35). Soil horizons inhabited by microbes coming from the original 31 depositional material, and that have persisted over time, were previously reported 32 (36, 37). Thaumarchaeota, which is the main archaea class detected by us in 33 Denmark, is known to be abundant in marine sediments (38), with considerable 34 detection at greater soil depths (3, 5, 7). Thaumarchaeota species are involved in 35 nitrogen cycling, including all known ammonia-oxidizers archaea, besides other 36 archaea groups with unknown energy metabolism (39). In several studies 37 Thaumarchaeota is regarded as a new phylum, as proposed by C. Brochier-Armanet 38 et al. (40). However, in accordance with the Greengenes taxonomy employed in this 39 study, it is still included as a class of the Crenarchaeota phylum.

40 Proteobacteria and Acidobacteria are referred to as the most abundant bacterial 41 phyla within soils (41). These two phyla have also been detected amongst the main 42 phyla in studies of vineyard topsoil (42, 43), and we demonstrate that this is also the 43 case for the topsoil in our studied vineyards. Most soil samples analysed for bacterial 44 communities evaluation come from the uppermost soil layer; and therefore phyla 45 distribution at greater depths is still poorly understood (3). Despite the detection of 46 Proteobacteria and Acidobacteria as the main phyla within the soil profiles from our 47 three vineyard areas, we observed gradual reductions in the relative abundances of 48 Acidobacteria in all three vineyards, and also a reduction with depth of 49 Proteobacteria in the Spanish and Danish vineyards. Relative abundance reductions 50 of Proteobacteria with depth have previously been reported (3, 31). However, other 51 studies have also detected the opposite, i.e. increases of this phyla at greater depths 52 (44, 45). For Acidobacteria, different studies have not shown any clear shifts (3, 29, 53 44), while other findings report increases of Acidobacteria at greater depths (7, 45). 54 Due to the different study locations, soil type, and land use and management, these 55 inconsistencies suggest the importance of site-specific soil physico-chemical 56 characteristics, which influence the bacterial community composition (3, 46). Site- 57 specific characteristics can also be the reason for the decrease with depth of 58 Firmicutes at our Spanish and Danish vineyards, while we observed an almost 59 threefold increase of Firmicutes relative abundance with depth at our Australian 60 vineyard. Firmicutes in the bottom layer of samples from Australia reached a relative 61 abundance close to that of Proteobacteria, the latter being the main phylum 62 observed within this deeper layer. Western Australian soils are highly weathered, 63 which implies a lack of basic cations, and that they are iron rich and have an acidic 64 pH (47). Firmicutes are referred to as thriving in extreme environments (48). Higher 65 relative abundances of Firmicutes at deeper soil depths have been reported (7, 29, 66 45), but also reduced abundances, e.g. in C. Will et al. (2).

67 As observed for our three vineyard areas, a relative abundance reduction with depth 68 for Verrucomicrobia has also been reported in other studies (2, 3, 7, 44). Moreover, 69 K. G. Eilers et al. (3) and C. M. Hansel et al. (7) detected an increase in the relative 70 abundance of Verrucomicrobia in the soil layers just below the topsoil, followed by 71 a reduction at greater depths, which is a similar pattern to that observed for the 72 Danish vineyard. Knowledge of Verrucomicrobia is still limited, other than that they 73 are ubiquitous in soils (49) and suggested to be free-living decomposers, with the 74 Spartobacteria as the dominant class (31). The genus DA101, belonging to 75 Spartobacteria, was indeed detected in our three vineyard areas and found to be 76 dominant in the topsoil and medium layer samples from the Australian and Danish 77 vineyards. The functional role of DA101 is not yet fully understood, but recent works 78 indicate that it is a widespread global soil species (50, 51). Also belonging to 79 Verrucomicrobia, two unidentified members of the Pedosphaerales order showed 80 broad detection in our three vineyard areas.

81 Plant growth-promoting rhizobacteria

82 Nineteen species recognised as PGPR were detected in this study, indicating 83 potential interaction of these bacteria with the grapevine roots and aboveground 84 grapevine organs. Examples of the species identified include: Azospirillum brasilense, 85 Bacillus muralis, B. firmus, B. endophyticus, Clostridium butyricum, Pseudomonas 86 stutzeri and P. fragi (Supplementary Table 3). In addition, several genera containing 87 members described as PGPR were identified in our samples, but these genera could 88 not be assigned to a species level. I. Zarraonaindia et al. (43) found a correlation 89 between bacteria in the soil and in the above-ground parts of grapevine organs, 90 suggesting soil as a main microbial reservoir potentially influencing grapes and 91 wine typicity. However, for further consideration based on our findings, evaluations 92 of the microbial communities dwelling in the rhizosphere and grapevine organs 93 (roots, leaves, flowers and grapes) are recommended.

94 The relative abundance reduction of Rhizobiales with soil depth in the samples from 95 the three vineyard areas is consistent with previous observations, showing a 96 positive correlation with the C:N ratio (2, 52). Due to the close relation between 97 plant roots and Rhizobiales, as well as other PGRP, it can be inferred that, in addition 98 to C:N ratio, the relative abundance reduction with soil depth of these microbes is 99 also a result of the lower root density in the deeper soil layers. Lower root density 100 leads to a reduced rhizosphere, an area known to typically support a higher 101 microbial density than bulk soil (53). Studying grapevines, D. R. Smart et al. (54) 102 found that around 80% of the roots are located within the upper 1 m of soil, which 103 correlates with the decreased abundance of Rhizobiales with depth in our samples. 104 A higher abundance of PRGPs in the rhizosphere will probably increase the 105 population found in the bulk soil near the root zone. Besides the already well-known 106 relationship of Rhizobiales with legume species, this order is also widely found to be 107 associated with higher plants, with studies indicating beneficial functional 108 interactions with its hosts (55, 56).

109 Correlation between prokaryotic communities and soil edaphic factors

110 The statistically significant correlation between pH and relative abundances, 111 observed for five phyla in the Danish vineyard (Crenarchaeota, Chloroflexi, 112 Acidobacteria, Firmicutes and Proteobacteria), but just two phyla in the Australian 113 vineyard (Crenarchaeota and Verrucomicrobia) and only one phyla in the Spanish 114 vineyard (Firmicutes), are probably a result of the greater pH variance in the Danish 115 soil profiles. The soils in the latter showed a pH pattern of a slightly acidic to neutral 116 character in the topsoil, gradually shifting to a slightly to strongly alkaline character, 117 while the Australian and Spanish soil profiles were found to be acidic for the former 118 and alkaline for the latter. The Spanish soil profiles showed minor changes in pH 119 values with depth, potentially causing a less measurable influence in the microbial

120 communities. A weak correlation between pH and microbial abundance, probably 121 as a result of small soil pH variation, is reported in C. Will et al. (2). Nevertheless, as 122 observed for the samples from the Danish vineyard and in other studies (57-60), pH 123 is considered to be one of the main drivers of bacterial community composition.

124 The effect of soil-carbon influencing microbial communities (60) was observed at 125 weak significant levels for our three vineyards, restricted to positive correlations 126 between TOC and Bacteroidetes and Firmicutes in the Danish soil profiles, to 127 Actinobacteria in the Australian soil profiles, and to Acidobacteria in the Spanish soil 128 profiles. Significant positive correlation with the C:N ratio was only observed for 129 Actinobacteria in Australia and Bacteroidetes in Denmark. One reason for such an 130 outcome might be the small numbers of samples available for these evaluations.

131 Bacteria and archaea main phyla showed significant correlation with most of the 132 elements evaluated (Al, Si, P, S, Cl, K, Ca, Ti, Mn, Fe, Ni and Cu). The same element 133 was shown to correlate either positively or negatively with different phyla, which 134 was consistent with studies showing different bacteria inversely reacting to 135 environmental changes in the soil. Such differences in the bacteria response to 136 environmental factors are probably the result of adaptation mechanisms and 137 intrinsic distinct tolerance levels, supported by these microorganisms (61-63). 138 Exceptions occurred for Cr and Zn, which for all the significantly correlated phyla 139 showed positive relationships. Based on the literature accessed, contrary to the 140 abundant information available correlating soil pH and carbon content with 141 bacterial community composition, there is still an information gap concerning the 142 correlation between soil chemical constitution and microbial community 143 composition. Data corroborating our findings of significant negative correlation 144 between soil elements and phyla refers to: K and Ca and Proteobacteria (57); K2O 145 and Proteobacteria (64); and Ca and Acidobacteriaceae (Acidobateria) (65). In the 146 same way, significant positive correlation is described for: S and Chloroflexi (64); Al 147 and Verrucomicrobia (64); K2O and Bacteroidetes (64); and Al and 148 Betaproteobacteria (Proteobacteria) (64). Inversely to our observations, negative 149 correlation between S and Acidobacteria and Actinobacteria is reported by S. N. 150 Yurgel et al. (64).

151 Actinobacteria in our Australian soil cores showed significant negative correlation 152 with Al, while positive correlation in the Danish soil cores. Al and Actinobacteria 153 presenting significant negative correlation have been observed in a study conducted 154 in acidic soil (64), a pH condition also observed in the Australian vineyard. Opposite 155 significant correlations were also observed between Verrucomicrobia and K in the 156 Australian and Danish soil cores, as a negative correlation for the former and a 157 positive correlation for the latter. Positive significant correlation between 158 Verrucomicrobia and K2O is reported by S. N. Yurgel et al. (64).

159 In view of the measured variations in soil edaphic factors along the soil profiles, taxa 160 detected dwelling at different depths are described as being habitat generalist. On 161 the other hand, taxa restricted to a specific soil layer are categorised as being habitat 162 specialist (66); yet it might be the case that these taxa were not detected in our 163 samples in other soil horizons.

164 Soil fungal community composition and diversity

165 There are approximately 120,000 fungi species described worldwide (67), thus 166 representing only a fraction of the estimated global fungi diversity. In less 167 conservative numbers, these are estimated at 6 million species (68), but they more 168 probably range between 2.2 and 3.8 million (67). Consistent with such data, ITS 169 sequencing from our 63 soil samples showed relative abundances of unidentified 170 members of the fungi kingdom reaching 33.6% in the Spanish soil cores, 34.7% in 171 the Australian, and 44.1% in the Danish. It should be emphasised that out of the 172 7,569 features assigned to fungi, 5,807 of these were only identified to kingdom level, 173 and the 1,762 features identified at higher taxonomic levels represented 475 taxa. 174 Correspondingly, the 5,807 unidentified features most likely represent numerous 175 fungi taxa with varied relative abundances, and not a few taxa occurring in high 176 abundances.

177 In a global study of soil samples from natural communities, collected at a depth of 178 0.05 m, members of the Agaricomycetes class dominated all of the 11 ecosystems, 179 followed mainly by Sordariomycetes, Leotiomycetes, Eurotiomycetes and 180 Mortierellomycotina, with these four classes presenting relative abundance 181 variations according to ecosystem (69). Compared with these five most abundant 182 classes detected globally, our samples taken at 0.1 m depths showed 183 Sordariomycetes as the main class in the Australian and Spanish vineyards and the 184 second-largest class in the Danish vineyard. Mortierellomycotina was also detected 185 among the main classes observed in Spain and Denmark, but Agaricomycetes, 186 Leotiomycetes and Eurotiomycetes did not show values among the most abundant 187 classes. While in the Australian and Spanish vineyards and 188 Tremellomycetes in the Spanish and Danish vineyards occurred as the main fungi 189 classes, in the aforementioned global study these two classes were detected with 190 less important relative abundances in the natural ecosystems evaluated. Since the 191 samples evaluated here come from vineyards, such relative abundance differences 192 in the dominance of the fungal communities inhabiting the uppermost soil layer may 193 be a result of land-use change (59, 70). As reported by A. Orgiazzi et al. (70), who 194 evaluated soil samples taken at 0.2 m depths in a natural cork-oak forest, a pasture, 195 and a managed meadow, vineyard samples showed important relative abundance of 196 Zygomycota, after Ascomycota and Basidiomycota, which is the same pattern that is 197 observed in our topsoil samples.

198 Important relative abundance variation with depths of different fungi phyla is 199 present in the current study. However, due to the lack of reported studies of 200 vertically stratified soil-fungi communities to the same depths as evaluated here, we 201 cannot properly compare the trends obtained. Most soil fungi research has been 202 performed within the uppermost soil layer. In addition, according to a research 203 review (71) and other more recent studies (72-75), fungi vertical stratification 204 assessment is mainly restricted to the upper 0.15 m of the soil profiles, or does not 205 reach depths > 0.6 m.

206 Dominant fungal taxa in our soil cores typically varied according to vineyard area 207 and depth of soil layer, probably indicating the recognised major influence of 208 climatic, biotic and abiotic factors on fungi community composition (69). The 209 changes in the fungi communities with soil depth are relevant for plants that develop 210 deep roots, since many fungi species influence plant health, and they act as bio- 211 stimulants, biocontrol agents, decomposers and pathogens (76). For instance, for 212 our samples from soil depths below 2.4 m, the species Meyerozyma guilliermondii 213 and Cyberlindnera jadinii showed important relative abundances in all vineyards at 214 such depths, while they had low abundance or were non-detected in the overlaying 215 soil layers. Both species are yeast (order Saccharomycetales), with studies of their 216 biotechnological applications under way (77, 78). Several species of yeast are 217 described as presenting ecological functions influencing plant growth (79). 218 Moreover, the yeast population is normally in an order of magnitude that is higher 219 in the rhizosphere, compared with bulk soil (80). The same trend is thus expected 220 for the vineyards studied here.

221 Members of the Hypocreales and Mortierellales orders were abundant in all topsoil 222 and medium layer samples from our study, while they were poorly abundant or non- 223 detected in the bottom layer. Mortierellales are saprobic organisms (81), while 224 Hypocreales are distinguished by the ability to derive nutrition from varied nutrient 225 sources (82). Several species of Hypocreales are commercially developed and 226 applied as biocontrol of plant diseases and pests, while others are subject to 227 research. Nevertheless, the genus Fusarium, belonging to Hypocreales and detected 228 within most of our samples, is a well-known potent plant pathogen (82). Hypocreales 229 is the largest class belonging to Ascomycota, with worldwide distribution, and 230 includes species with varied life modes, including saprobic, pathogenic or 231 endophytic species (83).

232 The combined functional diversity of the main groups detected in the different soil 233 layers (Saccharomycetales, Hypocreales, Mortierellales and Dothideomycetes) can be 234 related to the concept of soil health, which refers to soil as a living system influenced 235 by different processes and properties, and highly influenced by the soil microbiota 236 activity (76).

237 Vineyard-associated diseases

238 Soil samples from the three vineyard areas and from different soil-depth layers 239 showed 16 fungal taxa described as potential grapevine pathogens. Five other 240 identified genera, with several species, are associated with grapevine diseases (28). 241 Nevertheless, it should be emphasised that the detection of these phytopathogens is 242 not a direct indicator of plant infection. The total soil and rhizosphere microbial 243 activity and their interaction deeply influence the ability of a pathogen to affect plant 244 health severely (53). This total microbiome and its interaction are responsible for a 245 soil phenomenon known as ‘general disease suppression’. When disease 246 suppression is characterised by the action of specific microorganisms, it is referred 247 to as ‘specific disease suppression’ (53, 84). However, general suppression may 248 occur due to the absence of information about the action of specific microorganisms 249 involved in limiting either the survival or the growth of the pathogens (84). The 250 information retrieved here shows the potential applications of ITS region 251 sequencing in identifying the soil fungi associated with plant health.

252 In conclusion, this study expands the general understanding that microbial 253 communities composition vary significantly with depth and geography and also that 254 some prokaryotes are ubiquitous, dwelling in the different soil depths and countries 255 here evaluated. Considering the grapevine roots’ potential to reach deep soil depths 256 and the recognised association between roots and microorganisms, we identified 257 well-known plant-health associated microbes that significantly differed with soil 258 depth, including exclusive taxa detected in specific soil layers. The large knowledge 259 gap concerning soil fungi species was evidenced by the current study, with most of 260 the sequenced ITS region not being assigned in the database, requiring further 261 efforts for its identification. As a fundamental element of wine production and in 262 direct relation to plant health, more attention should be given to the soil fungal 263 species occurring in different soil layers.

264 ACKNOWLEDGEMENTS

265 We acknowledge Patricia Benitez, Jakob Walløe Hansen, Niels Esbjerg, Michael 266 Bunce and Virginia Willcock for fieldwork and logistical support; Elisa Filippi for 267 laboratorial assistance and Jesper S. Pedersen for preparing and providing 268 laboratorial material.

269 FUNDING

270 This study was funded by the Horizon 2020 Programme of the European 271 -Curie Innovative Training Network 272 “MicroWine” (grant number 643063) Commission within the Marie Skłodowska

273 (https://ec.europa.eu/programmes/horizon2020/en/h2020-section/marie- 274 sklodowska-curie-actions). The funders had no role in study design, data collection 275 and analysis, decision to publish, or preparation of the manuscript.

276 REFERENCES 277 1. Fierer N, Schimel JP, Holden PA. 2003. Variations in microbial community composition through two 278 soil depth profiles. Soil Biology and Biochemistry 35:167-176. 279 2. Will C, Thürmer A, Wollherr A, Nacke H, Herold N, Schrumpf M, Gutknecht J, Wubet T, Buscot F, Daniel 280 R. 2010. Horizon-specific bacterial community composition of German grassland soils, as revealed by 281 pyrosequencing-based analysis of 16S rRNA genes. Applied and Environmental Microbiology 76:6751- 282 6759. 283 3. Eilers KG, Debenport S, Anderson S, Fierer N. 2012. Digging deeper to find unique microbial 284 communities: the strong effect of depth on the structure of bacterial and archaeal communities in soil. 285 Soil Biology and Biochemistry 50:58-65. 286 4. Blume E, Bischoff M, Reichert J, Moorman T, Konopka A, Turco R. 2002. Surface and subsurface 287 microbial biomass, community structure and metabolic activity as a function of soil depth and season. 288 Applied Soil Ecology 20:171-181. 289 5. Hartmann M, Lee S, Hallam SJ, Mohn WW. 2009. Bacterial, archaeal and eukaryal community structures 290 throughout soil horizons of harvested and naturally disturbed forest stands. Environmental 291 Microbiology 11:3045-3062. 292 6. Goberna M, Insam H, Klammer S, Pascual J, Sanchez J. 2005. Microbial community structure at different 293 depths in disturbed and undisturbed semiarid Mediterranean forest soils. Microbial Ecology 50:315- 294 326. 295 7. Hansel CM, Fendorf S, Jardine PM, Francis CA. 2008. Changes in bacterial and archaeal community 296 structure and functional diversity along a geochemically variable soil profile. Applied and 297 environmental microbiology 74:1620-1633. 298 8. Fierer N, Chadwick OA, Trumbore SE. 2005. Production of CO 2 in soil profiles of a California annual 299 grassland. Ecosystems 8:412-429. 300 9. Madsen EL. 1995. Impacts of agricultural practices on subsurface microbial ecology. Advances in 301 agronomy (USA). 302 10. Richter DD, Markewitz D. -609. 303 11. Konopka A, Turco R. 1991. Biodegradation of organic compounds in vadose zone and aquifer 304 sediments. Applied and Environmental1995. How deep Microbiology is soil? BioScience 57:2260 45:600-2268. 305 12. Hiebert FK, Bennett PC. 1992. Microbial control of silicate weathering in organic-rich ground water. 306 Science-New York then Washington-:278-278. 307 13. Van Der Heijden MG, Bardgett RD, Van Straalen NM. 2008. The unseen majority: soil microbes as 308 drivers of plant diversity and productivity in terrestrial ecosystems. Ecology letters 11:296-310. 309 14. Bitton G. 2002. Encyclopedia of environmental microbiology. Wiley Online Library. 310 15. Sprent JI. 2001. Nodulation in legumes. Royal Botanic Gardens, Kew. 311 16. White RE. 2015. Understanding vineyard soils. Oxford University Press. 312 17. GSWA. 1967. Busselton and Augusta Western Australia - 1: 250,000 Geological Series - Explanatory 313 Notes. GSWA (Geological Survey of Western Australia). Bureau of Mineral Resources, Geology 314 Geophysics. 315 18. Brown K, McKenzie N, Isbell R, Jacquier D. 2001. Australian Soil Poster. CSIRO Land and Water. 316 19. McDonald RC, Isbell RF. 2009. Soil Profile, p 147-204. In Terrain NCoSa (ed), Australian Soil and Land 317 Survey Field Handbook, 3rd ed. CSIRO PUBLISHING, Collingwood. 318 20. Cabrera R, Crespo J, García J, Mediavilla B, Armenteros I. 1997. Mapa Geológico y Minero de Castilla y 319 León, escala 1: 400.000. Junta de Castilla y León Sociedad de explotación e Investigación Minera de 320 Castilla y León, SA (SIEMCALSA), Valladolid. 321 21. GEUS. n.a. Surface Geology Map of Denmark 1:200 000. http://www.geus.dk/UK/data- 322 maps/Pages/j200-dk.aspx. Accessed February 16, 2018. 323 22. Schoeneberger PJ, Wysocki DA, Benham EC, Staff SS. 2012. Field book for describing and sampling soils, 324 Version 3.0. Natural Resources Conservation Service NSSC, Lincoln, NE. 325 23. Verardo DJ, Froelich PN, McIntyre A. 1990. Determination of organic carbon and nitrogen in marine 326 sediments using the Carlo Erba NA-1500 Analyzer. Deep Sea Research Part A Oceanographic Research 327 Papers 37:157-165. 328 24. Brodie CR, Leng MJ, Casford JS, Kendrick CP, Lloyd JM, Yongqiang Z, Bird MI. 2011. Evidence for bias in 329 - 330 analysis acid preparation methods. Chemical Geology 282:67-83. 331 25. GobbiC and NA, concentrations Santini RG, Filippi and E,δ13C Ellegaard composition-Jensen of L, terrestrial Jacobsen CS,and Hansen aquatic LH. organic in press. materials Quantitative due to preand 332 qualitative evaluation of the impact of the G2 enhancer, bead sizes and lysing tubes on the bacterial 333 community composition during DNA extraction from recalcitrant soil core samples based on 334 community sequencing and qPCR doi:https://doi.org/10.1101/365395, PlosOne.

335 26. Aitchison J. 1982. The statistical analysis of compositional data. Journal of the Royal Statistical Society 336 Series B (Methodological):139-177. 337 27. Mandal S, Van Treuren W, White RA, Eggesbø M, Knight R, Peddada SD. 2015. Analysis of composition 338 of microbiomes: a novel method for studying microbial composition. Microbial ecology in health and 339 disease 26:27663. 340 28. Jayawardena RS, Purahong W, Zhang W, Wubet T, Li X, Liu M, Zhao W, Hyde KD, Liu J, Yan J. 2018. 341 Biodiversity of fungi on Vitis vinifera L. revealed by traditional and high-resolution culture- 342 independent approaches. Fungal Diversity 90:1-84. 343 29. Sagova-Mareckova M, Zadorova T, Penizek V, Omelka M, Tejnecky V, Pruchova P, Chuman T, Drabek O, 344 Buresova A, Vanek A. 2016. The structure of bacterial communities along two vertical profiles of a deep 345 colluvial soil. Soil Biology and Biochemistry 101:65-73. 346 30. 347 2012. Active and total microbial communities in forest soil are largely different and highly 348 Baldrianstratified P,during Kolařík decomposition. M, Štursová M,The Kopecký ISME journ J, Valáškováal 6:248. V, Větrovský T, Žifčáková L, Šnajdr J, Rídl J, 349 31. HelgasonVlček Č. B, Konschuh H, Bedard-Haughn A, VandenBygaart A. 2014. Microbial distribution in an 350 eroded landscape: Buried A horizons support abundant and unique communities. Agriculture, 351 ecosystems & environment 196:94-102. 352 32. Tiedje JM, Asuming-Brempong S, Nüsslein K, Marsh TL, Flynn SJ. 1999. Opening the black box of soil 353 microbial diversity. Applied soil ecology 13:109-122. 354 33. Hambrey MJ. 1994. Glacial environments. UBC Press, Vancouver. 355 34. Houmark-Nielsen M. 2007. Extent and age of Middle and Late Pleistocene glaciations and periglacial 356 episodes in southern Jylland, Denmark. Bulletin of the Geological Society of Denmark 55:9-35. 357 35. GEUS. 2015. Surface geology map of Denmark 1:25.000. 358 36. Kennedy MJ, Reader SL, Swierczynski LM. 1994. Preservation records of micro-organisms: evidence of 359 the tenacity of life. Microbiology 140:2513-2529. 360 37. Kieft T, Murphy E, Haldeman D, Amy P, Bjornstad B, McDonald E, Ringelberg D, White D, Stair J, Griffiths 361 R. 1998. Microbial transport, survival, and succession in a sequence of buried sediments. Microbial 362 Ecology 36:336-348. 363 38. Polónia AR, Cleary DF, Duarte LN, de Voogd NJ, Gomes NC. 2014. Composition of Archaea in seawater, 364 sediment, and sponges in the Kepulauan Seribu reef system, Indonesia. Microbial ecology 67:553-567. 365 39. Oton EV, Quince C, Nicol GW, Prosser JI, Gubry-Rangin C. 2016. Phylogenetic congruence and ecological 366 coherence in terrestrial Thaumarchaeota. The ISME journal 10:85. 367 40. Brochier-Armanet C, Boussau B, Gribaldo S, Forterre P. 2008. Mesophilic Crenarchaeota: proposal for 368 a third archaeal phylum, the Thaumarchaeota. Nature Reviews Microbiology 6:245. 369 41. Janssen PH. 2006. Identifying the dominant soil bacterial taxa in libraries of 16S rRNA and 16S rRNA 370 genes. Applied and environmental microbiology 72:1719-1728. 371 42. Burns KN, Kluepfel DA, Strauss SL, Bokulich NA, Cantu D, Steenwerth KL. 2015. Vineyard soil bacterial 372 diversity and composition revealed by 16S rRNA genes: differentiation by geographic features. Soil 373 Biology and Biochemistry 91:232-247. 374 43. Zarraonaindia I, Owens SM, Weisenhorn P, West K, Hampton-Marcell J, Lax S, Bokulich NA, Mills DA, 375 Martin G, Taghavi S. 2015. The soil microbiome influences grapevine-associated microbiota. MBio 376 6:e02527-14. 377 44. Liermann LJ, Albert I, Buss HL, Minyard M, Brantley SL. 2015. Relating microbial community structure 378 and geochemistry in deep regolith developed on volcaniclastic rock in the Luquillo Mountains, Puerto 379 Rico. Geomicrobiology Journal 32:494-510. 380 45. Li C, Yan K, Tang L, Jia Z, Li Y. 2014. Change in deep soil microbial communities due to long-term 381 fertilization. Soil Biology and Biochemistry 75:264-272. 382 46. Agnelli A, Ascher J, Corti G, Ceccherini MT, Nannipieri P, Pietramellara G. 2004. Distribution of 383 microbial communities in a forest soil profile investigated by microbial biomass, soil respiration and 384 DGGE of total and extracellular DNA. Soil Biology and Biochemistry 36:859-868. 385 47. GWA. 2017. Soil acidity in Western Australia, on Government of Western Australia, Department of 386 Primary Industries and Regional Development's Agriculture and Food. 387 https://www.agric.wa.gov.au/soil-acidity/soil-acidity-western-australia. Accessed November 28, 388 2017. 389 48. Filippidou S, Wunderlin T, Junier T, Jeanneret N, Dorador C, Molina V, Johnson DR, Junier P. 2016. A 390 combination of extreme environmental conditions favor the prevalence of endospore-forming 391 firmicutes. Frontiers in microbiology 7:1707. 392 49. Bergmann GT, Bates ST, Eilers KG, Lauber CL, Caporaso JG, Walters WA, Knight R, Fierer N. 2011. The 393 under-recognized dominance of Verrucomicrobia in soil bacterial communities. Soil Biology and 394 Biochemistry 43:1450-1455. 395 50. Pershina EV, Ivanova EA, Korvigo IO, Chirak EL, Sergaliev NH, Abakumov EV, Provorov NA, Andronov 396 EE. 2018. Investigation of the core microbiome in main soil types from the East European plain. Science 397 of The Total Environment 631:1421-1430. 398 51. Brewer TE, Handley KM, Carini P, Gilbert JA, Fierer N. 2017. Genome reduction in an abundant and 399 ubiquitous soil bacterium ‘Candidatus Udaeobacter copiosus’. Nature microbiology 2:16198.

400 52. Shu W, Pablo GP, Jun Y, Danfeng H. 2012. Abundance and diversity of nitrogen-fixing bacteria in 401 rhizosphere and bulk paddy soil under different duration of organic management. World Journal of 402 Microbiology and Biotechnology 28:493-503. 403 53. Berendsen RL, Pieterse CM, Bakker PA. 2012. The rhizosphere microbiome and plant health. Trends in 404 plant science 17:478-486. 405 54. Smart DR, Schwass E, Lakso A, Morano L. 2006. Grapevine rooting patterns: a comprehensive analysis 406 and a review. American Journal of Enology and Viticulture 57:89-104. 407 55. Garrido-Oter R, Nakano RT, Dombrowski N, Ma K-W, Team TA, McHardy AC, Schulze-Lefert P. 2018. 408 Modular Traits of the Rhizobiales Root Microbiota and Their Evolutionary Relationship with Symbiotic 409 Rhizobia. Cell host & microbe 24:155-167. e5. 410 56. Fischer D, Pfitzner B, Schmid M, Simões-Araújo JL, Reis VM, Pereira W, Ormeño-Orrillo E, Hai B, 411 Hofmann A, Schloter M. 2012. Molecular characterisation of the diazotrophic bacterial community in 412 uninoculated and inoculated field-grown sugarcane (Saccharum sp.). Plant and soil 356:83-99. 413 57. Kim JM, Roh A-S, Choi S-C, Kim E-J, Choi M-T, Ahn B-K, Kim S-K, Lee Y-H, Joa J-H, Kang S-S. 2016. Soil 414 pH and electrical conductivity are key edaphic factors shaping bacterial communities of greenhouse 415 soils in Korea. Journal of Microbiology 54:838-845. 416 58. Hartman WH, Richardson CJ, Vilgalys R, Bruland GL. 2008. Environmental and anthropogenic controls 417 over bacterial communities in wetland soils. Proceedings of the National Academy of Sciences:pnas. 418 0808254105. 419 59. Lauber CL, Strickland MS, Bradford MA, Fierer N. 2008. The influence of soil properties on the structure 420 of bacterial and fungal communities across land-use types. Soil Biology and Biochemistry 40:2407- 421 2415. 422 60. Seuradge BJ, Oelbermann M, Neufeld JD. 2017. Depth-dependent influence of different land-use 423 systems on bacterial biogeography. FEMS microbiology ecology 93. 424 61. Chanal A, Chapon V, Benzerara K, Barakat M, Christen R, Achouak W, Barras F, Heulin T. 2006. The 425 desert of Tataouine: an extreme environment that hosts a wide diversity of microorganisms and 426 radiotolerant bacteria. Environmental Microbiology 8:514-525. 427 62. Evans SE, Wallenstein MD. 2014. Climate change alters ecological strategies of soil bacteria. Ecology 428 letters 17:155-164. 429 63. Jorquera MA, Hernández M, Martínez O, Marschner P, de la Luz Mora M. 2010. Detection of aluminium 430 tolerance plasmids and microbial diversity in the rhizosphere of plants grown in acidic volcanic soil. 431 European Journal of Soil Biology 46:255-263. 432 64. Yurgel SN, Douglas GM, Comeau AM, Mammoliti M, Dusault A, Percival D, Langille MG. 2017. Variation 433 in bacterial and eukaryotic communities associated with natural and managed wild blueberry habitats. 434 Phytobiomes 1:102-113. 435 65. Lima AB, de Souza Cannavan F, Germano MG, Dini-Andreote F, de Paula AM, Franchini JC, Teixeira WG, 436 Tsai SM. 2015. Effects of vegetation and seasonality on bacterial communities in Amazonian dark earth 437 and adjacent soils. African Journal of Microbiology Research 9:2119-2134. 438 66. Barberán A, Bates ST, Casamayor EO, Fierer N. 2012. Using network analysis to explore co-occurrence 439 patterns in soil microbial communities. The ISME journal 6:343. 440 67. Hawksworth DL, Lücking R. 2017. Fungal Diversity Revisited: 2.2 to 3.8 Million Species. Microbiology 441 spectrum 5. 442 68. Taylor DL, Hollingsworth TN, McFarland JW, Lennon NJ, Nusbaum C, Ruess RW. 2014. A first 443 comprehensive census of fungi in soil reveals both hyperdiversity and fine‐scale niche partitioning. 444 Ecological Monographs 84:3-20. 445 69. Tedersoo L, Bahram M, Põlme S, Kõljalg U, Yorou NS, Wijesundera R, Ruiz LV, Vasco-Palacios AM, Thu 446 PQ, Suija A. 2014. Global diversity and geography of soil fungi. science 346:1256688. 447 70. Orgiazzi A, Lumini E, Nilsson RH, Girlanda M, Vizzini A, Bonfante P, Bianciotto V. 2012. Unravelling soil 448 fungal communities from different Mediterranean land-use backgrounds. PloS one 7:e34847. 449 71. Bahram M, Peay KG, Tedersoo L. 2015. Local‐scale biogeography and spatiotemporal variability in 450 communities of mycorrhizal fungi. New Phytologist 205:1454-1463. 451 72. Rime T, Hartmann M, Brunner I, Widmer F, Zeyer J, Frey B. 2015. Vertical distribution of the soil 452 microbiota along a successional gradient in a glacier forefield. Molecular ecology 24:1091-1108. 453 73. McGuire KL, D’Angelo H, Brearley F, Gedallovich S, Babar N, Yang N, Gillikin C, Gradoville R, Bateman 454 C, Turner BL. 2015. Responses of soil fungi to logging and oil palm agriculture in Southeast Asian 455 tropical forests. Microbial ecology 69:733-747. 456 74. Kyaschenko J, Clemmensen KE, Hagenbo A, Karltun E, Lindahl BD. 2017. Shift in fungal communities 457 and associated enzyme activities along an age gradient of managed Pinus sylvestris stands. The ISME 458 journal 11:863. 459 75. Powell JR, Karunaratne S, Campbell CD, Yao H, Robinson L, Singh BK. 2015. Deterministic processes 460 vary during community assembly for ecologically dissimilar taxa. Nature communications 6:8444. 461 76. 2018. Fungal biodiversity and their role in soil health. 462 Frontiers in microbiology 9. 463 77. YanFrąc Y, M, Zhang Hannula X, Zheng SE, Bełk X, Apaliyaa M, Jędryczka MT, Yang M. Q, Zhao L, Gu X, Zhang H. 2018. Control of postharvest blue 464 mold decay in pears by Meyerozyma guilliermondii and it’s effects on the protein expression profile of 465 pears. Postharvest Biology and Technology 136:124-131.

466 78. Buerth C, Tielker D, Ernst JF. 2016. Candida utilis and Cyberlindnera (Pichia) jadinii: yeast relatives 467 with expanding applications. Applied microbiology and biotechnology 100:6981-6990. 468 79. Botha A. 2011. The importance and ecology of yeasts in soil. Soil Biology and Biochemistry 43:1-8. 469 80. Amprayn K-o, Rose MT, Kecskés M, Pereg L, Nguyen HT, Kennedy IR. 2012. Plant growth promoting 470 characteristics of soil yeast (Candida tropicalis HY) and its effectiveness for promoting rice growth. 471 Applied Soil Ecology 61:295-299. 472 81. Wagner L, Stielow B, Hoffmann K, Petkovits T, Papp T, Vágvölgyi C, De Hoog G, Verkley G, Voigt K. 2013. 473 A comprehensive molecular phylogeny of the Mortierellales (Mortierellomycotina) based on nuclear 474 ribosomal DNA. Persoonia: Molecular Phylogeny and Evolution of Fungi 30:77. 475 82. Kepler RM, Maul JE, Rehner SA. 2017. Managing the plant microbiome for biocontrol fungi: examples 476 from Hypocreales. Current opinion in microbiology 37:48-53. 477 83. Liu J-K, Hyde KD, Jeewon R, Phillips AJ, Maharachchikumbura SS, Ryberg M, Liu Z-Y, Zhao Q. 2017. 478 Ranking higher taxa using divergence times: a case study in Dothideomycetes. Fungal Diversity 84:75- 479 99. 480 84. Garbeva Pv, Van Veen J, Van Elsas J. 2004. Microbial diversity in soil: selection of microbial populations 481 by plant and soil type and implications for disease suppressiveness. Annu Rev Phytopathol 42:243- 482 270. 483 484 Supplementary Section 1. Replicates and control sample comparison

485 We prepared replicates of six sub-samples (Supplementary Table 1), resulting in 18 486 samples to detect the intra-variability of the bacterial and archaeal communities 487 within the same depths in different vineyards, and their soil cores. Replicates refer 488 to three soil aliquots taken from the same 15-mL plastic tube, in this way 489 representing the same soil horizon interval. PCoA comparisons of bacterial and 490 archaeal unweighted UniFrac distance for these samples showed clustering for each 491 group of replicates (Supplementary Figure 3).

492

493

494 Supplementary Figure 3. Prokaryotes communities’ distribution according 495 soil replicates. Graph is PCoA comparisons of bacterial and archaeal unweighted 496 UniFrac distance. The six replicates refer to samples E1, E15, M2, M10, P1 and P16, 497 followed by the letters a, b, and c.

498 In relation to the G2 product negative-controls, from five samples prepared without 499 soil addition, but with G2 added, two showed measurable DNA. These two samples 500 were sent for 16s rRNA gene sequencing and only one passed the quality filtering.

501 Based on the bacterial composition outcome of this control sample, we filtered out 502 a series of taxa from our samples. The taxa excluded refer to members of the 503 Ralstonia, Burkholderia and Pelomonas genera, all recognised as common 504 contaminants of DNA extraction kits and laboratory reagents (1).

505

506 1. Salter SJ, Cox MJ, Turek EM, Calus ST, Cookson WO, Moffatt MF, Turner P, 507 Parkhill J, Loman NJ, Walker AW. 2014. Reagent and laboratory contamination can 508 critically impact sequence-based microbiome analyses. BMC biology 12:87.

509

510 Supplementary Figure 1. pH measurements along the soil profiles

511

512 Supplementary Figure 2. Relative abundances for the higher taxonomical level identified 513 per country according to soil layer. In general it is noticed a difference between the soil taxa 514 comparing soil layers from the same country. Topsoil, medium and bottom layer 515 correspond, respectively to samples coming from the following intervals in mbgl: 0.00-0.40, 516 0.41-2.49, 2.50-6.00 in Australia, 0.00-0-30, 0.31-2.49 and 2.50-4.20 in Spain; 0.00-0.30, 517 0.31-2.49 and 2.50-5.00 in Denmark.

518 OTHER SUPPLEMENTARY MATERIAL AVAILABLE ONLINE

519 Supp. Table 1. Soil cores information 520 Supp. Table 2. ANCOM results for prokaryotes at phyla and class levels 521 Supp. Table 3. List of reported and potential PGPRs detected on the samples 522 Supp. Table 4. ANCOM results for fungi at specie level 523 Supp. Table 5. Information of fungi associated with vineyard diseases 524 Supp. Table 6. Bacteria and Archaea phyla relative abundances (%) variation 525 Supp. Table 7. ANCOM results for prokaryotic phyla detected in the three layers 526 Supplementary Table 8. ANCOM results for prokaryotes detected in the three soil 527 layers in higher taxonomic levels. 528 529

530

531

532

533

534

535

536

537

538

539

540 MANUSCRIPT 4 541

542 Characterization of the wood mycobiome of 543 Vitis vinifera in a vineyard affected by esca. 544 Spatial distribution of fungal communities 545 and their putative relation to foliar 546 symptoms.

547

548 This paper includes an extensive characterization of the inner mycobiome of 549 grapevine trunks comparing asymptomatic with symptomatic plants affected by 550 Esca. We focused on the fungal community since the pathogens involved in this 551 disease are ascribed to this kingdom. This paper has been submitted to Frontiers in 552 Microbiology and it is actually accepted for revision.

553 This paper was conceived by G. Del Frari and myself as a significant achievement 554 towards the comprehension of the trunks microbial dinamycs occurring during the 555 establishment of a symptomatic condition, on grapevine leaves, affected by Esca 556 disease.

557 G. Del Frari and I developed the protocols and performed the experiments. I 558 analyzed and interpreted the data together with M. R. Aggerback and G.Del Frari. I 559 wrote part of the manuscript and revised the final draft.

560

561 Characterization of the wood mycobiome of Vitis vinifera in a vineyard 562 affected by esca. Spatial distribution of fungal communities and their putative 563 relation with foliar symptoms. 564 565 Giovanni Del Frari1†*, Alex Gobbi2†, Marie Rønne Aggerbeck2, Lars Hestbjerg 566 Hansen2, Ricardo Boavida Ferreira1 567 568 1 LEAF – Linking Landscape Environment Agriculture and Food, Instituto Superior 569 de Agronomia, University of Lisbon, Tapada da Ajuda, 1349-017 Lisbon Portugal 570 2 Department of Environmental Science - Section for Environmental Microbiology 571 and Biotechnology - Environmental Microbial Genomics Group (EMG), Aarhus 572 University, 4000 Frederiksborgvej, Roskilde, Denmark 573 574 †These authors contributed equally to the study 575 576 *Correspondence: 577 Corresponding Author 578 [email protected] 579 580 Keywords Microbial ecology, Vitis, metabarcoding, mycobiome, esca disease, 581 grapevine trunk diseases 582 583 584 Abstract 585 Esca is a disease complex belonging to the grapevine trunk diseases cluster. It 586 comprises five syndromes, three main fungal pathogenic agents and several 587 symptoms, both internal (e.g. affecting the stem) and external (i.e. affecting leaves 588 and bunches). The etiology and epidemiology of this disease complex remain, in 589 part, unclear. Some of the points that are still under discussion concern the sudden 590 rise in disease incidence, the simultaneous presence of multiple wood pathogens in 591 affected grapevines, the causal agents and the discontinuity in time of foliar 592 symptoms manifestation. The standard approach to the study of esca has been 593 mostly through culture-dependent studies, yet, leaving many questions 594 unanswered. 595 In this study, we used Illumina® next-generation amplicon sequencing to 596 investigate the mycobiome of the wood of grapevines in a vineyard with history of 597 esca. We characterized the wood mycobiome composition, investigated the spatial 598 dynamics of the fungal communities in different areas of the stem and in the canes, 599 and assessed the putative link between mycobiome and foliar symptoms. 600 An unprecedented diversity of fungi is presented (289 taxa), including five genera 601 reported for the first time in association with grapevine’s wood (Debaryomyces, 602 Trematosphaeria, Biatriospora, Lopadostoma and Malassezia) and numerous 603 hitherto unreported species. Esca-associated fungi Phaeomoniella chlamydospora 604 and Fomitiporia sp. dominate the fungal community, and numerous other fungi 605 associated with wood syndromes are also encountered (e.g. Eutypa spp., Inonotus 606 hispidus). The spatial analysis revealed different abundances of taxa, the exclusive

607 presence of certain fungi in specific areas of the plants, and tissue specificity. Lastly, 608 the mycobiome composition of the woody tissues in proximity to the canopy that 609 manifested foliar symptoms of esca, as well as in foliar symptomatic canes, was 610 highly similar to that of plants not exhibiting any foliar symptomatology. This 611 observation supports the current understanding that foliar symptoms are not 612 directly linked with the fungal community in the wood. 613 This work builds to the understanding of the microbial ecology of the grapevines 614 wood, offering insights and a critical view on the current knowledge of the etiology 615 of esca. 616 617 Introduction 618 The phyllosphere, rhizosphere and endosphere of grapevine (Vitis vinifera L.) are 619 characterized by the presence of complex communities of microorganisms that 620 constantly interact with one another and with the plant, affecting it positively, 621 neutrally or negatively (Bruez et al., 2014; Pinto et al., 2014; Zarraonaindia et al., 622 2015). Until a decade ago, the approach to characterize the mycobiome - namely the 623 fungal community present in/on an organism - of grapevines, focused on culture- 624 dependent studies in which fungi were isolated in vitro and identified 625 morphologically and/or molecularly (Morgan et al., 2017). This approach remains 626 valid to this day, however it presents several limitations, such as the impossibility 627 of detecting uncultivable fungi, the bias of the cultivation conditions (e.g. growth 628 medium, incubation parameters) and the difficulty of isolating species present in 629 low abundances (Morgan et al., 2017). In recent years, technologies like next- 630 generation sequencing (NGS) have improved in quality and reduced in cost, which, 631 in combination with ever more efficient bioinformatics tools, have allowed the 632 exploitation of this method in the study of the molecular ecology of environmental 633 DNA (eDNA) samples. In particular, DNA metabarcoding approaches have taken the 634 investigations of microbiomes to a new level, surpassing some of the limitations 635 which characterize culture-dependent studies. In fact, NGS studies have revealed a 636 higher diversity of taxa and accurate relative abundances in samples coming from 637 different environments, including the vineyard (Peay et al., 2016; Morgan et al., 638 2017; Jayawardena et al., 2018). Despite these recent advances, culture- 639 independent studies describing the microbial endosphere of grapevines are still 640 scarce. 641 DNA metabarcoding is a promising tool to investigate the microbial communities 642 present in the wood of grapevines, as it may lead to a new understanding of the 643 complexity that characterizes grapevine trunk diseases (GTDs). This cluster of 644 fungal diseases affects primarily the perennial organs of the plants, such as the trunk 645 and roots, however secondary symptoms may be observed in leaves, bunches and 646 shoots. Overall, GTDs cause a loss in vigor, productivity, quality of the yield and 647 lifespan of the plants, with conspicuous economic consequences (Bertsch et al., 648 2013; Fontaine et al., 2016; Hofstetter et al., 2012). GTDs pathogens are 649 phylogenetically unrelated, belonging to different families, orders and even phyla, 650 although plants infected may reveal similar symptomatology. For example, wood 651 discoloration and necrosis are symptomatology shared by all GTDs (e.g. Figure 1 C), 652 whereas the ‘tiger stripes’ pattern in the leaves can be attributed to both grapevine

653 leaf stripe disease (GLSD; Figure 1 F) and black dead arm (Mugnai et al., 1999; 654 Larignon et al., 2009; Bertsch et al., 2013). Moreover, the simultaneous presence of 655 several possible causal agents in infected grapevines complicates the outline of a 656 clear etiological pattern (Bruez et al., 2016; Edwards and Pascoe, 2004; Mondello et 657 al., 2017). This is especially true in the case of esca, a disease complex consisting of 658 five separate syndromes (brown wood streaking of rooted cuttings, Petri disease, 659 GLSD, esca and esca proper) in which, according to current literature, several 660 pathogenic fungi play a role (Surico, 2009). These fungi may infect vines in the field, 661 where conidia or other propagules reach fresh pruning wounds and start colonizing 662 the xylem, or during the propagation process in nurseries (Gramaje et al., 2018). The 663 pathogens most frequently associated with the first three syndromes are 664 Phaeomoniella chlamydospora and Phaeoacremonium minimum, two 665 tracheomycotic ascomycetes; while the latter two syndromes are associated with 666 the presence of the wood rotting basidiomycete Fomitiporia mediterranea (esca), 667 especially common in Europe, or with the simultaneous presence of both 668 tracheomycotic and wood rotting pathogens (esca proper) (Surico, 2009). Along 669 with these three players, other wood pathogens including, but not limited to, 670 members of the Diatrypaceae and Botryosphaeriaceae are often found in 671 symptomatic plants (Bruez et al., 2014; Edwards and Pascoe, 2004; Hofstetter et al., 672 2012; Travadon et al., 2016). Studies that used the NGS approach to learn more 673 about the grapevine endosphere are scarce (Dissanayake et al., 2018; Jayawardena 674 et al., 2018) and none of them investigated the mycobiome of GTDs-affected plants. 675 This study focuses on a vineyard in Portugal, which ranks 11th in the world for wine 676 production, with a total vineyard area of 195 kha (Aurand, 2017). V. vinifera cultivar 677 Cabernet Sauvignon is the most cultivated worldwide, with a total vineyard area of 678 340 kha. It is considered susceptible to trunk diseases (Darrieutort and Pascal, 679 2007; Eskalen et al., 2007) and has already been cited in studies concerning the 680 microbial ecology of wood, phyllosphere and grapes (González and Tello, 2011; 681 Bruez et al., 2014; Morgan et al., 2017; Singh et al., 2018). This work aims to 682 investigate the fungal communities present in the wood of grapevines, in a vineyard 683 with history of esca proper. Three main objectives were set: (1) to characterize the 684 mycobiome of the wood of V. vinifera cv Cabernet Sauvignon, in a vineyard located 685 in the Lisbon area (Portugal), using Illumina® NGS; (2) to understand the spatial 686 distribution of the communities present in different areas of perennial wood and in 687 annual wood; (3) to understand whether there is a link between the microbial 688 communities of the wood and the expression of foliar symptoms of esca. 689 690 Materials and Methods 691 The vineyard 692 Field sampling took place in the experimental vineyard (Almotivo) of the Instituto 693 Superior de Agronomia, in Lisbon (38°42'32.7"N, 9°11'11.5"W). The vineyard has a 694 density of 3333 plants/ha, the soil is classified as vertisoil, it is managed under 695 conventional agricultural practices and there is no irrigation system. The selected 696 cultivar was Cabernet Sauvignon grafted on 140 RU rootstock, 19 years old at the 697 moment of sampling (planted in 1998), trained as Cordon Royat Bilateral and spur 698 pruned. The field has a history of esca, with leaf-symptomatic grapevines accounting

699 for less than 1% of the total plants in all recorded years (2015, 2016 and 2017). The 700 selection of the plants used in this experiment was restricted to a block of 450 m2. 701 The immediate surroundings of the vineyard, within a 25 m radius from the 702 perimeter, are characterized by the presence of diverse vegetation. The majority of 703 the species are listed in Table S4, found in the supplementary materials. 704 705 Sampling and experimental setup 706 Samples of perennial wood (PW) were taken in a non-destructive way, in April 2017, 707 by means of hand-drilling the plants with a gimlet (Figure 1 B). The sampling 708 procedure occurred as follows. The bark, on each sampling point, was removed with 709 the use of a knife and the wounds were disinfected with ethanol (70% v/v); the 710 gimlet was placed perpendicularly on the open wounds and manually forced in the 711 wood until it went through the whole width of the plant. This allowed us to extract 712 cores of wood (5 mm of diameter and approximately 60 mm long) which were 713 immediately placed in sterile 15 mL falcon tubes and temporarily stored in ice 714 (Figure 1 C). Samples were then frozen, freeze-dried and stored at -80 °C. After 715 extracting each core of wood, the gimlet was sterilized by dipping it in a sodium 716 hypochlorite solution (0.35 w/w of active chlorine) for 1 minute, followed by a rinse 717 with ethanol (70% v/v) and then double-rinsed with sterile distilled water (SDW), 718 in order to minimize cross-contamination. 719 Canes grown in the 2017 growing season were sampled, as annual wood (AW), in 720 September, detaching them with pruning scissors, approximately 3 cm above the 721 spur from which they departed (total length of the canes sampled 50 mm; Figure 1 722 F). Each sample was deprived of its bark, frozen, freeze-dried and stored at -80°C 723 until processing. 724 725 Concerning the PW, ten grapevines were sampled in 9 areas each, as shown in Figure 726 1 (B). Five of these plants did not show foliar symptoms of esca during the previous 727 two growing seasons (years 2015 and 2016), while the other five presented leaf 728 ‘tiger stripes’ symptoms, only in one of the two cordons (cordon 2; Figure 1 A), 729 during the previous growing season (year 2016). The terms ‘asymptomatic’ and 730 ‘symptomatic’, which will often be encountered in the rest of the text, refer 731 exclusively to the foliar symptomatology and not to the wood symptomatology 732 (unless specifically stated). 733 The tissue types corresponding to the nine sampling areas are: ‘Graft Union’ (GU), 734 located approximately (3±1 cm) above the soil, on the graft union; ‘Trunk’ (T), (22±1 735 cm) above GU; ‘Upper Trunk’ (UT), (22±1 cm) above T; ‘Arm 1’ (A1), was located on 736 the cordon, (36±2 cm) away from UT; the sample point ‘Spur 1’ (S1) is located on 737 the cordon, right below the spur, (10±1 cm) from A1; ‘Arm 2’ (A2), located (22±1 738 cm) to the right of A1; ‘Spur 2’ (S2), (10±1 cm) from A2 (Figure 1). Sample points 739 (A1, S1, A2 and S2) belonged to cordon 1, and the canopy departing from this cordon 740 never exhibited foliar symptoms, for all 10 sampled plants. Sampling points (SA, 741 symptomatic arm) and (SS, symptomatic spur) are the equivalent of points (A1) and 742 (S1), but located in cordon 2. In this case, 5 out of the 10 sampled grapevines 743 presented foliar symptoms in the canopy departing from (SS), while the other 5 744 plants had a non-symptomatic canopy.

745 746 Only one tissue type was examined in the AW, namely the ‘Canes’, where 15 canes 747 were sampled from 10 plants. Five of them came from asymptomatic plants, while 748 the other 10 came from symptomatic plants. Within these 10, 5 canes were foliar- 749 symptomatic, while the other 5 were asymptomatic and sampled in cordon where 750 the whole canopy was asymptomatic (Figure 1 F). 751 752 To address the three objectives of this study the sample points corresponding to 753 different tissue types were combined as follows. (1) To characterize the mycobiome 754 of the wood of the vineyard, all sample points were taken in consideration (n= 80 755 from wood, n= 15 from canes; Figure 1 A). (2) To learn about the spatial distribution 756 of the mycobiome in the different areas of the plants, we used tissue types (GU – S2) 757 (n= 10 per tissue type, total n= 70; Figure 1 D), along with asymptomatic canes (n= 758 10). (3) To understand the link between foliar symptoms expression and 759 mycobiome of PW or AW, we created three groups per category. In the category 760 ‘PW’, group (i) consisted in asymptomatic plants (cordon 1, points ‘A1 and S1’; n= 761 10), group (ii) consisted in symptomatic plants, sampled in the asymptomatic 762 cordon (cordon 1, points ‘A1 and S1’; n= 10), group (iii) consisted in symptomatic 763 plants, sampled in the symptomatic cordon (cordon 2, points ‘SA and SS’; n= 10) 764 (Figure 1 E). The same applied for the category ‘AW’, where group (i) consisted in 765 asymptomatic canes, sampled from asymptomatic plants, group (ii) consisted in 766 asymptomatic canes, sampled from symptomatic plants, group (iii) consisted in 767 symptomatic canes.

768 769 Figure 1. Sampling points in the permanent wood or annual wood of grapevine cv 770 Cabernet sauvignon. (GU) Graft union, (T) Trunk, (UT) Upper trunk, (A1) Arm 1, (S1) 771 Spur 1, (A2) Arm 2, (S2) Spur 2, (SA) Symptomatic arm, (SS) Symptomatic spur. 772 Cordon (1) presented canopy with healthy leaves in all ten sampled plants, while 773 cordon (2) presented foliar symptoms in the canopy, departing from SS, in five of 774 the sampled plants (circles and letters in red). (A) Sampling points used to 775 characterize the mycobiome of permanent wood – objective 1-. (B) Sampling 776 procedure involved using a gimlet to drill the wood and extract wood cores. (C) 777 Cores of wood extracted with a gimlet (red arrows indicate wood symptomatology). 778 From right to left: brown wood streaking, wood necrosis, extensive wood necrosis, 779 wood decay-white rot-wood necrosis. (D) Sampling points used to test the spatial 780 distribution of fungal communities – objective 2-. (E) Sampling points used to

781 examine the mycobiome present in the wood in proximity of the canopy with foliar 782 symptomatology (AS, SS) and healthy (A1, S1) – objective 3-. (F) From left to right: 783 symptomatic canes sampled from plants with foliar symptoms, asymptomatic canes 784 sampled from plants with no foliar symptoms in either of the cordons or with foliar 785 symptoms in only one of the two cordons; the sampling area for each cane is 786 indicated by the blue rectangle. 787 788 DNA extraction, amplification, library preparation and sequencing 789 Wood samples were ground to dust using sterile mortars and pestles aiding the 790 process with liquid nitrogen. An aliquot of ground wood (0.25 ± 0.01 g) of each 791 sample was added to DNA extraction columns (FastDNA™ SPIN Kit for Soil, MP 792 Biomediacals® LLC) and total DNA was extracted as described by the kit 793 manufacturer. Three negative controls of the DNA extraction procedure were added. 794 The amplicon chosen in this study targeted the Internal Transcribed Spacer ITS1 795 region, and the primer set selected was ITS1F2 - ITS2 (Gaylarde et al., 2017) with 796 overhang recommended by Illumina. For building libraries, we have been using a 797 double-step PCR approach as reported by Feld et al. (2015). The full sequence of the 798 primers, including Illumina overhangs, is the following: ITS1F2 (5’- 799 TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG-GAACCWGCGGARGGATCA-3’) and 800 ITS2 (5’-GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG- 801 GCTGCGTTCTTCATCGATGC-3’) (Gaylarde et al., 2017). 802 Each first-step PCR reaction contained 12.5 µL of Supreme NZYTaq II 2x Green 803 Master Mix™ (NZYtech™), 0.5 µL of forward and reverse primers from a 10 µM stock, 804 1.5 µL of sterile water, and 5 µL of template. Each reaction was pre-incubated at 95 805 °C for 2 min, followed by 40 cycles of 95 °C for 15 s, 75 °C for 10 s, 55 °C for 15 s, 72 806 °C for 40 s; a further extension was performed at 72 °C for 10 min. 807 Second PCR-step for barcoding, fragment-purification by using MagBio beads and 808 Qubit quantification was performed as reported in Gobbi et al. (2018). Final pooling 809 was performed at 10 ng/sample. DNA Sequencing was performed using an in-house 810 Illumina MiSeq instrument and 2x250 paired-end reads with V2 Chemistry. 811 812 Bioinformatics 813 After sequencing, demultiplex was performed using our Illumina MiSeq platform 814 and the raw data were analyzed using QIIME 2 v. 2018.2 (Caporaso et al., 2010) 815 using the same pipeline described in Gobbi et al. (2018); denoised reads were 816 trimmed 15 bp on the left to remove the adapters and then they were analyzed using 817 DADA2 with the exact sequence variants (EVS) methods (Callahan et al., 2017). Each 818 ESV appears at least twice in the dataset. Singletons were discarded. Taxonomic 819 assignments were performed at 99% identity using qiime feature-classifier classify- 820 sklearn with a Naïve-Bayes classifier trained with UNITE (Nilsson et al., 2013) v7.2 821 for ITS. To test the three hypotheses underlying this work we separate the frequency 822 table in three sub-tables which were tested under different conditions. The raw data 823 of this study will be available on European Nucleotide Archive (ENA) (Study 824 Accession Number: PRJEB31028). 825 826

827 Statistics 828 The frequency table and its taxonomy were combined, converted to biom format in 829 QIIME (Caporaso et al., 2010), then merged with a table of metadata into an S4 object 830 and analysed in R (v. 3.4.3) using the following packages: phyloseq, v. 1.22.3 831 (McMurdie and Holmes, 2013); biomformat, v. 1.10.0 (McMurdie and Paulson, 832 2018), vegan, v. 2.5.2 (Oksanen et al. 2007); ggplot2, v. 3.0.0 (Wickham, 2016); 833 igraph, v. 1.2.2 (Csardi, 2006), MetacodeR, v. 0.2.1.9005 (Foster et al., 2017); 834 adespatial, v. 0.1.1 (Dray et al., 2018); data.table, v 1.10.4.3 (Dowle and Srinivasan, 835 2017); and microbiome, v. 1.4.0.R code is publicly available at 836 https://github.com/marieag/EMG. 837 To assess the alpha diversity, Shannon diversity index and Pielou’s evenness were 838 calculated and tested with pairwise ANOVA, determining differences in these 839 indexes between tissue groups. We analyzed the ꞵ-dispersion to measure between- 840 sample variances in abundance, computing average distances of the individual 841 samples. The resulting ordination was plotted using the non-metric 842 multidimensional scaling (NMDS) combined with a Jaccard index matrix. These 843 ordinations were also performed with a Bray-Curtis dissimilarity matrix. To assess 844 overall inter-group variance, we performed a PERMANOVA, using a Jaccard distance 845 matrix with 999 permutations. In order to illustrate this effect size compared to the 846 relative abundance of the taxa, we created differential heat trees using MetacodeR, 847 illustrating the log2 fold change in species abundance. A Wilcoxon Rank Sum test 848 was applied to test differences between the same species in different tissue groups, 849 and the resulting p-values were corrected for multiple comparisons using FDR, as 850 implemented in MetacodeR. P-value threshold was set to 0.05. 851 852 Results 853 Sequencing dataset description 854 This dataset, obtained by sequencing, consists of a total of 95 samples. This includes 855 80 samples collected from different parts of the PW and 15 from canes (AW). All of 856 them represent a total of 20 plants and each one of these counts as an independent 857 biological replicate. This dataset contains 2805 exact sequence variants which 858 appear a total of 8.184.885 times among all the different samples. 859 860 Visual examination of the sampled wood 861 Before processing the samples for the molecular analysis, wood cores of permanent 862 wood and samples of annual wood were visually examined to assess the presence of 863 symptomatology. Approximately 10% of the wood cores were fully asymptomatic, 864 75% presented symptoms of tracheomycosis (e.g. brown wood streaking and/or 865 wood necrosis), and 15% showed the presence of white rot, which was always 866 associated with other tracheomycosis symptoms. The examination of the annual 867 wood revealed that 100% of the samples were fully asymptomatic. 868 869 The wood mycobiome 870 The identification of sequences in our dataset revealed an unprecedented diversity. 871 Taxa that were assigned to genus or species level are 289, 50 of them are found in 872 relative abundance (RA) greater than 0.1%, while the remaining 239 are considered

873 rare taxa (RA < 0.1%). Within these 239 taxa, 146 are found in a RA included 874 between 0.1 and 0.01%, and 93 have a RA lower than 0.01%. The full list of taxa is 875 available in the supplementary material (Table S5). 876 The qualitative overview of the wood mycobiome will focus mainly on the 30 most 877 abundant taxa in PW and the 12 most abundant in AW, which account for 79.1 and 878 80.8% of the total RAs respectively (Table 1), while the remaining percentages 879 represent unidentified taxa or fungi found in lower abundances. Within this group 880 of taxa, five genera and nine species of fungi are described for the first time in 881 association with the grapevine wood mycobiome, while the remaining 18 taxa have 882 already been reported (Table 1). 883 The community encountered in perennial wood is characterized by the presence of 884 both ascomycetes and basidiomycetes (66.7% and 27.7% RA), with high 885 abundances of tracheomycotic pathogen Phaeomoniella chlamydospora (25.8%) 886 and white rot agent Fomitiporia sp. (14.6%), two organisms directly associated with 887 esca proper and other esca-related syndromes. Among all sampled wood cores (n= 888 80), P. chlamydospora was present in 68 of them (85%; RA> 0.1%), while 889 Fomitiporia sp. in 58 (64%; RA> 0.1%) or 14 (17.5%; RA> 35%). Other GTD 890 pathogens among the 30 most abundant taxa are Eutypa lata (0.7%) and E. 891 leptoplaca (0.9%), within the Diatrypaceae. More members of this family are 892 Anthostoma gastrinum (0.9%), a potential wood pathogen, as well as E. flavovirens, 893 Eutypella citricola and Cryptovalsa ampelina, identified as rare taxa. Members of the 894 Botryosphaeriaceae (e.g. Diplodia pseudoseriata, Neofusicoccum parvum, N. 895 australe), Ilyonectria sp. and Neonectria sp. are also found, although represented 896 only as rare taxa (Table S5). Decay agents, such as Fomitiporia sp., Fomitiporia 897 mediterranea (0.2%) and Inonotus hispidus (0.3%), were also identified in this study, 898 along with several others represented in minor abundances (e.g. Fomitiporella sp.). 899 Among the endophytes and saprophytes, Alternaria sp. (3.2%), Cladosporium sp. 900 (1.9%), Aureobasidium pullulans (0.4%) and Psathyrella sp. (0.5%) are the most 901 abundant. Several other genera or species, identified for the first time is association 902 with grapevine wood, amount to 14 taxa out of the 33 most abundant in PW or AW 903 (Table 1). 904 Annual wood is also colonized by both ascomycetes (76.3%) and basidiomycetes 905 (18.8%). The most abundant taxa are endophytic and saprophytic fungi, with 906 Alternaria sp. (14.6%), Ramularia sp. (9.4%) and Cladosporium sp. (8.2%) being 907 among most abundant, as well as other species reported for the first time (e.g. 908 Debaryomyces prosopidis; Table 1). Only two wood pathogens are present in 909 moderate abundances in AW, namely P. chlamydospora (3.9%) and Diaporthe sp. 910 (0.8%), while other pathogenic agents are found in minor abundances (RA < 0.2%; 911 e.g. Neofusicoccum australe). 912 The core mycobiome, namely the taxa shared between PW and AW, is constituted 913 by 44 taxa. Only 10 taxa are unique to AW and the remaining 235 are unique to PW. 914 All the 10 unique taxa found in AW are considered rare taxa, as their RAs are lower 915 than 0.1% of the total, while among the many taxa unique to PW we find organisms 916 belonging to the Hymenochaetaceae, , Pleomassariaceae, 917 Xylariaceae and several others (Table S5).

918 Table 1. List of most abundant taxa, identified to genus or species level, found in grapevine wood. The list includes the 30 most abundant 919 taxa found in permanent wood and the 12 most abundant in annual wood, for a total of 32 taxa. The numbers between brackets represent 920 the relative abundance of that Phylum or Family in the permanent wood or annual wood (PW% - AW%) based on the table created to 921 address objective (1). The ecology of the identified taxa in wood of grapevines or of other plants is shown based on available literature (E= 922 endophyte, S= saprophyte, P= pathogen, na= unknown ecology). The presence of taxa in different tissue types is based on the table created 923 -) indicates absence or presence in RA < 0.1%. Meaning of GU, T, UT, A1, A2, 924 S1, S2 and C as in the legend of Figure 1. 925 to address objective (2), (+) indicates presence (RA ≥ 0.1%), ( Relative Presence in different tissue type Ecology abundance (%) Phylum Family Species in wood PW AW GU / T / A1 / S1 / C * UT A2 S2 Ascomycetes Biatriosporaceae (0.6 – 0) Biatriospora 0.6 - E a - / - / - - / - - / - - (66.7 – 76.3) mackinnonii‡ Bionectriaceae (0.4 – 0) Clonostachys rosea 0.4 - E, S, P b - / - / - - / - - / + - Davidiellaceae (2.0 – 8.2) Cladosporium sp. 1.9 8.2 E, S b + / + / + + / + + / - + Diaporthaceae (<0.1 – Diaporthe sp. <0.1 0.8 E, S, P b - / - / - - / - - / - + 0.8) Diatrypaceae (2.6 – 0) Anthostoma gastrinum† 0.9 - S, P c, d + / + / + + / + + / - - Eutypa lata 0.7 - P b - / - / - - / - - / + - Eutypa leptoplaca 0.9 - P b - / - / + - / - + / + - Dothioraceae (0.4 – 4.0) Aureobasidium 0.4 4.0 E, S b - / + / + + / + - / - + pullulans Glomerellaceae (<0.1 – Colletotrichum sp. <0.1 0.4 P b - / - / - - / - - / - - 0.7) Herpotrichiellaceae (27.1 Exophiala xenobiotica† 0.5 - na - / - / - - / + + / + - – 3.9) Phaeomoniella 25.8 3.9 P d + / + / + + / + + / + + chlamydospora Hypocreales (0.3 – 0.2) Acremonium sp. 0.2 - E e - / - / - - / - - / + -

Lophiostomataceae (3.8 – Angustimassarina 0.5 - S f - / - / + + / + + / + - 0) acerina† sp. 2.7 - E, S b + / + / - + / - + / - - Lophiostoma cynaroidis† 0.3 - E g - / + / - - / - - / - - Lophiotrema rubi 0.3 - na + / - / - - / - - / - -

Table 1. (continue).

Relative Presence in different tissue type Ecology abundance (%) Phylum Family Species in wood PW AW GU / A1 / S1 / C * T / UT A2 S2 Massarinaceae (0.5 – 0) sp. 0.5 - E h + / + / + - / - + / - - Mycosphaerellaceae (3.3 Ramularia sp. 3.1 9.4 na + / + / + + / + + / + + – 9.4) Pleomassariaceae (2.9 – Trematosphaeria 2.9 - S i + / + / - - / + + / + - 0) pertusa‡ Pleosporaceae (3.9 – Alternaria sp. 3.2 14.6 E b, j + / + / + + / + + / + + 14.8) Saccharomycetaceae Debaryomyces 10.4 31.5 na + / + / + + / + + / + + (10.9 – 31.8) prosopidis‡ Xylariaceae (0.7 – 0) Lopadostoma 0.3 - S k - / - / - - / - - / - - meridionale‡ Lopadostoma 0.4 - S k + / - / - - / - - / - - quercicola‡ Basidiomyce Filobasidiaceae (0.2 – Filobasidium magnum† 0.1 0.2 na + / - / - - / + - / - - tes (26.7 – 0.2) 18.8)

Hymenochaetaceae (15.2 Fomitiporia sp. 14.6 - P d + / + / + + / + + / + - – 0) Fomitiporia 0.2 - P d - / + / - + / - - / - - mediterranea Inonotus hispidus 0.3 - S, P l + / - / - - / + - / - - Malasseziaceae (0.3 – Malassezia restricta‡ 0.2 0.2 na + / - / - - / - - / - + 0.3) Psathyrellaceae (0.5 – 0) Psathyrella sp. 0.5 - S m + / - / - - / - - / - - Sporidiobolaceae (0.6 – Rhodotorula 0.4 <0.1 S n - / + / + + / + - / - - 0) mucilaginosa† Tremellaceae (6.4 – 8.3) Cryptococcus sp. 2.7 7.6 E, S b + / + / + + / + + / + + Cryptococcus 0.3 - - - / - / + - / - - / - - heimaeyensis† Cryptococcus victoriae† 2.9 0.7 - + / + / + + / + + / + + TOTAL 79.1 80.8 926 * References. a ., 2017), b(Jayawardena et al., 2018), c(Haynes, 2016), d(Gramaje et al., 2018), e(González and Tello, 2011), 927 f(Thambugala et al., 2015), g(Xing et al., 2011), h(Casieri et al., 2009), i(Suetrong et al., 2011), j(Pancher et al., 2012), k(Jaklitsch et al., 2014), 928 l(González et al(Kolařík., 2009), etm(Bruez al et al., 2016), n ., 2017). 929 ‡First report of genus and species in grapevine wood. 930 †First report of species in grapevine wood. (Haňáčková et al

931 Diversity and spatial distribution of the mycobiome 932 933 Alpha diversity 934 The Shannon (H’) and Pielou’s (J’) indexes vary significantly among tissue types 935 (Figure 2, Table 2). The Shannon diversity analysis reveals that the three sampling 936 points located in the trunk, namely GU, T and UT, differ from both the spurs points 937 S1 and S2 (p < 0.01, p < 0.05, p < 0.001, respectively). The spur tissue is also 938 significantly different from the annual wood (Canes; p < 0.001). No differences are 939 observed among samples GU, T and UT, as well as between the two sample points in 940 the arm (A1 and A2) and in the spurs (S1 and S2). Exact p-values are available in 941 Table S6 in the supplementary materials. The different tissue types also vary in the 942 evenness of the mycobiome, with the fungal communities of the canes, GU, T and UT 943 being more evenly distributed than those of the spurs (Figure 2); canes’ evenness 944 differ from both the Arm tissues as well (Arm 1 and Arm 2; p < 0.05). Also, for this 945 parameter the three tissue types in the trunk (GU, T and UT) do not differ, as well as 946 the two tissues Arm 1 and Arm 2, and tissues Spur 1 and Spur 2 (Figure 2). 947 948 Beta diversity 949 The Jaccard’s index, when visualized in a non-metric multidimensional scaling 950 (NMDS) plot, shows a considerable overlap for different tissue types (Figure 3 A). 951 The PERMANOVA indicates significant difference between groups (p < 0.001), but 952 looking at the ordination, the difference is arguably in the clustering of the 953 observations, rather than a distinct difference in sample composition. A pattern 954 emerges when examining each tissue type separately (Figure 3 B). We observe a 955 reduction in between-sample variability starting from the GU to T and until UT. The 956 UT variability is very similar to that of both the Arm points (A1 and A2), while the 957 Spurs (S1 and S2) have higher between-sample variability, when compared to the 958 Arms, but similar to one another. Concerning the AW, the variability of this tissue 959 type is very low (Figure 3 B). Bray Curtis test revealed a similar profile and 960 significant differences (p < 0.001; data not shown). Matrixes of the Jaccard and Bray 961 Curtis are available in the supplementary materials. 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976

977 Table 2. Shannon diversity (H’) and Pielou’s evenness (J’) indexes of the spatial 978 distribution analysis (objective 2) and the mycobiome analysis in the wood 979 associated with symptomatic canopy (objective 3). 980 Shannon Pielou's evenness (H’) (J’) Spatial distribution (objective 2) Graft Union 1.85 0.60 Trunk 1.73 0.57 Upper trunk 1.92 0.61 Arm 1 1.26 0.40 Spur 1 1.08 0.34 Arm 2 1.58 0.49 Spur 2 1.09 0.37 Canes 1.93 0.61 Foliar symptomatology (objective 3) Permanent wood Asymptomatic arm 1.29 0.41 in asymptomatic plant Asymptomatic arm 1.05 0.33 in symptomatic plant Symptomatic arm 1.25 0.39 Annual wood Asymptomatic cane 2.04 0.63 in asymptomatic plant Asymptomatic cane 1.81 0.59 in symptomatic plant Symptomatic cane 2.09 0.68 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996

997 998 999 1000 1001

1002 1003 Figure 2. Box plots of diversity indexes (Shannon, Pielou’s evenness) of the fungal 1004 communities present in different sampling areas of the permanent wood (Graft 1005 Union, Trunk, Upper trunk, Arm 1, Spur 1, Arm 2, Spur 2) and annual wood (Canes). 1006 Statistical differences are shown by (*), where the Pair-wise ANOVA analysis 1007 resulted in a p < 0.5 (*), p < 0.1 (**) and p < 0.01 (***). 1008 1009 1010 1011

1012 1013 Figure 3. Non-metric multidimensional scaling (NMDS) plots based on Jaccard’s 1014 index. Fungal communities present in different tissue types in grapevine. (A) Shows 1015 all samples ordinated together, while (B) is the same data, split up per tissue group. 1016 Ellipses illustrate the multivariate normal distribution of samples within the same 1017 tissue group. The groups are colour-coded to correspond with Figure 2. 1018 1019 1020 Mycobiome composition and differentially represented taxa 1021 The bar plot in Figure 4 shows the relative abundance of the 20 most abundant taxa 1022 in different tissue types, giving an overview of the presence/absence of taxa and 1023 their differential representation. For example, Eutypa lata and Acremonium sp. are 1024 present exclusively in the wood below the spurs, while Trematosphaeria pertusa 1025 was detected in the same tissue type and in the graft union. Fomitiporia sp., very 1026 represented in most of the perennial wood, is nearly absent in the graft union. This 1027 tissue contains three unique taxa, Psathyrella sp., Lophiotrema rubi and 1028 Lopadostoma quercicola, as well as other taxa which are present in higher 1029 abundances when compared to the rest of the plant, such as Lophiostoma sp. and 1030 Massarina sp. The heat trees shown in Figures 5 and 6 give a thorough view of the 1031 differently abundant taxa, when comparing each tissue type, for taxa with RAs > 1032 0.1% (n= 50). The majority of the statistical differences observed, for both 1033 ascomycetes and basidiomycetes, concern the comparison between the Canes (AW) 1034 and all other tissue types of the permanent wood. Among the ascomycetes, the AW 1035 presents lower abundance of P. chlamydospora and Diatrypaceae, and higher 1036 abundance of Aureobasidium pullulans, Debaryomyces prosopidis, Diaporthe sp., 1037 Capnodiales, depending on the woody tissues compared. 1038 1039

1040 1041 Figure 4. Barplots of the relative abundance of the 20 most abundant taxa identified 1042 to species (s_) or genus (g_) level, found in different sampling areas of the stem and 1043 in the canes of grapevine. ‘Unassigned’ are taxa identified to a lower taxonomic level 1044 than genus, ‘Others’ are taxa not included in the 20 most abundant. 1045 1046 1047 1048 1049 1050 1051 1052 1053

1054 1055 1056 Figure 5. Differential heat tree matrix depicting the change in taxa abundance 1057 between different tissue groups, for ascomycetes, represented in the dataset (RA > 1058 0.01%). The size of the individual nodes in the grey cladogram depicts the number 1059 of taxa identified at that taxonomic level. The smaller cladograms show pairwise 1060 comparisons between each tissue group: an orange node indicates a higher 1061 abundance of the taxon in the tissue group stated on the abscissa, than in the tissue 1062 group stated on the ordinate. A blue node indicates the opposite. Taxa identified as 1063 statistically differently represented, according to the Wilcoxon test, are tagged with 1064 a black star. 1065 1066

1067 1068 1069 1070 Figure 6. Differential heat tree matrix depicting the change in taxa abundance 1071 between different tissue groups, for basidiomycetes, represented in the dataset (RA 1072 > 0.01%). The size of the individual nodes in the grey cladogram depicts the number 1073 of taxa identified at that taxonomic level. The smaller cladograms show pairwise 1074 comparisons between each tissue group: an orange node indicates a higher 1075 abundance of the taxon in the tissue group stated on the abscissa, than in the tissue 1076 group stated on the ordinate. A blue node indicates the opposite. Taxa identified as 1077 statistically differently represented, according to the Wilcoxon test, are tagged with 1078 a black star.

1079 Mycobiome and foliar symptoms 1080 Alpha and beta diversity 1081 The examination of the mycobiome of the wood in the proximity of the canopy that 1082 exhibited foliar symptoms of esca revealed no significant differences in term of 1083 diversity (H’) and evenness (J’), when compared with the fungal communities in the 1084 wood in proximity of non-symptomatic canopy, both in symptomatic or 1085 asymptomatic vines (p > 0.05; Table 2). The same results were obtained when 1086 comparing the mycobiome of the annual wood (canes) presenting foliar symptoms 1087 or non-symptomatic leaves, in both symptomatic and asymptomatic plants. The 1088 Jaccard indexes, when plotted in a NMDS matrix, reveal overlapping communities 1089 with a very similar between-sample variability, suggesting an overall similarity in 1090 beta diversity. In fact, the PERMANOVA analysis revealed no statistical differences 1091 among the three tissue groups in both permanent wood (p = 0.067) and annual 1092 wood (p = 0.429). Boxplots of the diversity indexes and NMDS of beta dispersion are 1093 available in the supplementary materials (Figure S9, Figure S10). 1094 1095 Mycobiome composition and differentially represented taxa 1096 The mycobiome of the permanent wood in proximity of symptomatic or non- 1097 symptomatic canopy is characterized by high abundances of P. chlamydospora, 1098 Fomitiporia sp. and Debaryomyces prosopidis (Figure 7). The most frequent taxa of 1099 known GTD-associated pathogens are presented in Table 3. 1100 The MetacodeR analysis revealed no statistical differences among taxa for their 1101 differential abundance in the three tissue groups compared in permanent wood and 1102 in annual wood. Nevertheless, trends of change are present for some taxa, and they 1103 are shown in the heat trees in Figure 8. 1104 1105 The permanent wood in proximity of non-symptomatic canopy in non-symptomatic 1106 plants presents higher abundances of Ramularia sp., Cladosporium spp., Alternaria 1107 sp., Debaryomyces sp., Cryptococcus sp. and, to a lower extent, P. chlamydospora; 1108 while Fomitiporia spp. is under-represented, when compared with the wood near 1109 symptomatic canopy. 1110 When comparing the PW in proximity of non-symptomatic and symptomatic 1111 canopy, both collected from a symptomatic vine, similar trends are observed. In fact, 1112 Ramularia sp., members of the Pleosporales and Aureobasidium pullulans are over- 1113 represented in the former, along with a minor over-representation of the other taxa 1114 previously mentioned. Also, in this case a trend of underrepresentation is observed 1115 for Fomitiporia spp. 1116 Some differences are also observed when comparing the PW near asymptomatic 1117 canopy coming from either asymptomatic or symptomatic plants. In the former we 1118 observe an over-representation of Alternaria sp., Debaryomyces sp. and 1119 Cryptococcus spp., and an under-representation of Aureobasidium pullulans and 1120 Anthostoma sp. 1121 1122 Annual wood communities are characterized by similar relative abundances, among 1123 the most abundant taxa, for all three groups analyzed. Several trends of variation

1124 are shown in Figure 8, involving primarily genera Malassezia, Cryptococcus, 1125 Acremonium and Diaporthe. 1126 1127 Table 3. Relative abundances of genera or species of fungi known to be involved in 1128 GTDs, encountered in the permanent wood (PW) or annual wood (AW), relative to 1129 objective (3). Three groups of wood were examined, namely the PW in proximity of 1130 symptomatic canopy (SYM); asymptomatic canopy, but in vines that presented foliar 1131 symptoms (ASYM-SYM); asymptomatic canopy, in vines that did not present foliar 1132 symptoms (ASYM). The same groups were examined in AW, namely canes with 1133 manifested foliar symptoms (SYM); asymptomatic canes, but in vines that presented 1134 symptomatic canopy (ASYM-SYM); asymptomatic canes, in vines that did not 1135 present symptomatic canopy (ASYM). Taxa listed are found in RA > 0.01% of the 1136 total dataset, other known wood pathogens (RA < 0.01%) are not included. 1137 Pathogens’ RA in PW Pathogens’ RA in AW Genus/Species ASYM- ASYM- SYM ASYM SYM ASYM SYM SYM Ascomycetes Anthostoma gastrinum* 1.3 0.7 - - - - Diaporthe sp. - - <0.1 0.6 <0.1 1.2 Eutypa lata 0.2 0.1 <0.1 - - - Eutypa leptoplaca 0.7 3.7 <0.1 - - - Neofusicoccum parvum 0.1 - - - - - Neofusicoccum australe - 0.2 - Phaeomoniella 21.8 50.1 43.1 3.5 3.4 4.9 chlamydospora

Basidiomycetes Fomitiporia sp. 34.9 24.1 7.7 - - - Fomitiporia mediterranea 0.9 <0.1 0.1 - - -

Total Ascomycetes 22.7 53.9 43.1 4.1 3.6 6.1 Total Basidiomycetes 35.8 24.1 7.8 - - - TOTAL 58.5 78.0 50.9 4.1 3.6 6.1 1138 1139 *genus Anthostoma has been associated to pathogenicity in grapevine wood, but not 1140 A. gastrinum, being this the first report of this species in grapevine wood. The RAs 1141 of this taxon were not calculated in the Total(s).

1142 1143 Figure 7. Barplots of the relative abundance of the 20 most abundant taxa identified to species (s_), genus (g_) or family (f_) level. 1144 ‘Unassigned’ are taxa identified to a lower taxonomic level than family. ‘Others’ are taxa not included in the 20 most abundant. (Left) 1145 Communities found in the PW in proximity of symptomatic canopy (‘Symptomatic_arm’) or of asymptomatic canopy, either in symptomatic 1146 plants (‘Asymptomatic_arm symptomatic_plant’) or in asymptomatic plants (‘Asymptomatic_arm asymptomatic_plant’). (Right)

1147 Communities found in the canes with manifested foliar symptoms (‘Symptomatic_cane’) or asymptomatic, but coming from symptomatic 1148 (‘Asymptomatic_cane symptomatic_plant’) or asymptomatic plants (‘Asymptomatic_cane asymptomatic_plant’).

1149 1150 Figure 8. Differential heat tree matrix depicting the change in species abundance between different tissue groups, represented in the 1151 dataset with a (RA > 0.01%). The size of the individual nodes in the grey cladogram depicts the number of taxa identified at that taxonomic 1152 level. The smaller cladograms show pairwise comparisons between each tissue group, with the colour illustrating the log2 fold change: a 1153 red node indicates a higher abundance of the taxon in the tissue group stated on the abscissa, than in the tissue group stated on the 1154 ordinate. A blue node indicates the opposite.

Discussion Characterization of the mycobiomes in PW and AW The diversity of the fungal communities living in the stem of woody plants, characterized by NGS approaches, is largely unknown. Research performed over the past decades on the grapevine mycobiome, using both culture-dependent and independent approaches, revealed over 900 fungal taxa (Jayawardena et al., 2018). The richness of the wood mycobiome in esca infected vineyards was estimated to be 88 taxa in Montpellier, France (Travadon et al., 2016); 85 species in Bordeaux, France, from 108 single strand conformation polymorphism (SSCP) profiles (Bruez et al., 2014); and 150 operational taxonomic units (OTUs) in Switzerland (Hofstetter et al., 2012). This study, the first using a NGS approach, reveals an even greater richness (289 taxa), adding numerous fungi to the known list of 900+ known taxa. Several factors may play a role in shaping the fungal community composition, such as location, cultivar and age of the plants (Dissanayake et al., 2018; Travadon et al., 2016). This is the primary reason why we expect that using NGS to assess the diversity of the wood mycobiome in other vineyards will considerably increase the number of fungal species found in association with grapevine wood. Interestingly, 239 out of the 289 taxa (80%) are rare taxa. The ecology of several of them is known (Jayawardena et al., 2018), while that of several others needs to be assessed, nevertheless the extent to which they contribute to the grapevine-mycobiome and fungus- fungus interactions remains to be elucidated. Grapevines permanent wood seems to function as a reservoir of fungal diversity, where species that are found in minor abundances may thrive under specific environmental conditions (e.g. extreme weather conditions, mechanical injuries), leading to positive or negative effects for the plant’s wellbeing. Unsurprisingly, within this massive richness, numerous fungi associated with GTDs were identified, both as frequent and rare taxa. This agrees with previous reports showing that esca-affected plants may host numerous other wood pathogens (Edwards and Pascoe, 2004; Rumbos and Rumbou, 2001). However, Phaeoacremonium spp., a genus of tracheomycotic fungi often involved in esca-associated syndromes (Mostert et al., 2006), was not detected. The vineyard under study, as well as other esca-symptomatic fields examined in previous works (Bruez et al., 2014; Hofstetter et al., 2012; Travadon et al., 2016), are dominated by wood pathogens (e.g. P. chlamydospora or Botryosphaeriaceae), and the presence of decay agents was confirmed in all studies. On the other hand, the wood mycobiome of healthy vineyards, characterized by other authors, is dominated by endophytes and occasionally saprobes, some of which hold potential for biological control of wood pathogens, with genera Trichoderma, Aureobasidium, Acremonium, Cladosporium, Alternaria and Epicoccum being predominant (Dissanayake et al., 2018; González and Tello, 2011; Pancher et al., 2012). Despite the fact that wood pathogens were encountered, in minor abundance, also in healthy vineyards (González and Tello, 2011; Pancher et al., 2012), decay agents were not reported. Unfortunately, previous studies did not report data on rare taxa, therefore it is not possible to establish if other wood pathogens – including decay agents – were present but unable to thrive (e.g. antagonistic interactions with other fungi, strong plant defenses) or if they were completely absent.

The differences observed between PW and AW (e.g. alpha diversity, beta dispersion, taxa composition and abundance), some of which have also been observed in a previous study in grapevines (Hofstetter et al., 2012) and other plants (Qi et al., 2012), can be due to two main factors. First factor is the time that fungi had to colonize the perennial wood (up to 19 years in the present study), which is also subjected to yearly pruning, leaving wounds that are the optimal entry point for colonizers. The annual wood had approximately 5 months of age from bud-burst to sampling and no wounding occurred in the shoots (e.g. summer pruning). A second factor that explains the these differences is the tissue-specificity, as considerable anatomical, biochemical and physiological differences characterize permanent wood and annual wood, which may prevent some fungi, such as decay agents, from colonizing the latter. In fact, the absence of decay agents in annual wood (e.g. Fomitiporia sp., Inonotus hispidus, Fomitiporella sp.) observed in this study, and also reported in canes and nursery propagation material in other studies (Casieri et al., 2009; Halleen et al., 2003; Hofstetter et al., 2012; Rumbos and Rumbou, 2001), suggests that the infection by these pathogens occurs exclusively in older wood and under field conditions. A brief introduction of the five genera identified for the first time in grapevine’s wood follows. Debaryomyces. Members of the genus Debaryomyces (Saccharomycetales, Saccharomycetaceae) are yeast-like ascomycetes found in several ecological niches. There are 93 species in the genus, with D. hansenii being the most widely studied. D. hansenii has been isolated from shoots of Sequoia sempervirens and from the soil under the tree (Middelhoven, 2003), as well as from white and brown rot of several woody species (González et al., 1989). Debaryomyces sp. was also detected in slime fluxes of Prosopis juliflora, a deciduous woody plant, as well as in insects associated with it (Drosophila carbonaria and Aulacigaster leucopeza) (Ganter et al., 1986). Molecular analysis identified some of these strains as a new species, D. prosopidis (Phaff et al., 1998), which is the only record of this organism in literature. Concerning the vineyard, Debaryomyces sp. has been identified on grape berries (Jara et al., 2016) and during wine fermentation (Varela and Borneman, 2017). Trematosphaeria. Members of the genus Trematosphaeria (Pleosporales, Pleomassariaceae) are saprotrophs or hemibiotrophs of terrestrial woody plants, with some exceptions reported from freshwater or marine habitats (Tanaka et al., 2005). T. pertusa, from this study, is known to grow on the surface of decaying terrestrial wood (Suetrong et al., 2011), however, it has also been retrieved from wood of Fagus sylvatica and Pinus sylvestris submerged in a river (Kane et al., 2002). Terrestrial woody hosts of T. pertusa include also Fraxinus excelsior, Fagus sp. and Platanus sp. Lopadostoma. Members of the genus Lopadostoma (Xylariales, Xylariaceae) are ascomycetes currently considered saprotrophs (Anthony et al., 2017). Members of the Xylariaceae are typically saprobes, despite some members of this family are endophytes and others are plant pathogens (Mehrabi and Hemmati, 2015). To date, due to the lack of studies on this topic, the ecology of the genus Lopadostoma remains to be fully understood. There are 12 species described in this genus,

most of which have been isolated from hosts Quercus spp. or Fagus sylvatica (Jaklitsch et al., 2014). Biatriospora. The genus Biatriospora (Pleosporales, Biatriosporaceae) contains species that have been isolated as endophytes of both terrestrial and marine- et al., 2017), as well as from , as an endolichenic fungus (Zhou et al., 2016). Known hosts of this genus, in temperate climate, are Ulmus spp. and Acer pseudoplatanusassociated plants, along (Kolařík with . ., 2017). Biatriospora sp. is a source of bioactive compounds (heptaketides) with antifungal severalproperties other (Zhou hosts et al., from 2016) tropical and preliminary climates (Haňáčková studies suggest et al , some 2017; potential Kolařík etin albiological B. mackinnonii, from this study, has been isolated from terrestrial plant material and it has also been linked with human mycetoma, a skin disease (Ahmedcontrol (Haňáčkováet al., 201 et al., 2017).., 2017). Malassezia. Members of the genus Malassezia (Malasezziales, Malasezziaceae) are yeast or yeast-like basidiomycetes4; Kolařík isolated et al from numerous niches. Fourteen species belong to this genus and most of them are obligatory lipophiles (Sommer et al., 2015). M. restricta, from this study, has been identified as endophyte of both woody and herbaceous plants (e.g. Eucalyptus sp., Populus deltoids, Spiranthes spiralis, Solanum tuberosum) as well as found on orchid roots, and in soil (Abdelfattah et al., 2016; Connell and Staudigel, 2013; Paulo et al., 2017).

It is important to mention the genera Cryptococcus, Angustimassarina and Exophiala, found in this study, which have been previously associated once to grapevine’s wood, either as saprobes or endophytes, by a culture-independent study (Jayawardena et al., 2018). Lastly, species of Ramularia have been isolated from leaves (both as epiphytes and endophytes) of several hosts, including Platanus sp., Prunus cerasus and Vitis vinifera (Bakhshi and Arzanlou, 2017), although never in association with wood.

Among the genera and species that were found associated with grapevines for the first time in this study, some have previously been detected in the endosphere or phyllosphere of other woody plants that were found in the proximities of the vineyard used in the present study (Table S4). For example, Lopadostoma spp. and Anthostoma gastrinum were reported in Quercus (Haynes, 2016; Jaklitsch et al., 2014), Rhodotorula mucilaginosa and Trematosphaeria pertusa in Fraxinus Malassezia restricta in Eucalyptus (Paulo et al., 2017). This suggests that the fungi present in the endosphere and phyllosphere of the(Haňáčková flora in proximity et al., 2017; of a vineyard Suetrong might et al., influence 2011), andthe composition of the mycobiome of grapevines wood, acting as a reservoir of multi-host fungi, with wind, rain and insects being possible vectors for mycobiome exchange.

Spatial distribution of the fungal communities The concept of spatial distribution and tissue specificity in woody and herbaceous plants is not new in microbial ecology (Kovalchuk et al., 2018; Paulo et al., 2017; Singh et al., 2017; Wearn et al., 2012). However, studies that investigated the spatial distribution of fungal

communities that colonize different areas or tissues in the wood of adult grapevines are scarce (Bruez et al., 2014; Travadon et al., 2016), and none of them used NGS approaches. When examining the relative abundances of identified taxa with NGS, it is important to remember that the ITS marker, our target amplicon, can be found in multiple copies in the genome of fungi and the number of copies may vary considerably from species to species (Schoch and Seifert, 2012), which inherently leave fungal relative distributions with great uncertainties. From a qualitative point of view, the communities of various areas in PW shared numerous taxa, overall in agreement with the findings of Bruez et al. (2014), although some tissue types were more variable than others. Also, considerable differences are evident from a quantitative point of view. High variability was observed among individual plants, as already noted in grapevines (Travadon et al., 2013). The graft union and the wood below the spurs host unique taxa or taxa that are found only in low percentages (RA < 0.1%; Table 1) in other parts of the plant. Supposedly, the root system (in this study Vitis berlandieri x V. rupestris), whose endosphere is influenced by the soil microbiome (Zarraonaindia et al., 2015) and by the rootstock used in the propagation process, harbor unique fungi that have a limited capacity of colonizing the stem of V. vinifera. Interestingly, Fomitiporia sp., a species that is well represented in all PW tissues (8.9% < RA < 22.9%), is nearly absent in the graft union (0.6%; Figure 4). Artificial inoculations of F. punctata in rootstock Kober 5BB (V. berlandieri x V. riparia) led to low re-isolation percentages of this pathogen (8% of inoculated plants; Sparapano et al., 2000), and we are not aware of any other literature that describes the presence of Fomitiporia sp. in the root system of naturally infected vines. The spur tissue is located in proximity of pruning wounds, which are considered the main entry point of GTDs-associated fungal pathogens (Bertsch et al., 2013). Through this woody tissue some colonizers successfully spread throughout the cordon and trunk (e.g. P. chlamydospora or Fomitiporia sp.), while others may be restrained by the antagonistic interactions with the resident mycobiome and/or plant defenses (e.g. Eutypa spp., Clonostachys rosea, Acremonium sp.; Table 1). The remaining tissues, namely trunk, upper trunk and arm points, do not differ by any of the parameters measured in this study (e.g. alpha and beta diversity, MetacodeR; Figures 2, 3, 5 and 6), where unique taxa among the most represented fungi are nearly absent (Table 1), and relative abundances differ primarily in the representation of P. chlamydospora (Figure 4). The results of this study show that the wood mycobiome of grapevines may vary and, in order to have a representative understanding of the diversity and abundance of the fungal communities, multiple sampling areas are recommended. We propose four samples per plant. The first two are the graft union and the wood below one spur, where some taxa are uniquely represented and abundances vary considerably; the third is any point of the trunk or arm; and the forth is annual wood.

Fungal communities and foliar symptomatology The outcome of the statistical analyses of the mycobiome of both permanent wood and annual wood highlight that the fungal communities were not affected by the manifestation of foliar symptoms, or vice versa. The manifestation of “tiger stipes” foliar symptoms, always associated with an advanced stage of infection by tracheomycotic pathogens, is in large part cryptic due to its discontinuity and unidentified causal agents (Mugnai et al., 1999). Greenhouse and field trials failed to reproduce such symptoms with artificial inoculations of Pa. chlamydospora and/or Pm. minimum (Zanzotto et al., 2007; Gramaje et al., 2010), except for one study in which esca-like foliar symptoms were replicated in a very small percentage of inoculated plants (Sparapano et al., 2001). Other microbial ecology studies of the wood from foliar-symptomatic and asymptomatic vines showed that there are no differences in the fungal community composition, finding similar abundances of wood pathogens (Bruez et al., 2014; Hofstetter et al., 2012). Nevertheless, tracheomycotic fungi are currently believed to be directly or indirectly responsible for foliar symptoms as they are frequently isolated from symptomatic wood (Surico, 2008). The two arguments supporting this hypothesis are that the translocation of (1) fungal toxins, (2) byproducts of the wood degradation or (3) a combination of both, via xylem sap, from the permanent wood to the leaves, are responsible for the appearance of leaf symptoms (Mugnai et al., 1999). While other studies showed that some toxins or culture filtrates of tracheomycotic fungi could lead to leaf discoloration and necrosis, solid evidence of the replication of the “tiger stripes” pattern is lacking (Abou- Mansour et al., 2004; Andolfi et al., 2011; Evidente et al., 2000; Sparapano et al., 2000). No evidence is currently available in support of the second hypothesis, namely the role of byproducts of wood degradation. The results of the present study are, to some extent, in agreement with the findings of Hofstetter et al. (2012) and Bruez et al. (2014). In fact, the fungal communities present in the PW in proximity of canopy that had exhibited symptoms were overall similar to those found in plants with healthy canopies (in symptomatic or asymptomatic plants). When considering trends, Pa. chlamydospora is more abundant in the wood near non-symptomatic canopies, while Fomitiporia sp. is the most represented taxa in the wood near symptomatic canopy. This observation suggests a higher wood decay activity, which likely leads to a greater presence of byproducts of wood degradation, therefore supporting the second hypothesis for foliar symptoms manifestation. Regardless, it does not explain the manifestation of foliar symptomatology in grapevines not infected by Fomitiporia sp. (Edwards et al., 2001). The similarity in the community structure of symptomatic and non-symptomatic canes is an additional evidence in support of the current understanding that fungi present in AW are not directly linked with the foliar symptomatology. Foliar symptoms’ manifestation remains cryptic, however, it is important to note that this study, as well as the one by Bruez et al. (2014), analyzed the microbial communities of the permanent wood several months after the plant had exhibited foliar symptoms. It is not known whether the pathogens abundance in the moment pre-/during/post- symptomatology vary. In fact, the study by Bruez et al. (2014) revealed that alterations in the wood mycobiome may occur in different seasons, therefore further research is needed

in this direction. However, this may not apply to the annual wood, as foliar symptoms were still present, and manifested only two months before wood tissue sampling.

Further considerations In this study, the mycobiome of the canes (AW) is composed principally by endophytes and saprobes, with the exception of pathogens P. chlamydospora and Diaporthe sp., whose presence did not induce the appearance of wood symptoms. Concerning the former, this observation suggests that this fungus may fall along a continuum ‘from mutualist to saprophyte or latent pathogen’ (Wearn et al., 2012), as also proposed by other authors (Hofstetter et al., 2012; Rumbos and Rumbou, 2001), and whose pathogenicity is triggered by external factors. Interestingly, also Fomitiporia sp. was often identified in woody tissue that did not present the typical white rot symptom. This fungus may live asymptomatically in wood when found in low abundances, and produce white rot when its presence increases considerably. Hypothetically, the wood cores showing this symptom (15% of the total) may be the same in which Fomitiporia sp. was present in a RA > 35% (17.5%). The presence of P. chlamydospora, and that of other wood pathogens, in vineyards around the world might be currently blatantly underestimated due to the elusiveness of internal symptoms. The Almotivo vineyard (this study) presented an incidence of foliar symptoms manifestation of ca. 1%, for three consecutive years, nevertheless, 100% of the sampled plants were colonized by both P. chlamydospora (in PW and AW) and Fomitiporia sp. (in PW). It is already known that foliar symptoms are not a reliable parameter to assess the health status of a vineyard (Pollastro et al., 2000), although it is the only one currently employed. It is of utmost importance to develop tools that allow vine growers to assess the real extent of infections in the wood, and apply appropriate control measures.

Conclusion The characterization of the grapevine wood mycobiome, using NGS, in a vineyard affected by esca proper is an important step that lays the foundations for future studies to compare microbial community structures of vineyards affected by esca or other GTDs. Some parameters that may influence the mycobiome composition and which may be of interest to investigate upon are: the flora surrounding the vineyard, the climatic conditions and seasonality, geographical location and year. Moreover, useful comparisons can be made between cultivars and vineyard management strategies (e.g. conventional, organic, biodynamic). Eventually, a meta-analysis of the mycobiome that takes into account several of these parameters may reveal a pattern that could elucidate some of the obscure points that still prevent a full understanding of the etiology of this disease complex.

Conflict of Interest The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Author contributions GDF: conceived the study, performed field sampling, performed wet-lab work, interpreted and discussed the results, wrote the manuscript; AG: performed wet-lab work, performed bio-informatics analyses, interpreted the results, wrote the manuscript; MRA: performed statistical analyses and data visualization; HO, LHH, RBF: supervised the study, reviewed the manuscript.

Funding This study was funded by the Horizon 2020 Programme of the European Commission within -Curie Innovative Training Network “MicroWine” (grant number 643063 to G. Del Frari, A. Gobbi, L. Hestbjerg Hansen and R. Boavida Ferreira) and by Portuguesethe Marie Skłodowskanational fund FCT Unit funding UID/AGR/04129/2013 (LEAF).

Acknowledgements The authors would like to acknowledge João Costa, Inês Valente, David van der Kellen, Lúcia Gomes, Elisa Filippi for technical support; Pedro Talhinhas for advises; Professor Carlos Lopes for allowing the use of the Almotivo vineyard; Professor José Carlos Costa for identifying the flora adjoining the Almotivo vineyard.

References Abdelfattah, A., Wisniewski, M., Droby, S., and Schena, L. (2016). Spatial and compositional variation in the fungal communities of organic and conventionally grown apple fruit at the consumer point-of-purchase. Hortic Res, 3. https://doi.org/10.1038/hortres.2016.47 Abou-Mansour, E., Couché, E., and Tabacchi, R. (2004). Do fungal naphthalenones have a role Phytopathol Mediterr, 43(1), 75–82. Ahmed, S. A., van de Sande, W. W. J., Stevens, D. A., Fahal, A., van Diepeningen, A. D., Menken, S.in B.the J., developmentet al. (2014). Revisionof esca symptoms? of agents of black-grain eumycetoma in the order Pleosporales. Persoonia, 33, 141–54. https://doi.org/10.3767/003158514X684744 Andolfi, A., Mugnai, L., Luque, J., Surico, G., Cimmino, A., and Evidente, A. (2011). Phytotoxins produced by fungi associated with grapevine trunk diseases. Toxins (Vol. 3). https://doi.org/10.3390/toxins3121569 Anthony, M. A., Frey, S. D., and Stinson, K. A. (2017). Fungal community homogenization, shift in dominant trophic guild, and appearance of novel taxa with biotic invasion. Ecosphere, 8(9). https://doi.org/10.1002/ecs2.1951 Aurand, J.M. (2017). State of the vitiviniculture world market. 38th OIV World Congress of Vine and Wine, 1–14. Bakhshi, M., and Arzanlou, M. (2017). Multigene phylogeny reveals a new species and novel records and hosts in the genus Ramularia from Iran. Mycol Prog, 16(7), 703–712. https://doi.org/10.1007/s11557-017-1308-y Bertsch C., Ramírez-Suero, M., Magnin-Robert, M., Larignon, P., Chong, J., Abou-Mansour, E., et al. (2013). Grapevine trunk diseases: complex and still poorly understood. Plant Pathology 62(2), 243–65. doi: 10.1111/j.1365-3059.2012.02674.x.

Bruez, E., Baumgartner, K., Bastien, S., Travadon, R., Guérin-Dubrana, L., and Rey, P. (2016). Various fungal communities colonise the functional wood tissues of old grapevines externally free from grapevine trunk disease symptoms. Aust J Grape Wine Res, 22(2), 288– 295. https://doi.org/10.1111/ajgw.12209 Bruez, E., Vallance, J., Gerbore, J., Lecomte, P., Da Costa, J. P., Guerin-Dubrana, L., et al. (2014). Analyses of the temporal dynamics of fungal communities colonizing the healthy wood tissues of esca leaf-symptomatic and asymptomatic vines. PLoS ONE, 9(5). https://doi.org/10.1371/journal.pone.0095928 Callahan B.J., McMurdie, P.J., and Holmes, S.P. (2017). Exact sequence variants should replace operational taxonomic units in marker-gene data analysis. ISME Journal, 11(12), 2639–2643. https://doi.org/10.1038/ismej.2017.119 Caporaso G., Kuczynski, J., Stombaugh, J., Bittinger, K., Bushman, F.D., Costello, E.K., et al. (2010). QIIME allows analysis of high-throughput community sequencing data. Nature Methods, 7(5), 335–336. https://doi.org/10.1038/nmeth0510-335 Casieri, L., Hofstetter, V., Viret, O., and Gindro, K. (2009). Fungal communities living in the wood of different cultivars of young Vitis vinifera plants. Phytopathol Mediterr, 48(1), 73–83. Connell, L., and Staudigel, H. (2013). Fungal diversity in a dark oligotrophic volcanic ecosystem (DOVE) on Mount Erebus, Antarctica. Biology, 2(2), 798–809. https://doi.org/10.3390/biology2020798 Darrieutort, G., and Pascal, L. (2007). Evaluation of a trunk injection technique to control grapevine trunk diseases. Phytopathol Mediterr, 46, 50–57. https://doi.org/10.14601/-1853 Csardi, G., and Nepusz, T. (2006). The igraph software package for complex network research. InterJournal, Complex Systems, 1695. http://igraph.org Dissanayake, A.J., Purahong, W., Wubet, T., Hyde, K.D., Zhang, W., Xu, H., et al. (2018). Direct comparison of culture-dependent and culture-independent molecular approaches reveal the diversity of fungal endophytic communities in stems of grapevine (Vitis vinifera). Fungal Diversity, 90(1), 85–107. https://doi.org/10.1007/s13225-018-0399-3 Dowle, M., and Srinivasan, A. (2017). Data.table: extension of data frame. https://CRAN.R- project.org/package=data.table. Dray, S., Blanchet, G., Borcard, D., Clappe, S., Guenard, G., Jombart, T., et al. (2018). adespatial: multivariate multiscale spatial analysis. R package version 0.1-1. https://CRAN.R- project.org/package=adespatial Edwards, J., Marchi, G., and Pascoe, I. G. (2001). Young esca in Australia. Phytopathol Mediterr, 40(3), 303–310. https://doi.org/10.14601/Phytopathol Edwards, J., and Pascoe, I. G. (2004). Occurrence of Phaeomoniella chlamydospora and Phaeoacremonium aleophilum associated with Petri disease and esca in Australian grapevines. Australas Plant Pathol, 33(2), 273–279. https://doi.org/10.1071/AP04016 Eskalen, A., Feliciano, A., and Gubler, W. (2007). Susceptibility of grapevine pruning wounds and symptom development in response to infection by Phaeoacremonium aleophilum and Phaeomoniella chlamydospora. Plant Dis, 91(9), 1100–1104. https://doi.org/Doi 10.1094/Pdis-91-9-1100

Evidente, A., Sparapano, L., Andolfi, A., and Bruno, G. (2000). Two naphthalenone pentaketides from liquid cultures of Phaeoacremonium aleophilum, a fungus associated with esca of grapevine. Phytopathol Mediterr, 39(1), 162–168. https://doi.org/10.14601/Phytopathol_Mediterr-1559 Feld, L., Nielsen, T. K., Hansen, L. H., Aamand, J., and Albers, C. N. (2015). Establishment of bacterial herbicide degraders in a rapid sand filter for bioremediation of phenoxypropionate-polluted groundwater. Appl Environ Microbiol, 82(3), 878–887. https://doi.org/10.1128/AEM.02600-15 Fontaine, F., Gramaje, D., Armengol, J., Smart, R., Nagy, Z.A., Borgo, M., et al. (2016). Grapevine trunk diseases. A review. OIV publications (December). Available at: http://www.oiv.int/public/medias/4650/trunk-diseases-oiv-2016.pdf. Foster, Z., Sharpton, T., and Grünwald, N. (2017). Metacoder: An R package for visualization and manipulation of community taxonomic diversity data. PLOS Computational Biology, 13(2). doi: 10.1371/journal.pcbi.1005404 Ganter, P. F., Starmer, W. T., Lachance, M. A., and Phaff, H. J. (1986). Yeast communities from host plants and associated Drosophila in southern arizona: new isolations and analysis of the relative importance of hosts and vectors on comunity composition. Oecologia, 70(3), 386– 392. https://doi.org/10.1007/BF00379501 Gaylarde, C., Ogawa, A., Beech, I., Kowalski, M., and Baptista-Neto, J. A. (2017). Analysis of dark crusts on the church of Nossa Senhora do Carmo in Rio de Janeiro, Brazil, using chemical, microscope and metabarcoding microbial identification techniques. Int Biodeterior Biodegradation, 117(January), 60–67. https://doi.org/10.1016/j.ibiod.2016.11.028 Gobbi A., Santini, R., Filippi, E., Ellegaard-Jensen, L., Jacobsen, C.S., and L. H. Hansen (2018). Quantitative and qualitative evaluation of the impact of the G2 enhancer, bead sizes and lysing tubes on the bacterial community composition during DNA extraction from recalcitrant soil core samples based on community sequencing and qPCR. https://doi.org/10.1101/365395 González, A. E., Martínez, A. T., Almendros, G., and Grinbergs, J. (1989). A study of yeasts during the delignification and fungal transformation of wood into cattle feed in Chilean rain forest. Antonie van Leeuwenhoek, 55(3), 221–236. https://doi.org/10.1007/BF00393851 González, V., Sánchez-Torres, P., Hinarejos, R., and Tuset, J. J. (2009). Inonotus hispidus fruiting bodies on grapevines with esca symptoms in mediterranean areas of Spain. J Plant Pathol, 91(2), 465–468. González, V., and Tello, M. L. (2011). The endophytic mycota associated with Vitis vinifera in central Spain. Fungal Divers, 47, 29–42. https://doi.org/10.1007/s13225-010-0073-x Gramaje, D., García-Jiménez, J., and Armengol, J. (2010). Field evaluation of grapevine rootstocks inoculated with fungi associated with petri disease and esca. Am J Enol Vitic, 61(4), 512–520. https://doi.org/10.5344/ajev.2010.10021 Gramaje, D., Urbez-Torres, J. R., and Sosnowski, M. R. (2018). Managing grapevine trunk diseases with respect to etiology and epidemiology: current strategies and future prospects. Plant Dis, 102(1), 12–39. https://doi.org/10.1094/PDIS-04-17-0512-FE

Halleen, F., Crous, P. W., and Petrini, O. (2003). Fungi associated with healthy grapevine cuttings in nurseries, with special reference to pathogens involved in the decline of young vines. Australas Plant Pathol, 32(1), 47–52. https://doi.org/10.1071/AP02062 endophytes in ash shoots – diversity and inhibition of Hymenoscyphus fraxineus. Balt For, 23(1),Haňáčková, 89–106. Z., Havrdová, L., Černý, K., Zahradník, D., and Koukol, O. (2017). Fungal Haynes, J. D. (2016). The developmental morphology of Anthostoma gastrinum. Mycologia, 61(3), 518–525. Hofstetter, V., Buyck, B., Croll, D., Viret, O., Couloux, A., and Gindro, K. (2012). What if esca Fungal Divers, 54, 51–67. https://doi.org/10.1007/s13225-012-0171-z Jaklitsch,disease ofW. grapevineM., Fournier, were J., Rogers, not aJ. D., fungal and Voglmayr, disease? H. (2014). Phylogenetic and taxonomic revision of Lopadostoma. Persoonia, 32, 52–82. https://doi.org/10.3767/003158514X679272 Jara, C., Laurie, V. F., Mas, A., and Romero, J. (2016). Microbial terroir in chilean valleys: diversity of non-conventional yeast. Front Microbiol, 7(MAY). https://doi.org/10.3389/fmicb.2016.00663 Jayawardena, R.S., Purahong, W., Zhang, W., Wubet, T., Li, X., Liu, M., et al. (2018). Biodiversity of fungi on Vitis vinifera L. revealed by traditional and high-resolution culture-independent approaches. Fungal Diversity, 4. https://doi.org/10.1007/s13225-018-0398-4 Kane, D. F., Tam, W. Y., and Jones, E. B. G. (2002). Fungi colonising and sporulating on submerged wood in the River Severn, UK. Fungal Divers, 10, 45–55.

Biatriospora (Ascomycota: Pleosporales) is an ecologically diverse genus including Kolařík,facultative M., marine Spakowicz, fungi and D.J., endophytes Gazis, R., Shaw, with J.,biotechnological Kubátová, A., Nováková,potential. Plant A., et Systematics al. (2017). and Evolution, 303(1), 35–50. https://doi.org/10.1007/s00606-016-1350-2 Kovalchuk, A., Mukrimin, M., Zeng, Z., Raffaello, T., Liu, M., Kasanen, R., et al. (2018). Mycobiome analysis of asymptomatic and symptomatic Norway spruce trees naturally infected by the conifer pathogens Heterobasidion spp. Environmental Microbiology Reports, 10(5), 532-41. https://doi.org/10.1111/1758-2229.12654 Larignon, P., Fontaine, F., Farine, S., Clément, C., and Bertsch, C. (2009). Esca et Black Dead C R Biol, 332(9), 765–783. https://doi.org/10.1016/j.crvi.2009.05.005 Arm :McMurdie, deux P.J.,acteurs and majeursHolmes, S.des (2013). maladies Phyloseq: du bois an chez R package la vigne. for reproducible interactive analysis and graphics of microbiome census data. PloS One 8 (4). Public Library of Science: e61217. McMurdie, P.J., and Paulson J.N. (2018). biomformat: an interface package for the BIOM file format. https://github.com/joey711/biomformat/, http://biom-format.org/ Mehrabi, M., and Hemmati, R. (2015). Two new records of Lopadostoma for mycobiota of Iran. Mycol Iran, 2(1), 59–64. Retrieved from http://mi.iranjournals.ir/article_14225.html Middelhoven, W. J. (2003). The yeast flora of the coast redwood, Sequoia sempervirens. Folia Microbiol, 48(3), 361–362. https://doi.org/10.1007/BF02931367

Miguel, P. S. B., Delvaux, J. C., de Oliveira, M. N. V., Moreira, B.C., Borges A. C., Totola, M. R., et al. (2017). Diversity and distribution of the endophytic fungal community in eucalyptus leaves. Afr J Microbiol Res, 11(3), 92–105. https://doi.org/10.5897/AJMR2016.8353 Mondello, V., Songy, A., Battiston, E., Pinto, C., Coppin, C., Trotel-Aziz, P., et al. (2017). Grapevine trunk diseases: a review of fifteen years of trials for their control with chemicals and biocontrol agents. Plant Dis, PDIS-08-17-1181-FE. https://doi.org/10.1094/PDIS-08- 17-1181-FE Morgan, H. H., du Toit, M., and Setati, M. E. (2017). The grapevine and wine microbiome: Insights from high-throughput amplicon sequencing. Front Microbiol, 8(MAY). https://doi.org/10.3389/fmicb.2017.00820 Mostert, L., Halleen, F., Fourie, P., and Crous, P. W. (2006). A review of Phaeo-acremonium species involved in Petri disease and esca of grapevines. Phytopathol Mediterr, 45(SUPPL. 1), 12–29. Mugnai, L., Graniti, A., and Surico, G. (1999). Esca (black measles) and brown wood- streaking: two old and elusive diseases of grapevines. Plant Dis, 83(5), 404–418. https://doi.org/10.1094/PDIS.1999.83.5.404 Nilsson, R.H., Taylor, A.F.S., Bates, S.T., Thomas, D., Bengtsson-Palme, J., Callaghan, T.M., et al. (2013). Towards a unified paradigm for sequence-based identification of fungi. Molecular Ecology, 22(21), 5271–5277. https://doi.org/10.1111/mec.12481 Oksanen, J., Kindt, R., Legendre, P., O’Hara, B., Stevens, M.H.H., Oksanen, M.J., et al. (2007). The Vegan package. Community Ecology Package, 10, 631–37. Pancher, M., Ceol, M., Corneo, P. E., Longa, C. M. O., Yousaf, S., Pertot, I., et al. (2012). Fungal endophytic communities in grapevines (Vitis vinifera L.) respond to crop management. Appl Environ Microbiol, 78(12), 4308–4317. https://doi.org/10.1128/AEM.07655-11 Peay, K., Kennedy, P.G., and Talbot, J. (2016). Dimensions of biodiverisity in the Earth mycobiome. Nature, 14 (7), 434-447. 10.1038/nrmicro.2016.59 Phaff, H. J., Vaughan-Martini, A., and Starmer, W. T. (1998). Debaryomyces prosopidis sp. nov., a yeast from exudates of mesquite trees. Int J Syst Bacteriol, 48, 1419–1424. https://doi.org/10.1099/00207713-48-4-1419 Pinto, C., Pinho, D., Sousa, S., Pinheiro, M., Egas, C., and Gomes, A. C. (2014). Unravelling the diversity of grapevine microbiome. PLoS ONE, 9(1). https://doi.org/10.1371/journal.pone.0085622 Pollastro, S., Dongiovanni, C., Abbatecola, A., and Faretra, F. (2000). Observations on the fungi associated with esca and on spatial distribution of esca-symptomatic plants in Apulian (Italy) vineyards. Phytopathol Mediterr, 39(1), 206–210. Qi, F., Jing, T., and Zhan, Y. (2012). Characterization of endophytic fungi from Acer ginnala Maxim. in an artificial plantation: media effect and tissue-dependent variation. PLoS ONE, 7(10). https://doi.org/10.1371/journal.pone.0046785 Rumbos, I., and Rumbou, A. (2001). Fungi associated with esca and young grapevine decline in Greece. Phytopathol Mediterr, 40(3), 330–335. https://doi.org/10.1002/pssa.2210290216

Schoch, C. L., and Seifert, K. A. (2012). Nuclear ribosomal internal transcribed spacer (ITS) region as a universal DNA barcode marker for fungi. Proc Natl Acad Sci U S A, 109(27), E1812–E1812. https://doi.org/10.1073/pnas.1207508109 Singh, D. K., Sharma, V. K., Kumar, J., Mishra, A., Verma, S. K., Sieber, T. N., et al. (2017). Diversity of endophytic mycobiota of tropical tree Tectona grandis Linn.f.: spatiotemporal and tissue type effects. Sci Rep, 7(1), 1–14. https://doi.org/10.1038/s41598-017-03933-0 Singh, P., Santoni, S., This, P., and Péros, J.-P. (2018). Genotype-environment interaction shapes the microbial assemblage in grapevine’s phyllosphere and carposphere: an NGS approach. Microorganisms, 6(4), 96. https://doi.org/10.3390/microorganisms6040096 Sommer, B., Overy, D. P., and Kerr, R. G. (2015). Identification and characterization of lipases from Malassezia restricta, a causative agent of dandruff. FEMS Yeast Res, 15(7), 1–8. https://doi.org/10.1093/femsyr/fov078 Sparapano, L., Bruno, G., Ciccarone, C., and Graniti, A. (2000). Infection of grapevines by some fungi associated with esca. I. Fomitiporia punctata as a wood-rot inducer. Phytopathol Mediterr, 39(1), 46–52. https://doi.org/10.14601/PHYTOPATHOL_MEDITERR-1542 Sparapano, L., Bruno, G., and Graniti, A. (2000). Effects on plants of metabolites produced in culture by Phaeoacremonium chlamydosporum, P. aleophilum and Fomitiporia punctata. Phytopathol Mediterr, 39(1), 169–177. https://doi.org/10.14601/phytopathol_mediterr- 1535 Sparapano, L., Bruno, G., and Graniti, A. (2001). Three-year observation of grapevines cross- inoculated with esca-associated fungi. Phytopathol Mediterr, 40(3), 376–386. Suetrong, S., Hyde, K.D., Zhang, Y., Bahkali, A.H., and Jones, G. (2011). Trematosphaeriaceae fam. nov. (Dothideomycetes, Ascomycota). Cryptogamie, Mycologie, 32(4), 343–358. Surico, G., Mugnai, L., and Marchi, G. (2008). The esca disease complex. Integrated Management of Diseases Caused by Fungi, Phytoplasma and Bacteria, 119–136. Surico, G. (2009). Towards a redefinition of the diseases within the esca complex of grapevine. Phytopathol Mediterr, 48(1), 5–10. Tanaka, K., Harada, Y., and Barr, M. E. (2005). Trematosphaeria: taxonomic concepts, new species from Japan and key to species. Fungal Divers, 19, 145–156. Thambugala, K. M., Hyde, K. D., Tanaka, K., Tian, Q., Wanasinghe, D. N., Ariyawansa, H. A., et al. (2015). Towards a natural classification and backbone tree for Lophiostomataceae, Floricolaceae, and Amorosiaceae fam. nov. Fungal Divers, 74(1), 199–266. https://doi.org/10.1007/s13225-015-0348-3 Travadon, R., Rolshausen, P. E., Gubler, W. D., Cadle-Davidson, L., and Baumgartner, K. (2013). Susceptibility of cultivated and wild Vitis spp. to wood infection by fungal trunk pathogens. Plant Dis, 97(12), 1529–1536. https://doi.org/10.1094/PDIS-05-13-0525-RE Travadon, R., Lecomte, P., Diarra, B., Lawrence, D. P., Renault, D., Ojeda, H, et al. (2016). Grapevine pruning systems and cultivars influence the diversity of wood-colonizing fungi. Fungal Ecol, 24, 82–93. https://doi.org/10.1016/j.funeco.2016.09.003 Varela, C., and Borneman, A. R. (2017). Yeasts found in vineyards and wineries. Yeast, 34(3), 111–128. https://doi.org/10.1002/yea.3219

Wearn, J. A., Sutton, B. C., Morley, N. J., and Gange, A. C. (2012). Species and organ specificity of fungal endophytes in herbaceous grassland plants. J Ecol, 100(5), 1085–1092. https://doi.org/10.1111/j.1365-2745.2012.01997.x Wickham, H. (2016). Ggplot2: Elegant Graphics for Data Analysis. Springer. Xing, X. K., Chen, J., Xu, M. J., Lin, W. H., and Guo, S. X. (2011). Fungal endophytes associated with Sonneratia (Sonneratiaceae) mangrove plants on the south coast of China. For Pathol, 41(4), 334–340. https://doi.org/10.1111/j.1439-0329.2010.00683.x Zanzotto, A., Autiero, F., Bellotto, D., Dal Cortivo, G., Lucchetta, G., and Borgo, M. (2007). Occurrence of Phaeoacremonium spp. and Phaeomoniella chlamydospora in grape propagation materials and young grapevines. Eur J Plant Pathol, 119(2), 183–192. https://doi.org/10.1007/s10658-007-9160-6 Zarraonaindia, I., Owens, S. M., Weisenhorn, P., West, K., Hampton-marcell, J., Lax, S et al. (2015). The soil microbiome influences grapevine-associated microbiota. MBio, 6(2), 1–10. https://doi.org/10.1128/mBio.02527-14.Editor Zhou, Y. H., Zhang, M., Zhu, R.-X., Zhang, J.-Z., Xie, F., Li, X.-B., et al. (2016). Heptaketides from an endolichenic fungus Biatriospora sp. and their antifungal activity. J Nat Prod, 79(9), 2149– 2157. https://doi.org/10.1021/acs.jnatprod.5b00998

MANUSCRIPT 5

Assessing the impact of plant genetic diversity in shaping the microbial community structure of Vitis vinifera phyllosphere in the Mediterranean

This paper show the effect of grape cultivar on shaping and selecting the phyllosphere microbial community.The experimental setup includes samples collected in a field-station in Montpellier, which account for 279 different cultivar within the same vineyard. This paper was published on Frontiers in Life Science. This paper is presented and discussed in this thesis since it is complementary to Manuscript 4 and Manuscript 6 and assess the microbial community living on leaves of Vitis vinifera. In this way, I could contribute to characterize also the phyllosphere microbiota and how it is affected by the different cultivar and further discuss it within Manuscript 6. My contribution to this paper is minor since I took part in the early stages of the bioinformatics analyses and I finally revised the draft to ensure a correct interpretation of the data.

Assessing the impact of plant genetic diversity in shaping the microbial community structure of Vitis vinifera phyllosphere in the Mediterranean

Prashant Singha*, Alex Gobbib, Lars H Hansenb, Sylvain Santonia, Patrice Thisa, Jean-Pierre Pérosa aAGAP, Univ Montpellier, CIRAD, INRA, Montpellier SupAgro, Montpellier, France, bDepartment of Environmental Science, Aarhus University, Roskilde, Copenhagen, Denmark

Key words: Genetic diversity, Grapevine, Microbiome, Phyllosphere.

Abstract The aerial surface of the plant (phyllosphere) is the habitat of complex microbial communities and the structure of this microbiome may be dependent on plant genetic factors, local environment or interactions between them. In this study, we explored the microbial diversity present in the phyllosphere of a very diverse set of grapevine cultivars representing the three genetic pools of the species, grown on an experimental plot at Montpellier (French Mediterranean region). We assessed microbiome variation in the phyllosphere using amplicon sequencing of the 16S rRNA gene and of the internal transcribed spacer (ITS), according to the grapevine genetic pools or cultivars, and organs (i.e. leaves and grape berries). The observed microbiome was complex; out of 542 bacterial genera; Pseudomonas, Pantoea, Sphingomonas, and Acinetobacter were the most abundant and almost ubiquitously present across the samples, and out of 267 fungal genera; Aureobasidium, Alternaria, Mycosphaerella and Aspergillus were most represented. Our results illustrated that the microbial taxa were almost uniformly distributed among the genetic pools and only a few cultivar or genetic pool level differences were found, but a very clear differential taxa abundance was found between the leaf and berry samples. Some genus level associations were also observed with certain genetic pools.

Introduction Vitis vinifera (subsp. vinifera L.), is the main grape species grown for fruit and wine production over the world. It is a natural host of a wide variety of prokaryotic and eukaryotic microorganisms that interact with grapevine, having either beneficial or phytopathogenic effects (Schulz et al. 1999). These microbes also play a major role in fruit yield, grape quality and, ultimately, in the pattern of grape fermentation and wine production (Compant et al. 2011; Bokulich et al. 2016; Belda et al. 2017). The grapevine phyllosphere is rather less extensively studied as compared to the

rhizosphere and endosphere (Vorholt 2012). The phyllosphere (in general) also harbors complex microbial communities involved in many crucial functions such as nitrogen fixation (Jones 1970), carbon sequestration, degradation of pesticides and organic pollutants (Brandl et al. 2001; Kishore et al. 2005; Bulgarelli et al. 2013). It is a significant and ubiquitous habitat for microorganisms and also an open system that microbes can invade by migration from the atmosphere, soil, other plants and insects (Lugtenberg et al. 2002; Williams et al. 2007). But microbial populations on phyllosphere are also known to live and thrive under harsh environmental factors such as UV radiation, air pollution, temperature fluctuations, water and nutrient availability (Andrews and Harris. 2000; Lugtenberg et al. 2002; Müller & Ruppel. 2014). A very fundamental question in microbial ecology is what drivers shape the microbiome on leaf age have been identified as important drivers (Kadivar & Stapleton 2003, Ikeda et al. 2011).phyllosphere? Some reports Environmental on grapevine conditions phyllosphere at the particular also suggested location that and the biotic bacterial factors and such fungal as communities of the phyllosphere are minimally affected by the chemical and biological treatments tested, and they mainly differed according to the grapevine location (Perazzolli et al. 2014; Bokulich et al. 2014). Few authors suggested that in the tropical and temperate forests, the plant genotype is a major driver of the composition of the bacterial communities in the phyllosphere (Lambais et al. 2006; Redford et al. 2010). Another study on Arabidopsis thaliana also illustrated that the plant genetic factors may influence the community composition of the phyllosphere (Bodenhausen et al. 2014). Until recently, there have been no scientific reports available, analyzing the effect of grapevine genetic factors on the microbiome structure of the phyllosphere. Considering that the microbial diversity present in phyllosphere could be relevant for plant health (Lugtenberg et al. 2002; Compant et al. 2005; Vorholt 2012), a better understanding of how the microbiota associated with grapevine phyllosphere is structured according to the grapevine genetic diversity available at a particular geographic location may provide unexpected opportunities to develop innovative and natural biocontrol methods or phytostimulators against plant pathogen or new breeding scheme for the creation of innovative resistant cultivars. As a first step towards this goal, we explored the bacterial and fungal diversity in the phyllosphere of leaf and berry samples from a set of rather diverse grapevine cultivars that belongs to the three genetic pools of the cultivated grapevine (Nicolas et al. 2016), in the French Mediterranean region. These experiments led us to address two major questions: i) What microbial diversity is present in the phyllosphere of our Mediterranean vineyards and ii) how this microbiome structure itself according to the grapevine genetic diversity and plant organs.

Materials and Methods Sample collection and DNA extraction A total of 279 grapevine cultivars were grown in a completely randomized block design at

Le Chapitre INRA Villeneneuve-Les-Maguelonne field station near Montpellier (French Mediterranean region). A panel of cultivars representing three genetic pools (western Europe, WW; from eastern Europe, WE; and table grape, TE) was constructed for genome- wide association studies while minimizing relatedness and retaining the main founders of modern cultivated grapevine to optimize the genetic diversity (Nicolas et al. 2016). Nine cultivars were randomly selected from each genetic pool and leaf (with sizes +1 to +4) samples were taken from four to five plants of each cultivar. Leaf samples were taken before spraying of pesticides; each plant had the same age. We collected the leaf samples in the Spring (mid-May 2016) and at the beginning of harvesting season, we also collected samples of berries from the same cultivars. A metadata table containing all the information about the samples and replicates can be downloaded from the GitHub repository (https://github.com/PrashINRA/MetaData_GrapevinePhyllo.git). All samples were washed with an isotonic solution of sodium chloride (0.15M) containing 0.01% Tween 20 using a horizontal shaker for 1hr at 100 RPM. Afterward, samples were given an ultrasonic bath for 7-10 min using Ultrasonic Cleaner (Branson 5510) for maximum recovery of microbes from the sample surface. The remaining solution was centrifuged at 4,000g and microbial pellets obtained in a 2-ml Eppendorf tube were collected and stored at -20°C. DNA was extracted from the pellets using the Meta-G-Nome Isolation Kit (Epicentre, Illumina) following the manufacturer’s instructions.

PCR amplifications and MiSeQ library preparation. To access bacterial communities, the V4 region of the 16S ribosomal gene was amplified using primers 515F and 806R (Caporaso et al. 2011). Fungal community diversity and abundance were accessed using modified ITS9 and ITS4 primers targeting the ITS2 region (Blaadid et al. 2013; Lundberg et al. 2013). Two-step PCR was performed to prepare sequencing libraries. PCR1 was designed to perform amplification of the target regions and to add Illumina Nextera transposase sequence to the amplicons. Primers from Illumina kit for dual indexing of the amplicons was used in PCR2. Both forward and reverse primers for PCR1 were amended with frameshift (FS) sequences in their 5’ overhang to improve sequence diversity and overall read quality (Souza et al. 2016). PCR1 was performed in 20 L reactions with 30ng of sample DNA using the Advantage 2 PCR kit (Clontech, 639206). PNA PCR clamps were also used to reduce host organelle contamination (Souza et al. 2016). Theμ same PCR1 was performed for ITS amplification except for the step of PNA annealing. Amplicon replicates were pooled, purified using Agencourt AMPure XP beads (Beckman Coulter) at a bead-to-DNA ratio of 0.7:1, resuspended in 30 L MilliQ water and evaluated in agarose gels. In PCR2, each cleaned PCR1 product within the same sample received a unique combination of forward and reverse primers (respectively,μ N7 and S5 Illumina dual index oligos). Afterward, samples were again cleaned using AmPure XP magnetic beads, pooled in equimolar concentrations and sequenced using 2x250bp MiSeq v2 sequencing (Illumina Inc., San Diego, CA, USA).

Data processing and analysis All RAW data files were imported and processed in the R-environment (R Core Team, 2017) using various codes and inbuilt functions available in different R-packages. The whole dataset for 16S and ITS amplicon sequences were uploaded and available at the institutional server http://agap-ng6.supagro.inra.fr/inra. Data processing and further analysis were done in two phases. In phase-I, raw data files from both the datasets were filtered and trimmed using the fastqPairedFilter() function of the dada2 package (Callahan et al. 2016) and bases with low- quality scores were discarded. These filtered files were then processed using Divisive Amplicon Denoising Algorithm (DADA) pipeline which included the steps of dereplication, core denoising algorithm and merging of the base pairs. Merging function provided global ends-free alignment between paired forward and reverse reads, and merged them together if they overlapped exactly and a table for ribosomal sequence variants (RSVs, a higher analog of operational taxonomic units-OTUs) was constructed, which records the number of times each amplicon sequence variant was observed in each sample. DADA infers sample sequences exactly and resolves differences of as little as one nucleotide (Callahan et al. 2016). Chimeras were removed using the removeBimeraDenovo() function of the dada2 package. OTU sequences were assigned a taxonomy using the RDP classifier and the UNITE database (Wang et al. 2007; Abarenkov et al. 2010) with assignTaxonomy() function of the same dada2 package for 16S and ITS sequences, respectively. Then, at the end of phase-I data processing, a phyloseq data object was created to initiate phase-II data analysis. In phase-II, a phylogenetic tree for the taxa was constructed using the R-package ape (Paradis et al. 2004) and merged with the phyloseq data object of phase-I. Unassigned taxa and singletons were also removed using the subset_taxa() and prune_taxa() functions of the phyloseq package in R (McMurdie & Holmes 2013). This data object was then used to calculate microbial abundances, , diversity analysis and for other statistical tests using various functions in the phyloseq and vegan packages (McMurdie & Holmes 2013; Oksanen et al. 2017). α β Prevalence plot for taxa abundances was made using ggplot() function of the ggplot2 package (Wickham 2009) using the entire 16S and ITS data-sets. Chao1 estimates of diversity (Chao 1987) was measured within sample categories using estimate_richness() α function of the phyloseq package. Relative abundances of microbial genera were also plotted using the ggplot2 package (Wickham 2009) on the above data, which were also rarified to even depth of 5,000 reads per sample. Multidimensional scaling (MDS, also known as principal coordinate analysis; PCoA) was performed using Bray-Curtis dissimilarity matrix between samples and visualized by using their base functions in the phyloseq package (McMurdie & Holmes 2013).

Statistical Analysis

We analyzed all the data from 16S and ITS amplifications separately in R version 3.3.4 using the dada2, phyloseq and vegan packages. CRAN packages plyr and ggplot2 (Wickham 2009; Wickham 2011) were also used to draw the figures. We assessed the statistical significance (P < 0.05) throughout and whenever necessary, we adjusted P-values for multiple comparisons according to the Benjamini and Hochberg method to control False Discovery Rate (Benjamini & Hochberg 1995) while performing multiple testing on taxa abundance according to sample categories. We performed an analysis of variance (ANOVA) among sample categories while measuring the Chao1 estimates of -diversity. Stratified permutational multivariate analysis of variance (PERMANOVA) with 999 permutations was conducted on all principal coordinates obtained during PCoA with theα adonis() function of the vegan package, to observe the statistical significance of clusters according to the sample categories. Linear regression (parametric test), and Wilcoxon (Non-parametric) test were performed on taxa abundances against genetic pools using their base functions in R (Myles and Douglas 1973; Bauer 1972).

Results Quality assessment of the data Raw demultiplexed sequence data files were generated using high-throughput amplicon sequencing of 16S and ITS ribosomal RNA genes and the number of reads per sample has been taken into account to obtain the depth of the sequencing. Rarefaction curves (number of reads vs number of OTUs) from both the datasets (Fig.1A, B) began to level off for most of the samples suggesting a good quality and coverage of both the data-sets and thus we can assume that the microbial communities were reasonably characterized with the sampling effort.

Microbial diversity in the phyllosphere A total of 5,772,135 16S and 3,807,033 ITS amplicon sequences were generated from 80 samples covering two sample types (or organ types) and three genetic pools, respectively. We identified 12,875 unique bacterial and 3,413 unique fungal OTUs, in our phyllosphere samples. After removal of unassigned taxa (genus level assignment) and singletons, 6,017 unique bacterial and 2,075 unique fungal OTUs belonging to 542 bacterial and 267 fungal genera were recovered. Phylum level classification of bacterial and fungal communities was also identified (Fig.2A, B) using the feature prevalence of entire16S and ITS data-sets, which is the number of samples in which a taxon appeared at least once. For example, the phylum Ignavibacteriae had only five unique taxa with the cumulative abundance of thirty-eight and its presence is observed in less than 10% of the samples. Bacterial and fungal communities were heavily dominated by phylum Proteobacteria (relative abundance > 55%) and Ascomycota (> 65%) respectively.

Effects of genetic diversity on microbial communities in the phyllosphere Multiple testing on each of bacterial 6017 and fungal 2075 OTUs (with adjusted p-values:adjp and controlling false discovery rates) was performed according to cultivars and genetic pools and apart from two bacterial taxon (OTU1309, genus: Gemmatimonas, adjp = 0.0209, FDR = 0.06017 and OTU120, genus: Hymenobacter, adjp = 0.036, FDR = 0.05), and one fungal taxon (OTU63, genus: Penicillium, adjp = 0.02, FDR = 0.028), which were differentially abundant between WW, and WE, we did not recover any taxa whose abundance is significantly different (statistically) among the genetic pools. Relative abundances for the twenty most abundant genera were plotted as well for each cultivar within their genetic pools (Fig.3A, B) and microbial genera were quite uniformly abundant among the three genetic pools. This pattern was also the same when we analyzed the abundances in leaf and berry samples individually within these three genetic pools (Fig.4A, B), except for few cultivar level differences (e.g- bacterial genus Vagococcus in the cultivars of TE and fungal genus Pichia in the cultivars of WW genetic pool). To test the association of these genera with genetic pools, we performed a linear regression for abundances of these genera against genetic pools (parametric test). As the Pichia also seems more abundant in the phyllosphere of berries (Fig. 4B), we also added this as confounders to the regression and the results indicated a highly significant association of these genera to TE and WW genetic pools, respectively (Table 1&2). We also observed that the abundance data for these genera were not normally distributed and therefore performed a nonparametric test (Wilcoxon rank sum test), that confirmed the association. The Chao1 estimator of alpha diversity was also measured and plotted according to the genetic pools (Fig.3C, D) and again we did not observe a very significant genetic pool wise differences in these estimates (ANOVA, for 16S data: Chao1, P = 0.033; for ITS data: Chao1, P = 0.041). Microbial community structure assemblages among the three genetic pools were also compared using PCoA to look for the genetic pool wise patterns of microbiota present in the phyllosphere. Taxa in both the PCoA plot (Fig. 3E, F) were clustered together (PERMANOVA, for 16S data: at F = 0.971, R2 = 0.285, P = 0.408; for ITS data: at F = 0.991, R2 = 0.172, P = 0.394), which also indicated the impact of genetic diversity is less evident. Results were the same when PCoA was performed on the data-sets grouped within 27 grapevine cultivars (Supplementary data S1).

Effect of organs on phyllosphere microbiome Multiple testing on taxa abundances in the phyllosphere of leaves and berries gave 17 bacterial and 33 fungal OTUs whose abundance was significantly different between these two organs. The data revealed the organ-specific patterns of phyllosphere microbiota in these grapevine cultivars. Table 3 and 4 are provided for 16S and ITS data respectively to display various bacterial and fungal OTUs (along with their respective genera) with their false discovery rates (FDRs) and adjusted p-values. According to the corrected p-values and

FDRs, 5 bacterial (e.g- Pseudomonas and Pantoea. adjusted P-value; adjp = 0.0038 & FDR = 0.00118) and 31 fungal genera (e.g- Aspergillus and Mycosphaerella. adjp = 0.0005 & FDR = 0.000129) were most differed between leaf and berries were, respectively. Relative microbial abundances for top twenty taxa was also calculated on leaf and berry samples (also grouped in genetic pools; Fig.4A, B) and differential abundances on both sample type were clearly visible. Leaf phyllosphere was heavily occupied by bacterial and fungal genera of Pseudomonas and Pantoea & Aureobasidium, Mycosphaerella, respectively. On the other hand, berry surfaces mainly comprised of bacterial genera of Acinetobacter and Sphingomonas & with fungal genera of Aureobasidium, Aspergillus and Pichia. To investigate the influence of leaves and berries, we also compared Chao1 estimates of alpha diversity between leaf and berry samples (Fig. 4C, D) and these estimates were also significantly different ( ANOVA, for 16S data: Chao1, P = 0.007; for ITS data: Chao1, P = 4.53e- 08). PCoA also indicated the same as it identified clear, separate clusters (Fig. 4E, F) corresponding to both organs (PERMANOVA; for 16S data: at F = 45.384, R2 = 4.121, P = 0.001; for ITS data: at F = 48.306, R2 = 2.539, P = 0.001).

Discussion Our analysis based on high throughput 16S and ITS profiling identifies the presence of complex microbiota in the phyllosphere of leaves and fruits (berries) of grapevine cultivars grown in our Mediterranean vineyard and it is dominated by bacterial genera of Pseudomonas, Sphingomonas, Enterobacter and the fungal genera of Aureobasidium, Alternaria, Cladosporium, respectively which is concordant with the findings of other grapevine related studies (Zarraonaindia et al. 2015 & Zhang et al. 2017). High relative abundances of some other microbial genera such as Pantoea and Mycosphaerella have also been identified in this study, which has been reported in few grapevine cultivars before as endophytes (Bell et al. 1995; Baldan et al. 2015). This is not uncommon as epiphytes and endophytes are separated by a thin boundary between their habitats and due to vertical and horizontal microbial transfers (Frank et al. 2017) sharing of the major chunk of OTUs are inevitable (Bodenhausen et al. 2013). The bacterial genus Vagococcus has also not been widely reported in plants by research communities except rhizosphere and phyllosphere of Rice (Mwajita et al. 2013). Hence, the specific abundance of this genus must be further identified to have the preliminary view of its functionality in the Mediterranean vineyards. The reason for its abundance could be the interaction between plant genetic factors and environmental conditions at this specific geographic location of the vineyard. Epiphytes (the phyllosphere microbes) associated with grapevine have been suggested to originate from soil, but are distinct from those in the rhizosphere microbiome (Zarraonaindia et al. 2015); this is most likely a consequence of the physio-chemical composition and surrounding environment which strongly modulates microbial community structure and its dynamics (Wagner et al. 2016). Other reports also evidenced that both environment and the plant

genotype could be the major drivers for epiphytic community structuring (Redford et al. 2010; Turner et al. 2013). Our preliminary study based on random sampling of 27 cultivars (from three genetic pools) also indicated that there is probably an impact of grapevine genetic diversity over the microbial composition in the phyllosphere, but it is not quite evident as we found only a few microbial OTUs were differentially abundant among genetic pools. Sampling from cultivars (among these genetic pools) which are more distant in the context of their genetic relatedness must be done in the future to further explore the impact of this genetic diversity. Few genera were also found specifically associated with some cultivars of certain genetic pools (e.g- Vagococcus with the genetic pool TE) and this association (if further confirmed with above mentioned sampling strategies), should be taken into account in developing new selective breeding strategies in order to have the putative beneficial role of the phyllosphere as a performance trait of the cultivars. Moreover, environmental control in shaping microbiome in these genetically diverse grapevine cultivars should also be further investigated by sampling at different geographic locations displaying variable climatic conditions. On the other hand, leaf and berry samples clearly displayed very distinct microbial patterns. Type of genera present and their taxa abundances was significantly different in both the organs. The physical features of berry surface like the number of waxy layers (or bloom, which prevent water loss from the skin) and their thicknesses are cultivar-specific (Knoche and Lang 2017). These physical features could influence the contact and permeability of the grape berry cuticle to different microorganisms as observed for some pathogens, such as Botrytis cinerea (Herzog et al., 2015) and could be the reason for organ-specific microbiome differences and deserve further investigations. This finding is also consistent with the few other findings, in which organ-specific microbial patterns have been reported in sugarcane (Souza et al. 2016) and in some commercially important grapevine cultivars (Bokulich et al. 2014). Our leaf samples were majorly occupied by the bacterial and fungal genus Pseudomonas, Pantoea and Sphingomonas & Aureobasidium, Mycosphaerella and Cladosporium. At the other end, berry surfaces displayed higher abundances of bacterial genus like Acinetobacter, Gluconobacter, Enterobacter, but major fungal abundances were similar in both leaf and fruit surfaces except the genus of Aspergillus and Pichia. Pichia (a yeast, family Saccharomycetaceae) was also found specifically abundant in berries of grapevine cultivars of the genetic pool WW. Taxonomy of Pichia is not fully resolved, and thus, a large diversity of roles in winemaking may be expected within this genus with some species inducing potential faults in winemaking (Fugelsang and Edwards 2010). Therefore the information regarding its association with certain genotypes should further be investigated in the context of wine fermentation. A richly diverse fungal component of the grapevine microbiome has also been uncovered in this work and it could also be particularly significant because there is not sufficient information on the potential risks or benefits of plant-fungi associations. Grapevine

associated microbial communities are relevant to industrial fermentation processes for wine production. Based on our results it can be assumed that grape juice used for wine production harbors a diverse bacterial and fungal community originating from its phyllosphere as well (e.g- Pantoea and Aspergillus). Some species of most abundant microbial genera we found (e.g- Pseudomonas and Mycosphaerella) have been previously reported for acting as biocontrol agents or BCAs (Kurose et al. 2016; Jousset et al. 2006). An interesting question would be to evaluate how

integration could provide a better understanding of how microbial communities are interactingto integrate with microbial each other, community with the studies host plant, into pathogen traditional or biocontrol with BCAs, approaches? which would This be definitely helpful for designing a novel biocontrol method. This also suggests that there is an open field for further studies of the possible role of bacterial and fungal colonizers in plant growth, development and response to biotic and abiotic stress. A whole-genome shotgun sequencing followed by metagenomic analysis (Qin et al 2010) can add a more detailed layer of information to the taxonomical characterization of a wide variety of grapevine samples, by generating information on the gene composition of the bacteria and fungi present. This information can, in turn, be used to discover new genes and to formulate putative functional pathways and modules, thus could provide insight into functional and genetic microbiome variability. Apart from metagenomics, the use of additional tools such as RNA-Seq (for meta-transcriptomics) may offer a more informative perspective as it can reveal details about populations that are transcriptionally active and not just identify the taxa and genetic content of microbial populations. Moreover, the integration of different omic approaches (e.g- meta-transcriptomics & meta-proteomics) may open a window into discovering the regulatory mechanisms orchestrating observed gene expressions, thereby uncovering how host-microbe and microbe-microbe interactions that regulate microbiome activity.

References Abarenkov K et al. 2010. The UNITE database for molecular identification of fungi - recent updates and future perspectives. New Phytologist 186: 281–285. Andrews JH, Harris RF. 2000. The ecology and biogeography of microorganisms on plant surfaces. Annual Reviews of Phytopathology 38: 145-180. Baldan E, Nigris S, Romualdi C, D’Alessandro S, Clocchiatti A, Zottini M, et al. 2015. Beneficial Bacteria Isolated from Grapevine Inner Tissues Shape Arabidopsis thaliana Roots. PLoS ONE 10(10): e0140252. Bauer D 1972. Constructing confidence sets using rank statistics. Journal of the American Statistical Association 67:687–690. Beals AW 1984. Bray-Curtis ordination: an effective strategy for analysis of multivariate ecological data. Advances in Ecological Research 14: 1-55. Belda I, Zarraonaindia I, Perisin M, Palacios A and Acedo A. 2017. From Vineyard Soil to Wine Fermentation: Microbiome Approximations to Explain the “terroir” Concept. Front. Microbiol. 8:821

Bell CR, Dickie GA, WLG Harvey, Chan JWYF. 1995. Endophytic bacteria in grapevine. Canadian Journal of Microbiology 41(1): 46-53. Benjamini Y, Hochberg Y. 1995. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society 57: 289-300. Blaalid R et al. 2013. ITS1 versus ITS2 as DNA metabarcodes for fungi. Molecular Ecology Resources 13: 218–224. Bokulich NA, Thorngate JH, Richardson PM and Mills DA. 2014. Microbial biogeography of wine grapes is condi-tioned by cultivar, vintage, and climate. PNAS 111:E139. Bokulich NA, Collins TS, Masarweh C, Allen G, Heymann H, Ebeler SE, Mills DA. 2016. Associations among wine grape microbiome, metabolome, and fermentation behavior suggest microbial contribution to regional wine characteristics. mBio 7(3):e00631-16. Bodenhausen N, Horton MW, Bergelson J. 2013. Bacterial Communities Associated with the Leaves and the Roots of Arabidopsis thaliana. PLoS ONE 8(2):e56329. Bodenhausen N, Bortfeld-Miller M, Ackermann M and Vorholt JA. 2014. A synthetic community approach reveals plant genotypes affecting the phyllosphere microbiota. PLoS Genet. 2014 Apr 17;10(4):e1004283. Brandl MT, Quinones B, and Lindow SE. 2001. Heterogeneoustranscription of an indoleacetic acid biosynthetic gene in Erwinia herbicola on plant surfaces. Proceedings of the National Academy of Science of the USA 98: 3454-3459. Bringel F, and Couée I. 2015. Pivotal roles of phyllosphere microorganisms at the interface between plant functioning and atmospheric trace gas dynamics. Frontiers in Microbiology 6: 486. Bulgarelli D, Spaepen SS, Themaat EVL and Shulze-Lefert P. 2013. Structure and functions of the bacterial microbiota of plants. Annual Reviews in Plant Biology 64: 807-838. Callahan BJ, Mcmurdie PJ, Rosen MJ, Han AW, Johnson AJ, and Holmes SP. 2016. DADA2: high-resolution sample inference from Illumina amplicon data. Nature Methods 13: 581–583. Caporaso JG et al 2011. Global patterns of 16S rRNA diversity at a depth of millions of sequences per sample. Proceedings of National Academy of sciences 108: 4516–4522. Chao A. 1987. Estimating the population size for capture-recapture data with unequal catchability. Biometrics 43: 783-791. Compant S, Mitter B, Colli-Mull, JG, Gangl H, and Sessitsch A 2011. Endophytes of grapevine flowers, berries, and seeds: identification of cultivable bacteria,comparison with other plant parts, parts, and visualization of niches of colonization. Microbial Ecology 62: 188-197. Compant S, Duffy B, Nowak J, Clément C, and Barka EA. 2005. Use of Plant Growth- Promoting Bacteria for Biocontrol of Plant Diseases: Principles, Mechanisms of Action, and Future Prospects. Applied and Environmental Microbiology 71: 4951-4959. Copeland, JK, Yuan L, Layeghifard M, Wang PW, and Guttman DS. 2015. Seasonal community succession of the phyllosphere microbiome. Molecular Plant Microbe Interaction 28: 274–285. Faith DP, Minchin PR, and Belbin L. 1986. Compositional dissimilarity as a robust measure of ecological distance. Vegetatio 69: 57–68. Frank AC, Saldierna Guzmán JP, Shay JE. 2017. Transmission of Bacterial Endophytes. Microorganisms.5(4):70. Fugelsang K and Edwards C. 2010. Wine Microbiology Second Edition. New York, USA: Springer Science and Business Media. Pages 3:28.

Gu L, Bai Z, Jin B, Hu Q, Wang H, Zhuang G and Zhang H. 2010. Assessing the impact of fungicide enostroburin application on bacterial community in wheat phyllosphere. Journal of Environmental Sciences 22: 134-141. Hollander M, and Wolfe DA. 1973. Nonparametric Statistical Methods. New York: John Wiley & Sons. Pages 115–120. Herzog K, Wind R, and Töpfer R. 2015. Impedance of the grape berry cuticle as a novel phenotypic trait to estimate resistance to Botrytis cinerea. Sensors 15:12498–12512. Ikeda S et al. 2011. Autoregulation of nodulation interferes with impacts of nitrogen fertilization levels on the leaf associated bacterial community in soybeans. Appl. Environ. Microbiol. 77:1973–1980. Jones K. 1970. Nitrogen fixation in phyllosphere of Douglas Fir Pseudotsuga-Douglasii. Annals of Botany 34: 239-244. Jousset A., Lara E, Wall LG, and Valverde C. 2006. Secondary Metabolites Help Biocontrol Strain Pseudomonas fluorescens CHA0 To Escape Protozoan Grazing. Applied and Environmental Microbiology 72: 7083-7090. Kadivar H & Stapleton AE. 2003. Ultraviolet radiation alters maize phyllosphere bacterial diversity. Microb. Ecol. 45:353–361. Kishore GK, Pande S, Podile AR. 2005. Biological control of late leaf spot of peanut (Arachis hypogaea) with chitinolytic bacteria. Phytopathology 95: 1157-1165. Knoche M, and Lang A. 2017. Ongoing growth challenges fruit skin integrity. CRC Crit. Rev. Plant Sci. 36:190–215. Kurose D, Furuya N, Saeki T, Tsuchiya K, Tsushima S, and Seier MK. 2016. Species- Specific Detection of Mycosphaerella polygoni-cuspidati as a Biological Control Agent for Fallopia japonica by PCR Assay. Molecular Biotechnology 58: 626-633. Lambais, MR, Crowley DE, Cury JC, Bull RC, Rodrigues RR. 2006. Bacterial diversity in tree canopies of the Atlantic forest. Science 312: 1917-1917. Lugtenberg BJ, Chin-A-Woeng, TF, and Bloemberg, GV 2002. Microbe-plant interactions: principles and mechanisms. Antonie van Leeuwenhoek 81: 373-383. Lundberg DS, Yourstone S, Mieczkowski P, Jones CD, and Dangl JL. 2013. Practical innovations for high-throughput amplicon sequencing. Nature Methods 10: 999–1002. McMurdie PJ, and Holmes S. 2013. “phyloseq: An R package for reproducible interactive analysis and graphics of microbiome census data”. PLoS ONE 8: e61217. Müller T, and Ruppel S. 2014. Progress in cultivation independent phyllosphere microbiology. Fems Microbiology Ecology 87: 2-17. Mwajita MR, Murage H, Tani A, and Kahangi EM. 2013. Evaluation of rhizosphere, rhizoplane and phyllosphere bacteria and fungi isolated from rice in Kenya for plant growth promoters. SpringerPlus 2:606. Myles H and Douglas AW. 1973. Nonparametric Statistical Methods. New York: John Wiley & Sons. Pages 27–33 (one-sample), 68–75. Nicolas SD. et al. 2016. Genetic diversity, linkage disequilibrium and power of a large grapevine (Vitis vinifera L) diversity panel newly designed for association studies. BMC Plant Biology 16:74. Oksanen JF et al. 2017. vegan: community Ecology Package. R package version 2.4-3. https://CRAN.R- project.org/package=vegan. Paradis E, Claude J, and Strimmer K. 2004. APE: analyses of phylogenetics and evolution in R language. Bioinformatics 20: 289-290.

Perazzolli M et al. 2014. Resilience of the natural phyllosphere microbiota of the grapevine to chemical and biological pesticides. Applied & Environmental Microbiolgy 80:3585–3596. Qin J, Li R, Raes J, et al. 2010. MetaHIT Consortium. A human gut microbial gene catalog established by metagenomic sequencing. Nature 464(7285):59–65 R Core Team. 2017. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/. Redford AJ, Bowers RM, Knight R; Linhart Y, Fierer N. 2010. The ecology of the phyllosphere: geographic and phyllogenetic variability in the distribution of bacteria on tree leaves. Environmental Microbiology 12: 2885-2893. Schulz B, Rommert AK, Dammann U, and Aust HJ 1999. The endophyte-host interaction: Mycological Research 103:1275–1283. Souza, RSC. et al. 2016. Unlocking the bacterial and fungal communities assemblages of sugarcanea balanced microbiome.antagonism? Scientific Reports 6:28774. Tuomisto H. 2010. exist. Oecologia 4: 853–860. Turner TR, James EK,A consistent Poole PS. terminology(2013). The forplant quantifying microbiome. species Genome diversity? Biology Yes, 14:209. it does Vorholt JA. 2012. Microbial life in the phyllosphere. Nature Reviews Microbiology 10:828- 840. Wagner MR., Lundberg DS, Rio TG, Tringe SG, Dangl JL, and Olds TM. 2016. Host genotype and age shape the leaf and root microbiomes of a wild perennial plant. Nature Communications 7:12151. Wang Q, Garrity GM, Tiedje JM, and Cole JR. 2007. Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Applied Environmental Microbiology 73: 5261–5267. Wickham H. 2009. ggplot2: Elegant Graphics for Data Analysis. New York, USA: Springer- Verlag. Wickham H. 2011. The Split-Apply-Combine Strategy for Data Analysis. Journal of Statistical Software 40:1-29 Williams TR, Moyne AL, Harris LJ, and Marco ML. 2007. Season, Irrigation, Leaf Age, and Escherichia coli Inoculation Influence the Bacterial Diversity in the Lettuce Phyllosphere. PLoS ONE 8:e68642. Zarraonaindia I et al. 2015. The soil microbiome influences grapevine-associated microbiota. Mbio 6: e02527-14. Zhang S, Chen X, Zhong Q, Huang Z, and Bai Z. 2017. Relations among epiphytic microbial communities from soil, leaves and grapes of the grapevine. Frontiers in Life Science10: 73-83.

Acknowledgements We thank Audrey Weber & Valerie Laucou of INRA for their assistance in experimental procedures of sequencing library preparation. We are also grateful to Philippe Chatelett for providing aid to edit this manuscript. This study was funded by the Horizon 2020 wska-Curie Innovative Training Network “MicroWine” (grant number 643063). programme of the European Commission within the Marie Skłodo Author Contribution Prashant Singh, Jean-Pierre Péros and Patrice This designed the research; Prashant Singh & Sylvain Santoni performed the lab experiments; Prashant Singh & Alex Gobbi analyzed the

data; Prashant Singh, Jean-Pierre Péros and Patrice This wrote the paper. All authors read and approved the final manuscript and declare no conflict of interest.

Data Availability All of the data are provided fully in the result section of this paper and the sequence data for 16S and ITS sequences are available at institutional server http://agap- ng6.supagro.inra.fr/inra and can be obtained upon reasonable request to authors.

Table 1: Summary of association tests for bacterial genus Vagococcus against three genetic pools Statistical Tests Genetic Pools P-values Linear regression TE 0.000985 (Parametric test) WE 0.06004 WW 0.125 Wilcoxon Rank Sum test TE 0.0008407 (Non-parametric test) WE 0.06324 WW 0.1072

Table 2: Summary of association tests for fungal genus Pichia against three genetic pools Statistical Tests Genetic Pools P-values Linear regression TE 0.1932 (Parametric test) WE 0.2294 WW 0.01239 Wilcoxon Rank Sum test TE 0.0732 (Non-parametric test) WE 0.3316 WW 0.005286

Table 3: Bacterial OTUs with differential abundance between leaf and berry samples with their respective genera, adjusted p-values, and FDRs. OTUs Genus adj p-values FDRs OTU5 Pseudomonas 0.0038 0.00118 OTU1 Pantoea 0.0038 0.00118 OTU25 Erwinia 0.0038 0.00118 OTU12 Enterobacter 0.0038 0.00118 OTU70 Gluconobacter 0.0038 0.00118 OTU81 Cloacibacterium 0.007 0.001475 OTU184 Comamonas 0.007 0.001475 OTU112 Carnobacterium 0.007 0.001475 OTU150 Bacillus 0.0104 0.001475

OTU389 Stenotrophomonas 0.0104 0.001475 OTU44 Staphylococcus 0.0104 0.001475 OTU149 Glucononacetobacter 0.0104 0.001475 OTU136 Duganella 0.0205 0.00317 OTU37 Orbus 0.0259 0.00379 OTU16 Acinetobacter 0.0273 0.00393 OTU92 Nocardioides 0.0355 0.00479 OTU29 Bartonella 0.0405 0.00520

Table 4: Fungal OTUs with differential abundance between leaf and berry samples with their respective genera, adjusted p-values, and FDRs. OTUs Genus adj p-values FDRs OTU29 Phaeosphaeria 0.0005 0.000129 OTU27 Dioszegia 0.0005 0.000129 OTU20 Stemphylium 0.0005 0.000129 OTU46 Nodulosphaeria 0.0005 0.000129 OTU78 Golovinomyces 0.0005 0.000129 OTU51 Pyrenophora 0.0005 0.000129 OTU36 Coniothyrium 0.0005 0.000129 OTU81 Ramularia 0.0005 0.000129 OTU24 Chalastospora 0.0005 0.000129 OTU54 Naevala 0.0005 0.000129 OTU53 Bullera 0.0005 0.000129 OTU45 Vishniacozyma 0.0005 0.000129 OTU124 Blumera 0.0005 0.000129 OTU59 Cryptovalsa 0.0005 0.000129 OTU71 Lachnum 0.0005 0.000129 OTU57 Hormonema 0.0005 0.000129 OTU43 Boeremia 0.0005 0.000129 OTU91 Cryptococcus 0.0005 0.000129 OTU111 Phoma 0.0005 0.000129 OTU88 Sydowia 0.0005 0.000129 OTU2 Mycosphaerella 0.0005 0.000129

OTU95 Angustimassarina 0.0005 0.000129 OTU5 Aspergillus 0.0005 0.000129 OTU117 Sigarispora 0.0005 0.000129 OTU89 Diplodia 0.0005 0.000129 OTU126 Hortaea 0.0005 0.000129 OTU15 Botrytis 0.0005 0.000129 OTU275 Diaporthe 0.0005 0.000129 OTU96 Acaromyces 0.0005 0.000129 OTU120 Candida 0.0005 0.000129 OTU18 Pichia 0.0005 0.000129 OTU84 Metschnikowia 0.001 0.00025 OTU1 Aureobasidium 0.0195 0.00606

A

B

Fig 1. Rarefaction curves for (A) bacterial and (B) fungal datasets based on sequencing reads, describing the observed number of OTUs as a function of the sequencing reads per samples. Each color represents the sample (n = 80). Saturation of the curves represents the good coverage and quality of the data-sets.

Fig 2. Prevalence plot (taxa prevalence versus total count) for (A) bacterial and (B) fungal taxa representing the phylum level diversity across samples. Each point corresponds to a

different or unique taxon. The y-axis represents the fraction of samples, these taxa are present.

Fig 3. Relative abundances of (A) bacterial and (B) fungal genera present on each cultivar, grouped within their genetic pools (9 cultivars per genetic pool, top 20 taxa, characterized to the genus level and datasets were rarified to 5,000 sequence reads per sample). Chao1 estimates of -diversity for (C) bacterial and (D) fungal data-sets for three genetic pools.

α

PCoA plots using Bray-Curtis distance between samples for (E) bacterial and (F) fungal data- sets among three genetic pools, explaining >60% variations with first two axes (taxa with variance < 1e-05 were trimmed).

Fig 4. Relative abundances of (A) bacterial and (B) fungal genera present on leaf and berry samples, also grouped within their genetic pools (top 20 taxa, characterized to the genus level, datasets were rarified to 5,000 sequence reads per sample). Chao1 estimates of - diversity for (C) bacterial and (D) fungal data-sets for both the organ types. PCoA plots using Bray-Curtis distance between samples for (E) bacterial and (F) fungal data-sets as per leafα and berry samples based on Bray-Curtis distance matrices, explaining >60% variations with first two axes (taxa with variance < 1e-05 were trimmed).

MANUSCRIPT 6

Seasonal microbial dynamics on grapevine leaves under biocontrol and copper fungicide treatments

This last manuscript reports the result of a microbial survey conducted in Western Cape, South Africa, where we tested two different fungicide treatments on grapevine leaves. From one side we applied a conventional copper spray treatment while at the other side we tested a novel strain of Lactobacillus plantarum MW-1, which showed in-vitro fungicide activity. The aim of this study was i) to understand the dynamics of the microbial community along the season and ii) the ecological impact of these two treatments, on fungal and bacterial populations. Therefore, the quantitative approach we integrated, to measure the fungal load between two treatments was not meant to draw any phytopathological conclusion on the efficacy of the biocontrol agent over the traditional copper treatment. This work is currently under revision after it has been submitted to Frontiers in Microbiology. I conceived this study together with I. Kyrkou within the framework of our project MICROWINE. After we designed the experiment, different private companies took a role in performing periodic sampling and sprays and on the logistics of the samples until they arrived, frozen, to our laboratory. I performed, with E. Filippi, all the experiment in laboratory. I interpreted the data and wrote all the manuscript except for the introductory section that was written by I. Kyrkou.

Seasonal microbial dynamics on grapevine leaves under biocontrol and copper fungicide treatments Gobbi Alex1, Kyrkou Ifigeneia1, Filippi Elisa, Ellegaard-Jensen Lea1, Hansen Lars Hestbjerg1 1Environmental Microbial Genomics (EMG), Aarhus University, Department of Environmental Science, Roskilde, DK, Denmark * Correspondence: Lars Hestbjerg Hansen [email protected]

Keywords: Biocontrol, Fungicide, Grapevine, Meta-barcoding, qPCR, Copper, Amplicons, Sequencing Abstract Winemakers have long used copper as a fungicide on grapevine. However, the potential of copper to accumulate on soil and affect the biota poses a challenge to achieving sustainable agriculture. One recently developed option is the use of biocontrol agents to replace or complement traditional methods. In the present study, a field experiment was conducted in South Africa in which the leaves in two blocks of a vineyard were periodically treated with either copper sulphate or sprayed with Lactobacillus plantarum MW-1 as a biocontrol agent. This study evaluated the impact of the two treatments on the bacterial and fungal communities as they changed during the growing season. To do this, NGS was combined with quantitative strain-specific and community qPCRs. The results revealed the progression of the microbial communities throughout the season and how the different treatments affected the microbiota. Bacteria appeared to be relatively stable at the different time points, with the only taxa that systematically changed between treatments being Lactobacillaceae, which included reads from the biocontrol agent. Cells of Lactobacillus plantarum MW-1 were only detected on treated leaves using strain-specific qPCR, with its amount spanning from 103 to 105 cells/leaves. Conversely, the fungal community was largely shaped by a succession of different dominant taxa over the months. Between treatments, only a few fungal taxa appeared to change significantly and the number of ITS copies was also comparable. In this regards, the two treatments seemed to affect the microbial community similarly, revealing the potential of this biocontrol strain as a promising alternative among sustainable fungicide treatments, although further investigation is required.

1. Introduction Copper (Cu) compounds have traditionally been the means to combat phytopathogenic microbes, especially fungi, in crop plants (Banik and Pérez-de-luque 2017). In organic agriculture including organic vineyards, the use of Cu-based pesticides is currently the only chemical treatment allowed, although it is limited to a maximum of 6 kg Cu ha-1 per year in the EU (Commission Regulation, 2002). Such a strict regulation is explained by the adverse effects of Cu on soil organisms and its long-term persistence on surface horizons (Van

Zwieten et al. 2004). Another rational behind this is that Cu overuse may lead to the development of Cu resistance in pathogenic fungi (Wang et al. 2011). Lastly, Cu residues on grapes may compromise wine quality (Tromp and Klerk 1988). In contrast, biocontrol agents, such as lactic acid bacteria, are not harmful to either the environment or health. Indeed, they are classified as “generally recognized as safe” (GRAS) by the Food and Drug Administration (FDA, USA) and have received the “qualified presumption of safety” (QPS) status from the European Food Safety Agency (EFSA) (Trias et al. 2008). The biocontrol potential of the lactic acid bacterium Lactobacillus plantarum has recently emerged as an important subject of study. The effectiveness of L. plantarum against several fungi has been emphasized in a large-scale screening study of over 7.000 lactic acid bacteria, which showed that most of the strains with antifungal properties belonged to L. plantarum (Crowley, Mahony, and van Sinderen 2013). Furthermore, many in vitro studies have demonstrated the antifungal properties of L. plantarum strains against plant parasites of the genera Botrytis (Sathe et al. 2007; Wang et al. 2011; Fhoula et al. 2013; de Senna and Lathrop 2017), Fusarium (Lavermicocca et al. 2000; Sathe et al. 2007; Smaoui et al. 2010; Wang et al. 2011; Crowley, Mahony, and van Sinderen 2013), Aspergillus (Lavermicocca et al. 2000; Djossou et al. 2011), Rhizopus (Sathe et al. 2007; Djossou et al. 2011), Alternaria, Phytophtora and Glomerella (Wang et al. 2011), Sclerotium, Rhizoctonia and Sclerotinia (Sathe et al. 2007). Additionally, chili seeds infected by Colletotrichum gloeosporioides germinated well after treatment with L. plantarum (El-Mabrok et al. 2012). In the field, a mix of L. plantarum and Bacillus amyloliquefaciens applied to durum wheat from heading to anthesis showed promising control of the fungal pathogens F. graminearum and F. culmorum (Baffoni et al. 2015). In vitro trials using different L. plantarum strains have shown a promising inhibition of plant pathogenic bacteria belonging to the genera Clavibacter (Oloyede et al. 2017), Xanthomonas (Visser et al. 1986; Trias et al. 2008; Roselló Prados 2016), Erwinia (Visser et al. 1986) and Pseudomonas (Tajudeen et al. 2011; Fhoula et al. 2013; Roselló Prados 2016). In fact, the inhibitory activity of L. plantarum against P. syringae in planta has long been documented (Visser et al. 1986), while that against Rhizobium radiobacter has only recently been reported (Korotaeva N, Ivanytsia T, and Franco BDGM 2015). Meanwhile, field sprays of L. plantarum on Chinese cabbage alleviated the severity of soft rot by Pectobacterium carotovorum subsp. carotovorum (Tsuda et al. 2016) and contributed to a reduction of fire blight by Erwinia amylovora on apple and pear (Daranas et al. 2018). However, the aforementioned studies focused on the effects of L. plantarum on single organisms while its effects on the microbial community in a field remain to be investigated. Next-generation sequencing (NGS) and quantitative PCR (qPCR) techniques facilitate rapid and cost-competitive mapping of complex microbiomes by tracking fastidious, unculturable or even unknown taxa and their roles (Schloss and Handelsman 2005; Clark et al. 2018). In particular, through accurate, real-time enumeration of taxonomic or functional gene markers, qPCR has enabled microbiologists to quantify specific species or phylotypes out of an environmental “genetic soup” (Clark et al. 2018) Nonetheless, qPCR assays rely

exclusively on known genes and thus overlook taxa distinct from those already described (Smith and Osborn 2009). This limitation is circumvented by NGS technologies and the approaches of either marker gene amplification (amplicon sequencing) or total (shotgun) sequencing of environmental DNA. Owing to the PCR amplification of specific molecular gene markers, amplicon sequencing can still be quite inefficient for drawing conclusions about the genus or species level (Escobar-Zepeda, De León, and Sanchez-Flores 2015). Moreover, biases can be introduced due to horizontal gene transfer, interspecific gene copy number variation within a microbiome and underrepresentation (Escobar-Zepeda, De León, and Sanchez-Flores 2015). Nevertheless, both NGS and qPCR have substantially contributed to explorations of the planet’s microbiome, and amplicon sequencing remains a valuable tool for comparative studies of microbial communities (Warinner et al. 2014; Roggenbuck et al. 2014). From an anthropocentric viewpoint, there has been increasing interest in using the above technologies to decipher the microbial interactions that directly affect human health and resources (e.g. on crops and livestock). In this context, the positive role of probiotic bacteria has long been acknowledged and their potential is being investigated in some detail (Soccol et al. 2010). L. plantarum is a versatile lactic acid bacterium and probiotic, and is ubiquitous on plants. The bacterium has been isolated from various environments, such as fermented food products, the human mouth and grape must (König and Fröhlich, 2009), and hence constitutes a promising case study. L. plantarum has been examined for, among other things, its adequacy as a biocontrol agent against phytopathogens and, as mentioned previously, it is generally considered a good candidate for use in agriculture. Even though available results of the antagonism of L. plantarum against phytopathogens in vitro are encouraging, harsh field conditions call for the careful design of field application trials (Visser et al. 1986). For example, a lack of nutrients can be circumvented by repeated, surface spray applications to sustain high viable biocontrol numbers (Visser et al. 1986). The scarcity of field studies makes it more difficult to evaluate this prediction, as well as the effect L. plantarum may have on a field’s native microbiome. The aim of this study was to evaluate this effect by investigating the progression of the bacterial and fungal communities in two plots in the same vineyard, one treated with Cu and the other with a biocontrol strain of L. plantarum, as the season progressed. Furthermore, the persistence of the applied biocontrol organism on the vines was monitored. To the authors’ knowledge, this study is the first preventative intervention using a biocontrol L. plantarum strain in a field-experiment. 2. Results This study used a combination of amplicon library NGS for both 16S and ITS, and qPCR with universal and strain-specific primers. The outcome of the DNA sequencing is presented below, followed by the results of qPCR analyses on the vine leaves. Finally, the results are presented of analyses of the must during different steps in the fermentation process.

2.1 DNA sequencing dataset description

The sequencing provided a dataset of 147 samples. After quality filtering, 14,028 exact sequence variants were attained from 3,011,656 high-quality reads concerning just the 16S community on leaves. For the ITS sequencing of the leaves, 6,901 features were retained, coming from 4,854,508 reads. All the samples were sequenced with sufficient coverage to unravel the complexity of the microbial community harboured on the leaves, as seen by the rarefaction curves in Figure S1. To minimise statistical issues that could arise from differences in terms of sequencing depth, the samples were normalised by randomly extracting 24,000 reads from each of the samples before any downstream analyses were performed.

2.2 Alpha diversity on leaves This section focuses on two different parameters that describe the microbial community on leaves: phylogenetic diversity (PD) and evenness. The alpha diversity results are summarised in Figure 1. Looking at the bacterial community on leaves, it was noted that PD was significantly higher in the Cu-treated samples compared to the biocontrol samples, as shown in Figure 1a (p-value= 0.014), and that PD seemed largely unaffected by the sampling time during the season (p-value= 0.538), as shown in Figure 1c. Looking at the evenness in Figure 1a, it can be seen that the Cu samples had a greater evenness than the biocontrol samples (p-value =0.0003). In contrast, no statistical effect on evenness was evident when looking at the collection time for 16S (p-value= 0.38), as displayed in Figure 1c. In contrast, the fungal community seemed to be unaffected by the treatments in terms of PD and evenness (p > 0.05), as visualised in Figure 1b. However sampling time produced a strong effect on PD (p-value<<0.05) and evenness (p-value<<0.05) on the ITS distribution, as revealed in Figure 1d. In particular, the number of sequence variants in the community tended to increase over time from September to November, and then suddenly decreased from December to the post-harvest period in February and March.

Figure 1: Boxplots of phylogenetic diversity (PD) (left) and evenness (right), grouped by treatments (above) and period (below) for bacteria and fungi. a) PD (left) and evenness (right) of bacteria grouped by treatment, b) PD (left) and evenness (right) of fungi grouped by treatment, c) PD (left) and evenness (right) of bacteria grouped by period, d) PD (left) and evenness (right) of fungi grouped by treatment. 2.3 Beta diversity on leaves This section shows the differences between the samples using Bray-Curtis dissimilarity, looking at the distribution and relative abundances of the single feature between samples.

The beta diversities of 16S and ITS sequence datasets, coloured by treatment or collection time, were visualised by PCoA plots and are shown in Figure 2. With regard to the bacterial community in Figure 2a, it can clearly be seen that the 16S distribution on leaves strongly clustered by treatment (p-value=0.001), while no significant pattern (p-value>0.05) was found in relation to the sampling collection time (Fig. 2b). The opposite trend was apparent for the fungal community. In fact, although ITS distribution does not appear to be affected by the treatment (p-value >0.05) as displayed in Figure 2b, it does strongly depend on the period of collection during the season, not only at month level (p-value= 0.001) as shown in Figure 2d, but sometimes even at day level (p-value= 0.001), as presented in Figure S2.

Figure 2: PCoA plots displaying the beta diversity within the dataset for 16S (left column) and ITS (right column) with samples coloured by treatment (above) and period of collection (below). a) Distribution of 16S data based on treatment, b) distribution of ITS dataset grouped by treatment, c) distribution of 16S dataset coloured by period, and d) distribution of ITS coloured by period. All distance matrixes are calculated based on Bray-Curtis dissimilarity. 2.4 Taxonomical composition on leaves

The results below show the microbial composition in terms of taxa assigned to the different features found, highlighting the microbial representation on grapevine leaves. All the information regarding bacterial and fungal community are summarised in Figure 3. Figure 3a shows the taxa bar plots of the bacterial population. In order, the dominant bacterial taxa inferred from the sequence features were Lactobacillaceae, Bacillus, Oxalobacteraceae, Enterobacteriaceae, Planococcaceae, Pseudomonas, Enterococcus, Sphingomonas and Staphylococcus. The main notable difference between the biocontrol-treated samples and the Cu-treated leaves was in the variation of Lactobacillaceae. This family includes the potential biocontrol agent and is an important indicator of its presence on leaves. Although members of Lactobacillaceae are generally present on the leaves, the difference between the biocontrol and Cu-treated samples was statistically significant (based on ANCOM). When the exact sequence variant assigned to this taxa was identified, an alignment was performed against the NCBI database using BLAST. The result of this confirmed the sequence belong to Lactobacillus plantarum. The presence of Lactobacillaceae on treated samples varied from 42 % to 68 % of the total bacterial community in the period from September to December. In the post-harvest months (February and March), the relative abundance decreased to between 0.1 % and 3.8 %, resembling the values found in the control/Cu-treated samples. This means that three months after the final spray (December) the relative abundance of Lactobacillaceae resembled the levels found on copper-treated leaves. The fungal communities, displayed in Figure 3b, were relatively similar for the biocontrol and Cu-treated samples when looking at the entire growing season. The dominant taxa belonged to Pleosporaceae, Cladosporium, Alternaria, Capnodiales, Sporobolomyces and Aureobasidium pullulans. Although no differences appeared between the two treatments, another pattern was seen with respect to the collection time. In fact, during the growing season, there was a significant change in the taxonomical composition of the fungal communities. Several fungal outbreaks led to a variation in relative abundances during the period studied. For instance, Pleosporacea members tended to increase in relative abundance from September to December, reaching 55 % of the total community, before suddenly decreasing to 9 % in March, although they still appeared among the dominant taxa. Sporobolomyces, Sporidiobolus and Botryosphaeriaceae were more dominant in September, while Cladosporium and Alternaria were mostly represented in the post-harvest months. In light of this finding, it was of interest to examine the differences between the treatments just within the same sampling period. Interestingly Botryosphaeriaceae varied in relative abundance from 18 % in the Cu-treated leaves to 0.67 % to the biocontrol in September. Further analyses were performed using BLAST on the dominant sequence variant assigned to Botryosphaeriaceae and the read was assigned at species level to Diplodia seriata, a known pathogen that causes bot canker on grapevine. Conversely Leptosphaeriaceae detected in September ranged from 2.7 % in Cu-treated leaves to 12 % recorded in biocontrol-treated samples for the same period.

Figure 3: Taxonomical bar plots for the bacterial and fungal community. Figure 3a) from left to right: taxa bar plots in which all the 16S samples are grouped by treatment, a taxa bar plot in which the 16S samples are grouped by treatment during a specific period, and finally the legend representing the 18 most abundant bacteria. Figure 3b) from left to right: taxa bar plots in which the ITS samples are divided by treatment, a taxa bar plot in which the ITS samples are grouped by treatment within a specific period, and finally the legend representing the 18 most abundant fungi.

2.5 ANCOM and Gneiss on leaves To test which taxa change in a statistically relevant way, ANCOM was run using treatment and period as discriminants for these samples. The result was that for 16S between different treatments, the only taxa that changed in a statistically significant way was Lactobacillaceae, the family to which this study’s biocontrol agent belongs, with the main sequence variant assigned to Lactobacillus plantarum (Table S1). Only two statistically relevant differences were found when looking at the period regarding mitochondria and Ralstonia, but none were associated with grapevine and furthermore they appeared in low abundance (Table S2). In the fungal community there were only two taxa that changed significantly between treatments. These taxa belong to the species Kondoa aeria and to the order of Filobasidiales (Table S3). They were in very low abundances and did not seem to be related to any known disease or have an impact on the plant itself. Instead, for this period a large amount of taxa were observed that changed during the season, 42 of which are classified at least at class level (Table S4). These findings provide information that can be used to understand the evolution of the fungal community as the season progresses. These 42 taxa included potential pathogens such as Alternaria, Cladosporium or members of Botryosphaeriaceae, with the main sequence assigned to Diplodia seriata using BLAST. As the period shaped the fungal community, an investigation was carried out on which taxa changed between different treatments during specific periods in the season. This ANCOM test returned 50 taxa, of which 46 were classified at order level (Table S5). Among the taxa that appeared to be differentially distributed between treatments, in a short period of time (such as September alone), there were Botryosphaeriaceae and Leptosphaeriaceae, which were previously highlighted when looking at the taxonomical composition in Figure 3b. Gneiss was then run on the bacterial community dataset to evaluate the impact of MW-1 on other bacteria. Interestingly, Lactobacillacea appeared to be sensitively different (p-value <0.05) from the rest of the microbial community, based on the position occupied in the tree (Fig. S3), but the fact that this taxon branches out from the rest means that it does not interact with other species in the community within the same microbial niche. This is a further evidence of that the introduction of this biocontrol agent does not alter the bacterial community that would normally be found on leaves sprayed with copper. 2.6 Quantifying fungal and MW-1 biomass on leaves The results of the quantitative evaluation by qPCR of the microbial community on the samples are summarised in Figure 4. The detection of the biocontrol agent was determined by strain-specific qPCR combined with a high-resolution melting curve. Furthermore, the total relative abundance of fungi during the period and between the treatments was compared using universal primers for fungi (with the same primer set used for the generation of the sequencing reads to reduce biases). The biocontrol agent was detected in all the treated samples after the first spray (26 September) and up to 2 February, one month after harvest. The cell numbers showed a range spanning from 10^3 to 10^5 cells/leaf. The maximum amount of Lactobacillus plantarum MW-1 cells was detected after the spray on 15

November, while on two dates the number of cells detected was below the detection limit. This study’s time zero, 15 September, was the first time that MW-1 was not detectable, with no spraying having previously been done, while the second time was on 9 March, two months after harvest. The total fungal abundance could be estimated at cell level number without introducing significant biases due to an uneven ploidy variation between the different taxa detected. However, since the community composition between the treatments appeared to be very similar, it was assumed to have a negligible difference in terms of genome distribution. For this reason, a relative quantification was performed between the two fungal communities grouped by treatments. With this in mind, the results obtained showed that the fungal communities between the two treatments followed a similar trend, with no profound variations between the different treatments. It was decided to estimate the number of fungal genomes, assuming a fungal-genome average size of 27.5 Mb. Accordingly, the number of genomes ranging from 10^2 to 10^4 genomes/leaves during this period was calculated. After normalising the raw-qPCR signals using the gDNA amount, an ANOVA was run which gave a p-value of 0.56. This confirmed that the variation between the two fungal communities coming from the two treatments was not significant. Finally a correlation analysis was performed between the number of cells of MW-1 detected and the corresponding fungal genome amount for each sample. In this case, the analyses returned a coefficient of -0.39, which could be interpreted as a moderate negative correlation between fungi and MW-1. To further demonstrate the link between initial gDNA amount and qPCR signal, an additional correlation analysis was performed. First two index ratios (IR) were calculated: IR1 between input gDNA in Cu versus biocontrol-treated leaves, and IR2 by dividing qPCR genome-copies detected on Cu versus biocontrol-treated leaves. Finally, the correlation between IR1 and IR2 during the period was determined. The correlation coefficient was 0.81, confirming that the two IRs were highly positively correlated, as shown in Figure 4a. A t-test on the two IR series (the assumption of equal variance was tested with a F-test that resulted in a p-value of 0.165) confirmed that the difference was not significant (p-value= 0.699). This means that the tendency of the two IRs did not differ significantly between the treatments. In conclusion, this indicated that not only did the two fungal communities not differ significantly in composition between treatments, but they were also quantitatively comparable.

Figure 4: a) Positive correlation between index-ratios IR1 and IR2; letters identify the collection date as reported in Table 1. b) Normalised genomes count for MW-1 (light blue triangles), estimated fungal genomes on community treated with biocontrol agent (dark blue squares) and fungal community genomes count under Cu treatment (black diamonds); curves are expressed as log10 of the genome count. The red dashed line indicates the harvest.

3. Discussion Understanding the structure and dynamics of microbial communities in the phyllosphere is important because of their effects on plant protection and plant growth. Since the structure and dynamics of the microbiome are affected by environmental factors such as geography (Bokulich et al. 2014), grape cultivar (Singh et al. 2018) and climate (Perazzolli et al. 2014), this study focused on different treatments applied in two blocks of the same vineyard. Both treatments, Cu and MW-1, were applied to control fungal diseases. The treatments were applied throughout the growing season and were followed by the periodic collection of

samples. The number of sprays was unequally distributed and was based on the visible symptom evaluation on the plants within the two blocks. This is important to note since the impacts of any chemical and biological treatments depend on the dosage and frequency of applications, as well as other biotic and abiotic factors (Perazzolli et al. 2014). This study revealed the development of the grapevine leaf microbiome over the growing season and displayed the impact of different types of vineyard management using NGS technology combined with qPCR. The combined use of these approaches allowed quantitative and qualitative information to be recovered from the compositional dataset (Perazzolli et al. 2014). In the present study, the biocontrol-treated phyllosphere-bacterial community had a reduced phylogenetic diversity and evenness compared with Cu-treated leaves, seemingly due to the presence of highly dominant taxa (i.e. the biocontrol agent, which appears as Lactobacillaceae, being dominant in almost all the treated samples). These highly abundant taxa probably compact the remaining populations, reducing evenness and phylogenetic diversity, and excluding some of the rare species that are likely to be missed at a fixed sequencing depth. In fact, phylogenetic diversity positively correlates with sequencing depth, as shown by Davison and Birch (2008). The beta diversity of the bacteria displayed significant differences since both communities, by treatment, scattered in two distinct directions along axis 1, which explained 28.5 % of the variation. In contrast to the bacterial communities, the fungal communities were affected by the sampling period, during which PD and evenness showed similar trends for both the biocontrol and Cu-treated leaves. This is in line with Singh et al. (2018) who showed that the shift of the microbial community in samples collected at different times was detected for fungi only. The fact that phylogenetic diversity increased, together with evenness, until December suggested that the complexity of the fungal community increased due to a higher number of equally abundant taxa. In December, the fall in PD and evenness was probably due to the presence of one or a few taxa with a high relative abundance that replaced rare species. This hypothesis is based on the fact that the sequencing depth was comparable between samples. The beta diversity of the fungal communities showed that the two treatments had an insignificant effect on the fungal community when the whole period was considered. This result is consistent with the study of Perazzolli et al. (2014) where a chemical and a biocontrol agent against downy mildew were tested in different locations and climatic conditions. They showed that the fungal community was impacted mostly by geography and was resilient to the treatments tested. Instead, it is clear from the present study that the fungal community was strongly affected by the sampling time, suggesting an evolution of the harboured microbiota due to seasonal changes such as temperature, solar exposition, rainfall and other abiotic fractions. The fact that microbial seasonal shift only impacts the fungal community and not the bacterial community is consistent with the work of Singh et al. (2018) in which two sets of samples from the same vineyards were analysed and the impact of different sampling times on the microbial community was measured.

The differences in beta diversity reflected the change in taxonomical composition in both bacterial and fungal communities. In particular, the bacterial communities differed significantly only for the relative abundance of Lactobacillaceae, as shown in Table S1, obtained by applying ANCOM. This taxon appeared in the biocontrol-treated leaves in relative abundance up to 40 % and contained reads of the biocontrol agent MW-1. After blasting the exact sequence variant against NCBI, the analyses could be refined to species level, identifying Lactobacillus plantarum as the most likely assignment. The present biocontrol strain, which here appeared as the dominant sequence variant, was not the only representative of Lactobacillaceae. The presence of other sequence variants assigned to Lactobacillaceae was also retrieved on copper-treated samples. This is consistent with the existing literature in which a community of LAB is found living on the phyllosphere naturally (Daranas et al. 2018). Furthermore, the strain-specific qPCR applied to recognise and quantify MW-1 produced a precise signal peak only on the biocontrol-treated samples. This allowed the conclusion to be drawn that the different amount detected in relative abundance for the Lactobacillaceae family was due to the presence of reads from the biocontrol agent. Finally the bacterial community, which appeared to be relatively stable throughout the season, reflected the composition already reported by Singh et al. (2018) The dominance of Proteobacteria and Firmicutes highlighted in their paper was consistent with the distribution shown in the present study, where the dominant taxa belonged to the same phyla. Furthermore, Singh et al. (2018) analysed two different sets of samples, one collected in spring and one during the harvest period, which displayed similar bacterial community composition with a few differences in relative abundance, e.g. the taxa Cyanobacteria. The analysis of the microbial community throughout the season offered interesting insight into the possible correlations between specific fungal outbreaks and seasonal variations. When looking at the whole period, there was a clear succession of dominant fungal taxa with Botryosphaeriaceae and Sporobolomyces in September, Pleosporacea in December and Cladosporium, Alternaria and Aureobasidium pullulans in February and March. All of these taxa are listed as being commonly recovered on grapevine plants in the review of Dissanayake et al. (2018). However, focusing on the differences between treatments that appeared only at a specific period of the season, other taxa could be found that were sensitive to the treatments. These taxa were generally hidden when looking at the complete timeline, but appeared clear when the focus was on a specific period. For instance, in September, just after spraying started, there were differences in the dominant taxa between treatments. The biocontrol leaves showed a reduced relative abundance of Botryosphaeriaceae compared to the Cu-treated samples, with the dominant sequence variant assigned to Diplodia seriata, a known grapevine pathogen that can cause bot canker (Úrbez-Torres et al. 2008). Conversely, within the family Botryosphaeriaceae, there is also the genus Botryosphaeria, whose anamorph stage has been confused with Diplodia (Phillips, Crous and Alves 2007). This genus appears not to be strongly affected by Cu (Mondello et al. 2017), but from the results of the present study it might be impacted by MW-1. Conversely, this biocontrol agent seems not to affect the taxon Leptosphaeriaceae, whose group contains pathogenic taxa such as Leptosphaeria that can cause black-leg disease in brassica (Hammond, Lewis, and Musa

1985). However, to the authors’ knowledge, Leptosphaeriaceae does not include pathogenic strains on grapevine. Furthermore, Leptosphaeriaceae are sensitive to chemical pesticides (Eckert et al. 2010), and this could explain the reduction when Cu was applied. However targeted quantitative methods are required to confirm this. Finally, the outcome of the NGS approach was combined with a targeted strain-specific qPCR to detect and quantify the biocontrol agent. A qPCR was applied on the fungal communities from the two treatments. This combined approach has previously been reported (Perazzolli et al. 2014) as an efficient way of drawing conclusions about microbial ecology when biocontrol agents are involved. The number of cells detected for MW-1 was consistent with the existing literature in which biocontrol agents on the phyllosphere are studied and quantified, and could eventually be used as an indicator of the effectiveness of the treatment (Perazzolli et al. 2014). The detection limit in this study was 102 cells/gram of leaves, a frequent limit of detection when qPCR is used (Hierro et al. 2006). The number of cells detected during the season varied between 105 and 103. A sudden tenfold decrease was expected 1-6 days after the spray, as previously reported in other field studies such as Daranas et al. (2018) using L. plantarum PM411 on apple, kiwi, strawberry and pear plants. When investigating the amount of fungi, the number of cells was not estimated due to the biases that could have been introduced by different ploidy between species. However, an average amount of fungal genomes was estimated based on the information reported in Li et al. (2018), with the average size for ascomycetes and basidiomycetes estimated at 13 and 42 Mb respectively. Accordingly, since members of both phyla were retrieved, 27.5 Mb was used as the average size for genomes. However, the results could be slightly different if these parameters were changed, consistent with other studies on fungal genomes such as Mohanta and Bae (2015), which reported different sizes. All the analyses established that the two fungal communities were quantitatively similar. Finally, when comparing the two normalised qPCR signals, an ANOVA test confirmed that there were no significant differences between treatments. Based on these results, leaves treated with L. plantarum MW-1 showed minor differences at specific periods of the year in terms of taxonomical composition, and there were no significant differences in terms of fungal load when compared with Cu-treated leaves. To conclude, although further evidences are required, this study provides affirmation that Lactobacillus plantarum MW-1 offers a promising biocontrol alternative or supplement to the use of Cu as a fungicide treatment on grapevine leaves. Finally, it paves the way for more environmentally-friendly treatment practices, not only in vineyards but also in other crops that are sensitive to fungal diseases and require chemical treatments.

4. Material and Methods 4.1 Materials Leaves were collected periodically from September 2016 to March 2017 from a vineyard in a winery in the Western Cape (South Africa). Two plots in the same field, separated by several lines of grapevine, were managed in two different ways: one block was sprayed with

Cu in a traditional vineyard management system, while the other was sprayed with a Lactobacillus plantarum MW-1 as a potential biocontrol agent. The Lactobacillus plantarum MW-1 strain has previously been isolated from grapes in the same wine region (data not shown). During the studied period, 12 different samplings were carried out in which leaves were collected in five spots for each of the treatment areas within the vineyards. For each spot and each treatment there were three independent biological replicates. Each sample consisted of five leaves that were placed in a 50 ml Falcon tube. The tubes were frozen at - 20 °C and shipped to the laboratory in Denmark. Sample collection followed the spray calendar for the biocontrol agent. After the grape harvest in January, there was a follow-up with two extra samplings in February and March. All the plants from which leaves had been collected were marked and did not change throughout the experimental plan. A complete calendar of the sampling is given in Table 1. Table 1: Sampling dates in this study; second column report the code used during data- processing and analyses; third column identify the state of the vineyard at the time of the samples collection

Date of collection Code in the paper Note 15 September 2016 1.S Time zero, no sprays performed 26 September 2016 2.S After biocontrol agent spraying 29 September 2016 3.S After biocontrol agent spraying 10 October 2016 1.O After biocontrol agent spraying 17 October 2016 2.O After biocontrol agent spraying 25 October 2016 3.O After biocontrol agent spraying 3 November 2016 1.N After biocontrol agent spraying 18 November 2016 2.N After biocontrol agent spraying 30 November 2016 3.N After biocontrol agent spraying 12 December 2016 1.D After biocontrol agent spraying 2 February 2017 1.F After harvest 9 March 2017 1.M After harvest

4.2 Sample preparation and DNA extraction The tubes containing the leaves were thawed at room temperature and 20 ml of a washing solution was subsequently added (Singh et al. 2018) and placed on a rocket inverter, applying one hour of gently shaking rotation. After this, the leaves were discarded using sterile pincettes. The washing solution was centrifuged for 15 minutes at 6000 rpm to create a pellet. Following removal of the washing solution, the pellet was resuspended in 978 ul of phosphate buffer solution and DNA was extracted according to the manufacturer’s protocol using FAST DNA Spin Kit for Soil (MP-Biochemical, CA). After the extraction, the DNA quality

and concentration were measured with Nanodrop (Thermo Fisher Scientific) and Qubit®2.0 fluorometer (Thermo Scientific™).

4.3 Library preparation for sequencing Amplicon library preparation for 16S bacterial gene was performed as described by Gobbi et al. (2018), with minor modifications hereby reported. To reduce the amount of plastidial DNA from the grapevine leaves, specific mPNA and pPNA were used with the same sequences as those recommended by Lundberg et al. (2013). Each reaction contained 12 µL of AccuPrime™ SuperMix II (Thermo Scientific™), 0.5 µL of forward and reverse primer from a 10 µM stock, 0.625 ul of pPNA and mPNA to a final concentration of 0.25 mg/mL, 1.5 µL of sterile water, and 5 µL of template. The reaction mixture was pre-incubated at 95 °C for 2 min, followed by 33 cycles of 95 °C for 15 sec, 75 °C for 10 sec, 55 °C for 15 sec and 68 °C for 40 sec. A further extension was performed at 68 °C for 10 min. The fungal community was sequenced using the same double-step PCR approach for library preparation, but the primers in use were ITS1 (TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG-GAACCWGCGGARGGATCA) and ITS2 (GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG-GCTGCGTTCTTCATCGATGC) with adapters for Illumina MiSeq Sequencing. Each reaction for the first PCR on ITS contained 12 µL of AccuPrime™ SuperMix II (Thermo Scientific™), 0.5 µL of forward and reverse primer from a 10 µM stock, 0.5 µL of bovine serum albumin (BSA) to a final concentration of 0.025 mg/mL, 1.5 µL of sterile water, and 5 µL of template. PCR cycles were the same as above, but without the step at 75 °C for 10 sec. The number of cycles was set to 40. The second PCR of fragment purification using beads and Qubit quantification were common for 16S and ITS and followed the protocol reported in Gobbi et al. (2018). The final pooling was proportional to the length of the fragment in order to sequence an equimolar amount of gene fragments for all the samples. Thus by sequencing all the samples within the same sequencing run, biases due to run variations were minimised. Sequencing was performed using an in-house Illumina MiSeq instrument and 2x250 paired-end reads with V2 Chemistry.

4.4 Strain-specific primer for the biocontrol agent In order to find a region that was unique for the biocontrol agent, part of the genome was uploaded and scanned using PHASTER (PHAge Search Tool Enhanced Research) (Arndt et al. 2016) to detect phage and prophage genomes within the bacterial genome. Following this, the uniqueness was tested of an amplicon generated from one primer that targets the genome and another targeting the phage sequence against the NCBI database and a private genome database repository of the strain provider. Thus a fragment of 300 bp was obtained that did not show any hit in NCBI and only one full hit in the private genome repository. To amplify this fragment, the primers MWP3F (CATCCCAACCGCTAACAA) and MWP3R (CGCAGAAAAGGTAGCAAA) were used. These primers were tested further against other strains of L. plantarum from the authors’ collection, showing amplification only toward the

biocontrol agent. To create a standard curve, the DNA was extracted from a pure culture of the biocontrol agent and serial dilutions were performed from 10-2 to 10-7. Knowing the length of the fragment of 300 bp, the qPCR signal and the fact that the fragment appears only once in a genome, the number of cells of the biocontrol agent L. plantarum MW-1 was estimated for each reaction. This information was used to estimate the number of cells in leaf extract samples.

4.5 qPCR All PCR reactions were prepared using UV sterilised equipment and negative controls were run alongside the samples. The qPCR with primers specific for the biocontrol agent and the fungal community was carried out on a CFX Connect™ Real-Time PCR Detection System (Bio- Rad). The primers used for the bacterial biocontrol agent were MWP3F and MWP3R, while for the total fungal community the same primers ITS1 and ITS2 were used as during library preparation. To reduce biases the adapters for Illumina MiSeq Sequencing were also maintained in qPCR. Single qPCR reactions contained 4 µL of 5x HOT FIREPol® EvaGreen® qPCR Supermix (Solis BioDyne, Tartu, Estonia), 0.4 µL of forward and reverse primers (10 µM), 2 µL of bovine serum albumin (BSA) to a final concentration of 0.1 mg/mL, 12.2 µL of PCR grade sterile water, and 1 µL of template DNA. Since DNA concentration in some of the samples was very low (around 0.5 ng/ul), normalisation by diluting was avoided before qPCR, but the qPCR signal have been normalized using the total input DNA, previously measured by Qubit. The qPCR cycling conditions included initial denaturation at 95 °C for 12 min, followed by 40 cycles of denaturation at 95 °C for 15 sec, annealing at 56 °C for 30 sec, and extension at 72 °C for 30 sec; a final extension was performed at 72 °C for 3 min. Quantification parameters showed an efficiency of E=87.7 % and R2 of 0.999. The same qPCR cycle conditions were applied for both bacterial and fungal qPCR. All qPCR reactions were followed by a dissociation curve in which temperature was increased from 72 °C to 95 °C, rising by 0.5 °C in each cycle, and fluorescence measured after each increment. Since the initial concentration of DNA was not homogeneous and in a few samples was below 1 ng/ul, it was decided to normalise the qPCR result using the initial concentration of total DNA, as reported in Yun et al. (2006). 4.6. Bioinformatics Sequencing data were analysed and visualised using QIIME 2 v. 2018.2 (Caporaso et al. 2010) and the same pipeline described in Gobbi et al. (2018). Taxonomic assignments were performed using qiime feature-classifier classify-sklearn with a pre-trained Naïve-Bayes classifier with Greengenes v_13.8 for 16S and UNITE v7.2 for ITS. The raw data of this study will be available on ENA (Study Accession Number)

4.7 Statistics Statistical evaluation of these results was performed separately for qPCR and the sequencing dataset. All statistical evaluations and visualisations regarding the qPCR dataset were

performed on Microsoft Office Excel 2010, while the NGS dataset was analysed through QIIME2 v. 2018.2. The qPCR signals obtained allowed the calculation of the number of cells of the biocontrol agent detected on the sprayed leaves throughout the period, while for the fungal community it was possible to perform a relative quantification between the two different treatments of Cu and the biocontrol. For both datasets the signal obtained was normalised using the total DNA as input (Yun et al., 2006). To relate the differences in qPCR signal on the fungal community to the presence/absence of the biocontrol treatment, the qPCR signal had to be correlated with the initial DNA amount before checking the differences and their statistical relevance. To do this two different index ratios (IR) were created: IR1 dividing the raw qPCR signal from Cu samples from that from biocontrol samples, and IR2 dividing the initial amount of DNA of Cu-treated with that of biocontrol-treated leaves. Two series of IRs were then obtained to test for equal variance with a F-test. This was intended to establish whether the two IRs, deriving from two different datasets, had a comparable variance within their samples. The F-test for variances, with a p- value= 0.16, proved that the two variances were equal and then a t-test was run on two series assuming equal variances between the IRs on the two treatments, Cu and biocontrol. A correlation analysis was also performed between the initial DNA amount and the resulting qPCR signal. Finally correlation analyses and a t-test were repeated on the samples treated with the bacterial agent as well to compare the variation between the total fungal community on biocontrol-treated leaves and the number of cells detected of the biocontrol bacteria. Sequencing data after QIIME 2 pipeline processing were statistically evaluated using the Kruskal-Wallis test for alpha and beta diversity. The resulting p-value of the alpha diversity comparison was based on two parameters (phylogenetic diversity and evenness) calculated between the different series analysed. Beta-diversity analyses were evaluated using PERMANOVA with 999 permutations. Finally a statistical evaluation of differentially abundant features was performed based on analysis of composition of microbiomes (ANCOM) (Mandal et al. 2015). This test is based on the assumption that few features change in a statistical way between the samples and hence is very conservative. To individuate microbial niches, Gneiss (Morton et al. 2017) was also applied on the bacterial dataset, using treatment as the determining variable. Acknowledgements The authors would like to thank the Horizon 2020 Programme of the European Commission within the Marie Schlodowska-Curie Innovative Training Network “MicroWine” (grant number 643063) for financial support. Conflict of Interest Nothing declared Bibliography

Arndt, David, Jason R. Grant, Ana Marcu, Tanvir Sajed, Allison Pon, Yongjie Liang, and David S. Wishart. 2016. “PHASTER: A Better, Faster Version of the PHAST Phage Search Tool.” Nucleic Acids Research 44 (W1): W16– 21. https://doi.org/10.1093/nar/gkw387.

Baffoni, Loredana, Francesca Gaggia, Nereida Dalanaj, Antonio Prodi, Paola Nipoti, Annamaria Pisi, Bruno Biavati, and Diana Di Gioia. 2015. “Microbial Inoculants for the Biocontrol of Fusarium Spp. in Durum Wheat.” BMC Microbiology 15 (1): 8–10. https://doi.org/10.1186/s12866-015-0573-7. Banik, Susanta, and Alejandro Pérez-de-luque. 2017. “In Vitro Effects of Copper Nanoparticles on Plant Pathogens, Beneficial Microbes and Crop Plants.” Spanish Journal of Agricultural Research 15 (2). https://doi.org/10.5424/sjar/2017152-10305.

Bokulich, N. A., J. H. Thorngate, P. M. Richardson, and D. A. Mills. 2014. “PNAS Plus: From the Cover: Microbial Biogeography of Wine Grapes Is Conditioned by Cultivar, Vintage, and Climate.” Proceedings of the National Academy of Sciences 111 (1): E139–48. https://doi.org/10.1073/pnas.1317377110. Clark, Dave R., Robert M. W. Ferguson, Danielle N. Harris, Kirsty J. Matthews Nicholass, Hannah J. Prentice, Kate C. Randall, Luli Randell, Scott L. Warren, and Alex J. Dumbrell. 2018. “Streams of Data from Drops of Water: 21st Century Molecular Microbial Ecology.” Wiley Interdisciplinary Reviews: Water, no. January: e1280. https://doi.org/10.1002/wat2.1280.

Crowley, Sarah, Jennifer Mahony, and Douwe van Sinderen. 2013. “Broad-Spectrum Antifungal-Producing Lactic Acid Bacteria and Their Application in Fruit Models.” Folia Microbiologica 58 (4): 291–99. https://doi.org/10.1007/s12223-012-0209-3.

Daranas, Núria, Esther Badosa, Jesús Francés, Emilio Montesinos, and Anna Bonaterra. 2018. “Enhancing Water Stress Tolerance Improves Fitness in Biological Control Strains of Lactobacillus Plantarum in Plant Environments.” PLoS ONE 13 (1): 1–21. https://doi.org/10.1371/journal.pone.0190931. Davison, Kirsten Krahnstoever, and Leann L Birch. 2008. “NIH Public Access” 64 (12): 2391–2404. https://doi.org/10.1038/jid.2014.371.

Dissanayake, Asha J., Witoon Purahong, Tesfaye Wubet, Kevin D. Hyde, Wei Zhang, Haiying Xu, Guojun Zhang, et al. 2018. “Direct Comparison of Culture-Dependent and Culture-Independent Molecular Approaches Reveal the Diversity of Fungal Endophytic Communities in Stems of Grapevine (Vitis Vinifera).” Fungal Diversity 90 (1): 85–107. https://doi.org/10.1007/s13225-018-0399-3.

Djossou, Olga, Isabelle Perraud-Gaime, Fatma Lakhal Mirleau, Gabriela Rodriguez-Serrano, Germain Karou, Sebastien Niamke, Imene Ouzari, Abdellatif Boudabous, and Sevastianos Roussos. 2011. “Robusta Coffee Beans Post-Harvest Microflora: Lactobacillus Plantarum Sp. as Potential Antagonist of Aspergillus Carbonarius.” Anaerobe 17 (6): 267–72. https://doi.org/10.1016/j.anaerobe.2011.03.006. Eckert, Maria R., Stephen Rossall, Andrew Selley, and Bruce D.L. Fitt. 2010. “Effects of Fungicides on in Vitro Spore Germination and Mycelial Growth of the Phytopathogens Leptosphaeria Maculans and L. Biglobosa (Phoma Stem Canker of Oilseed Rape).” Pest Management Science 66 (4): 396–405. https://doi.org/10.1002/ps.1890.

El-Mabrok, Asma Saleh W., Zaiton Hassan, Ahmed Mahir Mokhtar, and Mohamed Mustafa Aween. 2012. “Efficacy of Lactobacillus Plantarum C5 Cell and Their Supernatant Against Colletotrichum Gloeosporiodes on Germination Rate of Chilli Seeds.” Medwell Journals, 159–64. Escobar-Zepeda, Alejandra, Arturo Vera Ponce De León, and Alejandro Sanchez-Flores. 2015. “The Road to Metagenomics: From Microbiology to DNA Sequencing Technologies and Bioinformatics.” Frontiers in Genetics 6 (DEC): 1–15. https://doi.org/10.3389/fgene.2015.00348.

Fhoula, Imene, Afef Najjari, Yousra Turki, Sana Jaballah, Abdelatif Boudabous, and Hadda Ouzari. 2013. “Diversity and Antimicrobial Properties of Lactic Acid Bacteria Isolated from Rhizosphere of Olive Trees and Desert Truffles of Tunisia.” BioMed Research International 2013. https://doi.org/10.1155/2013/405708. Gobbi, A., Santini, R., Filippi, E., Ellegaard-Jensen, L., Jacobsen, C.S., Hansen, L.H. 2018. “Quantitative and Qualitative Evaluation of the Impact of the G2 Enhancer, Bead Sizes and Lysing Tubes on the Bacterial Community Composition during DNA Extraction from Recalcitrant Soil Core Samples Based on Community Sequencing and QPCR.” https://doi.org/10.1101/365395.

Gregory Caporaso, J, Justin Kuczynski, Jesse Stombaugh, Kyle Bittinger, Frederic D Bushman, Elizabeth K Costello, Noah Fierer, et al. 2010. “Correspondence QIIME Allows Analysis of High- Throughput Community Sequencing Data Intensity Normalization Improves Color Calling in SOLiD Sequencing.” Nature Publishing Group 7 (5): 335–36. https://doi.org/10.1038/nmeth0510-335. HAMMOND, KIM E., B. G. LEWIS, and T. M. MUSA. 1985. “A Systemic Pathway in the Infection of Oilseed Rape Plants by Leptosphaeria Maculans.” Plant Pathology 34 (4): 557–65. https://doi.org/10.1111/j.1365- 3059.1985.tb01407.x.

Hierro, Núria, Braulio Esteve-Zarzoso, Ángel González, Albert Mas, and Jose M. Guillamón. 2006. “Real-Time Quantitative PCR (QPCR) and Reverse Transcription-QPCR for Detection and Enumeration of Total Yeasts in Wine.” Applied and Environmental Microbiology 72 (11): 7148–55. https://doi.org/10.1128/AEM.00388-06. Korotaeva N, Limanska N, Biscola V Ivanytsia T, and Merlich A Franco BDGM. 2015. “Study of the Potential Application of Lactic Acid Bacteria in the Control of Infection Caused by Agrobacterium Tumefaciens.” Journal of Plant Pathology & Microbiology 06 (08). https://doi.org/10.4172/2157-7471.1000292. Lavermicocca, Paola, Francesca Valerio, Antonio Evidente, Silvia Lazzaroni, Aldo Corsetti, Marco Gobbetti, Istituto Tossine, et al. 2000. “Purification and Characterization of Novel Antifungal Compounds from the Sourdough Lactobacillus Plantarum Strain 21B” 66 (9): 4084–90.

Li, Huiying, Surui Wu, Xiao Ma, Wei Chen, Jing Zhang, Shengchang Duan, Yun Gao, et al. 2018. “The Genome Sequences of 90 Mushrooms.” Scientific Reports 8 (1): 6–10. https://doi.org/10.1038/s41598-018-28303-2. Lundberg, Derek S., Scott Yourstone, Piotr Mieczkowski, Corbin D. Jones, and Jeffery L. Dangl. 2013. “Practical Innovations for High-Throughput Amplicon Sequencing.” Nature Methods 10 (10): 999–1002. https://doi.org/10.1038/nmeth.2634.

Mandal, Siddhartha, Will Van Treuren, Richard A White, Merete Eggesbo, Rob Knight, and Shyamal D Peddada. 2015. “Analysis of Composition of Microbiomes: A Novel Method for Studying Microbial Composition.” Microbial Ecology in Health and Disease 26: 27663. https://doi.org/10.3402/mehd.v26.27663.

Mohanta, Tapan Kumar, and Hanhong Bae. 2015. “The Diversity of Fungal Genome.” Biological Procedures Online 17 (1): 1–9. https://doi.org/10.1186/s12575-015-0020-z. Mondello, Vincenzo, Aurélie Songy, Enrico Battiston, Catia Pinto, Cindy Coppin, Patricia Trotel-Aziz, Christophe Clément, Laura Mugnai, and Florence Fontaine. 2017. “Grapevine Trunk Diseases: A Review of Fifteen Years of Trials for Their Control with Chemicals and Biocontrol Agents.” Plant Disease, no. July: PDIS-08-17-1181-FE. https://doi.org/10.1094/PDIS-08-17-1181-FE.

Morton, James T, Jon Sanders, Robert A Quinn, Daniel Mcdonald, Antonio Gonzalez, Yoshiki Vázquez-baeza, and Jose A Navas-molina. 2017. “Crossm Differentiation” 2 (1): 1–11. https://doi.org/10.1128/mSystems.00162- 16.

Oloyede, A R, M A Ojona, A O Tubi, Biotechnology Centre, Ogun State, and Ogun State. 2017. “Evaluation of Antibacterial Potential of Some Lactic Acid Bacteria on Strains of Clavibacter Michiganensis Subsp. Michiganensis 1” 31 (1): 3793–97.

Perazzolli, Michele, Livio Antonielli, Michelangelo Storari, Gerardo Puopolo, Michael Pancher, Oscar Giovannini, Massimo Pindo, and Ilaria Pertot. 2014. “Resilience of the Natural Phyllosphere Microbiota of the Grapevine to Chemical and Biological Pesticides.” Applied and Environmental Microbiology 80 (12): 3585–96. https://doi.org/10.1128/AEM.00415-14.

Phillips, A.J.L., Pedro W Crous, and Artur Alves. 2007. “Diplodia Seriata , the Anamorph of ‘ Botryosphaeria ’ Obtusa.” Fungal Diversity 25 (1892): 141–55. Roggenbuck, Michael, Ida Bærholm Schnell, Nikolaj Blom, Jacob Bælum, Mads Frost Bertelsen, Thomas Sicheritz Pontén, Søren Johannes Sørensen, M. Thomas P. Gilbert, Gary R. Graves, and Lars H. Hansen. 2014. “The Microbiome of New World Vultures.” Nature Communications 5 (May 2014): 1–8. https://doi.org/10.1038/ncomms6498.

Roselló Prados, Gemma. 2016. “Characterization and Improvement of Plant-Associated Lactobacillus Plantarum. Novel Biocontrol Agent for Fire Blight Disease.” TDX (Tesis Doctorals En Xarxa). https://doi.org/10.1016/S0890-4065(99)00021-3.

Sathe, S. J., N. N. Nawani, P. K. Dhakephalkar, and B. P. Kapadnis. 2007. “Antifungal Lactic Acid Bacteria with Potential to Prolong Shelf-Life of Fresh Vegetables.” Journal of Applied Microbiology 103 (6): 2622–28. https://doi.org/10.1111/j.1365-2672.2007.03525.x.

Schloss, Patrick D, and Jo Handelsman. 2005. “Introducing DOTUR, a Computer Program for Defining Operational Taxonomic Units and Estimating Species Richness.” Applied and Environmental Microbiology 71 (3): 1501 LP-1506. http://aem.asm.org/content/71/3/1501.abstract.

Senna, Antoinette de, and Amanda Lathrop. 2017. “Antifungal Screening of Bioprotective Isolates against Botrytis Cinerea, Fusarium Pallidoroseum and Fusarium Moniliforme.” Fermentation 3 (4): 53. https://doi.org/10.3390/fermentation3040053.

Singh, Prashant, Sylvain Santoni, Patrice This, and Jean-Pierre Péros. 2018. “Genotype-Environment Interaction Shapes the Microbial Assemblage in Grapevine’s Phyllosphere and Carposphere: An NGS Approach.” Microorganisms 6 (4): 96. https://doi.org/10.3390/microorganisms6040096. Smaoui, Slim, Lobna Elleuch, Wacim Bejar, Ines Karray-Rebai, Imen Ayadi, Bassem Jaouadi, Florence Mathieu, Hichem Chouayekh, Samir Bejar, and Lotfi Mellouli. 2010. “Inhibition of Fungi and Gram-Negative Bacteria by Bacteriocin BacTN635 Produced by Lactobacillus Plantarum Sp. TN635.” Applied Biochemistry and Biotechnology 162 (4): 1132–46. https://doi.org/10.1007/s12010-009-8821-7. Smith, Cindy J., and A. Mark Osborn. 2009. “Advantages and Limitations of Quantitative PCR (Q-PCR)-Based Approaches in Microbial Ecology.” FEMS Microbiology Ecology 67 (1): 6–20. https://doi.org/10.1111/j.1574- 6941.2008.00629.x.

Soccol, Carlos Ricardo, Luciana Porto, De Souza Vandenberghe, Michele Rigon Spier, Adriane Bianchi, Pedroni Medeiros, and Caroline Tiemi Yamaguishi. 2013. “” 48 (4): 413–34.

Tajudeen, Bamidele, Adeniyi Bolanle, Ogunbanwo Samuel T, Smith Stella, and Omonigbehin Emmanuel. 2011. “Antibacterial Activities of Lactic Acid Bacteria Isolated from Selected Vegetables Grown in Nigeria: A Preliminary Report.” ©Sierra Leone Journal of Biomedical Research 3 (3): 2076–6270. Trias, Rosalia, Lluís Bañeras, Emilio Montesinos, and Esther Badosa. 2008. “Lactic Acid Bacteria from Fresh Fruit and Vegetables as Biocontrol Agents of Phytopathogenic Bacteria and Fungi.” International Microbiology 11 (4): 231–36. https://doi.org/10.2436/20.1501.01.66.

Tromp, A, and CA De Klerk. 1988. “Effect of Copperoxychloride on the Fermentation of Must and Wine Quality.” South African Journal for Enology and Viticulture 9 (1): 31–36. http://www.sawislibrary.co.za/dbtextimages/19898.pdf.

Tsuda, Kazuhisa, Gento Tsuji, Miyako Higashiyama, Hiroshi Ogiyama, Kenji Umemura, Masaaki Mitomi, Yasuyuki Kubo, and Yoshitaka Kosaka. 2016. “Biological Control of Bacterial Soft Rot in Chinese Cabbage by Lactobacillus Plantarum Strain BY under Field Conditions.” Biological Control 100: 63–69. https://doi.org/10.1016/j.biocontrol.2016.05.010.

Úrbez-Torres, J. R., G. M. Leavitt, J. C. Guerrero, J. Guevara, and W. D. Gubler. 2008. “Identification and Pathogenicity of Lasiodiplodia Theobromae and Diplodia Seriata , the Causal Agents of Bot Canker Disease of Grapevines in Mexico.” Plant Disease 92 (4): 519–29. https://doi.org/10.1094/PDIS-92-4-0519. Visser, R., W. H. Holzapfel, J. J. Bezuidenhout, and J. M. Kotze. 1986. “Antagonism of Lactic Acid Bacteria against Phytopathogenic Bacteria.” Applied and Environmental Microbiology 52 (3): 552–55. Wang, Haikuan, Hu Yan, Jing Shin, Lin Huang, Heping Zhang, and Wei Qi. 2011. “Activity against Plant Pathogenic Fungi of Lactobacillus Plantarum IMAU10014 Isolated from Xinjiang Koumiss in China.” Annals of Microbiology 61 (4): 879–85. https://doi.org/10.1007/s13213-011-0209-6. Warinner, Christina, João F.Matias Rodrigues, Rounak Vyas, Christian Trachsel, Natallia Shved, Jonas Grossmann, Anita Radini, et al. 2014. “Pathogens and Host Immunity in the Ancient Human Oral Cavity.” Nature Genetics 46 (4): 336–44. https://doi.org/10.1038/ng.2906. Zwieten, Lukas Van, Josh Rust, Tim Kingston, Graham Merrington, and Steven Morris. 2004. “Influence of Copper Fungicide Residues on Occurrence of Earthworms in Avocado Orchard Soils.” Science of the Total Environment 329 (1–3): 29–41. https://doi.org/10.1016/j.scitotenv.2004.02.014.

Appendix A

I hereby report all the abstracts of the upcoming publications, which I have not included in this manuscript since the complete paper still is in draft. Vintage variations on bacterial and fungal diversity of grapevine rhizosphere soils as an effect of geography. M. OYUELA AGUILAR1* and A. GOBBI2*, P.D. BROWNE, J.I. QUELAS1, L. ELLEGAARD- JENSEN2, L. HESTBJERG HANSEN2 and M. PISTORIO1

1Instituto de Biotecnología y Biología Molecular (IBBM), CCT-La Plata, CONICET, Dto de Cs. Biológicas, Fac. Cs. Exactas, UNLP, Calles 49 y 115 La Plata (1900), Argentina 2Department of Environmental Science, Aarhus University, Roskilde, Denmark. * the two authors contributed equally to this work Next Generation Sequencing (NGS) technologies has allowed access to previously inaccessible microbial communities in soil. The exuberating amount of information obtained from the application of such technologies has breached traditional agriculture. Aiming toward the development of more sustainable crop management practices. With the future perspective of incorporating these sustainable viticulture tendencies in the Argentinian wine industry, this research managed to create a representative profile of bacteria and fungi in grapevine rhizosphere soil of a few San Juan vineyards. The study’s main objective was to evaluate the abiotic effects on the microbial taxonomic composition found near grapevine roots, prior to the consecutive harvest periods of 2015, 2016 and 2017. Taking into consideration the climatic conditions, soil composition, location, grape variety and vintage. The samples originated from four sites located in two different estates of the central Ullum Valley region in San Juan. Each containing a Malbec and a Cabernet Sauvignon cultivar, with mainly loamy soils and no grafted vines. The results dictated the taxonomic bacterial groups were Proteobacterias, Bacteroidetes and Firmicutes. While for fungi these were the Ascomycetes, Basidiomycetes, Mortierellomycetes and a small percentage of the arbuscular mycorrhiza (AM), the Glomeromycetes. Between vintages, the main taxonomic composition seemed to vary primarily in 2015. Observations in bacteria indicated this was mainly due to the increase of the archeal group Thaumarchaeota, in 2016 and 2017. While in fungi, this seemed due to the higher presence of Ascomycetes in the last two years of research. The overall diversity indicated there were significant differences between vintages, estate locations and soil composition. Yet no significant differences due to grape variety. Further analysis showed the sites grouped together mainly because of the organic matter, carbon and nitrogen, and phosphorus content. Most importantly, results showed that fungal and bacterial diversity of a given sight might indeed vary from year to year. Making a better understanding of the complexity and depth faced when studying microbial terroir in vineyards.

Vineyard fungal and bacterial community variations as an effect of geography M. OYUELA AGUILAR1*, A. GOBBI2*, P.D. BROWNE, L. HESTBJERG HANSEN2 and M. PISTORIO1

1Instituto de Biotecnología y Biología Molecular (IBBM), CCT-La Plata, CONICET, Dto. de Cs. Biológicas, Fac. Cs. Exactas, UNLP, Calles 49 y 115 La Plata (1900), Argentina

2Department of Environmental Science, Aarhus University, Roskilde, Denmark. * the two authors contributed equally to this work

The increasing quality demands in the wine industry has created a shift toward microbial terroir understanding for improving sustainable viticulture practices. Thus, much scientific effort has focused on implementing Next Generation Sequencing (NGS) technologies. To best explain the microbial complexity in vineyards through the formation of phylogenetic libraries. Therefore, the main objective of this study was to focus on the varying geographic effects on the microbial composition of different grapevine organs. Taken from Malbec and Cabernet Sauvignon cultivars located in fourteen different vineyards of Argentina and five in Europe. In total, 174 samples from grapes, leaves and rhizosphere soil were collected in 2016, just days before the harvest period. Out of which 137 samples were successfully sequenced for both 16s rRNA and ITS1 analyses. Results showed that the highest diversity found was in rhizosphere soil as oppose to grapes and leaves. With the exception of one Cabernet Sauvignon site in the Alentejo region of Portugal. Furthermore, the similarities between grape and rhizosphere samples grouped according to regions, country and continent. While in leaves, Portuguese sites seemed to differ between regions more distinctly. Additionally, observations showed significant differences between training systems, grapevine grafting and -with the exception of rhizosphere soil- grape variety. The taxonomic groups in fungal communities were mostly unidentified in all of the samples. However, the recovered data showed fungal communities in rhizosphere as being the most diverse, seconded by grapes and then leaves. The two most common and predominant fungi classes in grapes, leaves and rhizosphere were Dothideomycetes and Tremellomycetes. As to bacteria, the diversity in both leaves and rhizosphere was much higher than in fungi. With the predominant class of Alphaprotobacteria at rhizosphere level and Actinobacteria at foliar level. The observations showed each class was very low in the abundant presence of the other, with some chloroplast contamination in leaf samples.

Fungicides and the grapevine wood mycobiome. A case study on tracheomycotic ascomycete Phaeomoniella chlamydospora reveals potential for two novel control strategies. GIOVANNI DEL FRARI1, ALEX GOBBI2, MARIE RØNNE AGGERBECK2, LARS HESTBJERG HANSEN2, RICARDO BOAVIDA FERREIRA1

1 LEAF – Linking Landscape Environment Agriculture and Food, Instituto Superior de Agronomia, University of Lisbon, 1349- 017 Tapada da ajuda, Lisbon, Portugal

2 Department of Environmental Science - Section for Environmental Microbiology and Biotechnology - Environmental Microbial Genomics Group (EMG), Aarhus University, 4000 Frederiksborgvej, Roskilde, Denmark Esca is a disease complex belonging to the grapevine trunk diseases cluster. It comprises five syndromes, three main fungal pathogenic agents and several symptoms both internal (i.e. affecting the stem) and external (i.e. affecting leaves and bunches). The etiology and epidemiology of this disease complex remain in part unclear. Some of the points still under discussion concern; the sudden rise in the disease’s incidence, the simultaneous presence of multiple wood pathogens in affected grapevines, the causal agents and discontinuity of foliar symptoms manifestation. The standard approaches to study esca has been mostly through culture-dependent studies, yet, leaving many questions unanswered. In this study, we used Illumina® next-generation sequencing to investigate the mycobiome of the wood of grapevines in a vineyard with a history of esca. The three main objectives were to: (i) characterize the mycobiome composition; (ii) understand the spatial dynamics of the fungal communities in different areas of the stem and in the canes; (iii) assess the hypothetical link between mycobiome and foliar symptoms. An unprecedented diversity of fungi is presented (289 taxa), including five genera reported for the first time in association with grapevine’s wood (Debaryomyces, Trematosphaeria, Biatriospora, Lopadostoma and Malassezia) and numerous species. Esca-associated fungi Phaeomoniella chlamydospora and Fomitiporia sp. dominate the fungal community, and numerous other fungi associated with wood syndromes are also encountered (e.g. Eutypa spp., Inonotus hispidus). The spatial analysis revealed different abundances of taxa, the exclusive presence of certain fungi in specific areas of the plants, and tissue specificity. Lastly, the mycobiome composition of the woody tissue in proximity to the canopy that manifested foliar symptoms of esca, as well as in foliar symptomatic canes, was highly similar to that of plants not exhibiting any foliar symptomatology. This observation supports the current understanding that foliar symptoms are not directly linked with the fungal community in the adjacent wood. This work contributes to the understanding of the microbial ecology of the grapevines wood, offering insights and a critical view on the current knowledge of the etiology of esca.

Appendix B

I hereby reports some of the results I obtained during the first part of this PhD while I was working on the MICROWINE Sequencing Protocol which have been adopted by all the studies included in this thesis except for Manuscript 5. These results are summarized in Table 1 in paragraph 2.2. 1- Sampling Grid In this test, we performed an extensive sampling within a vineyard following the scheme reported in Fig. B1a. From each spot we collected and then extracted DNA from three independent biological replicates. After analyzing the data using DADA2 approach in R we obtained a PCoA plot (Fig. B1b) which show the distribution of our samples. The samples are grouped by color and each color define the spot where they belong. Applying PERMANOVA on this it resulted in a p-value = 0,601 with 9999 permutations. In terms of alpha-diversity also we could not see any effect in terms of richness (p-value=0,875) and evenness (p- value=0,672). These results highlight that the diversity within the same spot is sometimes higher than the diversity between samples coming from different spots within the field. This place the focus on the amount of replicates necessary to represent a field more than the number of sampling spots within a vineyard. Based on literature, other important parameters to consider are: storage conditions (Choo et al. 2015), position within the rows of plants, which might be affected by agricultural practices such as tillage (Burns et al. 2016) but also the time of the sampling. The sampling moment in fact, is possibly affected by meteorological conditions (such as rainfalls) and field-treatments (pesticides and fertilizations). 2- DNA Extraction One of the challenges of this sequencing protocol was to choose a single method for DNA extraction that could work with all the different kinds of samples collected within the MICROWINE Network. After testing a few protocol we ended up choosing a commercial kit marketed from MP-Biochemicals and called FAST DNA Spin Kit for Soil since it allow us to retrieve the highest amount of DNA in all conditions and with low-inhibition which could be solved most of the times applying a moderate dilution step or a purification kit. Furthermore, this kit is based on bead-beating and from the literature it is indicated as the best way to retrieve the highest biodiversity (Vishnivetskaya et al. 2014). Then, we set the different preparation steps to adapt all the samples to this kit. The samples we analyzed are: soil, rhizosphere, leaves, must, wine and trunks. After DNA-extraction we tested the DNA for PCR amplification and visualize the results on agarose gel as summarized in Fig. B2

Figure B1: outcomes of the sampling test conducted; a) the sampling grid we applied in our studies within each vineyards. b) PCoA plots using bray-curtis dissimilarity and jaccard to show the distribution of the samples colored by spot in the field; The letters in this legend correspond to the numbers in panel a. PCoA plot analyses was made in collaboration with Prashant Singh at INRA, Montpellier (France)

Figure B2: 16S rRNA gene amplification with primers 341F and 806R. In order from left to right: marker, washed leaves, crushed leaves, must, wine, trunks, rhizosphere, soil, negative control, marker. 3- Dilution In this case, we performed a test to verify if the dilution with water, intended as a very common practice to normalize DNA amount or to remove PCR inhibitors, can affect the results of microbiome analyses. We selected two kind of samples to test; leaves and soil. Leaves were chosen because they displayed the lowest yield of DNA after an indirect extraction performed on the washing-solution. We performed an indirect DNA extraction to separate the bacteria from the leaf, thus avoiding lysing plant-cells and so getting high amount of host-DNA. After the extraction the DNA was diluted 1:10 with water and amplified with PCR and 16S rRNA gene primers 341F and 806R. Soil was chosen because it is the type of samples in which we recovered the highest amount of DNA. Furthermore, it is characterize by a high-degree of complexity due to its biodiversity. After the extraction the DNA was diluted 1:10, 1:100, 1:1000, and 1:2000 with water and qPCR amplification was performed followed by High Resolution Dissociation Curve analyses, in which the temperature rose from 65° C to 95° C with 0.5° C increase for each cycle. The results are summarized in Figure 3. In Fig. 3a it is possible to see three clusters of two samples each. Each cluster represent a samples coming from an independent DNA extraction from washed grapevine leaves (green) and its dilution 1:10. Therefore, diluting 1:10 does not affect the microbial community allowing us to maintain even small difference in the microbial community. We could have influenced this result adding for instance another sample type (such as soil) to the analyses or using leaves coming from different fields but we kept the amount of confounding variables at the minimum. In this way, we maximized the effect of dilution and this does not appear to be significant with any of the different distance matrixes in Fig. B3a (Bray-Curtis, Jaccard or Unifrac) since all p-values measured with PERMANOVA are much higher than 0,05. In Fig. B3b, over-dilutions change the microbial community (p-value < 0.05), nonetheless the effect of geography is always significant (p-value < 0.05) with any of the tested diluting factor. In our study, however, we never applied a dilution higher than 1:10 in order to normalize the amount of DNA we used in our library preparation.

Figure B3; here the two panels report different PCoA plots obtained on the two different experiment we performed to test the effect of DNA dilution on microbial community. In a) we display three PCoA plots obtained with Bray-Curtis dissimilarity, Jaccard and UNIFRAC distance on 3 samples of leaves that were amplified after DNA extraction (green) and after a 1:10 dilution with water (pink). The panel a was made in collaboration with Prashant Singh at INRA, Montpellier (France). In b) we reported a PCoA plot calculated using Euclidean distance on soil samples coming from France (green), Australia (Red) and Germany (blue) amplified using different dilutions rate 1:10 (circles), 1:100 (triangles), 1:1000 (squares) and 1:2000 (crosses). Panel b was made in collaboration with Toke Bang-Andreasen at Aarhus University (Denmark).

5- Blocking Primers In 2013 Lundberg et al. describe the use of Peptide-Nucleic-Acid based (PNA) blocking primers as a way to prevent the amplification of unwanted DNA from specific host- associated samples. In this publication, he suggested two PNA blocking primer that should prevent the amplification of plastidial DNA from chloroplast (pPNA), and plant- mitochondria (mPNA). We therefore purchase these reagents and tested them on soil, trunks and leaves samples. In our test we amplified DNA samples with and without PNA (in both cases using PCR-protocol suggested for PNA, which include a pre-annealing step for the PNAs at 75° C before the usual primer annealing, to avoid further biases). After analyzing the results through our bioinformatics pipeline with QIIME 2 (reported in Manuscript 2) we noticed a huge differences in relative abundance relative to the sequences assigned to Chloroplast comparing the different groups of samples. In particular, when PNA blocking primers were used we obtained a reduction from around 95% of the total abundance to around 1%. All these results are summarized in Figure B4 In soil PNA blocking primers for chloroplast and mitochondria were not necessary but we apply them to test if they could affect the overall microbial community composition and this was not the case. Slight change in relative abundance could be due to small differences between soil samples. On trunks and leaves samples the reduction of the amplification of chloroplast is remarkable; Finally, to test if the use of PNA introduce a bias compared to the generally performed exclusion of the reads assigned to chloroplast (via bioinformatics algorithm), we compared the taxonomical composition obtained using PNA with the one obtained removing plastidial reads after sequencing. From the last two columns, on the right side of Fig. B4 it is possible to see that the dominant taxa does not change between the two leaves samples obtained this way. However, a reduction in the rare taxa can be observed when the undesired reads are removed in post- sequencing analyses compared to the samples obtained using PNA. To check if there are differences, attributable to the use of PNA, within the rare taxa we would need to retrieve a sample of leaves with a higher amount of reads not assigned to chloroplast. Unfortunately, this was not possible in our test since the remaining reads often are less than a 1000. This does not allow a good representation of the bacterial community on leaves because we does not reach the saturation of the rarefaction curve when we measure the richness in alpha- diversity analyses (not shown).

Figure B4; taxa bar plots obtained from PNA-test on soil, trunks and leaves samples; from left to right the first group includes soil samples obtained with and without PNA; the second group shows samples of trunk obtained with and without PNA; the third group include leaves samples sequenced with and without PNA; finally last group display leaves obtained with PNA versus leaves sequenced without PNA but after bioinformatics removal of reads assigned to chloroplast.