<<

Choose an item.

The Pennsylvania State University

The Graduate School

CHARACTERIZATION OF MICROBIAL DYNAMICS AND VOLATILE

METABOLOME CHANGES DURING FERMENTATION OF

GRAPES IN TWO PENNSYLVANIA REGIONS

A Thesis in

Food Science

by

Hung Li Wang

© 2020 Hung Li Wang

Submitted in Partial Fulfillment of the Requirements for the Degree of

Master of Science

August 2020

The thesis of THEHung LiPURDUE Wang was UNIVERSITY reviewed and approved GRADUATE by the following: SCHOOL

Josephine Wee STATEMENT OF COMMITTEE APPROVAL Assistant Professor of Food Science Thesis Advisor

Helene Hopfer Assistant Professor of Food Science

Darrell W. Cockburn Assistant Professor of Food Science

Robert F. Roberts Professor of Food Science

Head of the Department of Food Science

Approved by:

Dr.

ii

ABSTRACT

Numerous studies have indicated that the microbiome could generate various volatile compounds which could lead to distinguishing and different wine characteristics. However, little research regarding the wine microbiome is investigating specific microorganisms and their role within the entire microbial community under a more comprehensive sampling method. Thus, in this study we conducted direct sampling from Central and Northeast PA wineries instead of using lab-scale production to study the effect of the wine microbiome on the wine metabolome. Resulting and wine samples were characterized by next-generation sequencing and headspace-solid phase microextraction-gas chromatography-mass spectrometry (HS-SPME-GC-MS). Collectively, the innovative sampling and experimental techniques provided a high-resolution picture of microbial dynamics and the resulting wine volatile profiles. Overall, we illustrated how microbial diversity and relative abundance of specific microorganisms change as fermentation progressed.

Also, various wine volatile metabolites that are formed during the different fermentation stages were identified. Finally, we were able to establish a prediction model based on the microbial and volatile compositions, which could be used in the future to predict wine quality and wine characteristics in general and could also be applied in winemaking processes to help define regionality and terroir. The wine industry in PA is a relatively young industry, experiencing rapid growth that is expected to continue in the future. Besides European of the Vitis vinifera genus, the environment of PA provides an ideal growing environment for the interspecific Vitis ssp. Chambourcin, one of the most commonly grown varieties in PA. In this study, we characterized the impact of winery-specific microorganisms on the aroma profiles of Chambourcin red wine during the fermentation process (0-20 days). Interestingly, microbial compositions in the

iii

made from grapes from different regions within PA was characterized by high fungal and bacterial diversity in the earlier fermentation stages (0-4 days). Additionally, changes in wine metabolites, such as volatile esters and alcohols that appeared in the later fermentation stages (7-

20 days) provide insights into potential regional differences of Chambourcin red wines. In addition, key discriminant features (microbes and volatile compounds) were identified between the Central and East regions of PA by partial least-squares-discriminant analysis (PLS-DA). Finally, the regularized canonical correlation analysis (rCCA) modeling analyses suggested potentially causal correlation between the microbial communities and wine volatile compounds. In summary, these results provide grape growers and wine producers with knowledge-based recommendations on how to improve wine production through an increased understanding of the impact of native microbial populations on the development of important wine aroma attributes.

iv

TABLE OF CONTENTS

LIST OF FIGURES ...... xi

LIST OF TABLES ...... xx

ACKNOWLEDGEMENTS ...... xxii

Chapter 1 LITERATURE REVIEW ...... 1

1.1. The Industry ...... 1

1.2. Characteristics of Pennsylvania Chambourcin, a French-American hybrid wine grape

cultivar that is important to Pennsylvania ...... 4

1.3. Microorganisms found in vineyards and wineries can impact winemaking, fermentation,

and final wine characteristics...... 7

1.3.1. Recent advances in high-throughput next generation sequencing (NGS) technologies

have altered the way we study microbial ecology in complex systems such as winemaking.

……………………………………………………………………………………. 12

1.4. Core aroma compounds in the grape and wine metabolome that contribute to sensory

properties of final wines...... 14

1.4.1 The concept of terroir in winemaking and the contributing of microorganisms to

shaping terroir...... 17

v

1.4.2 Advances in Gas Chromatography – Mass Spectrometry approaches improves

resolution of wine volatile profiling ...... 19

Chapter 2 Significance, Innovation, and Approach ...... 21

2.1. Significance ...... 21

2.1.1. Economic impact of wine industry in Pennsylvania ...... 21

2.1.2. Microorganisms present on grape berries and throughout fermentation influence the

chemical properties in wine ...... 22

2.1.3. Data-driven knowledge for improvement of winemaking ...... 23

2.2. Innovation ...... 23

2.3. Approach ...... 25

2.3.1 Rationale ...... 25

2.3.2. Hypothesis and specific aims ...... 26

Chapter 3 Characterization of microbial dynamics and volatile metabolome changes during fermentation of Chambourcin grapes in two Pennsylvania regions ...... 28

Abstract ...... 28

3.1. Introduction ...... 30

3.2. Materials and Methods ...... 33

vi

3.2.1 Fermentation and winemaking sample collection for microbiome-metabolome analysis ...... 33

3.2.2. Preparation of total microbial DNA ...... 38

3.2.3. Amplification, and purification of target gene sequences ...... 38

3.2.4. Sequencing library preparation and construction ...... 40

3.2.5. Raw sequence data processing and phylogenetic tree construction ...... 40

3.2.6. Taxonomic distribution and diversity analyses of microbial populations obtained from Chambourcin wine fermentations ...... 41

3.2.7. Preprocessing of microbial data for regression analyses ...... 44

3.2.8. Determination of cluster number for data clustering to improve taxa differentiation from Linear discriminant analysis Effect size (LEfSe) ...... 45

3.2.9. Characterization of microbial taxa using LEfSe during fermentation ...... 46

3.2.10. HS-SPME-GC-MS for identification and quantification of volatile compounds throughout fermentation of Chambourcin red wine ...... 47

3.2.11. Mass spectral deconvolution and preprocessing of wine metabolite data for regression analyses ...... 48

3.2.12. Distribution and statistical analyses of wine volatile compounds during the fermentation process ...... 49

vii

3.2.13. Partial least squares-discriminant analysis (PLS-DA), a dimensionality reduction

approach for regional classification of wine microbiome and metabolites datasets ...... 50

3.2.14. Construction of multivariate data frame and two-level correlation models ...... 51

3.3. Results ...... 52

3.4.1. Illumina sequencing data provided information of sequence counts and total features

from ITS2 and 16S rRNA gene sequences...... 52

3.3.2. Alpha rarefaction determination of optimal sequencing counts for normalization of

fungal and bacterial sequences...... 56

3.3.3. Individual winery taxonomic plots preserve the representation of microbial terroir

better than aggregated taxonomic representation...... 58

3.3.4. The process of fermentation contributes to the overall decline of microbial diversities

but impacted fungal and bacterial communities differently...... 67

3.3.5. Partitioning Around Medoids (PAM) algorithm-based clustering approach groups

fungal- and bacterial-dependent fermentation stages into two or three clusters...... 75

3.3.6. Linear Discriminant Analysis Effect Sizes (LEfSe) identified predominant fungal

and bacterial taxa in different fermentation stages ...... 81

3.3.7. Most of Chambourcin red wine metabolites showed significant change during

fermentation process ...... 85

viii

3.3.8. Partial Least Squares Discriminant Analysis (PLS-DA) explain regional differences

of wine microbiome and metabolome that could contribute to terroir...... 95

3.3.9. Aggregated microbial signatures from nine wineries showed different correlation

patterns with wine metabolome providing information of potential microorganisms

contributing to wine aroma profiles ...... 103

3.4. Discussion ...... 115

3.4.1 Changes in fungal and bacterial communities across fermentation stages represent

microbial signatures in Chambourcin red wine...... 115

3.4.2 Wine volatile metabolites showed significant change across fermentation and the

structure of compound distribution revealed the aroma profiles of Chambourcin red wine.

…………………………………………………………………………………….. 119

3.4.3 Wine microbial compositions and metabolite structures suggest regionality of

Chambourcin wines and key features provided by PLS-DA highlight regional characteristic

of Chambourcin...... 122

3.4.4 Associations between wine microbiome and volatile metabolome revealed the

different clustered patterns and provided the knowledge-based information for PA Wine

Industry to better understand the microbe and compound interactions during wine

fermentation...... 125

Chapter 4 Conclusions and future directions ...... 129

ix

4.1. Major findings and research conclusions ...... 129

4.2. Future directions ...... 131

Appendix A Supplemental material and supporting data management information and accessibility for microbiome data analysis pipeline ...... 132

Appendix B The impact of sulfite treatment on microbial populations during early fermentation stages of winemaking ...... 136

References ...... 139

x

LIST OF FIGURES

Figure 1-1: Number of wineries growth from 2000 to 2017 (Thompson et al., 2019)...... 2

Figure 1-2: Wine production in Pennsylvania from 2003-(Dombrosky & Gajanan, 2013)...... 3

Figure 3-1: Map of sample collection sites in Central and Northeast PA. Locations in Central PA are highlighted by a dotted circle and locations in Northeast PA are represented by dotted triangle.

Each red circle represents an individual winery where samples were collected throughout ten predetermined stages of Chambourcin fermentation and winemaking...... 33

Figure 3-2: Survey given to winemakers highlighting fermentation stages and sample collection for microbiome-metabolome study ...... 37

Figure 3-3: Distributions of sequence counts. (A) Fungal communities; (B) Bacterial communities.

...... 55

Figure 3-4: Alpha rarefaction curve on sequence depth. The minimum read counts of sufficient richness was around16,000 in (A) Fungal communities; around 7,000 in (B) Bacterial communities.

Each colored line represents the samples from each winery and fermentation stages...... 57

Figure 3-5: Fungal and bacterial taxonomic composition is influenced by fermentation state of

Chambourcin hybrid grapes. Taxonomic plots demonstrate the relative abundance of top 20 (A) fungal and (B) bacterial taxa throughout fermentation stages (denoted ‘S’ on the x-axis).

Saccharomyces cerevisiae is typically added into wine fermentation and its relative abundance is indicated by the dotted line in Panel A. Oenococcus oeni is typically added later in fermentation

xi

to drive malolactic fermentation (MLF) and its relative abundance is indicated by the dotted line in Panel B. Unidentified genera were only shown at the family level and ‘unidentified’ taxa were grouped into a category “f__others”...... 60

Figure 3-6: Fungal taxonomic composition presented from (A) Central and (B) Northeast region.

Taxonomic plots demonstrate the relative abundance of top 20 fungal taxa throughout fermentation stages (denoted ‘S’ on the x-axis). Saccharomyces cerevisiae is typically added into wine fermentation and its relative abundance is indicated by the dotted line. Unidentified genera were only shown at the family level and ‘unidentified’ taxa were grouped into a category “f__others”.

...... 62

Figure 3-7: Bacterial taxonomic composition presented from (A) Central and (B) Northeast region.

Taxonomic plots demonstrate the relative abundance of top 20 fungal taxa throughout fermentation stages (denoted ‘S’ on the x-axis). Oenococcus oeni is typically added later in fermentation to drive malolactic fermentation (MLF) and its relative abundance is indicated by the dotted line.

Unidentified genera were only shown at the family level and ‘unidentified’ taxa were grouped into a category “f__others”...... 63

Figure 3-8: High microbial diversity was observed at the individual winery level. For each winery, the relative abundance of the top 20 bacteria (A) and fungi (B) taxa that differed significantly not only by fermentation stages but also winery are shown. Superimposed on each panel are (A)

Saccharomyces cerevisiae and (B) Oenococcus oeni relative abundances. Dendrograms of each winery were displayed based on the relative abundance of the taxa at the genus level in each winery using Euclidean distances and Ward clustering algorithm ...... 66

xii

Figure 3-9: High microbial diversity observed in early stages for wine fermentation. Shown here are the rarefied α-diversity distributions calculated as Shannon diversity index at each fermentation stage from nine wineries. Box plots represent the 1.5*IQR (Inter quartile range) / Sqrt(n) correspond to 95% confidence interval. Panel (A) represents fungal richness and (B) represents bacterial richness. Significant differences between stage 1 and other groups were determined using

Kruskal-Wallis test with FDR adjusted p-value (q-value). * q-value < 0.05, ** q-value < 0.01, *** q-value < 0.005...... 69

Figure 3-10: Rarefied α diversity distributions of microbial richness and evenness throughout stages of fermentation with samples collected from nine wineries. (A-B) Faith’s phylogenetic diversity (richness) for fungi (A) and bacteria (B) and Pielou’s evenness for (C) fungi and (D) bacteria. Significant differences between stage 1 and other groups were determined using Kruskal-

Wallis test with FDR adjusted p-value (q-value). * q-value < 0.05, ** q-value < 0.01, *** q-value

< 0.005...... 70

Figure 3-11: Rarefied β-diversity based on unweighted UniFrac distances for (A) fungi and (B) bacteria, and weighted UniFrac distances for (C) fungi and (D) bacteria colored by fermentation stages. Principal Coordinates Analysis (PCoA) of taxonomy data plotted according to the first two principal components across fermentation stages. Statistical significance determined by

PERMANOVA, p-value = 0.001...... 73

Figure 3-12: Cladogram representing different taxa across fermentation stages without clustering process. Linear Discriminant Analysis Effect Sizes (LEfSe) analyses were performed using relative abundance data averaged at each stage for each winery at the highest classification level

xiii

taxonomy available. Data shown are the hierach of discriminating taxa visualized as a cladogram for comparison across different fermentation stages. Only taxa with LDA scores > 3, p-value <

0.05 are shown in the cladogram...... 77

Figure 3-13: Optimal clustering number for microbial communities at different fermentation stages.

Partition around medoids (PAM) analysis showed average silhouette width considering two clusters for (A) fungi and (B) bacteria, and total width sum of square considering 3 clusters for (C) fungi and (D) bacteria...... 79

Figure 3-14: Clusters of 10 fermentation stages in principle component analysis based on the data obtained α-diversity analysis (Shannon index, Faith’s PD, and Pielou’s evenness). (A-B) Samples of fungal communities in three (selected for use in this study) and two clusters; (C-D) Samples of bacterial communities in two (selected for use in this study) and three clusters...... 80

Figure 3-15: Differential taxa among early, middle, and late stages of fermentation for fungal communities. LEfSe analyses were performed using relative abundance data averaged by wineries at the highest classification level taxonomy available. Data shown are the log10 linear discrimination analysis (LDA) scores following LEfSe analyses and the hierarch of discriminating taxa visualized as cladograms clustered into three groups, early, middle, and late fermentation stages. Differential taxa with LDA scores > 3, p-value < 0.05 were used as cutoff...... 83

Figure 3-16: Differential taxa among early, and late stages of fermentation for bacteria communities. LEfSe analyses were performed using relative abundance data averaged by wineries at the highest classification level taxonomy available. Data shown are the log10 linear discrimination analysis (LDA) scores following LEfSe analyses and the hierarch of discriminating

xiv

taxa visualized as cladograms clustered into two groups, early and late fermentation stages.

Differential taxa with LDA scores > 3, p-value < 0.05 were used as cutoff...... 84

Figure 3-17: GC-MS chromatogram of wine metabolites averaged by wineries demonstrating the effect of fermentation stages on the production of volatile compounds. Chromatogram for (A)

Stage 1 (early fermentation) and (B) Stage 10 (late fermentation). *data without sample PA_05_S1

...... 86

Figure 3-18: 84 volatile compounds detected by GC-MS from samples collected across fermentation stages from all nine wineries. Compounds were grouped based on their chemical structures. Categorization of chemical structures were defined as previously published (Ilc et al.,

2016)...... 89

Figure 3-19: The demonstration of significant differences among fermentation stages of total identified volatile compounds (n=84) identified using Kruskal Wallis Test. FDR-adjusted p-value

(q-value) < 0.05 (green dots, n=16)...... 90

Figure 3-20: Distribution of top 20 highest abundant metabolites across wine fermentation. All compounds showed significant differences (Kruskal Wallis Test, FDR-corrected p-value < 0.05) among ten fermentation stages. *, represented compounds without RI validation...... 91

Figure 3-21: Distribution of top 20 highest abundant metabolites across wine fermentation in (A)

Central and (B) Northeast regions. All compounds showed significant differences (Kruskal Wallis

xv

Test, FDR-corrected p-value < 0.05) among ten fermentation stages. *, represented compounds without RI validation...... 94

Figure 3-22: Distribution of omics data before (left, skewed) and after (right, more normally distributed) the data preprocessing involving log2 transformation and Pareto scaling. Each plot was present from (A)Fungal omics data; (B) Bacterial omics data; (C) Wine metabolite omics data.

...... 97

Figure 3-23: 3D plot between the selected PCs with PLS-DA classification with Leave-one-out cross-validation (red star indicates the best classifier). (A-B) Fungal community; (C-D) Bacterial community; (E-F) Wine metabolites. Permutation test (n = 1000) was performed for the significance of class discrimination...... 99

Figure 3-24: Important features identified by PLS-DA in the selected component based on the highest explained variance. The colored boxes on the right indicate the relative concentrations of the corresponding metabolite in each group under study. (A) Fungal community; (B) Bacterial community; (C) Wine metabolites. Top 15 features were listed based on VIP scores. VIP > 1.0 were considered as important contributors to the PLS-DA model. (1R,2R,5R,E)-7-Ethylidene-# was denoted as (1R,2R,5R,E)-7-Ethylidene-1,2,8,8-tetramethylbicyclo[3.2.1]octane. *, represented compounds without RI validation...... 102

Figure 3-25: Leave-one-out cross validation (CV) results of lambda values used in rCCA model.

Values calculated from (A) fungal data set versus metabolic data set and (B) Bacterial data set

xvi

versus metabolic data set were shown in the heatmap. Arrow indicated the location of highest CV- score determined by λ1 and λ2...... 104

Figure 3-26: A scree plot of the canonical correlation coefficient from each dimension on rCCA model. Results were calculated from (A) fungal community and (B) bacterial community...... 104

Figure 3-27: Explained variance bar plots of the first 5 dimensions in the rCCA model from (A) fungal data and (B) bacterial data...... 105

Figure 3-28: Circle plot of rCCA model highlighted the contribution of correlation between microorganisms and metabolites to each selected dimension. Clustering of two data sets indicated the overall correlations between them. The strength of correlation is demonstrated by the distance from the center of the circle (the further distance the better). The distribution of each variables was defined by (A) dimension 1 and 2 in fungal data and (B) dimension 1 and 3 in bacterial data. 106

Figure 3-29: rCCA results of relationships between fungal community (shown in genus level) and wine metabolites were visualized in the heat map showing positive correlation (red) and negative correlation (blue). Unidentified genus was represented by its family or order name. Variables with correlations below 0.3 in absolute value are not plotted. *, represented compounds without RI validation...... 109

Figure 3-30: rCCA results of relationships between bacterial community (shown in genus level) and wine metabolites were visualized in the heat map showing positive correlation (red) and negative correlation (blue). Unidentified genus was represented by its family or order name.

xvii

Variables with correlations below 0.3 in absolute value are not plotted. *, represented compounds without RI validation...... 110

Figure 3-31: Top highest positive and negative loading values of taxa from rCCA model were selected for the species level correlation. The values were defined by (A) fungal data in first dimension and (B) bacterial in first and third dimension. Genera only contained unidentified species were filtered for the following correlation analysis...... 113

Figure 3-32: Heat map of Spearman’s correlation between microbiome (shown in species level) and wine metabolites. Correlation coefficient with p-value > 0.05 (shown in white block) and unidentified species underlying the selected genus were removed. *, represented compounds without RI validation...... 114

Figure A1-1: Principal-coordinate analysis (PCoA) of unweighted UniFrac method using (a) a de novo phylogentic approach; (b) de nono approach with manually distance adjustment; (c) reference phylogenetic approach (Janssen et al., 2018). 133

Figure A1-2: PcoA of ITS sequences from Saliva (blue) and restroom (red) generated using (a)

Binary Jaccard; (b) unweighted UniFrac with Muscle aligment; (c) unweighted UniFrac using ghost-tree generated phylogeny (https://github.com/JTFouquier/ghost-tree-trees)...... 134

Figure B1-1: Taxonomic plots demonstrate the relative abundance of top 20 (A) fungal taxa in different spontaneous fermentation time (hr) and treatment made by Chambourcin grape collected from different wineries Unidentified genera were only shown at the family level and ‘unidentified’

xviii

taxa were grouped into categories, “g_others” (species unidentified), “f_others” (genus unidentified), “O_others” (family unidentified). 138

Figure B1-2: Rarefied alpha diversity distributions of microbial richness and evenness between

Sodium metabisulfite treatment of spontaneous fermentation wine samples. (A) Faith’s phylogenetic diversity (richness) (B) Pielou’s evenness. Significant differences between the treatment was determined using Kruskal-Wallis test with FDR adjusted p-value (q-value). * q- value < 0.05...... 138

xix

LIST OF TABLES

Table 3-1: Summary of fermentation stages and sample collection...... 34

Table 3-2: Summary of fermentation stages and samples collected...... 35

Table 3-3: Denoised ITS2 and 16S sequence counts and total features of wine samples...... 53

Table 3-4: Results of the PERMANOVA pairwise comparison of fungal and bacterial β-diversity using unweighted and weighted UniFrac distance metrics. FDR-adjusted p-values (q-value) are reported for statistical significance...... 74

Table 3-5: Number of taxa identified at each taxonomic level between fungal (ITS2) and bacterial

(16S) communities...... 76

Table 3-6: Distributions of clustered fermentation stages of all samples. Three clusters resulted in better performance for fungal community and two clusters for bacterial community...... 81

Table 3-7: 84 Target Chambourcin Core VOCs detected in all wineries in the last fermentation stages (S10) were selected for the downstream analyses. Categorization of chemical structures were defined as previously published (Ilc et al., 2016). RT, Retention Time; RI, Retention Index;

*, represented compounds without RI validation...... 87

Table 3-8: Wine important metabolites with significant differences across fermentation stages.

Data shown here are based on Kruskal Wallis Test for non-parametric analysis...... 91

xx

Table 3-9: Leave-one-out cross-validation results of PLS-DA model from the omics data of fungi bacteria, and wine metabolites. The overall accuracy is defined by the ratio of the sum of the correct diagonals / number of cases...... 98

xxi

ACKNOWLEDGEMENTS

First, I would like to express my greatest gratitude to my advisor Dr. Josephine Wee for her continuous guidance throughout my research. She has always encouraged and provided me advice during the time of my study, I would not have gone this far without her help.

Also, I would like to acknowledge my committee members, Dr. Helene Hopfer and Dr. Darrell W.

Cockburn, for their insightful comments and guidance on my microbial and chemical experiments.

I also want to thank the lab members, Dr. Xue Du and Chun Tang (Elena) Feng who spent their time to help me set up the experiment and discuss the results. I will never forget all the valuable feedback that contributed to my research and this thesis. I would also like to thank Taejung Chung

(Dr. Jasna Kovac laboratory) who helped me with microbial analysis pipeline and Andrew

Poveromo and Stephanie Keller (Dr. Helene Hopfer laboratory) who taught me how to conduct

GCMS analysis.

Last but not the least, I would like to express my deepest appreciation to my parents and

Chieh Yu (Ann) Liu for their genuine support and care during these years.

This research conducted at Penn State University Park is supported by the USDA National

Institute of Food and Agriculture and Hatch Appropriations under Project #PEN04699 and Accession#1019351 and the Crouch Endowment for Viticulture, Enology, and Pomology

Research. The findings and conclusions in this study do not necessarily reflect the view of the funding agency.

xxii

CHAPTER 1

LITERATURE REVIEW

1.1. The Pennsylvania Wine Industry

In the United States, more than 900 million gallons of wine was produced in 2018 (General

Industry Stats, 2019). This volume accounts for 12% of global wine production. The estimated value of the United States wine exports reached $1.46 billion in 2018, and 95% of exported wines originate from California. When looking at wine production in the US, California produced 85% followed by Washington (4.5%) and New York (3.5%) (General Industry Stats, 2019).

Pennsylvania (PA) is not typically viewed as a wine region. The economic impact of the wine production industry in PA is estimated at $200 million annually (Happer &Kime, 2013) As a young industry, wine production and quality are of high importance to PA growers and winemakers. The

State of PA is primarily composed of smaller wineries (88% of the wineries in PA producing <

20,000 gallons per year) and the larger wineries representing 12% of the industry produced more than 20,000 gallons per year (MKF Research LLC, 2009). Of interest, a large production volume of wine products is sold within the state with approximately 85% sold directly to the consumer

(MKF Research LLC, 2009). In 2018, 285 wineries in Pennsylvania produced about 2.1 million gallons of wine, accounting for about 0.2% wine production of all US wine which places PA as the 10th largest wine producer in the US by volume (General Industry Stats, 2019). The production of wine grapes accounted for 30.2% of total grape production in PA. PA is traditionally known as a large juice and jelly grape producer. However, wine grape production is of lower economic value

(7.8 million dollars in 2018) compared to juice grape production (17.4 million dollars in 2018).

1

Still, based on the economic impact of wine grape production in 2018 which includes wine sales directly to consumers, to on-site restaurants, or retailers, the PA wine industry is an important economic industry and contributed around 418.3 million dollars to PA’s economy in 2018 (John

Dunham & Associates, 2019).

Together, the PA Wine Industry has experienced both continuous growth in terms of number of wineries (Figure 1-1) as well as gallons of wine produced (Figure 1-2) (Thompson et al., 2019). However, the low percentage of PA-produced wine sold at the distributor level and the amount that is exported out of state could limit industrial development (Dombrosky &Gajanan,

2013).

Figure 1-1: Number of wineries growth from 2000 to 2017 (Thompson et al., 2019).

2

Figure 1-2: Wine production in Pennsylvania from 2003-(Dombrosky & Gajanan, 2013).

Although PA ranks 10th in US wine production and 7th in numbers of wineries, the total economic impact was lower than comparable states close to PA, as New York, Ohio, Virginia or

Michigan which all experienced a larger economic impact from their respective wine industries

(General Industry Stats, 2019). One possibility of this lower competitiveness could due to the growth rate of wineries in PA that still has not been as rapid as in other eastern states and this condition might be further exasperated by the lack of in-state wine grape production which forces

PA wineries to purchase grapes or juice from other states (Maurus Brown, 2000). An explanation for the limited production of wine grapes in-state is presented by (Centinari et al., 2016) who surveyed 39 Pennsylvania wine and grape growers about the challenges when growing grapes in

PA. The most important challenge was winter injury followed by the disease pressure, both of which could lead to the unstable and limited production of high quality wine grapes and potentially increase costs (Dombrosky &Gajanan, 2013). It was also reported that Pennsylvania wineries might suffer from inconsistent wine quality across different vintages, when consistent quality in

3

wines is more preferred in the American market (Gardner, 2016). Overall, although the

Pennsylvania wine industry continues to grow, many wine grape growers in their regions still face the similar challenges, such as damaging winter or high disease pressure, which lead to the shortage of wine grapes production and differential wine quality each vintage year.

A potential solution is to determine strategies that provide consistent cultivation of suitable grape varieties and to better understand the role of the ecology of microorganisms on these grapes on grape and final wine quality. Consequently, with the greater knowledge, Pennsylvania winemakers may be able to maintain and engineer red wine characteristics through optimized microorganism management that strategically meet the preference of wine consumers.

1.2. Characteristics of Pennsylvania Chambourcin, a French-American hybrid wine

grape cultivar that is important to Pennsylvania

Pennsylvania has a relatively humid climate with heavier rainfall throughout the growing season and strong winds during harvest time that that result in grape berries with thick skins and loose clusters (Thompson et al., 2019). Among all the varieties of wine grapes, native American wine grapes account for 70 percent in PA, with 67% being Concord, followed by about 4% of

Catawba and Delaware. Although interspecific hybrid wine grapes are relatively new in the grape community, varieties like Chambourcin, , , , ,

Vignoles account for 13% of the wine grapes grown in PA, due to their higher resilience in harsher climates and consistent flavor profile, while European Vitis vinifera grapes, including Cabernet

Franc, , , , , and Grüner Veltliner, make up about 9%

(Happer &Kime, 2013; Tettemer, 2017; Thompson et al., 2019).

4

It has been previously reported that unsuitable varietal selection of wine grapes can lead to growth deficiency of the plant, and impact quality of the fruit resulting from disease or climate

(Happer &Kime, 2013; Reisch et al., 1993). Furthermore, the quality of wine is dependent on the condition of grapes during growth and harvest, including factors such as sugar content, acidity, and various aroma compounds (Jackson &Lombard, 1993; vanLeeuwen &Darriet, 2016). The frequent extreme weather events and heavier rainfall in PA (Thompson et al., 2019), including fluctuating temperatures and unstable seasonal pattern of rainfall can aggravate the spread of grape disease, for instance, warmer and damper conditions favor the growth of powdery mildew

(Erysiphe necator) (Caffarra et al., 2012). Consequently, choosing the right grape variety that matches the growing season and adapts well to unsteady climate will support flavor development in the berries and enhance the sustainability of wine production (Hoemmen et al., 2015).

Pennsylvania exhibits hot summers and cold winters which is more similar to central

European wine regions than California. This type of climate presents the ideal terroir for the growth of hybrid grapes (Thompson et al., 2019). Chambourcin, pronounced “SHAM-bour-sin”, is a

French-American hybrid (Seyve-Villard 12-417 x ) wine grape variety with a relatively dark skin and neutral flavor. Chambourcin was first commercialized in 1963 and this grape variety produces a red wine with low tannins and high acidity (Dewey, 2017; Gardner, 2016). In

Pennsylvania, this variety is the most abundant hybrid grape grown primarily in the North Central,

South West, South Central and South East regions, making Chambourcin an important grape cultivar for winemaking in the State (Dewey, 2017). Compared to V. vinifera previous research suggests that Chamboucin is more tolerant to temperature fluctuations as well as relatively more resistant to cold temperatures (Dombrosky &Gajanan, 2013; Gardner, 2016; Homich et al., 2016).

5

In addition, Chambourcin grape berries are more tolerant to disease pressures such as downy mildew, powdery mildew, and Grapevine vein clearing virus (GVCV) (Barlass et al., 1987; John

Hartman &Julie Beale, 2008). Another advantage of Chambourcin grapes is that it has a relatively longer growing season and is one of the Finally grapes to be harvested (Coia &Ward, 2017;

Gardner, 2016; Guo et al., 2014; Reisch et al., 1993).

As mentioned above, Chambourcin grapes generally produce wines with low tannins and high acidity. From a sensory perspective, one study indicated that Chambourcin red wine do not detract from Vitis vinifera red wine made by European wine grapes and can rather enhance its taste and aroma properties (Coia &Ward, 2017). In other words, it was suggested that Chambroucin could have its role in high-quality red wines and provide another route towards sustainable viticulture due to diseases resistance and tolerance of heat and humidity. Although native

American wine grapes are the main grape variety with a few of Vitis vinifera grown in PA There is an increased interest in the use of hybrid grapes such as Chambourcin due to its versatility and unique wine flavor profile. The versatile and unique characteristics of Chambourcin hybrid grapes has ability to produce quality red wine as well as a route toward sustainability in viticulture and wine industry. For example, Chambourcin’s higher tolerance to disease pressures combined with its tolerance to colder weather results in more stable grape production and potential use of less pesticides to control the disease. One thought for the difference in increased climate and disease tolerance of Chambourcin is the presence of a range of indigenous microorganisms on grape berries which can control spoilage as well as contribute to final wine quality (Bozoudi &Tsaltas,

2016). Therefore, due to limited information of microbiome on hybrid grapes, research in the area

6

of microbial ecology of wine grapes could be seen as a necessary first step towards controlling PA wine fermentation to achieve stable and high quality of wine.

1.3. Microorganisms found in vineyards and wineries can impact winemaking,

fermentation, and final wine characteristics.

Recent studies have demonstrated that microbial populations present on grape berries and throughout fermentation can impact characteristics of final wines (Bozoudi &Tsaltas, 2016).

Several studies have shown the association between microbial populations present in the soil, on grape berries, in winemaking environments, and in the air that contribute to the final characteristics of wine. Winemaking is driven by a fermentation process which is a bioconversion process where microorganisms participate in different biosynthetic pathways converting grape juice into wine and determining final wine quality. With progress in understanding the complex mechanisms of biochemistry, molecular biology, and microbial physiology, the fundamental impact of microbial communities and some species during the changes of must/juice into final wine is starting to become more clear (Romano et al., 2019). During winemaking, a diverse population of fermentative yeasts, filamentous fungi, and bacteria play a crucial role in the final characteristics of wine. During alcoholic and malolactic fermentation, microorganisms not only convert carbohydrates into ethanol, these microorganisms also provide deacidification and generate a small quantity of important metabolites which contribute to the sensory properties of wine (Romano et al., 2019; Versari et al., 1999).

Yeasts, a community of essential microorganisms to alcoholic fermentation also contribute to wine aroma, and flavor and can have both positive and negative effects on wine quality. A range

7

of yeasts ascribed by (Cletus Kurtzman et al., 2011) within the grape and wine ecosystem are generally grouped as Saccharomyces vs. non-Saccharomyces yeasts. Studies of wine ecology showed that non-Saccharomyces yeasts (also commonly known as native wild yeast) include the genera Starmerella, Candida, Pichia, Hanseniaspora/Kloeckera, Cryptococcus, Rhodotorula,

Saccharomycodes, Metschnikowia, Kluyveromyces, and Zygosaccharomyces, all of which are present in various abundances on grape berries and at different stages of fermentation (Masneuf-

Pomarede et al., 2016). Although a diverse population of yeasts are present on wine grapes and throughout fermentation, Saccharomyces cerevisiae is recognized as the dominant yeast during fermentation and is involved in the production of alcohol and sensory-related metabolites

(Swiegers et al., 2005). A combination of scientific literature and historic uses of S. cerevisiae in winemaking agree that S. cerevisiae is the hallmark yeast and choice for production and accumulation of ethanol and sensory important compounds. Another desirable characteristic of S. cerevisiae is its tremendous efficiency to metabolized sugar to ethanol and produce by-products such as higher alcohols, esters, or carbonyl compounds which have been shown to impart a positive impact on sensory profile of wine, though it was also mentioned to produce volatile sulfur compounds which is negative to wine aroma (Cordente et al., 2012)

In the wine industry today, commercial S. cerevisiae produced from different companies and laboratories such as K1M ICV-INRA from Lallemand Inc or ICV yeast from Scott

Laboratories have been widely used in winemaking (VALERO et al., 2005). It should be appreciated that although S. cerevisiae has been recognized as an important yeast during winemaking, there exists many different strains and varieties of S. cerevisiae. (Bozoudi &Tsaltas,

2016) demonstrates the effects of commercial S. cerevisiae leading wine fermentations to become

8

“universally flat”, which might not allow winemakers to produce wines with unique characteristics across different wineries from the same region (Bozoudi &Tsaltas, 2016).

On the other hand, the process of spontaneous fermentation traditionally used in wine making is thought to result in unpredictable final wine quality. Thus, in general, non-

Saccharomyces species (wild yeasts) are not view favorably in winemaking due to their lower fermentative capabilities and production of undesired compounds. Therefore, the addition of commercial S. cerevisiae during winemaking (also known as controlled fermentation) is a more common practice in wineries to control the vinification process. However, (Heard & Fleet, 1985) indicated that under certain conditions, populations of wild yeasts persist and continues to grow together with the commercial S. cerevisiae until the end of fermentation. Recently (Andorrà et al.,

2019) showed that wild yeasts present together with S. cerevisiae actively participate in the fermentative process, resulting in a variety of interactive enzymatic activities which could play a role in the complexity of wine aroma. If the balance between S. cerevisiae and wild yeasts is managed well, their interaction and metabolic impact can have positive influences on the sensory attributes of wines. For instance, Schizosaccharomyces strains are used to reduce urea content (a precursor of ethyl carbamate) to enhance food safety and increase pyruvic acid to stabilize the color of wines (Benito et al., 2014). Kluyveromyces, Hanseniaspora, and Metschnikowia were reported to have β-glucosidases activity which s the ability to catalyze glycosides-bound aroma compounds and enhances sensory attributes of final wines (Pérez et al., 2011). Therefore, the importance of wild yeasts cannot be overlooked and their contributions to the ecological diversity and metabolic attributes require more research and understanding to exploit its beneficial properties (Fleet, 2003).

9

Filamentous fungi can also impact wine production at different points during the winemaking process. Several fungi detected during grape cultivation have the potential to influence wine flavor. The most abundant filamentous fungi associated with wine grapes during winemaking and fermentation are Aspergillus, Penicillium, Alternaria, Botrytis, Rhizous,

Plasmopara, Uncinula, and Cladosporum (Fleet, 2003; Romano et al., 2019). The occurrence of filamentous fungi was indicated to indirectly have a potential influence on wine flavor. For example, β-glucans produced by Botrytis cinerea, and ochratoxin A produced by Aspergillus and

Penicillium fungi could retard the activities of other microorganism producing sensory-related compounds during fermentation (Fleet, 2003; Reed &Nagodawithana, 1988). Some filamentous fungi may stimulate fermentation or lead to the improved growing conditions for acetic acid bacteria by diminishing the activity of yeasts (Ribéreau-Gayon, 1985). In general, filamentous fungi are typically associated with grape berries in the vineyard and not usually a contributor during winemaking. Hence, this group of fungi could be a potential research target for the improvement of wine quality.

Besides the fungal species, bacteria could also have positive and negative effects on wine quality characteristics. There are two main groups of bacteria commonly found in the wine ecosystem: lactic acid bacteria (LAB) which are responsible for the malolactic fermentation (MLF), and acetic acid bacteria (AAB) which are known to produce oxidation products like acetaldehyde and acetic acid (Fleet, 2003; Swiegers et al., 2005). Also, while a lower level of acetic acid (0.2–

0.6 g/L) can add a pleasant sourness to the wine (Luo et al., 2013), at a higher level (>0.7 g/L) it is considered to be a wine spoilage that is legally regulated in most wine-producing countries (e.g.,

1.4 g/L for red wines in the USA (Denise Gardner, 2015).

10

Within the LAB, four genera, Lactobacillus, Pediococcus, Leuconostoc, and Oenococcus are most relevant and able to survive in the low pH and high ethanol concentration in wine

(Lonvaud-Funel, 1999). LAB not only ferment residual sugars left from yeasts, LAB also contribute to biochemical reactions such as esterase and glycosidase activities and methionine metabolism (Inês &Falco, 2018). After completion of alcoholic fermentation, Oenococcus oeni

(formerly Leuconostoc oenos) is the most well adapted species and exclusively used for MLF in which malic acid is deacidified to lactic acid, resulting in a softer mouth feel in the final wine.

Moreover, it is reported that citric acid is also metabolized by O. oeni and one of the intermediary compounds is diacetyl, considered an important flavor produced by MLF (Nielsen &Richelieu,

1999). However, LAB could also produce off-flavors depending on the available substrates. One example is the addition of potassium sorbate as a yeast inhibitor which results in the degradation of 2-ethoxyhexa-3,5-diene, which produces an unpleasant crushed geranium leaves odor

(Chisholm & Samuels, 1992). The AAB are commonly considered to be wine spoilage microorganisms due to the production of oxidation products such as acetaldehyde and acetic acid

(Swiegers et al., 2005). Acetobacteraceae is the main family in the ubiquitous AAB and growth can occur in grape must with already a few strains surviving during fermentation (Bartowsky &

Henschke, 2008). It is known that AAB is able to transform glucose into gluconic acid, and ethanol into acetic acid which can easily accumulate in wine (Mamlouk &Gullo, 2013). However, the same study also reported that some Acetobacter and Gluconobacter bacteria, especially Acetobacter pasteurianus, is able to oxidize lactate into acetoin which has a "butter-like" aroma (Drysdale

&Fleet, 1988), demonstrating the complex and complicated effects of AAB on final wine quality.

11

In summary, microorganisms such as yeasts, filamentous fungi, and bacteria found in vineyards and wineries and those added during winemaking play different roles and can impact fermentation and final wine characteristics. Taken together, identifying and understanding the interactions that occur between yeasts and other microbial populations will be critical to develop strategies to tailor sensory wine properties to meet consumer demands within a region. As consumption of premium wines and consumer interest in unique wines continue to grow, the impact of microbial diversity on grape berry and wine fermentations could be one way to meet this demand to produce unique and consistent, high quality wines.

1.3.1. Recent advances in high-throughput next generation sequencing (NGS) technologies

have altered the way we study microbial ecology in complex systems such as

winemaking.

In a complex system such as winemaking, the biodiversity of native microorganisms has been found to be incredibly complex even when inoculated with strains of commercial species of

S. cerevisiae during primary fermentation and O. oeni during MLF. Recently, there has been increased interest in the use of indigenous (native or wild) microorganisms as an alternative approach to commercial inoculates to create unique wines with distinct sensory attributes.

Indigenous microorganisms were reported to better adapt to a specific grape must matrix, which could preserve the influence of grape variety and the regional characteristics. In other words, the particular mixture of naturally present microbes has the potential to influence aromas, flavors, and color of wines (Bozoudi &Tsaltas, 2016). Accordingly, it is important to study the wine microbiome for their taxonomic profiling and biodiversity identification together from the certain

12

environment making us be able to establish strategies based on the knowledge to improve wine quality.

Targeted culture-dependent analysis methods have been extensively adopted to the wine grape system in order to investigate the effect of specific microbial populations on grapes or wines on the wine quality (LucaCocolin et al., 2013). For example, introducing a selected indigenous yeast strain, Sc5, Sc11, Sc21, and Sc24 which Sc11, has been shown to produce more of 2-phenylethanol, a desired floral odor compound in Mencía wines (Blanco et al., 2014). The microbiome is defined as “the entire habitat, including the microorganisms (bacteria, archaea, lower and higher eukaryotes, and viruses) and their genomes (i.e., genes), in the surrounding environmental conditions” (Marchesi &Ravel, 2015), whereas the metagenome is defined as “the genome of the total microbiota found in nature” which is the assemblage of genomic information from the whole microbiome in a given ecosystem (Handelsman et al., 1998). The use of emerging sequencing technologies such as massively parallel sequencing or also known as next-generation sequencing

(NGS), allows researchers to study entire microbial populations in complex fermentation systems.

NGS microbiome studies allow for millions of sequencing reactions to be conducted in parallel to identify microbial populations without a priori knowledge of the community. Researchers are now able to explore the contribution of microbial populations in complex and dynamic systems without the need of cultivation to decipher the taxonomic populations (Morgan et al., 2017).

Compared to other sequencing techniques, NGS with the Illumina platform requires amplicons with shorter base pairs (~500 bp), however, the millions of generated DNA templates can be quantified in a given population. Furthermore, the requirement of shorter sequences contributes to the significant improvement of data accuracy compared to Sanger sequencing (Reis-Filho, 2009).

13

Also, NGS is able to be introduced for the longitudinal characterization of microbial genomes including changes of population abundance in different time points, and the fluctuation of taxonomic diversity in a giving environment (Bergström et al., 2014). For instance, using NGS on bacterial communities associated with the vineyard the most abundant phyla identified were

Proteobaccteria, Bacteriodetes, and Acidobacteria (Burns et al., 2015; Zarraonaindia & Gilbert,

2015). Moreover, NGS was also used to investigate the fungal consortia associated with wine grapes which were mainly comprised of filamentous fungi include Aspergillus, Alternaria, and

Penicillium and yeasts include Hanseniaspora, Issatchenkia, Pichia, and Candida (David et al.,

2014; Kecskeméti et al., 2016). In addition, it was reported that NGS has more often revealed the information of filamentous fungal than yeast species which have minor concentration, which suggests that some cultivable yeasts, such as Kazachstania, and Malassezia, were overlooked in non-NGS (David et al., 2014; Morgan et al., 2017; Pinto et al., 2015).

Consequently, this culture-independent approach provides the opportunity to monitor the microbial metagenome, demonstrate its influence on wine quality, and reveal their functional contributions to wine quality, information that allows wine producers to improve production and consistency of high-quality wine.

1.4. Core aroma compounds in the grape and wine metabolome that contribute to sensory

properties of final wines.

Assessing the chemicals in red wine allows us to better understand the potential factors that give effects on the wine aroma perception. Although water and ethanol account for 97% of what makes wine, the remaining 3% (w/w), however, is responsible for the major flavor and color of

14

wine (Waterhouse et al., 2016). During the alcoholic fermentation process, native wild yeasts present in grape must or commercially added S. cerevisiae yeasts converts sugars to ethanol different metabolites. Furthermore, while alcoholic fermentation is the primary reaction that occurs, several volatile compounds produced from the different stages of fermentation contributes to the aroma profiles of final wines. Those metabolic compounds can be, thus, classified into two categories which are primary compounds (compounds originally from the grapes persisting unchanged in wine) and Secondary compounds (compounds produced during fermentation process)

(Styger et al., 2011). Flavor is defined as the “perception resulting from stimulating a combination of the taste buds, the olfactory organs, and chemesthetic receptors within the oral cavity”

(Waterhouse et al., 2016). Therefore, compounds perceive through olfaction, chemesthesis

(chemical activation for sensations), and taste in mouth were considered flavor. In addition, exploring the knowledge of the volatile compounds in wines is a great interest, since these compounds are highly associated with beverage flavor (Zhu et al., 2016). Furthermore, wine volatile aroma perceived by nose or via postnasal way from the mouth commonly suggested a major determinant to overall flavor perception and higher alcohols, esters and acids are quantitatively predominant in wine aroma compositions and are important in the sensory attributes.

Thus, it is essential to possess consistent quality of wine production (Polášková et al., 2008; Zhu et al., 2016).

As for primary compounds, known as varietal aroma, they highly depend on genetic variation whose related enzymes synthesize different grape aroma, for instance, 1-Deoxy-D- xylulose-5-phsphate synthase causes accumulation of terpenoids, giving the wines a distinct floral flavor (Battilana et al., 2009). Importantly, with regard to secondary compounds, known as

15

fermentative aroma, wine grapes store many volatile compounds as the form of glycosides in a precursor pool where yeasts and malolactic bacteria can catalyze the sugar moiety and release the compounds, which result in a more complex flavor profile deciding the final wine quality (Maicas

& Mateo, 2005). Likewise, yeasts can develop the wine aroma by producing different volatile compounds, particularly, alcohols and esters and after alcoholic fermentation, some bacteria can also alter the aroma profiles by promoting deglycosylation (Ugliano & Moio, 2006). Therefore, the fermentation process profoundly impacts the development of wine aroma which is mostly composed of fermentation-derived volatiles. Furthermore, understanding the compositions of common wine volatile compounds provides a knowledge-based idea to better control their levels in the final product to fulfill the major market interest. A review study analyzed 19 grape and wine aroma-related publications and identified 141 commonly present volatile compounds which categorized into 12 classes where aliphatic alcohols, aliphatic esters, monoterpenes were the most represented classes with their identified compounds (Table 1-1) (Ilc et al., 2016). Generally, aliphatic alcohols are a common group of grape volatiles and their C6-alcohols can be utilized by the yeast, while some short chain alcohols are rather formed by the yeast include isoamyl alcohol

(Mauricio et al., 1997). As for aliphatic esters, this group is well known for the contribution of fruity notes of wine and most of them are produced during fermentation by yeast enzymes, acyltransferase enzymes. Also, ethyl esters and acetate esters are considered as the two major groups in wine that ethyl esters mainly contribute to apple-like aroma, and acetate esters are known as their banana aroma (Saerens et al., 2010). Finally, monoterpenes are considered a large group of plant metabolites and many wine monoterpenes provide floral and citrusy aroma. Moreover, it

16

is observed that fermentation affects the releasing of volatile monoterpenes from glycosylated precursors (Ilc et al., 2016).

Afterall, being familiar with the formation and compositions of wine volatile compound could help the industry anticipate the changes during fermentation and the effects on wine quality.

Table 1-1: Key aroma compounds associated with red wine (Ilc et al., 2016). Class Evident key wine Potential key wine Hidden key wine odorants odorants odorants Aliphatic (Z)-3-hexen-1-ol 2-methyl-1-butanol alcohols

1-hexanol 1-butanol

Isobutanol

Isoamyl alcohol Aliphatic esters Ethyl acetate Butyl acetate ethyl 2- and 3- methylbutanoate

Ethyl decanoate Hexyl acetate

Ethyl butanoate

Ethyl hexanoate

Ethyl octanoate

Isoamyl acetate

Ethyl 3-methylbutanoate Monoterpenes Geraniol alpha-terpineol Wine lactone

Linalool Limonene cis-rose oxide

1.4.1 The concept of terroir in winemaking and the contributing of microorganisms to

shaping terroir.

Leeuwen and Seguin expressed terroir as an interactive ecosystem with local climates, soil, and grapevine variety in an environment (VanLeeuwen & Seguin, 2006). In biology, terroir

17

reflects how different environments contribute to the differences in fruit composition (Teixeira et al., 2013). Overall, varying definitions of terroir all agree on its geographical component.

Terroir is important in viticulture and enology because of the association with sensory attributes and used to explain the classification of quality wines. However, its concept is not well defined in enology and viticulture. Diverse factors including environment (climate, soil), grape varieties, human practices, microbial communities, and their interactions are all thought to contribute to terroir. Recently, studies have successfully used different metabolomics approaches to discriminate complex chemical profiles for wine grapes, must and corresponding wines from close local regions and vineyards, referred to as terroir (Roullier-Gall et al., 2014; Schueuermann et al., 2016). Specifically, in one of the studies, differences in the wine volatile profiles were responsible for sensory attribute differences between Pinot noir wines from different locations within a small area of 13 km2 in Otago, New Zealand (Schueuermann et al., 2016), while another study showed that non-targeted metabolomics analysis was able to discriminate between four different vineyards within a 30 km radius in Burgundy, France (Roullier-Gall et al., 2014). Taken together, environmental differences even across small geographical areas are an important factor in the concept of terroir, and seem to lead to distinct chemical composition of fine wines.

Nevertheless, the driving factors behind these differences remain elusive.

It is already known that microorganisms play crucial roles in wine quality. Hence, regional microbial signatures could be another aspect explaining terroir leading to differential geographic phenotypes of wines. For example, Knight et al. used wine as a research model to display the regionally differentiated population of natural Saccharomyces yeasts which were thought to be responsible for the distinct wine chemical profiles (Knight et al., 2015). Furthermore, it is reported

18

that yeast and bacteria communities differed regionally between vineyards and further, these microbial differences correlated to fermentation characteristics such as fermentation rate, final brix, or YAN of and Cabernet Sauvignon wine (Bokulich et al., 2016).

Thus, it is important to define the role of microorganisms on terroir as these microorganisms are known to create differential metabolites in the wine. Besides, a knowledge- based concept of terroir could help in achieving a consistent high- quality wine with higher consumer acceptance while still staying true to its geographical uniqueness (Belda, Zarraonaindia, et al., 2017).

1.4.2 Advances in Gas Chromatography – Mass Spectrometry approaches improves

resolution of wine volatile profiling

Wine is a complex beverage made of thousands of volatile and non-volatile compounds which define final wine characteristics as well as quality. The combination of these compounds contributes to what is collectively known as sensory qualities. The improved ability to determine specific microorganisms during grape must fermentation and throughout winemaking that contribute to wine sensory qualities could help winemakers to optimize winemaking. For example, profiling the wine composition throughout winemaking could help to develop strategies to modify wine metabolites and thus, optimize wine quality (Cuadros-Inostroza et al., 2016). Gas chromatography coupled with a mass spectrometer (GC-MS) provides many advantages for trace detection, identification, and quantification of volatile compounds. Used in a non-targeted approach, it is valuable for characterizing differences and similarities of unknown compounds in a set of samples. Furthermore, compounds can be tentatively identified using mass spectral

19

databases when the standards are unavailable. When combined with different adsorptive preparation techniques, for example, Solid-Phase Microextraction (SPME), a further enhancement of sensitivity of GC-MS analyses can be achieved (Ebeler, 2015). SPME directly extracts analytes from the headspace (HS) above the sample or the sample matrix itself onto the extraction fiber, which saves sample preparation time and more important, due to the fiber coating, lowers detection limits (Xu et al., 2016).

Several studies demonstrate the suitable of HS-SPME-GC-MS for profiling grape and wine volatile metabolomes. (Cuadros-Inostroza et al., 2016) conducted a longitudinal research with GC-

MS by identifying 115 metabolites during the stages in grape growth revealing the shift of metabolite during the ripening of grape berries. Moreover, another study used GC-MS with SPE to analyze the volatiles in Pinot noir wines providing the evidence of the winemaking condition, vintage, and barrel maturation were the most dominant steps shaping final wine volatile metabolites. (Joseph et al., 2015) conducted a survey of 95 Brettanomyces strains which is sometimes considered the contaminant yeast in dry wines and used HS-SPME-GC-MS to identify whether strains consistently give positive aroma characteristics such as spicy, fruity, or floral flavor in a preconstructed model wine.

With the increased global competition of ‘premium’ and ‘super-premium’ wines, the strategies directed towards the enhancement of producing consistent high-quality wine become more important. Selecting a suitable method for profiling wine volatile compounds can increase the efficiency of metabolites biomarker identification and construction of knowledge-based wine making protocol and therefore, give positive impacts for wine industry (Atanassov et al., 2009).

20

Chapter 2

Significance, Innovation, and Approach

In this chapter, I discuss the significance, innovation and overall experimental approach in my research project.

2.1. Significance

2.1.1. Economic impact of wine industry in Pennsylvania

There are over 200 wineries covering all regions in Pennsylvania (PA) that produce more than 1 million gallons of wine annually. Further, the wine industry contributes to the state’s economic impact with around 4.8 billion dollars in 2017. Although PA is the 5th largest grower of grapes in the US and 10th in the country in terms of amount of wine produced, compared with other states such as California, Washington, and New York (General Industry Stats, 2019), the PA wine industry is still considered a relatively young industry which is primarily composed of smaller wineries (Thompson et al., 2019). Therefore, improving wine quality and regional characteristics of PA wines are of high importance for growers and winemakers to maintain the industry’s marketing competitiveness.

21

2.1.2. Microorganisms present on grape berries and throughout fermentation influence the

chemical properties in wine

Microbial biodiversity plays an important role in wine characteristics and quality (Bokulich et al., 2013). Traditionally, wine was produced by natural or spontaneous fermentation of grape must/juice by yeasts that originate from grapes and winery equipment (Combina et al., 2005).

Today, even though commercial yeasts are added to winemaking, the presence of microorganisms from grapes and winery equipment can still differentially impact the chemical composition of wine, thus preserving distinctive characteristics of the region ((Bokulich et al., 2016; Spano &Torriani,

2016). Microbial terroir could be viewed as all naturally occurring microorganisms that are transferred from grape berries within a geographic location into the winemaking process and that influence stages of fermentation which therefore directly impact wine quality (Salvetti et al., 2016).

For example, non-Saccharomyces yeasts like Candida, Pichia or Cryptococcus have been shown to play a prominent role in grape quality and can influence winemaking particularly during early fermentation (Bozoudi &Tsaltas, 2016). Further, bacterial communities such as Lactobacillus,

Oenococcus and Acetobacter associate with the formation of compounds in different fermentation stages such as malolactic fermentation (Piao et al., 2015). The formation and the extent of several important wine aroma compounds like higher alcohols, acetate esters or phenolics is attributed to the presence of microbial communities during fermentation (Gamero et al., 2016). Taken together, the role and dynamics of bacterial and fungal species on grapes as well as throughout the winemaking process are important determinants of wine quality and characteristics through their can impact on the volatile and non-volatile composition of wine.

22

2.1.3. Data-driven knowledge for improvement of winemaking

Since dynamics of endogenous microbial populations are largely unpredictable and inconsistent, the ability to precisely understand the role and dynamics of how microbial populations impacts wine important metabolites is needed (Bokulich et al., 2016; Knight et al.,

2015). This exploratory study was aimed at characterizing the impact of microbial biodiversity on wine metabolite composition without changing current winemaking practices such as the use of commercial S. cerevisiae and O. oeni during the winemaking process. Thus, providing the data- constructed knowledge of microbial patterns and corresponding chemical profiles could enable production of high-quality wines with unique flavor characteristics. Overall, the findings from this study could provide knowledge-based tool for winemakers to improve regional wine flavor characteristics that preserve ‘terroir’.

2.2. Innovation

So far, there is little research analyzing the wine microbiome with a more comprehensive sampling strategy. Current research focus on specific microorganisms and individual characteristics instead of investigating its role within an entire microbial community. Wine fermentation is widely known as a complex fermentation system involving the interaction of various microorganisms and the formulation of sensorial compounds. However, the understanding of this interaction still remains ambiguous (Belda, Ruiz, et al., 2017). In the different fermentation stages from the early introduction of commercial yeast to the crushed grapes (must) to late malolactic fermentation, both fungal and bacterial species interact and lead to the formation of desired and less desired compounds. Data-constructed knowledge is required for the elucidation

23

of the microbial activity and derived chemical compounds. Therefore, to achieve the goal of understanding the role and dynamics of microbial populations, the strategy of direct sampling from the field (wineries) instead of lab-scale production would provide an informative aspect of microbial dynamics that is representative of the industrial environment. Following samples from early fermentation including the initial grape must, early, middle, and late stages of fermentation, and the products just before racking would also provide insight into the kinetics and temporal evolution of both microbes and their produced metabolites.

To obtain an overall view of the role and dynamics of microbial communities in winemaking, next-generation sequencing (also known as ‘high throughput sequencing’) approaches can be used. This technology enables rapid construction of a broader range of microbial dynamic profiles with high resolution. Therefore, this analysis can provide a view on how microbial diversity and relative abundance of specific microorganisms change as fermentation progresses. Second, the comprehensive chemical profile is accomplished by SPME GC-MS which is capable of detecting a wide range of compounds with varying polarity and volatility and can reach detection levels in the parts per trillion (ppt) range for certain compounds. Thus, the inclusive analysis brings a broad screen of how various wine metabolites are formed throughout fermentation.

Overall, due to the lack of knowledge of how the grape and wine microbiome contributes to chemical compounds throughout fermentation, the combination of next-generation sequencing

(NGS) and volatile profiling with HS-SPME-GC-MS allows us to understand the role of microbial activity on chemical wine composition and further, allows for the creating of a database able to establish a prediction model for wine quality and characteristics. Thus, by demonstrating the

24

characteristics of microbial patterns and wine metabolite profiles in PA wines, it is possible that the PA wine industry will be able to enhance its recognition in the USA wine market.

2.3. Approach

2.3.1 Rationale

The wine industry in PA is a relatively young industry with an economic impact of approximately 4 billion dollars in 2017. However, based on the number of the growth in PA wineries this trend is expected to keep increasing (Dombrosky & Gajanan, 2013). Thus, maintaining wine quality and wine production consistency could be viewed as an important objective for the wine industry. Our thought is that data-driven knowledge of the microbiome and composition of chemical profiles in the wine making process are required to achieve the goal.

Pennsylvania exhibits the hot summers and cold winters climate which is more like Europe than California. This type of climate presents the ideal environment for the growth of hybrid grapes.

Chambourcin, a French-American interspecific hybrid grape, is one of the most commonly grown varieties in PA and is relatively resistant to cold temperatures and diseases. Therefore, for analyzing the red wine system in PA, this hybrid grape is chosen as the representative research model.

Microbial compositions and activities change continuously when the fermentation is progressing. Although, Saccharomyces cerevisiae is added to the must at the start of fermentation to control the quality and consistency of wines produced, other wild yeast, fungi, and bacteria also

25

can influence the production of wine important metabolites. Therefore, next-generation Illumina sequencing with the detection of bacterial 16s rRNA gene V4 region and fungal ITS2 gene region can provide information about the entire microbial population and diversity in our grape model system. In addition, by selecting different stages and timepoint for sample collection which includes initial crush (grape must), prior to bulk inoculation with commercial S. cerevisiae, early, middle, and late fermentation stages, and just before racking, it is possible to fully characterize the microbes in different taxa. While sampling from mock fermentation or lab-scale fermentation are more easily controlled and collected, this approach lacks the impact of certain factors such as resident microbes from vineyards and wineries. Finally, we now understand that geographical differences in grape growing regions can influence distinctive characteristic of microbial communities on grapes and fermentation environment which results in final wines with completely different flavor profiles (Bokulich et al., 2016; Gilbert et al., 2014). Therefore, Samples directly collected from the wineries can represent the most real situation and environment of microbial communities and chemical compositions.

2.3.2. Hypothesis and specific aims

This exploratory study was aimed at characterizing the impact of microbial biodiversity present on Chambourcin hybrid grapes and its influence on wine metabolite composition. Our hypothesis is that microbial populations present at individual wineries in different regions of PA can affect production of volatile compounds present in grape musts and throughout fermentation.

Therefore, the study is conducted based on the following research aims with the flowchart of experimental design (Figure 2-1):

26

Aim 1: Determine the compositions of fungal and bacterial communities in grape must and fermentation samples from different wineries using next-generation Illumina sequencing.

Aim 2: Examine volatiles compounds in grape must and fermentation samples from different fermentation stages using a non-targeted gas chromatography–mass spectrometry (GC-

MS) with solid-phase microextraction (SPME) approach.

Aim 3: Identify the association between wine microbial communities and volatiles relative to regional signatures (terroir) using a Partial Least Squares-Discriminant Analysis (PLS-DA) and regularized Canonical Correlation Analysis (rCCA) data analysis approach.

Figure 2-1: The flowchart of experimental design.

27

CHAPTER 3

Characterization of microbial dynamics and volatile metabolome changes during

fermentation of Chambourcin grapes in two Pennsylvania regions

Abstract

Wine grape varieties and the regional distinction of wine characteristics (terroir) are important aspects of wine quality. In addition, the regional wine microbiome pattern also associates with different wine compositional characteristics. Studies show that various microorganisms generate different volatile compounds during fermentation which are able to influence wine aroma.

However, we still do not fully understand how microbial biodiversity and the predominance of unique taxa associates with wine volatiles from different regions. In this study, our goal is to conduct the first study of the microbiome of red Chambourcin wine from different wineries in

Pennsylvania. We used Illumina-based next generation sequencing to characterize the impact of microorganisms on volatile profiles obtained using HS-SPME-GC/MS of Chambourcin red wine throughout fermentation. We observed high fungal diversity during the early fermentation stage

(4 days after crushing), which likely contributes to a microbial terroir in the resulting wines. In addition, fungal diversity significantly decreases in the middle and late stages (5-20 days after fermentation). Changes in timing and abundance of wine volatiles throughout fermentation provide insights into the unique traits of Chambourcin red wines. In addition, significant regional discrimination obtained from wine microbiome and metabolome from Central and Northeast PA.

Key discriminant regionality-driven features include differences in relative abundance differences

28

of Botryosphaeria, Neofusicoccum, Enterobacteriaceae, and Burkholderiaceae, as well as different concentrations of (E)-2-Hexenoic acid, Heptanal, and 2-Hexenoic acid ethyl ester.

Furthermore, rCCA modeling provided fundamental correlation between wine microbial communities and volatile compounds which can help explain microbiome and metabolome-related regionality attributes. Collectively, these results could provide grape growers and wine producers with targeted recommendations to improve wine production through an increased understanding of the impact of native microbial populations responsible for developing aroma attributes.

29

3.1. Introduction

Microbial communities play critical roles in complex fermentation systems such as winemaking. Several lines of evidence suggest that changes in microbial diversity and abundance can influence the physicochemical properties of final wines, control wine spoilage, and alter wine perception (Bokulich et al., 2013). These microbial populations that are ubiquitously present on grapes, in the vineyard, in the soil, and in wine processing facilities throughout winemaking contribute to final wine quality and characteristics (Romano et al., 2019). Unique microbial populations or “microbial fingerprint” of wine grape berries have further been associated with specific geographical locations. Regionally distinct microbial population patterns present in the vineyard and winery that influence aspects of wine production and quality of final wines are collectively known as microbial terroir (Bokulich et al., 2016). Terroir is a concept used to denote the unique features of a region with winemaking practices and environmental factors that can influence and shape the final food product such as wine (Marlowe &Bauman, 2019). For example, we can appreciate that wines made from the same wine grape variety (cultivar) but produced in different regions have different sensory attributes. The concept of terroir presented together with the story of a bottle of wine typically results in increased consumer acceptance and appreciation, leading to important economic values (Belda, Zarraonaindia, et al., 2017). Previous studies have shown that the grape and wine microbiome in different regions can shape final wine quality and characteristics (Bokulich et al., 2016; Capozzi et al., 2015). Thus, understanding microbial diversity within a region could be valuable to preserve unique “microbial fingerprints” that potentially drive regional wine sensory profiles.

30

Traditionally, characterization of microbial biodiversity relied on culture-dependent approaches such as cultivation of microorganisms on selective and differential media. Culture- dependent approaches are often limited by the culture conditions and dramatically underestimates the number of microbial populations within an environment (Al-Awadhi et al., 2013). Advances in DNA-based microbiome sequencing technologies have allowed scientists to precisely identify and characterize microbial populations (collectively known as the ‘microbiome’) in complex ecosystems. Thus, with the understanding of microbial genetic fingerprint which comprises of microbial diversity and relative abundance, we are able to maintain food heritage, standards, and improve the authenticity of food products (Bozoudi &Tsaltas, 2016).

To date, Vitis vinifera grape varieties are a primary focus in many studies on the role and dynamics of microorganisms throughout winemaking (Mezzasalma et al., 2017; Nikolaou et al.,

2006; Zott et al., 2008). However, interspecies grape varieties (also known as hybrid grapes) are less well studied, although they represent an important part of many winemaking regions, including the Eastern US, and are critical in dealing with changing climate (Santos et al., 2020).

Chambourcin, a French-American hybrid wine grape variety (Vitis sp. ‘Chambourcin’) (Julius

Kühn-Institut, 2020), is grown across Pennsylvania, primarily in the North Central, South West,

South Central, and South East regions of the Commonwealth and is the most abundant hybrid grape variety in the state (Dewey, 2017). Compared to V. vinifera, Chambourcin is more tolerant to lower winter temperature, higher summer temperatures, and has a higher tolerance to disease pressures (Coia &Ward, 2017; Gardner, 2016; Guo et al., 2014). The use of hybrid grapes in the state is of importance from an economic standpoint as well as a viticulture perspective as interspecific varieties such as Chambourcin has the ability to grow and thrive better in regional

31

climates. This option not only produces quality red wines but also provides a route toward sustainability in the wine industry. The ‘more resistant’ characteristics of hybrid grapes from disease and weather damage could result in less spoilage and waste within the industry.

Consequently, the impact of microbial populations and their interactions in winemaking on physicochemical properties of hybrid red wines like Chambourcin could help draw interest on the use of interspecies varieties and provide options for winemakers within the industry to expand their selection of grapes for winemaking.

In this study, we utilize an Illumina-based next generation sequencing (NGS) approach together with an untargeted metabolomics approach in a Chambourcin hybrid grape model system to (1) characterize the microbiome and metabolome throughout fermentation, (2) explore the impact of regional differences on microbial populations and metabolic profiles, and (3) identify associations between microbial populations and specific volatile compounds. To achieve this, we collected 88 commercial samples from crushed grapes (musts) throughout fermentation from two

Chambourcin regions in Central and Northeast Pennsylvania (PA) (Figure 3-1). To profile wine volatiles, we used gas chromatography-mass spectrometry (GC-MS) with headspace-solid-phase microextraction (HS-SPME) for non-targeted metabolite profiling of volatile compounds in all samples. Understanding regional microbial signatures and metabolites would allow for future targeted microbiome manipulation to improve PA wine quality and competitiveness on a national and international market.

32

3.2. Materials and Methods

3.2.1 Fermentation and winemaking sample collection for microbiome-metabolome

analysis

Samples were collected from five wineries located in central PA and four wineries located in the Northeast region with a total of nine wineries during the 2019 vintage (the least distance between the wineries in Central and Northeast region is around 85 km) (Figure 3-1). For this study, our selection criteria involved choosing wineries and vineyards that grew Chambourcin grapes and

Figure 3-1: Map of sample collection sites in Central and Northeast PA. Locations in Central PA are highlighted by a dotted circle and locations in Northeast PA are represented by dotted triangle. Each red circle represents an individual winery where samples were collected throughout ten predetermined stages of Chambourcin fermentation and winemaking.

processed Chambourcin wine. At each of the nine Chambourcin sampling sites (i.e., wineries), approximately 50 mL of samples were collected by winemakers in duplicate into provided sterile

33

centrifuge tubes (VWR, Radnor, PA, USA) at ten predetermined time points across a 20-day period during fermentation and winemaking. Samples were stored immediately after sampling at -20°C until further processing (i.e., pick-up by research team from wineries and transfer to campus on dry ice within 1 day and further stored at -80°C). To obtain a comprehensive picture of the microbial composition during fermentation, we collect samples at ten stages throughout fermentation based on recommendations by head winemakers at two participating wineries. These ten stages were at Crush (Stage 1: Day 0), Must 1, Must 2, and Must 3 (Stage 2-4: Day 1-3), Early

Fermentation (Stage 5: Day 4) , Mid Fermentation (Stage 6: Day 7), Late Fermentation (Stage 7:

Day 10), End of Fermentation (Stage 8: Day 13), Malolactic (MLF) fermentation (Stage 9: Day

16), and final wine (collected before racking, Stage 10: Day 20) (Table 3-1). A total of 90 samples were collected. However, due to uncontrollable circumstances, one sample was lost during transport back to PSU campus and another sample was not collected during winemaking by winery staff resulting in a total of 88 unique samples (9 wineries x 10 fermentation stages minus 2 lost samples) (Table 3-2).

Table 3-1: Summary of fermentation stages and sample collection. Stage Timeline (day) Process S1 0 Crush (de-stemming) S2 1 Must 1 S3 2 Must 2 S4 3 Must 3 S5 4 Early fermentation (Early) S6 7 Middle fermentation (Mid) S7 10 Late fermentation (Late) S8 13 End fermentation (End) S9 16 Malolactic fermentation (MLF) S10 20 Wine (before racking and filtration)

34

Table 3-2: Summary of fermentation stages and samples collected. Stage Winery

PA19_01 PA19_02 PA19_03 PA19_04 PA19_05 PA19_06 PA19_07 PA19_08 PA19_09 S1 V V V V V V V V V S2 V V V V V V V V V S3 V V V X V V V V V S4 V V X V V V V V V S5 V V V V V V V V V S6 V V V V V V V V V S7 V V V V V V V V V S8 V V V V V V V V V vS9 V V V V V V V V V S10 V V V V V V V V V Total number of collected samples: 88 *V indicates sample successfully collected; X indicates sample loss.

In this study, participating wineries were provided with sample collection and handling instructions which included a survey log for sample handling outlining ten predetermined stages of sample collection. Of particular interest in this study, we did not request wineries to alter existing winemaking procedures, i.e., they were not prevented if they commonly added commercial S. cerevisiae to the initial fermentation and/or Oenococcus oeni during malolactic fermentation (MLF). We felt that retaining existing winemaking practices could help us capture individual diversity of microbial populations that could help explain final wine characteristics.

However, we asked winemakers to record specific time points in which these commercial microorganisms were introduced to fermentation tanks, on which day S. cerevisiae and O. oeni were added, what the type of added microorganisms were (supplier information), and when sulfite

35

treatments during winemaking were made, if any (Figure 3-2). This metadata was gathered to understand the differences in winemaking protocols across individual wineries.

36

Figure 3-2: Survey given to winemakers highlighting fermentation stages and sample collection for microbiome-metabolome study

37

1 3.2.2. Preparation of total microbial DNA

2 Total genomic DNA was extracted and prepared for microbiome sequencing as previously

3 published with minor modifications (Bokulich et al., 2016). In summary, samples from different

4 fermentation stages and wineries were thawed and centrifuged at 8000 x g for 15 min and

5 supernatants were discarded. Next, pellets were washed with ice-cold phosphate-buffered saline,

6 PBS (pH 7.4) and centrifuged again at 8000 x g for 15 min. Centrifugation and PBS washes were

7 repeated three times. DNA was extracted from approximately 200 mg of washed pellets from each

8 sample using Quick-DNATM Fecal/Soil Micobe Miniprep DNA extraction kit according to the

9 manufacturer’s instructions (Zymo Research, Irvine, CA, USA). Resulting DNA was eluded in

10 elution butter provided by the extraction kit. DNA concentration obtained of each sample was

11 quantified using NanodropOne (Thermo Scientific, Waltham, MA, USA) and quality was

12 monitored by the 260/280 ratio which assessed the amount of protein contamination. DNA samples

13 were normalized to 3 ng/L by dilution in nuclease-free free water and stored at -80°C until further

14 use.

15 3.2.3. Amplification, and purification of target gene sequences

16 Fungal and bacterial populations in collected samples were identified by targeting the

17 metagenomic sequence of the internal transcribed spacer 2 (ITS2) sequence and the V4 domain of

18 16s rRNA gene, respectively. The first round of PCR amplification for each sample included 12

19 ng of DNA template, 1 x KAPA HiFi HotStart ReadyMix (Kapa Biosystems, Wilmington, MA),

20 0.25 M of each primer pair (forward and reverse), nuclease-free water, and 0.05 mg/mL bovine

38

21 serum albumin (BSA) (Simga-Aldrich, Saint Louis, MO, USA). BSA was previously

22 demonstrated to improve PCR yield from samples that contain GC-rich templates and phenolic

23 compounds (Farell &Alexandre, 2012; Samarakoon et al., 2013). The fungal ITS2 locus was

24 amplified using the forward primer ITS9 (5’-GAA CGC AGC RAA IIG YGA-3’) and reverse

25 primer ITS4 (5’-TCC TCC GCT TAT TGA TAT GC-3’) (Nordberg et al., 2014), with forward

26 Illumina adapter overhang sequences (5’-TCG TCG GCA GCG TCA GAT GTG TAT AAG AGA

27 CAG‐[ITS2 sequences]-3’), and reverse Illumina adapter overhang sequences (5’- GTC TCG TGG

28 GCT CGG AGA TGT GTA TAA GAG ACA G‐[ITS2 sequences]-3’) (PCR Amplicon, PCR

29 Clean-up, and Index PCR, 2013). PCR amplification was carried out initially at 98°C for 5 min,

30 followed by 30 cycles at 95°C for 45 s, 55°C for 60s, and 72°C for 60s, and a final extension at

31 72°C for 5 min. Bacterial 16S rRNA genes were amplified with forward primer 515F (5’-GTG

32 YCA GCM GCC GCG GTA A-3’) (Parada et al., 2016) and reverse primer 806R (5’-

33 GGACTACNvGGGTWTCTAAT-3’) (Apprill et al., 2015), with forward Illumina adapter

34 overhang sequences (5’-TCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG‐[16S

35 rRNA v4 genes sequences]-3’), and reverse Illumina adapter overhang sequences (5’- GTC TCG

36 TGG GCT CGG AGA TGT GTA TAA GAG ACA G‐[16S rRNA v4 genes sequences]-3’).

37 Reaction conditions consisted of 98°C for 2 min, followed by 25 cycles at 95°C for 15 s, 59°C for

38 15 s, and 72°C for 15 s, with a final extension at 72°C for 5 min.

39 PCR amplicons were purified using GenEluteTM PCR Clean-Up Kit (Sigma-Aldrich, Saint

40 Louis, MO, USA) to remove single primers and primer dimers. Purified PCR amplicons were

41 visualized using gel electrophoresis (1.5% agarose) to confirm successful amplification of target

42 sequences and the exclusion of primer dimers. PCR amplicons of fungal ITS2 sequences ranged

39

43 between 370 bp and 590 bp and PCR amplicons of bacterial 16S rRNA v4 gene sequences were

44 approximately 390 bp. The size of first round PCR amplicons not within the range mentioned

45 above were removed using a GeneJET Gel Extraction Kit (Thermo Scientific, Waltham, MA,

46 USA).

47 3.2.4. Sequencing library preparation and construction

48 Purified PCR amplicons were submitted to the Penn State HUCK Institutes of the Life

49 Sciences Genomics Core Facility for Illumina next-generation sequencing. Purified fungal and

50 bacterial PCR amplicons from every twelve samples were pooled together and the resulting pooled

51 PCR products were identified by Bioanalyzer to assess untargeted artifacts existing in samples.

52 The same quality control and cleanup protocol was applied to the index PCR step where the rest

53 of the Illumina adapters were added along with the indexes. Equimolar concentrations of pooled

54 libraries containing PCR amplicons were sequenced on an Illumina MiSeq using 250 x 250 paired-

55 end sequencing (Illumina, San Diego, CA, USA).

56 3.2.5. Raw sequence data processing and phylogenetic tree construction

57 Raw sequences obtained from Illumina MiSeq comprising of fungal and bacterial DNA

58 was analyzed using QIIME 2 Core 2019.7 (Bokulich et al., 2018) and the resulting data in Casava

59 1.8 paired-end demultiplexed format was imported using the qiime tools import plugin which

60 combines forward and reverse reads into a single file. Bioinformatics analysis workflow was

61 adapted as previously published (Pearson et al., 2019). Forward and reverse reads of 16S rRNA

62 gene sequences (bacterial sequences) were truncated at base position 196 and 204 respectively

40

63 followed by denoising with the q2-DADA2 plugin (Callahan et al., 2016). To acquire a rooted

64 phylogenetic tree, we used fragment-insertion plugin based on the SEPP algorithm (Janssen et al.,

65 2018), which insert representative sequences (FeatureData[Sequence]) of our samples into a high

66 quality preconstructed reference phylogeny (SILVA version 128 99% threshold OTUs,

67 https://github.com/smirarab/sepp-refs/blob/master/silva/README.md). Furthermore, in order to

68 enable application of the phylogenetic tree for diversity computation, we used QIIME’s fragment-

69 insertion filter-features plugin to filter a feature-table which contains fragments present in the

70 insertion tree (mentioned above) (Quast et al., 2013).

71 Raw fungal ITS2 sequences were trimmed using q2-ITSxpress plugin (Rivers et al., 2018)

72 and denoised using a q2-DADA2 plugin. Representative fungal sequences were pre-filtered if

73 these sequences were lower than 80% identity to reference database and clustered against UNITE

74 ver8 99% OTUs reference database using the QIIME vsearch cluster-features-closed-reference

75 plugin (Nilsson et al., 2019). Clustered sequences were then aligned with pre-built phylogenic

76 reference tree made by UNITE ITS extension database and SILVA 18S database using q2-ghost-

77 tree plugin to construct a reference-based fungal phylogenetic tree (Fouquier et al., 2016).

78 3.2.6. Taxonomic distribution and diversity analyses of microbial populations obtained

79 from Chambourcin wine fermentations

80 Taxonomic identification of bacterial communities was analyzed using the q2-feature-

81 classifier plugin and a pre-trained Naïve Bayes classifier with SILVA 128 99% OTU reference

82 database extracted/trimmed to primer sites (Bokulich et al., 2018). The same plugin and classifier

83 was used for analysis of fungal community but the UNITE ver8 99% OTU database was trained

41

84 on the full reference sequences without any extraction. Q2-diversity plugin and the fungal and

85 bacterial phylogenetic trees were used to compute alpha (α) and beta (β) diversity metrics. In order

86 to predetermine the exclusion of low sequence reads samples, ITS2 and 16S sequence tables from

87 DADA2 plugin were imported for the analysis of rarefaction. Rarefaction analysis is a non-

88 parametric resampling technique which can be justified for normalization by random subsampling

89 of library size within each sample without replacement (Nipperess, 2016). Therefore, this analysis

90 can be a useful tool in microbial diversity analyses by providing the optimal number of sequencing

91 depth for normalizing of the total sequencing count from each sample at the same level minimizing

92 possible bias from the sequencing process. After targeting samples with optimal read counts based

93 on alpha rarefaction in which sequence reads reached a sufficient average observed OTUs,

94 minimum ITS2 and 16S sequence counts generated after applying filtering out the unrelated

95 features (e.g. Chloroplast, Mitochondria) were decided 16306 and 1669 (which was mean

96 respectively as library size normalization for downstream relative abundance and microbial

97 diversity analyses.

98 In order to improve the resolution of wild yeast and bacteria compositions, read counts of

99 Saccharomyces cerevisiae or Oenococcus oeni were first excluded from OTUs table and the rest

100 of taxa at the genus level was extracted and normalized by sum of reads. Using Euclidean distance

101 measurement and Ward clustering algorithm, dendrogram representing each winery was generated

102 based on relative abundance of each taxa at genus level from each winery. Prior to diversity

103 analyses, select samples were removed based on results obtained from rarefaction analyses and

104 taxonomic identification (sample PA19_03_S8, S9, and S10 were removed from ITS2 data, and

105 PA19_02_S10 and PA19_09_S4 from 16S data).

42

106 Microbial diversity is defined as the variety of microorganisms in a given ecosystem. To

107 understand the perturbations to a microbial community from the given environments, α-diversity

108 can be a useful approach by measuring amplicon sequencing data (Willis, 2019). αdiversity

109 summarizes the diversity within an ecological community. It considers two perspectives, richness

110 (the number of different taxonomic groups present within a sample) and evenness (the degree of

111 similarity in taxonomic abundance within a sample) (Poos et al., 2009). In this study, four

112 measurements of microbial α-diversity were chosen: Shannon’s diversity (Reese & Dunn, 2018),

113 Faith’s phylogenetic diversity (Faith, 1992), and Pielou’s Evenness (Pielou, 1966). Pairwise

114 comparisons were tested using the Kruskal-Wallis rank-based approach for nonparametric data

115 and both p-value and false discovery rate (FDR) adjusted p-value (q-value) were used to indicate

116 statistical significance (p, q-value < 0.05) (Nahm, 2016).

117 In contrast to α-diversity which analyzes differences within a community, β-diversity

118 focuses on the differences of one community compared to other communities. In other words, β-

119 diversity provides insight on the ecological dissimilarity which changes microbial communities.

120 Based on qualitative and quantitative measures of microbial community dissimilarity, β-diversity,

121 unweighted Unifrac distance (C.Lozupone & Knight, 2005) and weighted Unifrac distance (C. A.

122 Lozupone et al., 2007) were used. These measures can perform better than others approaches that

123 compares microbial communities because the relatedness of phylogeny is taken into account

124 (Goodrich et al., 2014). Both unweighted and weighted Unifrac distance metrics were exported

125 from QIIME2 and imported into R in order to be visualized in a Principal Coordinates Analysis

126 (PCoA) plot by using the R package “qiime2R” (Bisanz, 2018). Pairwise comparisons of β-

127 diversity were tested using PERMANOVA, a non-parametric approach of multivariate analysis of

43

128 dissimilarity based on pairwise distance (McArdle & Anderson, 2001). Both p- and q-values were

129 used to indicate statistical significance. Additionally, in order to mimic the same sample size in

130 the analyses of microbial diversity, samples which did were not present in the diversity data were

131 excluded from taxonomic distribution data. In addition, taxa which were not identified as bacteria

132 or fungi or identified as chloroplast or mitochondria were also filtered out of the data analysis for

133 microbial diversity.

134 3.2.7. Preprocessing of microbial data for regression analyses

135 Raw read counts of each feature (taxon) in all samples were examined using the

136 nearZerovar function from the R package “caret” with two cutoffs, TRUE if ≧ freqCut = 82/3 (if

137 the frequency of the most prevalent value over the second most frequent value) and TRUE if ≦

138 uniqueCut = 5 (the number of unique values divided by the total number of samples times 100)

139 (Kuhn, 2008). Features that did not meet these cutoffs were considered as near-zero variance

140 predictors and removed due to their uninformative predictor property. Next, one pseudo-count was

141 added to each feature in each sample using R package “selbal” to prevent error resulting from

142 logarithmic transformation (Rivera-Pinto et al., 2018). Cumulative Sum Scaling (CSS)

143 normalization enables bias correction associated with assessing differential relative abundance

144 from total-sum normalization (TSS). Following CSS normalization, a binary log transformation

145 (for correction of heteroscedasticity) were applied to the transposed data frame where samples

146 were listed in column and taxa in row using R package “metagenomeSeq” (Paulson et al., 2013).

147 Next, the data was uploaded to MetaboAnalyst, an online interface for omics data, scaled by Pareto

44

148 approach in Time-series / Two-factor module to reduce the mask effect from high abundant taxa

149 and exported for the downstream regression analyses (Chong et al., 2019).

150 3.2.8. Determination of cluster number for data clustering to improve taxa differentiation

151 from Linear discriminant analysis Effect size (LEfSe)

152 In order to differentiate the relative abundance of microbial communities and visualize

153 more informative output figures, simplifying the groups from ten predetermined fermentation

154 stages into fewer clustered time points was needed. We observed that not all stages were

155 significantly different from each other, therefore, clustering groups that were considered more

156 similar could represent a more meaningful group. To determine the similarity between each stage,

157 we selected α-diversity representing microbial richness and evenness obtained from individual

158 stages and Shannon’s diversity, Faith’s phylogenetic diversity, and Pielou’s Evenness were

159 imported into R v3.6.3 (R Core Team, 2013). Average Silhouette width (ASw) was used to assess

160 the efficiency of clustering in eleven different seed cluster numbers, k (from 2 to 12) with the R

161 package ‘factoextra’ (Alboukadel Kassambara & Fabian Mundt, 2016). Average Silhouette width

162 provides an evaluation of selecting an appropriate number of cluster for downstream analysis

163 (Rousseeuw, 1987). In addition, (Koren et al., 2013) recommended the use at least two different

164 clustering score assessments. Therefore, the total within-cluster sum of square (TWS) (Elbow

165 method) was used for the cross-validation of clustering number generated by ASw. The

166 Partitioning Around Medoids (PAM) clustering approach using a k-medoid algorithm with a

167 Manhattan distance measurement was chosen to confirm the separation of clusters by microbial

168 factors using the R package ‘cluster’ (Martin Maechler et al., 2019). The final determination of

45

169 cluster number was based on the separation of sample clusters. Separation of fungal communities

170 into three clusters and bacterial communities into two clusters showed best performance in

171 distinguishing fermentation stages into categories. Therefore, the 10 pre-determined fermentation

172 stages for ITS2 were categorized into early, middle and late phases (3 clusters), and for 16S were

173 categorized into early and late phases (2 clusters).

174 3.2.9. Characterization of microbial taxa using LEfSe during fermentation

175 Linear discriminant analysis was used for identification of differential taxa across groups.

176 Data with raw read counts at the genus level were first filtered by nearZerovar function in R and

177 one pseudo-count was added to each feature in each sample. Then, CSS normalization was applied

178 to the data without log2 and pareto processes due to the automatic logarithm in the following

179 discriminant analysis. Then, individual relative abundances obtained from fungal communities

180 were grouped into Early_phase, Mid_phase, and Late_phase (3 clusters) and and relative

181 abundances for bacterial communities were grouped into Early_phase, and Late_phase (2 clusters)

182 based on the predetermined clustering numbers previously established. This data was then

183 imported onto the Linear discriminant analysis (LDA) Effect Size (LEfSe) platform to differentiate

184 the features between clusters. LEfSe supported multidimensional groups comparisons and enables

185 identification of differences among groups by coupling standard tests for statistical significance.

186 To test the statistical differences of features among biological classes, non-parametric factorial

187 Kruskal-Wallis (KW) sum-rank test was used (Kruskal &Wallis, 1952) followed by the

188 investigation of biological consistency using pairwise tests among subclasses using the Wilcoxon

189 rank-sum test. Finally, LDA was used to estimate the effect size of differentially abundant features

46

190 (Segata et al., 2011). In this study, 0.05 α value was set for both Kruskal-Wallis and Wilcoxon

191 tests, LDA score ≧ 3 was set as the threshold for discriminative features and all-against-all

192 approach was selected for multi-class analysis in order to acquire more meaningful results. For the

193 purpose of LEfSe and consideration of information that can be extracted from this data analysis,

194 unidentified and unrecognizable microorganisms were removed manually from the histogram and

195 the cladogram (Pandit et al., 2018; Segata et al., 2011).

196 3.2.10. HS-SPME-GC-MS for identification and quantification of volatile compounds

197 throughout fermentation of Chambourcin red wine

198 Volatile compounds present in samples obtained across fermentation stages were detected

199 and quantified using gas chromatography−mass spectrometry (GC-MS) (Agilent Technologies

200 7890B GC System-5977B MSD). Chromatographic separations were carried out using Rtx-Wax

201 capillary columns with 25 m, 0.25 mm inner diameter, 0.25 μm film thickness, and CarbowaxTM

202 polyethylene glycol stationary phase (RESTEK, Bellefonte, PA, USA). Solid Phase

203 Microextraction (SPME) coated with 2 cm divinylbenzene, carboxen, and polydimethysiloxane

204 (DvB/CAR/PDMS) was chosen for extracting different kinds of volatile compounds. 2-Octanol

205 (13.7 mg/L) and Naphthalene-D8 (9.9 mg/L) were used as internal standards (IS) for the

206 normalization of volatile compounds. Each batch contained (1) one blank vial containing 10 μL of

207 IS mix (BLANK), (2) one vial containing 10 μL of C8-C20 standard solution (SIGMA-ALDRICH,

208 Saint Louis, MO, USA), (3) vials containing 10 μL of IS mix with 2 mL of sample, 3 g of NaCl,

209 and 0.5 g D-gluconic acid lactone (as an inhibitor of grape β-glucosidase activity), and (4) three

210 vials containing 10 μL of IS mix (protocol was modified from Pedneault, 2013). Each glass vial

47

211 was incubated at 30°C for 5 min with shaking at 250 rpm. SPME fiber was exposed to the

212 headspace of the inner vial for 30 min. Then, the fiber was injected into the column using the

213 splitless mode and thermally desorbed for 10 min from initial temperature 60°C to 240°C with

214 10.00°C/s rate. The oven temperature was programmed as follow: initial temperature 30°C for 1

215 min and then to 250°C hold for 5 min at a rate of 10°C/min; a total 28 min run time with column

216 flow rate 1 mL/min. Mass ion source and quadrupole temperature were set at 230°C and 150°C

217 respectively and 70 ev was used for electron energy. Mass spectral data were collected based on

218 33 – 350 amu scan range, 2 A/D sampling rate, threshold 100, and 10 min run time (MS only).

219 3.2.11. Mass spectral deconvolution and preprocessing of wine metabolite data for regression

220 analyses

221 Gas chromatography and mass spectrometry data were imported to OpenChrom, a cross-

222 platform open source software (Wenig & Odermatt, 2010). Denoising was first carried out by

223 removing silyl derivatives and phthalate-related ion (m/z 147, 148, and 149) and column bleeding-

224 related ion (m/z 207, 221, 267, and 281) in mass spectra (Halket & Zaikin, 2003; Marinetti, 2007;

225 Rood, 1997), followed by the Savizky-Golay filter (width, default = 15; order, default = 2), a

226 digital smoothing process to increase the precision of data extraction from the spectra (Manfred,

227 1981; (Bromba & Ziegler, 1981; Savitzky & Golay, 1964)). PARAllel FACtor analysis 2

228 (PARAFAC2) based Deconvolution and Identification System (PARADISe) (version 3.9) was

229 used to deconvolute overlapping signals, lower the signal-to-noise (S/N) ratio of chromatographic

230 peaks, and address retention time shifts (Johnsen et al., 2017). The PARAFAC2 model was built

231 the non-negativity constraint setting and performance of 5,000 iterations for manual set retention

48

232 time intervals. One to seven components calculated from PARAFAC2 model were determined by

233 the user to differentiate the underlying co-eluted compounds and baseline. Deconvoluted mass

234 spectra were identified using the National Institute of Standards and Technology (NIST14) mass

235 spectral library (version 2.2; Anzor Mikaia et al., 2014). A spectral library match of at least 70%,

236 a match with literature retention indices, and the presence of a compound in all nine wineries were

237 set as the selective cutoff for identified compounds (n = 84). Relative concentrations of each

238 compound were normalized using the mean of two IS for downstream analysis and subtracted by

239 the relative abundance in blank samples. The normalized relative concentration of each compound

240 was log2-transformed to correct for data skewness and Pareto scaling was applied to reduce mask

241 effect of highly abundant metabolites in MetaboAnalyst prior to downstream regression analyses

242 (Chong et al., 2019).

243 3.2.12. Distribution and statistical analyses of wine volatile compounds during the

244 fermentation process

245 Chromatographic peaks identified by GC-MS from the first to the last stage of fermentation

246 were aggregated across the nine wineries to demonstrate the differences in the number of

247 compounds as well as changes in relative concentration throughout the fermentation process.

248 Furthermore, 84 identified compounds were categorized into different groups based on their

249 chemical structure as previously published (Ilc et al., 2016). The relative concentration of

250 compounds without logarithm transformation and scaling were averaged across the nine wineries

251 for each fermentation stage and used in demonstrating the overall distribution of compounds.

252 Furthermore, to test the significant difference of relative concentration among the different

49

253 fermentation stages, preprocessed metabolite data were imported into Metaboanalyst (Chong et

254 al., 2019). Kruskal-Wallis nonparametric test was used to identify the compounds that differed

255 significantly in concentration across the ten fermentation stages using FDR-corrected p-value of

256 0.05.

257 3.2.13. Partial least squares-discriminant analysis (PLS-DA), a dimensionality reduction

258 approach for regional classification of wine microbiome and metabolites datasets

259 Partial least squares-discriminant analysis (PLS-DA) is a supervised method using

260 multivariate regression to calculate linear combination of independent variables (fungal, bacterial

261 community or wine metabolites) that may help explain the response of different class members

262 (regions). Relative abundances obtained from microbiome data (fungal and bacterial communities)

263 and metabolomics data obtained from HS-SPME-GC-MS were subjected to PLS-DA in

264 Metaboanalyst to assess regional discrimination using the classifier regions, Central and Northeast

265 PA. The predictive ability of the PLS-DA model is estimated by Q2, calculated via Leave-one-out

266 cross-validation. To test the effectiveness of discrimination, a permutation test was conducted

267 based on the ratio of between group sum of squares and within group sum of squares (B/W-ratio)

268 with 1000 permutations. Variable Importance in Projection (VIP) scores, a measure of the

269 importance of a each feature (either relative abundance of bacterial and/or fungal microbiome data

270 or metabolomics data) in the PLS-DA discrimination were used and features with VIP scores > 1

271 were considered for downstream analysis and interpretation.

50

272 3.2.14. Construction of multivariate data frame and two-level correlation models

273 A two-level approach was conducted to identify potential association pairs of

274 microorganisms and volatile metabolites. First, considering the feature size and complexity of the

275 multivariate data, the data frame was composed of preprocessed data of microbial abundance at

276 the genus level (first level) and metabolites on the same experimental sample size. Each variable

277 in the microbiome data was denoted as the X data set and metabolite data was denoted as the Y

278 data set in the modeling procedure. Regularized Canonical Correlation Analysis (rCCA) in the R

279 package “mixOmics” was chosen to explore maximum correlations between the two data sets

280 while overcoming the condition where the total number of variables is higher than the number of

281 samples (González et al., 2008). Leave-one-out (LOO) cross validation was used to examine the

282 goodness of the regularization parameters. The rCCA modeling was then performed using the

283 optimized regularization parameters via the ridge method. The plotvar function was used to

284 display variables having a correlation coefficient above 0.5. Finally, a correlation heatmap which

285 contained the similarity matrix between the variables was generated using the function cim, and

286 complete linkage of Euclidean distances was used in a hierarchical clustering to provide order to

287 the variables. Fungi and bacteria with the ten highest positive or negative loading factor values

288 were selected from each first component as candidate taxa for the second-level correlation analysis.

289 To identify correlation patterns between the microbiome at a lower level and metabolites, two raw

290 read counts of fungal and bacterial features at the species level were exported from the taxonomic

291 data, followed by near zero predictor filtering, a log2 transformation and pareto scaling. Finally,

292 the taxa within the selected genus which contain identified and recognizable species names were

293 extracted from the processed data and combined with metabolite data to form a species level

51

294 dataframe. Spearman correlation was analyzed using the R package “ggcorrplot” where the

295 resulting correlation plot visual only displays correlation coefficient with p-value less than 0.05

296 (Alboukadel Kassambara, 2019).

297 3.3. Results

298 3.4.1. Illumina sequencing data provided information of sequence counts and total features

299 from ITS2 and 16S rRNA gene sequences.

300 Paired-end sequencing of ITS2 sequences (ITS2) and 16S rRNA gene sequences (16S) was

301 performed on an Illumina MiSeq platform. Raw sequence data were processed through the most

302 updated version of QIIME2 (v2019.7) which included a denoising pipeline where non-clustered

303 and chimeric sequences were filtered out. Total sequence counts after denoising process of ITS2

304 and 16S were 3,558,611 and 2,158,724 respectively (Table 3-3). Both fungal and bacterial

305 sequence counts ranged between 20,000 and 40,000 (Figure 3-3) with an average of 40,439 and

306 24,531 counts for bacterial and fungal respectively. In addition, the number of detected features,

307 also known as amplicon sequence variants (ASVs) in QIIME2 or OTUs in QIIME1. Since OTUs

308 is a more common term used in microbiome literature, I chose to use OTUs for the remaining of

309 this thesis. Four hundred and eighty-four OUT were identified within fungal communities which

310 was lower than the 3,935 OTUs identified within bacterial communities. Although the identity of

311 these sequences has not been characterized, this indicates that the variety of bacteria during

312 fermentation could be higher than fungi.

52

313 Table 3-3: Denoised ITS2 and 16S sequence counts and total features of wine samples. Winery samples ITS2 sequence counts 16S sequence counts PA19_01_S1 37325 22731 PA19_01_S2 31675 18258 PA19_01_S3 44266 25769 PA19_01_S4 60839 28397 PA19_01_S5 52122 21183 PA19_01_S6 49151 18784 PA19_01_S7 21749 26219 PA19_01_S8 37685 15404 PA19_01_S9 33591 29462 PA19_01_S10 28972 25033 PA19_02_S1 36070 13822 PA19_02_S2 52950 17314 PA19_02_S3 47946 19941 PA19_02_S4 66505 24732 PA19_02_S5 53792 27150 PA19_02_S6 42174 22086 PA19_02_S7 26156 15105 PA19_02_S8 38930 14034 PA19_02_S9 32069 35169 PA19_02_S10 16306 975* PA19_03_S1 82023 23536 PA19_03_S2 51495 26266 PA19_03_S3 67506 13448 PA19_03_S5 23804 20523 PA19_03_S6 37792 10258 PA19_03_S7 34587 16705 PA19_03_S8 803* 12198 PA19_03_S9 1623* 9635 PA19_03_S10 2902* 19202 PA19_04_S1 60926 19878 PA19_04_S2 47524 20083 PA19_04_S4 83135 18873 PA19_04_S5 35713 13331 PA19_04_S6 58153 26846 PA19_04_S7 35955 15586 PA19_04_S8 32932 49675 PA19_04_S9 20847 39029 PA19_04_S10 24999 28245 PA19_05_S1 46802 24001

53

PA19_05_S2 40926 25944 PA19_05_S3 20376 27880 PA19_05_S4 46667 26039 PA19_05_S5 39009 31484 PA19_05_S6 63278 20521 PA19_05_S7 44242 34031 PA19_05_S8 19124 19250 PA19_05_S9 38321 32537 PA19_05_S10 28704 66069 PA19_06_S1 23380 31823 PA19_06_S2 24479 33878 PA19_06_S3 57882 26753 PA19_06_S4 68281 30695 PA19_06_S5 49927 37485 PA19_06_S6 58262 21591 PA19_06_S7 34332 37084 PA19_06_S8 43727 26839 PA19_06_S9 54418 35874 PA19_06_S10 25036 37833 PA19_07_S1 25805 34451 PA19_07_S2 142380 33279 PA19_07_S3 21329 19822 PA19_07_S4 35349 32808 PA19_07_S5 35452 30797 PA19_07_S6 46723 24479 PA19_07_S7 39341 18798 PA19_07_S8 59548 19577 PA19_07_S9 32068 20056 PA19_07_S10 33248 19642 PA19_08_S1 25598 12242 PA19_08_S2 45453 18472 PA19_08_S3 64841 8988 PA19_08_S4 45512 16229 PA19_08_S5 45683 24349 PA19_08_S6 64249 22885 PA19_08_S7 46574 28625 PA19_08_S8 43788 34996 PA19_08_S9 41463 37833 PA19_08_S10 34456 14515 PA19_09_S1 27959 28350 PA19_09_S2 31303 28852 PA19_09_S3 32092 18013

54

PA19_09_S4 24795 46578* PA19_09_S5 31165 17675 PA19_09_S6 38417 25224 PA19_09_S7 18101 20771 PA19_09_S8 21843 14867 PA19_09_S9 32640 24887 PA19_09_S10 31271 28168 Total sequence counts 3,558,611 2,158,724 Mean sequence counts ± SEM 40,439 ± 2,070 24,531 ± 1,028 Number of features 484 3,935

314

315 Figure 3-3: Distributions of sequence counts. (A) Fungal communities; (B) Bacterial 316 communities. 317

55

318 3.3.2. Alpha rarefaction determination of optimal sequencing counts for normalization of

319 fungal and bacterial sequences.

320 To obtain the optimal number of sequencing depth for normalization of total sequencing

321 counts in each sample, rarefaction analysis was used to minimize possible biases originating from

322 the sequencing process to provide a relatively unbiased result for downstream analyses (Moyer et

323 al., 1998). Rarefaction curves were plotted using OTUs to represent estimated feature richness

324 against the specific number of sequencing counts among samples. In this study, our sequencing

325 data was trained against the UNITE fungal and SILVA bacterial reference-based phylogenetic tree

326 for calculation of Faith’s richness which results in a more precise estimation than OTUs (McCoy

327 & Matsen, 2013). The results of rarefaction curves demonstrated that approximately 16,000 to

328 17,000 sequencing counts could reach the saturation of richness for ITS2 and around 7,000 to

329 8,000 for 16S (Figure 3-4). Therefore, the minimum library size (also referred as ‘rarefaction

330 level’) helped us determine a cut-off level which removes samples with sequencing counts lower

331 than 16,000 in ITS2 and 7,000 in 16S for the downstream analyses. As a result of this cut off, 3

332 samples (PA19_03_S8, PA19_03_S9, PA19_03_S10) were removed for ITS2 analyses and one

333 sample PA19_02_S10 was removed from 16S analyses (see Table 3-2).

56

334

335 Figure 3-4: Alpha rarefaction curve on sequence depth. The minimum read counts of sufficient 336 richness was around16,000 in (A) Fungal communities; around 7,000 in (B) Bacterial 337 communities. Each colored line represents the samples from each winery and fermentation 338 stages. 339

57

340 3.3.3. Individual winery taxonomic plots preserve the representation of microbial terroir

341 better than aggregated taxonomic representation.

342 The goal of this section is to present results of aggregated taxonomic plots and individual

343 winery taxonomic plots and compare/contrast fungal and bacterial plots. To demonstrate the

344 distribution of microorganisms fungal and bacterial populations throughout Chambourcin

345 fermentation and winemaking, a comprehensive taxonomical composition demonstrated a total of

346 166 fungal genera as identified by ITS2 and 507 unique bacterial genera identified by 16S.

347 Unidentified genera are listed at the family level. Sample PA19_09_S4 was removed from the

348 bacterial analyses pipeline due to failure in taxonomic identification as only taxa at the kingdom

349 level was detected.

350 In this study, it is important to note that we did not suggest or require changes to current

351 winemaking practices. Two steps in the winemaking process call for the addition of commercially

352 produced microorganisms, (1) Saccharomyces cerevisiae used for the alcoholic fermentation and

353 (2) Oenococcus oeni used for malolactic fermentation. Addition of commercial S. cerevisiae

354 strains were indicated by dotted lines in Figure 3-5 (Panel A) and O. oeni in (Panel B). Non-S.

355 cerevisiae yeasts (we collectively call this group ‘native wild yeasts’ and non-O. oeni bacteria)

356 were dominant in the first few stages of fermentation where the commercial microorganisms, S.

357 cerevisiae and O. oeni only occupied about 10 % of the total fungal and bacterial populations.

358 However, as fermentation progressed, both S. cerevisiae and O. oeni eventually became the main

359 microorganisms taking over 80 %of whole populations (Figure 3-5).

58

360 Although wild yeast and bacteria only made up approximately 10% of the total population

361 at the end of fermentation, the evenness of bacterial compositions was observed to be higher than

362 that of fungi. The distribution of fungal communities demonstrates that a few wild yeast taxa were

363 distinctly predominant among the whole populations and S. cerevisiae became the dominant

364 species after stage 2 of fermentation (over 50% of total fungal population). Additionally,

365 Starmerella was the most abundant wild yeast across all stages (34.32% in stage 1 and 49.88% in

366 stage 10 without S. cerevisiae) followed by Aureobasidium (26.97% in stage 1 and 13.77% in stage

367 10). Interestingly, while the abundance of most wild yeasts experienced a decreasing trend

368 throughout fermentation, Kazachstania and Torulaspora showed a different pattern with

369 increasing abundances toward the end of fermentation (Figure 3-5A).

370 O. oeni was the dominant species of fermentative bacteria after the stage 7 (over 50% of

371 total bacterial population). Relative to indigenous bacteria (non-Oenococcus bacteria),

372 Sphingomonas (11.02% in S1 and 9.40% in S10 without O. oeni) and Enterobacteriaceae (family

373 level, 6.61% in S1 and 15.23% in S 10 without O. oeni) were the most abundant taxa in the bacterial

374 communities across fermentation. Majority of bacterial taxa showed very little fluctuations across

375 fermentation stages. For example, Pseudomonas (14.50% in S10) and Lactobacillus (7.95% in

376 S10) had relatively higher abundance among non-Oenococcus taxa in the later fermentation stages

377 than others (Figure 3-5B). Even though commercial S. cerevisiae, and O. oeni dominated

378 throughout many fermentation stages, we found that both fungal and bacterial communities

379 showed high diversity during early stages of fermentation. Interestingly, we observed that a small

380 population of non-S. ceverevisiae and non-O. oeni persisted within the population until the end of

381 fermentation.

59

382

383 Figure 3-5: Fungal and bacterial taxonomic composition is influenced by fermentation state of 384 Chambourcin hybrid grapes. Taxonomic plots demonstrate the relative abundance of top 20 (A) 385 fungal and (B) bacterial taxa throughout fermentation stages (denoted ‘S’ on the x-axis). 386 Saccharomyces cerevisiae is typically added into wine fermentation and its relative abundance is 387 indicated by the dotted line in Panel A. Oenococcus oeni is typically added later in fermentation 388 to drive malolactic fermentation (MLF) and its relative abundance is indicated by the dotted line 389 in Panel B. Unidentified genera were only shown at the family level and ‘unidentified’ taxa were 390 grouped into a category “f__others”.

391

60

392 Next, we were interested in the distribution of fungal communities from Central and

393 Northeast regions. S. cerevisiae showed dramatic increase in abundance after stage 2 in both

394 regions. We also observed higher taxonomic diversity in earlier stages of fermentation with the

395 wild yeast, Starmerella and Aureobasidium dominating the population (Figure 3-6). We further

396 observed key differences in taxonomic abundances between two regions. For example, we

397 observed higher abundance of Starmerella in wineries of the Central region (46.46% in Central

398 and 19.14% in Northeast) while higher abundances of Mycosphaerella were observed in the

399 Northeast region (0.86% in Central and 9.43% in Northeast) at the beginning of fermentation

400 (S1). Also, in the later fermentation stages (S8 – S10), Central region showed higher abundance

401 of Kazachstania but Northeast region showed higher abundance of another genus, Torulaspora

402 (Figure 3-6).

403 On the other hand, both regions demonstrate similar growth pattern of O. oeni which was

404 predominant after the middle fermentation stages (S6-7). Also, higher bacterial diversity was

405 present in both regions at the beginning of fermentation. In addition, wineries in the Northeast

406 region showed higher abundance of Enterobacteriaceae (family) than Central region throughout

407 fermentation stages. Central region had higher abundance of Sphingomonas throughout

408 fermentation stages and Lactobacillus in the middle and later fermentation stages (S5 – S10).

409 Interestingly, it was also observed that the relative abundance of the genus Methylobacterium was

410 relatively stable throughout fermentation stages from both regions (Figure 3-7). Although

411 statistical analysis was not performed on relative abundances in taxonomic plot data, we identified

412 key regional differences in fungal and bacterial communities between Central and Northeast

413 regions.

61

414 415 Figure 3-6: Fungal taxonomic composition presented from (A) Central and (B) Northeast region. 416 Taxonomic plots demonstrate the relative abundance of top 20 fungal taxa throughout fermentation 417 stages (denoted ‘S’ on the x-axis). Saccharomyces cerevisiae is typically added into wine 418 fermentation and its relative abundance is indicated by the dotted line. Unidentified genera were 419 only shown at the family level and ‘unidentified’ taxa were grouped into a category “f__others”.

62

420 421 Figure 3-7: Bacterial taxonomic composition presented from (A) Central and (B) Northeast region. 422 Taxonomic plots demonstrate the relative abundance of top 20 fungal taxa throughout fermentation 423 stages (denoted ‘S’ on the x-axis). Oenococcus oeni is typically added later in fermentation to 424 drive malolactic fermentation (MLF) and its relative abundance is indicated by the dotted line. 425 Unidentified genera were only shown at the family level and ‘unidentified’ taxa were grouped into 426 a category “f__others”.

63

427 We observed that the overall distribution of microbial patterns in each fermentation stage

428 was impacted by the introduction of commercial microorganisms that reshaped the communities.

429 We also observed that distribution patterns of microbial communities from two regions

430 demonstrate distinct microbial terroir. However, similarities and differences in microbial patterns

431 could be due to regional differences in winemaking practices. Therefore, investigating the

432 microbial distribution of individual wineries might preserve a more detailed view of regional or

433 vineyard differences. Based on the hierarchical analysis of fungal communities, winery PA19_01

434 and 02 had relatively similar fungal signatures; Starmeralla was largely predominant in the wild

435 yeast population and relatively higher abundance of Mycosphaerella. Also, wild yeast populations

436 were mainly dominated by both Starmerella and Aureobasidium in winery PA19_06 and 07.

437 Additionally, PA19_09 showed the most distinct fungal signature which could be due to the higher

438 abundance of Cladosporium and Pichia in the earlier fermentation stages and Torulaspora in the

439 later stages (Figure 3-8A).

440 Similar to the patterns of S. cerevisiae dominating toward the end of fermentation, we

441 observed O. oeni becoming a dominant species toward the end of fermentation, although

442 abundances in populations of indigenous bacteria were relatively even (Figure 3-5B). Taxonomic

443 patterns of diversity highlight individuality in microbial terroir of PA wineries. High evenness of

444 bacterial populations during fermentation were observed in three wineries, PA19_02, 07, and 09.

445 Winery PA19_01 and 03 were clustered together showing relatively higher abundance of

446 Lactobacillus in the middle to later stages and followed by the groups of PA19_04, 05, and 06 that

447 showed higher abundance of Methylobacterium. Finally, winery PA19_08 exhibited dramatically

64

448 high levels of Enterobacteriaceae (in family level) across all stages compared with others (Figure

449 3-8B).

450 Therefore, taxonomic classification of fungal and bacterial communities within individual

451 wineries across two regions enabled observations of microbial patterns which drive Chambourcin

452 grape fermentation (Figure 3-4 and 5). In addition, we observed regionality differences in fungal

453 and bacterial populations in wineries of the Central and Northeast region (Figure 3-6 and 7).

454 Finally, high resolution microbial taxonomic compositions obtained from each winery provides a

455 deeper view of winemaking practices that influence microbial compositions of Chambourcin red

456 wine (Figure 3-8).

65

Figure 3-8: High microbial diversity was observed at the individual winery level. For each winery, the relative abundance of the top 20 bacteria (A) and fungi (B) taxa that differed significantly not only by fermentation stages but also winery are shown. Superimposed on each panel are (A) Saccharomyces cerevisiae and (B) Oenococcus oeni relative abundances. Dendrograms of each winery were displayed based on the relative abundance of the taxa at the genus level in each winery using Euclidean distances and Ward clustering algorithm

66

3.3.4. The process of fermentation contributes to the overall decline of microbial diversities

but impacted fungal and bacterial communities differently.

In section 3.3.3, I was interested in how aggregated taxonomic representation could be different than representation of individual taxonomic differences between bacterial and fungal species. In this section, my research question was how does the fermentation process influence microbial diversity? Here, I was interested whether microbial populations are statistically different throughout fermentation stages. Microbial diversity is defined as the variety of microorganisms in a given ecosystem. We used a previously established biodiversity analysis pipeline which focuses on α-diversity measurements (“Moving Pictures” tutorial, https://docs.qiime2.org/2020.2/tutorials/moving-pictures/#) with modifications in the fragment- insertion plugin (Mirarab et al., 2012) and q2-ghost-tree plugin (Fouquier et al., 2016). α-diversity summarizes the differences within an ecological community which is commonly used to seek the statistical meaning of microbial fluctuations during the fermentation process.

α-diversity indices (Shannon index, Faith’s PD, and Pielou’s evennss) were used to measure microbial richness and evenness within each individual fermentation stage. Microbial richness is defined as a number of different taxonomic groups present within a sample whereas microbial evenness indicates a degree of similarity of taxonomic abundance within a sample.

Shannon diversity which accounts for both ecological richness and evenness is the most commonly used diversity metric. Kruskal-Wallis non-parametric test was used to test significant changes throughout fermentation stages (“non-parametric approaches based on ranks, or medians are robust to outliers” Shaun Burke, 1998). Significant differences of Shannon diversity in fungal communities were observed in stage 5 to 10 compared with stage 1 meaning that diversity of fungi

67

changed significantly beginning at day 4 after fermentation (Table 3-1 & Figure 3-9A). On the other hand, significant differences in bacteria were observed after stage 7 which indicated a dramatic change of bacterial communities around day 13 of fermentation process (Table 3-1 &

Figure 3-9B). However, due to the property of Shannon index which uses overall diversity, it could be unclear as to either the diversity of richness or evenness influences the differences. Faith’s phylogenetic diversity (Faith’s PD) complemented with the prebuilt phylogenetic tree was used to measure the species richness within individual samples and Pielou’s index was used for the evenness of species abundance. Although, both indices illustrated similar decreasing trends as fermentation progress, we observed significant decline in richness and evenness in fungal communities after stage 3 indicating that the number of different taxa decreased dramatically followed by succession of the community by the remaining species in the population (Figure 3-

10A & C). In contrast, only Faith’s PD introduced significant results in bacterial communities, though both indices showed a decreasing trend during fermentation (Figure 3-10B & D). Thus, for bacteria communities, changes in Shannon diversity was driven by primarily by richness which suggest decrease in number of bacterial taxa but relatively unchanged in microbial abundance).

In conclusion, α-diversity analyses demonstrated decreasing patterns of richness in both fungal and bacterial communities, but significant differences were only detected in fungal community richness. Interestingly, fungal richness showed an earlier decrease in richness compared to bacterial richness.

68

Figure 3-9: High microbial diversity observed in early stages for wine fermentation. Shown here are the rarefied α-diversity distributions calculated as Shannon diversity index at each fermentation stage from nine wineries. Box plots represent the 1.5*IQR (Inter quartile range) / Sqrt(n) correspond to 95% confidence interval. Panel (A) represents fungal richness and (B) represents bacterial richness. Significant differences between stage 1 and other groups were determined using Kruskal-Wallis test with FDR adjusted p-value (q-value). * q-value < 0.05, ** q-value < 0.01, *** q-value < 0.005.

69

Figure 3-10: Rarefied α diversity distributions of microbial richness and evenness throughout stages of fermentation with samples collected from nine wineries. (A-B) Faith’s phylogenetic diversity (richness) for fungi (A) and bacteria (B) and Pielou’s evenness for (C) fungi and (D) bacteria. Significant differences between stage 1 and other groups were determined using Kruskal-Wallis test with FDR adjusted p-value (q-value). * q-value < 0.05, ** q-value < 0.01, *** q-value < 0.005.

70

Next, we were interested in how individual taxa in a particular microbial community affects differences of each fermentation stage. This can be computed using the distance metrics (β- diversity) which accounts for taxonomy and abundance of one species to another. In other words,

β-diversity described the degree of variance between samples revealing how microbial dissimilarity (differences of taxa) impacts the change of a samples’ microbial compositions during fermentation. To explore the effects of the fermentation process on each sample, in this section, the principal coordinate analysis (PCoA) of UniFrac distances, an unsupervised phylogeny-based approach, was chosen for both fungal and bacterial community comparisons. This does not only help us understand diversity within samples but also differences among samples. A phylogeny- based approach called UniFrac metrics was employed to compare fungal and bacterial communities. This qualitative metric is divided into unweighted and weighted UniFrac.

Unweighted UniFrac accounts only the taxonomic differences between samples. My results demonstrate the clusters of samples based on fermentation stages were more readily apparent.

Later stages were mainly located at the left side of the plot and earlier stages at the right side of panels for both fungal and bacterial consortia (Figure 3-11A & B). Furthermore, permutational multivariate analysis of variance (PERMANOVA) tests indicated that microbial taxonomy was significantly different between at least two fermentation stages. Meanwhile, fermentative fungi had higher ratio of between-group variance to within-group compared to bacteria (fermentative fungi, pseudo-F = 3.424, P = 0.001; bacteria, pseudo-F = 1.677, P = 0.001) (Figure 3-11A & B).

Results indicate that when the fermentation progresses, fungal compositions face higher fluctuation (ranged from -0.25 to 0.5 in x-axis and -0.3 to 0.2 in y-axis) compared to bacteria

(ranged from -0.2 to 0.2 in x-axis and -0.2 to 0.2 in y-axis) though the total principle component

71

could only explain 22.84 % of variance within bacterial samples. For quantitative analysis, weighted UniFrac which accounts for both taxonomy and relative abundance of species demonstrate similar patterns between fungi and bacteria communities and that the fermentation stage was the main factor driving dissimilarity among individual samples with higher explained variances (total 87.16% explained variance in fungal community and 79.06% in bacterial community) (Figure 3-11). PERMANOVA furthered strengthens the observation that fungal and bacterial samples differentiated predominantly by fermentation stages (fungi, pseudo-F = 6.874, P

= 0.001; bacteria, pseudo-F = 8.726, P = 0.001).

Based on the results of PERMANOVA test of unweighted and weighted UniFrac distance metrics, we could identify the degree of differences between each sample based on fermentation stages and wineries. Specifically, fungal compositions differentiated significantly among individual winery (S1 - S6) in earlier stages based on the larger dissimilarity in PCoA and

PERMANOVA test. On the contrary, changes in bacterial compositions appeared in the middle of fermentation accounts only for taxonomic differences. However, when considering the relative of abundance of each taxon, the significant differences of fungal and bacterial communities among each group occurred during early fermentation (Table 3-4). These results demonstrate that the time factor (beginning to end of fermentation) could provide a higher level of explained microbial variations in fungi and bacteria showing similar trends in the differences of individual taxa abundance. Furthermore, since fungal communities are associated with fewer taxa compared to bacteria, individual bacterial species could have a higher influence on bacterial populations in a wine system suggesting the use of different strategies when studying these two communities.

72

Figure 3-11: Rarefied β-diversity based on unweighted UniFrac distances for (A) fungi and (B) bacteria, and weighted UniFrac distances for (C) fungi and (D) bacteria colored by fermentation stages. Principal Coordinates Analysis (PCoA) of taxonomy data plotted according to the first two principal components across fermentation stages. Statistical significance determined by PERMANOVA, p-value = 0.001.

73

Table 3-4: Results of the PERMANOVA pairwise comparison of fungal and bacterial β-diversity using unweighted and weighted UniFrac distance metrics. FDR-adjusted p-values (q-value) are reported for statistical significance. Pairwise PERMANOVA, q-value Fermentation ITS2 16S stages Group 1 Group 2 Unweighted Weighted Unweighted Weighted S2 0.061 0.063 0.996 0.910 S3 0.039 0.023 0.996 0.773 S4 0.008 0.020 0.754 0.444 S5 0.008 0.008 0.946 0.073 S1 S6 0.008 0.008 0.090 0.036 S7 0.008 0.008 0.054 0.004 S8 0.008 0.008 0.011 0.004 S9 0.008 0.008 0.011 0.006 S10 0.008 0.013 0.011 0.004 S3 0.947 0.848 0.996 1.000 S4 0.927 0.608 0.996 0.949 S5 0.319 0.151 0.996 0.240 S6 0.039 0.051 0.754 0.057 S2 S7 0.009 0.063 0.548 0.006 S8 0.008 0.052 0.080 0.004 S9 0.008 0.052 0.025 0.004 S10 0.009 0.062 0.011 0.004 S4 0.947 0.848 0.996 0.977 S5 0.544 0.345 0.996 0.240 S6 0.097 0.038 0.996 0.080 S3 S7 0.015 0.151 0.552 0.006 S8 0.009 0.033 0.168 0.004 S9 0.008 0.008 0.069 0.004 S10 0.013 0.038 0.018 0.004 S5 0.679 0.739 0.996 0.366 S6 0.137 0.061 0.754 0.121 S7 0.027 0.323 0.279 0.010 S4 S8 0.008 0.048 0.035 0.004 S9 0.008 0.020 0.048 0.004 S10 0.026 0.052 0.011 0.008

74

S6 0.821 0.116 0.996 0.773 S7 0.054 0.388 0.218 0.073 S5 S8 0.013 0.069 0.068 0.041 S9 0.053 0.052 0.011 0.004 S10 0.138 0.074 0.011 0.010 S7 0.323 0.848 0.212 0.266 S8 0.179 0.739 0.105 0.158 S6 S9 0.584 0.633 0.034 0.012 S10 0.774 0.812 0.011 0.016 S8 0.927 0.812 0.287 0.773 S7 S9 0.714 0.606 0.090 0.073 S10 0.319 0.812 0.018 0.073 S9 0.679 0.739 0.320 0.266 S8 S10 0.135 0.848 0.180 0.306 S9 S10 0.442 0.747 0.996 0.941

3.3.5. Partitioning Around Medoids (PAM) algorithm-based clustering approach groups

fungal- and bacterial-dependent fermentation stages into two or three clusters.

To identify the variation of both fungal and bacterial community compositions during fermentation, we performed LEfSe, a linear discriminant analysis, to explore the taxa which were significantly differential among fermentation periods. α and β-diversity analyses provided us with information on how microbial compositions change during fermentation. However, in this study, a total of 102 fungal families, 166 fungal genera, 257 bacterial families, and 507 bacterial genera were identified from all samples (Table 3-5). This data does not allow us to characterize changes in microbial compositions that were attributed to each taxon.

75

Table 3-5: Number of taxa identified at each taxonomic level between fungal (ITS2) and bacterial (16S) communities. No. taxa Levels of Taxonomy ITS2 16S Phylum 5 26 Class 16 64 Order 43 138 Family 102 257 Genus 166 507 Species 241 661

Relative abundance data obtained from ten pre-determined fermentation stages were compared with each other. We observed several fungal taxa where significant difference was detected (LDA > 3, P < 0.05) (Figure 3-12). Specifically, microbial taxa with significantly higher abundance in a certain stage were listed. Nevertheless, these results which combine data obtained from ten fermentation stages could not provide meaningful information due to overlapping of relative abundance obtained from taxa in the stage 1 and 2, for example, and stage 4 and 9 were underlying stage 7. The cladogram (Figure 3-12) resulted in ambiguous information with too many overlapping areas in which the taxa might not be a representation of their assigned stage which could be caused by similarities in relative abundances of specific taxa across fermentation stages.

In other words, stage 2 might have similar microbial composition with stage 3 or even with stage

4 that lead to undifferentiated patterns where the taxa with the close phylogeny were assigned into those stages.

76

Therefore, a Partitioning Around Medoids (PAM) cluster algorithm was used to categorize fermentation stages into fewer groups or ‘clusters’ based on their microbial similarities, community state-types, richness and evenness. This algorithm is less sensitive to noise and outliers

Figure 3-12: Cladogram representing different taxa across fermentation stages without clustering process. Linear Discriminant Analysis Effect Sizes (LEfSe) analyses were performed using relative abundance data averaged at each stage for each winery at the highest classification level taxonomy available. Data shown are the hierach of discriminating taxa visualized as a cladogram for comparison across different fermentation stages. Only taxa with LDA scores > 3, p-value < 0.05 are shown in the cladogram. due to the use of the medoid (the most centrally located data point in a cluster) as the cluster center instead of the mean value. As for PAM clustering, the number of clusters (k) needs to be pre-

77

determined before clustering. Therefore, the ‘Average Silhouette width’ (ASw) and ‘Total Width sum of squares’ (TWs) was used to optimize the number of clusters. ASw, which is optimized for a maximum value suggested a 2-cluster approach for the fermentation stages of fungal and bacterial communities (Figure 3-11A & B), while results from the TW analysis, which optimizes a minimum suggested a 3-cluster approach for both fungi and bacteria (Figure 3-11C & D). Based on ASw and TW clustering, PAM clustering was tested with 2 and 3 clusters based on their α- diversities throughout fermentation stages to better parse α-diversities obtained from fungi and bacteria. Fungal communities showed a more rationale partition with 3 clusters (Figure 3-12A) than two (Figure 3-12B). On the other hand, for the bacterial consortia a TW-cluster solution

(Figure 3-12C) performed better than 3-clusters (Figure 3-12D). Based on the optimal number of clustering from the PAM algorithm, the distribution of fermentation stages is shown in Table 3-6, with stages clustered into three groups for fungi (early phase: S1; mid phase: S2-S5; late phase:

S6-S10) and two clusters for bacteria (early phase: S1-S6; late phase: S7-S10). Based on our study, these clusters which group fermentation stages into two (bacteria) or three (fungi) clusters is the optimal representation of microbial taxa during fermentation and will be used for of this chapter.

78

Figure 3-13: Optimal clustering number for microbial communities at different fermentation stages. Partition around medoids (PAM) analysis showed average silhouette width considering two clusters for (A) fungi and (B) bacteria, and total width sum of square considering 3 clusters for (C) fungi and (D) bacteria.

79

Figure 3-14: Clusters of 10 fermentation stages in principle component analysis based on the data obtained α-diversity analysis (Shannon index, Faith’s PD, and Pielou’s evenness). (A-B) Samples of fungal communities in three (selected for use in this study) and two clusters; (C-D) Samples of bacterial communities in two (selected for use in this study) and three clusters.

80

Table 3-6: Distributions of clustered fermentation stages of all samples. Three clusters resulted in better performance for fungal community and two clusters for bacterial community.

In Cluster No. Samples in each fermentation stage Microorganisms use No. S1 S2 S3 S4 S5 S6 S7 S8 S9 S10

1 7 0 2 1 1 0 0 0 0 0 Y 2 2 4 5 4 6 3 3 2 1 1 Fungi 3 0 3 2 3 3 6 6 6 7 7 1 7 3 1 1 1 0 0 0 0 0 N 2 2 6 7 7 8 9 9 8 8 8 1 9 8 7 6 8 6 4 3 1 1 Y 2 0 1 1 1 1 3 5 6 8 7 Bacteria 1 3 5 4 3 5 4 1 1 1 0 N 2 6 4 3 3 3 3 3 2 0 1 3 0 0 1 1 1 2 5 6 8 7

3.3.6. Linear Discriminant Analysis Effect Sizes (LEfSe) identified predominant fungal and

bacterial taxa in different fermentation stages

After data minimalization and clustering of ten fermentation stages into fewer groups using a PAM-based clustering approach, our goal was to obtained more meaningful results from LEfSe.

First, in the fungal communities, the number of differential taxa in early phase (15 genera) was larger than middle and late phases (2 genera) showing that in the beginning of fermentation, the relative abundances of those taxa were significantly higher than those in late fermentation stage.

It is also important to note that S. cerevisiae has not emerged as the dominant species in early fermentation but instead the population are dominated by native wild yeasts (non-Saccharomyces yeasts). For example, the genus Starmerella, Aureobasidium, Sporobolomyces, Alternaria, and

Occultifur were the top 5 most abundant native wild yeasts in the early phase (LDA scores > 4, P

< 0.05) (Figure 3-15A). Next, the genus Neopestalotiopsis (a filamentous fungi) showed relatively higher abundance in both early and late phases giving that this genus might have lower growth rate

81

due to the detection in the later stage. Also, its growth could be independent from the influences of wild yeasts because of the significantly higher abundance than the wild yeasts detected in the early phase. In the late phase (fermentation stage 6 to 10), Saccharomyces (LDA scores > 5, P <

0.05) had the highest relative abundance among the fungal taxa, followed by another wild yeast with less abundance, Kazachstania, (LDA scores > 3, P < 0.05), which could be considered to have some similar characteristics with Saccharomyces (Figure 3-13A). Furthermore, we observed a distinct pattern of fungal taxa between the early and late phase including high diversity of wild yeast in early fermentation and the predominance of Saccharomycetaceae (family level) in the late stages of fermentation (Figure 3-15B)

Using LEfSe to analyze bacteria populations, 18 genera were identified with significantly higher relative abundance in the early phase compared to the late phase where only one genus,

Oenococcus was presented indicating the non-Oenococcus bacteria were predominant in the early stages of fermentation (Figure 3-16A). Moreover, the grape epiphytes, Sphingomonas,

Methylobacterium, Komagataeibacter, Gluconobacter, and Pseudomonas were the most abundant genera followed by the acetic acid bacteria, Acetobacter, from the early phase (LDA scores > 3.6,

P < 0.05) (Figure 3-16A). We observed that the phylogenetic relationships among the differential taxa were diverse especially in the early phase where individual taxa were from different families and orders (Figure 3-16B). In summary, we were able to identify the degree of differentiation between individual taxa from different taxonomic levels and the statistical significance in the relative abundance of individual taxa.

82

Figure 3-15: Differential taxa among early, middle, and late stages of fermentation for fungal communities. LEfSe analyses were performed using relative abundance data averaged by wineries at the highest classification level taxonomy available. Data shown are the log10 linear discrimination analysis (LDA) scores following LEfSe analyses and the hierarch of discriminating taxa visualized as cladograms clustered into three groups, early, middle, and late fermentation stages. Differential taxa with LDA scores > 3, p-value < 0.05 were used as cutoff.

83

Figure 3-16: Differential taxa among early, and late stages of fermentation for bacteria communities. LEfSe analyses were performed using relative abundance data averaged by wineries at the highest classification level taxonomy available. Data shown are the log10 linear discrimination analysis (LDA) scores following LEfSe analyses and the hierarch of discriminating taxa visualized as cladograms clustered into two groups, early and late fermentation stages. Differential taxa with LDA scores > 3, p-value < 0.05 were used as cutoff.

84

3.3.7. Most of Chambourcin red wine metabolites showed significant change during

fermentation process

In this section of the study, our goal is to explore the chemical profiles with a focus on volatile compounds obtained from winery samples across ten pre-determined fermentation stages.

The research question we were interested in is “how microbial populations present on grapes and throughout fermentation impact metabolites that could influence final wine characteristics?”.

Further understanding the development of wine metabolites including aroma compound accumulation and degradation is important to the Pennsylvania Wine Industry because knowledge of chemical compounds in hybrid PA red wine could help winemakers make decisions on selection of grape cultivars. Although much is known about volatile metabolites from wine made from Vitis vinefera grapes, hybrid grapes such as Chambourcin are less well studied relative to characterization of metabolites. Here, we evaluated volatiles organic compounds (VOCs) emitted from samples collected from nine wineries throughout ten fermentation stages using solid-phase microextraction (SPME) on gas chromatography-mass spectrometry (GC-MS) (see Table 3-1).

Over 100 VOCs were detected from 88 samples collected during the winemaking process. From a broad sense, I observed increase in overall complexity in GC-MS profiles characterized by the (1) appearance of new VOCs and (2) increase in abundance of VOCs as fermentation progressed from grape must to final wines (Figure 3-17). To identify Chambourcin-associated VOCs, I selected 84 compounds detected across all wineries in the last fermentation stage (stage 10) as target compounds for the downstream analyses and we term these “Target Chambourcin Core VOCs”.

Target Chambourcin core VOCs were categorized into 14 classes; esters were the most abundant

85

compounds (31% of total number of compounds) in Chambourcin red wine followed by alcohols

(21%) and acids (17%) (Table 3-7; Figure 3-18).

Figure 3-17: GC-MS chromatogram of wine metabolites averaged by wineries demonstrating the effect of fermentation stages on the production of volatile compounds. Chromatogram for (A) Stage 1 (early fermentation) and (B) Stage 10 (late fermentation). *data without sample PA_05_S1

86

Table 3-7: 84 Target Chambourcin Core VOCs detected in all wineries in the last fermentation stages (S10) were selected for the downstream analyses. Categorization of chemical structures were defined as previously published (Ilc et al., 2016). RT, Retention Time; RI, Retention Index; *, represented compounds without RI validation. Compound RT RI Compound RT RI Esters Butanoic acid 11.687 1584 Acetic acid, methyl ester* 2.162 NA Butanoic acid, 3-methyl- 12.189 1628 Ethyl Acetate* 2.559 NA Butanoic acid, 2-methyl- 12.209 1630 Butanoic acid, methyl 3.462 972 Hexanoic acid 14.150 1804 ester Isobutyl acetate 3.814 1001 Heptanoic acid 15.277 1912 Butanoic acid, ethyl ester 4.103 1024 2-Hexenoic acid, (E)- 15.380 1922 Butanoic acid, 3-methyl-, 4.489 1053 2,3-Dimethylfumaric acid* 16.205 NA ethyl ester 1-Butanol, 3-methyl-, 5.236 1104 Octanoic acid* 16.355 NA acetate Acetic acid, pentyl ester 5.877 1150 Nonanoic acid* 17.373 NA Hexanoic acid, methyl 6.059 1163 n-Decanoic acid* 18.354 NA ester Hexanoic acid, ethyl ester 6.854 1216 2-Furancarboxylic acid* 19.502 NA Acetic acid, hexyl ester 7.319 1249 2,4-Hexadienedioic acid* 24.271 NA ETHYL (S)-(-)- 8.177 1308 Aldehydes LACTATE Heptanoic acid, ethyl 8.180 1308 Acetaldehyde* 1.700 NA ester 2-Hexenoic acid, ethyl 8.281 1316 Butanal, 3-methyl- 2.784 906 ester Octanoic acid, methyl 8.909 1364 Hexanal 4.609 1062 ester Octanoic acid, ethyl ester 9.630 1417 Heptanal 6.003 1159 Isopentyl hexanoate 9.890 1438 Benzaldehyde, 4-methyl- 11.959 1607 Pentanoic acid, 2- hydroxy-4-methyl-, ethyl 10.802 1510 5-Hydroxymethylfurfural* 20.190 NA ester Decanoic acid, methyl 11.525 1571 Acetal ester Decanoic acid, ethyl ester 12.127 1623 Ethane, 1-ethoxy-1-methoxy-* 2.293 NA Octanoic acid, 3- 12.346 1642 Ethane, 1,1-diethoxy-* 2.630 NA methylbutyl ester Butanedioic acid, diethyl 12.360 1643 Pentane, 1-(1-ethoxyethoxy)- 5.095 1094 ester

87

Ethyl 9-decenoate 12.636 1668 Dione Acetic acid, 2-phenylethyl 13.894 1781 2,3-Pentanedione 4.313 1040 ester Dodecanoic acid, ethyl 14.389 1828 Acetyl valeryl 5.541 1127 ester Hexadecanoic acid, ethyl 18.436 NA 1,2-Cyclopentanedione 13.274 1725 ester* Alcohols Ketone 3-Buten-2-ol, 2-methyl- 4.154 1028 2-Heptanone 5.982 1157 1-Butanol, 3-methyl-, 5.236 1104 Acetoin 7.327 1250 acetate 2-Buten-1-one, 1-(2,6,6- 1-Butanol 5.523 1125 trimethyl-1,3-cyclohexadien-1- 14.018 1792 yl)-, (E)- 1-Butanol, 3-methyl- 6.556 1194 Ethers 1-Pentanol, 4-methyl- 7.850 1285 Ethene, ethoxy-* 1.673 NA 1-Pentanol, 3-methyl- 8.018 1296 1-Propanol, 3-ethoxy- 8.641 1344 1-Hexanol 8.463 1330 Anhydride Pentanoic acid, 2-methyl-, 3-Hexen-1-ol, (Z)- 8.764 1353 2.152 NA anhydride* 2-Hexen-1-ol, (Z)- 9.157 1381 Terpenes 1-Hexanol, 2-ethyl- 10.167 1460 D-Limonene 6.190 1171 cis-Hept-4-enol 10.295 1470 Sulfur 2,3-Butanediol, [R- 10.695 1501 1-Propanol, 3-(methylthio)- 12.755 1678 (R*,R*)]- Linalool 10.874 1516 Nitrogen

Rheadan-8-ol, 2,3,10,11- 1-Octanol 11.002 1527 tetramethoxy-16-methyl-, 23.134 NA (6à,8à)-* 2,3-Butanediol, [S- 11.121 1537 Phenol (R*,R*)]- Propylene Glycol 11.265 1550 2,4-Di-tert-butylphenol* 18.727 NA Benzyl alcohol 14.464 1835 Others Phenylethyl Alcohol 14.886 1875 Carbon dioxide* 1.464 NA Acids n-Hexane* 1.569 NA Acetic acid 9.400 1398 Bicyclo[4.2.0]octa-1,3,5-triene 6.943 1222 (1R,2R,5R,E)-7-Ethylidene- Propanoic acid, 2-methyl- 10.975 1525 1,2,8,8- 10.718 1503 tetramethylbicyclo[3.2.1]octane*

88

Figure 3-18: 84 volatile compounds detected by GC-MS from samples collected across fermentation stages from all nine wineries. Compounds were grouped based on their chemical structures. Categorization of chemical structures were defined as previously published (Ilc et al., 2016).

Changes in abundance of VOCs can help us understand potential reactions that occur during fermentation of the winemaking process. To validate the statistical significance in changes of Target Chambourcin core VOCs abundance, I used the Kruskal Wallis nonparametric test and demonstrated that the relative abundances of 68 VOCs were significantly changed during the fermentation process (q-value < 0.05) indicating the dramatical change of volatile profiles throughout fermentation stages (Figure3-19). For example, VOCs such as pentanoic acid, 2- hydroxy-4-methyl-, ethyl ester, (S)-(-)-Ethyl lactate and butanedioic acid, diethyl ester showed the most significant q-value while 1-Propanol, 3-ethoxy- had the least significant q-value (Table 3-8).

As fermentation progressed, most compounds started to increase after stage 2 and 3 where 1-

Butanol, 3-methyl- was the most abundant compound and increased dramatically after the middle stage of fermentation. Other core VOCs that followed a similar pattern such as 1-buntanol, 3-

89

methyl-, octanoic acid, ethyl ester, and phenylethyl alcohol. By contrast, 1-Hexanol had a higher abundance in the beginning of fermentation and decreased after stage 3. Levels of the VOC hexanal showed a similar trend to 1-hexanol with decreasing abundance throughout fermentation (Figure

3-20).

Figure 3-19: The demonstration of significant differences among fermentation stages of total identified volatile compounds (n=84) identified using Kruskal Wallis Test. FDR-adjusted p-value (q-value) < 0.05 (green dots, n=16).

To explore the distinct volatile profiles between different regions, I investigated the distribution of individual volatile compound throughout fermentation stages comparing Central and Northeast PA. First, the distribution of volatile compounds from both regions showed a similar pattern where I observed increases in relative abundance of most volatile compounds after mid- fermentation with 1-Butanol, 3-methyl detected at the highest abundant. Interestingly, we observed a dramatic increase in VOCs from stage 4 to 6 and then a decrease in abundance of VOCs in

Northeast wineries. This trend was not observed in VOC patterns from samples obtained across fermentation in the Central region. Furthermore, the second most abundant compound was

90

Figure 3-20: Distribution of top 20 highest abundant metabolites across wine fermentation. All compounds showed significant differences (Kruskal Wallis Test, FDR-corrected p-value < 0.05) among ten fermentation stages. *, represented compounds without RI validation.

Table 3-8: Wine important metabolites with significant differences across fermentation stages. Data shown here are based on Kruskal Wallis Test for non-parametric analysis. Metabolites p-value FDR-adjusted p-value Pentanoic acid, 2-hydroxy-4-methyl-, ethyl ester 4.89E-11 4.11E-09 ETHYL S---LACTATE 4.38E-09 1.38E-07 Butanedioic acid, diethyl ester 4.94E-09 1.38E-07 Propylene Glycol 9.22E-09 1.92E-07 2,3-Butanediol, S-R,R- 1.38E-08 1.92E-07 Pentanoic acid, 2-methyl-, anhydride 1.59E-08 1.92E-07 2-Hexen-1-ol, Z- 1.60E-08 1.92E-07 2,3-Butanediol, R-R,R- 2.04E-08 2.14E-07 Butanoic acid, 3-methyl-, ethyl ester 2.76E-08 2.57E-07 Phenylethyl Alcohol 5.83E-08 4.90E-07 1-Pentanol, 3-methyl- 7.11E-08 5.43E-07

91

2-Hexenoic acid, ethyl ester 9.39E-08 6.49E-07 Butanal, 3-methyl- 1.00E-07 6.49E-07 Pentane, 1-1-ethoxyethoxy- 1.13E-07 6.49E-07 Heptanoic acid, ethyl ester 1.16E-07 6.49E-07 Hexanoic acid, ethyl ester 1.88E-07 8.90E-07 Linalool 1.90E-07 8.90E-07 5-Hydroxymethylfurfural 1.93E-07 8.90E-07 1-Butanol, 3-methyl- 2.01E-07 8.90E-07 1-Pentanol, 4-methyl- 2.16E-07 9.08E-07 1-Butanol, 3-methyl-, acetate 2.82E-07 1.13E-06 Butanoic acid, ethyl ester 3.39E-07 1.29E-06 Octanoic acid, ethyl ester 3.77E-07 1.36E-06 n-Hexane 3.88E-07 1.36E-06 Octanoic acid 5.98E-07 2.01E-06 Acetic acid, 2-phenylethyl ester 7.48E-07 2.42E-06 1-Propanol, 2-methyl- 1.12E-06 3.37E-06 Dodecanoic acid, ethyl ester 1.12E-06 3.37E-06 Decanoic acid, ethyl ester 1.21E-06 3.49E-06 Isopentyl hexanoate 1.44E-06 4.04E-06 3-Hexen-1-ol, Z- 1.54E-06 4.18E-06 Octanoic acid, 3-methylbutyl ester 2.22E-06 5.83E-06 Decanoic acid, methyl ester 2.59E-06 6.60E-06 Butanoic acid 3.22E-06 7.96E-06 Hexadecanoic acid, ethyl ester 3.41E-06 8.18E-06 n-Decanoic acid 3.69E-06 8.61E-06 Ethyl 9-decenoate 3.97E-06 9.02E-06 Hexanoic acid 4.86E-06 1.07E-05 Octanoic acid, methyl ester 5.04E-06 1.09E-05 Hexanal 5.55E-06 1.17E-05 1R,2R,5R,E-7-Ethylidene-1,2,8,8- tetramethylbicyclo3.2.1octane 6.81E-06 1.40E-05 Acetic acid, hexyl ester 1.31E-05 2.62E-05 cis-Hept-4-enol 1.47E-05 2.87E-05 Isobutyl acetate 1.67E-05 3.19E-05 1-Hexanol 2.93E-05 5.47E-05 Butanoic acid, methyl ester 5.39E-05 9.85E-05 2,4-Di-tert-butylphenol 6.38E-05 0.00011406 Ethene, ethoxy- 0.00010806 0.00018627 1-Propanol, 3-methylthio- 0.00010866 0.00018627 Hexanoic acid, methyl ester 0.00015227 0.00025581 2,3-Pentanedione 0.0002505 0.00041259 2,3-Dimethylfumaric acid 0.00028013 0.00045252

92

1-Octanol 0.00032479 0.00051476 Butanoic acid, 2-methyl- 0.00038788 0.00060336 Benzyl alcohol 0.0004608 0.00069693 1-Butanol 0.00046462 0.00069693 Benzaldehyde, 4-methyl- 0.00069982 0.0010313 Ethane, 1-ethoxy-1-methoxy- 0.0007513 0.0010881 Acetoin 0.00076998 0.0010962 Ethane, 1,1-diethoxy- 0.00085595 0.0011983 2-Buten-1-one, 1-2,6,6-trimethyl-1,3- cyclohexadien-1-yl-, E- 0.0010975 0.0015114 Rheadan-8-ol, 2,3,10,11-tetramethoxy-16-methyl-, 6a,8a- 0.0012374 0.0016765 3-Buten-2-ol, 2-methyl- 0.0016 0.0021333 1,2-Cyclopentanedione 0.003691 0.0048444 Acetyl valeryl 0.0061403 0.0079351 2-Furancarboxylic acid 0.0072004 0.0091641 Acetaldehyde 0.0075476 0.0094627 Bicyclo4.2.0octa-1,3,5-triene 0.008925 0.011025 2-Hexenoic acid, E- 0.011107 0.013521 Butanoic acid, 3-methyl- 0.012678 0.015214 1-Propanol, 3-ethoxy- 0.03447 0.040782

Octanoic acid, ethyl ester, in the Central region and Phenylethyl Alcohol in the Northeast region.

Consequently, it was observed that the distribution patterns of volatile profiles indicate some distinction between Central and Northeast regions (Figure 3-21).

In summary, SPME-GC-MS data can demonstrate the timing and level of VOCs produced during winemaking. It was also observed that most compounds such as esters and alcohols were produced during fermentation and appeared only in the later timepoint of fermentation. Most of wine metabolites changed significantly in their abundance during fermentation. In addition, 1-

Butanol, 3-methyl-, Octanoic acid, ethyl ester, and Phenylethyl Alcohol were the most abundant compound in PA Chambourcin red wine and there were some differences of volatile distribution patterns shown between Central and Northeast regions.

93

Figure 3-21: Distribution of top 20 highest abundant metabolites across wine fermentation in (A) Central and (B) Northeast regions. All compounds showed significant differences (Kruskal Wallis Test, FDR-corrected p-value < 0.05) among ten fermentation stages. *, represented compounds without RI validation.

94

3.3.8. Partial Least Squares Discriminant Analysis (PLS-DA) explain regional differences

of wine microbiome and metabolome that could contribute to terroir.

The first part Aim 3 of this thesis is to explore the association between microbiome and metabolome with respect to terroir. The research question we are interested in here is: Does regional differences observed in microbial populations obtained from NGS-based microbiome approaches correlate with specific wine important metabolites identified using an untargeted metabolomics-based approach? To understand the discrimination of wine microbial populations and metabolites between Central and Northeast regions in PA (the smallest distance between the wineries in Central and Northeast region is around 85 km), I utilized a partial least squares discriminant analysis (PLS-DA), a supervised regression model for identification of key features that were responsible for regional differences. Since data outputs of microbiome and metabolite analyses are multivariate and not normally distributed, these data outputs do not fit a parametric statistical test such as PLS-DA. In addition, for downstream linear regression analyses, nonparametric data may lead to inaccurate and misleading results (Castro & Pereira-Filho, 2016).

Therefore, I used a data preprocessing step that functions to filter and clean, integrate, and transform raw data into a normalized format prior to PLS-DA.

Data outputs from microbiome and metabolome were log2 transformed to help with fold change normalization in order to correct for heteroscedasticity (this makes skewed data more symmetric). Next, Pareto scaling was used to decrease the mask effect from the high abundant features and rescue low abundant differential taxa and metabolites during the analyses. After this data preprocessing step, the skewed distribution of fungal, bacterial, and metabolic data was

95

adjusted and centered showing a more symmetric and normal distribution for better interpretation of regression model analyses (Figure 3-22).

The PLS-DA model demonstrates that there were structural differences in data sets (fungal microbiome data, 57.3%; bacterial microbiome date, 36.4%; wine metabolites data, 58.1%) between Central and Northeast regions in PA (Figure 3-23A, C, & E). The differences shown in the model were calculated by using relative abundance of each features including microbial taxa and metabolic compounds. Subsequently, a Leave-one-out cross validation was carried out to validate the goodness and predictive abilities of discrimination in which R2 represents the coefficient of determination (goodness of fit) while Q2 represents the coefficient of prediction

(also called cross validated R2, goodness of prediction) and 1000 random data permutation tests were performed to confirm the significance of regional discrimination. Fungal communities showed cross-validation from the three components (Table 3-9) where the overall accuracy was

0.75 (75 correct predictions out of 100 trials) with R2 = 0.53 and Q2 = 0.43 indicating the model had moderate effect of prediction (Peng &Lai, 2012). Thus, this moderate effect of prediction suggests that data provided in this model can be used as a predictor for regional discrimination. To support this, permutation test of the fungal microbiome data set using PLS-DA confirmed that regional discrimination is statistically significant (P = 0.01). In addition, bacterial and metabolic data sets highlighted an even better predictive ability on the PLS-DA model where the overall accuracy for both data sets were higher than 90% with an R2 > 0.67 indicating a substantial predictive power according to Peng and Lai, 2012. To support the PLS-DA model, permutation tests on the bacterial and metabolic data sets further confirmed the significance of discrimination,

96

Figure 3-22: Distribution of omics data before (left, skewed) and after (right, more normally distributed) the data preprocessing involving log2 transformation and Pareto scaling. Each plot was present from (A)Fungal omics data; (B) Bacterial omics data; (C) Wine metabolite omics data.

97

P = 0.002 (bacterial) and P < 0.001 (metabolite) (Figure 3-23 C-F). Therefore, PLS-DA coupled with permutation tests summarized the overall classification of microbial and metabolic signatures based on regional differences. Although the fungal data set had less separation power compared to the bacteria and metabolite data sets, I believe that these results still provided a preliminary view of the effect of regional differences on the microbial and wine metabolic structures throughout fermentation stages in winemaking.

Table 3-9: Leave-one-out cross-validation results of PLS-DA model from the omics data of fungi bacteria, and wine metabolites. The overall accuracy is defined by the ratio of the sum of the correct diagonals / number of cases. PLS-DA Fungal data Bacterial data Metabolic data Overall accuracy 0.75 0.93 0.94 R2 0.53 0.71 0.84 Q2 0.43 0.69 0.73

PLS-DA provided information of overall differences in microbial and metabolic data set features between the two regions. The model also identified the importance of each feature (ie. relative abundance for microbiome or VOC level for SPME-GC-MS) which contributed to the variance of PLS-DA. In this study, Variable Importance in Projection (VIP), a weighted sum of squares of feature loadings in the model, was used for characterization of key contributors. In addition, VIP scores > 1.0 can be considered as an important microbial or metabolic feature in the model because the squared sum of entire VIP scores is equal to the feature number leading the average VIP score to be equal to 1 (Cho et al., 2008).

98

Figure 3-23: 3D plot between the selected PCs with PLS-DA classification with Leave-one-out cross-validation (red star indicates the best classifier). (A-B) Fungal community; (C-D) Bacterial community; (E-F) Wine metabolites. Permutation test (n = 1000) was performed for the significance of class discrimination.

99

First, according to the VIP result in component 2 selected based on the highest explained variance, it could be observed that the first three discriminative contributors were the filamentous fungi, Botryosphaeria, Neofusicoccum, and Cladosporium (VIP > 2.0) which had a higher relative abundance in wineries of the Northeast region. Furthermore, Kazachstania spp., a genus of wild yeast (VIP = 1.9) was abundant in the Central region where as Lachancea spp. (VIP = 1.7) was abundant in the Northeast region of PA. In addition, Saccharomyces (VIP = 1.0) showed less discriminatively important between two regions which was expected because of addition of commercial S. cerevisiae in both regions (Figure 3-24A). As for bacterial communities,

Enterobacteriaceae and Burkholderiaceae (VIP > 2.5) which are common genus associated with grapevine, were identified as the two most important features on PLS-DA which

Enterobacteriaceae were found in higher abundance in the Northeast while the Burkholderiaceae was abundant in the Central region. Next, Aureimonas, a plant endophyte with VIP = 2.5 (higher abundance in Central) followed by Lactococcus, a genus of lactic acid bacteria with VIP = 2.0

(higher abundance in Northeast) were also key contributors on the PLS-DA model (Figure 3-24B).

Additionally, though Oenococcus was not shown in the figure, it had a VIP score = 1.0 meaning that it was also considered to be a less important feature for the regional discrimination on PLS-

DA. With regards to metabolites, the number of esters and acids were observed to be the highest among the most important features (VIP > 1.0) shown in Figure 3-24C. Specifically, 2-Hexenoic acid, (E)- (VIP = 4.1) which is commonly associated with a fatty, long-lasting odor was the most important discriminative metabolites on PLS-DA (NIST Chemistry WebBook SRD 69, 2018).

Within the group of esters, 2-Hexenoic acid, ethyl ester commonly associated with a fruity, green odor was the key esters with VIP = 1.3 (PubChem 519129, 2020). Together, both compounds were

100

identified as significantly more abundant in the Central PA region. Finally, other than esters and acids, aldehyde-related compounds identified as important features on the PLS-DA model (Figure

3-24C).

In summary, data preprocessing of omics data set features via the PLS-DA regression model enabled identification of key features within the model with highest predictive power represented by VIP scores which represent discriminative effects of each variable in the model.

This analysis revealed several important filamentous fungi, yeasts, and non-Oenococcus bacteria as well as wine-important metabolites (acids, esters, and aldehydes) which are potential features that could discriminate difference of wines produced in the Central and Northeast regions in PA.

101

Figure 3-24: Important features identified by PLS-DA in the selected component based on the highest explained variance. The colored boxes on the right indicate the relative concentrations of the corresponding metabolite in each group under study. (A) Fungal community; (B) Bacterial community; (C) Wine metabolites. Top 15 features were listed based on VIP scores. VIP > 1.0 were considered as important contributors to the PLS-DA model. (1R,2R,5R,E)-7-Ethylidene-# was denoted as (1R,2R,5R,E)-7-Ethylidene- 1,2,8,8-tetramethylbicyclo[3.2.1]octane. *, represented compounds without RI validation.

102

3.3.9. Aggregated microbial signatures from nine wineries showed different correlation

patterns with wine metabolome providing information of potential microorganisms

contributing to wine aroma profiles

To gain insights into how microbial signatures could be related to wine important metabolites, we used a correlation-based analysis to investigate the degree of correlation between relative abundances of fungal and bacterial taxa with volatile compounds in samples collected throughout winemaking. Regularized canonical correlation analysis (rCCA) is a multidimensional statistical method that explores the degree of correlation between two data sets with quantitative variables in the same experimental sample size. Our rCCA approach demonstrates that the final determination of penalty values (λ1 and λ2) were different in the two paired data sets and showed that both fungal and bacterial data sets versus the metabolic data set had < 0.9 CV scores indicating a good fittness of penalties for the rCCA model (Figure 3-25). In this study, as previously mentioned, two groups of paired data sets were analyzed, rCCA dataset (1) fungi versus metabolite, and (2) bacteria versus metabolite. Therefore, the canonical correlation coefficient which measures the strength of associations between two variates in the data sets was shown in each dimension

(Figure 3-26). In the fungal and metabolic data sets, the first canonical correlation coefficient was

0.9999 with first dimensional explained variance of 30.11% fungal data and 36.82% metabolic data (Figure 3-25A & 3-27A). Likewise, the first correlation coefficient of bacterial and metabolic data sets was 0.9629 with first dimensional explained variance of 7.20% bacterial data and 35.29% metabolic data (Figure 3-25B & 3-27B). Thus, a general conclusion from rCCA is that both fungal and bacterial structures were correlated with wine metabolites.

103

Figure 3-25: Leave-one-out cross validation (CV) results of lambda values used in rCCA model. Values calculated from (A) fungal data set versus metabolic data set and (B) Bacterial data set versus metabolic data set were shown in the heatmap. Arrow indicated the location of highest CV- score determined by λ1 and λ2.

Figure 3-26: A scree plot of the canonical correlation coefficient from each dimension on rCCA model. Results were calculated from (A) fungal community and (B) bacterial community.

104

Figure 3-27: Explained variance bar plots of the first 5 dimensions in the rCCA model from (A) fungal data and (B) bacterial data.

Then, rCCA modelling was used to integrate microbial and wine metabolic data. A correlation circle plot highlighted the overall patterns of correlated variables between two data sets where outer circle represented higher correlation power and overlapped variables were considered positive-correlated. With respect to the fungi and wine metabolites, we could find some fungal variables (taxa) correlated with metabolites. For example, we observed most variables were located at outer circle where fewer fungal variables (taxa) were overlapped with most of metabolite variable (volatile compounds) on the left side of the circle. Further, it was also shown that large number of fungal taxa positively correlation with fewer metabolites on the right side of the circle

(Figure 3-28A). On the other hand, most of the bacterial taxa were observed to have little correlation with metabolites as observed through variables located within the inner circle.

Nonetheless, a few of taxa had higher correlation power with volatile compounds where the overlapped areas were shown in both left and ride side of the outer circle (Figure 3-28B). In other words, bacterial community might have lower correlation power (positive or negative correlation)

105

with volatile compounds compared with fungal community. Still, a few bacterial taxa might play some roles in these relationships.

With the circle plot from the rCCA model, we understand the overall differences in the correlation patterns with wine metabolites between fungal and bacterial communities. Moreover, the rCCA provided a more specific correlations between each microbial taxa and volatile compound which were visualized in a heat map where the red and blue color represents positive and negative correlations.

Figure 3-28: Circle plot of rCCA model highlighted the contribution of correlation between microorganisms and metabolites to each selected dimension. Clustering of two data sets indicated the overall correlations between them. The strength of correlation is demonstrated by the distance from the center of the circle (the further distance the better). The distribution of each variables was defined by (A) dimension 1 and 2 in fungal data and (B) dimension 1 and 3 in bacterial data.

Correlation between fungal community and wine metabolites could be categorized into 5 groups based on different correlation properties and only variables with > 0.3 correlation

106

coefficient were shown on the heat map. The first group (group 1) including Aureobasidium,

Candida, and several filamentous fungi like Botrytis and Alternaria had an overall higher correlation coefficient with wine volatile metabolites. Also, this group were negatively correlated with most of the compounds but were positively correlations with a few compounds such as 1-

Hexanol and 2-Hexenoic acid, (E)-. Group 2 including the taxa Sporobolomyces, Starmerella,

Hanseniaspora, and Pichia, had the overall lower correlation power (positive and negative) than group 1 according to the lighter color shown in the heatmap (Figure 3-29). On the other hand, group 3 which were all filamentous fungi also had lower correlations which was similar with group

2 but those taxa in group 2 had opposite correlation with a group of compounds from Butanoic acid (most left side compound) to Ethene, ethoxy-. Then, taxa in the group 4, had the similar pattern with group 1 but the lower correlative level with compounds Butanoic acid to Ethene, ethoxy-, especially the wild yeast genus, Bulleromyces showed higher correlation pattern in this group. Finally, I observed higher correlations between group 5 which contains Saccharomyces and

Wickerhamomyces with most wine metabolites especially Butanoic acid, 3-methyl-, ethyl ester which also showed a completely different pattern to the taxa in the group 1 (Figure 3-29).

As for the bacterial community, we also grouped bacterial taxa into 5 categories based on their correlative properties with wine metabolites and the variables were shown on the heat map

(correlation coefficient > 0.3). First, Oenococcus had its own group which positively correlated with most of the wine metabolites especially with Acetic acid, 2-phenylethyl ester, ETHYL (S)-

(-)-LACTATE, 1-Octanol while it showed the most negative correlation with 2-Hexen-1-ol, (Z)-.

Then, group 2 which contains Lactococcus, Rhodanobacter, and 1174-901-12 genus (Family:

Beijerinckiaceae) demonstrate negative correlation across wine metabolites from Heptanoic acid

107

to 2-Hexenoic acid, ethyl ester. In group 3, Brevundimonas could be considered having the lowest association with the formation and degradation of wine metabolites. In group 4 (e.g.

Geodermatophilus) and group 5 (e.g. Allorhizobium bacteria) we observed negative correlation to

Oenococcus. Both groups had positive correlation with a few compounds including 3-Hexen-1-ol,

(Z)- to 2-Hexenoic acid, (E)-. In general, it appears that group 4 had a lower overall level of negative correlation with metabolites compared to group 5 (Figure 3-30).

108

Figure 3-29: rCCA results of relationships between fungal community (shown in genus level) and wine metabolites were visualized in the heat map showing positive correlation (red) and negative correlation (blue). Unidentified genus was represented by its family or order name. Variables with correlations below 0.3 in absolute value are not plotted. *, represented compounds without RI validation.

109

Figure 3-30: rCCA results of relationships between bacterial community (shown in genus level) and wine metabolites were visualized in the heat map showing positive correlation (red) and negative correlation (blue). Unidentified genus was represented by its family or order name. Variables with correlations below 0.3 in absolute value are not plotted. *, represented compounds without RI validation.

110

After investigating pair-wise associations between microbial communities and wine metabolites, rCCA could further provide loading factors which include weights that contribute to the correlation between two variates. In other words, extracting individual features could help prioritize associations with more important roles. We selected priority taxa based on the ten highest positive and negative loading values from each dimension. We further analyzed individual species underlying selected genera for a second level correlation. However, due to low resolution of bacterial sequences at genus level (many unidentified genera), we chose to extract the taxa from two dimensions to acquire more information of the bacterial community. In addition, for representative taxa only identified at the family level were exclude from the list before proceeding to the next step of correlation analysis. In summary, we identified a total of 17 fungal genera and

28 bacterial genera and visualized using a heatmap generated by Spearman’s correlation analysis

(Figure 3-31).

rCCA first provided key features that contributed to overall correlations (fungal vs. metabolites; bacteria vs. metabolites) and we used these features to identify and prioritize individual species providing a list of microbial candidates and associate wine metabolites. After filtering out taxa which contained unidentified species, we obtained a total of 22 fungal species and 5 bacterial species. Based on Spearman’s correlation, as we predicted, we could still observe that addition of commercial microorganisms of S. cerevisiae and O. oeni had broader relationships with wine metabolites. Interestingly, we still observed compounds that correlated with several wild yeasts and bacteria. For example, Kazachstania humilis correlated with Acetic acid pentyl ester and Starmerella bacillaris (the most abundant wild yeast in our samples) were associated with

Acetic acid and pentyl ester independent of S. cerevisiae interactions. In relation to bacteria,

111

Bacillus gibsonii showed a characteristic correlating with compounds such as Ethyl Acetate. In addition, Bacillus gibsonii is the only bacteria that associates with 2-Heptanone (Figure3-32).

In conclusion, the rCCA model explored maximum correlations between microbial communities and wine metabolites and provided key features with higher influence on the model.

In our study, different correlation patterns from the key microbial taxa visualized via a heatmap using Spearman’s correlation provides a more targeted list of microorganisms and metabolites for future studies.

112

Figure 3-31: Top highest positive and negative loading values of taxa from rCCA model were selected for the species level correlation. The values were defined by (A) fungal data in first dimension and (B) bacterial in first and third dimension. Genera only contained unidentified species were filtered for the following correlation analysis.

113

Figure 3-32: Heat map of Spearman’s correlation between microbiome (shown in species level) and wine metabolites. Correlation coefficient with p-value > 0.05 (shown in white block) and unidentified species underlying the selected genus were removed. *, represented compounds without RI validation.

114

3.4. Discussion

3.4.1 Changes in fungal and bacterial communities across fermentation stages represent

microbial signatures in Chambourcin red wine.

α-diversity analyses of aggregated microbial distributions demonstrate overall changes in relative abundance of microbial populations with high diversity of indigenous fungi and bacteria in early fermentation (Figure 3-5). High microbial diversity in the initial stages of fermentation has been previously shown to influence aroma characteristics of final wines (Steensels

&Verstrepen, 2014). Compared to the overall changes in fungal communities, bacterial populations maintained a high level of diversity across all samples. These patterns (initially high diversity of fungal and bacterial communities) were similar with the microbial dynamics in either spontaneous or inoculated fermentation (Bokulich et al., 2016; Guzzon et al., 2020; Pinto et al.,

2015).

Commercial S. cerevisiae and O. oeni were predominant in the middle and later stages of fermentation. This supports current winemaking practices that rely heavily on inoculated fermentation using commercial strains of yeast and bacteria. The predominant behavior of commercial microbes in alcoholic fermentation led to significant decrease of microbial richness

(Anagnostopoulos et al., 2019). These changes observed by decrease in operational taxonomic units (OTUs) of microbial richness were also previously identified (Bokulich et al., 2016).

Interestingly, the evenness of fungal populations decreased significantly throughout fermentation stages but this decrease was not observed with bacterial populations. Although the number of bacterial taxa decreased during fermentation, relative abundances of most of the remaining taxa

115

(non-Oenococcus bacteria) did not change dramatically. While relative abundance of non-

Oenococcus bacteria remain relatively stable, the genus Lactobacillus and Pseudomonas increased during later stages of fermentation. It has previously reported that the inoculation of O. oeni could result in the increase of lactic acid bacteria which might contribute to MLF (duPlessis et al., 2017).

Furthermore, Lactobacillus genus was shown to increase during spontaneous fermentation (Chen et al., 2020). Likewise, some studies mentioned that the relative abundance of Pseudomonas populations were higher during early stages of fermentation for controlled and spontaneous fermentation (Bokulich et al., 2016; Chen et al., 2020; Piao et al., 2015). Yet, we further found this genus had higher relative abundance in late fermentation (not necessarily a meaning of growth) which could be due to the ability of antagonism for alcoholic fermentation (Dimitrios et al., 2019).

Linear discriminant analysis revealed the dominant fungal and bacterial taxa differentiated by each fermentation phase (early, middle, and late). As for fungal communities, studies have shown that non-Saccharomyces yeast such as Hanseniaspora, Candida, Pichia, Metschnikowia,

Kluyveromyces, and Lachancea associate with the initial stages of fermentation (L.Cocolin et al.,

2000; Dimitrios et al., 2019). In our study, Starmerella, Aureobasidium, and Sporobolomyces were were present at the highest relative abundance in the early phase (fermentation stage 1). These yeast populations present in our samples at early stages of fermentation were different from previously mentioned studies and could be characteristic difference of PA Chambourcin. For example, Starmerella was present at the highest relative abundance of wild yeast genus in which

Starmerella bacillaris (synonym Candida zemplinina) (Sipiczki, 2003) was the only underlying species in this taxa (data not shown). Starmerella bacillaris has been previously shown to be a wild yeast during wine fermentation and reported to frequently associate with overripe and

116

botrytized wine grapes (Englezos et al., 2017). Furthermore, some studies conducted in pilot scale laboratory setting demonstrate that Starmerella spp. is commonly thought to reduce ethanol content while increasing glycerol content and volatile compounds (e.g. 1-Hexanol, Diethyl succinate) when compared with S. cerevisiae (Englezos et al., 2017). To the best of my knowledge, no present studies have investigated the role of Aureobasidium and Sporobolomyces in early stages of winemaking and how these genera contribute to wine characteristics. First, Aureobasidium pullulans (also known as ‘black yeast’), the only Aureobasidium spp. within this taxon was the second most abundant fungi throughout different stages of fermentation (data not shown). This species was found to be present in high abundance during winemaking in studies conducted in

Italy, Spain, and Canada (Bozoudi &Tsaltas, 2016). One study reported that A. pullulans was never detected at harvest of grape berries (Renouf et al., 2005). However, our results showed that A. pullulans was present at approximately 25% of the fungal population in crushed grapes

(fermentation stage 1, see Table 3-1). A. pullulans is well-known for the production of amylase and β-glucosidase enzymes used to aid release of varietal aromas and for the production of pullulan, an unbranched homopolysaccharide which can improve aroma perception of red wines (Baffi et al., 2013; Bozoudi & Tsaltas, 2016). In addition, other studies also reported that A. pullulans can emit typical flavor compounds of red wine (i.e., 3-methyl-1-butanol, and Octanoic acid, ethyl ester)

(Verginer et al., 2010).

On the other hand, Sporobolomyces spp. was commonly detected on grapes and normally dies off during fermentation. In support of this, our data demonstrates that Sporobolomyces spp. is the third largest wild yeast population present during the first few days of fermentation and decreased as fermentation progressed. Although two detected species (S. patagonicus, and S.

117

symmetricus) were not mentioned in the study, it was reported that some of the secondary metabolites from this genus can impart a mouldy off-flavor in wines (Lederer et al., 2013).

Collectively, with the consideration of regional climate, and soil structure associated with grape variety, this heterogeneity of indigenous fungi community could be characteristic of a fungal signature of PA red wines produced from Chambourcin grapes and has potentials to contribute to distinct aroma profiles (Gilbert et al., 2014).

Bacterial communities are also influenced by grape varieties and the microenvironment of grapes (Barata et al., 2012). Acetic acid bacteria (AAB, e.g. Gluconobacter spp., Acetobacter spp.,) and lactic acid bacteria (LAB, Oenococcus sp., Lactobacillus spp.) are common grape associated bacteria (Barata et al., 2012; Bozoudi & Tsaltas, 2016). In this study, Sphingomonas,

Methylobacterium, Komagataeibacter, and Gluconobacter were the most abundant grape epiphytes that were present during the early fermentation process. Some studies reported that

Sphingomonas and Methylobacterium which are the major bacteria present on Vitis vinifera grapes were detected at relatively different proportions (Sphingomonas < 2% in cherry wine, (H. M.Li et al., 2019); Sphingomonas & Methylobacterium > 6% in Vitis vinifera red wine, Bokulich et al.,

2013). In addition, it was reported that Sphingomonas could have a positive correlation with the initial fermentation rate which might enhance the efficiency of alcoholic fermentation, and yet its impacts on wine aroma profile remains unclear (Bokulich et al., 2016). Moreover,

Komagataeibacter and Gluconobacter are acetic acid bacteria that could decrease wine quality through the oxidation of sugars or sugar alcohols (D-glucose, glycerol, ethanol). In addition,

Gluconobacter is also reported to have the ability to inhibit growth of Saccharomyces spp.

(Bokulich et al., 2013; Zhang et al., 2017).

118

In summary, biodiversity analysis and high-throughput sequencing demonstrate several dominant genera during Chambourcin winemaking which highlight similarities and differences in microbial structures from other previously studied red wines. Therefore, this knowledge-based information could provide us with strategies for improvement of PA red wines that directly rely on microbial population contributions on grapes and during winemaking.

3.4.2 Wine volatile metabolites showed significant change across fermentation and the

structure of compound distribution revealed the aroma profiles of Chambourcin red

wine.

Volatile aroma compounds are associated with grape wine quality due to the effects these compounds impart on sensory perception (Vilanova et al., 2010). Thus, determining a large number of odor-active compounds is needed for understanding the nature of wine aroma profiles.

We employed gas chromatography – mass spectrometry (GC-MS) with solid-phase microextraction (SPME) approaches to quantify and identify volatile compounds in samples obtained from ten predetermined fermentation stages. Our results showed that most wine volatile metabolites changed significantly during the winemaking process exemplifying the importance of fermentation which makes wine volatile profiles far more complex than that of juice and hence determines the wine quality (Sharma et al., 2012; Waterhouse et al., 2016). Moreover, among volatile compounds that are derived from fermentation, most of them increased in relative concentration after fermentation stage 2 (48 hr) and reached stationary phase (plateau) at around stage 6. The general patterns of volatile compounds from Cabernet Sauvignon fermentation were mentioned in Callejón et al. demonstrating that alcohols and esters began to increase in the first 24

119

to 36 hr of fermentation until the exponential phase (72 - 84 hr) and concentrations decreased slightly or remained constant after maximum levels. In addition, 1-Hexanol levels were reported to increase from 24 hr and then decreased after 48 hr (Callejón et al., 2012).

The distribution of individual volatile compounds throughout fermentation provided an illustration of aroma characteristics in Chambourcin red wine. Notably, 1-Butanol, 3-methyl-

(Isoamyl alcohol) was the most abundant in the samples followed by Octanoic acid, ethyl ester

(Ethyl Octanoate), and Phenylethyl Alcohol. Both Isoamyl alcohol and Phenylethyl alcohol were identified as inherent alcohols for different red wine produced from Cabernet frac, Cabernet

Sauvignon, Meritage, Merlot (having the highest baseline concentration for these compounds),

Pinot gris, Pinot noir, and (Bejaei et al., 2019). In addition, Isoamyl alcohol could impart negative impacts to wine quality higher levels (430 mg/L). On the other hand, Phenylethyl alcohol is repeatedly mentioned to have potential contribution to the floral character but is negligible in a complex volatile matrix of wines (De-La-Fuente-Blanco et al., 2016).

Furthermore, we identified a number of esters with high abundance such as 1-Butanol, 3- methyl-, acetate (Isoamyl acetate), Decanoic acid, ethyl ester (Ethyl decanoate), and Hexanoic acid, ethyl ester (Ethyl hexanoate) which were also common esters in red wines (Bejaei et al., 2019).

Interestingly, Octanoic acid, ethyl ester and Decanoic acid, ethyl ester as the second and fifth highest abundant compounds were acting as aroma enhancer compounds to Cabernet Sauvignon,

Cabernet Gernischet and Chardonnay wines (Welke et al., 2014). Acetic acid was the most abundant acid in Chambourcin red wine as well as a common acid in several red wines, especially in Syrah, and Meritage wines. Moreover, as the second and the third highest abundant acid,

120

Octanoic acid and Hexanoic acid giving fruity-acid and cheese odor were also commonly detected in red wine (Bejaei et al., 2019).

Additionally, we also detected several unique compounds with limited studies associating these compounds with the winemaking process. For instance, (1R,2R,5R,E)-7-Ethylidene-1,2,8,8,- tetramethylbicyclep[3.2.1]octane (also called Cyprotene) and Rheadan-8-ol, 2,3,10,11- tetramethoxy-16-methyl-, (6a,8a)- (also called Alpinigenine) isolated from tropical sedge and

Papaver bracteatum which is an plant alkaloid (Lalezari et al., 1973; Olawore et al., 2006). The presence of these compounds could indicate that other plant materials were involved during fermentation or possibly produced from the microorganisms.

In conclusion, our data showed the distribution of individual volatile compounds during the fermentation of Chambourcin red wine. This present study increased basic knowledge about aroma formation in hybrid red wines providing interesting clues to wineries to help explain certain wine characteristics. However, according to previous studies, volatile compounds with different concentrations can influence either positively or negatively on wine quality (De-La-Fuente-Blanco et al., 2016; Welke et al., 2014). Thus, to precisely understand the role of each volatile compound within a complex matrix or a wine model system, the analysis of absolute concentration, and downstream sensory tests are needed.

121

3.4.3 Wine microbial compositions and metabolite structures suggest regionality of

Chambourcin wines and key features provided by PLS-DA highlight regional

characteristic of Chambourcin.

According to the aggregated microbial distribution, we now understand the changes of fungal and bacterial populations over time during red wine fermentation. However, individual wineries can exhibited various microbial compositions (Figure 3-6) which is influenced by climatic conditions and soil structures as well as wine grape varieties (Bokulich et al., 2016).

Terroir is an expression that a particular region’s climate, soil structures with associated grape varieties, and their interactions influence phenotypes of wines. Recent studies have found positive correlation between geographic wine microbial signatures with volatile and non-volatile wine metabolites (Knight et al., 2015). For the first time, these findings contribute to our understanding that microbial terroir could also influence phenotypes of final wines relative to chemical characteristics of these wines.

Thus, to understand the effect of terroir in Pennsylvania, we use a PLS-DA modeling approach to determine regional discriminations of wine microbiome and volatile metabolites. Our results suggest that both microbial and metabolic compositions were distinct between two regions.

Of note, volatile metabolites had the most discriminative outcome followed by the relative abundances of bacteria communities. The smallest distance between wineries of the Central and

Northeast regions in our study is approximately 85 km. Previous studies have shown that distinct volatile and non-volatile profiles of wines produced from different vineyards could be detected even though wineries were located within an area of 13 km2 or within the distance of 30 km

(Roullier-Gall et al., 2014; Schueuermann et al., 2016). Previous work has shown that regional

122

bacterial or fungal signature could be associated with climate, soil type (co-associated with plants, and distance between two regions (Bokulich et al., 2014). In addition, other studies further mentioned the contribution of microbial terroir that is correlated with wine flavor ((Bokulich et al.,

2016; Pretorius, 2020)). Collectively, regional differences in microbial signatures from Central and Northeast PA could contribute to the distinct volatile profiles of regional Chambourcin red wine.

VIP scores enabled extraction of the key contributing features in class discrimination with in relation to regional distinct wines. According to the PLS-DA model of fungal consortia, the genus Botryosphaeria, Neofusicoccum, and Cladosporium were the most discriminative fungi with higher abundance in the Northeast region. Previous studies demonstrate that the higher abundance of these three genera could be characterized due to wetter and cool weather (Espinoza et al., 2009;

Ú rbez-Torres et al., 2006) and were considered to have pathogenicity on viticulture leading to the influence on wine quality (Latorre et al., 2011; Lorenzini et al., 2015; Pitt et al., 2012). Furthermore,

Kazachstania was the most discriminative wild yeast with higher abundance in the Central region.

A previous study mentioned that the co-culture of different strains of Kazachstania together with

Saccharomyces cerevisiae produced distinct fermentation-derived aroma such as vinegar, cheesy flavor, or floral flavor which are known to positively contribute to aroma attributes (Jood et al.,

2017).

As for the bacterial model, family Enterobacteriaceae (higher abundance in Northeast region), Burkholderiaceae, and genus Aureimonas (higher abundance in Central region) were the most discriminative bacteria. Enterobacteriaceae was reported to be the most abundant family in grape must (Pinto et al., 2015) and have negative correlation with fermentation rate suggesting the

123

potential issue of wine quality (Bokulich et al., 2016). Burkholderiaceae was considered as a major contributor to the grape veraison (change of color of grape berries) (O’Bryon, 2018) and to be associated with soil disease (Mendes et al., 2011) Finally, with little attention to wine-related characteristics, Aureimonas spp. was considered to be a plant endophytes which is usually isolated from plant leaves (Madhaiyan et al., 2013).

On the other hand, although wine aroma profiles were not merely determined by microbial communities, based on our PLS-DA and rCCA results and other studies, microbial signatures are still an important factor leading distinct compositions of volatile (Bokulich et al., 2016; Knight et al., 2015). For instance, the discriminative patterns showed that the higher production of volatile compounds in the Central region including 2-Hexenoic acid, (E)-, which is reported to have association with grape ripening (Vilanova et al., 2012), Carbon dioxide, Heptanal which often associates with oak aging of winemaking step (Dumitriu et al., 2019) while Butanoic acid, 2,4-

Hexadienedioic acid were abundant in Northeast, could be one of the outcomes of the interactions between the regional microorganisms and wine grapes though the further research is needed.

Accordingly, despite the fact that it was doubtful whether the identified key microorganisms could be applied to replicate all aspects of red wine terroir, these microbial markers differentiated between two regions could provide actionable knowledge-based information to winemakers for improving the unique wine volatile characteristics and mitigate problem fermentations in Pennsylvania regions.

124

3.4.4 Associations between wine microbiome and volatile metabolome revealed the

different clustered patterns and provided the knowledge-based information for PA

Wine Industry to better understand the microbe and compound interactions during

wine fermentation.

As we mentioned across this study, microbiome play crucial roles during the fermentation process. Identifying relationships between microbial populations and wine metabolites can help us to decipher the patterns of microbial behaviors influencing wine quality. Due to little information of microbial dynamics and related chemical compounds from PA Chambourcin red wine. Through a longitudinal sampling of 88 Chambourcin wine to represent the microbial and aroma status of

PA red wine, we conducted a correlation modeling approach using canonical correlation analysis

(rCCA) to identify potential interactions between fungi, bacteria consortia, and wine volatile compounds. Although the sampling sites did not involve the regions in West PA, still, we could obtain a fundamental overview of microbial distribution and wine metabolites of Chambourcin red wine.

If there are correlations among two datasets (variates), rCCA will find linear combinations with maximum correlation between the datasets (W.Härdle &Simar, 2007). In our study, the correlation coefficient in each dimension was maximized by choosing the most linear combination between fungal or bacterial variables and volatile metabolic variables and this is why the coefficient values of fungi, metabolite dataset in the first few dimensions were all close to one.

With this analytic property, a high canonical correlation alone cannot ensure a meaningful canonical function without explained variance of datasets (Dattalo, 2014). For instance, it was shown that fungal, metabolite datasets had higher explained variance compared with bacterial,

125

metabolite datasets in the first dimension (Figure 3-24). In other words, with consideration of high correlation coefficient in the first dimension, a larger portion of fungal taxa had relationships with wine volatile metabolites compared with the relationships in a relatively small portion of bacterial taxa. Collectively, although both fungal and bacterial communities were confirmed to have relationships with volatile compounds, fungi demonstrated higher portion of the community correlated with wine volatile metabolites.

Furthermore, our results demonstrated the different correlation patterns among the microbial taxa. Notably, commercial microorganisms, Saccharomyces and Oenococcus showed the overall higher correlation level than other native microorganisms. Studies has mentioned that the selective pressures prevailing during fermentation always favor the microbes with the most efficient fermentative properties to which Saccharomyces cerevisiae is therefore universally considered for initiating fermentation and conversing grape sugars to ethanol and other sensorially important metabolites (Swiegers et al., 2005). Furthermore, these two microbes positively correlated with most of the detected metabolites, including Butanoic acid, 3-methyl-, ethyl ester and Acetic acid, 2-phenylethyl ester which were recognized as important fruiting flavor compounds in wine (Arias-Pérez et al., 2020; Viana et al., 2009). On the other hand, several wild yeasts, filamentous fungi and non-Oenocuccus bacteria also showed positive correlation with some metabolites such as Hexanal, one of the common volatile compounds in the hybrid grapes wines

(Slegers et al., 2015), and 3-hexen-1-ol, (Z)- an important volatile compound with grassy-green odor used in fruit flavors and also detected in Cabernet Sauvignon grapes (Darriet et al., 2002;

PubChem, 2005). Additionally, Butanal, 3-methyl-, a branched-chain aldehyde associated with some wine products such as botrytized Sauternes, is reported to be produced by some Lactobacillus

126

bacteria and S. cerevisiae (Smit et al., 2009). Nevertheless, our results did not identify the such correlation patterns. Generally speaking, rCCA tried to identify and maximize the correlation between two variates. However, in this research, it would be misleading to interpret the positive correlation as the metabolites produced by individual microorganisms (Vojinovic et al., 2019). For example, when we identified the positive correlation between S. cerevisiae and Linalool, the results should be understood as S. cerevisiae within an entire microbial community had similar dynamic property with Linalool. Accordingly, in order to elucidate mechanisms of volatile metabolite production from each microorganism, culture-dependent methods are necessary for species isolation and inoculation (Hart et al., 2019).

RCCA provided key contributors in correlation model based on the loading factor values which indicated the level of model influence from the variable. We used those contributors to conduct a species level correlation using Spearman’s rank correlation for revealing the associations between the individual species and volatile metabolites. Together, although the model could not tell the direct relationship and mechanisms whether the compound was produced from which specific microorganism. In addition, study has mentioned the advantage of using CCA for the first identification of important features in the correlation pattern is that CCA with its maximized correlation property could reveal information which remains hidden for Pearson correlation relied on ranking approach (Gajdoš et al., 2015) similar with Spearman’s correlation (Liu et al., 2016).

These results illustrated a complex association between microbial consortia and volatile compositions in wine fermentation ecosystem. In other words, by using the Chambourcin red wine as the research model, such relationships of microbes, and volatile compounds could be useful tools for predicting the suitability of winemaking process with environmental factors, for example,

127

the co-inoculation of wild yeasts and S. cerevisiae to mitigate undesirable microflora avoiding the use of sulfur dioxide (Ciani et al., 2016). Furthermore, this knowledge-based tools can improve the health condition of viticulture such as maintaining the disease suppression microbes,

Sphingomonas spp. or Pseudomonas spp., and prevent microbiological problems in the unusual vintages, for instance, control the occurrence timing of spoilage AAB species (E. J. Bartowsky,

2009; Bokulich et al., 2016; Zarraonaindia & Gilbert, 2015).

In conclusion, as the study for reporting Chambourcin microbiome and volatile metabolites in Pennsylvania, I identified and characterized the diversity and abundance of microbial populations and volatile metabolites throughout winemaking using Chambourcin hybrid grapes.

My work is to provide insights into how microbial biodiversity can impact volatile compounds that contribute to ‘Terroir’. While microbiome studies typically present aggregate data of a large number of wineries, my data suggests that characterization of individual winery microbiome provides higher resolution and better preserves the idea of ‘microbial terroir’ (see Figure 3-7).

Although the associations between microbiome and volatile metabolome are complex, results from

PLS-DA and rCCA analyses highlight key players that drive fungal, bacterial, and volatile signatures in Central PA vs. Eastern PA and their interactive patterns. Finally, I present a list of top 22 candidate fungi, 5 candidate bacteria and corresponding volatile metabolites that could be explored further to connect the presence of microbial populations to mechanisms of volatile metabolite productions. Future studies will also focus on the evaluation of sensory attributes of microbiome-volatile metabolome interactions as well as culture-dependent methods to strains inoculation of a model wine to provide a more comprehensive and accurate perspective of

Chambourcin hybrid red wines.

128

Chapter 4

Conclusions and future directions

4.1. Major findings and research conclusions

1. Fungal communities showed decreases in richness and evenness in early

fermentation while bacteria demonstrated significant decrease of richness in later stages of

fermentation.

2. Furthermore, our β-diversity analyses revealed that both fungi and bacteria held the

similar distribution pattern that the different number of taxonomic identities together with

the consideration of microbial relative abundance could provide the more explanation of

microbial variations during each fermentation stages.

3. On the other hand, we also identified several taxa predominant in different

fermentation periods using LEfSe. In the fungal communities, the genus Starmerella,

Aureobasidium, Sporobolomyces were the most abundant native wild yeasts in the earlier

fermentation as well as the bacteria, Sphingomonas, Methylobacterium, Komagataeibacter.

Both consortia were dominated by Saccharomyces and Oenococcus in the late fermentation.

4. Regarding the metabolites in wine fermentation, GC-MS with SPME explored the

timing and level of volatile compounds that there were more compounds for which most of

them were esters, and alcohols appeared in the later timepoint of fermentation and most of

wine metabolites changed significantly in the abundance during winemaking. In addition, 1-

129

Butanol, 3-methyl-, Octanoic acid, ethyl ester, and Phenylethyl Alcohol were detected as the most abundant compounds in Chambroucin red wine.

5. PLS-DA regression model revealed the important feature of regional discrimination that the fungal taxa, Botryosphaeria, Neofusicoccum, and Cladosporium, and the bacterial taxa, Enterobacteriaceae and Burkholderiaceae had higher abundance in Northeast region.

6. Furthermore, the model also showed 2-Hexenoic acid, (E)-, the compound commonly associated with a fatty, long-lasting odor was the most important discriminative metabolite and had higher abundance in Central region.

7. Regularized canonical correlation analysis (rCCA) which explored the degree of correlation between wine microbime and metabolome demonstrate that Aureobasidium,

Candida, and several filamentous fungi had highly negative correlations with most of the volatile compounds identified. Relative abundance of Saccharomyces and

Wickerhamomyces had much higher correlation with most of the wine metabolites, especially Butanoic acid, 3-methyl-, ethyl ester.

8. Spearman’s correlation analysis which expresses the relationships between two data matrices revealed that Kazachstania humilis was related with Acetic acid pentyl ester not shown in the relationship with Saccharomyces cerevisiae. Likewise, Bacillus gibsonii showed a characteristic correlating with the compounds which were unassociated with

Oenococcus oeni.

130

4.2. Future directions

Future work could include several modifications to experimental approaches to improve data obtained from the wine microbiome and metabolome. First, the sampling plan of this study collected the wine fermentation samples in nine wineries from Central and Northeast PA. However, to obtain a more comprehensive study of Chambourcin red wine which represents entire

Pennsylvania, samples should be collected from all regions in PA. Furthermore, the sampling plan can also be extended to include more than one vintage year in order to eliminate the effect of climatic changes from different vintage years. Therefore, with a broader sampling range and longitudinal experiment, we may acquire a more complete information of microbiome and metabolome in Pennsylvania red wine. Based on our results, we obtained associations between microorganisms and wine volatile metabolites. However, different wine attributes such as non- volatile compounds, polyphenol contents, and the sensory characteristics still remain unclear.

Accordingly, we could conduct experiments for identification of different attributes to figure out the distribution and their interactions with wine microorganisms. After that, we can select the unique microbe that showed the most potential contribution to the overall wine quality based on the results of volatile, non-volatile, color, and the sensory trials and complement with culture- dependent approach to isolate the candidate strains as the tool for wine industry improve the wine quality or enhance the regional characteristics of wine.

Consequently, wine is a growing business in Pennsylvania but lacks detailed research on its red wine property, it will be valuable to put more efforts in this field to provide a more comprehensive and useful knowledge for the Pennsylvania Wine Industry.

131

Appendix A

Supplemental material and supporting data management information and

accessibility for microbiome data analysis pipeline

Amplicon sequencing data from this study will be deposited into NCBI and data analysis pipelines including QIIME2 & R scripts will be made available through Penn State’s new research repository service offered by the University Libraries

(https://scholarsphere.psu.edu/collections/8pk02c972r).

Choice of QIIME2 plugin for microbial diversity analyses.

Two modifications related to handling of NGS-based microbiome analysis are highlighted below. These modifications dealt with additional plugins to improve our data analysis pipeline. In this study, we chose QIIME2 for analyzing amplicon sequence data. Nonetheless, the plugin provided by the previous analyzing workflow of the QIIME2 “Moving Pictures” tutorial section

(docs.qiime2.org/2020.2/tutorials/moving-pictures/) were reported to have the limitations on the biodiversity analyses, especially for fungal consortia. We employed a more updated plugin to improve the traditional sequencing analyses pipeline. First, regarding the analyses of 16S rRNA gene for the bacterial community, many studies used FastTree, a de novo approach to build a sample-data-based tree against a preconstructed reference-data-based tree (Gaulke et al., 2018;

X.Li et al., 2020; Loo et al., 2019) and this previously existing plugin in QIIME2 construct an insufficient phylogenetic tree and may even introduce artifactual clustering for diversity analyses

(Figure A1-1a) (Janssen et al., 2018). Therefore, q2-fragment-insertion, an advance plugin in

132

QIIME2, outperformed the traditional de novo method and showed a more accurate data integration and identification by using SATé-enabled phylogenetic placement (SEPP), a phylogenetic placement method that inserts sequences into a preconstructed phylogenetic tree and aligns full-length sequences from reference database (Mirarab et al., 2012), with insertion of V4 fragments into the reference tree (Janssen et al., 2018). Thus, the reference-based phylogenetic tree produced by q2-fragment-insertion plugin was able to correctly integrate the microbial sequencing data (Figure A1-1c).

Figure A1-1: Principal-coordinate analysis (PCoA) of unweighted UniFrac method using (a) a de novo phylogentic approach; (b) de nono approach with manually distance adjustment; (c) reference phylogenetic approach (Janssen et al., 2018).

Also, the phylogenetic issue happens in the diversity analyses of fungal community when using QIIME2. While ITS allows us to identify fungal species more easily, its high sequence variability also becomes a barrier against correct sequence alignment and phylogeny construction

(Fouquier et al., 2016; Goodrich et al., 2014). Accordingly, in order to construct a fungal phylogenetic tree, a bioinformatics tool called ghost-tree (https://github.com/JTFouquier/ghost-

133

tree-trees) implemented in QIIME2 is used to create a tree by integrating fungal 18S rRNA gene sequences and fungal ITS from reference database into a single phylogenetic tree. It is shown that compared to non-phylogenetic method or FastTree-generated tree, ghost-tree provide a better visual separation of microbial communities collected from two distinctive environments (Figure

A1-2) (Fouquier et al., 2016). As a result, by choosing the suitable methods to set up the phylogenetic trees with higher resolution for both bacterial and fungal communities, we were able to improve our ability to detect the dynamics of microbiome and understand their roles during wine fermentation.

Figure A1-2: PcoA of ITS sequences from Saliva (blue) and restroom (red) generated using (a) Binary Jaccard; (b) unweighted UniFrac with Muscle aligment; (c) unweighted UniFrac using ghost-tree generated phylogeny (https://github.com/JTFouquier/ghost-tree-trees).

Choice of fungal ITS region.

Fungi communities have key roles during wine fermentation including Saccharomyces cerevisiae, wild yeasts, and filamentous fungi. ITS region has been chosen for the identification of fungi due to the conserved primer binding sites, a high number of copies per cell, and high

134

interspecific variability (Schoch et al., 2012). Under the premise that ITS1 and 2 have similar abilities of classification accuracy, ITS2 may be better suitable for phylogenetic characterization based on its higher available number of sequences in INSD (Nilsson et al., 2009). Moreover, ITS2 was found to have higher suitability to reveal the operational taxonomic units (OTUs) richness and taxonomy specifics in fungal populations (Yang et al., 2018).

On the other hand, as for the bacterial community, there are several characteristics making

16S rRNA gene a suitable biomarker for taxonomic identification and phylogenetic construction of bacterial communities, include (i) it presents in almost all known bacteria; (ii) it is functionally constant over time, which gives a higher accuracy of evolution measurements; (iii) it has a mosaic structure which is comprised of both conserved and more variable regions. (iv) the length is proper for sequencing (Coenye & Vandamme, 2003; Janda & Abbott, 2007). Importantly, Youssef et al. indicated the V4, V5+V6, and V6+V7 sequencing fragments had comparable estimation of OTUs and species richness relative to the results from almost full-length fragments. Likewise, another study showed that the V4 region (F515 – R806) is recognized to be the most suitable for taxonomic classification of bacterial communities based on either single-read or paired-end NGS approaches

(Mizrahi-Man et al., 2013).

135

Appendix B

The impact of sulfite treatment on microbial populations during early fermentation

stages of winemaking

Research questions

Sulphur dioxide (SO2) are well-established practices in the fermentation process which is used to inhibit the growth of indigenous yeasts and bacteria. However, the effect of this practice on indigenous microbiome in Chambourcin red wine is still unclear (Andorrà et al., 2008). Therefore, in this section, we are interested in (1) how Sulphur dioxide impacts the microbial compositions and (2) biodiversity of richness and evenness in different fermentation timepoints.Preliminary findings

To explore the effect of SO2 on indigenous microbial population, we conducted the lab- scale 3-day spontaneous fermentation using Chambourcin wine grapes collected from three different wineries in Pennsylvania and sodium metabisulfite (VWR, PA, USA) was added once the grapes were crushed. Total genomic DNA were extracted from each fermentation timepoint and sequenced using ITS2 gene region as biomarker on Illumina Miseq (2x250) platform.

According to the microbial taxonomic compositions, we can first identify that the indigenous fungi were different among individual wineries. Specifically, grapes collected from winery SM showed high abundance of Aureobasidium pullulans, and Sporobolomyces patagonicus in the beginning of fermentation. On the other hand, we can also observe that Sodium

136

metabisulfite has less effects on the microbiome compositions (Figure B1-1) which needs further statistical tests. Therefore, we carried out the α diversity of microbial richness and evenness with

Kruskal-Wallis test to see whether the sulfite practice has significant influence on microbial diversity. We discover that the sulfite treatment has the significant impact on microbial evenness.

In other words, by the practice of SO2 treatment, the relative abundance of some taxa changed significantly leading the uneven microbial compositions. Nevertheless, the treatment did not show the significant difference in microbial richness meaning that the number of taxa remained the same

(Figure B1-2).

Consequently, we now understand the influence of SO2 on the indigenous microbiome from Chambourcin red wine, which the practice had more influence on the relative abundance of taxa instead of the taxa number.

137

Figure B1-1: Taxonomic plots demonstrate the relative abundance of top 20 (A) fungal taxa in different spontaneous fermentation time (hr) and treatment made by Chambourcin grape collected from different wineries Unidentified genera were only shown at the family level and ‘unidentified’ taxa were grouped into categories, “g_others” (species unidentified), “f_others” (genus unidentified), “O_others” (family unidentified).

Figure B1-2: Rarefied alpha diversity distributions of microbial richness and evenness between Sodium metabisulfite treatment of spontaneous fermentation wine samples. (A) Faith’s phylogenetic diversity (richness) (B) Pielou’s evenness. Significant differences between the treatment was determined using Kruskal-Wallis test with FDR adjusted p-value (q-value). * q- value < 0.05.

138

References

Al-Awadhi, H., Dashti, N., Khanafer, M., Al-Mailem, D., Ali, N., &Radwan, S. (2013). Bias

problems in culture-independent analysis of environmental bacterial communities: A

representative study on hydrocarbonoclastic bacteria. SpringerPlus, 2(1), 1–11.

https://doi.org/10.1186/2193-1801-2-369

Alboukadel Kassambara. (2019). Visualization of a Correlation Matrix using “ggplot2.”

http://www.sthda.com/english/wiki/ggcorrplot

Alboukadel Kassambara, &Mundt, F. (2016). Extract and Visualize the Results of Multivariate

Data Analyses. https://github.com/kassambara/factoextra/issues

Andorrà, I., Landi, S., Mas, A., Guillamón, J. M., &Esteve-Zarzoso, B. (2008). Effect of

oenological practices on microbial populations using culture-independent techniques. Food

Microbiology, 25(7), 849–856. https://doi.org/10.1016/j.fm.2008.05.005

Andorrà, I., Miró, G., Espligares, N., Maria Mislata, A., Puxeu, M., &Ferrer-Gallego, R. (2019).

Wild Yeast and Lactic Acid Bacteria of Wine. In Yeasts in Biotechnology. IntechOpen.

https://doi.org/10.5772/intechopen.84128

Anzor Mikaia, Edward White V, Vladimir Zaikin, Damo Zhu, O. David Sparkman, Pedatsur Neta,

&Igor Zenkevich. (2014). NIST 14 MS Database and MS Search Program v.2.2.

Apprill, A., Mcnally, S., Parsons, R., &Weber, L. (2015). Minor revision to V4 region SSU rRNA

806R gene primer greatly increases detection of SAR11 bacterioplankton. Aquatic Microbial

139

Ecology, 75(2), 129–137. https://doi.org/10.3354/ame01753

Arias-Pérez, I., Ferrero-Del-Teso, S., Sáenz-Navajas, M. P., Fernández-Zurbano, P., Lacau, B.,

Astraín, J., Barón, C., Ferreira, V., &Escudero, A. (2020). Some clues about the changes in

wine aroma composition associated to the maturation of “neutral” grapes. Food Chemistry,

320, 126610. https://doi.org/10.1016/j.foodchem.2020.126610

Atanassov, I., Hvarleva, T., Rusanov, K., Tsvetkov, I., &Atanassov, A. (2009). Wine Metabolite

Profiling: Possible Application in Winemaking and Grapevine Breading in Bulgaria.

Biotechnology & Biotechnological Equipment, 23(4), 1449–1452.

https://doi.org/10.2478/V10133-009-0011-9

Baffi, M. A., Tobal, T., Lago, J. H. G., Boscolo, M., Gomes, E., &Da-Silva, R. (2013). Wine aroma

improvement using a β-glucosidase preparation from aureobasidium pullulans. Applied

Biochemistry and Biotechnology, 169(2), 493–501. https://doi.org/10.1007/s12010-012-

9991-2

Barata, A., Malfeito-Ferreira, M., &Loureiro, V. (2012). The microbial ecology of wine grape

berries. In International Journal of Food Microbiology (Vol. 153, Issue 3, pp. 243–259).

Elsevier. https://doi.org/10.1016/j.ijfoodmicro.2011.11.025

Barlass, M., Miller, R. M., &Douglas, T. J. (1987). Development of Methods for Screening

Grapevines for Resistance to Infection by Downy Mildew. II. Resveratrol Production.

American Journal of Enology and Viticulture, 38(1).

Bartowsky, E. J. (2009). Bacterial spoilage of wine and approaches to minimize it. In Letters in

140

Applied Microbiology (Vol. 48, Issue 2, pp. 149–156). John Wiley & Sons, Ltd.

https://doi.org/10.1111/j.1472-765X.2008.02505.x

Bartowsky, Eveline J., &Henschke, P. A. (2008). Acetic acid bacteria spoilage of bottled red wine-

A review. International Journal of Food Microbiology, 125(1), 60–70.

https://doi.org/10.1016/j.ijfoodmicro.2007.10.016

Battilana, J., Costantini, L., Emanuelli, F., Sevini, F., Segala, C., Moser, S., Velasco, R., Versini,

G., &Grando, M. S. (2009). The 1-deoxy-d-xylulose 5-phosphate synthase gene co-localizes

with a major QTL affecting monoterpene content in grapevine. Theoretical and Applied

Genetics, 118(4), 653–669. https://doi.org/10.1007/s00122-008-0927-8

Bejaei, Cliff, Madilao, &vanVuuren. (2019). Modelling Changes in Volatile Compounds in British

Columbian Varietal Wines that Were Bottle Aged for Up to 120 Months. Beverages, 5(3), 57.

https://doi.org/10.3390/beverages5030057

Belda, I., Ruiz, J., Esteban-Fernández, A., Navascués, E., Marquina, D., Santos, A., &Moreno-

Arribas, M. V. (2017). Microbial contribution to Wine aroma and its intended use for Wine

quality improvement. Molecules, 22(2). https://doi.org/10.3390/molecules22020189

Belda, I., Zarraonaindia, I., Perisin, M., Palacios, A., &Acedo, A. (2017). From vineyard soil to

wine fermentation: Microbiome approximations to explain the “terroir” Concept. In Frontiers

in Microbiology (Vol. 8, Issue MAY). Frontiers Media S.A.

https://doi.org/10.3389/fmicb.2017.00821

Benito, S., Palomero, F., Calderón, F., Palmero, D., &Suárez-Lepe, J. A. (2014). Selection of

141

appropriate Schizosaccharomyces strains for winemaking. Food Microbiology, 42, 218–224.

https://doi.org/10.1016/j.fm.2014.03.014

Bergström, A., Skov, T. H., Bahl, M. I., Roager, H. M., Christensen, L. B., Ejlerskov, K. T.,

Mølgaard, C., Michaelsen, K. F., &Licht, T. R. (2014). Establishment of intestinal microbiota

during early life: A longitudinal, explorative study of a large cohort of Danish infants. Applied

and Environmental Microbiology, 80(9), 2889–2900. https://doi.org/10.1128/AEM.00342-14

Bisanz, J. E. (2018). qiime2R: Importing QIIME2 artifacts and associated data into R sessions.

https://github.com/jbisanz/qiime2R

Blanco, P., Mirás-Avalos, J. M., Pereira, E., Fornos, D., &Orriols, I. (2014). Modulation of

chemical and sensory characteristics of red wine from mencía by using indigenous

saccharomyces cerevisiae yeast strains. Journal International Des Sciences de La Vigne et

Du Vin, 48(1), 63–74. https://doi.org/10.20870/oeno-one.2014.48.1.1659

Bokulich, N. A., Collins, T. S., Masarweh, C., Allen, G., Heymann, H., Ebeler, S. E., &Millsa, D.

A. (2016). Associations among wine grape microbiome, metabolome, and fermentation

behavior suggest microbial contribution to regional wine characteristics. MBio, 7(3).

https://doi.org/10.1128/mBio.00631-16

Bokulich, N. A., Dillon, M. R., Zhang, Y., Rideout, J. R., Bolyen, E., Li, H., Albert, P. S.,

&Caporaso, J. G. (2018). q2-longitudinal: Longitudinal and Paired-Sample Analyses of

Microbiome Data. MSystems, 3(6). https://doi.org/10.1128/msystems.00219-18

Bokulich, N. A., Ohta, M., Richardson, P. M., &Mills, D. A. (2013). Monitoring Seasonal Changes

142

in Winery-Resident Microbiota. PLoS ONE, 8(6), e66437.

https://doi.org/10.1371/journal.pone.0066437

Bokulich, N. A., Thorngate, J. H., Richardson, P. M., &Mills, D. A. (2014). Microbial

biogeography of wine grapes is conditioned by cultivar, vintage, and climate. Proceedings of

the National Academy of Sciences of the United States of America, 111(1), E139–E148.

https://doi.org/10.1073/pnas.1317377110

Bozoudi, D., &Tsaltas, D. (2016). Grape Microbiome: Potential and Opportunities as a Source of

Starter Cultures. In Antonio Morata &Iris Loira (Eds.), Grape and Wine Biotechnology.

IntechOpen. https://doi.org/10.5772/64806

Bromba, M. U. A., &Ziegler, H. (1981). Application hints for Savitzky-Golay digital smoothing

filters. Analytical Chemistry, 53(11), 1583–1586. https://doi.org/10.1021/ac00234a011

Burns, K. N., Kluepfel, D. A., Strauss, S. L., Bokulich, N. A., Cantu, D., &Steenwerth, K. L.

(2015). Vineyard soil bacterial diversity and composition revealed by 16S rRNA genes:

Differentiation by geographic features. Soil Biology and Biochemistry, 91, 232–247.

https://doi.org/10.1016/j.soilbio.2015.09.002

Caffarra, A., Rinaldi, M., Eccel, E., Rossi, V., &Pertot, I. (2012). Modelling the impact of climate

change on the interaction between grapevine and its pests and pathogens: European grapevine

moth and powdery mildew. Agriculture, Ecosystems and Environment, 148, 89–101.

https://doi.org/10.1016/j.agee.2011.11.017

Callahan, B. J., McMurdie, P. J., Rosen, M. J., Han, A. W., Johnson, A. J. A., &Holmes, S. P.

143

(2016). DADA2: High-resolution sample inference from Illumina amplicon data. Nature

Methods, 13(7), 581–583. https://doi.org/10.1038/nmeth.3869

Callejón, R. M., Margulies, B., Hirson, G. D., &Ebeler, S. E. (2012). Dynamic changes in volatile

compounds during fermentation of Cabernet Sauvignon grapes with and without skins.

American Journal of Enology and Viticulture, 63(3), 301–312.

https://doi.org/10.5344/ajev.2012.12009

Capozzi, V., Garofalo, C., Chiriatti, M. A., Grieco, F., &Spano, G. (2015). Microbial terroir and

food innovation: The case of yeast biodiversity in wine. In Microbiological Research (Vol.

181, pp. 75–83). Elsevier GmbH. https://doi.org/10.1016/j.micres.2015.10.005

Centinari, M., Kelley, K. M., Hed, B., Miller, A., &Patel-Campillo, A. (2016). Assessing Growers’

Challenges and Needs to Improve Wine Grape Production in Pennsylvania. Journal of

Extension, 54(3).

Chen, Y., Zhang, W., Yi, H., Wang, B., Xiao, J., Zhou, X., Jiankun, X., Jiang, L., &Shi, X. (2020).

Microbial community composition and its role in volatile compound formation during the

spontaneous fermentation of ice wine made from Vidal grapes. Process Biochemistry, 92,

365–377. https://doi.org/10.1016/j.procbio.2020.01.027

Chisholm, M. G., &Samuels, J. M. (1992). Determination of the Impact of the Metabolites of

Sorbic Acid on the Odor of a Spoiled Red Wine. Journal of Agricultural and Food Chemistry,

40(4), 630–633. https://doi.org/10.1021/jf00016a021

Cho, H. W., Kim, S. B., Jeong, M. K., Park, Y., Miller, N. G., Ziegler, T. R., &Jones, D. P. (2008).

144

Discovery of metabolite features for the modelling and analysis of high-resolution NMR

spectra. International Journal of Data Mining and Bioinformatics, 2(2), 176–192.

https://doi.org/10.1504/IJDMB.2008.019097

Chong, J., Wishart, D. S., &Xia, J. (2019). Using MetaboAnalyst 4.0 for Comprehensive and

Integrative Metabolomics Data Analysis. Current Protocols in Bioinformatics, 68(1).

https://doi.org/10.1002/cpbi.86

Ciani, M., Capece, A., Comitini, F., Canonico, L., Siesto, G., &Romano, P. (2016). Yeast

Interactions in Inoculated Wine Fermentation. Frontiers in Microbiology, 7(APR), 555.

https://doi.org/10.3389/fmicb.2016.00555

Cletus Kurtzman, J. W. Fell, &Teun Boekhout (Eds.). (2011). The Yeasts : A Taxonomic Study

(5th ed.). Elsevier Science & Technology. https://doi.org/10.1016/B978-0-444-52149-

1.00012-4

Cocolin, L., Bisson, L. F., &Mills, D. A. (2000). Direct profiling of the yeast dynamics in wine

fermentations. FEMS Microbiology Letters, 189(1), 81–87. https://doi.org/10.1111/j.1574-

6968.2000.tb09210.x

Cocolin, Luca, Alessandria, V., Dolci, P., Gorra, R., &Rantsiou, K. (2013). Culture independent

methods to assess the diversity and dynamics of microbiota during food fermentation.

International Journal of Food Microbiology, 167(1), 29–43.

https://doi.org/10.1016/j.ijfoodmicro.2013.05.008

Coenye, T., &Vandamme, P. (2003). Intragenomic heterogeneity between multiple 16S ribosomal

145

RNA operons in sequenced bacterial genomes. FEMS Microbiology Letters, 228(1), 45–49.

https://doi.org/10.1016/S0378-1097(03)00717-1

Coia, L. R., &Ward, D. L. (2017). The hybrid grape chambourcin has a role in quality red V.

vinifera blends in a New World grape growing region. Journal of Wine Research, 28(4), 326–

331. https://doi.org/10.1080/09571264.2017.1392292

Combina, M., Elía, A., Mercado, L., Catania, C., Ganga, A., &Martinez, C. (2005). Dynamics of

indigenous yeast populations during spontaneous fermentation of wines from Mendoza,

Argentina. International Journal of Food Microbiology, 99(3), 237–243.

https://doi.org/10.1016/j.ijfoodmicro.2004.08.017

Cordente, A. G., Curtin, C. D., Varela, C., &Pretorius, I. S. (2012). Flavour-active wine yeasts. In

Applied Microbiology and Biotechnology (Vol. 96, Issue 3, pp. 601–618). Springer.

https://doi.org/10.1007/s00253-012-4370-z

Cuadros-Inostroza, A., Ruíz-Lara, S., González, E., Eckardt, A., Willmitzer, L., &Peña-Cortés, H.

(2016). GC–MS metabolic profiling of Cabernet Sauvignon and Merlot cultivars during

grapevine berry development and network analysis reveals a stage- and cultivar-dependent

connectivity of primary metabolites. Metabolomics, 12(2), 1–17.

https://doi.org/10.1007/s11306-015-0927-z

Darriet, P., Pons, M., Henry, R., Dumont, O., Findeling, V., Cartolaro, P., Calonnec, A.,

&Dubourdieu, D. (2002). Impact odorants contributing to the fungus type aroma from grape

berries contaminated by powdery mildew (Uncinula necator); incidence of enzymatic

146

activities of the yeast Saccharomyces cerevisiae. Journal of Agricultural and Food Chemistry,

50(11), 3277–3282. https://doi.org/10.1021/jf011527d

Dattalo, P.V. (2014). A Demonstration of Canonical Correlation Analysis with Orthogonal

Rotation to Facilitate Interpretation. http://scholarscompass.vcu.edu/socialwork_pubs

David, V., Terrat, S., Herzine, K., Claisse, O., Rousseaux, S., Tourdot-Maréchal, R., Masneuf-

Pomarede, I., Ranjard, L., &Alexandre, H. (2014). High-throughput sequencing of amplicons

for monitoring yeast biodiversity in must and during alcoholic fermentation. Journal of

Industrial Microbiology and Biotechnology, 41(5), 811–821. https://doi.org/10.1007/s10295-

014-1427-2

De-La-Fuente-Blanco, A., Sáenz-Navajas, M. P., &Ferreira, V. (2016). On the effects of higher

alcohols on red wine aroma. Food Chemistry, 210, 107–114.

https://doi.org/10.1016/j.foodchem.2016.04.021

Denise Gardner. (2015, September 30). Volatile Acidity in Wine. PennState Extenision.

https://extension.psu.edu/volatile-acidity-in-wine

Dewey, R. (2017). Get to Know Pennsylvania’s Native Grapes. Pennsylvania Winery Association.

https://pennsylvaniawine.com/2017/06/13/ultimate-guide-pennsylvania-grapes/

Dimitrios, A. A., Kamilari, E., &Tsaltas, D. (2019). Contribution of the Microbiome as a Tool for

Estimating Wine’s Fermentation Output and Authentication. In Advances in Grape and Wine

Biotechnology. IntechOpen. https://doi.org/10.5772/intechopen.85692

147

Dombrosky, J., &Gajanan, S. (2013). Pennsylvania Wine Industry-An Assessment. The Center for

Rural Pennsylvania.

https://www.rural.palegislature.us/documents/reports/pa_wine_industry_2013.pdf

Drysdale, G. S., &Fleet, G. H. (1988). Acetic Acid Bacteria in Winemaking: A Review. American

Journal of Enology and Viticulture, 39(2). duPlessis, H., duToit, M., Nieuwoudt, H., van derRijst, M., Kidd, M., &Jolly, N. (2017). Effect of

Saccharomyces, Non-Saccharomyces Yeasts and Malolactic Fermentation Strategies on

Fermentation Kinetics and Flavor of Shiraz Wines. Fermentation, 3(4), 64.

https://doi.org/10.3390/fermentation3040064

Dumitriu, G. D., Teodosiu, C., Gabur, I., Cotea, V.V., Peinado, R. A., &deLerma, N. L. (2019).

Evaluation of aroma compounds in the process of wine ageing with oak chips. Foods, 8(12).

https://doi.org/10.3390/foods8120662

Ebeler, S. E. (2015). Analysis of Grapes and Wines: An Overview of New Approaches and

Analytical Tools (Vol. 44). UTC. https://pubs.acs.org/sharingguidelines

Englezos, V., Giacosa, S., Rantsiou, K., Rolle, L., &Cocolin, L. (2017). Starmerella bacillaris in

winemaking: opportunities and risks. In Current Opinion in Food Science (Vol. 17, pp. 30–

35). Elsevier Ltd. https://doi.org/10.1016/j.cofs.2017.08.007

Espinoza, J. G., Briceño, E. X., Chávez, E. R., Ú rbez-Torres, J. R., &Latorre, B. A. (2009).

Neofusicoccum spp. associated with stem canker and dieback of blueberry in Chile. Plant

Disease, 93(11), 1187–1194. https://doi.org/10.1094/PDIS-93-11-1187

148

Faith, D. P. (1992). Conservation evaluation and phylogenetic diversity. Biological Conservation,

61(1), 1–10. https://doi.org/10.1016/0006-3207(92)91201-3

Farell, E. M., &Alexandre, G. (2012). Bovine serum albumin further enhances the effects of

organic solvents on increased yield of polymerase chain reaction of GC-rich templates. BMC

Research Notes, 5(1), 1. https://doi.org/10.1186/1756-0500-5-257

Fleet, G. H. (2003). Yeast interactions and wine flavour. In International Journal of Food

Microbiology (Vol. 86, Issues 1–2, pp. 11–22). Elsevier. https://doi.org/10.1016/S0168-

1605(03)00245-9

Fouquier, J., Rideout, J. R., Bolyen, E., Chase, J., Shiffer, A., McDonald, D., Knight, R., Caporaso,

J. G., &Kelley, S. T. (2016). Ghost-tree: Creating hybrid-gene phylogenetic trees for diversity

analyses. Microbiome, 4(1), 11. https://doi.org/10.1186/s40168-016-0153-6

Gajdoš, M., Mračková, M., Elfmarková, N., Rektorová, I., &Mikl, M. (2015). 50. Comparison of

canonical correlation analysis and pearson correlation in resting state FMRI in patients with

parkinson’s disease. Clinical Neurophysiology, 126(3), e47–e48.

https://doi.org/10.1016/j.clinph.2014.10.209

Gamero, A., Quintilla, R., Groenewald, M., Alkema, W., Boekhout, T., &Hazelwood, L. (2016).

High-throughput screening of a large collection of non-conventional yeasts reveals their

potential for aroma formation in food fermentation. Food Microbiology, 60, 147–159.

https://doi.org/10.1016/j.fm.2016.07.006

Gardner, D. M. (2016). Tasting Chambourcin: Part I. Penn State Extension Wine & Grapes U.

149

https://psuwineandgrapes.wordpress.com/2016/07/15/tasting-chambourcin-part-i/

Gaulke, C. A., Arnold, H. K., Humphreys, I. R., Kembel, S. W., O’dwyer, J. P., &Sharpton, T. J.

(2018). Ecophylogenetics clarifies the evolutionary association between mammals and their

gut microbiota. MBio, 9(5). https://doi.org/10.1128/mBio.01348-18

General Industry Stats. (2019). United States Wine and Grape Industry FAQS | WineAmerica. The

National Association of American Wineries. https://wineamerica.org/policy/by-the-numbers/

Gilbert, J. A., Van DerLelie, D., &Zarraonaindia, I. (2014). Microbial terroir for wine grapes. In

Proceedings of the National Academy of Sciences of the United States of America (Vol. 111,

Issue 1, pp. 5–6). National Academy of Sciences. https://doi.org/10.1073/pnas.1320471110

González, I., Martin, P. G. P., &Baccini, A. (2008). CCA: An R Package to Extend Canonical

Correlation Analysis. In JSS Journal of Statistical Software (Vol. 23).

http://www.jstatsoft.org/

Goodrich, J. K., DiRienzi, S. C., Poole, A. C., Koren, O., Walters, W. A., Caporaso, J. G., Knight,

R., &Ley, R. E. (2014). Conducting a microbiome study. In Cell (Vol. 158, Issue 2, pp. 250–

262). Cell Press. https://doi.org/10.1016/j.cell.2014.06.037

Guo, Q., Honesty, S., Xu, M. L., Zhang, Y., Schoelz, J., &Qiu, W. (2014). Genetic diversity and

tissue and host specificity of Grapevine vein clearing virus. Phytopathology, 104(5), 539–

547. https://doi.org/10.1094/PHYTO-03-13-0075-R

Guzzon, R., Malacarne, M., Larcher, R., Franciosi, E., &Toffanin, A. (2020). The impact of grape

150

processing and carbonic maceration on the microbiota of early stages of winemaking. Journal

of Applied Microbiology, 128(1), 209–224. https://doi.org/10.1111/jam.14462

Halket, J. M., &Zaikin, V. G. (2003). Derivatization in Mass Spectrometry—1. Silylation.

European Journal of Mass Spectrometry, 9(1), 1–21. https://doi.org/10.1255/ejms.527

Handelsman, J., Rondon, M. R., Brady, S. F., Clardy, J., &Goodman, R. M. (1998). Molecular

biological access to the chemistry of unknown soil microbes: A new frontier for natural

products. Chemistry and Biology, 5(10). https://doi.org/10.1016/S1074-5521(98)90108-9

Happer, J. K., &Kime, L. (2013). Wine Grape Production. PennState Extension.

https://extension.psu.edu/wine-grape-production

Härdle, W., &Simar, L. (2007). Canonical Correlation Analysis. In: Applied Multivariate

Statistical Analysis (WolfgangHärdle &L.Simar (Eds.)). Springer Berlin Heidelberg.

https://doi.org/10.1007/978-3-540-72244-1_14

Hart, R. S., Jolly, N. P., &Ndimba, B. K. (2019). Characterisation of hybrid yeasts for the

production of varietal wine – A review. Journal of Microbiological Methods,

165, 105699. https://doi.org/10.1016/j.mimet.2019.105699

Heard, G. M., &Fleet, G. H. (1985). Growth of natural yeast during the fermentation of

inoculated wines. Applied and Environmental Microbiology, 50(3), 727–728.

https://doi.org/10.1128/aem.50.3.727-728.1985

Hoemmen, G., Altman, I., &Rendleman, M. (2015). Impact of sustainable viticulture programs on

151

American Viticultural areas: Lodi AVA. Journal of Wine Research, 26(3), 169–180.

https://doi.org/10.1080/09571264.2015.1052047

Homich, L. J., Scheinberg, J. A., Elias, R. J., &Gardner, D. M. (2016). Effects of co-inoculation

on wine-quality attributes of the high-acid, red hybrid variety chambourcin. American

Journal of Enology and Viticulture, 67(2), 245–250. https://doi.org/10.5344/ajev.2015.15084

Ilc, T., Werck-Reichhart, D., &Navrot, N. (2016). Meta-analysis of the core aroma components of

grape and wine aroma. Frontiers in Plant Science, 7(September2016), 1472.

https://doi.org/10.3389/fpls.2016.01472

Inês, A., &Falco, V. (2018). Lactic Acid Bacteria Contribution to Wine Quality and Safety. In

Generation of Aromas and Flavours. IntechOpen. https://doi.org/10.5772/intechopen.81168

Jackson, D. I., &Lombard, P. B. (1993). Environmental and Management Practices Affecting

Grape Composition and Wine Quality - A Review. American Journal of Enology and

Viticulture, 44(4).

Janda, J. M., &Abbott, S. L. (2007). 16S rRNA gene sequencing for bacterial identification in the

diagnostic laboratory: Pluses, perils, and pitfalls. In Journal of Clinical Microbiology (Vol.

45, Issue 9, pp. 2761–2764). https://doi.org/10.1128/JCM.01228-07

Janssen, S., McDonald, D., Gonzalez, A., Navas-Molina, J. A., Jiang, L., Xu, Z. Z., Winker, K.,

Kado, D. M., Orwoll, E., Manary, M., Mirarab, S., &Knight, R. (2018). Phylogenetic

Placement of Exact Amplicon Sequences Improves Associations with Clinical Information.

MSystems, 3(3). https://doi.org/10.1128/msystems.00021-18

152

John Dunham & Associates. (2019). 2018 Economic Impact Study of the Pennsylvania Wine &

Grape Industries. https://pennsylvaniawine.com/wp-

content/uploads/2019/09/PAWines_EcoImpact_FULL_2018.pdf

John Hartman, &Julie Beale. (2008). Powdery Mildew of Grape.

http://www.ca.uky.edu/agc/pubs/id/id21/

Johnsen, L. G., Skou, P. B., Khakimov, B., &Bro, R. (2017). Gas chromatography – mass

spectrometry data processing made easy. Journal of Chromatography A, 1503, 57–64.

https://doi.org/10.1016/j.chroma.2017.04.052

Jood, I., Hoff, J. W., &Setati, M. E. (2017). Evaluating fermentation characteristics of

Kazachstania spp. and their potential influence on wine quality. World Journal of

Microbiology and Biotechnology, 33(7). https://doi.org/10.1007/s11274-017-2299-1

Joseph, C. M. L., Albino, E. A., Ebeler, S. E., &Bisson, L. F. (2015). Brettanomyces bruxellensis

Aroma-Active compounds determined by SPME GC-MS olfactory analysis. American

Journal of Enology and Viticulture, 66(3), 379–387. https://doi.org/10.5344/ajev.2015.14073

Julius Kühn-Institut. (2020, June 27). CHAMBOURCIN. Julius Kühn-Institut - Federal Research

Centre for Cultivated Plants (JKI) , Institute for Grapevine Breeding - Geilweilerhof (ZR).

http://www.vivc.de/?r=passport%2Fview&id=2436

Kecskeméti, E., Berkelmann-Löhnertz, B., &Reineke, A. (2016). Are Epiphytic Microbial

Communities in the Carposphere of Ripening Grape Clusters (Vitis vinifera L.) Different

between Conventional, Organic, and Biodynamic Grapes? PLOS ONE, 11(8), e0160852.

153

https://doi.org/10.1371/journal.pone.0160852

Knight, S., Klaere, S., Fedrizzi, B., &Goddard, M. R. (2015). Regional microbial signatures

positively correlate with differential wine phenotypes: Evidence for a microbial aspect to

terroir. Scientific Reports, 5(1), 1–10. https://doi.org/10.1038/srep14233

Koren, O., Knights, D., Gonzalez, A., Waldron, L., &Segata, N. (2013). A Guide to Enterotypes

across the Human Body: Meta-Analysis of Microbial Community Structures in Human

Microbiome Datasets) A Guide to Enterotypes across the Human Body: Meta-Analysis of

Microbial Community Structures in Human Microbiome Datasets. PLoS Comput Biol, 9(1),

1002863. https://doi.org/10.1371/journal.pcbi.1002863

Kruskal, W. H., &Wallis, W. A. (1952). Use of Ranks in One-Criterion Variance Analysis. Journal

of the American Statistical Association, 47(260), 583–621.

https://doi.org/10.1080/01621459.1952.10483441

Kuhn, M. (2008). Building Predictive Models in R Using the caret Package. Journal of Statistical

Software, 28(5), 1–26. https://stackoverflow.com/questions/41246134/citing-caret-r-

package-in-apa-style

Lalezari, I., Shafiee, A., &Nasseri‐Nouri, P. (1973). Isolation of alpinigenine from Papaver

bracteatum. Journal of Pharmaceutical Sciences, 62(10), 1718–1718.

https://doi.org/10.1002/jps.2600621033

Latorre, B. A., Briceño, E. X., &Torres, R. (2011). Increase in Cladosporium spp. populations and

rot of wine grapes associated with leaf removal. Crop Protection, 30(1), 52–56.

154

https://doi.org/10.1016/j.cropro.2010.08.022

Lederer, M. A., Nielsen, D. S., Toldam-Andersen, T. B., Herrmann, J.V., &Arneborg, N. (2013).

Yeast species associated with different wine grape varieties in Denmark. Acta Agriculturae

Scandinavica Section B: Soil and Plant Science, 63(1), 89–96.

https://doi.org/10.1080/09064710.2012.723738

Li, H. M., Jiang, D. Q., Dai, Z. G., Zhang, Y. S., Zhang, Y., Sun, S. Y., &Zhao, Y. P. (2019).

Aromatic property of cherry wine produced by malolactic fermentation of controlled and

spontaneous on the bacterial evolution. International Journal of Food Properties, 22(1),

1270–1282. https://doi.org/10.1080/10942912.2019.1640736

Li, X., Trivedi, U., Brejnrod, A. D., Vestergaard, G., Mortensen, M. S., Bertelsen, M. F.,

&Sørensen, S. J. (2020). The microbiome of captive hamadryas baboon. BioRxiv,

2020.01.10.901256. https://doi.org/10.1101/2020.01.10.901256

Liu, X., Zhang, S., Jiang, Q., Bai, Y., Shen, G., Li, S., &Ding, W. (2016). Using community

analysis to explore bacterial indicators for disease suppression of tobacco bacterial wilt.

Scientific Reports, 6(1), 1–11. https://doi.org/10.1038/srep36773

Lonvaud-Funel, A. (1999). Lactic acid bacteria in the quality improvement and depreciation of

wine. Antonie van Leeuwenhoek, International Journal of General and Molecular

Microbiology, 76(1–4), 317–331. https://doi.org/10.1023/A:1002088931106

Loo, W. T., García-Loor, J., Dudaniec, R. Y., Kleindorfer, S., &Cavanaugh, C. M. (2019). Host

phylogeny, diet, and habitat differentiate the gut microbiomes of Darwin’s finches on Santa

155

Cruz Island. Scientific Reports, 9(1), 1–12. https://doi.org/10.1038/s41598-019-54869-6

Lorenzini, M., Cappello, M. S., &Zapparoli, G. (2015). Isolation of Neofusicoccum parvum from

withered grapes: strain characterization, pathogenicity and its detrimental effects on passito

wine aroma. Journal of Applied Microbiology, 119(5), 1335–1344.

https://doi.org/10.1111/jam.12931

Lozupone, C. A., Hamady, M., Kelley, S. T., &Knight, R. (2007). Quantitative and qualitative β

diversity measures lead to different insights into factors that structure microbial communities.

In Applied and Environmental Microbiology (Vol. 73, Issue 5, pp. 1576–1585). American

Society for Microbiology. https://doi.org/10.1128/AEM.01996-06

Lozupone, C., &Knight, R. (2005). UniFrac: A new phylogenetic method for comparing microbial

communities. Applied and Environmental Microbiology, 71(12), 8228–8235.

https://doi.org/10.1128/AEM.71.12.8228-8235.2005

Luo, Z., Walkey, C. J., Madilao, L. L., Measday, V., &VanVuuren, H. J. J. (2013). Functional

improvement of Saccharomyces cerevisiae to reduce volatile acidity in wine. FEMS Yeast

Research, 13(5), 485–494. https://doi.org/10.1111/1567-1364.12053

Madhaiyan, M., Hu, C. J., Roy, J. J., Kim, S.-J., Weon, H.-Y., Kwon, S.-W., Ji, L., &Ji, C. L.

(2013). IUMS Printed in Great Britain. International Journal of Systematic and Evolutionary

Microbiology, 63, 1702–1708. https://doi.org/10.1099/ijs.0.041020-0

Maicas, S., &Mateo, J. J. (2005). Hydrolysis of terpenyl glycosides in grape juice and other fruit

juices: A review. In Applied Microbiology and Biotechnology (Vol. 67, Issue 3, pp. 322–335).

156

Appl Microbiol Biotechnol. https://doi.org/10.1007/s00253-004-1806-0

Mamlouk, D., &Gullo, M. (2013). Acetic Acid Bacteria: Physiology and Carbon Sources

Oxidation. Indian Journal of Microbiology, 53(4), 377–384. https://doi.org/10.1007/s12088-

013-0414-z

Marchesi, J. R., &Ravel, J. (2015). The vocabulary of microbiome research: a proposal.

Microbiome, 3(1), 31. https://doi.org/10.1186/s40168-015-0094-5

Marinetti, L. (2007). History and Pharmacology of γ‐Hydroxybutyric Acid. In R. L.Bertholf &R.

E.Winecker (Eds.), Chromatographic Methods in Clinical Chemistry and Toxicology (pp.

197–216). John Wiley & Sons, Ltd. https://doi.org/10.1002/9780470023129

Marlowe, B., &Bauman, M. (2019). Terroir Tourism: Experiences in Organic Vineyards.

Beverages, 5(2), 30. https://doi.org/10.3390/beverages5020030

Martin Maechler, Peter Rousseeuw, Anja Struyf, Mia Hubert, Kurt Hornik, Matthias Studer, Pierre

Roudier, Juan Gonzalez, Kamil Kozlowski, Erich Schubert, &Keefe Murphy. (2019).

``Finding Groups in Data’’: Cluster Analysis Extended Rousseeuw et al. https://svn.r-

project.org/R-packages/trunk/cluster

Masneuf-Pomarede, I., Bely, M., Marullo, P., &Albertin, W. (2016). The genetics of non-

conventional wine yeasts: Current knowledge and future challenges. In Frontiers in

Microbiology (Vol. 6, Issue JAN, p. 1563). Frontiers Media S.A.

https://doi.org/10.3389/fmicb.2015.01563

157

Mauricio, J. C., Moreno, J., Zea, L., Ortega, J. M., &Medina, M. (1997). The effects of grape must

fermentation conditions on volatile alcohols and esters formed by Saccharomyces cerevisiae.

Journal of the Science of Food and Agriculture, 75(2), 155–160.

https://doi.org/10.1002/(SICI)1097-0010(199710)75:2<155::AID-JSFA853>3.0.CO;2-S

Maurus Brown. (2000). Grant Program To Increase Wine Grape Production in Ohio. Journal of

Extension, 38.

McArdle, B. H., &Anderson, M. J. (2001). FITTING MULTIVARIATE MODELS TO

COMMUNITY DATA: A COMMENT ON DISTANCE‐BASED REDUNDANCY

ANALYSIS. Ecology, 82(1), 290–297. https://doi.org/10.1890/0012-

9658(2001)082[0290:FMMTCD]2.0.CO;2

McCoy, C. O., &Matsen, F. A. (2013). Abundance-weighted phylogenetic diversity measures

distinguish microbial community states and are robust to sampling depth. PeerJ, 2013(1).

https://doi.org/10.7717/peerj.157

Mendes, R., Kruijt, M., DeBruijn, I., Dekkers, E., Van DerVoort, M., Schneider, J. H. M., Piceno,

Y. M., DeSantis, T. Z., Andersen, G. L., Bakker, P. A. H. M., &Raaijmakers, J. M. (2011).

Deciphering the rhizosphere microbiome for disease-suppressive bacteria. Science,

332(6033), 1097–1100. https://doi.org/10.1126/science.1203980

Mezzasalma, V., Sandionigi, A., Bruni, I., Bruno, A., Lovicu, G., Casiraghi, M., &Labra, M.

(2017). Grape microbiome as a reliable and persistent signature of field origin and

environmental conditions in Cannonau wine production. PLOS ONE, 12(9), e0184615.

158

https://doi.org/10.1371/journal.pone.0184615

Mirarab, S., Nguyen, N., &Warnow, T. (2012). SEPP: SATé-enabled phylogenetic placement.

Pacific Symposium on Biocomputing, 247–258.

https://doi.org/10.1142/9789814366496_0024

Mizrahi-Man, O., Davenport, E. R., &Gilad, Y. (2013). Taxonomic Classification of Bacterial 16S

rRNA Genes Using Short Sequencing Reads: Evaluation of Effective Study Designs. PLoS

ONE, 8(1). https://doi.org/10.1371/journal.pone.0053608

MKF Research LLC. (2009). The Economic Impact of Pennsylvania Wine and Grapes: Update

2007. Pennsylvania Winery Association. https://pennsylvaniawine.com/wp-

content/uploads/2017/04/PAWines_2007EconomicImpactReport.pdf

Morgan, H. H., duToit, M., &Setati, M. E. (2017). The grapevine and wine microbiome: Insights

from high-throughput amplicon sequencing. In Frontiers in Microbiology (Vol. 8, Issue

MAY). Frontiers Media S.A. https://doi.org/10.3389/fmicb.2017.00820

Moyer, C. L., Tiedje, J. M., Dobbs, F. C., &Karl, D. M. (1998). Diversity of deep-sea hydrothermal

vent archaea from Loihi Seamount, Hawaii. Deep-Sea Research Part II: Topical Studies in

Oceanography, 45(1–3), 303–317. https://doi.org/10.1016/S0967-0645(97)00081-7

Nahm, F. S. (2016). Nonparametric statistical tests for the continuous data: The basic concept and

the practical use. Korean Journal of Anesthesiology, 69(1), 8–14.

https://doi.org/10.4097/kjae.2016.69.1.8

159

Nielsen, J. C., &Richelieu, M. (1999). Control of flavor development in wine during and after

malolactic fermentation by Oenococcus oeni. Applied and Environmental Microbiology,

65(2), 740–745. http://www.ncbi.nlm.nih.gov/pubmed/9925610

Nikolaou, E., Soufleros, E. H., Bouloumpasi, E., &Tzanetakis, N. (2006). Selection of indigenous

Saccharomyces cerevisiae strains according to their oenological characteristics and

vinification results. FOOD MICROBIOLOGY, 23, 205–211.

https://doi.org/10.1016/j.fm.2005.03.004

Nilsson, R. H., Larsson, K. H., Taylor, A. F. S., Bengtsson-Palme, J., Jeppesen, T. S., Schigel, D.,

Kennedy, P., Picard, K., Glöckner, F. O., Tedersoo, L., Saar, I., Kõljalg, U., &Abarenkov, K.

(2019). The UNITE database for molecular identification of fungi: Handling dark taxa and

parallel taxonomic classifications. Nucleic Acids Research, 47(D1), D259–D264.

https://doi.org/10.1093/nar/gky1022

Nilsson, R. H., Ryberg, M., Abarenkov, K., Sjà ¶kvist, E., &Kristiansson, E. (2009). The ITS

region as a target for characterization of fungal communities using emerging sequencing

technologies. FEMS Microbiology Letters, 296(1), 97–101. https://doi.org/10.1111/j.1574-

6968.2009.01618.x

Nipperess, D. A. (2016). The Rarefaction of Phylogenetic Diversity: Formulation, Extension and

Application (pp. 197–217). Springer, Cham. https://doi.org/10.1007/978-3-319-22461-9_10

NIST Chemistry WebBook SRD 69. (2018). 2-Hexenoic acid, (E)-. NIST Standard Reference Data;

National Institute of Standards and Technology.

160

https://webbook.nist.gov/cgi/cbook.cgi?ID=C13419697

O’Bryon, I. (2018). Analysis of Grape Berry Epiphytic Microbiomes via QIIME.

https://scholarworks.rit.edu/theses

Olawore, N. O., Usman, L. A., Ogunwande, I. A., &Adeleke, K. A. (2006). Constituents of

Rhizome Essential Oils of Two Types of Cyperus articulatus L. Grown in Nigeria.). Journal

of Essential Oil Research, 18(6), 604–606. https://doi.org/10.1080/10412905.2006.9699179

Pandit, R. J., Hinsu, A. T., Patel, N.V., Koringa, P. G., Jakhesara, S. J., Thakkar, J. R., Shah, T.

M., Limon, G., Psifidi, A., Guitian, J., Hume, D. A., Tomley, F. M., Rank, D. N., Raman, M.,

Tirumurugaan, K. G., Blake, D. P., &Joshi, C. G. (2018). Microbial diversity and community

composition of caecal microbiota in commercial and indigenous Indian chickens determined

using 16s rDNA amplicon sequencing. Microbiome, 6(1), 115.

https://doi.org/10.1186/s40168-018-0501-9

Parada, A. E., Needham, D. M., &Fuhrman, J. A. (2016). Every base matters: Assessing small

subunit rRNA primers for marine microbiomes with mock communities, time series and

global field samples. Environmental Microbiology, 18(5), 1403–1414.

https://doi.org/10.1111/1462-2920.13023

Paulson, J. N., Stine, O. C., Bravo, H. C., Pop, M., Colin Stine, O., Bravo, H. C., &Pop, M. (2013).

Differential abundance analysis for microbial marker-gene surveys. Nature Methods, 10(12),

1200–1202. https://doi.org/10.1038/nmeth.2658

Pearson, T., Caporaso, J. G., Yellowhair, M., Bokulich, N. A., Padi, M., Roe, D. J., Wertheim, B.

161

C., Linhart, M., Martinez, J. A., Bilagody, C., Hornstra, H., Alberts, D. S., Lance, P.,

&Thompson, P. A. (2019). Effects of ursodeoxycholic acid on the gut microbiome and

colorectal adenoma development. Cancer Medicine, 8(2), 617–628.

https://doi.org/10.1002/cam4.1965

Peng, D. X., &Lai, F. (2012). Using partial least squares in operations management research: A

practical guideline and summary of past research. Journal of Operations Management, 30(6),

467–480. https://doi.org/10.1016/j.jom.2012.06.002

Pérez, G., Fariña, L., Barquet, M., Boido, E., Gaggero, C., Dellacassa, E., &Carrau, F. (2011). A

quick screening method to identify β-glucosidase activity in native wine yeast strains:

Application of Esculin Glycerol Agar (EGA) medium. World Journal of Microbiology and

Biotechnology, 27(1), 47–55. https://doi.org/10.1007/s11274-010-0425-4

Piao, H., Hawley, E., Kopf, S., DeScenzo, R., Sealock, S., Henick-Kling, T., &Hess, M. (2015).

Insights into the bacterial community and its temporal succession during the fermentation of

wine grapes. Frontiers in Microbiology, 6(JUL), 809.

https://doi.org/10.3389/fmicb.2015.00809

Pielou, E. C. (1966). The measurement of diversity in different types of biological collections.

Journal of Theoretical Biology, 13(C), 131–144. https://doi.org/10.1016/0022-

5193(66)90013-0

Pinto, C., Pinho, D., Cardoso, R., Custódio, V., Fernandes, J., Sousa, S., Pinheiro, M., Egas, C.,

&Gomes, A. C. (2015). Wine fermentation microbiome: A landscape from different

162

Portuguese wine appellations. Frontiers in Microbiology, 6(SEP).

https://doi.org/10.3389/fmicb.2015.00905

Pitt, W. M., Sosnowski, M. R., Huang, R., Qiu, Y., Steel, C. C., &Savocchia, S. (2012). Evaluation

of fungicides for the management of Botryosphaeria canker of grapevines. Plant Disease,

96(9), 1303–1308. https://doi.org/10.1094/PDIS-11-11-0998-RE

Polášková, P., Herszage, J., &Ebeler, S. E. (2008). Wine flavor: Chemistry in a glass. Chemical

Society Reviews, 37(11), 2478–2489. https://doi.org/10.1039/b714455p

Poos, M. S., Walker, S. C., &Jackson, D. A. (2009). Functional-diversity indices can be driven by

methodological choices and species richness. Ecology, 90(2), 341–347.

https://doi.org/10.1890/08-1638.1

Pretorius, I. S. (2020). Tasting the terroir of wine yeast innovation. FEMS Yeast Research, 20(1).

https://doi.org/10.1093/femsyr/foz084

PubChem. (2005). cis-3-Hexen-1-ol. National Library of Medicine.

https://pubchem.ncbi.nlm.nih.gov/compound/cis-3-Hexen-1-ol

PubChem 519129. (2020, June 6). Ethyl 2-hexenoate. National Library of Medicine.

https://pubchem.ncbi.nlm.nih.gov/compound/Hexenoic-acid_-ethyl-ester

Quast, C., Pruesse, E., Yilmaz, P., Gerken, J., Schweer, T., Yarza, P., Peplies, J., &Glöckner, F.

O. (2013). The SILVA ribosomal RNA gene database project: Improved data processing and

web-based tools. Nucleic Acids Research, 41(D1). https://doi.org/10.1093/nar/gks1219

163

R Core Team. (2013). R: A language and environment for statistical computing. R Foundation

for Statistical Computing, Vienna, Austria. http://www.r-project.org/

Reed, G., &Nagodawithana, T. W. (1988). Technology of Yeast Usage in Winemaking. American

Journal of Enology and Viticulture, 39(1).

Reis-Filho, J. S. (2009). Next-generation sequencing. Breast Cancer Research, 11(SUPPL. 3), 1–

7. https://doi.org/10.1186/bcr2431

Reisch, B., Pool, R., Peterson, D., Martens, M.-H., &Henick-Kling, T. (1993). Wine and Juice

Grape Varieties for Cool Climates. https://ecommons.cornell.edu/handle/1813/17814

Renouf, V., Claisse, O., &Lonvaud-Funel, A. (2005). Understanding the microbial ecosystem on

the grape berry surface through numeration and identification of yeast and bacteria.

Australian Journal of Grape and Wine Research, 11(3), 316–327.

https://doi.org/10.1111/j.1755-0238.2005.tb00031.x

Ribéreau-Gayon, P. (1985). New Developments In Wine Microbiology. American Journal of

Enology and Viticulture, 36(1), 1–10.

Rivera-Pinto, J., Egozcue, J. J., Pawlowsky-Glahn, V., Paredes, R., Noguera-Julian, M., &Calle,

M. L. (2018). Balances: a New Perspective for Microbiome Analysis. MSystems, 3(4).

https://doi.org/10.1128/msystems.00053-18

Rivers, A. R., Weber, K. C., Gardner, T. G., Liu, S., &Armstrong, S. D. (2018). ITSxpress:

Software to rapidly trim internally transcribed spacer sequences with quality scores for

164

marker gene analysis. F1000Research, 7, 1418.

https://doi.org/10.12688/f1000research.15704.1

Romano, P., Maurizio Ciani, &Graham H. Fleet (Eds.). (2019). Yeasts in the Production of Wine.

Springer New York. https://doi.org/10.1007/978-1-4939-9782-4

Rood, D. (1997). Gas Chromatography Problem Solving and Troubleshooting. Journal of

Chromatographic Science, 35, 136–137.

Roullier-Gall, C., Boutegrabet, L., Gougeon, R. D., &Schmitt-Kopplin, P. (2014). A grape and

wine chemodiversity comparison of different appellations in Burgundy: Vintage vs terroir

effects. Food Chemistry, 152, 100–107. https://doi.org/10.1016/j.foodchem.2013.11.056

Rousseeuw, P. J. (1987). Silhouettes: A graphical aid to the interpretation and validation of cluster

analysis. Journal of Computational and Applied Mathematics, 20(C), 53–65.

https://doi.org/10.1016/0377-0427(87)90125-7

Saerens, S. M. G., Delvaux, F. R., Verstrepen, K. J., &Thevelein, J. M. (2010). Production and

biological function of volatile esters in Saccharomyces cerevisiae. Microbial Biotechnology,

3(2), 165–177. https://doi.org/10.1111/j.1751-7915.2009.00106.x

Salvetti, E., Campanaro, S., Campedelli, I., Fracchetti, F., Gobbi, A., Tornielli, G. B., Torriani, S.,

&Felis, G. E. (2016). Whole-metagenome-sequencing-based community profiles of Vitis

vinifera L. cv. berries withered in two post-harvest conditions. Frontiers in

Microbiology, 7(JUN). https://doi.org/10.3389/fmicb.2016.00937

165

Samarakoon, T., Wang, S. Y., &Alford, M. H. (2013). Enhancing PCR Amplification of DNA

from Recalcitrant Plant Specimens Using a Trehalose-Based Additive. Applications in Plant

Sciences, 1(1), 1200236. https://doi.org/10.3732/apps.1200236

Santos, J. A., Fraga, H., Malheiro, A. C., Moutinho-Pereira, J., Dinis, L.-T., Correia, C., Moriondo,

M., Leolini, L., Dibari, C., Costafreda-Aumedes, S., Kartschall, T., Menz, C., Molitor, D.,

Junk, J., Beyer, M., &Schultz, H. R. (2020). A Review of the Potential Climate Change

Impacts and Adaptation Options for European Viticulture. Applied Sciences, 10(9), 3092.

https://doi.org/10.3390/app10093092

Savitzky, A., &Golay, M. J. E. (1964). Smoothing and Differentiation of Data by Simplified Least

Squares Procedures. Analytical Chemistry, 36(8), 1627–1639.

https://doi.org/10.1021/ac60214a047

Schoch, C. L., Seifert, K. A., Huhndorf, S., Robert, V., Spouge, J. L., Levesque, C. A., Chen, W.,

Bolchacova, E., Voigt, K., Crous, P. W., Miller, A. N., Wingfield, M. J., Aime, M. C., An, K.

D., Bai, F. Y., Barreto, R. W., Begerow, D., Bergeron, M. J., Blackwell, M., …Schindel, D.

(2012). Nuclear ribosomal internal transcribed spacer (ITS) region as a universal DNA

barcode marker for Fungi. Proceedings of the National Academy of Sciences of the United

States of America, 109(16), 6241–6246. https://doi.org/10.1073/pnas.1117018109

Schueuermann, C., Khakimov, B., Engelsen, S. B., Bremer, P., &Silcock, P. (2016). GC-MS

Metabolite Profiling of Extreme Southern Pinot noir Wines: Effects of Vintage, Barrel

Maturation, and Fermentation Dominate over Vineyard Site and Clone Selection. Journal of

Agricultural and Food Chemistry, 64(11), 2342–2351.

166

https://doi.org/10.1021/acs.jafc.5b05861

Segata, N., Izard, J., Waldron, L., Gevers, D., Miropolsky, L., Garrett, W. S., &Huttenhower, C.

(2011). Metagenomic biomarker discovery and explanation. Genome Biology, 12(6), R60.

https://doi.org/10.1186/gb-2011-12-6-r60

Sharma, A. K., Singh, P. N., &Sawant, S. D. (2012). Evaluation of Fermentation Efficiency of

Yeast Strains and their Effect on Quality of Young Wines. Indian Journal of Microbiology,

52(3), 495–499. https://doi.org/10.1007/s12088-011-0226-y

Shaun Burke. (1998). Missing Values, Outliers, Robust Statistics & Non-parametric Methods. In

National Measurement System Valid Analytical Measurement Programme.

https://www.webdepot.umontreal.ca/Usagers/sauves/MonDepotPublic/CHM 3103/LCGC

Eur Burke 2001 - 4 de 4.pdf

Sipiczki, M. (2003). Candida zemplinina sp. nov., an osmotolerant and psychrotolerant yeast that

ferments sweet botrytized wines. International Journal of Systematic and Evolutionary

Microbiology, 53(6), 2079–2083. https://doi.org/10.1099/ijs.0.02649-0

Slegers, A., Angers, P., Ouellet, É., Truchon, T., &Pedneault, K. (2015). molecules Volatile

Compounds from Grape Skin, Juice and Wine from Five Interspecific Hybrid Grape Cultivars

Grown in Québec (Canada) for Wine Production. Molecules, 20, 10980–11016.

https://doi.org/10.3390/molecules200610980

Smit, B. A., Engels, W. J. M., &Smit, G. (2009). Branched chain aldehydes: Production and

breakdown pathways and relevance for flavour in foods. In Applied Microbiology and

167

Biotechnology (Vol. 81, Issue 6, pp. 987–999). Springer. https://doi.org/10.1007/s00253-008-

1758-x

Spano, G., &Torriani, S. (2016). Editorial: Microbiota of grapes: Positive and negative role on

wine quality. In Frontiers in Microbiology (Vol. 7, Issue DEC, p. 2036). Frontiers Research

Foundation. https://doi.org/10.3389/fmicb.2016.02036

Steensels, J., &Verstrepen, K. J. (2014). Taming Wild Yeast: Potential of Conventional and

Nonconventional Yeasts in Industrial Fermentations. Annual Review of Microbiology, 68(1),

61–80. https://doi.org/10.1146/annurev-micro-091213-113025

Styger, G., Prior, B., &Bauer, F. F. (2011). Wine flavor and aroma. In Journal of Industrial

Microbiology and Biotechnology (Vol. 38, Issue 9, pp. 1145–1159). Springer.

https://doi.org/10.1007/s10295-011-1018-4

Swiegers, J. H., Bartowsky, E. J., Henschke, P. A., &Pretorius, I. S. (2005). Yeast and bacterial

modulation of wine aroma and flavour. Australian Journal of Grape and Wine Research,

11(2), 139–173. https://doi.org/10.1111/j.1755-0238.2005.tb00285.x

Teixeira, A., Eiras-Dias, J., Castellarin, S. D., &Gerós, H. (2013). Berry phenolics of grapevine

under challenging environments. In International Journal of Molecular Sciences (Vol. 14,

Issue 9, pp. 18711–18739). Multidisciplinary Digital Publishing Institute (MDPI).

https://doi.org/10.3390/ijms140918711

Tettemer, J. (2017). Pennsylvania Grape Guide. Pennsylvania Winery Association.

https://pennsylvaniawine.com/wp-

168

content/uploads/2017/04/PAWinesGrapeGuide_longform.pdf

Thompson, L., DeVito, C., &Vigna, P. (2019). Pennsylvania Wine: Fifty Years of Progress.

https://www.i-winereview.com/reports/Pennsylvania-i-WineReview-R74-intro.pdf

Ugliano, M., &Moio, L. (2006). The influence of malolactic fermentation andOenococcus oeni

strain on glycosidic aroma precursors and related volatile compounds of red wine. Journal of

the Science of Food and Agriculture, 86(14), 2468–2476. https://doi.org/10.1002/jsfa.2650

Ú rbez-Torres, J. R., Leavitt, G. M., Voegel, T. M., &Gubler, W. D. (2006). Identification and

distribution of Botryosphaeria spp. associated with grapevine cankers in California. Plant

Disease, 90(12), 1490–1503. https://doi.org/10.1094/PD-90-1490

VALERO, E., SCHULLER, D., CAMBON, B., CASAL, M., &DEQUIN, S. (2005).

Dissemination and survival of commercial wine yeast in the vineyard: A large-scale, three-

years study. FEMS Yeast Research, 5(10), 959–969.

https://doi.org/10.1016/j.femsyr.2005.04.007 vanLeeuwen, C., &Darriet, P. (2016). The Impact of Climate Change on Viticulture and Wine

Quality. Journal of Wine Economics, 11(1), 150–167. https://doi.org/10.1017/jwe.2015.21

VanLeeuwen, C., &Seguin, G. (2006). The concept of terroir in viticulture. Journal of Wine

Research, 17(1), 1–10. https://doi.org/10.1080/09571260600633135

Verginer, M., Leitner, E., &Berg, G. (2010). Production of volatile metabolites by grape-

associated microorganisms. Journal of Agricultural and Food Chemistry, 58(14), 8344–8350.

169

https://doi.org/10.1021/jf100393w

Versari, A., Parpinello, G. P., &Cattaneo, M. (1999). Leuconostoc oenos and malolactic

fermentation in wine: A review. In Journal of Industrial Microbiology and Biotechnology

(Vol. 23, Issue 6, pp. 447–455). Nature Publishing Group.

https://doi.org/10.1038/sj.jim.2900733

Viana, F., Gil, J.V., Vallés, S., &Manzanares, P. (2009). Increasing the levels of 2-phenylethyl

acetate in wine through the use of a mixed culture of Hanseniaspora osmophila and

Saccharomyces cerevisiae. International Journal of Food Microbiology, 135(1), 68–74.

https://doi.org/10.1016/j.ijfoodmicro.2009.07.025

Vilanova, M., Genisheva, Z., Bescansa, L., Masa, A., &Oliveira, J. M. (2012). Changes in free and

bound fractions of aroma compounds of four Vitis vinifera cultivars at the last ripening stages.

Phytochemistry, 74, 196–205. https://doi.org/10.1016/j.phytochem.2011.10.004

Vilanova, M., Genisheva, Z., Masa, A., &Oliveira, J. M. (2010). Correlation between volatile

composition and sensory properties in Spanish Albariño wines. Microchemical Journal, 95(2),

240–246. https://doi.org/10.1016/j.microc.2009.12.007

Vojinovic, D., Radjabzadeh, D., Kurilshikov, A., Amin, N., Wijmenga, C., Franke, L., Ikram, M.

A., Uitterlinden, A. G., Zhernakova, A., Fu, J., Kraaij, R., &vanDuijn, C. M. (2019).

Relationship between gut microbiota and circulating metabolites in population-based cohorts.

Nature Communications, 10(1), 1–7. https://doi.org/10.1038/s41467-019-13721-1

Waterhouse, A. L., Sacks, G. L., &Jeffery, D. W. (2016). Understanding Wine Chemistry. In

170

Understanding Wine Chemistry (1st ed.). John Wiley & Sons, Ltd.

https://doi.org/10.1002/9781118730720

Welke, J. E., Zanus, M., Lazzarotto, M., &Alcaraz Zini, C. (2014). Quantitative analysis of

headspace volatile compounds using comprehensive two-dimensional gas chromatography

and their contribution to the aroma of Chardonnay wine. Food Research International, 59,

85–99. https://doi.org/10.1016/j.foodres.2014.02.002

Wenig, P., &Odermatt, J. (2010). OpenChrom: A cross-platform open source software for the mass

spectrometric analysis of chromatographic data. BMC Bioinformatics, 11.

https://doi.org/10.1186/1471-2105-11-405

Willis, A. D. (2019). Rarefaction, Alpha Diversity, and Statistics. Frontiers in Microbiology,

10(OCT), 2407. https://doi.org/10.3389/fmicb.2019.02407

Xu, C. H., Chen, G. S., Xiong, Z. H., Fan, Y. X., Wang, X. C., &Liu, Y. (2016). Applications of

solid-phase microextraction in food analysis. In TrAC - Trends in Analytical Chemistry (Vol.

80, pp. 12–29). Elsevier B.V. https://doi.org/10.1016/j.trac.2016.02.022

Yang, R.-H., Su, J.-H., Shang, J.-J., Wu, Y.-Y., Li, Y., Bao, D.-P., &Yao, Y.-J. (2018). Evaluation

of the ribosomal DNA internal transcribed spacer (ITS), specifically ITS1 and ITS2, for the

analysis of fungal diversity by deep sequencing. PLOS ONE, 13(10), e0206428.

https://doi.org/10.1371/journal.pone.0206428

Zarraonaindia, I., &Gilbert, J. A. (2015). Understanding grapevine-microbiome interactions:

Implications for viticulture industry. In Microbial Cell (Vol. 2, Issue 5, pp. 171–173). Shared

171

Science Publishers OG. https://doi.org/10.15698/mic2015.05.204

Zhang, H., Xu, X., Chen, X., Yuan, F., Sun, B., Xu, Y., Yang, J., &Sun, D. (2017). Complete

genome sequence of the cellulose-producing strain Komagataeibacter nataicola RZS01.

Scientific Reports, 7(1). https://doi.org/10.1038/s41598-017-04589-6

Zhu, F., Du, B., &Li, J. (2016). Aroma Compounds in Wine. In Grape and Wine Biotechnology.

InTech. https://doi.org/10.5772/65102

Zott, K., Miot-Sertier, C., Claisse, O., Lonvaud-Funel, A., &Masneuf-Pomarede, I. (2008).

Dynamics and diversity of non-Saccharomyces yeasts during the early stages in winemaking.

International Journal of Food Microbiology, 125, 197–203.

https://doi.org/10.1016/j.ijfoodmicro.2008.04.001

172