<<

ABSTRACT

CONDENSED CHARACTERIZATION BY FT-ICR MALDI AND SEPARATION WITH SAW-TOOTH GRADIENT HPLC

by Savanah Gail Reeves

Condensed (CT) are high molecular weight compounds, comprised of -3-ol monomers stabilized by C-C interflavan bonds. They are found in and affect food astringency, protein availability, and biogeochemical cycles. Tannins are highly sorptive, making them difficult to analyze using existing methods. Recently, we developed methods using Fourier Transform Ion Cyclotron Resonance Matrix Assisted Laser Desorption/Ionization mass spectrometry (FT-ICR MALDI-MS) to obtain exact masses of polymer species. Using CT from grain, we obtained spectra that exhibited peaks consistent with the (epi) composition, previously proposed. However, the spectra also exhibited peaks that suggested the tannin contained groups. Using thiolysis, methanolysis, NMR, and Fourier Transform Infrared Spectroscopy (FTIR), we confirmed that the tannin does not contain ester groups. Instead, we propose that the MALDI matrix 2,5-dihydroxy forms adducts with the that survive ionization and yield misleading MS peaks. To further characterize CT, we are developing reversed phase High Performance Liquid Chromatography (RP-HPLC) methods that separate polymers based on molecular weight. This is an improvement over previous methods that give a “hump” for most tannin extracts. Combining MALDI analysis with better HPLC methods will improve the potential to establish the roles of CT in diverse environments.

CONDENSED TANNIN CHARACTERIZATION BY FT-ICR MALDI MASS SPECTROMETRY AND SEPARATION WITH SAW-TOOTH GRADIENT HPLC

Thesis

Submitted to the

Faculty of Miami University

in partial fulfillment of

the requirements for the degree of

Master of Science

by

Savanah Gail Reeves

Miami University

Oxford, Ohio

2020

Advisor: Dr. Ann E. Hagerman

Reader: Dr. Neil D. Danielson

Reader: Dr. Andrea N. Kravats

Reader: Dr. Heeyoung Tai

©2020 Savanah Gail Reeves

This thesis titled

CONDENSED TANNIN CHARACTERIZATION BY FT-ICR MALDI MASS SPECTROMETRY AND SEPARATION WITH SAW-TOOTH GRADIENT HPLC

by

Savanah Gail Reeves

has been approved for publication by

College of Arts and Science

and

Department of Chemistry and Biochemistry

______Dr. Ann E. Hagerman

______Dr. Neil D. Danielson

______Dr. Andrea N. Kravats

______Dr. Heeyoung Tai

Table of Contents

Chapter 1: Introduction ………………………………………..……….………...… pg. 1

Chapter 2: Characterization of high molecular weight by FT-ICR-MALDI-MS and HPLC …………………...…...…………….…………… pg. 11

Chapter 3: Improving RP-HPLC separation of condensed tannins ..…...... … pg. 32

Chapter 4: Conclusion ………………………………………………………...... pg. 52

iii

List of Tables

Table 1. Thiolysis and HSQC NMR structural data for condensed tannin from Sorghum bicolor, Neptunia lutea, and cocoa. ….…………………...… pg. 25

Table 2. FT-ICR MALDI-MS data for condensed tannin from Sorghum bicolor, Neptunia lutea, and cocoa. ………………..…………..………… pg. 26

Table 3. FT-ICR ESI-MS/MS data of the peak at 864.69266 m/z for condensed tannin from Sorghum bicolor. …………………...…………….…… pg. 27

Table 4. Preliminary analysis of main peaks for Cocoa and Sorghum CT separated by three HPLC solvent methods. ……………………...…... pg. 48

Table 5. Thiolysis data of Sorghum CT HPLC fractions. ………………………….. pg. 49

iv

List of Figures

Figure 1. Condensed and hydrolyzable tannins. ………………………….……...... pg. 5

Figure 2. Diversity of condensed tannins. .………………………...………....……. pg. 6

Figure 3. FTIR spectra of EGCg, Neptunia lutea CT, catechin, cocoa CT, and Sorghum CT. ………..………………………………………..…….. pg. 22

Figure 4. Methanolysis HPLC Chromatogram. ………....……...……..………...…. pg. 23

Figure 5. 1H NMR spectra of DHB and DHB-Sorghum CT mixture...... pg. 24

Figure 6. Chromatogram of cocoa CT separated by Method 1 with solvent gradient profile. ………………………………………….…………… pg. 41

Figure 7. Chromatogram of Sorghum CT separated by Method 1 with solvent gradient profile. ………………………………………………….....… pg. 42

Figure 8. Chromatogram of cocoa CT separated by Method 2 with solvent gradient profile. ……………………………………………….....…… pg. 43

Figure 9. Chromatogram of Sorghum CT separated by Method 2 with solvent gradient profile. ……………………………………………….....…… pg. 44

Figure 10. Chromatograms of reinjected fractions of Sorghum CT, analyzed by Method 2 with solvent gradient profile. …....………………...... …… pg. 45

Figure 11. Chromatogram of cocoa CT separated by Method 3 with solvent gradient profile. …………………………………………….....……… pg. 46

Figure 12. Chromatogram of Sorghum CT separated by Method 3 with solvent gradient profile. …………………………………………….....……… pg. 47

v

Dedication

I dedicate my thesis work to my family and friends who helped me along the way.

I am grateful for the support and encouragement of my loving parents. They always push me to achieve my very best. I want to thank my family, with a special thanks to Macie.

Thank you to Brayden for always believing in me. I also want to thank my wonderful roommates.

I also dedicate this work to my advisor, Dr. Ann E. Hagerman. Without her mentorship, none of this would have been possible. I am so appreciative for her guidance and her patience throughout this process. I’m so lucky to have had the opportunity to learn from her.

vi

Acknowledgements

Dr. Ann E. Hagerman

Dr. Arpad Somogyi

Dr. Wayne Zeller

Dr. Theresa Ramelot

Dr. Neil D. Danielson

Dr. Andrea N. Kravats

Dr. Heeyoung Tai

Dr. Andre J. Sommers

Rosie Magro

Mai Nguyen

vii

Chapter 1: Introduction

As the fourth most abundant biochemical produced by plants, tannins play a large role in and interactions with other organisms1, 2. Tannins are a broad class of “water- soluble phenolic compounds having molecular weights between 500 and 3000 [Da ... with] the ability to precipitate , gelatin and other proteins”3. The basic properties of tannins include antioxidant activity, protein binding, metal binding, and astringency. Tannins are highly bioactive compounds with diverse fates in diverse environments. When plants or fruits containing this material abscise, the tannins end up in the soil or water. In these environments, tannins contribute to plant, animal, and microbial , as well as biogeochemical cycles1. When tannin-containing foods are eaten, tannins end up in the gut and digestive tract, contributing to human and animal metabolisms4. The microbiome plays a major role in the conversion of complex molecules, such as tannins, in both physiological systems and ecosystems5-7. Greater understanding of tannin structure may help to elucidate the functions of tannins in metabolism and biogeochemical cycles. There are two types of tannins: hydrolyzable and condensed tannins (Figure 1). Hydrolyzable tannins are derived from ; a core polyol such as glucose is esterified to gallic acid, with additional esterified or cross-linked galloyl groups (Figure 1b,c)8. These molecules are found in plants, including many that we consume, such as blackberries, pomegranates, and raspberries9-11. Hydrolyzable tannins are degraded by acidic or basic conditions, cleaving the ester bonds by hydrolysis or oxidation, into smaller phenolics11. They are also degraded by enzymes, such as oxidase12 and tannase enzymes4, 13. While hydrolyzable tannin degradation has been thoroughly investigated, very little is known concerning the degradation of condensed tannins14. Condensed tannins (CTs, ) are flavan-3-ol polymers with variations in hydroxylation patterns, cis- and trans-stereochemistry of C-ring substituents, interflavan bond connections (A or B type), mean degree of (mDP), and degree of esterification

(Figure 1a, Figure 2)15. Condensed tannins, like hydrolyzable tannins, are present in plants including foods, such as , persimmons, and Sorghum grain16-18. The C-C interflavan bonding between monomers of condensed tannin molecules makes them much less susceptible to degradation than the ester-linked hydrolyzable tannins. Therefore, the degradation of condensed

1 tannins, including the mechanism and products, is currently unclear. While the pathways for transformation of the flavan-3-ol subunits of CT to metabolites are well established19, more research needs to be performed to understand the initial steps of polymer decomposition. Degradation studies can afford greater understanding of the roles fulfilled by condensed tannins in diverse environments, and how those contributions affect and are affected by other environmental factors. In order to perform degradation studies, a large quantity of the target substrate is needed. For example, a soil microcosm experiment with one level of tannin amendment and all the appropriate controls requires 750 mg of tannin (McGivern, B and Wrighton, K personal communication). Obtaining a large amount of condensed tannin through extraction and purification techniques can be a difficult and time-consuming process. Our lab group, however, has optimized a technique that allows for purification of Sorghum CT with yields as high as 1.25 mg per gram of grain. Using this method, we routinely obtain 250 mg of tannin in just 4 days, making it easy and feasible to obtain enough CT for biological degradation studies. In order to trace the metabolic fate of tannin, it is necessary to have accurate structural information about the starting material. Hydrolyzable tannins are ideal targets for metabolic studies, because they accumulate in plants as discrete compounds that can be purified to homogeneity20. In contrast, condensed tannins are produced as a series of polymers with closely related structures that cannot easily be resolved into discrete compounds based on current separation methods21. The methods used in previous studies to determine the average composition of condensed tannin preparations include thiolysis, heteronuclear single quantum correlation (HSQC) NMR, Fourier Transform Ion Cyclotron Resonance Electrospray Mass Spectroscopy (FT-ICR ESI-MS), Fourier Transform Infrared spectroscopy (FTIR), and Fourier Transform Ion Cyclotron Resonance Matrix Assisted Laser Desorption/Ionization mass spectrometry (FT-ICR MALDI MS)18, 22, 23. This thesis describes how those methods were employed to establish that the Sorghum CT prepared in our lab is composed of epicatechin extender units with catechin terminal units and an average chain length of 16 (Figure 1a)18, 22. Thiolysis relies on acid catalyzed depolymerization of the CT polymer at the 4→8 (B-Type) interflavan linkages, followed by nucleophilic attack by --thiol group to the carbocation extender units24. The terminal unit is released as an unmodified flavan-3-ol. High Performance Liquid Chromatography (HPLC) is then used to separate the components, which

2 can be identified by comparison to standards (retention time, UV/Visible absorbance, mass spectrometry, and electrochemical response)25. Mean degree of polymerization (mDP) and procyanidin/ (PC/PD) ratios are calculated using peak areas25, 26. Thiolysis provides valuable structural information that can be verified through NMR and other techniques15. However, thiolysis does not work well for branched (4→6 interflavan linkage) condensed tannin polymers, or those that contain A-Type linkages27. Thiolysis provides average structural information for the molecules in the CT sample. The information details the overall, average composition of the heterogeneous sample mixture, not exact structures of individual components of the mixture.

HSQC NMR determines subunit composition and mean degree of polymerization28. This method is especially useful for observing interflavan linkages, A-type and B-type. It provides reliable and detailed structural information, and helps to verify results gathered through other methods. NMR, like thiolysis, provides overall, average structural information for the sample mixture. Although very useful, NMR analysis requires a large sample mass; it is difficult and very time-consuming to purify the amount of condensed tannin needed. In addition, HSQC NMR is a very specialized technique. Although a database of 1H and 13C chemical shifts of condensed tannins up to tetramers (in the literature until 2015) exists, interpretation by experts in the area of condensed tannin NMR is necessary for identification28. FT-ICR MALDI-MS is a high resolution, soft ionization technique that allows for more detailed analysis of exact tannin species composition. Peaks can be assigned as polymers containing specific subunits with identifiable linkage types. High resolution FT-ICR MALDI- MS is a somewhat new technique for structural analysis of condensed tannins, and has proven to be extremely valuable, especially when used in tandem with fragmentation experiments. In order to gather the high-resolution spectra, the analyst must test many different MALDI matrices, matrix–sample ratios, sample preparation/drying methods, and replicates. This is quite time- consuming. Improving the signal to noise ratio is also a challenge for condensed tannin samples. FT-ICR MALDI-MS is a specialized technique that requires an expert to perform the experiment and interpret the results. Furthermore, our investigation into verifying the structure of our Sorghum CT preparations demonstrated that CT samples can form supramolecular complexes with a commonly used MALDI matrix, 2, 5-dihydroxy benzoic acid (DHB), to yield misleading MS peaks and patterns. A combination of several methods, including thiolysis, NMR, and FT-

3

ICR MALDI-MS, must be employed to corroborate data gathered from each technique, such that more complete and consistent information can be used to characterize the structures of condensed tannin polymers15. Our new understanding of FT-ICR MALDI-MS data and our detailed structural information about Sorghum CT are described in Chapter 2 of this thesis. Once degradation experiments have been performed, the next crucial step is to observe degradative changes in the tannin sample. Normal phase (NP) HPLC is capable of separating CT samples according to increasing degree of polymerization (DP), up to 10-1129. However, NP-

HPLC techniques face issues with incomplete elution and acidic degradation of the CT sample29. Reversed phase (RP) HPLC is more commonly used to separate the heterogeneous CT sample components21. In preliminary experiments, we have employed RP-HPLC to compare the chromatogram of the original, undegraded CT and the chromatogram of the degraded sample to show changes in peak area distribution toward different chain-length polymers. For this analysis, it would be ideal to make use of HPLC with MS detection to obtain exact structural composition information of each sample peak fraction, but substantial information can be obtained by more widely available detection, such as diode array and electrochemical detection. One significant barrier my lab group has encountered with this technique is maintaining PC solubility in samples from the microbial cultures, due to side reactions during sample workup with microbial culture media, minerals, or soil. One of my fellow lab group researchers is currently working to resolve these issues. Achieving highly resolved separation based on molecular weight is another formidable challenge, especially for the longer chain-length polymers30. To try to overcome this challenge, we implemented a modified saw-tooth gradient solvent method, that has been shown to be useful for separation of other high molecular weight polymers31. That method is described in the third chapter of this thesis. Discovering and solving problems associated with qualitative and quantitative analysis of CT will help advance techniques for better understanding of these important molecules. Gaining knowledge of tannin degradation can improve our knowledge of carbon cycling, the role that tannins play in this process, and how they are impacted by global climate change1, 12. This can also improve our understanding of the role of tannins in the metabolisms of plants, microbes, animals, and even humans14, 32. Tannins are widely prevalent in our ecosystems and our diets, so the more we can study them, the more we can comprehend their impacts on the health of our own bodies and our environment.

4

Figure 1. Condensed and hydrolyzable tannins. Representative structures are provided for Sorghum CT, a typical procyanidin (PC) (a), the small gallic acid, characteristic of hydrolyzable tannins (b) and the simple hydrolyzable tannin pentagalloyl glucose (c).

5

Figure 2. Diversity of condensed tannins. The B ring can be substituted with 1-3 phenolic groups yielding chain extender units classified as (1), procyanidin (2), and prodelphinidin (3). The polymer is often esterified by gallic acid on the C ring (4). Interflavan C-C bonds between the A and C ring are most common, but the ether-containing A-type linkage (5) is typical of some plants. More uncommon modifications are glycosylation (6) or occurrence of a flavone such as as the terminal group (7).

6

Bibliography: 1. Kraus, T. E. C.; Dahlgren, R. A.; Zasoski, R. J., Tannins in dynamics of forest ecosystems - a review. Plant and Soil 2003, 256 (1), 41-66. 2. Kraus, T. E. C.; Zasoski, R. J.; Dahlgren, R. A.; Horwath, W. R.; Preston, C. M., Carbon and nitrogen dynamics in a forest soil amended with purified tannins from different plant species. Soil Biology & Biochemistry 2004, 36 (2), 309-321. 3. Swain, T.; Bate-Smith, E. C., compounds. In Comparitive Biochemistry, Florkin, M.; Mason, H. S., Eds. Academic Press: Cambridge: UK, 1962; Vol. 3, pp 755-809. 4. Aguilar-Zarate, P.; Cruz-Hernandez, M. A.; Montanez, J. C.; Belmares-Cerda, R. E.; Aguilar, C. N., Bacterial tannases: production, properties, and applications. Revista Mexicana De Ingenieria Quimica 2014, 13 (1), 63-74. 5. Pennisi, E., No microbiome is an island, survey reveals. Science 2019, 365 (6456), 851- 851. 6. Smith, A. H.; Zoetendal, E.; Mackie, R. I., Bacterial mechanisms to overcome inhibitory effects of dietary tannins. Microbial Ecology 2005, 50 (2), 197-205. 7. Zong, W.; Wang, J.; He, Y.; Qiu, Y.; Guo, D.; Fu, H., Net nitrogen mineralization and enzyme activities in an alpine meadow soil amended with litter tannins. Journal of Plant Nutrition and Soil Science 2018, 181 (6), 954-965. 8. Quideau, S.; Deffieux, D.; Douat-Casassus, C.; Pouysegu, L., Plant polyphenols: chemical properties, biological activities, and synthesis. Angewandte Chemie-International Edition 2011, 50 (3), 586-621. 9. Russo, M.; Cacciola, F.; Arena, K.; Mangraviti, D.; de Gara, L.; Dugo, P.; Mondello, L., Characterization of the polyphenolic fraction of pomegranate samples by comprehensive two- dimensional liquid chromatography coupled to mass spectrometry detection. Natural Product Research 2020, 34 (1), 39-45. 10. Hager, T. J.; Howard, L. R.; Liyanage, R.; Lay, J. O.; Prior, R. L., composition of as determined by HPLC-ESI-MS and MALDI-TOF-MS. Journal of Agricultural and Food Chemistry 2008, 56 (3), 661-669. 11. Sojka, M.; Janowski, M.; Grzelak-Blaszczyk, K., Stability and transformations of raspberry (Rubus idaeus L.) in aqueous solutions. European Food Research and Technology 2019, 245 (5).

7

12. Zak, D.; Roth, C.; Unger, V.; Goldhammer, T.; Fenner, N.; Freeman, C.; Jurasinski, G., Unraveling the importance of polyphenols for microbial carbon mineralization in rewetted riparian peatlands. Frontiers in Environmental Science 2019, 7. 13. de las Rivas, B.; Rodriguez, H.; Anguita, J.; Munoz, R., Bacterial tannases: classification and biochemical properties. Applied Microbiology and Biotechnology 2019, 103 (2), 603-623. 14. Monagas, M.; Urpi-Sarda, M.; Sanchez-Patan, F.; Llorach, R.; Garrido, I.; Gomez- Cordoves, C.; Andres-Lacueva, C.; Bartolome, B., Insights into the metabolism and microbial biotransformation of dietary flavan-3-ols and the bioactivity of their metabolites. Food & Function 2010, 1 (3), 233-253. 15. Naumann, H.; Sepela, R.; Rezaire, A.; Masih, S. E.; Zeller, W. E.; Reinhardt, L. A.; Robe, J. T.; Sullivan, M. L.; Hagerman, A. E., Relationships between structures of condensed tannins from Texas legumes and methane production during rumen digestion. Molecules 2018, 23 (9). 16. Herderich, M. J.; Smith, P. A., Analysis of and wine tannins: Methods, applications and challenges. Australian Journal of Grape and Wine Research 2005, 11 (2), 205-214. 17. Li, C. M.; Leverence, R.; Trombley, J. D.; Xu, S. F.; Yang, J.; Tian, Y.; Reed, J. D.; Hagerman, A. E., High molecular weight persimmon ( L.) : a highly galloylated, A-linked tannin with an unusual flavonol terminal unit, myricetin. Journal of Agricultural and Food Chemistry 2010, 58 (16), 9033-9042. 18. Gupta, R. K.; Haslam, E., Plant proanthocyanidins. Part 5. Sorghum polyphenols. Journal of the Chemical Society-Perkin Transactions 1 1978, (8), 892-896. 19. Cifuentes-Gomez, T.; Rodriguez-Mateos, A.; Gonzalez-Salvador, I.; Alanon, M. E.; Spencer, J. P. E., Factors affecting the absorption, metabolism, and excretion of cocoa flavanols in humans. Journal of Agricultural and Food Chemistry 2015, 63 (35), 7615-7623. 20. Moilanen, J.; Sinkkonen, J.; Salminen, J. P., Characterization of bioactive plant ellagitannins by chromatographic, spectroscopic and mass spectrometric methods. Chemoecology 2013, 23 (3), 165-179. 21. Leppa, M. M.; Karonen, M.; Tahtinen, P.; Engstrom, M. T.; Salminen, J. P., Isolation of chemically well-defined semipreparative liquid chromatography fractions from complex mixtures of proanthocyanidin oligomers and polymers. Journal of Chromatography A 2018, 1576, 67-79.

8

22. Krueger, C. G.; Vestling, M. M.; Reed, J. D., Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry of heteropolyflavan-3-ols and glucosylated heteropolyflavans in sorghum Sorghum bicolor (L.) Moench. Journal of Agricultural and Food Chemistry 2003, 51 (3), 538-543. 23. Chupin, L.; Motillon, C.; Charrier-El Bouhtoury, F.; Pizzi, A.; Charrier, B., Characterisation of maritime () bark tannins extracted under different conditions by spectroscopic methods, FTIR and HPLC. Industrial Crops and Products 2013, 49, 897-903. 24. Rigaud, J.; Perezilzarbe, J.; Ricardo-Da-Silva, J. M.; Cheynier, V., Micro method for the identification of proanthocyanidin using thiolysis monitored by high-performance liquid chromatography. Journal of Chromatography 1991, 540 (1-2), 401-405. 25. Scioneaux, A. N.; Schmidt, M. A.; Moore, M. A.; Lindroth, R. L.; Wooley, S. C.; Hagerman, A. E., Qualitative variation in proanthocyanidin composition of populus species and hybrids: genetics is the key. Journal of 2011, 37 (1), 57-70. 26. Schofield, J. A.; Hagerman, A. E.; Harold, A., Loss of tannins and other phenolics from willow leaf litter. Journal of Chemical Ecology 1998, 24 (8), 1409-1421. 27. Gu, L. W.; Kelm, M. A.; Hammerstone, J. F.; Beecher, G.; Holden, J.; Haytowitz, D.; Prior, R. L., Screening of foods containing proanthocyanidins and their structural characterization using LC-MS/MS and thiolytic degradation. Journal of Agricultural and Food Chemistry 2003, 51 (25), 7513-7521. 28. Zeller, W. E., Activity, purification, and analysis of condensed tannins: current state of affairs and future endeavors. Crop Science 2019, 59 (3), 886-904. 29. Hummer, W.; Schreier, P., Analysis of proanthocyanidins. Molecular Nutrition & Food Research 2008, 52 (12), 1381-1398. 30. Putman, L. J.; Butler, L. G., Separation of high molecular weight sorghum by high-performance liquid chromatography. Journal of Agricultural and Food Chemistry 1989, 37 (4), 943-946. 31. Durner, B.; Ehmann, T.; Matysik, F.-M., High-resolution polymer high performance liquid chromatography: Application of a saw tooth gradient for the separation of various polymers. Journal of Chromatography A 2019, 1587, 88-100.

9

32. Piluzza, G.; Sulas, L.; Bullitta, S., Tannins in forage plants and their role in animal husbandry and environmental sustainability: a review. Grass and Forage Science 2014, 69 (1), 32-48.

10

Chapter 2: Characterization of high molecular weight polyphenols by FT-ICR MALDI-MS and HPLC Introduction As the fourth most abundant biochemical produced by plants, tannins play a large role in plant metabolism and interactions with other organisms1. Tannins are a broad class of “water- soluble phenolic compounds having molecular weights between 500 and 3000 [Da ... with] the ability to precipitate alkaloids, gelatin and other proteins”2. The basic properties of tannins include antioxidant activity, protein binding, and astringency. There are two types of tannins: hydrolyzable and condensed tannins. Hydrolyzable tannins are derived from gallic acid; a core polyol such as glucose is esterified to gallic acid, with additional esterified or cross-linked galloyl groups3. These molecules are found in plants, including many that we consume, such as blackberries, pomegranates, and raspberries4-6. Condensed tannins (CTs, proanthocyanidins) are flavan-3-ol polymers with variations in hydroxylation patterns, cis- and trans-stereochemistry of C-ring substituents, interflavan bond connections (A or B type), mean degree of polymerization (mDP), and degree of esterification7. Condensed tannins, like hydrolyzable tannins, are present in food plants, such as grapes, persimmons, and Sorghum grain8-10. When plants or fruits containing tannins abscise, the tannins end up in the soil and water.

In these environments, hydrolyzable tannins are degraded by acidic or basic conditions6 and enzymes, such as phenol oxidase11. While hydrolyzable tannin degradation has been thoroughly investigated, very little is known concerning the degradation of condensed tannins12. The C-C interflavan bonding between monomers of condensed tannin molecules makes them much less susceptible to degradation than hydrolyzable tannins. Therefore, the degradation of condensed tannins, including the mechanism and products, is currently unclear. More research needs to be performed to understand the fate and degradation of condensed tannin in soil, and its role in metabolic and biogeochemical cycles1. Understanding of condensed tannin degradation in soil environments may also provide insight into the fate of proanthocyanidins in the gut, as well. Previous research has shown that Sorghum CT, which is readily purified in hundred mg quantities suitable for metabolic studies, is composed of epicatechin extender units with a catechin terminal unit10. Due to its simple structure and convenience, we are using Sorghum CT in large scale soil decomposition experiments to help elucidate CT degradation mechanisms and

11 products in soil (McGivern, B; Wrighton, K & Hagerman, AE unpublished). An important step towards understanding condensed tannin degradation is to confirm the starting structure and improve techniques for structural analysis. There are well established methods for the structural analysis of condensed tannins. Thiolysis with HPLC is a depolymerization technique, widely used to ascertain the average chain length and molecular weight of the polymer, as well as the composition of extender and terminal units9, 13. HSQC NMR determines subunit composition and mean degree of polymerization

(mDP), and is especially useful for observing A-type linkages14. Fourier Transform Ion Cyclotron Resonance Matrix Assisted Laser Desorption Ionization mass spectrometry (FT-ICR MALDI-MS) has been proven a valuable tool for the characterization of structural features and linkage types of condensed tannin molecules15, 16. While thiolysis and NMR provide overall, average structural information for the mixture of polymers in the heterogeneous sample, high resolution FT-ICR MALDI-MS provides exact composition of individual tannin species in the mixture. We had planned to use this combined suite of tools to confirm the structure of our large- scale Sorghum CT preparations in order to assist in our studies of the fate of CT in soil. Thiolysis and NMR provided consistent results, but we were surprised to find that the MALDI-MS results suggested the presence of galloylated subunits. In this paper, we demonstrate the formation of supramolecular complexes between condensed tannins and the common MALDI matrix, 2,5- (DHB), and the misleading MALDI-MS data arising from this complexation. Uncovering this weakness in MALDI analysis improves our ability to interpret MALDI data. Enhanced data interpretation facilitates the use of MALDI in our soil degradation studies and other studies of tannin structure-function relationships.

Methods: Two genetically distinct, mature lines of high tannin Sorghum bicolor (L.) Moench grain were obtained and stored at 4oC (Hi-Tannin Sumac NM03-9905, Scott Bean, USDA Manhattan Kansas), (ICRISAT line IS8260, Gebisa Ejeta, Purdue University Department of Agronomy). Analysis was performed on at least two different preparations of CT from each of these two varieties of Sorghum for each of the techniques utilized.

12

Purification. Sorghum tannin was extracted from ground grain with containing ascorbic acid and purified by ethyl acetate extraction to remove small phenolics, followed by

Sephadex LH20 chromatography to isolate the high molecular weight fraction17, 18. The freeze- dried powder was stored at -20oC. Commercial (Hershey’s Natural Unsweetened) cocoa powder, obtained at a local supermarket, was defatted using Soxhlet extraction with hexane. The sample was then extracted with :water (70:30, v/v) four times, discarding the solid. Acetone was removed from the extract and replaced with ethanol by successive rotary evaporation, and the final ethanol-water mixture was centrifuged to remove insoluble solids before applying the sample to Sephadex LH20 equilibrated in 95% ethanol. After eluting all of the non-adsorbing components (theobromine, sugars, small phenolics) with ethanol, the condensed tannin was eluted with 70% aqueous acetone. Acetone was removed from the solution by rotary evaporation, and the aqueous solution was freeze dried. The freeze-dried powder was stored at -20oC. Neptunia lutea CT was a gift from Prof. Harley Naumann, University of Missouri. Tannins were obtained from acetone-water extracts of leaf tissue (Neptunia lutea) followed by purification on a Sephadex LH20 column19. EGCg was a gift from Lipton , Newark, NJ.

Thiolysis. Purified CTs were analyzed by thiolysis according to published procedures9, 20. Briefly, approximately 1 mg of tannin was dissolved in 200 µL of methanol containing 30 µL of the HCl reagent (32% (v/v) HCl in methanol) and 72 µL of the thiol reagent (5% (v/v) toluene- -thiol in methanol) and incubated at 40°C for 30 min. The thiolytic degradation products were analyzed by HPLC using an Agilent 1100 HPLC with ChemStation Rev. A.09.03 software. The column was a Thermofisher Hypersil Gold C8, 160 x 4.6 mm, 3 µm packing. Sample injection volume was 10 µL. The gradient program employed 0.13% (v/v) trifluoroacetic acid (TFA) in nanopure water (A) and 0.10% (v/v) TFA in acetonitrile (B) at a flow rate of 0.5 mL/min at

27.0oC for a 48 min duration, as follows: 0-3 min, isocratic at 15% B; 8 min, increase to 20% B; 10 min, increase to 30% B; 28-32 min, increase to 70% B; 37-40 min, decrease to 15% B, then hold isocratic at 15% until the end20. Reaction products were detected at 220 nm and were identified by their retention times and spectral characteristics compared to authentic standards9,

20. Products were quantitated based on peak areas and converted to moles of extender and moles of terminal units. The chromatograms from control samples that did not contain acid or thiol and were not heated were used to confirm that all of the tannin was degraded by thiolysis, and to

13 confirm that the tannin did not contain any flavan-3-ol contamination that would interfere with terminal unit determination.

NMR. 1H, 13C and 1H-13C HSQC NMR spectra were recorded at 27°C on a BrukerBiospin

DMX-500 (1H 500.13 MHz, 13C 125.76 MHz) instrument equipped with TopSpin 3.5 software and a cryogenically cooled 5 mm TXI 1H/13C/15N gradient probe in inverse geometry. Spectra were recorded in DMSO-d6 and were referenced to the residual signals of DMSO-d6 (2.49 ppm for 1H and 39.5 ppm for 13C spectra). 13C NMR spectra were obtained using 1K scans

(acquisition time 56 min). For 1H−13C HSQC experiments, spectra were obtained using between 200 and 620 scans (depending on sample size and instrument availability) obtained using the standard Bruker pulse program (hsqcetgpsisp.2) with the following parameters: Acquisition: TD 1024 (F2), 256 (F1); SW 16.0 ppm (F2), 165 ppm (F1); O1 2350.61 Hz; O2 9431.83 Hz; D1 = 1.50 s; CNST2 = 145. Acquisition time: F2 channel, 64 ms, F1 channel 6.17 ms. Processing: SI =1024 (F2, F1), WDW = QSINE, LB = 1.00 Hz (F2), 0.30 Hz (F1); PH_mod = pk; Baseline correction ABSG = 5 (F2, F1), BCFW = 1.00 ppm, BC_mod = quad (F2), no (F1); Linear prediction = no (F2), LPfr (F1). Sample sizes used for these spectra ranged from 5-10 mg providing NMR sample solutions with concentrations of 10-20 mg/mL. High Resolution FT-ICR MALDI-MS. Solutions of purified CT samples were prepared (15 mg/mL) using reagent grade methanol. The CT sample (1 µL) was mixed with 10 µL of 2,5- dihydroxybenzoic acid (DHB) matrix solution (0.1 M in acetonitrile:water, 1:1, containing 0.1% formic acid). A 1 µL aliquot of this sample-analyte mixture was deposited on the plate and allowed to dry before inserting in the Bruker 15T FT-ICR MALDI-MS. Calibration was run on a standard peptide mix (Bruker Daltonics, Billerica, MA) in the negative ion mode. Typically, 20% of the Yag/Nd (351 nm) laser power was used for the spectral acquisition. FT-ICR ESI-MS negative ion analysis. Ultrahigh resolution mass spectra were acquired on a Bruker SolariXR Fourier transform ion cyclotron resonance mass spectrometer (FT-ICR MS) (Bruker Daltronics GmbH, Bremen, Germany) equipped with a 15 Tesla superconducting magnet (Magnex Scientific Inc., Yarnton, GB) and an Apollo II ESI source (Bruker Daltonics GmbH, Bremen, Germany) operated in the negative ionization mode. Negative ionization has already been proven to be the preferred ionization mode in fingerprinting wines by FT-ICR

MS21. Tannin samples were dissolved in acetonitrile:water 1:1 solvent mixture in a concentration range of 3-5 µM . These solutions were then introduced into the electrospray source at a flow

14 rate of 120 µL/h. FT-ICR-MS was externally calibrated by using an Agilent calibration standard (Agilent Technologies, Santa Clara, CA, USA) diluted tenfold with acetonitrile:water 1:1. Some well-known background ions (generated e.g., from alkylbenzene sulfonates) were also used for calibration of the lower m/z range. The mass accuracy was <1 ppm. Methanolysis/HPLC. Equal volumes of 2 mg/mL Sorghum CT in methanol and 1 M HCl in methanol were mixed and the reaction tubes were evacuated. A control sample was prepared without HCl. The anaerobic tubes were incubated at 70oC for 24 hours. After incubation, the samples were diluted 10-fold with MeOH before injecting 10 L on the HPLC. Analysis was performed with an Agilent 1100 HPLC with ChemStation Rev. A.09.03 software. The column was a Thermofisher Hypersil Gold C8, 160x4.6 mm, 3µm packing. The gradient program utilized 0.13% (v/v) trifluoroacetic acid in water (A) and 0.10% (v/v) TFA in acetonitrile (B) at a flow rate of 0.5 mL/min at 27oC for a duration of 35 min, as follows: 0-4 min, 2% B; 8 min, increase to 20% B; 10 min, increase to 30% B; 30-35 min, increase to 70% B. Reaction products were detected at 220 nm and were identified by their retention times and spectral characteristics, as compared to authentic standards9, 20. Commercial (>99.5%) was used to determine that the lower limit of detection was 0.3% galloylation, or less than 1 gallate ester group in 100 subunits. Fourier Transform Infrared Spectroscopy. Condensed tannin from cocoa served as a reference compound for ungalloylated tannin22. Condensed tannin from Neptunia lutea served as a reference compound for galloylated tannin7. Samples of EGCg and catechin acted as positive and negative monomeric controls. About 1-2 mg of each sample was pressed into a pellet. Infrared spectra were collected with a Harrick Split-pea ATR microscope interfaced to a Perkin- Elmer Spectrum 100 Fourier transform infrared spectrometer. This accessory employed a germanium internal reflection element (IRE) and the standard deuterium triglycine sulfate (DTGS) detector on the Spectrum 100 macro bench. Spectra collected using this device represent the average of 32 individual scans possessing a spectral resolution of 4 cm-1. The samples were brought into intimate contact with the IRE using a loading of 0.5 kg.

NMR of matrix-tannin mixtures (Complexation). 1H NMR spectra were recorded at 298 K on a Bruker AV500, equipped with Topspin, 5 mm BBO probe with Z-gradients (1H outer coil, broadband inner coil), and 5 mm TXI probe with Z-gradients (1H inner coil, 13C, 15N outer coil) ,

VT: ~190 K - 373 K. Spectra were recorded in D2O, using water suppression on the solvent

15 signal at 4.8 ppm, and were obtained using a NOESY method with 8-16 scans (acquisition time less than 2 min.) Samples were composed of 4 mg of tannin or matrix dissolved in 5 mL of D2O, and 300 µL of each solution were combined to create the mixture sample.

Results and Discussion: Routine thiolysis of our scaled-up Sorghum condensed tannin preparations indicated that the material consisted of epicatechin extender units with a catechin terminal unit and mean degree of polymerization (mDP) of about 16 (Table 1), consistent with earlier studies of

Sorghum proanthocyanidin structure10. In an attempt to probe the molecular weight distribution of the polymer, we used FT-ICR MALDI-MS23, 24. Surprisingly, the peak patterns from this analysis were consistent with a galloylated condensed tannin structure (Table 2). Galloylation is not consistent with our thiolysis data, and has not previously been reported for CT from Sorghum. In order to resolve this discrepancy, we used a variety of analytical tools to evaluate whether the tannin does contain gallate . Our data shows that the tannin is not galloylated, but that strong complexes between the DHB matrix and the condensed tannin interfere with accurate structure determination using MALDI-MS. Matrix assisted laser desorption MS (MALDI-MS) is a soft-ionization mass spectrometric method, that does not promote fragmentation. This technique is widely used for structural analysis of high molecular weight compounds, specifically peptides24. MALDI-MS with a time of flight detector (MALDI-ToF-MS) has become well-established as a technique for structural analysis of CT15, 16, 24. More recent work has demonstrated the additional power of FT-ICR

MALDI-MS for exact structural determination due to its superior mass resolution and accuracy23,

25. Like thiolysis and NMR, FT-ICR MALDI-MS provides structural information, including chain length, linkage types, and subunit composition. However, unlike other techniques that provide overall, average structural information for the polymers in the heterogeneous sample mixture, FT-ICR MALDI-MS provides exact structural composition of individual tannin species in the mixture. Using this high-resolution mass data, we were able to identify Sorghum CT polymers consisting of (epi)catechin subunits up to 13 DP (Table 2). Peaks were also assigned as polymers containing A-type linkages (Table 2). The polymer composition and presence of A- type linkages is consistent with our thiolysis data and is confirmed by HSQC NMR as described below (Table 1).

16

The FT-ICR MALDI data revealed that the A-type linkages were found preferentially in shorter length polymers, with degrees of polymerization (DP) from 3–6 (Table 2). A similar pattern was revealed in a structural survey of CT from a wide range of fruits, with A-type linkages more prevalent in tetramer-heptamers and less abundant in octamers and larger oligomers26. It has been suggested that A-type linkages are formed through oxidation of B-linked proanthocyanidins. The oxidation may be mediated by oxidase (PPO) or laccase27, 28 or by free radical mechanisms29. The preferential existence of A-type linkages in shorter polymers may provide indirect support for the enzymatic mechanism, if there is a tannin size limitation to the oxidizing activity of PPO or laccase. Alternatively, addition of A-type linkages during the normal polymerization process may halt further polymerization of the CT, preventing extension to longer polymers. Additional research into factors dictating numbers of A-type linkages in condensed tannin is warranted based on the many studies suggesting bioactivity of this type of CT30. Upon further analysis of the FT-ICR MALDI data for our Sorghum CT, an unanticipated peak interval pattern was discovered. There were secondary peaks that appeared at an interval of +152.011 m/z from the peaks representing polymers consisting of solely (epi)catechin units (Table 2). This +152 m/z interval is typically attributed to gallate esters at the C3 position of the flavan-3-ol ring for some units in the polymer7. The data suggested that each of the tannin species (trimer, tetramer, etc.) existed as both an ungalloylated oligomer and as a monogallate, implying that the 16-mer Sorghum tannin could be about 5% galloylated. This surprising finding contradicts our thiolysis data, as well as published reports on Sorghum tannin composition10, 15. It prompted further detailed investigation of the Sorghum condensed tannin structure, focused specifically on possible gallate ester groups. We validated our ability to detect gallate esters by including appropriate large or small molecule positive and negative controls in the analyses. For positive controls, we used Neptunia lutea tannin, which is known to contain a high degree of gallate esters7, and (EGCg). For negative controls, we used cocoa condensed tannin22 and catechin. We confirmed that in our hands, the Neptunia tannin contained about 30% gallate esters (Table 1). The FT-ICR MALDI data showed that the Neptunia condensed tannin existed in multiple galloylated forms of mono-, di-, and tri-esterified polymers (Table 2). This range of galloylation implies a high level of heterogeneity in the Neptunia lutea sample, but is consistent

17 with the average assessment of 30% gallate ester from thiolysis and NMR (Table 1, 2). We analyzed cocoa CT, which contains no galloylation22, by FT-ICR MALDI-MS. Again, we observed the unexpected +152 m/z peak pattern interval that implied the presence of monogallate oligomers (Table 2). HSQC NMR verified tannin compositional data gathered from thiolysis and helped to describe the nature of interflavan bonds within the Sorghum condensed tannin polymers. Cross peaks centered around (4.4, 36) ppm and (4.4, 39) ppm were identified as epicatechin and catechin, respectively (Table 1). A ratio of the areas presented a cis isomer composition of 90.4%. Although thiolysis revealed traces of (epi)gallocatechin thiol, in the NMR there were no cross peaks that could be identified as prodelphinidin-type subunits (Table 1). Cross peaks centered at (4.4, 38) ppm and (4.3, 29) ppm were identified as corresponding to A-type linkages and B-type linkages, respectively (Table 1). Based on the relative peak areas, it was determined that the Sorghum condensed tannin contained 5.3% of A-type linkages. Neither the HSQC NMR nor thiolysis data provided any evidence of galloylation. We also analyzed Neptunia lutea CT and cocoa CT by HSQC NMR. The Neptunia lutea CT was determined to contain 34.4% galloylation, while the cocoa CT was found to contain no galloylation (Table 1). The NMR results verified the thiolysis data for these control samples, as well (Table 1). We implemented Fourier-transform ion cyclotron resonance electrospray ionization (FT- ICR ESI) as a mass spectrometric method which promotes fragmentation. This allowed us to further probe the structure of our scaled-up Sorghum CT preparations. The ESI-MS did not display any evidence of the +152.011 m/z interval, supporting the conclusion that gallate esters were not present in the sample. A tandem MS/MS analysis was performed on the peak at 864.69266 m/z (Table 3). Results of fragment ions formed through retro-Diels− (RDA) reaction, heterocyclic ring fission (HRF), and quinone methide fission (QM) suggested predominantly B-type interflavan linkages, and some A-type16 (Table 3). Infrared spectroscopy was employed to attempt to detect gallate ester groups in the Sorghum condensed tannin. The Fourier Transform (FT) IR spectra of the positive galloyl ester controls, EGCg and Neptunia lutea condensed tannin, displayed ester carbonyl peaks at 1691 cm-

1 and 1687 cm-1, respectively (Figure 3 A,B). Neither the catechin nor the cocoa condensed tannin spectra displayed carbonyl peaks at these wavenumbers, or even near 1700 cm-1 (Figure 3

18

C,D). The FTIR spectra of Sorghum condensed tannin did not display a carbonyl peak near 1700 cm-1 (Figure 3 E). Thus, the IR spectroscopic technique provides evidence that the Sorghum condensed tannin does not contain gallate ester groups. We established methanolysis reaction conditions for releasing methyl gallate from esterified condensed tannins with a limit of detection of 0.3% galloylation, or less than one gallate group per 100 subunits. No methyl gallate was released from Sorghum CT under these conditions (Figure 4). HSQC NMR, IR Spectroscopy, FT-ICR ESI-MS and methanolysis all verified the results from thiolysis, and supported the conclusion that Sorghum condensed tannin does not contain gallate ester groups. The FT-ICR MALDI-MS results posed a puzzling contradiction to our other data. Since condensed tannins are known to have good binding properties, we began to consider the possibility of complexation during MALDI analysis as the source for the 152 m/z intervals in the spectra. Supramolecular structures are complexes that form due to strong noncovalent interactions between molecules. These supramolecular complexes can form through several different mechanisms, such as ion pairing, electrostatic interactions, ion- interactions, hydrogen bonding, solvophobic interactions, dispersive interactions, and stacking interactions31. In the case of - stacking interactions, complexation can be evidenced through the use of 1H NMR to observe an increase in shielding experienced by aromatic protons32. In particular, is the formation of a supramolecular complex by - stacking interaction between pigments and usually colorless copigment molecules32. Copigmentation is also referred to as color stabilization, due to the increase in color intensity and stability attributed to pigment-copigment complexation33. Good copigments typically contain extended -conjugated systems that favor - stacking interactions, and H-bond donor/acceptor groups32. Flavanols, phenolics, hydrolyzable tannins, and benzoic acids have been found to act as copigments in natural systems such as flowers and in plant derived beverages such as wine, with as the pigment molecules32, 33. Anthocyanins are structurally similar to other , including the flavan-3-ols, catechin and epicatechin, although the cationic anthocyanins are more highly oxidized than other flavonoids17. Anthocyanins are produced from condensed tannin molecules during oxidative cleavage in hot alcohols17. Due to the structural similarities between flavan-3-ols, anthocyanins,

19 and condensed tannin, it is reasonable to hypothesize that our Sorghum CT could participate in supramolecular complex formation with other phenolics, such as the DHB MALDI matrix.

Proton Nuclear Magnetic Resonance (1H NMR) provided evidence for changes in the chemical shifts of the aromatic protons in the DHB due to complexation with condensed tannin.

The pure DHB 1H NMR spectrum showed three doublets at 7.236 ppm, 6.95 ppm, and 6.79 ppm, as expected (Figure 5). The 1H NMR spectrum of DHB mixed with Sorghum condensed tannin shows that those aromatic proton peaks were shifted upfield, by 0.026 - 0.049 ppm (Figure 5). This change indicates that the aromatic DHB protons were more shielded in the mixture with Sorghum condensed tannin, suggesting that a supramolecular complex had formed. This upfield shift is consistent with the complex formation mechanism of - stacking interaction32. A similar upfield shift was also observed for a mixture of DHB and Neptunia lutea CT to suggest that the complexation occurs with galloylated CT samples, as well. We propose that a strong supramolecular complex is formed by the insertion of a DHB molecule between the aromatic rings of the flavan-3-ol subunits of the CT. Our data suggest that the complexes are stable during the “soft” ionization process of MALDI, such that the spectra include discrete mass peaks for the tannin itself and for the tannin-DHB complexes. Previous studies have shown DHB-sample supramolecular complexation with cyclic and linear oligosaccharides during MALDI-ToF analysis34. Their results of showed an interval pattern of

+152.26 m/z for analyte-DHB complexes, as [M + DHB – H]- structures34. With our FT-ICR MALDI-MS analysis, this mass interval and the size of the tannin-DHB complex is indistinguishable from that of gallate esters. The structure of the tannin-DHB matrix still needs to be analytically determined. Future work includes structural modelling of the complex, as well as CD spectroscopy or X-ray crystallography to discern structural details.

Conclusion Using the techniques of thiolysis with HPLC, NMR, FT-ICR MALDI-MS, FTIR spectroscopy, FT-ICR ESI-MS and methanolysis, we were able to confirm the structure of Sorghum CT prepared in large quantity for metabolism studies. Thiolysis and NMR provided overall average structural data to support the simple, flavan-3-ol structure of the tannin. FT-ICR MALDI-MS provided complementary data to show exact structural composition of individual polymer species. It offered new insights into the positions of the A-type linkages, which were

20 found predominantly in the smaller CT oligomers. However, this MALDI-MS technique incorrectly suggested that our Sorghum CT polymer contained about 5% galloylation. We demonstrated that the misleading MS peak pattern observed in the FT-ICR MALDI-MS data is the result of supramolecular complexation between the tannin sample and the DHB matrix. This complexation was evidenced through the use of 1H NMR. These results suggest that the complexation occurs through - stacking interactions. We also show that IR spectroscopy and methanolysis techniques are valuable tools for the detection of galloylation in high molecular weight polyphenols to verify other analytical techniques. FT-ICR MALDI-MS has proven to be a very useful technique for structural analysis of CT, and it offers important information to complement other analytical techniques. However, the results must be interpreted cautiously, taking into consideration the formation of tannin-matrix complexes, which are indistinguishable from gallate esters by MALDI-MS, alone. Awareness of this issue can help to prevent misidentification and subsequent error in structural determination of CT molecules in future research. A suite of analytical tools must be utilized for complete, accurate structural determination of condensed tannins.

21

Figure 3. FTIR spectra of EGCg (A), catechin (B), Neptunia lutea CT (C), cocoa CT (D), and Sorghum CT (E). EGCg (A) and Neptunia lutea CT (C) are galloylated, and the characteristic ester peak is indicated with an arrow. There is no ester peak in C, D or E

22

Figure 4. Methanolysis HPLC Chromatogram. Sorghum CT reaction chromatogram is in black, while methyl gallate (MeG) standard chromatogram is in green, with elution of MeG at 21.4 min.

23

Figure 5. 1H NMR spectra of DHB and DHB-Sorghum CT mixture. The pure DHB spectrum is shown in red. The spectrum of the mixture of DHB with Sorghum CT is shown in blue.

24

, and and ,

Neptunialutea

, ,

) ) substitution on the (% ring B

-

Sorghumbicolor

position (% Cis),hydroxyl (gallo

Type composition. Type Linkage

-

. Thiolysis and HSQC for data . from and tannin NMR condensed Thiolysis structural

1

degree of polymerization (mDP), epicatechin com (mDP), epicatechin polymerization of degree Table Table thiolysis HPLC from mean Peak chromatograms HSQC areas calculate and cocoa. to used were NMR peaks cross A gallate and esterification, Prodelphinidin),

25

osition

are misleading and and aremisleading

and cocoa. and comp Peak

matrix matrix complexation.

-

Sorghum bicolor, Neptunia lutea, lutea, Neptunia Sorghumbicolor,

or cocoa condensed tannins. These peaks tannins. These condensed cocoa or

dihydroxybenzoic acid (DHB) matrix. (DHB) acid matrix. dihydroxybenzoic

-

Sorghum

for condensed tanninfrom

MSdata

-

ICR MALDI ICR

-

. . FT

2

This mass unit value represents the number of (epi)gallocatechin subunits. number (epi)gallocatechin of represents value the unit mass This ester groups. number galloyl apparent of represents value the unit mass This

This mass unit value represents the number of (epi)catechin subunits. number (epi)catechin of represents value the unit mass This

and calculations for CT samples analyzed with 2,5 with a analyzed samples CT for calculations and Table Table not groups the present actually are in ester *Gallate result sample of polymers, but the actually theymisidentified are be galloylated as could 1 1 2 3

26

Table 3. FT-ICR ESI-MS/MS data of the peak at 864.69266 m/z for condensed tannin from Sorghum bicolor. The fragmentation described includes quinone methide (QM) fission, heterocyclic ring fission (HRF), and retro-Diels-Alder (RDA) reaction.

27

Bibliography

1. Kraus, T. E. C.; Dahlgren, R. A.; Zasoski, R. J., Tannins in nutrient dynamics of forest ecosystems - a review. Plant and Soil 2003, 256 (1), 41-66. 2. Swain, T.; Bate-Smith, E. C., Flavonoid compounds. In Comparitive Biochemistry, Florkin, M.; Mason, H. S., Eds. Academic Press: Cambridge: UK, 1962; Vol. 3, pp 755-809. 3. Quideau, S.; Deffieux, D.; Douat-Casassus, C.; Pouysegu, L., Plant polyphenols: chemical properties, biological activities, and synthesis. Angewandte Chemie-International Edition 2011, 50 (3), 586-621. 4. Russo, M.; Cacciola, F.; Arena, K.; Mangraviti, D.; de Gara, L.; Dugo, P.; Mondello, L., Characterization of the polyphenolic fraction of pomegranate samples by comprehensive two- dimensional liquid chromatography coupled to mass spectrometry detection. Natural Product Research 2020, 34 (1), 39-45. 5. Hager, T. J.; Howard, L. R.; Liyanage, R.; Lay, J. O.; Prior, R. L., Ellagitannin composition of blackberry as determined by HPLC-ESI-MS and MALDI-TOF-MS. Journal of Agricultural and Food Chemistry 2008, 56 (3), 661-669. 6. Sojka, M.; Janowski, M.; Grzelak-Blaszczyk, K., Stability and transformations of raspberry (Rubus idaeus L.) ellagitannins in aqueous solutions. European Food Research and Technology 2019, 245 (5). 7. Naumann, H.; Sepela, R.; Rezaire, A.; Masih, S. E.; Zeller, W. E.; Reinhardt, L. A.; Robe, J. T.; Sullivan, M. L.; Hagerman, A. E., Relationships between structures of condensed tannins from Texas legumes and methane production during in vitro rumen digestion. Molecules 2018, 23 (9). 8. Herderich, M. J.; Smith, P. A., Analysis of grape and wine tannins: Methods, applications and challenges. Australian Journal of Grape and Wine Research 2005, 11 (2), 205-214. 9. Li, C. M.; Leverence, R.; Trombley, J. D.; Xu, S. F.; Yang, J.; Tian, Y.; Reed, J. D.; Hagerman, A. E., High molecular weight persimmon (Diospyros kaki L.) proanthocyanidin: a highly galloylated, A-linked tannin with an unusual flavonol terminal unit, myricetin. Journal of Agricultural and Food Chemistry 2010, 58 (16), 9033-9042. 10. Gupta, R. K.; Haslam, E., Plant proanthocyanidins. Part 5. Sorghum polyphenols. Journal of the Chemical Society-Perkin Transactions 1 1978, (8), 892-896.

28

11. Zak, D.; Roth, C.; Unger, V.; Goldhammer, T.; Fenner, N.; Freeman, C.; Jurasinski, G., Unraveling the importance of polyphenols for microbial carbon mineralization in rewetted riparian peatlands. Frontiers in Environmental Science 2019, 7. 12. Monagas, M.; Urpi-Sarda, M.; Sanchez-Patan, F.; Llorach, R.; Garrido, I.; Gomez- Cordoves, C.; Andres-Lacueva, C.; Bartolome, B., Insights into the metabolism and microbial biotransformation of dietary flavan-3-ols and the bioactivity of their metabolites. Food & Function 2010, 1 (3), 233-253. 13. Scioneaux, A. N.; Schmidt, M. A.; Moore, M. A.; Lindroth, R. L.; Wooley, S. C.; Hagerman, A. E., Qualitative variation in proanthocyanidin composition of populus species and hybrids: genetics is the key. Journal of Chemical Ecology 2011, 37 (1), 57-70. 14. Zeller, W. E., Activity, purification, and analysis of condensed tannins: current state of affairs and future endeavors. Crop Science 2019, 59 (3), 886-904. 15. Krueger, C. G.; Vestling, M. M.; Reed, J. D., Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry of heteropolyflavan-3-ols and glucosylated heteropolyflavans in sorghum Sorghum bicolor (L.) Moench. Journal of Agricultural and Food Chemistry 2003, 51 (3), 538-543. 16. Rush, M. D.; Rue, E. A.; Wong, A.; Kowalski, P.; Glinsk, J. A.; van Breemen, R. B., Rapid determination of procyanidins using MALDI-ToF/ToF mass spectrometry. Journal of Agricultural and Food Chemistry 2018, 66 (43), 11355-11361. 17. Hagerman, A.E. 2011. Tannin Handbook. Miami University, Oxford OH 45056 www.users.muohio.edu/hagermae/. 18. Hagerman, A. E.; Butler, L. G., Condensed tannin purification and characterization of tannin-associated proteins. Journal of Agricultural and Food Chemistry 1980, 28 (5), 947-952. 19. Naumann, H. D.; Hagerman, A. E.; Lambert, B. D.; Muir, J. P.; Tedeschi, L. O.; Kothmann, M. M., Molecular weight and protein-precipitating ability of condensed tannins from warm-season perennial legumes. Journal of Plant Interactions 2014, 9 (1), 212-219. 20. Scioneaux, A. N.; Schmidt, M. A.; Moore, M. A.; Lindroth, R. L.; Wooley, S. C.; Hagerman, A. E., Qualitative variation in proanthocyanidin composition of Populus species and hybrids: Genetics is the key. J. Chem. Ecol. 2011, 37, 57-70. 21. Roullier-Gall, C.; Witting, M.; Gougeon, R. D.; Schmitt-Kopplin, P., High precision mass measurements for wine metabolomics. Frontiers in Chemistry 2014, 2.

29

22. Cooper, K. A.; Campos-Gimenez, E.; Alvarez, D. J.; Nagy, K.; Donovan, J. L.; Williamson, G., Rapid reversed phase ultra-performance liquid chromatography analysis of the major cocoa polyphenols and inter-relationships of their concentrations in . Journal of Agricultural and Food Chemistry 2007, 55 (8), 2841-2847. 23. Spraggins, J. M.; Rizzo, D. G.; Moore, J. L.; Rose, K. L.; Hammer, N. D.; Skaar, E. P.; Caprioli, R. M., MALDI FTICR IMS of intact proteins: using mass accuracy to link protein images with proteomics data. Journal of the American Society for Mass Spectrometry 2015, 26 (6), 974-985. 24. Rue, E. A.; Rush, M. D.; van Breemen, R. B., Procyanidins: a comprehensive review encompassing structure elucidation via mass spectrometry. Reviews 2018, 17 (1), 1-16. 25. Marshall, A. G.; Hendrickson, C. L.; Jackson, G. S., Fourier transform ion cyclotron resonance mass spectrometry: A primer. Mass Spectrometry Reviews 1998, 17 (1), 1-35. 26. Lin, L. Z.; Sun, J. H.; Chen, P.; Monagas, M. J.; Harnly, J. M., UHPLC-PDA-ESI/HRMS n profiling method to identify and quantify oligomeric proanthocyanidins in plant products. Journal of Agricultural and Food Chemistry 2014, 62 (39), 9387-9400. 27. Dixon, R. A.; Xie, D. Y.; Sharma, S. B., Proanthocyanidins - a final frontier in flavonoid research? New Phytologist 2005, 165 (1), 9-28. 28. Osman, A. M.; Wong, K. K. Y., Laccase (EC 1.10.3.2) catalyses the conversion of procyanidin B-2 (epicatechin dimer) to type A-2. Tetrahedron Letters 2007, 48 (7), 1163-1167. 29. Chen, L.; Yuan, P.; Chen, K.; Jia, Q.; Li, Y., Oxidative conversion of B- to A-type procyanidin trimer: Evidence for quinone methide mechanism. Food Chemistry 2014, 154, 315- 322. 30. Liu, H.; Howell, A. B.; Zhang, D. J.; Khoo, C., A randomized, double-blind, placebo- controlled pilot study to assess bacterial anti-adhesive activity in human urine following consumption of a supplement. Food & Function 2019, 10 (12), 7645-7652. 31. Schneider, H. J., Binding mechanisms in supramolecular complexes. Angewandte Chemie-International Edition 2009, 48 (22), 3924-3977. 32. Trouillas, P.; Sancho-Garcia, J. C.; De Freitas, V.; Gierschner, J.; Otyepka, M.; Dangles, O., Stabilizing and modulating color by copigmentation: insights from theory and experiment. Chemical Reviews 2016, 116 (9), 4937-4982.

30

33. Fulcrand, H.; Morel-Salmi, C.; Mané, C.; Poncet-Legrand, C.; Vernhet , A.; Cheynier, V., Tannins: from reactions to complex supramolecular structures. Australian Journal of Grape and Wine Research 2006, 12-17. 34. Mele, A.; Malpezzi, L., Noncovalent association phenomena of 2,5-dihydroxybenzoic acid with cyclic and linear oligosaccharides. A matrix-assisted laser desorption/ionization time- of-flight mass spectrometric and X-ray crystallographic study. Journal of the American Society for Mass Spectrometry 2000, 11 (3), 228-236.

31

Chapter 3: Improving RP-HPLC Separation of Condensed Tannins

Introduction Limitations in current HPLC separations methods for condensed tannins (CTs) have impeded research into structural analysis. As a result, our understanding of structure-function relationships is lacking, such as in the area of CT degradation. The pathways and mechanisms for the degradation of condensed tannins is unclear. Studies have identified many final degradative products as small phenolics1, and have suggested possible microbial degradation pathways from oligomers to form those products2. However, the mechanism of initial breakdown from long chain CT polymers is still a mystery. In order to elucidate this mechanism, it is necessary to visualize changes in the size distribution from the undegraded tannin sample to the degraded sample (McGivern, B; Wrighton, K & Hagerman, AE unpublished). By observing conversion of area under one peak in a chromatogram to another and identifying the size of the component represented by each peak, we can gather information about how tannin degradation occurs. We could gather evidence to suggest that degradation proceeds, for example, by the removal of one subunit at a time from polymers to directly form small, monomeric phenolics, or perhaps by random fragmentation of long polymers into smaller ones until monomeric phenolics finally result. Being able to separate CT samples by size and chain-length is important for analysis of degradation studies. Because condensed tannins are highly sorptive molecules and interact with the stationary phase column, size-exclusion chromatography cannot be properly utilized to separate CT samples, especially the larger polymers, by length3, 4. Normal phase (NP) HPLC has been proven capable of separating CT samples according to increasing degree of polymerization, up to a degree of polymerization (DP) of 10-113. However, NP-HPLC techniques face issues with incomplete elution of the CT sample due irreversible adsorption to the column, and acidic degradation due to the highly acidic solvents that are used for most methods3. Reversed phase (RP) HPLC is more commonly implemented for the separation of CT samples, and is the technique utilized by my lab group5. However, it is very difficult to obtain good resolution and separation of CT samples, especially longer chain-length polymers. Longer chain-length polymers have more opportunity to display slight variations, and the similar structure of these variations make separation extremely difficult3. The chromatograms tend to show a large,

32 unresolved hump of tannin material, rather than individual, distinct peaks3, 6. This lack of separation makes it difficult to make comparisons between tannin polymer size distribution in different samples. Several techniques have been studied in an attempt to resolve this “unresolved hump”. NP-Hydrophilic Interaction Liquid Chromatography (HILIC) HPLC was proven unsuccessful, with the unresolved hump still present in the chromatogram of grape tannins6. Another technique that achieved more success was centrifugal partition chromatography (CPC) with an ethyl acetate/acetone/water (2:1:1, v/v/v) solvent system6. This method was able to separate the “unresolved hump” of grape procyanidins as observed by RP-HPLC into distinct peaks, based on DP6. Very few labs have CPC available to them, so the method has not been widely adopted. To overcome these weaknesses in CT separation, studies have been performed to create a RP-HPLC method that is able to separate and resolve CT samples according to their molecular weight7. One such method presented by Putman and Butler utilizes a combination of linear and step gradients to achieve separation based on increasing molecular weight7. However, they were only able to achieve separation and resolution of different length polymers up to a peak with a

DP of 137. Our Sorghum CT has a mean degree of polymerization (mDP) of 16, and we have not been able to achieve good separation and resolution of these samples using similar linear and step-wise gradient methods. Putman and Butler also found that CT sample components eluted with the change in solvent composition, at the step up to higher solvent organic content7. This finding suggests that solute-solvent interactions play a role in analyte adsorption/desorption to help achieve better peak resolution. A study performed by Durner et. al. similarly found that solute-solvent interactions for desorption, along with solute-column interactions for adsorption, are important factors in achieving good separation and resolution of polymers via RP-HPLC8. This study utilized principles of gradient polymer elution chromatography (GPEC) and high performance precipitation liquid chromatography (HPPLC) to introduce a novel technique, called high- resolution polymer HPLC (HRP-HPLC). HRP-HPLC is based on a saw-tooth gradient method, which achieves separation of polymers based on molecular weight, by using changing solvent compositions to alter adsorption/desorption interactions8. Chromatograms of poly(vinylchloride), poly(methylmethacrylate), poly(dimethylsiloxane), and polypropylene glycol show the ability of

33 this method to achieve molecular weight-based separation with good resolution of peaks for a variety of different polymers8.

Based on the separation results achieved by Durner et al.8 for several different polymers, we created modified saw-tooth gradient method for our RP-HPLC system to try to achieve better size distribution visualization of our Sorghum CT samples. We propose that the HRP-HPLC technique, making use of a saw-tooth gradient, improves separation and resolution of CT sample components based on their molecular weights. This advancement in analysis of condensed tannins with RP-HPLC will allow for better understanding of tannin size distributions in a wide variety of systems, including our current studies of tannin biodegradation.

Methods Mature grain from two genetically distinct lines of high tannin Sorghum bicolor (L.)

Moench were obtained and stored at 4oC (Hi-Tannin Sumac NM03-9905, Scott Bean, USDA Manhattan Kansas), (ICRISAT line IS8260, Gebisa Ejeta, Purdue University Department of Agronomy). Sorghum tannin was extracted from ground grain with methanol containing ascorbic acid and purified by ethyl acetate extraction to remove small phenolics, followed by Sephadex

LH20 chromatography to isolate the high molecular weight fraction9, 10. The freeze-dried powder was stored at -20oC. Commercial (Hershey’s Natural Unsweetened) cocoa powder, obtained at a local supermarket, was defatted using Soxhlet extraction with hexane. The sample was then extracted with acetone:water (70:30, v/v) four times, discarding the solid. Acetone was removed from the extract and replaced with ethanol by successive rotary evaporation, and the final ethanol-water mixture was centrifuged to remove insoluble solids before applying the sample to Sephadex LH20 equilibrated in 95% ethanol. After eluting all of the non-adsorbing components (theobromine, sugars, small phenolics) with ethanol, the condensed tannin was eluted with 70% aqueous acetone. Acetone was removed from the solution by rotary evaporation, and the aqueous solution was freeze dried. The freeze-dried powder was stored at -20oC. All HPLC samples were analyzed on an Agilent 1050 HPLC with ChemStation Rev. A.09.03 software. The column was a Thermofisher Hypersil Gold C8, 160 x 4.6 mm, 3 µm packing. Sample injection volume was 10 µL. The gradient program employed 0.13% (v/v) trifluoroacetic acid (TFA) in nanopure water (A) and 0.10 % (v/v) TFA in acetonitrile (B) at a flow rate of 0.5

34 mL/min. Samples were analyzed using three different separation methods. Method 1 was a linear gradient and lasted for an 80 min duration, as follows: 0-4 min, isocratic at 2% B; 55 min, increase to 55% B; 60 min, increase to 85% B; 65-70 min, decrease to 2% B, then hold isocratic until the end. Method 2 was a more step-wise version of Method 1 and lasted for an 80 min duration, as follows: 0-4 min, isocratic at 2% B; 22 min, increase to 19% B; 45-55 min, increase to 55% B; 60 min, increase to 85% B; 65-70 min, decrease to 2% B, then hold isocratic until the end. Method 3 was a modified sawtooth gradient and lasted for a duration of 80 min, as follows: 0-4 min, isocratic at 2% B; 10 min, increase to 20% B; 15 min, decrease to 5% B; 17 min, increase to 30% B; 22 min, decrease to 10% B; 24 min, increase to 40% B; 29 min, decrease to 20% B; 31 min, increase to 50% B; 36 min, decrease to 30% B; 38 min, increase to 60% B; 43 min, decrease to 40% B; 45 min, increase to 70% B; 55 min, increase to 80% B; 70 min, decrease to 2% B, then hold isocratic until the end. After separation, the main peaks in each chromatogram were analyzed for peak areas and peak symmetry, using the Agilent software11.

Some samples were analyzed by thiolysis according to published procedures12, 13. Fractions were dried and redissolved in 100 µL of methanol containing 30 µL of the HCl reagent (32% (v/v) HCl in methanol) and 72 µL of the thiol reagent (5% (v/v) toluene--thiol in methanol) and incubated at 40°C for 30 min. The thiolytic degradation products were analyzed by HPLC using an Agilent 1100 HPLC with ChemStation Rev. A.09.03 software. The column was a Thermofisher Hypersil Gold C8, 160 x 4.6 mm, 3µm packing. Sample injection volume was 10 µL. The gradient program employed 0.13% (v/v) trifluoroacetic acid (TFA) in nanopure water

(A) and 0.10 % (v/v) TFA in acetonitrile (B) at a flow rate of 0.5 mL/min at 27.0 oC for a 48 min duration, as follows: 0-3 min, isocratic at 15% B; 8 min, increase to 20% B; 10 min, increase to 30% B; 28-32 min, increase to 70% B; 37-40 min, decrease to 15% B, then hold isocratic at 15% until the end13. Reaction products were detected at 220 nm and were identified by their retention times and spectral characteristics compared to authentic standards12, 13. Products were quantitated based on peak areas and converted to moles of extender and moles of terminal units. The chromatograms from control samples that did not contain acid or thiol and were not heated were used to confirm that all of the tannin was degraded by thiolysis, and to confirm that the tannin did not contain any flavan-3-ol contamination that would interfere with terminal unit determination.

35

Results and Discussion The Sorghum CT and cocoa CT were previously analyzed by HSQC NMR and thiolysis to gather information about relevant structural features (Table 1). These techniques provide overall, average polymer composition of heterogeneous CT samples. Analysis revealed that the cocoa CT consisted of (epi)catechin subunits with a mean degree of polymerization (mDP) between 1.7 and 4 (Table 1). The cocoa CT shows no evidence of A-type linkages. This information supports the structure proposed by previous studies14. The Sorghum CT was found to contain (epi)catechin subunits and 5% A-type linkages, with a mDP of about 16 (Table 1). This structural information is supported by previous studies15. The Sorghum CT polymers were, on average, longer than the cocoa polymers, but both preparations were heterodisperse mixtures. HPLC chromatograms of both the Sorghum and cocoa CT samples analyzed by Method 1 show large, unresolved humps (Figures 6, 7). There are several small spikes towards the beginning of the cocoa CT hump, which we speculated could be the small, monomeric, dimeric and possibly trimeric polyphenolic components, characteristic of cocoa tannin14 (Table 1, Figure 6). However, the rest of the hump shows no distinct peaks (Figure 6). The Sorghum CT hump shows no distinct peaks (Figure 7). Based on the assumption that the peaks on the early side of the hump were lower dp oligomers, we postulated that the difference in the shapes of the humps obtained from Sorghum and cocoa CT could result from the difference in average DP between the two CT samples (Table 4). The cocoa CT peak may show tailing to the right because it contains more smaller chain-length oligomers, while the Sorghum CT peak may show tailing to the left because it contains more longer chain-length polymers (Table 1, 4). The HPLC chromatogram of cocoa analyzed by Method 2 shows two peaks, the first eluting during the isocratic phase of the separation and the second peak eluting during the second gradient in the solvent program (Figure 8). The first peak appears to include many unresolved compounds, similar to but more complex than the peak in Method 1 (Figure 8, Figure 6). The second peak does not reveal underlying components. The ratio of the area of the first peak to the area of the second peak is about 1.5:1. The HPLC chromatogram of Sorghum CT analyzed by Method 2 shows a similar two- peak separation (Figure 9). There are several, very small bumps along the beginning of the first peak. The second peak, like in the cocoa chromatogram, does not reveal any underlying components. The ratio of the area of the first peak to the second peak is about 1.4:1.

36

Fractions of the eluate from Sorghum CT separated by Method 2 were collected and pooled as follows: 18-24 min (fraction A), 24-27 min (fraction B), 27-30 min (fraction C), 30-33 min (fraction D), 33-37.5 min (fraction E), 37.5-43.5 min (fraction F), 53-57.5 min (fraction G) (Figure 9). Fractions A, B, C, D, E, and F were from the first peak, while fraction G was from the second peak. These fractions were analyzed by thiolysis. Thiolysis results yielded a trend of increasing mDP for the fractions (Table 5). This suggests separation based on increasing molecular weight and DP. These fractions were also reinjected and reanalyzed by HPLC Method 2. Surprisingly, the reinjected fractions A, B, C, D, E, and F did not elute as single peaks corresponding to their original time of elution. Instead, each of these fractions re-separated into two peaks similar to the original chromatogram (Figure 10). However, in these reinjected fractions, the first peak was asymmetric, skewed towards the time at which the fraction was collected (Figure 10). The relative sizes of the first and second peaks in each chromatogram are different, as well (Figure 10). The separation of the fractions into two different components upon re-chromatography suggests that the CT may exist in aggregated and disaggregated forms, and that these forms can be separated chromatographically. The thiolysis data shows that the fractions of the first peak have mDP ranging from 4 to 12. Each of these fractions separate to yield two peaks, with the later peak corresponding to a fraction with mDP of 21 (Table 5). These results may suggest that CT polymers have a tendency towards aggregation at a certain equilibrium constant or ratio, and that the second peak could represent this aggregated component. The aggregation equilibrium constant could differ for different types of CT, explaining the difference in peak area ratios for cocoa and Sorghum CT chromatograms. The Sorghum CT has a slightly lower peak area ratio to suggest that it may have a slightly higher tendency towards aggregation and a higher aggregation equilibrium constant than cocoa CT. Aggregation tendencies could also be affected by polymer size, as evidenced by the differing relative areas among the reinjected fractions of the Sorghum CT. The analysis of reinjected fraction G showed only a large second peak, but no evidence of a first peak. This result suggests that aggregation is accompanied by formation of covalent linkages, such that the aggregated CT cannot disaggregate. Irreversible crosslinking of aggregates has been noted for wine tannins under some conditions16 and is consistent with our thiolysis data. Thiolysis is a degradative method of polymer analysis in which the extender units

37 of the CT are released as thiol adducts, but the terminal unit is not modified during the analysis. The ratio between thiol adducts and terminal units is the mDP for the polymeric CT. Formation of simple, reversible aggregates will not be reflected in the mDP of a polymer as determined by thiolysis, since the individual polymers in the aggregate will still have their original extender-to- terminal ratio. Aggregation accompanied by covalent linkage formation would increase the mDP determined by thiolysis because the new aggregates would have longer chain lengths and higher extender-to-terminal ratios. We propose that under some abiotic conditions, CT can undergo aggregation and polymerization to yield larger molecules than those found in the original plant extract. In the future, we will need to define the conditions and establish the structures of these higher polymers. Understanding this abiotic pathway for polymerization could provide significant insights into understanding both the fate of tannins in natural systems, and also the pathway for CT polymer formation in vivo. The HPLC chromatograms of both the Sorghum and cocoa CT samples analyzed by Method 3 show the elution peaks corresponding to the “teeth” of the saw-tooth gradient (Figure 11, 12). As with the other chromatographic methods used here, the first peak of the cocoa preparation appears to be a series of partially resolved components (Figure 11). The remaining peaks for cocoa, and all the peaks for the Sorghum CT do not reveal individual species (Figure 11, 12). The cocoa CT chromatogram shows a ratio of about 4.5:6.0:4.8:1 for the peak areas, while the Sorghum CT chromatogram shows a ratio of about 2.0:8.0:6.9:1 (Table 4). This difference in distribution could correspond to the difference in mDP of these CT samples (Table 1). Since the Sorghum CT is composed of longer polymers, the peak area distribution is shifted towards the peaks that represent the higher DP components. Conversely, the cocoa CT peak area distribution is shifted towards the earlier peaks that represent the smaller DP components. Fractions of the Sorghum CT Method 3 chromatogram were collected as follows: 16.5- 18.5 min (fraction H), 23.5-26 min (fraction I), 30-31.5 min (fraction J), 36.5-37.7 min (fraction K) (Figure 12). These fractions were analyzed by thiolysis. Thiolysis results yielded a trend of increasing mDP for the fractions (Table 5). This confirms the postulate that separation is based on molecular weight and mDP. Using the thiolysis-determined mDP of each peak and their relative areas, we calculated the weighted mDP of the Sorghum CT sample to be 17.95. The average mDP we have obtained for our Sorghum CT is 16.0, based on many replicate analyses of many preparations. However, a value of 17.95 mDP is not an unreasonable calculation. This

38 calculation was similarly performed for the cocoa CT to obtain a weighted mDP value of 16.46. This is quite a big difference from the mDP determined by thiolysis and NMR (Table 1). However, in gathering the data for the cocoa CT Method 3 peak areas, there was a technical difficulty, and the blank could not be subtracted from the chromatogram signals. We need to pursue more careful study to determine the reliability of weighted mDP calculation for samples analyzed by Method 3. Method 3, unlike Method 2, does not reveal aggregation. The two-step gradient of Method 2 allows unaggregated components to elute during the first step’s isocratic period, while the much higher organic content of the second step elutes the stronger adsorbed aggregated components. Another avenue of further research could attempt to gain more understanding of this aggregation. Replicate samples could be analyzed, following different resting periods to study the effect of time on size distribution and aggregation. Additionally, fractions gathered of the eluate from the second peak of Method 2 chromatograms could be reanalyzed using Method 3 to investigate the size distribution of these aggregates. Method 3 should also be studied to probe the effects of sample concentration and volume on the separation8. We propose that Method 3, a modified saw-tooth gradient technique, shows promising results for achieving improved resolution and separation of CT samples based on molecular weight. Method 3 could be very useful in our degradation studies, as well as other CT structural analysis research. Future work includes further testing of this method, such as reanalysis of reinjected fractions and analysis of other CT samples.

Conclusion The separation of polymers is much more challenging than small molecules, simply due to the focus of the separation; the goal of polymer chromatography is to separate the sample based on polymer characteristics, such as molecular weight distribution, size distribution, chemical properties, or chain structure8. All of the slight variations in these characteristics among polymer species in a sample impose challenges on the separation. However, the saw-tooth gradient method achieves good separation of both hydrophobic and hydrophilic polymers, based on molecular weight, using HRP-HPLC principles of solubility versus precipitation, and adsorption versus desorption interactions8. The changing solvent composition facilitates solute- solvent interactions that improve separation7. This method overcomes the weaknesses of RP-

39

HPLC to allow for improved condensed tannin polymer separation. RP-HPLC is much more common, useful, and more accessible than the techniques of NP-HPLC3, HILIC-HPLC6, CPC-

HPLC6, and ion-mobility MS17. This means that more researchers can perform improved analysis of procyanidins to enhance our knowledge of structure, structure-function relationships, and degradation. This advancement will allow for greater insight into the roles of CT in physiological systems and ecosystems.

40

Figure 6. Chromatogram of cocoa CT separated by Method 1 with solvent gradient profile. Partially resolved components at lower retention times may be dimers, trimers and oligomers.

41

Figure 7. Chromatogram of Sorghum CT separated by Method 1 with solvent gradient profile. There is no evidence for abundant dimers, trimers or oligomers.

42

Figure 8. Chromatogram of cocoa CT separated by Method 2 with solvent gradient profile. Partially resolved peaks at low retention times may be dimers, trimers and oligomers.

43

Figure 9. Chromatogram of Sorghum CT separated by Method 2 with solvent gradient profile. Gathered fractions are labeled with letters (A-G) and are delineated by the yellow boxes for the first peak.

44

action F from Figure 9, 9, from action andFigure F

CT, analyzed by Method by solvent with 2 analyzed CT,

Sorghum

of reinjected fractions of of fractions reinjected of

) separation of fraction A from Figure 9, B) fraction B from Figure 9, C) fraction C C fraction C) 9, A from B from fraction of fraction B) 9, separation ) Figure Figure

A

Figure 10. Chromatograms 10. Figure profile. gradient 9, fr Figure fraction from E) 9, F) E Figure D D) 9, from fraction Figure from 9. from fraction Figure G G)

45

Figure 11. Chromatogram of cocoa CT separated by Method 3 with solvent gradient profile.

46

Figure 12. Chromatogram of Sorghum CT separated by Method 3 with solvent gradient profile. Gathered fractions are labeled with letters (H-K).

47

Table 4. Preliminary analysis of main peaks for Cocoa and Sorghum CT separated by three HPLC solvent methods.

1 For the Cocoa samples with poorly resolved strong initial peaks, the ChemStation software used the time for the highest peak as the retention time, although the peak was clearly broad and poorly resolved with a later average retention time. 2 The area estimates were made on the raw data without subtracting the background, which was negligible for these runs with high analyte concentration. In some cases, the areas are only estimates because the peaks were off scale (max abs > 2AU). 3 The symmetry was calculated by the ChemStation software as the ratio between the area of the leading half of the peak and the lagging half of the peak. Symmetry = 1, the peak is symmetrical. Symmetry <1, the back half of the peak comprises a larger area. Symmetry > 1, the front half of the peak comprises a larger area. For the Cocoa samples with poorly resolved strong initial peaks, the Symmetry is underestimated because of the incorrect instrumental assignment of retention time.

48

Table 5. Thiolysis data of Sorghum CT HPLC fractions. Fractions A through G represent the fractions collected from a Method 2 chromatogram (Figure 9). Fractions H-K represent the peaks from a Method 3 chromatogram (Figure 12). Mean degree of polymerization (mDP) for each fraction was determined by thiolysis.

49

Bibliography

1. Cifuentes-Gomez, T.; Rodriguez-Mateos, A.; Gonzalez-Salvador, I.; Alanon, M. E.; Spencer, J. P. E., Factors affecting the absorption, metabolism, and excretion of cocoa flavanols in humans. Journal of Agricultural and Food Chemistry 2015, 63 (35), 7615-7623. 2. Monagas, M.; Urpi-Sarda, M.; Sanchez-Patan, F.; Llorach, R.; Garrido, I.; Gomez- Cordoves, C.; Andres-Lacueva, C.; Bartolome, B., Insights into the metabolism and microbial biotransformation of dietary flavan-3-ols and the bioactivity of their metabolites. Food & Function 2010, 1 (3), 233-253. 3. Hummer, W.; Schreier, P., Analysis of proanthocyanidins. Molecular Nutrition & Food Research 2008, 52 (12), 1381-1398. 4. Cheynier, V., Phenolic compounds: from plants to foods. Phytochemistry Reviews 2012, 11 (2-3), 153-177. 5. Khoddami, A.; Wilkes, M. A.; Roberts, T. H., Techniques for analysis of plant phenolic compounds. Molecules 2013, 18 (2), 2328-2375. 6. Ma, W.; Waffo-Teguo, P.; Paissoni, M. A.; Jourdes, M.; Teissedre, P. L., New insight into the unresolved HPLC broad peak of grape polymeric tannins by combining CPC and Q-ToF approaches. Food Chemistry 2018, 249, 168-175. 7. Putman, L. J.; Butler, L. G., Separation of high molecular weight sorghum procyanidins by high-performance liquid chromatography. Journal of Agricultural and Food Chemistry 1989, 37 (4), 943-946. 8. Durner, B.; Ehmann, T.; Matysik, F.-M., High-resolution polymer high performance liquid chromatography: Application of a saw tooth gradient for the separation of various polymers. Journal of Chromatography A 2019, 1587, 88-100. 9. Hagerman, A.E. 2011. Tannin Handbook. Miami University, Oxford OH 45056 www.users.muohio.edu/hagermae/. 10. Hagerman, A. E.; Butler, L. G., Condensed tannin purification and characterization of tannin-associated proteins. Journal of Agricultural and Food Chemistry 1980, 28 (5), 947-952. 11. Technologies, A., Agilent ChemStation: Understanding your ChemStation. 07/09 ed.; Germany, 2009; pp 240-241.

50

12. Li, C.; Leverence, R.; Trombley, J. D.; Xu, S.; Yang, J.; Tian, Y.; Reed, J. D.; Hagerman, A. E., High molecular weight persimmon (Diospyros kaki L.) proanthocyanidin: A highly galloylated, A-linked tannin with an unusual flavonol terminal unit, myricetin. J. Agric. Food Chem. 2010, 58, 9033-9042. 13. Scioneaux, A. N.; Schmidt, M. A.; Moore, M. A.; Lindroth, R. L.; Wooley, S. C.; Hagerman, A. E., Qualitative variation in proanthocyanidin composition of Populus species and hybrids: Genetics is the key. J. Chem. Ecol. 2011, 37, 57-70. 14. Cooper, K. A.; Campos-Gimenez, E.; Alvarez, D. J.; Nagy, K.; Donovan, J. L.; Williamson, G., Rapid reversed phase ultra-performance liquid chromatography analysis of the major cocoa polyphenols and inter-relationships of their concentrations in chocolate. Journal of Agricultural and Food Chemistry 2007, 55 (8), 2841-2847. 15. Gupta, R. K.; Haslam, E., Plant proanthocyanidins. Part 5. Sorghum polyphenols. Journal of the Chemical Society-Perkin Transactions 1 1978, (8), 892-896. 16. Vernhet, A.; Dubascoux, S.; Cabane, B.; Fulcrand, H.; Dubreucq, E.; Poncet-Legrand, C., Characterization of oxidized tannins: comparison of depolymerization methods, asymmetric flow field-flow fractionation and small-angle X-ray scattering. Analytical and Bioanalytical Chemistry 2011, 401 (5), 1559-1569. 17. Rue, E. A.; Glinski, J. A.; Glinski, V. B.; van Breemen, R. B., Ion mobility-mass spectrometry for the separation and analysis of procyanidins. Journal of Mass Spectrometry 2020, 55 (2).

51

Chapter 4: Conclusion

In this thesis, we demonstrated that the techniques of thiolysis with HPLC, NMR, FT- ICR ESI-MS, FT-ICR MALDI-MS, FTIR spectroscopy, and methanolysis are useful methods for the structural analysis of condensed tannin samples. Thiolysis and HSQC NMR provided overall, average structural data for condensed tannin samples. FTIR spectroscopy and methanolysis techniques are valuable tools for the detection of galloylation in high molecular weight polyphenols to verify other analytical results. FT-ICR ESI-MS provides information about interflavan linkage types. FT-ICR MALDI-MS provides complementary data to show exact structural composition of individual polymer species. It also offers new insights into the positions of the A-type linkages, which were found predominantly in the smaller CT oligomers. However, this MALDI-MS technique presented a misleading MS peak pattern, found to be the result of supramolecular complexation between the condensed tannin sample and the DHB matrix. This complexation was evidenced through the use of 1H NMR. The results suggested that the complexation may occur through - stacking interactions. Therefore, although FT-ICR MALDI-MS has proven to be a very useful technique for structural analysis of CT, the results must be interpreted cautiously. Analyses must take into consideration the formation of tannin-matrix complexes, which are indistinguishable from gallate esters by MALDI-MS, alone. Awareness of this issue can help to prevent misidentification and subsequent error in structural determination of CT molecules in future research. A suite of analytical tools must be utilized for complete, accurate structural determination of condensed tannins. We also demonstrated that a modified saw-tooth gradient RP-HPLC method shows promising results of achieving improved resolution and separation of CT samples based on molecular weight. It overcame the weaknesses of previous RP-HPLC methods, and is more convenient than other separation methods, such as NP-HPLC1, HILIC-HPLC2, CPC-HPLC2, and ion-mobility MS3 techniques. Our modified saw-tooth gradient RP-HPLC method could be very useful in our degradation studies, as well as other CT structural analysis research. Uncovering and solving problems associated with qualitative and quantitative analysis of condensed tannins will help advance techniques for better understanding of these important molecules. Expanding knowledge of tannin degradation can improve our knowledge of carbon

52 and nitrogen cycling, the role that tannins play in this process, and how they are impacted by global climate change4, 5. This can also enhance our understanding of the role of tannins in the metabolisms of plants, microbes, animals, and even humans6, 7. Tannins are widely prevalent in our ecosystems and our diets, so the more accurately we can study them, the more we can comprehend their impacts on the health of our bodies and our environment.

53

Bibliography

1. Hummer, W.; Schreier, P., Analysis of proanthocyanidins. Molecular Nutrition & Food Research 2008, 52 (12), 1381-1398. 2. Ma, W.; Waffo-Teguo, P.; Paissoni, M. A.; Jourdes, M.; Teissedre, P. L., New insight into the unresolved HPLC broad peak of Cabernet Sauvignon grape seed polymeric tannins by combining CPC and Q-ToF approaches. Food Chemistry 2018, 249, 168-175. 3. Rue, E. A.; Glinski, J. A.; Glinski, V. B.; van Breemen, R. B., Ion mobility-mass spectrometry for the separation and analysis of procyanidins. Journal of Mass Spectrometry 2020, 55 (2). 4. Kraus, T. E. C.; Dahlgren, R. A.; Zasoski, R. J., Tannins in nutrient dynamics of forest ecosystems - a review. Plant and Soil 2003, 256 (1), 41-66. 5. Zak, D.; Roth, C.; Unger, V.; Goldhammer, T.; Fenner, N.; Freeman, C.; Jurasinski, G., Unraveling the importance of polyphenols for microbial carbon mineralization in rewetted riparian peatlands. Frontiers in Environmental Science 2019, 7. 6. Piluzza, G.; Sulas, L.; Bullitta, S., Tannins in forage plants and their role in animal husbandry and environmental sustainability: a review. Grass and Forage Science 2014, 69 (1), 32-48. 7. Monagas, M.; Urpi-Sarda, M.; Sanchez-Patan, F.; Llorach, R.; Garrido, I.; Gomez- Cordoves, C.; Andres-Lacueva, C.; Bartolome, B., Insights into the metabolism and microbial biotransformation of dietary flavan-3-ols and the bioactivity of their metabolites. Food & Function 2010, 1 (3), 233-253.

54