THESIS
CALIFORNIASTATE UNIVERSITY SAN J\1ARCOS
THESIS SIGNATURE PAGE
THESIS SUBlVIITTEDIN PARTIAL FULFILLMENT OF THE REQUIREMENTSFOR THE DEGREE
MASTER OF
SCIENCE IN CHElVIISTRY
TITLE: Investigation into the Secondary Metabolite Chemistry of Rhamnus crocea Inside and Outside Hermes copper Range
AUTHOR(S): Alyssa Dubord
DATE OF SUCCESSFUL DEFENSE: 05/03/2021
THE THESIS HAS BEEN ACCEPTED BY THE THESIS COMMITTEE IN PARTIAL FULFILLMENT OF THE REQUIREtvlENTS FOR THE DEGREE OF MASTER OF SCIENCE IN CHElVIISTRY
Jackie Trischman 05/10/2021 loc�bhii:JlPDIJ Co:tv1:MITTEE CHAIR SIGNATURE DATE
George Vourlitis 05/12/2021 COMMITTEE MEMBER SIGNATURE DATE
:MichaelSchmidt · cc 1'.N 05/17/2021 -M1Id1ae l3d iiildt(M ay 1,,202110.4HDi, COMMITTEE MEMBER SIGNATURE DATE
COMMITTEE MEMBER SIGNATURE DATE Dubord
Investigation into the secondary metabolite chemistry of
Rhamnus crocea inside and outside Hermes copper range
Alyssa Dubord
Thesis committee:
Jacqueline Trischman (chair), George Vourlitis, Michael Schmidt
Department of Chemistry and Biochemistry
College of Science, Technology, Engineering and Mathematics
California State University San Marcos
333 S. Twin Oaks Road, San Marcos, CA 92096
1 Dubord
TABLE OF CONTENTS
ABSTRACT……………………………………………………………………………………….3
LIST OF TABLES………………………………………………………………………………...4
LIST OF FIGURES……………………………………………………………………………….5
INTRODUCTION………………………………………………………………………………...6
Hermes Copper (Lycaena hermes) & Spiny Redberry (Rhamnus crocea)………………..6
Secondary metabolites…………………………………………………………………….8
Hypothesis………………………………………………………………………………..10
METHODS………………………………………………………………………………………11
Previous work……………………………………………………………………………11
Current work……………………………………………………………………………..12
RESULTS & DISCUSSION ……………………………………………………………………14
Previous work……………………………………………………………………………14
Current work……………………………………………………………………………..15
ACKNOWLEDGEMENTS……………………………………………………………………...45
REFERENCES…………………………………………………………………………………..47
APPENDICIES…………………………………………………………………………………..50
2 Dubord
ABSTRACT
Larvae of the Hermes copper butterfly, Lycaena hermes, are reared on its single host
plant the spiny red berry, Rhamnus crocea. This plant is rich in flavonoids, two of which are
kaempferol and rhamnocrocin, the latter a novel compound. It was hypothesized that
rhamnocrocin would exist in higher concentrations in R. crocea inside the L. hermes range
compared to outside that range. Using LC-MS/MS analysis, the most abundant compounds within all plant extracts occurred at [M+H] values of 741 m/z, 755 m/z and 317 m/z. Compound
741 was tentatively named rhamnocrocin in a separate structure elucidation project, Compound
755 appears to be a methylated version of rhamnocrocin and Compound 317 has been identified as rhamnetin. A two-sample t-test identified no statistically significant difference for each of these compounds when comparing concentrations in and out of range. All plant extracts were also analyzed for a kaempferol component which appeared at 286 m/z. Approximately 22 different kaempferol compounds were discovered to be contained within the plant extracts. A two-sample t-test, showed statistically significant differences in concentration when comparing kaempferol concentration inside and outside L. hermes range with higher concentrations existing outside. Principal component analysis showed higher average and minimum monthly temperatures and less precipitation in areas in range. Higher precipitation was observed outside range, as were kaempferol concentrations. L. hermes seem to prefer a range that is lower in precipitation, higher in average and minimum monthly temperature, and also lower in kaempferol concentration. These factors may be contributing to why L. hermes is found in such a small habitat compared to its host plant.
3 Dubord
LIST OF TABLES
Table 1. PCA: Compounds (741, 755, 317)………………………………………………...... 26
Table 2. PCA: Compounds & climatic variables………………………………………………...28
Table 3. PCA: Compounds & soil variables……………………………………………………..31
Table 4. PCA: Compounds & foliage variables………………………………………………….33
Table 5. PCA: Kaempferol & climatic variables………………………………………………...36
Table 6. PCA: Kaempferol & soil variables……………………………………………………..37
Table 7. PCA: Kaempferol & foliage variables………………………………………………….39
Table 8. PCA: Kaempferol & Elevation…………………………………………………………41
Table 9. Site names in and out …………………………………………………………………..51
4 Dubord
LIST OF FIGURES
Figure 1. Images of L. hermes and R. crocea …………………………………………………….7
Figure 2. Mapped ranges of L. hermes and R. crocea habitat…………………………………….8
Figure 3. Kaempferol structure…………………………………………………………………..11
Figure 4. Chromatogram and 1H NMR data from old LC-MS/MS method……………………..17
Figure 5. Chromatogram and mass spectral data of revised LC-MS/MS method……………….18
Figure 6. Proposed structure of rhamnocrocin……………………………………………...... 18
Figure 7. Mass spectrum for Rhamnocrocin analogues………..………………………………...19
Figure 8. Chromatogram and mass spectrum from EF-2018-01-2-3-9- ………………….…….20
Figure 9. Overlaid kaempferol chromatograms……………………………………α …………….22
Figure 10. PCA: Compounds (741, 755, 317) …………………………………………………..26
Figure 11. PCA: Compounds & climatic variables……………………………………………...28
Figure 12. PCA: Compounds & soil variables…………………………………………………...30
Figure 13. PCA: Compounds & foliage variables……………………………………………….33
Figure 14. PCA: Kaempferol & climatic variables………………………………………………35
Figure 15. PCA: Kaempferol & soil variables…………………………………………………...37
Figure 16. PCA: Kaempferol & foliage variables……………………………………………….39
Figure 17. PCA: Kaempferol & Elevation……………………………………………………….41
Figure 18. Site maps for R. crocea sampling…………………………………………………….50
5 Dubord
INTRODUCTION
Hermes Copper (Lycaena hermes) & Spiny Redberry (Rhamnus crocea):
The Hermes copper butterfly (Lycaena hermes) is endemic to a small coastal sage scrub
(CSS) region of Southern California and northern Baja California, Mexico (Marschalek et al.,
2016; Figure 1A, B). Specifically, this region is located 80 km north of the US-Mexico border,
70 km east of San Diego and a few records extend 160 km south into Baja California, Mexico
(Thorne, 1963, Emmel and Emmel, 1973). Being both a sedentary and specialist species, they are extremely vulnerable to extinction based on their changing environment (Marschalek and
Deutschman, 2008, Thorne, 1963). A large contributor to L. hermes habitat loss is the destruction caused by anthropogenic wildfires. For example, the 2003 Paradise, Cedar, and Mine fires destroyed 39% of remaining L. hermes populations (Hogan, 2006). Urbanization is another threat to natural habitat as San Diego is projected to reach one million new residents by 2030
(Marschalek and Deutschman, 2008). The fragmentation of CSS habitat from both wildfires and urbanization puts L. hermes at a greater risk for extinction. The butterfly is thus classified as vulnerable. Several petitions have been made to list the species under the endangered species act but all have been denied due to insufficient data on biological vulnerability and threats (staff of
Carlsbad Fish and Wildlife, 2006). Today, Hermes copper remains a federal candidate for listing despite its dwindling population.
The spiny redberry (Rhamnus crocea) is the single larval host plant to L. hermes (Figure
1C, D). This plant is found in coastal chaparral and is native to Southern and Northern California as well as Baja California and parts of Arizona. This shrub’s distribution far exceeds the habitat of L. hermes, but for unknown reasons the butterfly is sedentary and does not travel outside its
6 Dubord
range to plants which may seem suitable (Figure 2; Thorne, 1963). Though the Rhamnus genus has been used in various traditional medicines, very little is known about the chemical ecology of the R. crocea species (Mai et al., 2001; Vourlitis, 2018).
Figure 1. The underside of L. hermes wings are bright yellow in color (A.) with the upperside being yellow-orange with black spots (B). R. crocea (C.) is the single larval host plant to this butterfly of which they lay their eggs under new leaf growth (D.) (Deutschman et al., 2011).
7 Dubord
Figure 2. For unknown reasons, the range of L. hermes, shown in orange, (A.) is far smaller
compared to the range of its host plant R. crocea, shown in green (B.). (Deutschman et al., 2011;
Montalvo, 2020).
Secondary metabolites:
Intermediary metabolism consists of enzyme driven reactions which build and convert the organic compounds an organism needs to store metabolic energy. These reactions are
produced through pathways known as metabolic pathways. Primary metabolism combines and
synthesizes such molecules as carbohydrates, proteins, fats and nucleic acids into the primary
metabolites needed to sustain life (Dewick, 2009).
In contrast to primary metabolites, secondary metabolites, often referred to as natural products, are those which exist in a limited distribution throughout organisms. They often give an organism some competitive advantage but are not essential to maintain life. Though the purposes of many natural products are still being discovered, some have been known to produce toxins which ward off predators, volatile compounds that attract the same or other species or
8 Dubord
coloring agents to attract or ward off other species. Natural products are produced from primary
metabolites using enzymes. There are many different natural products that can be constructed
from these building blocks (Dewick, 2009).
Synthesis and abundance of many secondary metabolites can be attributed to
environmental conditions or when organisms are subjected to stresses (Ramakrishna and
Ravishankar, 2011; Yang et al., 2018). Such environmental conditions include light, temperature,
soil salinity and drought. The Rhamnaceae family has been known to produce characteristic
secondary metabolites such as triterpenes, cyclopeptide alkaloids, benzylisoquinoline alkaloids,
and flavonoids (Alarcón and Cespedes, 2015). This last group, the flavanoids, are the important
group in this investigation.
Through evolution, insects have been programmed to recognize and respond to chemical
signaling from their host plants via differing volatile cues which elicit a specialized response; for
instance, egg laying (Bruce, 2014). Oviposition sites are crucial for Lepidoptera as this choice
affects offspring survival thus impacting population survival (Garcia-Barros and Fartmann,
2009; Reisenman et al., 2010). In addition, larvae have been shown to sequester compounds in the leaves of their host plant to use for their benefit. Butterflies of the Lycaenidae family, which
L. hermes belongs, have been shown to sequester compounds classified as flavonoids (Wiesen, et
al., 1994). This has been suggested to contribute to wing pattern and thus species recognition and
intraspecific visual communication (Wiesen, et al., 1994). Kaempferol 3-O-glucoside was found
as the most abundant flavonoid in larvae, pupae and imagines and accounted for ~83–92% of all
soluble flavonoids in adult butterflies (Wiesen, et al., 1994). Kaempferol is a flavanoid that is
commonly found in edible plants as well as botanical plants used in traditional medicine
(Calderon-Montano et al., 2011) (Figure 3).
9 Dubord
Flavonoids are a subdivision of secondary metabolites categorized as polyphenolics with
the carbon framework C6-C3-C6 (Samanta et al., 2011). Flavonoids are biosynthesized through
the shikimate pathway and citric acid cycle with the precursors phenylalanine and malonyl-CoA
(Samanta et al., 2011). They are located in cell vacuoles of green plants and play an important role in the plant’s interaction with its environment and in protecting them against various biotic and abiotic factors (Samanta et al., 2011).
Hypotheses:
L. hermes stays sedentary in their small fragmented habitat despite having host plants outside their habitat which seem suitable. This fact led to the question: what is causing the sedentary behavior of this butterfly and could the answer exist in the leaf chemistry of its host plant? To better understand the selective reproductive habits of L. hermes, the chemical composition of the R. crocea leaves were investigated (Isbell, 2020). Leaves from R. crocea were harvested from 30 different locations in Southern California (Isbell, 2020). These locations included 15 areas where L. hermes have been known to lay their eggs and 15 sites where they have not (Vourlitis, 2018). This collection was used to determine if the R. crocea leaves in range exhibit different chemical composition or foliage variables from those out of range (Isbell,
2020). Climatic variables and soil variables were also examined to determine if any of these factors may be influencing attraction or deterrence of L. hermes (Isbell, 2020). From this analysis, a newly discovered compound, tentatively named rhamnocrocin, was discovered with the molecular formula C33H40O19. Based on its kaempferol backbone, rhamnocrocin is classified
as a flavone, a sub class of flavonoids (Samanta et al., 2011). It was predicted that the R. crocea
within range of L. hermes will contain differing concentrations of rhamnocrocin compared to the
10 Dubord
R. crocea outside the range. In addition, variation in leaf N, C, lignin, soluble sugar,
holocellulose, solid N and C as well as temperature and precipitation data were also used to
establish correlations to the concentration of rhamnocrocin and other abundant compounds
between sites in and out of range.
Figure 3. Shown here is the chemical structure of kaempferol. This secondary metabolite makes
up the backbone of the rhamnocrocin molecule.
METHODS
Previous work:
Total N, total C, C:N, lignin:N and H2O variables:
In 2018 following L. hermes flight season, ~2-6 R. crocea shrubs were sampled from each of 30 different locations in Southern California with the purpose of measuring variables of total N, total C, C:N, lignin:N and H2O content (Vourlitis, 2018). From each shrub, old and new
growth R. crocea leaves and stems were collected as well as surface soil samples (Vourlitis,
2018).
11 Dubord
Chemical analysis:
Dried R. crocea leaf tissue was crushed, weighed and extracted using 75% methanol:25%
dichloromethane submersion for 24 hours in the dark. The supernatant was decanted and
remaining solvent was evaporated using a rotovap. Each extract was then fractionated using isooctane and methanol to partition into polar and nonpolar fractions. The nonpolar isooctane
fraction was analyzed via gas chromatography mass spectrometry (GCMS) and polar extracts
were analyzed by liquid chromatography-tandem mass spectrometry (LC-MS/MS).
Current work:
Solvent was removed in vacuo for all purified plant extracts, then the dried extract was weighed and dissolved in sufficient LCMS grade acetonitrile to reach a concentration of 3 mg/mL.
LC-MS/MS:
The electrospray ionization source of the Agilent 6410 triple quadrupole mass spectrometer used in this study was operated in positive ion mode. Nitrogen nebulizing gas temperature was set to 300 °C. An injection volume of 3 L and gradient mobile phase mixture composed of acetonitrile with 0.1% formic acid /water withμ 0.1% formic acid were used for all acquisitions. Data were acquired for 25 min with a flow rate of 0.300 mL/min. For scan mode, an
Agilent ZORBAX SB-C18 1.8 m 2.1x50 mm column was installed. For product ion mode
(PIM) and multiple-reaction monitoringμ (MRM), a new column was installed. Both columns
were left at ambient temperature and had flow rates of 0.300 mL/min.
12 Dubord
The mass spectrometer parameters varied depending on the acquisition mode. In scan mode, a mass range of 50-1200 m/z was scanned over a 500 ms time period. For PIM scans, a mass range of 100-800 m/z for Plant 741, Plant 755, Plant 777 and 100-400 m/z for Plant 317 was scanned over a 500 ms time period; collision energies were set at 30, 10, 0 and 50 for Plant
317, Plant 741, Plant 755 and Plant 777, respectively. The same collision energies were used in
MRM mode with 500 ms dwell times for each precursor → product ion transition. In SIM mode, mass range of 50-400 m/z was scanned over 500 ms time period for kaempferol at 287 m/z and kaempferol fragment at 286 m/z.
HPLC:
Attached to the quadrupole was an Agilent Technologies 1260 Infinity HPLC. The data analysis in the method used began at 0.1 min. The mobile phase consisted of solvent A (0.1% formic acid:water) and solvent B (0.1% formic acid:Acetonitrile). The gradient elution was as follows: 10-20% B for 0.1-2 min; 20-95% B for 2-7 min; 95-99% B for 7-12min; 99-5% B for
12-14 min and 5-10% B for 14-15 min; and re-equilibrating for 15-25 min. The UV detection wavelengths were recorded at 254 nm and 215 nm.
Statistics:
Statistical tests were performed via the statistical package Minitab 18. Assumptions of samples being independent and randomly obtained were met. Using an outlier test, values greater than 1.5 times the interquartile range were constituted as outliers and were removed from the data set. The test for normality and homogeneity of variance (HOV) were performed prior to performing statistical test for significance. HOV used the Levene’s test. A two-sample t-test was
13 Dubord
performed to determine if there was a difference in compound concentrations between ranges.
The null hypothesis was that samples within L. hermes range compared to outside the range had
no difference in compound concentration. When p<0.05, the null hypothesis was accepted. When
p>0.05, the null hypothesis was rejected and the alternate hypothesis was retained.
To further analyze the data, this study employed principal component analysis (PCA)
plots generated using the programming language R and integrated development environment
Rstudio. PCA is a dimensionality reduction technique which emphasizes variation and pattern to
orient multiple dimensions within a data set into a 2-dimensional plot (Powell, 2021; Karkare,
2021). The axes, called “principal components,” are assembled through eigenvalues and
eigenvectors. They do not mean anything physically, but are a combination of the data points in
question (Powell, 2021). Each transformation creates the horizontal axis “PC1” to contain the
most variation and the vertical axis “PC2” to contain the second-most variation (Powell, 2021).
The PC3 axis and above typically have the least amount of variation and are therefore dropped from a graphical representation (Powell, 2021). The closer the arrow is to the origin, the higher the magnitude of the contribution of that variable to the principal component (Fei et al., 2017).
RESULTS & DISCUSSION
Previous work:
Previous extraction work revealed several soluble C-based compounds that were in large quantities in R. crocea leaves using LC-MS/MS and GC/MS. Further analysis with LC-MS/MS found that of the 49 compounds seen in this study, 10 compounds were shown to be present at statistically different concentrations between sites (Isbell, 2020). Three of the 10 compounds
14 Dubord
seen may be alkaloids: alkaloid 1, alkaloid 2 and alkaloid 3 (Isbell, 2020). Alkaloid 1 and
alkaloid 3 were found in higher concentrations within L. hermes range with alkaloid 2 found
higher outside range (Isbell, 2020). Compounds that appear to be terpenes and iridoids were seen
in higher concentrations outside the L. hermes habitat (Isbell, 2020). Two sugars were also seen: glu 1 and glu 2. Glu 1 was found in higher concentrations outside range and glu 2 was found in higher concentrations within range. Flavonoids similar to rhamnocrocin were seen in significantly higher quantities outside L. hermes habitat (Isbell, 2020). Tocopherol was found in significantly higher quantities within range (Isbell, 2020). Compounds that appeared to be peptides were also measured in higher concentrations outside of the range (Isbell, 2020).
Current work:
LC-MS/MS:
The previous work contained discrepancies between LC-MS/MS and 1H NMR data. LC-
MS/MS data showed multiple peaks of several different compound classes with no major peak, indicating a sample with many components (Figure 4A). However, 1H NMR data revealed the
same sample to be a near pure compound or small set of similar compounds all containing a
kaempferol backbone (Figure 4B). These results initiated a close examination and revision of the
LC-MS/MS method. Most Rhamnus species contain metabolites with formula weights below 600 amu. However, when looking at the 13C NMR, there were likely three glycosides attached to the
kaempferol backbone, rather than the typical two sugars moieties. This called for an expansion of
the m/z range in the LC-MS/MS method. This new method revealed a major peak, very polar in
nature, which eluted from the column with a retention time (Rt) less than 1 minute (Figure 5A).
This major peak exhibited a parent fragment at 741 m/z (Figure 5B). This corresponded to a
15 Dubord
compound with a molecular weight of 740 amu and corresponded to the pure structure found in
1H NMR. After extensive literature search with no reported findings, it was deemed that the
identified compound was novel and has been tentatively named rhamnocrocin (Figure 6). Based
on its kaempferol backbone, rhamnocrocin is classified as a flavone which is a further sub class
of a flavonoids (Samanta et al., 2011; Figure 3B). In this compound, the hydroxyl group in
kaempferol is replaced by a glycosidic linkage which attaches three consecutiveβ glycoside
groups (Figure 6). Flavonoids most commonly exist glycosylated, thus containing many
hydroxyl groups which makes them fairly water soluble. They also exist commonly with
multiple methyl groups and isopentyl units which can make them substantially more lipophilic
(Samanta et al., 2011).
Further LC-MS/MS with all plant extracts revealed rhamnocrocin and two analogues of
rhamnocrocin in amounts large enough to be identified. The actual rhamnocrocin molecule had a
tR 0.598-0.740 min (Figure 7A). This was identified by analysis of a pure sample of rhamnocrocin≈ named EF-2018-01-2-3-9- after a series of 3 purification steps. A second compound, clearly related to rhamnocrocinα had a t 6.288-7.466 min (Figure 7B). Being that this form has the same molecular weight but a different≈ retention time indicates that this form is an isomer that differs in polarity causing its elution from the column at a later time. The third analog of rhamnocrocin has a parent fragment of 755 with a t 7.667-8.673 min (Figure 7C). Having a molecular weight with a difference of 14 corresponds ≈to the addition of a methyl group to the molecule and is thus the methylated form of rhamnocrocin (Figure 7C). This was confirmed with
1H NMR data of various fractions in the large sample work up.
16 Dubord
Figure 4. The chromatogram from the original LC-MS/MS method contained several peaks of interest indicating the presence of multiple compounds within the plant extract. However, 1H
NMR spectra (B.) indicated only one pure compound. These instrumental discrepancies lead to
LC-MS/MS method revision.
17 Dubord
Figure 5. The chromatogram from the revised method (A.) showed one strong peak representative of a single compound or group of closely related compounds that had been suggested by 1H NMR. MS results in scan mode showed a parent fragment at 741 amu corresponding to the rhamnocrocin molecule (B.)
Figure 6. The proposed structure of the novel rhamnocrocin molecule determined by 2D NMR.
18 Dubord
Figure 7. So far, there have been three analogs of rhamnocrocin discovered: the original, its isomer and a methylated version, all with differing retention times and parent ions at 741 (A.),
741 (B.) and 755 (C), respectively.
19 Dubord
Figure 8. To ensure the tR of the original rhamnocrocin, a pure sample obtained by HPLC called
EF-2018-01-2-3-9- , was run on LC-MS/MS. The resulting chromatogram (A.) showed
rhamnocrocin to haveα a tR 0.598-0.740. The corresponding mass spectrum (B.) showed a parent
ion at 741.3 m/z indicating≈ a molecular weight of 740 m/z.
A mass analyzer of the triple quadrupole MS/MS like the one used in this study can be
thought of as two moving belts stacked and parallel to each other with a collision cell in the middle. The top belt, representing the first quadrupole, is fixed to select the precursor ions which travel to the collision cell to be fragmented. The collision cell is a hexapole with six rods and
nitrogen as the collision gas. Varying voltages are applied to both the collision cell and the
20 Dubord
quadrupoles to cause the movement of the product ion to the third quadrupole. The fragments are
filtered through the third quadrupole resulting in a product-ion scan MS/MS representing a
compound’s fingerprint. The selected reaction monitoring (SRM) mode is used when specific
precursor ion and specific product ion are monitored. Multiple SRMs run for the same precursor
ion, this is called multiple reaction monitoring (MRM) (Agilent technologies, 2012).
Using the mass spectra of all extracts obtained by scan mode, the most common peaks
throughout the samples were investigated. The most common compounds throughout all samples
had parent ions at 317 m/z, 741 m/z, 755 m/z and 777 m/z. Ions 741 and 755 correspond to the
rhamnocrocin molecule and its methylated form, respectively. Each 755 ion peak was
accompanied by a 777 ion peak. However, the 755 m/z peak was always strong, and the 777 m/z peak was always weaker. When comparing the fragmentation patterns of 755 and 777, they were almost identical. This means that 777 most likely has the rhamnocrocin backbone with something added to it. There is a loss of 22 m/z from 777 to 755 meaning whatever was adding would have this mass. With this weight, one possibility is that sodium may be present on the 777 form of rhamnocrocin. Sodium is a common contaminant in positive ESI mass spectrometry and typically has origins from buffers used in other MS work or contaminated solvents/glassware. To check this, a scan mode from 100-1000 m/z was performed in negative ion mode. In the resulting
chromatogram, the 777 m/z peak disappeared which strongly suggested that this peak
corresponded to a sodium contaminant. If this is true, all 777 peaks correspond to the same
organic constituent as the 755 peaks. The areas for 777 peaks were therefore added to the 755
peaks areas prior to statistical analysis. The 317 compound is most likely rhamnetin, a known
compound in the Rhamnaceae genus. It is a flavonol with the molecular formula of C16H12O7.
21 Dubord
Using SIM mode, each plant extract was also analyzed for the kaempferol fragment
which appeared at 286 and kaempferol molecule at 287 amu. Resulting chromatograms had 22
different retention times for kaempferol-containing peaks (Figure 9). This indicated that there are at least 22 different components containing kaempferol existing in the R. crocea plant extracts.
These could correspond to a multitude of compounds, which like rhamnocrocin, have kaempferol
as a backbone. Plant biosynthesis is performed through reactions using enzymes which create
new compounds by adding different functional groups onto already existing molecules. Natural
products can be biosynthesized through the combination of several building blocks obtained
from primary metabolites or by using a mixture of different building blocks (Dewick, 2009).
These combinations result in vast amounts of structural diversity which can be seen in the
overlaid chromatograms of the varying kaempferol containing compounds in Figure 9.
Figure 9. Overlaid kaempferol chromatograms, each color representing a different sample
extract. Different retention times among peaks correspond to different kaempferol containing
compounds.
Statistics:
Abundance was measured as the integration of the ion current at the specified mass over
time. The measured abundance of peak 741, corresponding to rhamnocrocin, from all in-range
22 Dubord
sample extracts and out-of-range extracts, appeared to be normally distributed (p-value=0.186).
HOV was passed meaning equal variances between in and out regions (p-value=0.984). Two-
sample t-test results showed a p-value>0.05 meaning the null hypothesis was retained (p-
value=0.663).
The measured abundance for compound 755, the methylated rhamnocrocin molecule,
passed the assumption of normality (p-value=0.441) and passed HOV (p=0.595) meaning equal
variances. A two-sample t-test resulted in a p-value>0.05 meaning the null hypothesis was
retained (p-value=0.765).
The measured abundance for compound 317, passed both tests for normality and HOV
(p-value=0.139 and p-value=0.811, respectively). The two-sample t-test gave a p-value>0.05
meaning the null hypothesis was retained (p-value=0.505).
The kaempferol fragment (at 286 m/z) did not come from a normal distribution (p-
value=0.033). HOV was not passed meaning unequal variances (p-value=0.008). Because of this,
for the two-sample t-test, a Welch’s t-test was performed. This test gave a p-value<0.05 meaning
the null hypothesis was rejected (p-value=0.015).
Compounds 741, 755 and 317 all had p-values>0.05. This means there was no statistical
difference in rhamnocrocin concentrations (compounds 741), methylated rhamnocrocin
concentrations (compound 755) or rhamnetin concentrations (compound 317) when comparing
R. crocea leaf extracts in and out of L. hermes range. However, rejection of the null hypothesis indicates statistically different kaempferol fragment concentrations inside and outside of L. hermes range. Average total area values of 2.54x106 and 6.50x106 relative to chromatographic
peak area for in and out of range, respectively, show that kaempferol compound concentrations
are higher within R. crocea plants outside the range of L. hermes.
23 Dubord
PCA:
Previously obtained measurements of climate, soil and foliage variables were examined
to better understand when and why compounds 741, 755, 317 and kaempferol may be
biosynthesized. Climate variables included average monthly temperature, maximum average
monthly temperature, minimum average monthly temperature and total precipitation.
Soil variables included estimated N deposition, pH, total inorganic N, total N, total C and ratio of C:N. Nitrogen deposition is the input of N from the atmosphere and because many terrestrial ecosystems are N limited they may benefit from additional N inputs in increasing
biomass production (Stevens, et al., 2018) However, too much N can cause toxicity to occur
(Stevens, et al., 2018). Both organic and inorganic forms of N occur in the soil (Li, et al., 2014).
+ - The main inorganic forms are ammonium N (NH4 -N) and nitrate N (NO3 -N), necessary for
plant direct uptake and biomass production (Li, et al., 2014). Soil organic C and total N are used
for estimating soil quality through their C:N ratio (Zhijing, and Shaoshan, 2018). Soil pH
directly controls factors such as microorganism activity, nutrient solubility and availability
(Gentili, et al., 2018). In acidic soils, micronutrients are more available to plants but can become
toxic when in excess (Gentili, et al., 2018). Alkaline soils increased availability for
macronutrients but phosphorus and micronutrient availability are reduced (Gentili, et al., 2018).
Foliage variables included leaf N, leaf C, leaf C:N, lignin, and holocellulose. Leaf N, C
and C:N ratio also tell us about the health and composition of the foliage. Carbon and N are
critical for plant functions such as energy flow and nutrient cycling (Zhang, et al., 2019). A high
C:N is an indicator for N use efficiency while foliage with a low C:N typically are decomposed
faster by microbes, cycling the N back into the ecosystem (Zhang, et al., 2019). According to
Rowell, et al., lignins are amorphous, with their structure typically containing aromatic polymers
24 Dubord of phenylpropane units. They serve as encrusting agents in the cellulose/hemicellulose matrix and act as a type of adhesive for the plant cell wall. Holocellulose is the combination of cellulose
(glucan polymer of D-glucopyranose units linked by glucosidic bonds) and hemicellulose
(multiple sugars units containing polysaccharide polymers).β Holocellulose makes up the carbohydrate portion of most plants and accounts for ~65-70% of the plants dry weight. Their chemical structure is made up of sugars with many hydroxyl groups which aid the plant in absorbing moisture via hydrogen bonding.
Figure 10 contains a PCA of Compounds 741, 755 and 317. The proportion of variation explained in the data by these compounds is 52.36% and 30.37% for PC1 and PC2, respectively, with a cumulative explained proportion of 52.4% and 82.7%, respectively (Figure 10, Table 1).
Pearson correlation coefficients showed PC1 best explains Compound 755 and Compound 317: these values increase with decreasing PC1 (Table 1). PC2 best explains Compound 741: this value increases with increasing PC2 (Table 1). Compound 317 is inversely related to Compound
755 meaning when one concentration goes up, the other goes down. Compound 741 and 317 have a weak but positive correlation. Compound 741 appears to be at a right angle to compound
755 showing no correlation, including no apparent decrease in rhamnocrocin concentration when more methylated rhamnocrocin is found. There is no clear clustering of data points pertaining to in and out of range as shown in red and blue, respectively.
25 Dubord
Figure 10. Variation of Compounds 741, 755 and 317 shown through PCA. There is no clear clustering for data points in and out of range. Compound 317 (Comp 317) and Compound 741
(Comp 741) have a weak but positive correlation. Compound 317 is negatively correlated to
Compound 755 (Comp 755). Compound 755 and 741 appear to be at a right angle showing no correlation to each other. PC1 and PC2 combined explain a total of 82.73% of the variation in the data.
Table 1. Eigenvalues and percentages of explained variability (A). Pearson correlation coefficients between compounds for the first two principal components (PC) (B.). PC1 best explains Compound 755 and Compound 317: these values increase with decreasing PC1. PC2 best explains Compound 741: these values increase with increasing PC2.
PC Eigenvalue Proportion Cumulative 1 1.5707 0.524 0.524 2 0.9110 0.304 0.827 3 0.5184 0.173 1.000 A.
26 Dubord
Parameter/PC PC1 PC2 Compound 741 0.406 -0.886 Compound 755 -0.619 -0.446 Compound 317 0.672 0.124
Figure 11 contains a PCA of Compounds 741, 755, 317 and climatic variables of average
monthly temperature (Ave. T_C (Wt.)), maximum average monthly temperature (Max. T_C
(Wt.)), minimum average monthly temperature (Min. T_C (Wt.)) and total precipitation
(Total_mm (Wt.)). PC1 explains 42.45% of variance extracted from the data set while PC2 explains 24.16% (Figure 11, Table 2). The cumulative explained proportion for PC1 and PC2 were 42.4% and 66.6%, respectively (Table 2). Pearson correlation coefficients showed PC1 best explains Compound 741, Ave T_C (Wt), Min T_C (Wt.) and Total mm (Wt.): these values increase with decreasing PC1 (Table 2A). PC2 best explains Compound 755, Compound 317 and
Max. T_C (Wt.): these values increase with increasing PC2 (Table 2). There is clear clustering of leaf samples obtained from inside L. hermes habitat and those obtained outside the habitat. The
left side of the plot, containing average temperature and minimum temperature, contains mainly
samples obtained from inside L. hermes range. The right side of the plot, containing total precipitation, contains mainly samples obtained from outside the range. This means that many of the samples in range were found in a cooler environment with less precipitation. Compound 755 was not associated with areas containing higher average maximum monthly temperatures while
Compound 741 and 317 had a weak but positive correlation.
27 Dubord
Figure 11. PCA with climate variables of Minimum average temperature in C° (Min. T_C
(Wt.)), Average monthly temperature in C° (Ave. T_C (Wt.)), Compound 755 (Comp 755), Total precipitation in mm (Total_mm (Wt.)), Compound 741 (Comp 741), Maximum average monthly temperature in C° (Max. T_C (Wt.)), and Compound 317 (Comp 317). There is clear clustering between plant samples taken from in and out of range. Precipitation is higher outside of L. hermes range while minimum and average monthly temperatures are higher inside the range.
Compounds 741 and 317 are positively correlated with maximum temperatures while compound
755 is negatively correlated. The combined PC1 and PC2 axes represent a total of 66.61% variation in the data.
Table 2. Eigenvalues and percentages of explained variability (A). Pearson correlation coefficients between compounds for the first two principal components (PC) (B.). PC1 best explains Compound 741, Ave T_C (Wt), Min T_C (Wt.) and Total mm (Wt.): these values
28 Dubord increase with decreasing PC1. PC2 best explains Compound 755, Compound 317, Max. T_C
(Wt.): these values increase with increasing PC2.
PC Eigenvalue Proportion Cumulative 1 2.9712 0.424 0.424 2 1.6915 0.242 0.666 3 1.0627 0.152 0.818 4 0.7166 0.102 0.920 5 0.5021 0.072 0.992 6 0.0526 0.008 1.000 7 0.0033 0.000 1.000
Parameter/PC PC1 PC2 Compound 741 0.042 -0.288 Compound 755 -0.002 0.619 Compound 317 0.084 -0.619 Ave. T_C (Wt.) -0.568 -0.120 Max. T_C (Wt.) 0.147 -0.362 Min. T_C (Wt.) -0.572 -0.029 Total_mm (Wt.) 0.566 0.061
Figure 12 contains a PCA of Compounds 741, 755, 317 and soil variables of estimated N deposition (N_Dep (kgN/ha)), pH, total organic nitrogen (TIN_1(gN/m2)), total N (Total_N
(gN/m2)), total C (Total_C (gN/m2)) and ratio of C:N (Soil C/N). PC1 explains 29.89% of variance extracted from the data set while PC2 explains 20.35% (Figure 12). PC1 and PC2 had a cumulative proportion of 29.9% and 50.2%, respectively (Table 3A). Pearson correlation coefficients showed PC1 best explains N_Dep (kgN/ha), pH, TIN_1 (gN/m2), Total_C (gN/m2) and Soil C:N: these values increase with decreasing PC1. PC2 best explains Compound 741,
Compound 755, Compound 317, and Total_N (gN/m2): these values increase with increasing
PC2 (Table 3). Though there is overlap of the in and out of range data points, a general trend can be deduced for all plant samples. The right side of the PCA lists total inorganic N, ratio of C/N and total C. Data points on this side of the PCA can be assumed to contain higher amounts of
29 Dubord these soil variables. Compound 317 is also on this side of the PCA meaning for this compound to be biosynthesized, the plant may need to experience higher total inorganic N, ratio of C:N and total C. The left side of the PCA lists soil variables, pH, N deposition and total N which exist in higher numbers for these plant samples. Compound 741 and pH are positively correlated meaning this compound may exist in more alkaline soils. Compound 755 and total N are positively correlated which means 755 may be biosynthesized in soils with high N content.
Compound 317 is inversely related to Total N meaning it may need soils lower in N content for biosynthesis.
Figure 12. PCA with soil variables of N deposition (N_Dep (kgN/ha)), Total N (Total_N
(gN/m2)), Compound 755 (Comp 755), Total C (Total_C (gN/m2)), Soil C:N ratio (Soil C/N),
Total inorganic N (TIN_1 (gN/m2)), Compound 317 (Comp 317), Compound 741 (Comp 741) and pH (overlapped with Comp 741). There is no clear clustering between samples in and out of
30 Dubord range. Compound 755 is positively correlated with total N while compound 317 is negatively correlated. Compound 741 is positively correlated with soil pH but negatively correlated with total C. Soil C:N is positively correlated with total inorganic N but both are negatively correlated to N deposition. The combined PC1 and PC2 axes represent a total of 50.24% variation in the data.
Table 3. Eigenvalues and percentages of explained variability (A.). Pearson correlation coefficients between compounds and soil variables for the first two principal components (PC)
(B.). PC1 best explains Compound N_Dep (kgN/ha), pH TIN_1 (gN/m2), Total_C (gN/m2) and
Soil C/N): these values increase with decreasing PC1. PC2 best explains Compound 741,
Compound 755, Compound 317, and Total_N (gN/m2): these values increase with increasing
PC2.
PC Eigenvalue Proportion Cumulative 1 2.6898 0.299 0.299 2 1.8315 0.204 0.502 3 1.3947 0.155 0.657 4 1.1505 0.128 0.785 5 0.7749 0.089 0.871 6 0.5568 0.062 0.933 7 0.3158 0.035 0.968 8 0.2646 0.029 0.998 9 0.0213 0.002 1.000
Parameter/PC PC1 PC2 Compound 741 0.182 -0.262 Compound 755 0.049 0.475 Compound 317 -0.106 -0.598 N_Dep (kgN/ha) 0.351 0.141 pH 0.266 -0.260 TIN_1 (gN/m2) -0.501 -0.129 Total_N (gN/m2) 0.114 0.404 Total_C (gN/m2) -0.471 0.284 Soil C/N -0.523 -0.009
31 Dubord
Figure 13 contains a PCA of Compounds 741, 755, 317 and foliage variables leaf N
(Leaf_N (%)), leaf C (Leaf_C (%)), leaf C:N (Leaf_CN(%)), lignin (Lignin (%)) and holocellulose (Holo (%)). PC1 explains 27.51% of variance extracted from the data set while
PC2 explains 23.18% (Figure 13). PC1 and PC2 had cumulative proportions of 27.5% and
50.7%, respectively (Table 4A). Pearson correlation coefficients showed PC1 best explains Leaf
N (%), Leaf C (%), Leaf_CN (%): these values increase with decreasing PC1. PC2 best explains
Compound 741, Compound 755, Compound 317, Lignin (%) and Holo (%): these values increase with increasing PC2. (Table 4). This PCA also does not have clear clustering for data points obtained from samples in and out or range but does show some general trends for all plant extracts. The left side of the PCA lists leaf C:N (Leaf_CN (%)) and lignin %. Since Compound
755 and lignin are overlapping, they are positively correlated and presumably, foliage must be high in lignin for 755 production. The right side of the PCA indicates plant extracts with foliage containing higher Leaf N, Leaf C and holocellulose. Compound 317 and 741 are inversely related to lignin and may be biosynthesized when these conditions are low. Compound 755 is inversely related to holocellulose while 741 and 317 are positively correlated to this variable.
32 Dubord
Figure 13. PCA with foliage variables of percentage of leaf C:N ratio (Leaf_CN (%)),
Compound 755 (Comp 755), Lignin percentage (overlapped with Comp 755), Leaf C percentage
(Leaf_C (%)), Leaf N percentage (Leaf_N (%)), Holocellulose percentage (Holo (%)),
Compound 741 (Comp 741) and Compound 317 (Comp 317). Lignin and Compound 755 are positively correlated. They are both opposite from % holocellulose meaning they are negatively correlated to this variable. Compound 741 and 317 appear to be negatively correlated with lignin
% and may be positively correlated to holocellulose %. Leaf N % and leaf C % are positively correlated while both are negatively correlated to Leaf C:N %. The combined PC1 and PC2 axes represent a total of 50.69% variation in the data.
Table 4. Eigenvalues and proportions of explained variability (A.). Pearson correlation coefficients between compounds and foliage variables for the first two principal components
33 Dubord
(PC) (B.). PC1 best explains Leaf N (%), Leaf C (%), Leaf_CN (%): these values increase with decreasing PC1. PC2 best explains Compound 741, Compound 755, Compound 317, Lignin (%) and Holo (%): these values increase with increasing PC2.
PC Eigenvalue Proportion Cumulative 1 2.2010 0.275 0.275 2 1.8546 0.232 0.507 3 1.2801 0.160 0.667 4 0.9547 0.119 0.786 5 0.8206 0.103 0.889 6 0.5525 0.069 0.958 7 0.3204 0.040 0.998 8 0.0160 0.002 1.000
Parameter/PC PC1 PC2 Compound 741 0.044 -0.159 Compound 755 -0.282 0.367 Compound 317 0.088 -0.607 Leaf_N (%) 0.598 0.318 Leaf_C (%) 0.277 0.111 Leaf_CN (%) -0.504 -0.346 Lignin (%) -0.342 0.353 Holo (%) 0.327 -0.338
PCAs containing kaempferol were done separately from compounds 741, 755 and 317
because kaempferol showed statistically different concentrations in and out of range. Figure 14
contains a PCA of Kaempferol and climatic variables via PCA. PC1 explains 62.22% of variance
extracted from the data set while PC2 explains 22.56% (Figure 14). PC1 and PC2 have
cumulative proportions of 25.8% and 45.3%, respectively. Pearson correlation coefficients
showed PC1 best explains Ave. T_C (Wt.), Min. T_C (Wt.), total_mm (Wt.): these values
increase with decreasing PC1. PC2 best explains Kaempferol fragment, Max.T_C (Wt.): these
values increase with increasing PC2. These data points do contain clustering for in and out of
range leaf extracts. Similar to compounds and climatic variables, samples within range are found
34 Dubord in areas with a higher average temperature and higher minimum temperature. Samples out of range are found in areas with more precipitation and contain higher concentrations of kaempferol. Kaempferol as well as precipitation are inversely related to average temperature.
This would argue that kaempferol is more commonly synthesized under conditions involving higher precipitation.
Figure 14. PCA with climate variables of Minimum average temperature in C° (Min. T_C
(Wt.)), Average monthly temperature in C° (Ave. T_C (Wt.)), Maximum average monthly temperature in C° (Max. T_C (Wt.)), Total precipitation in mm (Total_mm (Wt.)) and kaempferol. There is clear clustering between samples taken in and out of range. Kaempferol is found in higher abundance outside of range. It is positively correlated with total precipitation but negatively correlated with average minimum temperature. The combined PC1 and PC2 axes represent a total of 84.78% variation in the data.
35 Dubord
Table 5. Eigenvalues and percentages of explained variability (A.). Pearson correlation coefficients between kaempferol fragment and climatic variables for the first two principal components (PC) (B.). PC1 best explains Ave. T_C (Wt.), Min. T_C (Wt.), total_mm (Wt.): these values increase with decreasing PC1. PC2 best explains Kaempferol fragment, Max.T_C
(Wt.): these values increase with increasing PC2.
PC Eigenvalue Proportion Cumulative 1 3.1108 0.622 0.622 2 1.1280 0.226 0.848 3 0.6842 0.137 0.985 4 0.0675 0.014 0.998 5 0.0094 0.002 1.000
Parameter/PC PC1 PC2 Kaempferol fragment -0.311 0.427 Ave. T_C (Wt.) 0.556 -0.001 Max. T_C (Wt.) 0.012 -0.875 Min. T_C (Wt.) 0.542 0.223 Total_mm (Wt.) -0.548 -0.042
Figure 15 contains a PCA of Kaempferol and soil variables via PCA. PC1 explains
40.55% of variance extracted from the data set while PC2 explains 23.04% (Figure 15). PC1 and
PC2 contain cumulative proportions of 40.5% and 63.6%, respectively (Table 6A). Pearson correlation coefficients showed PC1 best explains N_Dep (kgN/ha), pH, TIN_1 (gN/m2), Soil
C/N: these values increase with decreasing PC1. PC2 best explains Kaempferol fragment,
Total_N (gN/m2), Total_C (gN/m2), Soil C/N): these values increase with increasing PC2. There is heavy overlap of data points in and out of range. The trend for all plant extracts is that kaempferol is inversely correlated with soil C/N ratio and more positively correlated to nitrogen deposition. Thus, it would seem kaempferol is more commonly biosynthesized in areas with high soil N which implies higher N availability.
36 Dubord
Figure 15. PCA with soil variables of Soil C:N ratio (Soil C/N), Total inorganic N (TIN_1
(gN/m2)), Total C (Total_C (gN/m2)), Total N (Total_N (gN/m2)), Kaempferol, pH, and N deposition (N_Dep (kgN/ha)). Kaempferol may be found in higher abundance in areas with higher total N, pH and N deposition. It is negatively correlated to soil C:N but shows no correlation between total inorganic N and total C. The combined PC1 and PC2 axes represent a total of 63.59% variation in the data.
Table 6. Eigenvalues and percentages of explained variability (A.). Pearson correlation coefficients between kaempferol fragment and soil variables for the first two principal components (PC) (B.). PC1 best explains N_Dep (kgN/ha), pH, TIN_1 (gN/m2), Soil C/N: these values increase with decreasing PC1. PC2 best explains Kaempferol fragment, Total_N (gN/m2),
Total_C (gN/m2), Soil C/N): these values increase with increasing PC2.
37 Dubord
PC Eigenvalue Proportion Cumulative 1 2.8382 0.405 0.405 2 1.6130 0.230 0.636 3 1.0676 0.153 0.788 4 0.7986 0.114 0.902 5 0.3472 0.050 0.952 6 0.2588 0.037 0.989 7 0.0766 0.011 1.000
Parameter/PC PC1 PC2 Kaempferol fragment 0.330 0.408 N_Dep (kgN/ha) 0.360 -0.153 pH 0.286 0.021 TIN_1 (gN/m2) -0.490 0.260 Total_N (gN/m2) 0.191 0.673 Total_C (gN/m2) -0.424 0.479 Soil C/N -0.472 -0.246
Figure 16 contains a PCA of Kaempferol and foliage variables via PCA. PC1 explains
41.96% of variance extracted from the data set while PC2 explains 25.65% (Figure 16). PC1 and
PC2 have a cumulative proportion of 42.0% and 67.6%, respectively (Table 7A). Pearson
correlation coefficients showed PC1 best explains Leaf N (%), Leaf_C (%), Leaf_CN (%): these
values increase with decreasing PC1. PC2 best explains the kaempferol fragment, Lignin (%)
and Holo (%): these values increase with increasing PC2. Though there is overlap between
samples in and out of range, the general trend shows kaempferol and leaf N % to have a negative correlation to ratio of leaf C/N %. Lignin % is negatively correlated to Holocellulose % but positively correlated to leaf C %.
38 Dubord
Figure 16. PCA with foliage variables of Leaf C percentage (Leaf_C (%)), Lignin percentage,
Kaempferol, percentage of leaf C:N ratio (Leaf_CN (%)), Leaf N percentage (Leaf_N (%)) and
Holocellulose percentage (Holo (%)) and kaempferol. Kaempferol is negatively corelated to leaf
C:N and may be positively correlated with lignin %. It may also be found in higher abundance with higher leaf N, leaf C and holocellulose %. The combined PC1 and PC2 axes represent a total of 67.61% variation in the data.
Table 7. Eigenvalues and percentages of explained variability (A.). Pearson correlation coefficients between kaempferol fragment and foliage variables for the first two principal components (PC) (B.). PC1 best explains Leaf N (%), Leaf_C (%), Leaf_CN (%): these values increase with decreasing PC1. PC2 best explains the kaempferol fragment, Lignin (%) and Holo
(%): these values increase with increasing PC2.
39 Dubord
PC Eigenvalue Proportion Cumulative 1 2.5178 0.420 0.420 2 1.5389 0.256 0.676 3 0.8556 0.143 0.819 4 0.6333 0.106 0.924 5 0.4331 0.072 0.996 6 0.0213 0.004 1.000
Parameter/PC PC1 PC2 Kaempferol fragment 0.191 0.620 Leaf_N (%) 0.616 0.091 Leaf_C (%) 0.420 -0.290 Leaf_CN (%) -0.541 -0.255 Lignin (%) -0.152 0.590 Holo (%) 0.304 -0.331
Lastly, it occurred to check the kaempferol-containing compounds against altitude as R. crocea plants collected outside of range grew at a higher elevation. Figure 17 shows a PCA with kaempferol and elevation. R. crocea plants outside L. hermes range grew at an elevation that was significantly higher in elevation outside L. hermes range compared to inside (p-value=0.012).
PC1 explains 51.63% of variance extracted from the data set while PC2 explains 48.37% (Figure
17). PC1 and PC2 have a cumulative proportion of 51.3% and 100%, respectively (Table 8A).
Clustering exists between samples in range and samples outside of range. Kaempferol and elevation are both pointed toward the samples outside of L. hermes range which may indicate that these two variables are increased for samples outside the range.
40 Dubord
Figure 17. A PCA of kaempferol and elevation. There is clear clustering of data points in range as well as out of range. Both arrows for kaempferol and elevation are pointing toward out of range implying that the R. crocea plants out of range contain higher concentration of kaempferol and grew at a higher elevation. PC1 and PC2 combined explain 100% of the variation in the data.
Table 8. Eigenvalues and percentages of explained variability (A.). Pearson correlation coefficients between kaempferol fragment and foliage variables for the first two principal components (PC) (B.).
PC Eigenvalue Proportion Cumulative 1 1.0255 0.513 0.513 2 0.9745 0.487 1.000
Parameter/PC PC1 PC2 Elevation 0.707 -0.707 Kaempferol fragment -0.707 -0.707
41 Dubord
Points of Interest
A way R. crocea is presumably communicating with L. hermes is through plant volatiles.
Nonconjugated molecules have the ability to cross membranes and evaporate into the atmosphere
(Pichersky, et al., 2006). Free volatiles are most likely to accumulate in membranes and are in
some cases glycosylated then stored in vacuoles (Pichersky, et al., 2006). The structure of
rhamnocrocin contains three glycosides. The methyl groups on the methylated form of
rhamnocrocin cause the molecule to be slightly lipophilic. Lipophilicy and a high vapor pressure
is a common characteristic of plant volatiles (Pichersky, et al., 2006). Being glycosylated and
lipophilic, the methylated form of rhamnocrocin slightly fits the description of the common
structure of a plant volatile. However, having such a large molecular weight and being so polar,
it is unlikely to be a plant volatile. Biosynthesis rates of plant volatiles are highest in young
leaves which are not fully developed and need the most protection (Pichersky, et al., 2006).
Interestingly, it is known that L. hermes lays its eggs on branches underneath new leaf growth. It would seem possible that the biosynthesis of some kaempferol containing compound may be
attracting L. hermes to oviposit and thus provide protection to the plant. In addition, plant volatiles containing an aromatic ring follow the pathway leading from shikimate to phenylalanine then to primary and secondary nonvolatile compounds (Pichersky, et al., 2006).
One such primary compound is lignin, which may be why compound 755 is also positively
correlated with lignin.
Plant volatiles are influenced by environmental factors such as light, temperature and
moisture (Pichersky, et al., 2006). Severe water deficiency has been demonstrated to increase
secondary metabolite production (Yang et al., 2018). The flavonoid concentration in Pisum
sativum was shown to increase by 45% in drought conditions compared to a well-watered control
42 Dubord
(Yang et al., 2018). Plant extracts collected from inside L. hermes range originated from areas lower in precipitation while plants outside contained higher levels of precipitation (Figure 11,
12). Based on this, it would be expected that there would be a higher concentration of
Compounds 741, 755, 317 and kaempferol in range of L. hermes, however, this was not the case.
There was no statistical significance between Compound concentrations in and out of L. hermes
range. When it came to kaempferol, the opposite was observed. A t-test and PCA showed that
kaempferol was more likely to be found outside range in areas with higher precipitation.
Photosynthesis, one of the fundamental processes needed to construct secondary
metabolites, begins with light. According to Yang, et al., light is essential in promoting plant growth as well as inducing or regulating plant metabolism. As a response to their environment, the presence of light allows for plants to biosynthesize secondary metabolites such as phenolic compounds, triterpenoids and flavonoids. For example, after subjected to long light irradiation of
16 hours, leaves of Ipomoea batatas responded in a dramatic increase in such flavonoids as flavanols. In Centella asiatica, flavonoids were positively correlated to growth-lighting conditions and UV-B irradiation was found to have the highest increase in flavonoids such as kaempferol in Populus trichocarpa. These examples all suggest that flavonoids may be used by plants to protect against light exposure. R. crocea plants outside of L. hermes range were found to grow at a statistically higher elevations than those inside the range. UV radiation, especially
UV-B, is increased at higher altitudes (Rana et al., 2020). In addition, increased altitude will result in a decrease in atmospheric temperature which has been shown to increase the production of phenolic compounds (Rana et al., 2020). In order to protect themselves from the influx of UV-
B, the R. crocea outside the range are assumingly synthesizing more kaempferol containing molecules which may be acting as a UV-filter to protect the plant.
43 Dubord
Kaempferol was also positively correlated with total soil N and N deposition. N is important for metabolic processes, which is why plant tissues that have low quantities of N can reduce the quality of plants as food (Fei et al., 2017). N demand and levels exist in the highest concentrations in young growing tissues of plants and decline as the plant ages (Fei et al., 2017).
Flavonoids have been reported to regulate oviposition and feeding (Mierziak et al., 2014).
Compounds 741, 755, 317 and kaempferol are all classified as flavonoids and therefore may influence oviposition. Females typically select plants suitable for oviposition based on visual, olfactory and gustatory information (Fei et al., 2017). The host plant is generally chosen for containing a high concentration of primary metabolites and a low concentration of secondary metabolites (Fei et al., 2017). This is consistent with kaempferol existing in higher concentrations outside the range of L. hermes. The leaves nutritional quality is typically higher when young and developing compared to when it is mature (Fei et al., 2017). This is most likely why L. hermes chooses to lay is eggs under new leaf growth as insects tend to develop healthier and in higher densities on new plant tissues compared to older tissues (Fei et al., 2017).
Secondary metabolites are often responsible for plant defense against herbivores which may deter oviposition and interfere with an insect’s physiology once ingested (Fei et al., 2017).
Previous studies have shown that butterflies require physical contact with the host plant to recognize it as a potential oviposition site (Fei et al., 2017).
L. hermes seem to prefer a range that is lower in precipitation, higher in average and minimum monthly temperature, and also lower in kaempferol concentration. Kaempferol is obviously necessary for the most abundant flavonoids in R. crocea as it makes many of their backbones. What may be confining L. heremes to their small habitat may in fact be kaempferol concentration. In this range there exists the perfect concentration as outside their range there is
44 Dubord
too much. This may be determined by volatile phytochemicals or visual, olfactory and or
gustatory information. The climate also seems to be influencing their sedentary habits as a
majority of the leaf samples from inside the habitat were found in areas with average or
minimum temperature.
The investigation of these ecologically important questions was made possible with LC-
MS/MS. Next steps for this project include running a SIM for 285 m/z to capture cases of
degradation rather than MS processes where there had been sugars in two spots on the
kaempferol molecule. Using a t-test, the kaempferol ion at 287 m/z should be compared to the 22
different kaempferol-containing molecules determined in the plant samples (Figure 9). The glu 1
and glu 2 sugars identified in the previous work will also be identified.
Many insects use phytochemicals for host plant recognition (Fei et al., 2017). Another
interesting experiment would be to test if the kaempferol in the R. crocea leaves can be seen by
UV radiation in differing amounts in and out of range. To do this, leaves would be collected
from one R. crocea plant in range and one out of range. Using reflectance spectroscopy, the leaves would be examined under UV265 nm UV365 nm, the wavelengths the kaempferol molecule
absorbs. Lepidopterans have the capability to sequester flavonoids of which a vast majority end
up in their wings which is presumably used for species identification. Should R. crocea leaves
absorb this frequency, it may be determined to be a visual cue which aids L. hermes in
determining suitable host plants.
ACKNOWLEGEMENTS
This research and writing would not have been possible without the tremendous support
and assistance I received throughout the past two years.
45 Dubord
First, I would like to thank my PI, Dr. Jackie Trischman, whose expertise guided my research questions. Thank you for your unwavering support of my ambitions and mentorship for the past 4.5 years beginning as an undergraduate and now to a graduate student. You have been an enormous inspiration to me and your lab has showed me that there should be no boundaries in
STEM. It has been an honor working in your lab and something I will look back on fondly.
I would like to acknowledge Robyn Araiza for your outstanding mentorship throughout this entire project. Your help and feedback were invaluable and helped me to grow my instrumentation and critical thinking skills. This project would not be what it is today without your help.
I would also like to thank Dr. Vourlitis for beginning such a fascinating project and
Liberty Isbell for providing the research foundation this project was built on. Dr. Vourlitis, thank you for serving on my committee and for such valuable feedback throughout my research and writing process.
In addition, I would like to thank Dr. Schmidt and Dr. Iafe. Dr. Schmidt, thank you for serving on my committee and for your attention to detail throughout my thesis as it has improved my writing and presentation skills. Dr. Iafe, thank you for being such an organized graduate coordinator resulting in my timely conclusion of this program.
Lastly, I would like to acknowledge my parents and boyfriend for your unwavering support and belief in me. To my cohort, thank you for your friendship and being there with me every step of the way.
46 Dubord
REFERENCES
1. Agilent Technologies Agilent 6400 Series Triple Quadrupole LC/MS System Concepts
Guide: The Big Picture; Agilent Technologies: Santa Clara, 2012.
2. Alarcón, J., Cespedes, C.L.; Phytochem Rev, 2015, 14, 389–401.
3. Bruce, T. J. A.; J. Exp. Bot., 2014, 66, 2, 455-465.
4. Calderon-Montano, J. M.; Burgos–Moron, E.; Perez–Guerrero, M.; Lopez–Lazaro, M.;
Mini-Rev. Med. Chem., 2011, 11, 4, 298-344.
5. Deutschman, D. H.; Berres, M. E.; Marschalek, D. A.; Strahm, S. L. Two-year evaluation
of Hermes Copper (Lycaena Hermes); 2011; p. 8.
6. Dewick, P. M.; Medicinal Natural Products: A Biosynthetic Approach, 3rd Edition.; John
Wiley & Sons, Ltd: Chichester, 2009.
7. Emmel T. C., Emmel J. F.; The butterflies of Southern California, Nat Hist Museum Los
Angeles County, 1973.
8. Fei, M.; Harvey, J. A.; Yin, Y.; Gols, R.; J. Chem. Ecol., 2017, 43, 6, 617–629.
9. Garcia-Barros, E.; Fartmann, T. Ecology of Butterflies in Europe; Cambridge University
Press: Cambridge, 2009.
10. Gentili, R.; Ambrosini, R.; Montagnani, C.; Caronni, S.; Citterio, S.; Front. Plant Sci.,
2018, 9, 1335.
11. Hogan, D. Petition to List the Hermes Copper Butterfly (Hermelycaena [Lycaena]
hermes) as Endangered Under the Endangered Species Act; 2006.
12. Isbell, L.; Effects of N and Climate on Hermes Copper Butterfly (Lycaena hermes)
Habitat in Southern California, 2020.
47 Dubord
13. Karkare, P. Principal Component Analysis – A Brief Introduction. Medium
(https://medium.com/x8-the-ai-community/principal-component-analysis-a-brief-
introduction-dc8cf3e03c71) (accessed Mar 13, 2021).
14. Li, S.; Wang, Z.; Miao, Y.; Li, S. J. Integr. Agric., 2014, 13, 10, 2061-2080.
15. Mai, L. P.; Gueritte, F.; Dumontet, V.; Tri, M. V.; Hill, B.; Thoison, O.; Guenard, D.;
Sevenet, T.; J. Nat. Prod., 2001, 64, 1162-1168.
16. Marschalek, D. A.; Deutschman, D. H.; J. Insect Conserv., 2008, 12, 97-105.
17. Marschalek, D. A.; Deutschman, D. H.; Strahm, S.; Berres, M. E.; Ecol. Entomol., 2016,
41, 327–337.
18. Mierziak, J.; Kostyn, K.; Kulma, A.; Molecules, 2014, 19, 10, 16240-16265.
19. Montalvo, A.; Riordan, E.; Beyers, J; Gellie, N., Plant Profile for Rhamnus crocea and
Rhamnus ilicifolia, 2020.
20. Pichersky, E.; Noel, J. P.; Dudareva, N.; PMC, 2006, 311, 5762, 808-811.
21. Powell, V. Principal Component Analysis. Setosa.io (https://setosa.io/ev/principal-
component-analysis/) (accessed Mar 13, 2021).
22. Ramakrishna, A.; Ravishankar, G. A.; Plant Signal., 2011, 6, 11, 1720-1231.
23. Rana, P.S.; Saklani, P.; Chandel, C.; Res. J. Med. Plant, 2020, 14, 43-52.
24. Reisenman, C. E.; Riffell, J. A.; Bernays, E. A.; Hildebrand, J. G.; J. Biol. Sci., 2010,
277, 1692, 2371-2379.
25. Rowell, R. M.; Han, J. S.; Rowell, J. S.; Natural Polymers and Agrofibers Composites,
2000, 115-134.
26. Samanta, A.; Das, S. K.; Das, G.; Int. J. Pharm. Sci. Tech., 2011, 6, 1, 12-35.
48 Dubord
27. Staff of Carlsbad Fish and Wildlife Service, Federal Register Volume 71, Number 152,
2006.
28. Stevens, C. J.; David, T. I.; Storkey, J. Funct. Ecol., 2018, 32, 7, 1757-1769.
29. Thorne, F.; J. Res. Lepid., 1963, 2, 143-150.
30. Vourlitis, G; Hermes Copper Butterfly Chemical Ecology Study (US Fish and Wildlife
Service Grant F17AC00960); 2018.
31. Wiesen, B., Krug, E., Fiedler, K.; J. Chem. Ecol., 1994, 20, 2523–2538.
32. Yang, L.; Wen, K.; Ruan, X.; Zhao, Y.; Wei, F.; Wang, Q.; Molecules, 2018, 23, 4, 76.
33. Zhang, J.; He, N.; Liu, C.; Xu, L.; Chen, Z.; Li, Y.; Wang, R.; Yu, G.; Sun, W.; Xiao, C.;
Chen, H. Y. H.; Reich, P. B.; Glob. Change Biol., 2019, 26, 4, 2534-2543.
34. Zhijing, X.; Shaoshan, A.; MDPI, 2018, 10, 4757.
49 Dubord
APPENDICIES
Figure 18. Maps of Southern California generated through Google Earth with pinned study site areas where samples of R. crocea inside (A.) and outside (B.) L. hermes habitat were collected
(Vourlitis, 2018).
50 Dubord
Table 9. Sites names, latitude, longitude, elevation and estimated N deposition for R. crocea study sites in and out of L. hermes range (Isbell, 2020).
Inside L. hermes habitat Est. N Deposition Site Latitude Longitude Elevation (m) (kg N ha-1yr-1) Elfin Forest (EF) 33.074 -117.157 138 11.73 Black Mountain (BM) 32.977 -117.123 281 12.38 Meadowbrook (MB) 32.964 -117.069 187 12.13 Mission Trails (MT) 32.833 -117.038 144 10.77 McGinty Peak (MP) 32.758 -116.851 352 9.87 mg20763 33.063 -117.083 122 11.91 cbo86186 33.025 -117.171 49 12.13 SD195367 32.938 -117.213 49 12.83 cbo29765 32.938 -117.134 72 12.23 oe3104 32.951 -117.017 210 10.43 SD211218 32.869 -116.968 152 10.94 SD208086 32.711 -117.079 29 9.79 cbo37640 32.925 -117.162 109 11.5 SD182404 33.044 -117.153 49 11.73 in:9513993 32.832 -117.104 95 12.15 Outside L. hermes habitat Est. N Deposition Site Latitude Longitude Elevation (m) (kg N ha-1yr-1) UCR102732 33.781 -117.056 610 10.84 in:8546144 33.807 -117.354 595 10.45 UCR241774 33.725 -117.392 378 8.37 UCR100298 33.8 -117.061 610 11.2 UCR260565 33.641 -117.226 537 9.47 UCR249823 33.598 -117.142 402 11.19 cbo43336 33.308 -117.232 129 13.36 cbo43271 33.315 -117.234 80 13.36 cbo53916 33.366 -117.153 224 12.34 SD201912 33.168 -117.094 232 13.33 cbo73769 33.171 -117.275 96 12.93 UCR270175 33.466 -117.042 488 10.08 UCR1131 33.386 -116.79 846 6.32 SD163649 33.259 -117.141 N/A 12.48 cbo76139 33.093 -117.298 28 10.44
51