<<

University of New Hampshire University of New Hampshire Scholars' Repository

Master's Theses and Capstones Student Scholarship

Winter 2017

Developing a molecular pipeline to identify in New England

Erin Neff University of New Hampshire, Durham

Follow this and additional works at: https://scholars.unh.edu/thesis

Recommended Citation Neff, Erin, "Developing a molecular pipeline to identify Chenopodium species in New England" (2017). Master's Theses and Capstones. 1156. https://scholars.unh.edu/thesis/1156

This Thesis is brought to you for free and open access by the Student Scholarship at University of New Hampshire Scholars' Repository. It has been accepted for inclusion in Master's Theses and Capstones by an authorized administrator of University of New Hampshire Scholars' Repository. For more information, please contact [email protected].

DEVELOPING A MOLECULAR PIPELINE TO IDENTIFY CHENOPODIUM SPECIES IN

NEW ENGLAND

BY

ERIN NEFF B.S. Biochemistry, Grove City College, 2015

THESIS

Submitted to the University of New Hampshire in Partial Fulfillment of the Requirements for the Degree of

Master of Science in Integrative and Organismal Biology

December, 2017

This thesis has been examined and approved in partial fulfillment of the requirements for the degree of Master of Science in Integrative and Organismal Biology by:

Thesis Director, Thomas M. Davis, Ph.D., Biological Sciences

Janet R. Sullivan, Ph.D., Biological Sciences

Cheryl A. Smith, Ph.D., , Nutrition, and Systems

Richard G. Smith, Ph.D., Natural Resources & the Environment

On November 27, 2017

Original approval signatures are on file with the University of New Hampshire Graduate School.

ii

TABLE OF CONTENTS

DEDICATION

LIST OF TABLES

LIST OF FIGURES

ABSTRACT

CHAPTER PAGE

INTRODUCTION ………………………………………………………………………..……. 1

THE CHENOPODIUM ………………………...…………….………...... … 1

CHENOPODIUM SPECIES IN NORTHERN NEW ENGLAND ……………………. 2

COMPLICATED : AS A SHORT CASE

STUDY …………… …………………………………………………………………... 10

CURRENT STATE OF GENETIC RESOURCES AND GERMPLASM

AVAILABILITY FOR THE CHENOPODIUM GENUS ………………………….….. 13

GERMPLASM RESOURCES ………………………………………………………… 14

AIMS OF THIS PROJECT …………………………………………………………...... 15

I. METHODS……………………………………………………………………...... 17

a. DETERMINATION OF COLLECTION SITES ……………………………… 17

b. COLLECTION PROTOCOL ……………………………….………... 18

iii

c. STANDARD COMPARATORS USED IN THIS STUDY ……………...... 19

d. ANALYSIS OF COLLECTED PLANT MATERIAL: DETERMINATION OF

PLOIDY VIA FLOW CYTOMETRY …………………………………………. 20

e. SPECIES IDENTIFICATION AND ASSESSMENT OF WITHIN-SPECIES

DIVERSITY VIA RAPD PCR ………………………………………………… 23

f. CONFIRMATION OF IDENTITY FOR USDA STANDARDS VIA DNA

SEQUENCING ………………………………………………………..…...... 27

i. CHLOROPLAST DNA SEQUENCING ………………………..……... 27

ii. SALT OVERLY-SENSITIVE 1 (SOS1) DNA SEQUENCING

……………………………………………………………………...... 28

g. DEVELOPMENT OF DIPLOID MODEL CHENOPOD FOR FUTURE

GENETIC STUDY VIA CROSSES ………………………………...….……… 32

II. RESULTS ………………………………………………………………………….. 36

a. PLANT COLLECTIONS ……………………………………………………… 36

b. FLOW CYTOMETRY ………………………………………………………… 38

c. RAPD PCR ……………………………………………………………..……… 42

d. CONFIRMATION OF USDA REFERENCE STANDARDS VIA DNA

SEQUENCING ………………………………………………………….……... 45

i. CHLOROPLAST DNA SEQUENCING AND ALIGNMENT …...... 45

ii. SOS1 DNA SEQUENCING AND ALIGNMENT …………………….. 47

e. DIPLOID SPECIES CROSSES ...……………………………………………… 49

III. DISCUSSION ……………………………………………………………………… 52

a. TAXONOMY ………………………………………………………………….. 52

iv

b. DISCUSSION OF USEFULNESS OF MOLECULAR IDENTIFICATION

TESTS

i. FLOW CYTOMETRY ………………………………………………… 54

ii. RAPD PCR ……………………………………………………...... 55

iii. DNA SEQUENCING ………………………………………………….. 56

1. CPDNA SEQUENCING …………………………………...….. 57

2. SOS1 DNA SEQUENCING …………………………………… 58

c. VAR. MACROCALYCIUM …...... 59

d. CHENOPODIUM BERLANDIERI VAR. BUSHIANUM ……………………. 60

e. CHENOPODIUM BERLANDIERI VAR. ZSCHACKEI ………...….……….. 61

f. CHENOPODIUM STRICTUM ………………………………………………... 62

g. CHENOPODIUM ALBUM ……………………………………....…………… 62

h. ……………………………………..……….. 63

i. CHENOPODIUM STANDLEYANUM ……………………………..………… 64

j. CHENOPODIUM FOGGII ....…………………………………...……...... 64

k. DIPLOID HYBRID ASSESSMENT ……………………………….…...... 66

IV. CONCLUSIONS ………………………………………………………….……….. 67

V. LIST OF REFERENCES …………………………………………………………... 68

VI. APPENDICIES

a. APPENDIX 1: PREPARATION OF DE LAAT’S BUFFER FOR FLOW

CYTOMETRY …………………………………………………………………. 74

b. APPENDIX 2: PREPARATION OF PROPIDIUM IODINE STAIN FOR FLOW

CYTOMETRY …………………………………………………………………. 75

v c. APPENDIX 3: QUANTIFICATION OF DNA USING A QUBIT

FLUORMETER ………………………………………………………………... 76 d. APPENDIX 4: PREPARATION OF 2% ELECTROPHORESIS GEL FOR

RAPD PCR …………………………………………………………………….. 77 e. APPENDIX 5: PREPARATION OF 1X TBE BUFFER FOR

ELECTROPHORESIS GEL ……………………………………………..…….. 78 f. APPENDIX 6: CHLOROPLAST REFERENCE SEQUENCE DATA USED IN

THIS STUDY …………………………………...………………………….….. 79 g. APPENDIX 7: CHLOROPLAST PHYLOGENETIC GENERATED FOR

THIS STUDY ………………………………………………………………..… 80 h. APPENDIX 8: COLLECTION SITES, 2016 – 2017.……………….….…...... 81 i. APPENDIX 9: SOS1 INTRON 16 REFERENCE SEQUENCE DATA

……………………………………………………………………..…………..... 85 j. APPENDIX 10: SOS1 INTRON 16 PHYLOGENETIC TREE GENERATED

FOR THIS STUDY ………………………………………………….………… 86 k. APPENDIX 11: COLLECTION DATA, SUMMER 2016 AND 2017………... 87 l. APPENDIX 12: FLOW CYTOMETRY DATA, SUMMER 2016 AND 2017

……………………………………………………………………………….….. 89 m. APPENDIX 13: CHLOROPLAST DNA SEQUENCE IDENTITY MATRIX

………………………………………………………………………………...... 93 n. APPENDIX 14: SOS1 INTRON 16 SEQUENCE IDENTITY MATRIX

……………………………………………………………………………...... … 94

vi

DEDICATION

This work is dedicated to my best friend and husband, Joseph Neff, whose patient support and kind encouragement has made the completion of this research and the writing of this thesis possible.

vii

LIST OF TABLES

TABLE 1: CHENOPOD SPECIES IN THE NORTHERN NEW ENGLAND REGION ……… 3

TABLE 2: PUBLISHED 1C VALUES FOR NNE CHENOPODS AND ………….. 14

TABLE 3: USDA COMPARATORS USED IN THIS STUDY ………………………………. 20

TABLE 4: RAPD PRIMERS USED IN THIS STUDY ……………………………………….. 26

TABLE 5: COLLECTION CODES FOR SPECIMENS SUBMITTED TO UNH HODGDON

HERBARIUM AND USDA NORTH CENTRAL REGIONAL PLANT INTRODUCTION

STATION ……………………………………………………………………………………… 37

TABLE 6: SPECIES SEQUENCED USING TRNL-F CHLOROPLAST PRIMERS ………... 46

TABLE 7: SPECIES SEQUENCED USING SOS1 INTRON 16 PRIMERS ………………… 49

TABLE 8: DIPLOID HYBRIDS OBTAINED FROM CROSSES …………………………… 52

viii

LIST OF FIGURES

FIGURE 1: CHENOPODIUM PHYLOGENETIC TREE FROM WALSH ET AL. (2015) …… 6

FIGURE 2: CHROMOSOME COUNTS OF CHENOPODIUM SPECIES FROM MANDAK ET

AL. (2012) ……………………………………………………………………………………….. 9

FIGURE 3A: EMASCULATION OF C. FICIFOLIUM …………...….. 34

FIGURE 3B: COVERED C. FICIFOLIUM INFLORESCENCES FOLLOWING

POLLINATION ………………………………………………………………………………... 34

FIGURE 4A: CO-CHOP FLOW CYTOMETRY OUTPUT FROM C. FICIFOLIUM ……….. 39

FIGURE 4B: SINGLE-CHOP FLOW CYTOMETRY OUTPUT FROM REFERENCE

STANDARD …………………………………………………………………………………… 39

FIGURE 4C: SINGLE-CHOP FLOW CYTOMETRY OUTPUT FROM C. FICIFOLIUM ..... 39

FIGURE 5: CALCULATED 1C FLOW CYTOMETRY VALUES OF CHENOPODS AND

USDA STANDARDS ………………………………………………………………………….. 41

FIGURE 6: RAPD PCR OUTPUT FOR IDENTIFYING C. BERLANDIERI SPECIMENS TO

THE SUBSPECIES-LEVEL …………………………………………………………………... 43

FIGURE 7: RAPD PCR OUTPUT FOR ASSESSING WITHIN-SPECIES DIVERSITY OF C.

BERLANDIERI VAR. MACROCALYCIUM SPECIMENS ………………………………… 44

FIGURE 8: RAPD PCR OUTPUT COMPARING C. STANDLEYANUM AND C. FOGGII

…………………………………………………………………………………………………... 44

FIGURE 9: CHOOSING DIPLOID PARENTS VIA RAPD PCR ………………………...….. 45 ix

FIGURES 10 – 12: ASSESSMENT OF DIPLOID HYBRIDS VIA PCR WITH FTL PRIMERS

………………………………………………………………………………………………50 – 51

x

ABSTRACT

DEVELOPING A MOLECULAR PIPELINE TO IDENTIFY CHENOPODIUM SPECIES

() IN NEW ENGLAND

By

Erin Neff

University of New Hampshire, December, 2017

Weedy species from the genus Chenopodium may provide useful genetic resources for improving quinoa (C. quinoa), a highly nutritious and economically-important crop. Before this can be accomplished, the weedy species in the Northern New England (NNE) region must be enumerated, characterized, and represented in germplasm collections and herbaria. In this study, Chenopodium germplasm was collected from the NNE region and identified via a pipeline consisting of morphological identification, flow cytometric C-value determination, gel-visualization of RAPD and gene-specific PCR products, and comparative

DNA sequencing. The collected specimens were compared to obtained from the USDA

National Plant Germplasm System. In total, five different species, including the rare C. foggii, were collected and examined, and three of the nine studied USDA comparator accessions were found to be incorrectly identified. Representative specimens were submitted to the Hodgdon

Herbarium at the University of New Hampshire and the USDA National Plant Germplasm

System. As a step toward developing germplasm needed for constructing a linkage map as a genomic resource for Chenopodium, crosses were performed between genetically distinct representatives of a wild diploid species, C. ficifolium, and four confirmed F1 hybrids were obtained. xi

xii

INTRODUCTION THE GENUS CHENOPODIUM The number of plant species belonging to the genus Chenopodium (Amaranthaceae) has been reported to be between 100 and 150 (Cole 1961; Jellen et al. 2011; Kuhn, 1993, as cited in

Fuentes-Bazan et al. 2012a, b). Chenopodium species are distributed mainly in , North

America, and (Jellen et al. 2011). The grain crop quinoa (C. quinoa) is the most economically important species in the genus (FAO 2011), although other “chenopod” species are consumed by humans as grains and green vegetables (Rana et al. 2012). Most notably, C. album

(lambsquarters), is consumed as a green vegetable in , , and Turkey (Mandak et al.

2012; Rana et al. 2012; Pradhan and Tamang 2015; Samancioglu et al. 2016), and is used as a traditional medicine (Aziz et al. 2016). Chenopodium album has also been described as “among the worst on Earth” (Mandak et al. 2012).

As a prolific agricultural , lambsquarters is widely distributed in , but the origin of C. album is thought to ultimately trace to (Wilson 1980). Although lambsquarters is not the main focus of this work, its familiarity and representative characteristics provide a convenient frame of reference for discussion of other, lesser-known Chenopodium species that have similar life histories. For instance, as a weed, lambsquarters is successful for many reasons: it is adapted to grow well in a diverse array of environments and to out-compete neighboring herbaceous plants, as are other weedy species in this genus. Additionally, Basu et al.

(2004) described lambsquarters as having a high seed output; being highly competitive with crop plants; being self-compatible; demonstrating very rapid seedling growth; and being highly environmentally plastic, meaning that C. album can undergo phenotypic change in response to environmental stress and variability. Furthermore, they noted that lambsquarters is allelopathic,

1 meaning that it produces chemicals that inhibit or kill neighboring plants. In the event that the lambsquarters plant experiences a difficult spring season, and indeterminacy – i.e., continuous flowering and vegetative growth, if allowed – ensures that it will complete its life cycle and set seeds by the end of the growing season (Basu et al. 2004).

CHENOPODIUM SPECIES IN NORTHERN NEW ENGLAND

I have compiled a list of the 11 weedy chenopod species that are reported to be either native or naturalized to the Northern New England (NNE) region (Table 1). This list was created by generating a baseline list of 23 Chenopodium species using NNE herbarium records

(NEHerbaria.org), which I then cross-referenced against the 17 chenopod species described in

Arthur Haines’ Flora Nova Angliae (2011, pp. 321 – 325). This new, reduced list was then compared to two more recent phylogenetic studies that reclassified six of these species into other genera (Fuentes-Bazan et al. 2012a and b; Walsh et al. 2015).

2

Included in trnL- Included in Sub- Included in ITS Ploidy F & matK/trnK SOS1 genome phylogeny level phylogeny phylogeny model Taxon (Fuentes-Bazan (Walsh et (Fuentes-Bazan (Walsh et al. (Walsh et et al. 2012b) al. 2015) et al. 2012b) 2015) al. 2015) Sister to C. ficifolium Yes Yes Yes 2x Clade B C. berlandieri var. Yes, no Yes, no - - - macrocalycium subspecies listed subspecies listed

C. berlandieri var. bushianum - - - - -

C. berlandieri var. zschakei - - Yes AB 4x

C. album Yes Yes Yes BCD 6x BC or CD 4x C. strictum var. glaucophyllum - - - (putative) (putative) C. standleyanum Yes Yes Yes AA 2x C. foliosum - - - - - C. foggii - - - - - C. pratericola Yes Yes - - - C. leptophyllum - - Yes AA 2x C. simplex No, but in genus (Haines, personal communication) C. bonus-henricus Yes, as Yes, as Blitum - - - Yes, as C. glaucum ssp. glaucum Yes, as Yes, as Oxybasis - - Oxybasis C. rubrum var. humile Yes, as Oxybasis Yes, as Oxybasis - - - C. rubrum var. rubrum - - - - - C. capitatum var. capitatum Yes, as Blitum Yes, as Blitum - - - C. polyspermum var. Yes, as Yes, as Lipandra - - - polyspermum C. polyspermum var. - - - - - acutifolium C. urbicum Yes, as Oxybasis Yes, as Oxybasis - - - Yes, as Yes, as C. murale - - - Chenopodiastrum Chenopodiastrum Table 1: Current and former Chenopodium species that have been recorded in Northern New England. The species listed in the shaded cells are currently classified in the genus Chenopodium, while the rest were formerly known as chenopods, but have been recently reclassified into other genera in the . The ploidy level and subgenome composition of each species is listed if known (Walsh et al. 2015), as well as the presence of sequence data using ITS and trnL-F and matK/trnK primers (Fuentes-Bazan et al. 2012b). If a species was reclassified based on a phylogenetic study, the new genus name is also listed. If a species was not studied, a “-“ is listed.

3

Chenopodium has a convoluted taxonomic history, evidenced in part by the previously mentioned reclassification events and the changing number of chenopod species recognized in the

NNE region. Moreover, the goosefoot family, Chenopodiaceae, was recently absorbed into the family, Amaranthaceae (http://www.mobot.org/MOBOT/research/APweb/, accessed

October 10, 2017). Two recent studies have been quite central to either driving or supporting taxonomic changes at the subfamily and genus level.

The first is the Fuentes-Bazan et al. (2012b) study, “A novel phylogeny-based generic classification for Chenopodium sensu lato, and a tribal rearrangement of

(Chenopodiaceae)”. This study examined the nuclear internal transcribed spacer region (ITS), the plastid trnL-F and matK/trnK regions, in various species within Chenopodium. This work uncovered “six highly supported lineages of Chenopodium s.lat. within subfamily

Chenopodioideae” (Fuentes-Bazan et al. 2012b), and also resulted in a tribal rearrangement so that only Atripliceae, Anserineae, Dysphanieae, and Axyrideae are recognized tribes within the subfamily Chenopodioideae. The genus Chenopodium now resides within the tribe Atripliceae, along with Lipandra, Oxybasis, and Chenopodiastrum, which can be seen in the resulting phylogenetic based on the trnL-F and matK/trnK data, the ITS data set, the trnL-F data including coded indels, and the nrITS data including coded indels. Some species have been assigned to new taxa based on these phylogenetic findings and have thus been given new scientific names, but not all literature reflects these changes, which could explain the variability in the total number of reported chenopod species, as well as the reduction in the number of chenopods recognized in NNE.

The second study is the Walsh et al. (2015) paper, “Chenopodium polyploidy inferences from Salt Overly Sensitive 1 (SOS1) data”, in which two introns of the single-copy nuclear locus 4

Salt Overly-Sensitive 1 (SOS1) were examined and used to create a phylogeny (Figure 1). This phylogeny features four clades that illuminate the relationships and putative subgenomic compositions of two major reticulate lineages, hexaploid species from the Eastern Hemisphere and

American tetraploid species, which the authors infer to have arisen independently (Walsh et al.

2015). The American tetraploids included in the study have sequence representation in both the

A and B clades, on which basis their subgenome composition is designated AABB. In contrast, the Eurasian hexaploids, including C. album, have sequence representation in the B, C, and D clades, on which basis their subgenome composition is designated BBCCDD.

Allopolyploid plants have multiple subgenomes, each originating from their respective diploid progenitor species. These subgenomes complicate matters where DNA sequencing is concerned, because genomic segments amplified by PCR for sequencing may differ in terms of length due to indels (insertion/deletion polymorphisms). Since nucleotide bases are fluorescently labelled sequentially from the 5’ end during the sequencing reaction, the length discrepancies cause the bases to be labelled out of sync. The resulting chromatogram will feature multiple peaks for most of the bases – which is ultimately useless for phylogenetic analysis of allopolyploids. In order to obtain a gene sequence from each subgenome individually, PCR-amplified gene sequences from each of the subgenomes need to be cloned and isolated individually, which is how

Walsh et al. (2015) were able to use SOS amplicons to infer the subgenome composition of these species. The Walsh et al. (2015) study has been of great importance to this thesis, since it provided ploidy and cladistic information for most of the North American Chenopodium species of interest, which are often overlooked in scientific studies in favor of economically important species, such as C. quinoa and C. album.

5

Figure 1: A molecular phylogeny of the genus Chenopodium based on SOS1 (Salt Overly Sensitive 1) sequencing from Walsh et al. 2015. This phylogeny diagram also specifies the ploidy of each species, as well as its inferred subgenome composition. 6

One consequence of the phylogenetic work done by Fuentes-Bazan et al. (2012a) is that several former chenopod species such as Chenopodium glaucum, C. rubrum, C. urbicum, C. capitatum, and C. bonus-henricus were found to fall outside of the redefined Chenopodium genus, and therefore belong to other genera and have been given names that reflect these changes. Based on herbarium records, each of these species has been recorded as being present in the Northern

New England (NNE) region consisting of New Hampshire, Vermont, and Maine

(NEHerbaria.org). Thus, the reclassification of these species reduces the number of Chenopodium species reported as present in NNE. The herbarium records also feature some specimens with names that do not appear on modern phylogenetic treatments, such as C. botrys, C. laneolatum, and C. hybridum (NEHerbaria.org). These discrepancies, however, could be simple cases of plant misidentification. These outdated species names were removed from the list of chenopods in NNE, as they were not included in Haines’ taxonomic key (2011). Additionally, species such as C. foliosum, C. leptophyllum, and C. pratericola have been recorded as being present in NNE, but are considered to be “rare introductions”, meaning that they are not native, but that a few seeds may have been brought in from elsewhere and a resulting plant was collected as an herbarium specimen

(Haines, personal communication). These plants do not yet seem to exist as naturalized populations in the NNE states (Haines, personal communication), but they have not been shown to belong to genera outside of Chenopodium, so they are included in Table 1.

The Walsh et al. (2012) paper not only examined some of the American Chenopodium species but also provided important genomic information about them, such as ploidy levels for each species, which agree with the respective chromosome counts provided by Mandak et al.

(2012) in Figure 2. For instance, the ploidy information included in the Walsh et al. (2015) phylogeny indicated that C. standleyanum is a diploid species with the AA genome structure; and

7

C. ficifolium is a diploid species sister to Clade B and thus may possess a genome structure of (or resembling) BB. Additionally, C. berlandieri and C. quinoa were shown to be tetraploid species with AABB sub-genome structure, which indicates that C. berlandieri might be successfully crossed with quinoa in a breeding program (Walsh et al. 2015). Two species reported to exist in

NNE, C. foggii and C. strictum, were not included in this phylogeny: C. foggii is a rare plant, at least in Northern New England, and has yet to be included in molecular phylogenetic studies, and

C. strictum is a tetraploid with a putative BBCC or CCDD subgenome composition (Jellen and

Maughan, personal communication).

8

Figure 2: From Mandak et al. 2012. Published chromosome counts from literature of various Chenopodium species, with the most common chromosome count printed in bold.

Chenopodium foggii has been reported to exist as a rare plant in Northern New England, and herbarium specimens putatively representing this species are maintained at several NNE herbaria: University of New Hampshire, Connecticut Botanical Society, University of

Connecticut, New England Botanical Club, The Gray Herbarium (Harvard), and University of

Maine Herbaria (NEHerbaria.org). Very little is known about C. foggii and it has not been included in any molecular phylogenetic analysis of the genus. The rare status of C. foggii has led to some confusion as to whether it is an endangered species, since it was on the Massachusetts Endangered

9

Species List in 2004 (USDA 2004), but it is not listed on any current endangered species lists produced by the U.S. Fish and Wildlife Service.

COMPLICATED TAXONOMY: CHENOPODIUM ALBUM AS A SHORT CASE STUDY

As mentioned previously, there has long been great confusion surrounding both taxonomic and phylogenetic relationships within Chenopodium (Walsh et al. 2015). Possibly the best example of this lively debate surrounds C. album. For instance, Rahiminejad and Gornell (2004) noted that the confusion surrounding C. album and related taxa is due to the “usual” reasons surrounding this type of taxon: namely “phenotypic plasticity, parallel evolution, and putative hybridization”, and this is echoed by Mandak et al. (2012). Essentially, the main question here is this: Does C. album exist as a single species with diploid, tetraploid, and hexaploid cytotypes (Rana et al. 2012, Krak et al. 2016)? Or, is C. album uniformly hexaploid, and its purported lower level cytotypes are actually separate species that have been mistaken for C. album or lumped together with hexaploid

C. album (Cole 1962, Mandak et al. 2012)?

In any case, C. album and closely related species are often referred to as “the C. album complex” or “aggregate”, although the species included in these groupings are not always clearly defined, and vary based on the author and the region (Rahiminejad and Gornall 2004). For instance,

Graebner (1919, as cited in Rahiminejad and Gornall 2004) determined that nine species comprised this aggregate, including C. leptophyllum, C. album, C. quinoa, C. amaranticolor, C. striatum, C. opulifolium, C. berlandieri, C. ficifolium, and C. hircinum. Later European descriptions of the aggregate include C. suecicum, C. album, C. opulifolium, C. berlandieri, and

C. strictum (Uotila 1978). The 2012 Mandak et al. study, which utilized chromosome counting and flow cytometry, added C. ficifolium and C. striatiforme to that list.

10

The species within the C. album aggregate listed above are not treated as cytotypes of C. album by proponents of the single-species-with-3-cytotypes theory (Bera and Mukherjee 1992,

Rahiminejad and Gornall 2004, and Rana et al. 2010). The supporters of the 3-cytotypes theory point to flavonoid studies (Rahiminejad and Gornall 2004), randomly-amplified polymorphic

DNA (RAPD) and directed amplification of minisatellite DNA (DAMD) studies (Rana et al.

2010), chromosome counting studies (Cole 1962), and morphological studies (Bera and Mukherjee

1992) for support. The overwhelming majority of the authors of these studies, however, assume from the start that C. album has three cytotypes, and do not seem to consider that those cytotypes could be separate species. For instance, Bera and Mukherjee (1992) point out that there are significant morphological differences between the three cytotypes in regards to floral and somatic characteristics. Additionally, they note that “Similarities in the nature of grains of the three cytotypes of C. album on the one hand suggested their closeness and the dissimilarities suggest divergence”, but they do not discuss whether there could be more than one species in the group examined.

A useful perspective on the confusing taxonomic situation of C. album and related species in , particularly India, is that it is peripheral to understanding the spectrum of Chenopodium species in the . As C. album is a species of Eurasian origin, it seems possible, if not likely, that only one cytotype, the hexaploid, was introduced to the Western world, while greater diversity

(and more cytotypes) could remain near the species’ center of origin. For instance, Rana et al.

(2010) examined the Indian C. album aggregate using RAPD and DAMD markers. The authors note that there are significant differences between Indian diploid cytotypes, but those diploid cytotypes form a group distinct from the tetraploid and hexaploid cytotypes on the resulting dendrogram. Additionally, the three Indian cytotypes of C. album are not sexually compatible with

11 each other, and tetraploid American Chenopodium species included in the study, such as C. berlandieri, grouped more closely to C. quinoa than to the Indian tetraploid cytotypes (Rana et al.

2010).

Rana et al. (2010) point out that “all the material of C. album of British, European,

American and Australian origin is uniformly hexaploid”, but the Indian tetraploid cytotype “shows greater resemblance to C. album (sensu stricto) [versus C. berlandieri] in the presence of anthocyanin pigment, nature of and seed coat markings”. This idea that C. album located outside of Asia is singularly hexaploid is supported by chromosome counting studies (Cole

1962, Rahiminejad and Gornall 2004, Mandak et al. 2012). As mentioned previously, Mandak et al. (2012) published a table of chromosome counts of Chenopodium species from various studies

(Figure 2). For some of the species, multiple counts are given, especially for C. album. Cole (1962) proposed that these discrepancies are due to “taxonomic misidentification of the original material used, mistakes which are easily made in this critical genus.” Cole continues by suggesting that the

American 4x C. album cytotype is C. berlandieri var. zschackei, which “has been long confused with C. album in the U. S. A.”. To illustrate this point, Cole describes how 18 of the 23 North

American C. album specimens housed in the Kew Herbarium were later re-identified as C. berlandieri var. zschackei by Aellen (1929). Interestingly, Aellen had concluded earlier that there were 34 forms and subspecies of lambsquarters in North America, but he apparently later changed his mind.

12

CURRENT STATE OF GENETIC RESOURCES AND GERMPLASM AVAILABILITY

FOR CHENOPODIUM

Linkage maps are important genetic resources for plant breeding programs, especially in relation to marker-assisted breeding (MAB). Marker-assisted selection (MAS) of breeding material based on the presence of particular genetic markers as opposed to physical traits, is a very efficient method for choosing breeding material (Perez-de-Castro et al. 2012). Additionally, having a reference genome can allow breeders to assess the genetic diversity within germplasm collections and identify polymorphisms in a genome-wide manner (Perez-de-Castro et al. 2012).

The first linkage map for C. quinoa was published in 2004, and it contained 255 markers generated using amplified fragment length polymorphism (AFLP), single-sequence repeats

(SSR), and randomly-amplified polymorphic DNA (RAPD) methods, and 35 linkage groups

(Maughan et al.). Later, in 2012, another quinoa linkage map was generated using the single nucleotide polymorphisms (SNPs) found in two recombinant inbred quinoa line populations, and it consisted of approximately 14,000 putative SNPs and 29 linkage groups (Maughan et al.

2012). As quinoa is an allotetraploid species with gametophyte chromosome number of n = 2x =

18 (Palomino et al. 2008), so only 18 linkage groups are expected in a fully developed linkage map. In 2017, the full genome of C. quinoa was published (Jarvis et al. 2017), but there do not appear to be any published linkage maps or sequenced genomes for any other chenopod species.

The estimated nuclear DNA content of plants, or C-value, is another important tool for understanding a particular species, since the genome size of plants is correlated with traits such as weediness, dry mass production, seed mass, ecological requirements, cell size, and the length of cell cycle (Dolezal et al. 2007). The C-values of plants can be calculated using flow cytometry, and can be a useful tool for distinguishing between species, since “Genome size 13 within a species is supposed to be exceptionally stable” (Vrit et al. 2016). The C-values of some weedy chenopods have been published, though the only species or counterparts in the NNE region that have been examined are C. album, C. strictum, C. ficifolium, and C. berlandieri subsp. nuttalliae and var. bushianum. A compilation of these C-values is available in Table 2.

Provided that the chenopods used in these studies were identified properly, the C-values for these species should be similar to the C-values of NNE chenopods, since intraspecific genome size shouldTable 2: be Published consistent C-va (Vritlues inet pgal. and 2016) Mbp. for NNE chenopod species and quinoa. Mbp = pg x 977 (Dolezal 2002). Published 1C Source Taxon Published 1C value (pg) value (Mbp) Ohri 2002 C. album 0.77 752 C. album 1.63 1593 C. ficifolium 0.66 645 C. bushianum 1.59 1553 Kubesova et al. 2010 C. strictum 0.8 782 Bennett et al. 1998 C. album 2.33 2276 Kolano et al. 2012 C. quinoa 1.54 1503 C. quinoa 1.45 1419 Palomino et al. 2008 C. quinoa 1.48 1446 C. berlandieri var. nuttalliae 1.52 1485 Vrit et al. 2016 C. album 1.89 1844 C. ficifolium 0.87 845 C. strictum 1.03 1004

Germplasm Resources

Currently, Chenopodium germplasm is available from only a handful of sources. As of

1998, 1029 quinoa ecotypes were held in the Universidad Nacional del Altiplano-Puno (UNAP)

(Ortiz et al. 1998). In 2000, at least 1512 quinoa accessions were housed in the Bolivian national quinoa collection (Rojas et al. 2000). Another resource is the Centro Internacional de la Papa (CIP-

FAO) collection, which, in 2007, housed at least 22 publicly available quinoa accessions

14

(Christensen et al. 2007). As of October 2017, there were 297 Chenopodium accessions available from the USDA National Plant Germplasm System (USDA-NPGS), of which 164 are C. quinoa accessions (ars-grin.gov). Other germplasm could be available from private and university-based collectors, such as Rick Jellen from Brigham Young University in Salt Lake City, Utah (personal communications).

AIMS OF THIS PROJECT

The goal of this project was to develop a molecular pipeline to identify chenopod species in NNE, with three aims. The first aim was to provide a baseline status of this weedy genus in

NNE so that its distribution and diversity can be monitored in the future as a response to environmental change. The second aim was to assemble a collection of local chenopod germplasm that, because of its tetraploidy, is potentially suited to interbreeding with quinoa. The third aim was to establish genetic resources, if possible, for future research focused on downy mildew resistance in Chenopodium species, as it is a formidable issue for quinoa growers worldwide (Gandarillas et al. 2013). In many instances, a diploid model plant system with a well-developed genome assembly is best suited to this type of research, so a collection of local diploid species (C. ficifolium, C. standleyanum, and C. foggii) was needed. In addition to fulfilling these stated aims, this project also focused on gathering specimens for both the

University of New Hampshire Hodgdon Herbarium and the USDA North Central Plant

Introduction Station in Ames, Iowa. The Hodgdon Herbarium has few accessions of

Chenopodium specimens collected in the past decade, and the Plant Introduction Station is lacking in Chenopodium material from the New England region. The specific objectives and approaches for this study were:

15

1) Collect living and herbarium specimens representative of the Chenopodium species

present in the Northern New England region.

a. Examine the collection records of Chenopodium species from the past 10 – 20

years in the herbaria of New England and identify candidate sites for collection

visits, including previously listed or ecologically similar sites.

b. Visit sites of potential interest, and gather samples of plants that resemble

Chenopodium species, until a range of within-species diversity has been gathered.

2) Develop a molecular pipeline to correctly identify the collected specimens.

a. Identify the plants using morphological and ecological characteristics to the extent

possible.

b. Calculate each collected plant’s 1C value (genome size) using flow cytometry and

a known cytometry standard.

c. Compare the plant’s visual DNA profile from RAPD PCR to those of standard

comparators.

d. Confirm the identities of the standard comparators by DNA sequencing, modeled

after previous phylogenetic studies, and comparing the sequence data to their

published sequences.

3) Submit Chenopodium specimens to the UNH Hodgdon Herbarium.

4) Submit Chenopodium seed samples that are representative of the Eastern New England

region to the USDA North Central Plant Introduction Station in Ames, Iowa.

5) Develop diploid hybrids of potential value for future genetic and genomic studies.

METHODS

16

DETERMINATION OF COLLECTION SITES A wealth of herbarium records exists documenting the collection and taxonomic history of Chenopodium species in Northern New England (NNE), thus providing an important resource for this project. Additionally, the morphology of these chenopods is well-described in botanical keys and literature. However, the disadvantages to relying solely on morphology for chenopod species identification are many. For instance, oftentimes the distinguishing characteristic between two species is something minute, such as a texture difference on the seed coat of a species, so that the collector can only make the distinction at a certain stage of the plant’s development. Additionally, as previously described, there is a substantial degree of between- species and within-species diversity present in the chenopods, so that an identification made solely upon morphological characteristics may have a level of uncertainty. The final disadvantage is that, in the pre-molecular era, morphological identification was the only method available, and the morphological diversity of the chenopod genus has led botanists to either lump multiple species together under one name, or to grant each differing form its own species name.

As a result, the herbarium records and other literature are filled with extraneous species and a general sense of confusion. Thus, any person studying this group of plants must decide whether a plant belongs to its given species name, or whether it is something different. In this project, examining the morphological characteristics of each plant provided only a starting point for identification.

Chenopodium album is a well-known agricultural weed. As a hexaploid species it was of limited interest to this project because it cannot be interbred with quinoa, a tetraploid. In order to determine where to look for other weedy chenopods in the NNE area, the records of the most recently collected chenopods in the UNH Hodgdon Herbarium were consulted. Only 10

17

Chenopodium specimens were submitted to the Hodgdon Herbarium between 2000 and 2017.

These 10 submissions comprise four taxa: C. album, C. berlandieri var. macrocalycium, C. glaucum var. glaucum, and C. rubrum var. rubrum, although as described previously, the latter two taxa are now classified in the genus Oxybasis, and C. album was of limited interest. Thus, the one specimen of particular interest was C. berlandieri var. macrocalycium. This specimen was collected from the Isles of Shoals in the Gulf of Maine in 2006, where it was found on a

“Maritime cobble beach” (NHA-553957). The characteristics of this collection site indicated that the seashore in the New Hampshire seacoast region would be a good place to begin a search for living specimens of C. berlandieri var. macrocalycium.

PLANT COLLECTION PROTOCOL

Upon finding a chenopod plant in the field, a photograph of the plant and its surroundings was taken for documentation of site characteristics, and the coordinates of the collection site were recorded using a Garmin etrex Summit Global Positioning System (GPS) device. The plant was then carefully pulled or dug up to minimize damage to the root mass. The root mass of the plant was then wrapped in a wet paper towel and placed into a plastic freezer bag, which was then closed and placed in a backpack or cooler to shield it from the sunlight. Each plant was labeled with a code derived from the GPS coordinates of its collection site. Upon returning to the lab, the plants were either potted immediately in the UNH Greenhouse, or placed in the refrigerator if they could not be attended to right away. Plant specimens that were to be donated to the UNH Hodgdon Herbarium were pressed in the field using a standard plant press.

STANDARD COMPARATORS USED IN THIS STUDY

18

Because one of the main objectives of this study was to develop a molecular method for definitively identifying Chenopodium species in NNE, a set of standard comparator plants was required as a control to compare field-collected specimens. Comparator plants were obtained from the germplasm collection at the USDA North Central Regional Plant Introduction Station in

Ames, Iowa. The accessions were chosen based on their geographical collection sites, as close to

New England as could be managed, as they would be representative of the diversity of New

England. In some cases, there was only one USDA accession available, so it was used even though it may not have been collected near New England. All of the USDA comparator plants used in this study are listed in Table 3, along with their USDA accession numbers (PI = Plant

Introduction Number), and the experiments in which each accession were used.

Table 3: A list of all comparator Chenopodium species obtained from the USDA North Central Regional Plant Introduction Station in Ames, Iowa, including their accession numbers and the studies in which they were used.

19

Used Used in Used in USDA in Used Flow chloroplast Taxon Accession(s) RAPD in SOS1 DNA Cytometry DNA Used PCR sequencing? Study? sequencing? study? Ames 29961 Yes Yes Yes No C. album PI 666271 Yes Yes Yes Yes C. berlandieri var. bushianum PI 608030 Yes Yes No Yes C. berlandieri var. macrocalycium PI 666279 Yes Yes Yes Yes C. berlandieri var. zschackei PI 666288 Yes Yes Yes Yes C. ficifolium PI 658749 Yes Yes Yes No PI 510533 No Yes Yes No

PI 478418 Yes No No No

PI 587173 Yes No No No

C. quinoa PI 614901 Yes No No No

PI 614880 Yes No No No

PI 614881 Yes No No No

Ames 13734 Yes No No No C. standleyanum PI 666323 Yes Yes No Yes C. strictum PI 666324 Yes Yes No Yes

ANALYSIS OF COLLECTED PLANT MATERIALS DETERMINATION OF PLOIDY VIA FLOW CYTOMETRY In flow cytometry, nuclei from the somatic cells of plant are released via chopping with a razor, stained with a fluorescent dye, and then injected into a flow cytometry instrument where they are pumped at a high speed past lasers and sensors that record the intensity of the cell fluorescence (Dolezel et al. 2007). Cell fluorescence data is presented in graphical form, with two peaks present – one representing cells in the G1 phase of the cell cycle, and the other

20 representing cells in the G2 phase (Dolezel et al. 2007). If a reference standard is included, the position of the G1 peak along the X-axis can be used to infer either ploidy level or the 1C value of the sample, per the following equations (Dolezel et al. 2007):

푚푒푎푛 푝표푠푖푡푖표푛 표푓 퐺1 푠푎푚푝푙푒 푝푒푎푘 푆푎푚푝푙푒 푝푙표푖푑푦 = 푅푒푓푒푟푒푛푐푒 푝푙표푖푑푦 푥 푚푒푎푛 푝표푠푖푡푖표푛 표푓 퐺1 푟푒푓푒푟푒푛푐푒 푝푒푎푘

푠푎푚푝푙푒 1퐶 푚푒푎푛 푝푒푎푘 푝표푠푖푡푖표푛 푆푎푚푝푙푒 1퐶 푣푎푙푢푒 = 푅푒푓푒푟푒푛푐푒 1퐶 푣푎푙푢푒 푥 푟푒푓푒푟푒푛푐푒 1퐶 푚푒푎푛 푝푒푎푘 푝표푠푖푡푖표푛

There are several advantages to using flow cytometry to help identify chenopods, the first of which is that this method is relatively inexpensive. Flow cytometric analyses are also fairly quick to perform, and do not require a large amount of plant tissue. Additionally, both ploidy level and subgenome composition can be inferred from flow cytometry data, as long as there is relevant reference information available, such as genome sizes and subgenomic compositions for the species in question. Such information was provided in Jellen et al. (2001).

There are several disadvantages to this method, however. For instance, flow cytometers are expensive machines to purchase, and can be expensive to maintain. Flow cytometry also requires fresh, young tissue, so forethought is needed in regards to growing reference standards so they are available in timely fashion for analysis. Lastly, reference data, such as the subgenome sizes and ploidy level of each species, as well as the 1C value of the reference plant, is also required.

In this project, Chenopodium specimens were first compared to diploid Fragaria vesca subsp. vesca ‘ 4’ strawberry, which has a 1C value of 260 Mbp (Bassil and Davis et al.

2015). Chenopod samples were prepared as single-chops and analyzed separately from the

‘Hawaii 4’ strawberry reference. A single-chop preparation is done so that the reference material and the unknown Chenopodium material are prepared as separate samples. All of these singly 21 prepared samples are analyzed in the same batch, and the Mean FL2-A value of the single- chopped reference is used in the 1C value calculations of the unknown material.

This single-chop approach was eventually modified so that samples were prepared as co- chops in a 4:1 weight ratio of chenopod: Pisum sativa L. ‘Ctirad’ pea tissue, with 1C value of

4440 Mbp, as per Dolezel et al. (2007). In contrast to a single-chop preparation, a co-chop prep is done when reference material is chopped in the same petri dish at the same time as an unknown sample, and is repeated for all samples in the batch. This way, the reference data is generated at the same time and under the same biochemical conditions as the unknown data.

These reference Mean FL2-A values are used in the 1C value calculations.

Following tentative morphology-based identification, flow cytometry was the next identification step taken when Chenopodium species were collected from the field, since it provided immediate insight into the ploidy level and subgenome composition of the plant. By illuminating the ploidy level and subgenome composition of each collected sample, the scope of possible identities for each plant was greatly narrowed, even permitting immediate species identifications in some cases (for instance, hexaploid C. album).

The flow cytometry protocol is as follows: 1.4mL/sample of De Laat’s buffer (Appendix

1) and 1.05uL β-mercaptoethanol/mL buffer were mixed to make a chopping buffer. A total of

70 – 120mg young tissues were collected and placed in a petri dish with 500uL of the chopping buffer. Leaves were then chopped into tiny pieces using a double-edged razor used in a straight up-and-down motion. An additional 500uL aliquot of buffer was added to the dish, and the solution was transferred to a 30um filter inside a clean 1.5-mL microfuge tube. Samples were promptly put on ice after filtering. The filtered samples were pelleted in a centrifuge at 150 x g

22

(500 rpm) for 5 minutes and then returned to the ice. Supernatant was removed so that only about

200uL remained in the tube, to which 400uL of propidium iodine stain (Appendix 2) was added.

The stained samples were incubated in the dark, over ice for 30 minutes before analyzing in the

BD Accuri C6 Flow Cytometer. Using this method, the collected species were split into smaller, ploidy-based groups, providing a way to focus only on the species of interest, such as diploids and tetraploids.

SPECIES IDENTIFICATION AND ASSESSMENT OF WITHIN-SPECIES DIVERSITY VIA RAPD PCR Randomly amplified polymorphic DNA (RAPD) PCR creates unique visual profiles for different specimens when run on a gel, based on how the arbitrary primers interact with the sample genomic DNA. One major advantage to RAPD PCR is that it allows for the visualization of the within- and between-species genetic variation at fairly high resolution. RAPD PCR can also potentially be used to establish the parentage of hybrids, provided that the parental visual profiles display diagnostic differences. Lastly, RAPD PCR is easy to perform and the reagents involved are inexpensive.

There are several disadvantages to using RAPD PCR to identify chenopods, one of which is that this method takes multiple days to perform, due to a lengthy thermocycler protocol, a long run-time for the gel, and extensive staining/de-staining procedures. Isolated template DNA is also required for RAPD PCR, which makes it much longer than methods utilizing fresh tissue.

RAPD PCR also does not illuminate ploidy, so it unfortunately cannot confirm the ploidy level inferred from flow cytometry. Another disadvantage is that a reference comparator must be included for every different species in the PCR assay, which also means that this method cannot be used to identify unknown species unless it visually matches a comparator. The final, and most

23 troubling, disadvantage is that there is no way to truly know whether the reference comparators had been correctly identified in the first place, so their identity must be confirmed via DNA sequencing. As long as these limitations are kept in mind, RAPD PCR should provide a method to visually confirm a species’ identity when an appropriate comparator sample is included for reference. Additionally, it is important that some specimens – mainly those roughly identified as

Chenopodium berlandieri – be identified to the subspecies level. There are three subspecies of C. berlandieri that are thought to exist in New England: var. macrocalycium and var. bushianum

(both native), and var. zschackei (naturalized).

Standards for each of the C. berlandieri subspecies were procured from the USDA North

Central Plant Introduction Station in Ames, Iowa (Table 3). Accessions collected in New

England were obtained from the USDA as standards, when possible, as those would theoretically be the closest genetically to the field-collected species. Once the seeds of the standards had sprouted, DNA was extracted according to the following protocol (Torres et al. 1993): 1 mL grinding buffer per sample was prepared by adding 4 uL β-mercaptoethanol/mL of 2% CTAB. In a ceramic mortar, 0.1g of fresh, young leaves from each plant were added, along with enough liquid nitrogen to cover the leaves. Once the liquid nitrogen evaporated, the leaves were slowly ground with a ceramic pestle. Before the leaf tissue thawed, 1 mL of grinding buffer was added, and the tissue was ground into a slurry. This slurry was then added to a clean 1.5-mL microfuge tube. Once all the samples had been ground, the tubes were incubated in a 60C water bath for 30 minutes. After incubation, the tubes were allowed to cool for 10 minutes on the bench top.

Chloroform: octanol (24:1) was added to nearly fill each tube, and each tube was then vortexed.

The tubes were then centrifuged at 14,000 x g for 5 minutes to separate the phases. Following centrifugation, the upper aqueous phase was transferred to a new 1.5-mL microfuge tube. If the

24 aqueous solution was cloudy, the chloroform extraction steps were repeated. Ice-cold 95% ethanol was added to each tube with the aqueous phase, and the tubes were stored in a -20C freezer overnight to increase the amount of precipitated DNA. On the second day, the tubes were centrifuged at 14,000 x g for 5 minutes, and the supernatant was removed. 1 mL of 70% ethanol was added to each tube, and the tubes were again held in the freezer overnight.

On the third day, tubes were centrifuged at 14,000 x g for 5 minutes, and the supernatant was again removed. The DNA was dried by placing the open tubes in a speed vacuum centrifuge at

45C for 1 minute. 25 uL TE buffer was added to each tube, and the tubes were refrigerated overnight. On the fourth day, a 1 uL RNase/1 mL sterile water solution was prepared, and 25 uL of the solution was added to each tube. The tubes were gently mixed then incubated at 37C for 1 hour. Finally, the DNA was quantified using an Invitrogen QuBit Fluorometer (Appendix 3), and diluted with sterile water to a final DNA concentration of 25 ng/uL.

Following DNA extraction, six RAPD primers were individually tested to identify which ones would give unique profiles for Chenopodium species. Primers (Table 4) were tested by using them in RAPD PCR assays of several Chenopodium species. Primers were chosen for use in subsequent assays if the resulting gel displayed a sufficient amount of within-species and between-species diversity. The standards and collected species that had been identified as a particular species by flow cytometry were then analyzed using RAPD PCR according to the following protocol: for each DNA sample, a master mix containing 10.5 uL sterile water, 5 uL

Long-Amp Reaction Buffer, 1 uL dNTPs, 2.5 uL of 25 mM MgCl2, and 1 uL Long-Amp Taq

Polymerase was assembled. Each PCR tube contained a total reaction volume of 25 uL, which was comprised of 20 uL master mix, 1 uL of 0.5X RAPD PCR primer, and 4 uL of 25 ng/uL template DNA. The tubes were treated using the following thermocycler protocol: 2 minutes at 25

94C, 1 minute at 94C, 2.5 minutes at 35C, 30 seconds at 44C, 2.5 minutes at 65C, and 7 minutes at 72C for a total of 39 cycles. The tubes were held in the thermocycler at 5C. A 2% agarose:

NuSieve electrophoresis gel (Appendix 4) was prepared using 1X TBE buffer (Appendix 5), and the samples and 2 uL tracking dye/sample were loaded into the gel and run at 100V for 3 hours, and then stained for 15 minutes in a 0.5 ug/mL ethidium bromide solution. The gel was de- stained in water for 30 minutes in two changes of water, and photographed using a Fotodyne

Foto/Analyst Luminary FX camera under trans-UV light.

Table 4: The names and nucleotide sequences of RAPD primers tested for use in this project

Primer Code Primer Sequence Chosen for RAPD PCR study: BC106 5’ – CGT CTG CCC G – 3’ Yes

BC104 5’ – GGG CAA TGA T – 3’ Yes

BC190 5’ – AGA ATC CGC C – 3’ Yes

BC191 5’ – CGA TGG CTT T – 3’ Yes

BC123 5’ – GTC TTT CAG G – 3’ Yes

B200 5’ – TCG GGA TAT G – 3’ Yes

OPO-11 5’ – GAC AGG AGG T – 3’ Yes

BC105 5’ – CTC GGG TGG G – 3’ No

BC103 5’ – GTG ACG CCG C – 3’ Yes

CONFIRMATION OF IDENTITY FOR USDA STANDARDS VIA DNA SEQUENCING

26

Using DNA sequencing to either identify unknown chenopods or confirm the identity of reference comparators is a fairly quick and inexpensive method that can provide definitive results. One major resource for this kind of work is the GenBank DNA sequence database, managed by the National Center for Biotechnology Information (NCBI), which is a searchable database of millions of publicly available DNA sequences. This invaluable resource allows researchers to directly compare their DNA sequence data to reference sequences, provided that reference sequences have been uploaded to GenBank. DNA sequencing was used in this study in order to answer two questions: 1) Can DNA sequencing be used to definitively confirm the identity of the reference comparators utilized in the other identification methods? and 2) Can

DNA sequencing be used to definitively identify a new, unknown chenopod?

CHLOROPLAST DNA SEQUENCING

In this experiment, site-specific chloroplast DNA (cpDNA) sequencing was assessed as a method to definitively identify chenopod specimens. DNA from 15 chenopod specimens, both collected samples and USDA comparator species, were amplified via PCR using trnTAC2 (5’-

CAT TTT TCG GTA TAG TAA BCC -3’) and trnTf (5’- ATT TGA ACT GGT GAC ACG AG

-3’) primers (Fuentes-Bazan et al. 2012a), LongAmp Taq Polymerase, 5X LongAmp Reaction

Buffer, and 10mM dNTPs. These samples were amplified using the following thermocycler profile: initial denaturing at 94C for 30 seconds, 30 cycles of denaturing at 94C for 30 seconds, annealing at 52C for 1 minute, and extension at 65C for 45 seconds. There was a final extension at 65C for 10 minutes, and samples were held at 4C indefinitely. The size of the PCR products was examined via gel electrophoresis on a 1% agarose gel, and then purified using a DNA Clean and Concentrator kit (Zymo Research). These purified PCR products were then quantified using a Qubit fluorometer and then prepared for sequencing per the service provider’s specifications. 27

Two tubes per PCR product were prepared for sequencing, so that each product could be sequenced using both the forward and the reverse primer. All of the DNA sequencing for this study was performed by GeneWiz, LLC in South Plainfield, NJ. Prepared sequencing reaction tubes were prepared and shipped to the GeneWiz facility on dry ice, and the samples were sequenced immediately. Sequence data was obtained within 48 hours of shipping the prepared samples.

Sequences from the Chenopodium clade of the Fuentes-Bazan et al. (2012b) phylogeny were obtained from the NCBI database and used as reference sequences (Appendix 9). FASTA files for the Fuentes-Bazan et al. (2012b) reference sequences were downloaded from NCBI and passed through EditSeq (DNAStar) to convert the .txt files to .seq files. These .seq files were then uploaded to a MegAlign project (DNAStar). Consensus sequences for the samples in this study were generated from the raw sequence data using SeqManPro (DNAStar), and they were then added to the MegAlign project. This project was aligned using the Clustal W method, and the “redundant” sequences, or sequences with 100% identity based on the Sequence Distances table, were removed. The sequences were also rearranged to reflect a decreasing sequence identity when compared to USDA quinoa accession (PI 510533). A phylogenetic tree was generated using the remaining sequences (FigTree, Appendix 10).

SALT-OVERLY SENSITIVE 1 (SOS1) SEQUENCING

In this experiment, a TOPO TA Cloning Kit for Sequencing (Invitrogen, Version O) was used to clone Chenopodium PCR amplicons representing the subgenomes of allopolyploid samples (as previously determined by flow cytometry). In a sterile environment, agar plates were prepared by pouring enough LB/agar solution (Appendix 6) into petri dishes to just cover the

28 bottom of the plate. Plates were allowed to solidify, and were stored in a cold room. DNA samples were amplified via PCR using SOS1i16F2 (5’- TGT TAC ATA TGC GCT GCA TTT

TTA CG -3’) and SOS1i16R (5’- TTT CAG TGA TGA CTG CAG AAG -3’) primers (Walsh et al. 2015). A control PCR was prepared using 100ng Control DNA Template (Invitrogen), 5X

LongAmp Taq Reaction Buffer, 50mM dNTPs (Invitrogen), Control PCR Primers (0.1 ug/ul,

Invitrogen), LongAmp Taq Polymerase, and water (Invitrogen). PCR samples were amplified using the following thermocycler profile: initial denaturing at 94C for 30 seconds, then 30 cycles of denaturing at 94C for 30 seconds, annealing at 52C for 1 minute, and extension at 65C for 45 seconds. There was a final extension at 65C for 10 minutes, and samples were held at 4C indefinitely. Successful amplification was confirmed via 1% agarose electrophoresis gel.

Immediately following PCR amplification, the DNA content of the products was quantified using a NanoDrop ND-1000 spectrophotometer. The products were diluted using sterile water

(Invitrogen) so that they contained 6.45 ng/uL of DNA each, and the kit control was diluted to contain 5.69 ng/uL of DNA. This was done so that a 3:1 gene insert: vector ratio was achieved, according to the following calculation:

푏푝 표푓 푖푛푠푒푟푡 3 ( 푥 푛푔 표푓 푣푒푐푡표푟) 푏푝 표푓 푣푒푐푡표푟

Immediately following quantification and dilution, the samples were ligated by combining 0.5uL salt solution (Invitrogen), 0.5uL pCR4-TOPO vector (Invitrogen), enough PCR product to achieve the 3:1 insert: vector ratio, and enough sterile water to bring the total reaction volume to 6 uL. These ligation mixtures were stirred gently using a pipette tip, incubated on the benchtop for 10 minutes at room temperature, and incubated for at least 30 minutes at -20C. A

2uL aliquot of each ligated sample was added to a 25uL aliquot of One Shot Mach1 -T1

29

Competent E.coli cells (Invitrogen) that were thawed on ice. The cells were mixed gently, incubated on ice for 10 minutes, heat-shocked in a 42C water bath for 30 seconds, and then returned to ice. 125uL of room-temperature SOC medium (Invitrogen) was added to each tube of cells, and the tubes were incubated horizontally at 200rpm at 37C for 1 hour. LB/agar plates were pre-warmed at 37C for 30 minutes. In a sterile hood, two plates were prepared for each sample: one with 40ul of the transformation mix, and the other with 100ul of transformation mix.

Prepared plates were incubated at 37C for at least 8 hours.

Following incubation, approximately 10 bacterial colonies were collected from each sample and cultured overnight in pre-warmed (37C) LB without ampicillin, in a two-step process. First, 100uL of warm LB was added to a 1.5-mL tube, and a bacterial colony was placed in the solution. The tube was shaken horizontally at 200rpm at 37C for 2 hours. Then 900uL more of the LB media was added to the same tube, and the tube was shaken horizontally under the same conditions for at least 8 hours, or overnight. Following this initial culturing step, 100uL of each liquid culture was plated on LB/agar plates containing 25ug ampicillin and incubated at

37C overnight. One bacterial colony from each plate was picked and cultured according to the steps outlined previously, but in LB media with 25ug ampicillin. The plasmid DNA was isolated using a Zyppy Plasmid Miniprep Kit (Zymo Research). These purified plasmid samples were quantified and then sequenced according to the service provider’s specifications; however, poor sequencing results were obtained, so the rest of the samples were prepared as purified PCR samples with the following protocol:

1uL of the plasmid DNA was added to 1mL of sterile H2O to make a working stock of template plasmid DNA. This template plasmid DNA was amplified via PCR using 1uL of the working stock, and the size of the product was checked using a 1% agarose gel. The PCR 30 product was then purified using a DNA Clean & Concentrator kit (Zymo Research), then quantified and prepared for sequencing according to the service provider’s specifications, using

T3 primer (5’- TAA TAC GAC TCA CTA TAG GG -3’) as the sequencing primer.

Diploid chenopod template DNA was amplified via PCR using SOS1i16F2 (5’- TGT

TAC ATA TGC GCT GCA TTT TTA CG -3’) and SOS1i16R (5’- TTT CAG TGA TGA CTG

CAG AAG -3’) primers (Walsh et al. 2015), and the size of the product was checked using a 1% agarose gel. The PCR product was purified using a DNA Clean & Concentrator kit (Zymo

Research), and the purified product was quantified and prepared for sequencing according to the sequencing service provider’s specifications. Diploid chenopods were sequenced using both the forward and reverse primers so that a consensus sequence could be made.

Sequence data that was featured in the Walsh et al. (2015) phylogeny was needed as reference data (Appendix 12), so FASTA files were downloaded for each sequence and passed through EditSeq (DNA Star) to convert .txt files into .seq files. Reads from the diploid plants were uploaded to SeqMan Pro (DNA Star) in order to make a consensus sequence. It was then noted that the reference data were sequenced in the opposite direction of the data from this study, meaning that the alignment from the 5’-end, when the reference data were in the forward direction, would never align properly as there were large amounts of data missing. The reverse complement of the reference data was taken (www.reverse-complement.com), and each sequence file generated by this study was checked for directionality. If the sequence file appeared to be in the same direction as the original reference data files, the reverse complement of the file was taken. If the sequence file appeared to be in the opposite direction as the original reference data, any messy “N” sections at the ends were trimmed.

31

All of these sequence files were uploaded to MegAlign (DNA Star) and aligned using the

Clustal W method. The reference sequence data had all been end-trimmed to the same point, so the sequences from this study were subsequently trimmed to that same point and the whole group was re-aligned. The sequences were rearranged to reflect a decreasing sequence identity in comparison to one of the reference quinoa sequences. The phylogenetic tree, generated using

FigTree, is available in Appendix 13.

DEVELOPMENT OF DIPLOID MODEL CHENOPOD FOR FUTURE GENETIC

STUDY VIA BIDIRECTIONAL CROSSES

Bidirectional crosses between collected specimens of C. ficifolium, one from Portsmouth,

NH and one from Quebec City, Quebec, were attempted in the interest of producing a hybrid diploid model system of relevance to tetraploid quinoa. The two “parent” specimens in question were chosen in part because they were collected in NNE, since all of the C. ficifolium germplasm available from the USDA is of Eurasian origin and therefore does not adequately represent the genetic diversity of this species that is available in the New England region. Additionally, these two “parent” specimens were examined via RAPD PCR to ensure that their visual profiles would be different enough so that any putative hybrid progeny could be assessed in this way.

Emasculating flowers of chenopod species, such as quinoa, is notoriously difficult, given the small size of the flowers, the lack of colorful , and the gynomonoecious nature – that is, having both hermaphroditic and female flowers on the same plant – of the plants, so it is often helpful to perform manual emasculation underneath a dissecting microscope before the flower buds open (Peterson et al. 2015). However, the inflorescence structure of C. ficifolium differs from that of chenopods with higher ploidy levels in that only the terminal flowers appear to be

32 hermaphroditic, in which case emasculation could be as simple as removing all terminal flower buds before the anthers begin to shed pollen.

In this study, seeds from parent C. ficifolium specimens, “Portsmouth” and “QC4”, were grown in the UNH greenhouse. Before the flower buds of each plant opened and the anthers began shedding pollen, the terminal flowers of all seed parent plants were removed using fine- tipped forceps (Figure 3A). The flower buds of the pollen parent plants were allowed to mature normally. The inflorescences of the pollen parents were gently rubbed against those of the seed parents when those plants’ stigmas began to emerge from the closed flower bud. The need for isolating the cross against pollen dissemination by wind and insect activity was not known, so some plants were isolated by loosely wrapping the seed parent’s inflorescence with paper towel and securing it below the inflorescence with a twist-tie (Figure 3B). This wrapping was removed a day after the pollination event. Crosses in both directions between both parents were performed, and DNA from each purported parent was isolated.

33

Figure 3A (above-left): An emasculated C. ficifolium inflorescence, with all terminal flower buds removed.

Figure 3B (above-right): Pollinated C. ficifolium plants with inflorescences wrapped loosely with paper towel and secured at the base with a twist-tie to isolate them from pollen spread by disruptive wind or insect activity.

Seeds from these crosses were allowed to mature in the greenhouse, collected at the end of the parents’ lifecycle, and dried at room temperature inside the fume hood. Seeds were stored inside paper seed envelopes until planting (< 20 from each cross) in the lab under grow-lights.

DNA from each “F1” seedling was isolated, and RAPD PCR was performed, including each parent, each seedling, and mixed template and mixed product samples. Mixed template and mixed product samples are essentially artificial hybrids which can act as a visual reference in a

RAPD PCR gel. Mixed templates were created during the normal RAPD PCR protocol, but instead of only using one type of template DNA, half the normal volume of template DNA from each parent was added to the PCR tube. In contrast, mixed product samples were created by making an extra RAPD PCR sample for each parent, and half of the resulting PCR product in

34 each tube was mixed together and loaded into the gel. In this way, one can see the effects of

“hybridizing” template DNA before and after the thermocycler protocol.

Towards the end of the study, PCR amplification of the Flowering Locus T-Like (FTL) gene (Storchova et al. 2015) was substituted for RAPD PCR amplification, since amplification of this locus produced multiple DNA bands when run out on an agarose gel, and the bands produced were different enough that parental banding patterns could be differentiated. This substitution was also made because PCR amplification produces results much faster than RAPD

PCR. PCR amplification of the FTL gene was performed using the previously-described thermocycler protocol and the CrFT35For (5’- GGT TGG TGA CTG ATA TTC CAG -3’) and

CrFT501Rev (5’- CGC CAC CCT GGT GCA TAC AC -3’) primers (Storchova et al. 2015),

LongAmp Taq Polymerase, 5X LongAmp Taq Reaction Buffer, and 10mM dNTPs. The sizes of the resulting products were examined via gel electrophoresis using a 1% agarose gel and 1kb+

DNA ladder.

Any progeny plants whose RAPD profile or FTL profile matched one of the parental profiles was discarded, and those whose profiles appeared to differ were subjected to continued

RAPD testing with different primers. If it was determined that a progeny plant was actually a hybrid, that plant was allowed to grow to maturity so that seeds could be collected.

35

RESULTS PLANT COLLECTIONS In total, 57 plant specimens were collected in the summer of 2016. GPS data, information about collection sites, and the initial morphology-based identification of each specimen are provided in Appendix 11. The collection sites in 2016 were mostly concentrated near the coast, specifically near salt marshes and on sand dunes at or near the tideline, and these sites yielded three taxa: C. album, C. berlandieri var. macrocalycium, and C. strictum. Several specimens of

C. album were collected in non-coastal sites, such as in mowed lawns and growing along sidewalks. A fourth species, C. ficifolium, which was not previously represented in the Hodgdon

Herbarium, was found in downtown Portsmouth, NH (initially by Dr. Richard Smith) and along the Saint Lawrence River in Quebec City, Quebec.

The majority of the plant material collected in 2016 was brought back to the UNH greenhouse, where it was planted so that its DNA could be isolated and molecular assays could be performed over time. Some plants were tagged in the field and leaf tissue was collected from them so that they could be analyzed in the lab. At the end of the season, seeds were harvested from the plants in the greenhouse and the tagged plants in the field, either to be submitted to the

USDA Germplasm Repository or to provide a source of plant material for future assays. Ten plant specimens from 2016 were pressed and submitted to the UNH Hodgdon Herbarium. In total, two specimens from the Blitum clade, one specimen of C. ficifolium, two specimens of C. strictum, four specimens of C. album, and one specimen of C. berlandieri var. macrocalycium were submitted to the herbarium. Additionally, seeds from two C. ficifolium specimens, two C. berlandieri var. macrocalycium specimens, two C. album specimens, and one C. strictum specimen were submitted to the USDA. Collector codes for the specimens submitted to both the

36

Hodgdon Herbarium and the USDA North Central Plant Introduction Station are listed in Table

5.

Table 5: Collector codes for the specimens submitted to both the UNH Hodgdon Herbarium and the USDA North Central Plant Introduction Station. The unique USDA identifying code is in parenthesis following the UNH collector code. GPS and site data is available for these specimens in Appendix 11.

Collector Codes of Collector Codes of Species Specimens submitted to Specimens submitted to UNH Hodgdon Herbarium USDA D1 284-E (UNH2016003) D2 C. album D3 294 (UNH2016006) 288-B C. berlandieri var. 279 (UNH2016002) 284-I macrocalycium 296-A (UNH2016007) 284-K C. strictum 284-A (UNH2016004) 284-J P1 (UNH2016001) C. ficifolium P1 QC6 (UNH2016005) C. simplex 302-B - C. foggii 302-A -

In the summer of 2017, two additional species were collected near Shell Pond in Stow,

ME (GPS data available in Appendix 11). This site was located at the base of a sheer cliff face rising from the top of a very steep hillside. This site was suggested by Arthur Haines, author of

Flora Novae Angliae (2011), who claimed to have seen a population of C. foggii, an elusive chenopod species, at the site in recent years. Another small population of plants resembling

Chenopodium was also present at this site. Several plants of each type were both pressed for herbarium submission and also collected for planting at the UNH MacFarlane Greenhouse, and the two species were later identified morphologically as C. foggii and C. simplex by Arthur

Haines via video call and in person by Dr. Janet Sullivan. C. simplex is often referenced as being

37 conspecific with the Eurasian C. hybridum (http://plants.usda.gov), which has been found to be a

Chenopodiastrum clade species (Fuentes-Bazan et al. 2012b).

FLOW CYTOMETRY

Flow cytometry provided the first evidence for ploidy level and subgenome composition for each plant sample. Initially, all samples were analyzed as single-chop samples, including diploid Fragaria vesca subsp. vesca ‘Hawaii 4’ strawberry, which was used as a reference standard with a 1C value of 260 Mbp (Bassil and Davis et al. 2015). This method of directly calculating the 1C values of the chenopod sample relative to the 1C value of the standard was not the most reliable, however, as they were not analyzed simultaneously. This was remedied by transitioning the cytometry method so that all samples were prepared as co-chops with Pisum sativa L. ‘Ctirad’ pea tissue (Dolezel et al. 2007) in a 1:4 weight ratio of pea tissue:chenopod tissue, so that a more accurate comparison could be made. Examples of single-chop and co-chop flow cytometry outputs can be seen in Figure 4.

38

A Figure 4A: Flow Cytometry Data Output for a Co-Chop The graphical flow cytometry output for sample QC4 (C. ficifolium collected from Quebec City, Quebec in the summer of 2016) prepared as a co-chop with Pisum sativa L. ‘Ctirad’ pea tissue (1C=4440Mbp, Dolezal et al. 2007), gating on 2C peaks included. The chenopod 2C and 4C peaks are the most-left gated peak and the ungated, intermediate peak (left).

Figure 4B: Flow Cytometry Data Output for a Single- Chop reference sample: The graphical flow cytometry data output for also the reference sample, Fragaria vesca subsp. vesca ‘Hawaii 4’ strawberry (1C=260Mbp, Bassil and Davis et al. 2015), gating on 2C peak included.

Figure 4C: Flow Cytometry Data Output for a Single- Chop sample: The graphical output for sample QC4, as seen in Figure 4A, with gating on the 2C peak included.

B C

Initially, the aim was to establish a reference framework for the 1C values for chenopod

species by analyzing plant tissue from plants grown from seeds obtained from the USDA North

Central Plant Introduction Station in Ames, Iowa. However, the pattern of C-value variation

among USDA accessions was not entirely as expected, based upon previous reports (Table 2).

Fitting expectations, the diploid comparators C. standleyanum (~600 Mbc = 1C) and C.

ficifolium (~ 850 Mbp = 1C) had the smallest genome sizes, while the genome sizes of the C.

39 quinoa (PI 510533) and C. berlandieri var. macrocalycium (PI 666279) comparators were in the expected ~1400 Mbp range, and some accessions of hexaploid C. album had the highest 1C values – in the expected 1800 Mbp range. But in contrast to expectations, the comparator accession C. berlandieri var. zschackei (PI 666288), which reportedly shares an AABB subgenome composition with C. quinoa and C. berlandieri var. macrocalycium (Walsh et al.,

2015), had markedly lower than expected C values, in the 950-1000 Mbp range. Another USDA accession, C. berlandieri var. bushianum (PI 608030), which should also have an AABB subgenome, had greater than expected 1C values in the 1750-1860 Mbp range. Furthermore, some plants within C. album accession PI 666271 had C values in the ~950 range, thus having C values of about 50% the value expected for hexaploid C. album. Thus, the suitability of USDA accessions identified as C. berlandieri var. zschackei (PI 666288), C. berlandieri var. bushianum

(PI 608030), and C. album accession (PI 666271) was brought into question.

The 1C values of locally collected samples segregated into fairly clean categories which, coupled with the rough morphological identification that had been made in the field, made it easy to tentatively identify each collected chenopod sample with a fair amount of confidence. The calculated 1C values for multiple specimens from each species collected, including USDA comparator plants, are shown in Figure 5. Additionally, all flow cytometry data from this study are provided in Appendix 12.

40

Figure 5: Calculated 1C values for various chenopod specimens, collected in the wild or obtained from USDA germplasm, arranged by ascending value, not by date of assay. Values were calculated using both Fragaria vesca subsp. vesca ‘Hawaii 4’ strawberry (1C=260 Mbp), (Bassil and Davis et al. 2015) and Pisum sativa L. ‘Ctirad’ pea (1C=4440 Mbp) (Dolezal et al. 2007) as reference standards. Accessions that deviated from expected values are boxed in red.

41

Chenopodium berlandieri var. macrocalycium is a tetraploid with AABB subgenome composition (Walsh et al. 2015), and the calculated 1C values of the collected samples fell into an expected range (Table 2) of approximately 1280 Mbp – 1450 Mbp. Another tetraploid, C. strictum with a calculated 1C = 915 – 1000 Mbp, was also collected, and it has a supposed

BBCC or CCDD subgenome composition (Jellen and Maughan, personal communication).

Chenopodium album, which is a hexaploid species, produced 1C values of approximately 1600 –

1860 Mbp, although one USDA comparator sample (PI 666271) produced 1C values that fell into the ~900 range, coinciding with the C. strictum range. Two diploid species were collected:

C. ficifolium, which had a 1C value of 820 – 890 Mbp. C. foggii, with a 1C value of 577 Mbp, was most similar to that of C. standleyanum (1C = 580 – 625 Mbp). Chenopodiastrum simplex was also analyzed, even though it is no longer classified in the genus Chenopodium, and it had a

1C value of 1174 Mbp.

RAPD PCR

RAPD PCR was used to assess field-collected specimens. In Figure 6, in which field- collected chenopods were compared to USDA accessions based upon their calculated 1C-values, it is clear that the collected specimens match most closely to Lane 3, C. berlandieri var. macrocalycium (PI 666279). While the collected specimens in Figure 6 do not appear to be very diverse, it is clear in Figure 7 that there is genetic diversity between the different collection sites, especially between the Rye, NH site and the other sites.

RAPD PCR was also used to qualitatively assess the degree of similarity between a USDA accession of C standleyanum and the collected chenopod identified as C. foggii (Figure 8). There are some similarities between the two specimens, but the differences are substantial, and these

42 differences persisted across multiple primers, supporting the morphology-based determination that the collected specimen belonged to a taxon distinct from C. standleyanum. Lastly, RAPD

PCR was used to assess the genetic diversity of collected C. ficifolium specimens in order to choose parents for the diploid hybrid study. In Figure 9, there is a clear distinction between the

“Portsmouth” and “Quebec City” specimens, so these two types were considered distinct enough to use as crossing parents.

1 2 3 4 5 6 7 8 9 10

Figure 6: Identification of Chenopodium berlandieri to the subspecies level via RAPD PCR using primer BC191 (left). Lane 1, 1kb+ DNA ladder Lane 2, C. berlandieri var. zschackei USDA standard (PI 666288) Lane 3, C. berlandieri var. macrocalycium USDA standard (PI 666279) Lane 4, C. berlandieri var. bushianum USDA standard (PI 608030)

Lanes 5 – 10, field-collected plants with C-values in the C. berlandieri range, which are the most similar to Lane 3, USDA standard C. berlandieri var. macrocalycium.

43

1 2 3 4 5 6 7 8 9 Figure 7: Assessing genetic diversity within collected C. berlandieri across geographical locations via RAPD PCR using primer BC190 (left). Lane 1, 1kb+ DNA ladder

Lane 2, C. quinoa, USDA standard (PI 510533) Lane 3, C. berlandieri var. macrocalycium USDA standard (PI 666279)

Lane 4, C. berlandieri collected from Fort Foster, ME Lane 5, C. berlandieri collected from Odiorne State Park, NH Lanes 6 and 7, C. berlandieri collected from Rye Beach, Rye, NH Lanes 8 and 9, C. berlandieri collected from Appledore Island, ME

1 2 3 4 5 6 7 8 9 10 11 12

Figure 8: Distinguishing between USDA C. standleyanum (PI 666323) and a collected chenopod identified as C. foggii via RAPD PCR using multiple primers (left). Lane 1, 1kb+ DNA ladder

Lane 2, C. standleyanum, RAPD Primer BC103 Lane 3, C. foggii, RAPD Primer BC103

Lane 5, C. standleyanum, RAPD Primer BC123 Lane 6, C. foggii, RAPD Primer BC123 Lane 8, C. standleyanum, RAPD Primer BC190 Lane 9, C. foggii, RAPD Primer BC190 Lane 11, C. standleyanum, RAPD Primer B200

Lane 12, C. foggii, RAPD Primer B200

44

1 2 3 4 5 6 7 8 9

Figure 9: Using RAPD PCR, using primer BC106, to identify C. ficifolium parents that are genetically distinct enough for their progeny to be detectable using PCR-based method (left). Ultimately, “Portsmouth” and “Quebec City #4” were chosen as crossing parents.

Lane 1, 1kb+ DNA ladder Lane 2, USDA C. ficifolium (PI 658749)

Lane 3, C. ficifolium “Quebec City #1” Lane 4, C. ficifolium “Quebec City #4”, chosen as a parent Lane 5, C. ficifolium “Quebec City #6” Lane 6, C. ficifolium “Quebec City #7” Lane 7, C. ficifolium “Quebec City collection site 2” Lane 8, C. ficifolium “Portsmouth #1”, chosen as a parent

Lane 9, C. ficifolium “Portsmouth #3”

ASSESSMENT OF USDA REFERENCE STANDARDS VIA DNA SEQUENCING

CHLOROPLAST DNA SEQUENCING AND ALIGNMENT

Overall, 15 chenopod specimens were sequenced using trnL-F chloroplast primers, which was modeled after phylogenetic work performed by Fuentes-Bazan et al. (2012a, b). The taxa used in this portion of the study are listed below in Table 6, which also includes the USDA accession numbers and the collector codes for the wild specimens used. All plant collection data is included in Appendix 11. The phylogenetic tree generated based on the alignment of the sequence data is available in Appendix 7.

The C. simplex sequence (302B) acted as an outgroup in this tree, which was expected as C. simplex was shown to belong in the Chenopodiastrum clade (Fuentes-Bazan et al. 2012b, 45

Haines, personal communication). There were two major clades in this phylogenetic tree: one containing allotetraploids C. berlandieri var. macrocalycium and C. quinoa, and diploids C. standleyanum, and C. foggii; and the other containing two subclades, one of allohexaploid C. album and allotetraploid C. strictum sequences, and one of diploid C. ficifolium and assorted reference sequences. The second major clade, contains C. album, C. strictum, and C. ficifolium sequences which have sequence representation in the SOS1-based B, C, and D clades (Walsh et al. 2015, Jellen personal communication). As stated previously, this clade is split into two subclades, one containing C. album and C. strictum sequences and the other subclade contains C. ficifolium and assorted reference sequences.

Table 6: Specimens included in the trnL-F chloroplast DNA sequencing portion of this study, including USDA accession numbers and collector codes. Accession Number of USDA Collector Code of Wild Species specimens used Specimens Used PI 666271 284F Chenopodium album Ames 29961 300 Chenopodium berlandieri 281 PI 666279 var. macrocalycium 301 Chenopodium berlandieri PI 666288 - var. zschackei Chenopodium ficifolium PI 658749 QC4 Collected chenopod 302A (C. - 302A foggii) Chenopodium quinoa PI 510533 - Collected chenopod 302B (C. - 302B simplex) 284B Chenopodium strictum PI 666324 284K

46

SOS1 DNA SEQUENCING AND ALIGMENTS

In total, nine plant samples were sequenced at the SOS1 (Salt Overly-Sensitive 1) locus at intron 16, modeled after the work done by Walsh et al. (2015). The phylogenetic tree generated from the alignment of these sequences are available in Appendix 10. The reference sequences used, available in Appendix 9, were sequences from each clade in the Walsh et al.

(2015) phylogenetic tree, and they behaved as expected in the trees generated for this study. The exception to this was that a C. suecicum sequence (KP79900.1) grouped with C. ficifolium

(KP799004.1) in a sister to Clade B, instead of being in Clade B, as it should have been (Walsh et al. 2015). However, C. strictum (PI 666324) had sequence representation in the D clade as expected (Jellen, personal communication), and C. standleyanum (PI 658755) had representation in the A clade, which was also expected (Walsh et al. 2015). The previously un-documented chenopod species, C. foggii, had sequence representation in the A clade as well, which was not surprising given that the calculated 1C-value for this specimen was similar to that of C. standleyanum (PI 658755).

There were some deviations from the expected outcome in regards to the sequences generated from USDA comparator plants, however. For instance, C. berlandieri var. bushianum

(PI 608030) and C. berlandieri var. macrocalycium (PI 666279) both had sequence representation in both the B and D clades, which is contrary to their expected AABB genome compositions (Walsh et al. 2015). Similarly, C. berlandieri var. zschackei (PI 666288) had sequence representation in both the C and D clades, versus the expected AABB representation

(Walsh et al. 2015). Finally, C. album (PI 666271) had sequence representation in the A clade, which is contrary to its expected representation of BBCCDD (Walsh et al. 2015). Two of these sequences, the D-clade C. berlandieri var. macrocalycium (PI 666279) sequence and C. album 47

(PI 666271), were provisionally removed due to suspected accidental mislabeling. Restriction sites for the enzymes AluI and HpyCH4IV were found in all sequences generated for this study, and the use of restriction fragments to determine whether there was a mix-up between amplicon clones from C. berlandieri var. macrocalycium (PI 666279) and C. album (PI 666271) is still being investigated.

There was a certain amount of difficulty, as described in Table 7, in getting usable SOS1 sequence data from the sequencing service provider, GeneWiz. It was not uncommon for the results of half of the samples submitted to return as having failed, either for “Poor quality” of the prepared sample, or for an apparent lack of sequencing primer in the prepared sample. The sequences were prepared meticulously before shipping, so it is highly unlikely that sequencing primer was simply omitted from the reaction mixture. However, GeneWiz can repeat the sequencing reaction, for a price, which means that they are only using an aliquot of the overall sample mixture for the initial sequencing reaction. This means that it is most likely that the

GeneWiz technicians do not vortex sample tubes before sequencing, which would give a “No priming” result, as the components of the reaction mixture tend to stratify when frozen. After this was discovered, a special request of “Please vortex or mix tubes before sequencing” was submitted with each batch of samples, but this problem of getting poor results continued.

48

Table 7: All species sequenced using SOS1 (Salt Overly-Sensitive 1) intron 16 primers, including the number of clones successfully sequenced vs. the total number of clones, and each species’ subgenome composition inferred from its placement in the final phylogeny.

Expected Subgenome Number of clones Subgenome Composition based Species sequenced/ total Composition on placement in number of clones (Walsh et al. phylogeny 2015) USDA C. berlandieri var. 6/8 AABB CCDD zschackei (PI 666288) USDA C. berlandieri var. macrocalycium (PI 4/8 AABB BBDD 666279) USDA C. berlandieri var. 2/8 AABB BBDD bushianum (PI 608030) USDA C. strictum (PI BBCC or 2/8 _DD 666324) CCDD USDA C. quinoa (PI 0/8 AABB - 510533) USDA C. album (Ames 0/16 BBCCDD - 29961) USDA C. album (PI 1/16 BBCCDD AA__ 666271) USDA C. standleyanum Diploid species AA AA (PI 666323) Unknown chenopod 302A Diploid species unknown AA (C. foggii)

DIPLOID SPECIES CROSSES

As outlined previously, crosses in both directions were performed using sets of two field- collected C. ficifolium specimens, “Portsmouth” and “Quebec City #4”. DNA from the parental plants from each cross was compared to DNA from each putative hybrid plant that was grown from the resulting seed, and amplified via PCR using FTL primers. The resulting PCR products were examined using gel electrophoresis on 1% agarose, and mixed template and mixed product lanes were included to simulate hybrid PCR products. Of the 20 putative hybrid plants assessed from 5 crosses, 4 “hybrid” plants showed a combination of the parental product bands and also

49 matched the mixed product and mixed template lanes (Figures 10 – 12). The details of this assessment are available in Table 8.

1 2 3 4 5 6 7 8 9 10 11 Figure 10: Assessing putative hybrid progeny of crosses between C. ficifolium plants using FTL primers (left). “Hybrids” A and E appear to be true hybrids. Lane 1, 1kb+ DNA ladder Lane 2, C. ficifolium “Portsmouth” seed parent Lane 3, C. ficifolium “Quebec City #4” pollen parent Lane 4, “Hybrid A”, appears to be a hybrid Lane 5, “Hybrid B”

Lane 6, “Hybrid C” Lane 7, “Hybrid D”

Lane 8, “Hybrid E”, appears to be a hybrid Lane 9, “Hybrid F”

Lane 10, Mixed template

Lane 11, Mixed PCR product

50

1 2 3 4 5 6

Figure 11: Assessing putative hybrid progeny of crosses between C. ficifolium plants using FTL primers (left). “Hybrid” A appears to be a true hybrid.

Lane 1, 1kb+ DNA ladder Lane 2, C. ficifolium “Quebec City #4” seed parent

Lane 3, C. ficifolium “Portsmouth” pollen parent Lane 4, “Hybrid A”, appears to be a hybrid

Lane 5, Mixed template Lane 6, Mixed product

1 2 3 4 5 6

Figure 12: Assessing putative hybrid progeny of crosses between C. ficifolium plants using FTL primers (left). “Hybrid” A appears to be a true hybrid. Lane 1, 1kb+ DNA ladder Lane 2, C. ficifolium “Quebec City #4” seed parent Lane 3, C. ficifolium “Portsmouth” pollen parent Lane 4, “Hybrid A”, appears to be a hybrid Lane 5, Mixed template Lane 6, Mixed product

51

Table 8: The numbers of true hybrids in a small sample of F1 seed found via PCR amplification using FTL primers.

# Seedlings from # Hybrids found in % Hybrids in total Cross Sample Examined Sample Sample “Quebec City #4” x 1 1 100% “Portsmouth” “Quebec City #4” x 1 1 100% “Portsmouth” “Quebec City #4” x 6 0 0% “Portsmouth” “Portsmouth” x 6 2 33% “Quebec City #4” “Portsmouth” x 6 0 0% “Quebec City #4”

DISCUSSION

Taxonomy

The consideration of multiple forms of evidence suggests that eleven currently recognized chenopod taxa are native or naturalized to the Northern New England region, based upon cross-referencing herbaria records (NEHerbaria.org), botanical keys (Haines 2011), and modern phylogenetic studies (Fuentes-Bazan et al. 2012a, b, and Walsh et al. 2015). Of these eleven taxa, I collected representatives of five in 2016 and 2017: C. album, C. berlandieri var. macrocalycium, C. strictum, C. ficifolium, and C. foggii. Of the remaining six, three are considered “rare introductions”: C. foliosum, C. pratericola, and C. leptophyllum (Haines, personal communication). It bears considering that perhaps the three “rare introductions”

(Haines, personal communication) should be removed from the list of NNE chenopod species, as previously described, if they truly are waifs and not native or naturalized in this region.

52

C. berlandieri var. bushianum and C. berlandieri var. zschackei present a different problem, because their USDA comparator accessions did not match with genome expectations, so we cannot rule out the possibility that either or both of these C. berlandieri varieties exist in

NNE but have not been distinguished from macrocalycium. Finally, the genomic resemblance of C. standleyanum to C. foggii suggests that C. standleyanum may be limited to narrowly-defined ecological conditions, as C. foggii appears to be, and that it will only be found when we know where to look, as was the case for C. foggii.

As previously outlined, a great deal of taxonomical confusion surrounds this genus, and mistakes are particularly common when identifications are made on the basis of morphological characteristics alone. Just as Cole (1962) described how more than half of the C. berlandieri var. zschackei specimens were misidentified as C. album in the Kew Herbarium, it was found that three of the nine USDA comparator samples used in this study did not conform to their expected genomic and molecular characteristics, and therefore may have been originally misidentified.

According to David Brenner, Chenopodium curator of the USDA North Central Plant

Introduction Station in Ames, Iowa, Chenopodium specimens submitted to the collection have been identified by their collectors usually based on morphology, not on molecular or genetic studies (Brenner, personal communication). This means that molecular and genetic identification methods are very important due to the high incidence of these putative mistakes, especially considering that this germplasm collection is used by researchers all over the country.

53

Usefulness of molecular identification tests

Flow Cytometry

Flow cytometry was found to be an informative and convenient method for obtaining genome sizes, which can allow researchers to infer taxon identity, as well as subgenome composition and ploidy level. The inference of subgenome and ploidy are possible because the mean position of the G1 sample peak of a plant with an AA subgenome composition is smaller than one with a BB composition; however, the BB plant is similar to a CCDD plant, as the C and

D subgenomes are fairly small. Since the subgenome sizes are additive, an AABB plant will have a distinctly larger 1C value than that of a plant with a CCDD subgenome composition

(Jellen and Maughan, personal communication). However, a BBCCDD hexaploid, like C. album, will have the largest 1C value of all (Jellen and Maughan, personal communication). Genome sizes are purportedly very stable within a species (Vrit et al. 2016), which was generally what was observed during the course of this study.

Flow cytometry is quick, easy, and relatively inexpensive as long as there are no machine malfunctions and an appropriate standard is available. Flow cytometric determination of C value contributed importantly to the discovery that USDA accessions identified as C. berlandieri var. zschackei (PI 666288), C. berlandieri var. bushianum (PI 608030), and C. album accession PI

666271 was incorrectly identified in the USDA germplasm collection. This finding points to the potential value of a broader flow cytometry study of the USDA Chenopodium germplasm collection.

54

RAPD PCR

In this study, RAPD PCR was used to help confirm the identity of a collected specimen by comparing it to the reference comparator that most closely matched the specimen’s putative identity based on morphology and flow cytometric data. This is not to say that the identity of the reference comparator was guaranteed to be correct, but this method successfully provided a way to differentiate between species and also within species. For instance, three subspecies of C. berlandieri are reportedly native or naturalized to the New England region. The visual RAPD profiles of the USDA accessions intended to represent these subspecies were different enough that they could be easily distinguished from one another, and thus might have been useful in identifying the subspecies of the collected C. berlandieri specimens (Figure 6). However, as noted in the previous section on the results of cytometric analysis, the C. berlandieri var. zschackei (PI 666288) and C. berlandieri var. bushianum (PI 608030) cytometric results were not in the range expected for AABB tetraploids, and thus all subsequent molecular analysis of these accessions must be interpreted with caution. More importantly, the RAPD PCR profiles of collected plants identified as C. berlandieri var. macrocalycium most closely matched the C. berlandieri var. macrocalycium USDA comparator. RAPD PCR also allowed me to determine that the collected C. berlandieri var. macrocalycium specimens from the summer of 2016 were genetically diverse, so that the specimens submitted to the UNH Hodgdon Herbarium were representative of the diversity of this region for that species (Figure 7). Additionally, RAPD PCR was used to demonstrate that an unknown species, later identified as C. foggii, was distinct from a known species (C. standleyanum, PI 666323), even though their flow cytometry values were similar. The C. foggii versus C. standleyanum RAPD profile comparison is shown in Figure 8.

Lastly, RAPD PCR was used to assess collected C. ficifolium specimens to determine if their 55 visual profiles were distinct enough to support their selection as crossing parents in a genetic study, such that any putative hybrid progeny could be confirmed using PCR-based methods

(Figure 9).

RAPD PCR was found to be an informative tool for tentative species identification, so long as a comparator was included that was the same species as the specimen of interest. In this study, RAPD PCR was used in identifying field-collected specimens; identifying C. berlandieri specimens to the variety-level; assessing within-species diversity from specimens collected at various geographical locations; and choosing diploid plants to be parents for making crosses.

RAPD PCR was found to be ineffective when assessing putative diploid hybrid progeny, and so was replaced by PCR amplification using FTL primers. RAPD PCR is inexpensive to run and provides clear results, but is expensive in regards to the amount of time needed to run the assay.

DNA sequencing

Each molecular identification method examined in this study was incomplete in the sense that it was not able to absolutely identify an unknown chenopod on its own, and all methods relied heavily on the use of reference standards for comparison. Especially when studying a group that has been so notoriously subject to misidentification, this then begs the question: How can it be known if a reference standard itself has been identified correctly? A correctly-identified comparator should provide results that are consistent with its ploidy level and subgenome in each step of the identification pipeline. A reference standard’s DNA should also match reference sequence data from the NCBI database, if available. As discussed previously, this study sought to examine the usefulness of DNA sequencing for confirming the identity of both reference comparator species and unknown chenopod species. 56

There are several disadvantages to relying on DNA sequencing to identify chenopods.

One major disadvantage is that the existence of reference sequence data for a given species is not guaranteed, especially for weedy chenopod species, which are rarely studied. Additionally, few large phylogenetic studies on the Chenopodium genus have been undertaken, meaning that even if there is sequence data for these species, little is understood about their within-genus taxonomic relationships. Perhaps the largest disadvantage to using DNA sequences is that the PCR- amplified, subgenome-specific gene sequences of allopolyploid species must be cloned in order to get usable sequence data, which requires expensive cloning kits and at least 5 days to complete the cloning and bacterial culturing steps that prepare the samples for sequencing. cpDNA sequencing

The sequence groupings in the cpDNA tree indicate that the allotetraploid species contained in the first major clade (C. berlandieri and quinoa) have inherited their chloroplast genomes from the same ancestral, diploid source. Additionally, these species all share an A-clade

SOS1 subgenome (Walsh et al. 2015), so their cpDNA genomes could also be derived from that

A-clade diploid ancestor. The groupings in the second clade indicate that C. album and C. strictum cpDNA genomes are both derived from either a C-clade or from a D-clade ancestor. The presence of C. ficifolium in the other subclade of that major clade indicates that its cpDNA genome is derived from a B-clade ancestor. One reference C. album sequence grouped with the

C. ficifolium subclade, indicating either that its cpDNA subgenome is derived from a B-clade ancestor, which would set it apart from the other C. album accessions tested (Walsh et al.), or that this plant was misidentified.

Based on the Sequence Distances matrix (MegAlign, Appendix 11), the sequences in the first major clade (quinoa, C. berlandieri var. macrocalycium, etc.) differ from one another by 57 approximately 2 nucleotides out of 974, which is not enough to confidently distinguish between these species. Furthermore, the sequences in the C. album/C. strictum subclade are 100% identical to each other, based on the Sequence Distances matrix, so there is no way to distinguish between these species using the cpDNA amplicon. However, the C. ficifolium sequences differ from the C. album/C. strictum sequences by approximately 9/974 nucleotides, and from the quinoa-clade sequences by ≈18/974 nucleotide differences. Finally, the C. album/C. strictum sequences differ by ≈16/974 nucleotides from the quinoa-clade sequences. This means that cpDNA sequencing, using the trnL-F primers, are not a useful way to distinguish among the “A- clade” cpDNA species (quinoa, C. berlandieri var. macrocalycium, C. standleyanum, and C. foggii) or between the “CD-clade” cpDNA species (C. album, C. strictum). However, cpDNA sequencing is a useful method for distinguishing between the “B-clade”/ “CD-clade” cpDNA species from the “A-clade” cpDNA species, which is useful given that NNE chenopods are often misidentified as C. album.

SOS1 DNA Sequencing

Of all of the molecular methods used to identify chenopods in this study, DNA sequencing using the SOS1 primers was the most expensive, most time-consuming, and yielded the smallest amount of data. The Sequence Distance Matrix (MegAlign, Appendix 14) demonstrates a clear distinction in percent sequence identity between the A-, B-, and C- clades, but the distinction between the C- and D- clades was not as definite. However, several restriction sites were located within the sequence data, in which the A-clade might be cut and not the others, and so on. This indicates that perhaps a more useful and inexpensive procedure might be to amplify the DNA, cloning if needed, cut it using restriction enzymes, and then viewing the DNA

58 fragments on an agarose gel. Although the lengthy and fastidious cloning step is could still be necessary if distinctive restriction fragments cannot be obtained from the polyploid sequences, this method would be significantly faster, but it might not provide detailed distinctions between

B-, C-, and D- clade species. More work in this area is needed.

Chenopodium berlandieri var. macrocalycium

C. berlandieri var. macrocalycium was the chenopod that was collected most often in the summers of 2016 and 2017, as it was found at four distinct collection sites. Using a USDA comparator (PI 666279) and morphological characteristics, flow cytometry, RAPD PCR, and cpDNA sequencing, it was concluded that these collected specimens did indeed belong to C. berlandieri var. macrocalycium. RAPD PCR, especially using RAPD primer BC190, demonstrated that genetic variation exists among the four collection sites in NNE.

As discussed previously, the chloroplast DNA sequence data grouped all specimens thought to be C. berlandieri var. macrocalycium, including accession PI 666279, into a major clade along with species such as quinoa and C. standleyanum. This indicates that all of these species’ cpDNA genomes were derived from the same “A-type” source, denoted due to their shared A-clade subgenome representation (Walsh et al. 2015). However, the USDA C. berlandieri var. macrocalycium comparator used in this study was removed from the SOS1 alignment because it is suspected that it was accidentally mixed up during the cloning process with accession PI 666271. Seeds from collected plants that were identified via molecular comparison to this USDA standard were submitted to the USDA National Plant Germplasm

System. Additionally, pressed specimens of this type that represented the scope of the collection sites within this study were submitted to the UNH Hodgdon Herbarium.

59

Chenopodium berlandieri var. bushianum

No specimens collected in the duration of this study were identified as C. berlandieri var. bushianum. However, the identity of the USDA accession (PI 608030) used as the C. berlandieri var. bushianum comparator was brought into question. The calculated 1C values from flow cytometry for this USDA comparator were distinct from quinoa, another AABB species (≈1800

Mbp for C. berlandieri var. bushianum versus ≈1330 Mbp for quinoa), and the RAPD PCR profile was distinct from all field-collected tetraploid AABB specimens. Accession PI 608030 was not included in the chloroplast DNA sequencing study. The SOS1 sequence data from this

USDA comparator has placed it in the B and D-clades, which does not match the AABB composition expected of the reference C. berlandieri species (Walsh et al. 2015).

This particular USDA accession (PI 608030), collected in 1992 by Robert Myers from the University of Missouri, was not used in the original Walsh et al. (2015) study, but it was included in the original Fuentes-Bazan et al. (2012a, b) phylogenetic studies as a C. album specimen. The Fuentes-Bazan et al. (2012a) study examined all subgenera from Chenopodium as well as other genera in subfamily Chenopodioideae, so the inclusion of this particular accession should not have had any effect on the resulting phylogeny, as it still belongs to the genus

Chenopodium. However, due to the uncertainties surrounding its identity, it is recommended that this particular accession not be used as a reference standard for C. berlandieri var. bushianum in future studies. This is unfortunate because there are currently no other C. berlandieri var. bushianum accessions in the NPGS collection.

60

Chenopodium berlandieri var. zschackei

No specimens collected in the duration of this study were identified as C. berlandieri var. zschackei. However, the identity of accession PI 666288 used as a comparator in this study was brought into question. The observed morphology of this comparator was nearly identical to that of the USDA comparator for C. strictum when the two were grown under the same conditions.

Additionally, the calculated flow cytometry values for PI 666288 were very similar to those of C. strictum specimens, and were very different from those of quinoa. All chenopods of AABB subgenome composition should have similar 1C values.

The RAPD PCR profile of PI 666288 was also distinct from our collected AABB specimens, but was visually similar to the profile of C. strictum specimens. The chloroplast DNA sequence data grouped this accession into a subclade with C. album and C. strictum sequences, unlike all macrocalycium specimens, indicating that these species have a shared “C-type” or “D- type” cpDNA genome. Finally, the SOS1 sequence data placed PI 666288 in the C and D clades, which is contrary to the Walsh et al. (2015) zschackei sequences, which were of AB composition. Additionally, this USDA accession was collected PI 666288 on the same day and in the same place by Dr. Rick Jellen from Brigham Young University as the C. strictum accession used (PI 666324), which indicates that it is probable that a specimen mix-up occurred on the part of the collector or at the recipient germplasm repository. All told, this evidence strongly suggests that PI 666288 was misidentified upon collection or subsequently, and that it is truly a C. strictum specimen instead, and researchers should use caution when using this accession in studies. Fortunately, there are other zschackei accessions in the NPGS collection.

61

Chenopodium strictum

A USDA comparator (PI 666324) was used in this study, and one collection site (Rye

Beach) of C. strictum was found in the summer of 2016. The flow cytometry values and RAPD

PCR profile of these specimens were distinct from other chenopods, but as previously discussed, were very similar to the USDA comparator for accession PI 666288 (misidentified as C. berlandieri var. zschackei). Jellen and Maughan (personal communication, 2016) had suggested that C. strictum was of either CCDD or CCBB subgenome composition, and this was consistent with the SOS1 sequence data from the USDA comparator for C. strictum, as the only acquired

SOS sequences were placed in the D-clade. The chloroplast DNA sequence data from the USDA comparator accession (PI 666288) and field-collected species were grouped in a subclade along with C. album sequences, indicating that these species share a “C-type” or “D-type” cpDNA genome. C. strictum was not featured in the original Walsh et al. (2015) phylogeny. Seeds from collected plants that matched this USDA comparator were submitted to the USDA North Central

Plant Introduction Station in Ames, Iowa, and pressed specimens of this type were also submitted to the UNH Hodgdon Herbarium.

Chenopodium album

Two USDA comparators (PI 666271 and Ames 29961) were used for C. album in this study, and 21 specimens were collected all over the NNE region in 2016 and 2017, as it is a common species. The first USDA comparator, PI 666271, had flow cytometry and RAPD PCR data that was distinct from other chenopods but was very similar to C. strictum specimens. The chloroplast DNA sequence grouped this accession with other C. album and C. strictum specimens, indicating that they share a “C-type” or “D-type” cpDNA genome. The SOS1

62 sequence data from this USDA comparator shows that part of its subgenome composition is A- clade, which does not coincide with Walsh et al. (2015)’s C. album sequence data, which features sequences in the B, C, and D-clades, or with that reported for C. strictum (BBCC or

CCDD). This evidence suggests that either this C. album USDA comparator was also misidentified. The latter hypothesis warrants investigation. Meanwhile, any researcher who uses this accession in future studies should be aware of these anomalies.

By contrast, the other USDA comparator (Ames 29961) produced flow cytometry and RAPD

PCR results that were consistent with a hexaploid chenopod. Since the only hexaploid chenopod in the NNE region is C. album, this specimen was not sequenced using the SOS1 primers.

However, its chloroplast DNA sequence data grouped this accession near a mixture of C. album and C. strictum sequences. Flow cytometry results for C. album specimens are distinct and appear to be stable, and there is a great deal of within-species variation in C. album, so RAPD

PCR was mainly used as an exclusionary test, essentially to ensure that it did not match any other collected plants, as it was difficult to procure a consistent visual profile for this species. Seeds from collected plants identified as C. album were submitted to the USDA North Central Plant

Introduction Station in Ames, Iowa, and some representative specimens were also submitted to the UNH Hodgdon Herbarium, especially specimens collected from the UNH campus.

Chenopodium ficifolium

A USDA comparator (PI 658749) was used in this study, and there were two collection sites for C. ficifolium specimens in 2016 and 2017. This species has a distinct fig-type leaf shape, which was present in both the USDA comparator and in the collected specimens. The flow cytometry results and the RAPD PCR profiles for specimens of this type were distinct from other chenopods. Due to the distinctive leaf shape and the consistent molecular test results, the USDA 63

C. ficifolium comparator was not sequenced in the SOS1 sequence study; however, the reference

C. ficifolium samples in the Walsh et al. (2015) phylogeny are in a sister clade to the B-clade, which made them indispensable for anchoring the combined tree with both the Walsh et al.

(2015) sequences and the sequences generated in this study. Additionally, cpDNA sequence data produced a C. ficifolium subclade, which indicates that these plants share a common “B-type” cpDNA genome.

Chenopodium standleyanum

No specimens identified as C. standleyanum were collected during this study, but a

USDA comparator (PI 666323) was used. This comparator seed stock had a very low germination rate and a calculated flow cytometry value that was similar to a collected sample later identified as C. foggii, but was distinct from other chenopods. The RAPD PCR profiles for

C. standleyanum showed some similarities to that of C. foggii, but ultimately the two profiles were decidedly distinct. Accession PI 666323’s chloroplast sequence data grouped it together with quinoa and C. berlandieri var. macrocalycium sequences, indicating a shared “A-type” cpDNA genome. This USDA comparator’s SOS1 sequence data coincided with the C. standleyanum specimen included in the Walsh et al. (2015) phylogeny, which was of diploid A- clade composition.

Chenopodium foggii

As discussed previously, C. foggii is an elusive species that is documented solely in the herbarium record. As it has not been included in any modern molecular or genetic study, there is no USDA germplasm available and no published sequence data for C. foggii. However, there was one collection site (Shell Pond) in 2017 in which an unknown chenopod matching this

64 species’ morphological description was collected. The morphology of the collected species on- site was very similar to that of a young C. album or C. berlandieri var. macrocalycium, or one grown under poor conditions, but unlike those polyploid species, the morphology of this C. foggii did not change under greenhouse conditions. Dr. Janet R. Sullivan confirmed that these collected specimens were morphologically distinct from C. standleyanum (PI 666323) due to the presence of keeled sepals, as described in Flora Novae Angliae (2011, pg. 322).

Additionally, the leaf shape of this C. foggii was very distinct from that of C. standleyanum, but the flow cytometry data from the two specimens was fairly similar (1C =

577Mbp for C. foggii vs. 1C = 580 – 625 Mbp for C. standleyanum). Similarly, there were a few parallels in the two specimens’ RAPD PCR profiles, but there were too many differences for the two to be considered the same. The chloroplast DNA sequencing data showed that C. foggii grouped together with quinoa, C. standleyanum, and C. berlandieri var. macrocalycium, indicating a shared “A-type” cpDNA genome. Finally, the SOS1 sequence data for C. foggii placed it in the A-clade along with other diploid species, such as C. standleyanum and C. fremontii. In the end, it is possible that C. foggii was rediscovered, but all that can safely be said at this moment is that this unknown chenopod is an A-clade diploid that matches the morphological and habitat description of C. foggii, but this chenopod does not match any other chenopods that were utilized in this study.

65

Diploid Hybrid Assessment

As discussed previously, linkage maps are critical genetic resources for breeders looking to perform marker-assisted breeding, and as of yet, no such map or set of genetic markers for weedy Chenopodium species exists. Linkage maps of diploid chenopods ancestral to quinoa would be valuable resources for geneticists looking to identify genes of relevance to quinoa improvement, for purposes such as disease resistance, heat tolerance, etc. The first step to developing such a linkage map is to successfully produce an F1 population of a weedy species, to then isolate the F1 plants to ensure that they self-pollinate, and then examine the F2 progeny for segregating traits.

In total, four hybrid plants were identified via PCR amplification using FTL primers.

While the emasculation and pollination methods used in this study are perhaps crude and could be refined, the presence of any hybrid progeny is promising. Although these particular F1 plants have already set seeds in the UNH Greenhouse, the crosses from which these hybrid plants were found are the best place to continue looking for other hybrid progeny, which can be isolated and forced to self-pollinate in order to produce an F2 population. Additionally, a second round of C. ficifolium bidirectional crosses, using the same “Portsmouth” and “Quebec City #4” parental types, was performed several months later, and none of the progeny from those crosses have yet been assessed. Once a linkage map is developed from these hybrids, it can be used tell whether the diploid, weedy species have any downy mildew disease resistance genes – or other genes of interest – which could be potentially bred into quinoa. Having a linkage map of a diploid species will also make it easier to develop linkage maps of species with higher ploidy levels. In the future, PCR-based assessment using FTL primers is recommended over using a RAPD-based method, as it is much faster, uses fewer reagents, and gives easy-to-interpret visual profiles. 66

CONCLUSION

The germplasm identification pipeline detailed in this study, which incorporates morphology, flow cytometry, RAPD PCR, and DNA sequencing, has been shown to be an effective method for identifying wild chenopods, as well as for illuminating instances of misidentification. As a case in point, the USDA comparator species used in this study were examined, and three out of nine of them were found to be incorrectly identified. These accessions (PI 608030, PI 666271, and PI 666288), are not recommended for use as comparator species or species representatives in future studies involving Chenopodium, and the identity of accession PI 666279 requires further confirmation.

Several diploid hybrids were made during this study, using collected C. ficifolium specimens, which is the important first step toward developing a linkage map and reference genome for weedy chenopod species. In the future, more diploid and AABB crosses should be attempted in order to develop a for linkage mapping purposes, and special effort to identify downy mildew resistance genes should be made. Chenopodium foggii was documented in NNE, which is an important result, as this rare species has not been previously included in any modern phylogenetic studies and is not part of the USDA NPGS germplasm collection. Additionally, five chenopod species were collected but three chenopods (C. berlandieri var. zschackei, C. berlandieri var. bushianum, and C. standleyanum) were not located during the summers of 2016 and 2017, but should be sought out, and their putative existence in NNE confirmed or contested. If confirmed, documentary samples should be submitted to the USDA germplasm collection and the UNH Hodgdon Herbarium.

67

REFERENCES

Aellen, P. 1929. Beitrag zur systematik der Chenopodium – Arten Amerikas, vorwiegend auf Grund der Sammlung des United States National Museum in Washington, D.C. I. Rep Spec

Nov Regn Veget 26(1): 31 – 64.

Aziz, M.A., Adnan, M., Khan, A.H., Rehman, A.U., Jan, R., and Khan, J. 2016. Ethno- medicinal survey of important plants practiced by indigenous community at Ladha subdivision,

South Waziristan agency, Pakistan. J Ethnobiol Ethnomed 12(53): 1 –12.

Bassett, I. J. and Crompton, C. W. 1978. The Biology of Canadian Weeds: Chenopodium album L. Can. J. Plant Sci. 58: 1061 – 1072.

Bassil, N.V. and Davis, T.M., Zhang, H., Ficklin, S., Mittmann, M., Webster, T.,

Mahoney, L., Wood, D., Alperin, E.S., Rosyara, U.R., Putten, H.K., Monfort, A., Sargent, D.J.,

Amaya, I., Denoyes, B., Bianco, L., van Dijk, T., Pirani, A., Iezzoni, A., Main, D., Peace, C.,

Yang, Y., Whitaker, V., Verma, S., Bellon, L., Brew, F., Herrara, R., and van de Weg, E. 2015.

Development and preliminary evaluation of a 90 K Axiom SNP array for the allo-octoploid cultivated strawberry Fragaria x ananassa. BMC Genomics 16:155. https://doi.org/10.1186/s12864-015-1310-1.

Basu, C., Halfhill, M.D., Mueller, T.C., and Stewart, C.N. Jr. 2004. Weed genomics: new tools to understand weed biology. Trends Plant Sci 9(8): 391 – 398.

Bennett, M.D., Leitch, I.J., Hanson, L. 1998. DNA amounts in two samples of angiosperm weeds. Ann of Bot 82 (Supplement A): 121 – 134.

68

Bennett, M.D. and Leitch, I.J. 2012. Plant DNA C-values Database. Royal Botanic

Gardens, Kew. Accessed October 2017. Available from: data.kew.org/cvalues/

Bera, B., Das, S., and Mukherjee, K.K. 1992. Morphological studies of three cytotypes of

Chenopodium album L. of lower Gangetic Plains, West Bengal, India. Phytomorphology.

42(1&2): 93 – 103.

Brenner, D. (USDA North Central Plant Introduction Station, Ames, Iowa). Email with

Neff, E. and Davis, T.M. (Dept. of Biological Sciences, University of New Hampshire, Durham,

NH). 2017 July 28.

Christensen, S.A., Pratt, D.B., Pratt, C., Nelson, P.T., Stevens, M.R., Jellen, E.N.,

Coleman, C.E., Fairbanks, D.J., Bonifacio, A., and Maughan, P.J. 2007. Assessment of genetic diversity in the USDA and CIP-FAO international nursery collections of quinoa (Chenopodium quinoa Willd.) using microsatellite markers. Plant Genetic Resources 5(2): 82 – 95.

Cole, M.J. 1961. Interspecific relationships and intraspecific variation of Chenopodium album L. in Britain. I. The Taxonomic Delimitation of the Species. Watsonia 5(2): 47 – 58. Cole,

M.J. 1962. Interspecific relationships and intraspecific variation of Chenopodium album L. in

Britain. II. The Chromosome numbers of C. album L. and other species. Watsonia 5(3): 117 –

122.

Dolezal, J. 2002. Nuclear DNA Content and Genome Size of Trout and Human. doi:

10.1002/cyto.a.10013

Dolezal, J., Greilhuber, J., and Suda, J. 2007. Estimation of nuclear DNA content in plants using flow cytometry. Nature Protocols 2(9): 2233 – 2244.

69

Fuentes-Bazan, S., Mansion, G., and Borsch, T. 2012a. Towards a species level tree of the globally diverse genus Chenopodium (Chenopodiaceae). Mol Phylogenet Evol 62: 359 – 374.

Fuentes-Bazan, S., Uotila, P., and Borsch, T. 2012b. A novel phylogeny-based generic classification for Chenopodium sensu lato, and a tribal rearrangement of Chenopodioideae

(Chenopodiaceae). Willdenowia 1: 5 – 24.

Gandarillas, A., Saravia, R., Plata, G., Quispe, R., Ortiz-Romero, R. 2015. Principle

Quinoa Pests and Diseases. Chapter 2.6. In FAO &CIRAD. State of the Art Report on Quinoa in the World in 2013, p. 192 – 215. Rome.

Jarvis, D.E., Ho, Y.S., Lightfoot, D.J., Schmockel, S.M., Li, B., Borm, T.J.A., Ohyanagi,

H., Mineta, K., Michell, C.T., Saber, N., Kharbatia, N.M., Rupper, R.R., Sharp, A.R., Dally, N.,

Boughton, B.A., Woo, Y.H., Gao, G., Schijlen, E.G.W.M., Guo, X., Momin, A.A., Negrao, S.,

Al-Babili, S., Gehring, C., Roessner, U., Jung, C., Murphy, K., Arold, S.T., Gojobori, T., van der

Linden, C.G., van Loo, E.N., Jellen, E.N., Maughan, P.J., and Tester, M. 2017. The genome of

Chenopodium quinoa. Nature 542: 307 – 312. doi:10.1038/nature21370

Jellen, E.N., Kolano, B.A., Sederberg, M.C., Bonifacio, A., and Maughan, P.J. 2011.

Chenopodium. In Wild Crop Relatives: Genomic and Breeding Resources: Legume Crops and

Forages. London, Springer Heidelberg Dordrecht. Pp. 35 – 61.

Jellen, E.N. and Maughan, P.J. (Dept. of Plant and Wildlife Sciences, Brigham Young

University, Provo, UT). Conversation with: Neff, E. and Davis, T.M. (Dept. of Biological

Sciences, University of New Hampshire, Durham, NH). 2016 July 27.

70

Kolano, B., Siwinska, D., and Maluszynska, J. 2008. Comparative cytogenetic analysis of diploid and hexaploid Chenopodium album Agg. Acta Soc Bot Pol. 7: 293 – 298.

Kubesova, M., Moravcova, L., Suda, J., Jarosik, V., and Pysek, P. 2010. Naturalized plants have smaller genomes than their non-invading relatives: a flow cytometric analysis of the

Czech alien flora. Preslia 82: 81 – 96.

Maughan, P.J., Bonifacio, A., Jellen, E.N., Stevens, M.R., Coleman, C.E., Ricks, M.,

Mason, S.L., Jarvis, D.E., Gardunia, B.W., and Fairbanks, D.J. 2004. A genetic linkage map of quinoa (Chenopodium quinoa) based on AFLP, RAPD, and SSR markers. Theor Appl Genet

109: 1188 – 1195.

Maughan, P.J., Smith, S.M., Rojas-Beltran, J.A., Elzinga, D., Raney, J.A., Jellen, E.N.,

Bonifacio, A., Udall, J.A., and Fairbanks, D.J. 2012. Single nucleotide polymorphism identification, characterization, and linkage mapping in quinoa. The Plant Genome 5(3): 114 –

125.

Mandak, B., Travnicek, P., Pastova, L., and Korinkova, D. 2012. Is hybridization involved in the evolution of the Chenopodium album aggregate? An analysis based on chromosome counts and genome size estimation. Flora 207: 530 – 540. Natural Heritage and

Endangered Species Program. 2004. Massachusetts list of endangered, threatened, and special concern species (29 January 2005). Massachusetts Division of Fisheries and Wildlife,

Massachusetts. Available from: https://plants.usda.gov/java/threat?statelist=states&stateSelect=US25&sort=stateComname

Ohri, D. 2015.The taxonomic riddle of Chenopodium album L. complex

(Amaranthaceae). Nucleus 58(2): 131 – 134. 71

Ortiz, R., Ruiz-Tapia, E.N., and Mujica-Sanchez, A. 1998. Sampling strategy for a core collection of Peruvian quinoa germplasm. Theor Appl Genet 96: 475 – 483.

Palomino, G., Hernandez, L.T., and de la Cruz Torres, E. 2008. Nuclear genome size and chromosome analysis in Chenopodium quinoa and C. berlandieri subsp. Nuttalliae. Euphytica

164: 221 – 230.

Perez-de-Castro, A.M., Vilanova, S., Canizares, J., Pascual, L., Blanca, J.M., Diez, M.J.,

Prohens, J., and Pico, B. 2012. Application of Genomic Tools in Plant Breeding. Curr Genomics

13(3): 179 – 195.

Peterson, A., Jacobsen, S., Bonifacio, A., and Murphy, K. 2015. A crossing method for quinoa. Sustainability. 7: 3230 – 3243. doi: 10.3390/su7033230

Rahiminejad, M.R. and Gornall, R.J. 2004. Flavonoid evidence for allopolyploidy in the

Chenopodium album aggregate (Amaranthaceae). Plant Syst Evol. 246: 77 – 87.

Rana, T.S., Narzary, D., and Ohri, D. 2010. Genetic diversity and relationships among some wild and cultivated species of Chenopodium L. (Amaranthaceae) using RAPD and

DAMPD methods. Curr Sci 99(6): 840 – 846.

Rhoades, A. F. and Block, T. A. 2007. The Plants of Pennsylvania. Second Edition.

University of Pennsylvania Press, Philadelphia.

Rojas, W., Barriga, P., and Figueroa, H. 2000. Multivariate analysis of the genetic diversity of Bolivian quinoa germplasm. Plant Genetic Resources Newsletter 122: 16 – 23.

Stevens, P. F. (2001 onwards). Angiosperm Phylogeny Website. Version 14, July 2017

[and more or less continuously updated since].

72

Storchova, H., Drabesova, J., Chab, D., Kolar, J., and Jellen, E.N. 2015. The introns in

FLOWERING LOCUS T-LIKE (FTL) genes are useful markers for tracking paternity in tetraploid Chenopodium quinoa Willd. Genet Resour Crop Evol 62: 913 – 925.

Sukhorukov, A.P. and Zhang, M. 2013. Fruit and Seet Anatomy of Chenopodium and

Related Genera (Chenopodioideae, Chenopodiaceae/Amaranthaceae): Implications for Evolution and Taxonomy. PLOS One 8(4): 1 – 18.

Torres, A.M., Weeden, N.F., and Martin, A. 1993. Linkage among isozyme, RFLP, and

RAPD markers in Vicia faba. Theor Appl Genet 85: 937 – 945.

United States Department of Agriculture. 2017. Plants Profile for Chenopodium simplex

(2017 October 2). Available from: https://plants.usda.gov/core/profile?symbol=Chsi2.

Vrit, P., Krak, K., Travnicek, P., Douda, J., Lomonosova, M.N., and Mandak, B. 2016.

Genome size stability across Eurasian Chenopodium species (Amaranthaceae). Bot J Linean Soc

182: 637 – 649.

Walsh, B.M., Adhikary, D., Maughan, P.J., Emshwiller, E., and Jellen, E.N. 2015.

Chenopodium polyploidy inferences from Salt Overly Sensitive 1 (SOS1) data. Am J Bot 102(4):

533 – 543.

Wilson, H.D. 1980. Artificial Hybridization Among Species of Chenopodium sect.

Chenopodium. Syst Bot 5(3): 253 – 263.

73

APPENDIX 1: PREPARATION OF DE LAAT’S BUFFER FOR CYTOMETRY

To 40 mL distilled H2O in a flask on a stir plate, add: 180 mg HEPES (15 mM)

18.5 mg Na2EDTA.2H2O (1 mM) 100 uL Triton X-100 (0.2% v/v) 298 mg KCl (80 mM) 58 mg NaCl (20 mM) 500 mg PVP-40,000 (0.25 mM) 5.13 mg sucrose (300 mM) 8.5 mg spermine.4HCl (0.50 mM)

Adjust to pH 8.0 and raise the volume to 50 mL with distilled H2O. Use 1.4 mL of the buffer per sample + 1.05 uL β-mercaptoethanol (15 mM)/mL buffer used

74

APPENDIX 2: PREPARATION OF PROPIDIUM IODINE STAIN FOR FLOW CYTOMETRY n = number of samples + 1 to account for error Propidium iodide: n x 30 uL (1 mg/mL) RNase: n x 3.04 uL (10 mg/mL) β-mercaptoethanol: n x 0.6 uL De Laat’s buffer: n x 368 uL Propidium iodide is light-sensitive. This stain can be prepared ahead and kept at -80C.

75

APPENDIX 3: QUANTIFICATION OF DNA USING A QUBIT FLUOROMETER n = number of DNA samples + 2 standards Working Solution: Reagent = 1 x n uL Buffer = 199 x n uL For each standard: 190 uL Working Solution + 10 uL standard For each DNA sample: 198 uL Working Solution + 2 uL DNA sample The fluorometer is calibrated by analyzing Standard 1 and then Standard 2. Each DNA sample is then analyzed by the fluorometer, which then calculates the DNA concentration based on the volume of DNA solution added to each tube.

76

APPENDIX 4: PREPARATION OF 2% ELECTROPHORESIS GEL FOR RAPD PCR In a flask, combine100 mL of 1X TBE buffer, 1.0g agarose, and 1.0g NuSieve agarose.

Swirl to mix. Tare the flask, and add approximately 6g of sterile water to replace the water volume that is lost due to evaporation when heating. Transfer the flask to a microwave, and heat it in 15-second intervals. Swirl and weigh the flask between heating intervals, until the solids are dissolved and the weight of the flask reaches 0g.

Using potholders, swirl the bottom portion of the flask under running warm water until the flask can comfortably be placed on the palm of a hand. Once cooled, pour the solution into the platform of an OWL gel box with a 20-well comb. Tape the comb upright using labeling tape, if necessary, to avoid tilted wells. Allow the gel to sit for 30 minutes, then transfer to a refrigerator for another 15 minutes before loading. Use more 1X TBE as running buffer.

77

APPENDIX 5: PREPARATION OF 1X TBE BUFFER FOR ELECTROPHORESIS GEL To make 1L of 10X TBE buffer:

In a large beaker on a stir plate, add 600 mL of sterile, filtered water. Add the following and stir until dissolved:

108 g Tris

55 g Boric Acid

40 mL of 0.5 M NaEDTA

Adjust to pH 8.0, and add enough sterile, filtered water to bring the volume to 1000 mL. Transfer the buffer to a large flask, cover the top with aluminum foil, and autoclave.

Once cooled, dilute it to 1X by adding 4500mL sterile H2O to 500 mL 10X TBE. Store in a covered container.

78

APPENDIX 6: CHLOROPLAST REFERENCE SEQUENCE DATA USED IN THIS

STUDY

Species GenBank Accession Number

Chenopodium album HE577559.1

Chenopodium album HE577609.1

Chenopodium album HE577596.1

Chenopodium album HE577556.1

Chenopodium berlandieri var. boscianum HE577564.1

Chenopodium ficifolium HE577606.1

Chenopodium quinoa HE577576.1

Chenopodium quinoa HE577579.1

Chenopodium quinoa HE577578.1

Chenopodium standleyanum HE577560.1

Chenopodium standleyanum HE577603.1

79

APPENDIX 7: CHLOROPLAST DNA PHYLOGENETIC TREE GENERATED FOR

THIS STUDY

C/D-type A-type

B-type

cpDNA tree generated for this study. Sequences generated for this study are in black, whereas reference sequences are printed in grey. The “A-type” clade is highlighted in light blue; the “B-type” clade is in green; and the “C-“ or “D-type” clade is highlighted in purple.

80

APPENDIX 8: COLLECTION SITES, 2016 – 2017

Appendix 8-A: Overview of plant collection sites in summer 2016 (multicolored) and summer 2017 (blue). These sites are clustered mainly along the New Hampshire Seacoast, but also include two sites in Quebec City. In total, four species were collected in 2016: C. album (purple marker), C. berlandieri (green marker), C. strictum (red marker), and C. ficifolium (yellow marker). One unknown chenopod species was collected in 2017, and the collection site is marked with the blue marker.

81

Appendix 8-B: Plant collection sites in New Hampshire and Southern Maine in summer 2016. While the majority of sites are concentrated along the Seacoast, there is one site in Durham, NH, two sites in Portsmouth, NH, and many sites on Appledore Island. Four species, C. album (purple marker), C. berlandieri (green marker), C. strictum (red marker), and C. ficifolium (yellow marker), were collected.

82

Appendix 8-C: Plant collection sites on Appledore Island during summer 2016. These sites are dispersed along the coast of the northern half of the island, as the southern half is a private residential area. Two species, C. berlandieri (green marker) and C. album (purple marker), were collected on the island.

83

Appendix 8-D: Plant collection site of unknown chenopod and C. simplex in summer 2017. This site was on a southern-facing cliff near Shell Pond in Stow, ME. GPS data is available in Appendix 9. This site was sparsely populated by only a few different types of plants, as the “soil” was hardly more than granite sand. This habitat closely resembles the habitat of C. foggii as recorded in herbaria records and taxonomic keys.

84

APPENDIX 9: SOS1 INTRON 16 REFERENCE SEQUENCE DATA USED IN THIS

STUDY

Species GenBank Accession Number Blitum californicum KP798895.1 KP798897.1

KP798990.1 Chenopodium album KP798988.1 KP798992.1

KP798948.1 Chenopodium berlandieri var. zschackei KP798950.1 KP799006.1 Chenopodium ficifolium KP799004.1 KP798924.1 KP798978.1 KP798980.1 Chenopodium standleyanum KP798934.1 KP798940.1 Chenopodium quinoa KP798942.1 KP798994.1 KP798998.1 KP798996.1 KP799000.1 KP799002.1

85

APPENDIX 10: SOS1 INTRON 16 PHYLOGENETIC TREE GENERATED FOR THIS

STUDY

A-clade B-clade

C-clade D-clade

SOS1 tree generated for this study. Sequences generated for this study are in black, whereas reference sequences are printed in grey. The A-clade is highlighted in orange; the C-clade is in green; the D-clade is in red; and the B-clade is highlighted in blue.

86

APPENDIX 11: COLLECTION DATA, SUMMER 2016 AND 2017

Species Assigned Date Collection Location Identification Collector Notes Collected Location Details Based on Code Morphology

P1 Telephone C. ficifolium pole southeast P2 of theater C. ficifolium 02-Jun-16 125 Bow St, P3 43o 4.717’N Flower bed C. ficifolium Portsmouth, NH 70o 45.270’W southeast of P4 03801 theater C. album Across street P5 C. album from theater 10-Jun-16 Sand above P6 Fort Foster, ME C. album rocky beach

43o 02.552’ N Sand above 279 C. berlandieri 70o 42.075’ W rocky beach

Odiorne State 43o02.529’ N Park, NH Sand above 280 C. berlandieri 70o 42.901’ W rocky beach 24-Jun-16 43o02.511’ N Sand, at 281 C. berlandieri 70o 42.914’ W tideline

282 Rocky beach C. berlandieri 284-A C. berlandieri 43o 00.157’ N Rye Beach, Rye, 284-B Sandy levee, C. berlandieri 70o 44.072’ W NH roadside near 284-C marsh C. berlandieri 284-D C. berlandieri 43o 07.989’ N 285-A York Beach C. album 70o 00.292’ W Harbor, ME 285-B 03-Jul-16 Unknown 43o 14.246’ N 286 Ogunquit, ME C. album 70o 05.417’ W

87

Species Assigned Date Collection Location Identification Collector Notes Collected Location Details Based on Code Morphology

In a crack 212 Islington St., between the P7 21-Aug-16 Discarded Portsmouth, NH C. berlandieri building & 03801 asphalt

42o 59.269' N 293 C. album 70o 36.869' W

42o 59.283' N 294 C. album 70o 36.945' W

42o 59.285' N 295 C. album 70o 36.946' W

296-A C. berlandieri

296-B 42o 59.299' N C. berlandieri 296-C 70o 36.945' W Sandy levees, C. berlandieri Appledore Island, 296-D 23-Aug-16 pathways, C. berlandieri Gulf of Maine 296-E rocky coves C. berlandieri 42o 59.352' N 297 C. berlandieri 70o 36.823' W 42o 59.308' N 298 C. berlandieri 70o 36.660' W 42o 59.447' N 299 C. berlandieri 70o 37.012' W

42o 59.497' N 300 C. strictum 70o 36.893' W

42o 59.494' N 301 C. berlandieri 70o 36.909' W

Cliff base. 302A Shallow C. foggii Cliff overlooking “soil” 44o 14.984’ N 17-July-17 Shell Pond, Stow, composed of 70o 57.974’ W ME granite sand. 302B Southern C. simplex exposure 88

APPENDIX 12: FLOW CYTOMETRY DATA, SUMMER 2016 AND 2017

Collection Calculated 1C Single Taxon Collector Code Location or Co-Chop value (Mbp) Chop Accession ID USDA C. ficifolium PI 658749 851 x C. ficifolium P1 Downtown 821 x Portsmouth P3 831 x QC#1 869 x QC#4 865 x Quebec City, QC#4 878 x Quebec QC#6 869 x QC#7 887 x Shell Pond, ME C. foggii 302-A 577 x site USDA C. PI 658755 581 x C. standleyanum standleyanum USDA C. PI 658755 624 x standleyanum USDA C. berlandieri var. PI 666279 1455 x macrocalycium

C. berlandieri USDA C. var. berlandieri var. PI 666279 1456 x macrocalycium macrocalycium

P6 Fort Foster, ME 1400 x

279 1337 x

280 1374 x Odiorne State Park 281 1388 x 282 1440 x 284-C 1335 x

284-D 1307 x 284-H Rye Beach, Rye, 1333 x 284-I NH 1307 x 284-L 1288 x 284-M 1282 x 288-A 1302 x

89

Collection Calculated 1C Single Taxon Collector Code Location or Co-Chop value (Mbp) Chop Accession ID

296-A 1293 x C. berlandieri var. macrocalycium 296-B 1321 x 296-C 1335 x 296-D 1256 x Appledore Island 296-E 1289 x Not measured; 297 - - mature seeds 298 1282 x

299 1285 x 301 1239 x USDA PI 608030 1866 x bushianum C. berlandieri USDA PI 608030 1758 x var. bushianum bushianum USDA PI 608030 1855 x bushianum USDA zschackei PI 666288 976 x C. berlandieri var. zschackei USDA. zschackei PI 666288 994 x

USDA C. PI 666324 984 x strictum C. strictum USDA C. PI 666324 1000 x strictum

284-A 917 x 284-A Rye Beach, Rye, 946 x 284-B NH 833 x

284-B 916 x

284-B 980 x 284-B 988 x USDA C. album PI 666272 1782 x C. album USDA C. album PI 666272 1772 x USDA C. album PI 666271 969 x USDA C. album PI 666271 951 x USDA C. album Ames 29961 1849 x

USDA C. album Ames 29961 1760 x

90

Collection Calculated 1C Single Co- Taxon Collector Code Location or value (Mbp) Chop Chop Accession ID USDA C. PI 605700 1867 x album P4 Downtown 1786 x P5 Portsmouth 1770 x Odiorne State 282 1736 x C. album Park

York Beach 285-A 1719 x Harbor, ME

286 Ogunquit, ME 1728 x D1 1704 x D2 UNH Campus 1672 x

D3 1682 x

DK-1 1735 x Collected by Dr. DK-2 1713 x Richard Smith DK-NY 1653 x 284-E 1694 x

284-F 1656 x Rye Beach, Rye, 284-G 1656 x NH 288-B 1627 x 288-C 1724 x

Quebec City, QC#3 1763 x Quebec Islington St, P7 1639 x Portsmouth, NH 293 1703 x 294 1690 x Appledore Island 295 1651 x 300 1597 x C. simplex 302-B Shell Pond, ME 1174 x USDA C. PI 478418 915 x quinoa USDA C. PI 478418 1320 x quinoa USDA C. PI 587173 1093 x quinoa C. quinoa USDA C. PI 587173 1313 x quinoa USDA C. PI 614901 1328 x quinoa USDA C. PI 614901 1203 x quinoa 91

Collection Calculated 1C Single Co- Taxon Collector Code Location or value (Mbp) Chop Chop Accession ID USDA C. PI 614880 1030 x quinoa USDA C. PI 614880 1335 x quinoa USDA C. C. quinoa PI 614881 1121 x quinoa USDA C. PI 614881 1338 x quinoa USDA C. Ames 13734 1454 x quinoa

92

APPENDIX 13: CHLOROPLAST DNA SEQUENCE IDENTITY MATRIX

93

APPENDIX 14: SOS1 INTRON 16 SEQUENCE IDENTITY MATRIX

94