The Genetic Basis of Seed Coat Polymorphisms in Lupinus Perennis
Total Page:16
File Type:pdf, Size:1020Kb
THE GENETIC BASIS FOR SEED COAT POLYMORPHISMS IN LUPINUS PERENNIS Rachel E. Wilson A Thesis Submitted to the Graduate College of Bowling Green State University in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE December 2019 Committee: Helen Michaels, Advisor Paul Morris Scott Rogers © 2019 Rachel Wilson All Rights Reserved iii ABSTRACT Helen J. Michaels, Advisor Multigenetic traits, specifically seed coat phenotypes, are poorly understood in domesticated plants. This knowledge gap can sometimes be filled by studying wild relatives (Mammadov et al., 2018; von Wettberg 2018). Pigments responsible for seed colors are deposited in the seed coat, which is composed of the two outermost layers of a seed. Seed coat phenotypes can be polymorphic and typically involve complex pathways with multiple layers of expression controlling pigmentation compounds, like anthocyanins. The Anthocyanin Biosynthetic Pathway (ABP) plays a central role in the polymorphic phenotypes of seed coats in many legumes, a family that includes Lupines such as Lupinus perennis (Chalker-Scott, 1999). We hypothesized that the genetics of the ABP correlate with seed coat phenotype. This study aims to identify candidate genes that might be responsible for color differences in the polymorphic seeds of L. perennis. Using RNA sequencing (RNAseq), we produced de novo assemblies of the seed coat transcriptomes of immature seeds of white and speckled seeds. There were two stages of immature seeds used: a pre-pigment and post-pigment stage. The use of both stages increased the chance of constructing a transcriptome containing pigment transcripts. Putative functional annotations of the seed coat transcripts were assigned using multiple databases. Differential expression analysis revealed 58 candidates showing changes in expression patterns correlated between the two phenotypes, involving 36 up expressed and 22 down expressed genes. Two pertain to the ABP and several genes were previously reported to be involved in plant defense, such as Powdery mildew resistance genes. iv Further work is necessary to verify these results by qPCR and to examine the seed coat transcriptome in greater detail, such as co-expression analysis. These results have important implications for endangered butterfly habitat restoration and crop breeding in the genus Lupinus and suggest that there is much more to understand about these seed coat phenotypes. v ACKNOWLEDGMENTS I would like to thank my advisor Dr. Helen Michaels for all of her advice and guidance. I would also like to thank the members of my committee Dr. Scott Rogers and Dr. Paul Morris for all of the help they provided and input they supplied throughout this project. Additionally, I would like to thank my fellow graduate students in the Michaels’ lab: Haley Meek, Erica Forstater, Meigan Day, and Ian Anderson for their assistance and support both in the field and in the lab. Lastly, I would like to thank the Bowling Green City Parks Department and Cinda Stutzman for allowing me to collect seeds on their properties. vi TABLE OF CONTENTS Page INTRODUCTION ................................................................................................................. 1 METHODOLOGY ................................................................................................................ 14 Sample Collection and Preparation ............................................................................ 14 RNA Extraction ......................................................................................................... 16 RNA Quality Control Determination ......................................................................... 17 Illumina Sequencing .................................................................................................. 18 Transcriptome Reconstruction ................................................................................... 19 Functional Annotation ............................................................................................... 20 Gene Expression Analysis ......................................................................................... 22 Differential Gene Expression ..................................................................................... 22 RESULTS .............................................................................................................................. 24 Phenotypes of Biological Replicates ......................................................................... 24 Raw Data .................................................................................................................... 24 Data Quality ............................................................................................................... 25 Transcriptome Reconstruction ................................................................................... 26 Gene Functional Annotation ...................................................................................... 28 GO Classification ....................................................................................................... 30 KOG Classification .................................................................................................... 31 KEGG Classification ................................................................................................. 32 Gene Expression Analysis ......................................................................................... 33 Sample Correlation .................................................................................................... 33 vii Summary of Gene Expression Levels ........................................................................ 35 Gene Expression Difference Analysis ....................................................................... 35 DISCUSSION ........................................................................................................................ 40 REFERENCES ...................................................................................................................... 47 APPENDIX A. SUPPLEMENTARY TABLES.................................................................... 61 APPENDIX B. SUPPLEMENTARY INFORMATION ....................................................... 68 viii LIST OF FIGURES Figure Page 1 Pathway of the Anthocyanin Biosynthetic Pathway .................................................. 10 2 Three Developmental Time Points of L. perennis Seed Pods ................................... 15 3 Graph Interval Lengths .............................................................................................. 29 4 GO Classification ....................................................................................................... 30 5 KOG Classification .................................................................................................... 31 6 KEGG Classification ................................................................................................. 32 7 Pearson Correlation .................................................................................................... 34 8 Venn Diagram of Differentially Expressed Genes .................................................... 37 9 Volcano Plots ............................................................................................................ 38 10 Heatmap of Differentially Expressed Genes.............................................................. 39 ix LIST OF TABLES Table Page 1 Sample Purity, Concentration, and Tissue Stage ....................................................... 25 2 Data Quality Measurements for Library Construction .............................................. 26 3 The Mapped Transcriptome ....................................................................................... 27 4 The Length Measurements of Transcriptome ............................................................ 27 5 Interval Lengths of Transcripts .................................................................................. 27 6 The Transcriptome Annotation Summary ................................................................. 28 7 Sample Names and Approaches................................................................................. 36 1 INTRODUCTION Seed polymorphisms are common in many plants, especially legumes (members of the Fabaceae), which are plants with pods that contain their seeds and have high nutritional value. Some examples of economically important legumes with polymorphic seeds are Soybeans, Chickpea, Common Bean, Sweet Pea, and Lupin (Todd and Vodkin 1996; Tuteja et al., 2009; Zabala and Vodkin 2014, Mirali et al., 2016). Literature suggests that legumes, which are protein rich, can be used as a substitute for meat (Abraham et al., 2019). Lupines are a popular ornamental plant because of their towering spiraled inflorescence and color variety. The most common ornamental variety is the Russell Hybrid with yellows, reds, purples, and all shades in- between (Elmer et al., 2001). While lupin is a common ornamental plant for gardeners, it is starting to gain standing as an important agricultural crop. Cultivated for over 3000 years beginning in the Mediterranean Basin (Gladstone, 1970), there are currently five Lupines that are domesticated, L. angustifolius, L. albus, L. polyphyllus, L. mutabilis, and L. luteus (Gladstone, 1974). In Western Australia, the world’s largest producer of lupines, the Lupin industry has an export sum of $65