University of Nevada, Reno a Thesis Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Science In
Total Page:16
File Type:pdf, Size:1020Kb
University of Nevada, Reno Toward Understanding the Genetic Basis of Cross-Incompatibility in Sorghum: de novo genome Assembly of Johnsongrass and Resequencing of Iap and BAM1 loci A Thesis submitted in partial fulfillment of the requirements for the degree of Master of Science in Cellular and Molecular Biology By Julia N. Trowbridge Melinda Yerka/Thesis Advisor December 2019 THE GRADUATE SCHOOL We recommend that the thesis prepared under our supervision by Julia Trowbridge Entitled de novo genome Assembly of Johnsongrass and Resequencing of Iap and BAM1 loci be accepted in partial fulfillment of the requirements for the degree of Master of Science Dr. Melinda Yerka , Advisor Dr. Felipe Barrios-Masias , Committee Member Dr. David Alvarez-Ponce , Comm ittee Member Dr. Jeff Harper , Graduate School Representative David W. Zeh, Ph.D., Dean, Graduate School December 2019 i ABSTRACT Sorghum [Sorghum bicolor (L.) Moench] (referred to as “sorghum” hereafter) is a C4 grain crop in the grass family Poaceae. It is closely-related to other members of sub-family Panicoideae, including the staple crops maize [Zea mays L.] and rice [Oryza sativa L.], and is the 5th-most produced cereal crop in the world31,33,72. The U.S. leads production of sorghum globally34. Johnsongrass [Sorghum halepense N. Steud de Wet] is a noxious weed in 46 states in the United States and often found growing within close proximity to sorghum where it has been shown to contaminate harvested seed through pollen-mediated gene flow. The risk of gene flow from sorghum to Johnsongrass is the primary reason why GE sorghum has not been approved for commercialization by USDA-APHIS (personal communication, Dr. Subray Hegde, USDA Biotechnology Regulatory Services branch chief in the Biotechnology Risk Analysis Program). Many efforts71,73-77 have been made to determine the rate of gene flow between sorghum and Johnsongrass to empirically assess the risk of sorghum traits transferring to feral Johnsongrass populations, but these studies have used limited numbers of accessions from both species and the lack of high-throughput genotyping methods or a high-quality Johnsongrass reference genome have led to inconsistent results. Given the polyploid history of Johnsongrass (a putative allotetraploid [2n = 4x = 40] and the close relationship between it and sorghum (S. bicolor is one of its ancestral genomes), this risk is not insignificant. In order to determine the frequency of sorghum alleles segregating in regional feral Johnsongrass populations, an assembled and annotated Johnsongrass reference ii genome is needed to identify species-specific alleles, and their copy number, that may differ from those in the existing, well-annotated sorghum reference genome. Local rates of gene flow are needed because different sorghum genotypes and production methods are used in different geographies, and both factors could impact rates of reproductive success, genetic drift, or the fixation of crop alleles. This thesis provides the basic genomic framework necessary to assist in NGS- based inquiries into the ancestry, speciation, and comparative genomics of Johnsongrass and sorghum. We completed the first Johnsongrass de novo genome assembly and amplified, through long-read resequencing, the putative reproductive barrier loci 64 (Inhibition of Alien Pollen, Iap and Barely Any Meristem, BAM1 ) in Johnsongrass that are known to impact rates of gene flow among Sorghum species and closely-related genera (Zea and Saccharum). This new Johnsongrass reference genome and targeted resequencing data will greatly facilitate population genetic studies aimed to clarify empirical rates of gene flow among sorghum and Johnsongrass specifically, and within the Sorghum species complex generally. They will additionally assist with genetic and physiological investigations into the roles of key loci involved in processes of speciation and reproductive isolation. iii ACKNOWLEDGMENTS Knowledge comes in many forms and I would like to thank those who have expanded mine through readings, hands on activities, or conversations. To my new friends Devin Smith and John (Jeep) Baggett, you made grad school bearable and even fun. To my fellow graduate students, especially Haley Toups and Chrystle Weigand, thank you for taking time out of your own research to teach me various lab/ greenhouse skills all while being the most positive and kind people I’ve met. To Jason, thanks for waiting around through countless hours of lab work, reading, stress, and listening to my fascination over my project, even when you didn’t quite understand what I was saying. To my Mom, thank you for the belief, support, and pride you have in me. To my advisors, professors, and other researchers, your wealth of knowledge never ceases to amaze and intimidate me in an inspiring way. Your feedback was/is always fair, challenging, and a growing experience that I have appreciated as I become more confident in my own research ability and work ethic. iv Table of Contents 1. List of Figures v 2. Chapter I: Literature Review 1 a. Introduction 1 b. Zea cross-incompatibility 4 c. PMEs modulate the stiffening and loosening of cell walls 6 d. Sorghum cross-incompatibility 9 3. Chapter II: Assembly of the Johnsongrass [Sorghum halepense N. Steud de Wet] genome 13 a. Introduction 13 b. Materials and Methods 15 c. Results 25 d. Discussion 30 4. Supplemental Information 34 5. References 72 v List of Figures 1. Figure 1. Model of Zea Pectin Methylesterases: Roles of Ga1, Ga2 and Tcb1 Allele Types in Pollen-Pistil Interactions..……………………………………..…8-9 2. Figure 2. Synteny between Chromosome 5 in Sorghum bicolor and the Zea mays reference genome, including the Ga1 locus on Chromosome 2 and additional sections of Chromosome 4. Sorghum appears to retain the ancestral Poaceae locus that was divided following an ancient tetraploidization event in Zea. Homologous regions are shown with connecting strands. Repeats are indicated in orange; the positive strand genes are indicated in blue and the negative strand genes are indicated in green…...............................................................................................….10 3. Figure 3. Representative Johnsongrass Seedlings after One Month of Growth. Meter stick for length reference under control (A) or water deficit (B) conditions. The control condition was 200 ml ± 20 ml SD daily nutrient solution whereas the water deficit condition was 50 ml ± 5 ml SD daily nutrient solution...…..…………………………………………………………...………….…21 4. Figure 4. 0.4% Agarose Gel with Johnsongrass Amplicons for Iap and BAM1, Putative Cross-Incompatibility Loci. Chr02:2144633..2160696: Iap full-length amplicon containing five candidate genes; Chr02:2144633..2150496: sub-region of Iap containing two candidate genes expressed in floral tissues at anthesis, Sobic.002G023300.1 and Sobic.002G023400; Chr02:2550778..2556242: amplicon containing the full-length BAM1 gene (Sobic.002G027600.1)..……………………..25 5. Figure 5. Heatmap of the Johnsongrass De Novo Genome Assembly. Blue boxes denote scaffolds and green boxes denote contigs. The red diagonal lines next to most pairs of green boxes are the result of synteny between those contigs. The gap in the diagonal red line appears to be one chromosome that is complicated. There is more polyploidy, heterozygosity, or both than the other chromosomes. It is likely that this unresolved heterozygocity is the reason for only 36 chromosomes (scaffolds) being detected despite Johnsongrass having 40 chromosomes. Further cytology and advanced computational investigations will be needed to resolve this………………26 6. Figure 6. Ordering Metrics for the Johnsongrass De Novo Genome Assembly. A total of 36 finished scaffolds (chromosomes) were detected of N50 = 43,822,357 bp each…………………………………………………………………………………..27 7. Figure 7. Phylogenetic Tree of BAM1. The neighbor-joining method and distance corrections were conducted in MUSCLE of the coding region of the gene BAM1 in sorghum against the Poaceae species Maize, Rice, Johnsongrass, and outside group Arabidopsis………………………………………………………………………..…29 8. Figure 8. Phylogenetic Tree of Sobic.002G023300. The neighbor-joining method and distance corrections were conducted in MUSCLE of the coding region of the gene Sobic.002G023300 in sorghum against the Poaceae species Maize, Rice, Johnsongrass, and outside group Arabidopsis.…………………..…………….…….29 1 LITERATURE REVIEW Introduction Sorghum [Sorghum bicolor (L.) Moench] (referred to as “sorghum” hereafter) is a C4 grain crop in the grass family Poaceae. It is closely-related to other members of sub-family Panicoideae, including the staple crops maize [Zea mays L.] and rice [Oryza sativa L.], and is the 5th-most produced cereal crop in the world31,33,72. The U.S. is the largest global producer 34, with most production going to animal feed, although specialty markets are leading to increased adoption of commercial varieties for gluten-free bread flour, syrup, popping, alcoholic beverages, and biofuels. Sorghum is widely marketed as a gluten-free, non-GMO ancient grain35.The genus Sorghum contains 25 species within five clades: Eu-sorghum, Heterosorghum, Parasorghum, Stiposorghum, and Chaetosorghum 11,12,13. Sorghum, S. bicolor, is in the Eu-sorghum clade along with its weedy relatives, which include its conspecific de-domesticated relative, shattercane [S. bicolor (L.) Moench ssp. drummondii (Nees ex Steud.) De Wet ex Davidse]; the noxious weed, Johnsongrass [Sorghum halepense N. Steud de Wet]; and [S. propinquum S. Kunth Hitchc]16,17.Current genetic evidence