ABSTRACT SAILSBERY, JOSHUA KENT. Comparative Genomic
Total Page:16
File Type:pdf, Size:1020Kb
ABSTRACT SAILSBERY, JOSHUA KENT. Comparative Genomic and Transcriptional Analyses of Magnaporthe oryzae and other Eukaryotes. (Under the direction of Dr. Ralph A. Dean). Magnaporthe oryzae causes devastation of rice crops around the world; destroying enough to feed at least 60 million people annually. Here-in, I investigate this pathogenic fungus to elucidate the genomic components utilized during infection. Investigated components include regulatory elements, RNA sequences, and genes (especially transcription factors). Chapter 1 contains the background and introduction to my research. In Chapter 2, I characterize a common transcription factor component, the basic Helix-Loop-Helix (bHLH) domain, in M. oryzae and other fungi. Through phylogenetic analyses I identified 12 major groupings within Fungi; identifying conserved motifs and functions specific to each group. Several classification models were built to distinguish the 12 groups and elucidate the most discerning sites in the domain. These models were highly accurate and led to the identification of 12 highly discerning sites (1, 4, 6, 7, 8, 12, 15, 16, 19, 20, 50, and 53), which were incorporated into a set of rules to classify sequences into one of these 12 groups. Conservation of amino acid sites and phylogenetic analyses established that, like plant bHLH proteins, fungal bHLH containing proteins are most closely related to animal Group B In Chapter 3, I assessed the bHLH domain across Plants, Animals, and Fungi to identify unique sequence characteristics pertaining to each Kingdom. Using classification models, I identified five essential amino acid sites that are highly characteristic of these Kingdoms. Hidden Markov Models, built on expertly aligned domains, were used with the classification models to identify and classify bHLH sequences from a marine environmental sample. Last, I created an online tool that can align, extract, and classify bHLH sequences. Next generation sequencing was used to perform a detailed examination and characterization of small RNA molecules from mycelia and appressoria in Chapter 4. In a collaborative project, my work showed that genomic features contributed differentially to the RNA sequence libraries. Mycelia RNAs were enriched for intergenic and repetitive elements while a higher proportion of appressoria RNAs were enriched for tRNA loci. Differential mapping of small RNAs to the 5’ and 3’ halves of mature tRNAs was also observed. This led to the identification of sites with post- transcriptional modification within tRNAs and showed a difference in that modification between the two tissues. In a second collaborative RNA study (Chapter 5), methylguanosine-capped and polyadenylated small RNAs (CPA-sRNAs) were sequenced with 454 technologies. My work showed that CPA-sRNAs mapped to rRNAs, tRNAs, snRNAs, transposable elements and intergenic regions. Where CPA-sRNAs were mapped to protein coding genes, they were predominately associated with transcriptional start and termination sites. Those proteins enriched for CPA-sRNAs, especially ribosomal encoding proteins, were positively correlated with gene expression. Finally, in Chapter 6, I designed a new comparative genomics software package (D-SynD) that can detect regions of syntenic DNA between multiple large genomes simultaneously. D-SynD requires no gene models and makes no assumptions with regards to gene order or orientation. Additionally, detected syntenic regions are statistically evaluated for significance. The software allows many user options, such as defining the preferred syntenic region size and complexity. D-SynD is released as an open-source software package for use in comparative genomic studies. Comparative Genomic and Transcriptional Analyses of Magnaporthe oryzae and other Eukaryotes by Joshua Kent Sailsbery A dissertation submitted to the Graduate Faculty of North Carolina State University in partial fulfillment of the requirements for the degree of Doctor of Philosophy Bioinformatics Raleigh, North Carolina 2011 APPROVED BY: _______________________________ ______________________________ Dr. Ralph A. Dean Dr. Ignazio Carbone Co-Chair of Advisory Committee Co-Chair of Advisory Committee ________________________________ ________________________________ Dr. Gary A. Payne Dr. Eric A. Stone ________________________________ Dr. Jeffrey Thorne DEDICATION Without question, this work is dedicated to my beloved wife Stacy Dawn Sailsbery and our two boys Aiden Kent and Ryan James Sailsbery. ii BIOGRAPHY Joshua was born on May 31st, 1979 and raised in the Pacific Northwest. In High school, with the support of his entire family, he participated in the Washington State funded Running Start program. As a result, Joshua graduated with his two year degree a day prior to obtaining his high school diploma. In 1998, Joshua moved to Provo, UT and started his undergraduate research at Brigham Young University. After a year working toward his Bachelor’s degree, Joshua put his academic career on hold to serve the wonderful people struggling along the Rio Grande. In 2001, after completing two years of service in south east Texas, he returned to BYU to complete his studies. In this time he met and married his sweetheart Stacy Searle. While planning his professional career, Joshua overheard a Bioinformatics student explain the field to a career counselor and Joshua was hooked. From there, he was employed by Dr. David A. McClellan, an Integrated Biology professor, to create two Bioinformatics software packages (CDM and TreeSAAP). After graduating from BYU with a Bachelor’s degree in Computer Science, Computational Biology, and Bioinformatics in 2005, Joshua pursued graduate school at North Carolina State University. With funding provided by the NIH endowed IGERT grant, Joshua joined the Bioinformatics Resource Center. During the next three years, Joshua had the opportunity to be educated from and work with many talented professionals in his field. iii Following his third year of study, Joshua joined Dr. Ralph A. Dean’s lab at the Center for Integrated Fungal Research. While there, he had the opportunity to make large contributions to two projects, and led three projects of his own. He also was fortunate over the course of three summers to mentor junior researchers recruited through the NSF supported Research Experience for Undergraduates program. iv ACKNOWLEDGMENTS First I would like to thank my advisor Dr. Ralph A. Dean, for the opportunity to work in the Fungal Genomics Lab. He has provided immeasurable support in critical thinking, manuscript editing, and project direction. In many ways he embodies scientific excellence. I will be ever grateful to have been a part of his team at the Center for Integrative Fungal Research (CIFR). I am also grateful for my committee members Eric A. Stone and Jeffrey Throne for their wonderful lectures, and for supporting my academic career since my arrival at NCSU. I would like to thank my other committee members Gary Payne and Ignazio Carbone for their suggestions in several projects and their support in finishing the requirements for this degree. To the people behind the IGERT, NSF, and NIH grants that have funded my academic career, thank you. The funding has allowed me to participate in many exciting fields of study, financial support for my family, and to complete my Ph.D. work. I am sincerely grateful to Douglas E. Brown for his example of professionalism in the work place. His optimistic outlook on life was especially appreciated. Also, I would like to thank Minfeng Xue who showed me the finer art of Perl scripting. Many thanks to all the current and former members of CIFR for the expertise, professionalism, and great working environment they provided, including, Vickie Randleman, Greg C. Bernard, Junhyun Jeon, William Sharpee, Dr. Malali Gowda, and Dr. Yeon Yee Oh. v I would like to thank all the Research Experience for Undergraduate students I had the pleasure of working with. It’s hard not to succeed when you have such talented and bright people working with you. I’m particularly grateful to Brent Clay who toiled for more than a year, employing his massive skills to the progression of our work. Finally, I would like to thank my supportive family. Both of my parents, who have nothing but the upmost confidence in me. My sister Tawnie and her family who moved to North Carolina and helped Stacy and I in innumerable ways. My two wonderful boys who have sacrificed so much time with their father so he could “write his paper”. And most importantly, my wife Stacy without whom none of this would even be possible. Thank you Dear. vi TABLE OF CONTENTS LIST OF TABLES ............................................................................................................................................. x LIST OF FIGURES .......................................................................................................................................... xii CHAPTER 1-Insights to rice blast disease provided by the Magnaporthe oryzae genome ............................................................................................................................................................. 1 Background .................................................................................................................................................. 2 References .................................................................................................................................................... 5 CHAPTER 2-Phylogenetic Analysis and Classification