Honors Program in Partial Fulfillment of the Requirements for University Honors
Total Page:16
File Type:pdf, Size:1020Kb
i Phylogenetic Analysis of a Group of Enteric Bacteria Based on 16S rDNA Gene Sequencing A thesis submitted to the Miami University Honors Program in partial fulfillment of the requirements for University Honors by Katie Lynn Ziegler May 2004 Oxford, Ohio ii ABSTRACT PHYLOGENETIC ANALYSIS OF A GROUP OF ENTERIC BACTERIA BASED ON 16S rDNA GENE SEQUENCING By Katie Lynn Ziegler It has been suggested that phylogenies derived specifically from 16S rDNA gene sequencing provide reasonable evidence for describing major evolutionary lineages (Brown, 1997). Because of the critical role they play in protein synthesis and thus survival, rDNA sequences are highly conserved and tend to resist significant change over time. Thus, they are of prime interest when studying evolutionary relatedness. In this study, we sequenced a portion of the 16S rDNA gene from the amplified PCR products of enteric bacterial samples. The resulting DNA sequences were edited and aligned to develop a phylogeny based on our comparative data. We developed phylogenies using parsimony analysis and bootstrapping and compared our results to expected data. We hypothesized that species within the same genus would contain the most shared derived traits and therefore be most closely related. However, our phylogeny provided limited resolution of some of our strains. While a few strains resolved well (i.e. Serratia and some Enterobacter strains), weak bootstrap support characterized much of the phylogeny. Ribosomal DNA contains conserved, variable, and highly variable regions. The highly variable regions are most prone to mutation, and thus may not provide good phylogenetic signal. For this reason the hyper-variable regions not only fail to provide meaningful data, but can also destroy signal that is already present. iii iv Phylogenetic Analysis of a Group of Enteric Bacteria Based on 16S rDNA Gene Sequencing by Katie Lynn Ziegler Approved by: _________________________, Advisor Dr. Gary Janssen, Ph.D. _________________________, Reader Dr. Luis Actis, Ph.D. __Dr. Francisco B.-G. Moore__, Reader Dr. Francisco B.-G. Moore, Ph.D. Accepted by: __________________________, Director, University Honors Program v vi Acknowledgements I would first like to thank Dr. Francisco Moore (Paco) and the University of Akron Biology Department for their immense help with this project. Paco, thank you for taking a chance on the random undergrad who emailed you in October about summer work! Thank you for designing this project for me to work on and for keeping the lab stocked with everything I needed throughout the process. Thank you for your guidance when I ran out of my own troubleshooting ideas. Thank you for serving as one of my readers. And thank you for responding so quickly to all of my questions as I wrote this thesis. I don’t know what I would have done without your input. I would also like to thank Dr. R. Joel Duff from the University of Akron for allowing me to use his lab when I needed it and for answering some of my queries along the way. Thanks to Traci Branch for running sequencing reactions while I was in Oxford. Thank you for staring aimlessly at the alignments with me until we figured out what the heck to do with them! I really appreciate that you dedicated so much of your time to helping me finish “on time”. Thanks also to Lauren Smith for her support and for sending me files to include in my thesis. I would like to thank Dr. Gary Janssen from Miami University for serving as my advisor throughout this process. Thank you for putting up with my ever- shrinking time table and for continuing to support me throughout the process. Thanks to Dr. Luis Actis for serving as my other thesis reader and for helping me with my proposal. Thanks also to Dr. Joe Carlin for granting me computer access and letting me download phylogeny software onto the departmental computer. Finally, thanks to my family and friends for putting up with me when I was stressed out (aka the last month!) I don’t know what I would have done without all of you! vii Table of Contents Introduction……………………………………………………………………...1 Materials and Methods……………………………………………………..…..6 Results…………………………………………………………………….……15 Discussion………………………………………………………………………23 References Cited………………………………………………………………27 Appendix………………………………………………………………………..29 viii List of Figures Figure 1. Schematic illustration of primer binding sites on the 16S rDNA gene……………………………………………………………………………………..10 Figure 2. Agarose gel electrophoresis illustrating a representative gel of PCR amplified 16S rDNA gene products………………………..………………….16 Figure 3. New England BioLabs TriDye 1 kb DNA ladder used to estimate the size of the 16S rDNA gene after PCR amplification…………………..……….17 Figure 4. Most parsimonious phylogeny detailing the relatedness of the bacterial species under study………………………………………………………...19 Figure 5. Bootstrap 50% majority-rule consensus tree of 2000 replicates……..21 ix List of Tables Table 1. Bacterial strains used for rDNA sequence determination (by LAB Reference Number)……………………………………………………………………..7 Table 2. Bacterial rDNA sequences obtained from the Ribosomal Database Project (by corresponding GenBank entry number)…………………………………7 Table 3. Primer sequences used for PCR and sequencing reactions……………9 x 1 Introduction Developments in the field of molecular systematics have expanded the realm of knowledge available to us regarding bacterial classification and relatedness. Since DNA is the cellular component responsible for bacterial inheritance, and evolution is the change in the genetic composition of a population over time, variation in DNA sequences should provide hints toward solving evolutionary questions (Mindell & Honeycutt, 1990). It has been suggested that phylogenies derived specifically from 16S rDNA gene sequencing provide reasonable evidence for describing major evolutionary lineages (Brown et al., 1997). Ribosomal RNA genes encode the rRNAs, which bind ribosomal proteins and associate to form complete ribosome complexes. Ribosomal RNAs are universally present throughout all species, and ribosomes are essential for cellular function and survival. Because of the critical role they play, rDNA sequences are highly conserved and tend to resist significant change over time (Wertz et al., 2003). Thus, they are of prime interest when studying evolutionary relatedness. In addition to obtaining phylogenies from rDNA sequence data, research has indicated a number of other areas in which such data is useful. For example, the core genome hypothesis proposes the existence of core genes and auxiliary genes. Core genes include regulatory genes present in a full (or almost full) copy in each isolate of a species. They rarely undergo lateral gene transfer because, 2 theoretically, there should be no selective advantage conferred by doing so (Lan & Reeves, 2001). Auxiliary genes, on the other hand, may include pathogenicity islands, resistance genes, novel metabolic functions, toxin genes, etc. These are significantly more likely to undergo lateral gene transfer in response to environmental or selective pressures (Lan & Reeves, 2001). In an attempt to define a bacterial species using a molecular approach, the core genome hypothesis acknowledges that because there is little or no selective advantage to acquiring core genes from other species, the core genes’ sequences tend to diverge between species. As a result, altered core genes pose a barrier to interspecies recombination (Lan & Reeves, 2001). If these core genes truly do not recombine, we may be able to more concretely define bacterial species based on conserved molecular sequence data from the core genes (i.e. 16S rDNA). The benefits of using genotypes over phenotypes in classifying bacteria include deriving a more rapid and precise development of an objective and accurate species concept. Thus genotyping may be useful for effectively identifying pathogens—in particular those that grow slowly or those that cannot be cultivated in vitro (Harmsen & Karch, 2004). While sequence databanks facilitate this endeavor, none are perfect. As a result, sequence data might be most effective in distinguishing between species when they are too similar to be differentiated otherwise. For example, in the case of bioterrorist attacks, rapid identification of Bacillus anthracis could prove critical in differentiating it from 3 other similar Bacillus species before rendering appropriate treatment (Sacchi et al., 2002). This study involves sequencing a portion of the rDNA gene from a subset of Gram-negative enteric bacteria and generating a phylogenetic tree based on the sequence data obtained. I will first provide a brief description of the organisms included in the study. Serratia marcescens is a rod-shaped facultative anaerobe of the family Enterobacteriaceae. Once considered nonpathogenic it is now known to cause some opportunistic infections, especially nosocomial infections in immunocompromised patients (Hume et al., 1999). Members of the genus Serratia can be distinguished from others belonging to the Enterobacteriaceae by their production of three special enzymes: DNase, lipase, and gelatinase (http://medic.med.uth.tmc.edu/path/00001521.htm). Enterobacter aerogenes is associated with nosocomial infections (http://www.diseasesdatabase.com) and is considered to be closely related to Klebsiella pneumoniae, another Gram-negative facultative anaerobe. Klebsiella are also members of the Enterobacteriaceae. Enterobacter taylorae has also caused severe nosocomial infections in several patients (Rubinstein et al., 1993). Citrobacter