Phylogeny Based on 16S Rrna/DNA
Total Page:16
File Type:pdf, Size:1020Kb
Phylogeny Based on Secondary article 16S rRNA/DNA Article Contents . Introduction Erko Stackebrandt, DSMZ-German Collection of Microorganisms and Cell Cultures GmbH, . Semantic Macromolecules: a Basis for Phylogenetic Braunschweig, Germany Studies . Sequence Determination, Sequence Alignment and Determination of Sequence Similarities Modern systematics of prokaryotes is based on comparative analysis of the evolutionarily . Recognition of the Higher Taxa of Prokaryotes conservative genes coding for 16S ribosomal RNA. Dendrograms of phylogenetic . Polyphasic Approach to Bacterial Systematics relatedness show the order in which organisms evolved in time, thus providing a basis for . The Taxonomic Rank ‘Species’ in Bacteriology their classification. Introduction are the historical record of evolution, and the determina- In contrast to more highly evolved eukaryotes, in which tion of their primary structure provides a powerful means complex morphologies visibly reflect their evolutionary by which evolutionary relationships can be measured. In history, the microscopic and ultrastructural features of essence, two organisms possessing a given stretch of microorganisms cannot be used to deduce the way in which semantides which differ in only a few changes (mutations, the prokaryotes and the morphologically simple eukar- nucleotide order or amino acid positions) are more closely yotic forms evolved. Before 1960 taxonomists were unable related to each other than those organisms in which a to appreciate the complexity of microbial systematics and higher number of changes have accumulated. Thus these to recognize that groups based on superficial properties molecules can be considered as chronometers. As different alone did not necessarily reflect those which arose due to genes are subjected to different rates of changes (same evolutionary processes. However, at this same time (when frequency of mutation but different level of manifestation most microbiologists resigned themselves to the belief that of changes), each cell possesses a variety of chronometers the true course of prokaryote phylogeny could never be from which different evolutionary events can be deter- unravelled) concepts were developed that turned out to mined. Slow running clocks will reflect early evolutionary change our view about the inter-relatedness of living events such as the separation of the main lines of descent, organisms. while fast running clocks will reflect more recent events, such as the more precise separation of genera and species. Comparative analysis of one particular homologous semantide from many organisms will allow the determina- Semantic Macromolecules: a Basis for tion of the order in which this gene or protein evolved during evolution, thus unravelling its family tree. Phylogenetic Studies Episemantic molecules, which are synthesized under the control of tertiary semantides, are adenosine triphosphate Biological molecules can be classified into three categories (ATP), carotenoids and chemotaxonomic markers. These – semantides, episemantides and asemantides – according molecules are not phylogenetic markers per se but the past to the information they carry (Zuckerkandl and Pauling, has shown that groups of organisms that form a 1965). At the highest level is the semantides, which are phylogenetic cluster can, in most cases, be recognized by information-carrying molecules. From the evolutionary the chemical composition of markers such as peptidogly- origin of the first cell to contemporary cells, information on can, lipids, fatty acids, isoprenoid quinones, polyamines reproduction, behaviour, survival, maintenance, etc. have and mycolic acids. been laid down in blueprints and passed in semantides Asemantic molecules are molecules that are not pro- from one generation to the other. duced by the organisms themselves and therefore do not Three semantides have been defined according to their express any of the information that this organism contains. role in the cell. Following the biological dogma of For example, molecules such as exogenously supplied information flow, DNA is the primary semantide, RNA vitamins, phosphate ions, oxygen, viruses, etc. cannot be is the secondary semantide, and proteins are the tertiary used in the reconstruction of evolutionary events. semantides. As changes in the primary structure of DNA occur by an ongoing process of random mutation and selection, the composition of this molecule is constantly changing and this information is passed through messen- ger RNA to the proteins. The sequences of these molecules ENCYCLOPEDIA OF LIFE SCIENCES / & 2001 Nature Publishing Group / www.els.net 1 Phylogeny Based on 16S rRNA/DNA 16S rRNA and the gene coding for it are ribosomal proteins are assembled to form the two reliable homologous molecules ribosomal subunits of which each ribosome is composed. The small 30S ribosomal subunit contains the 16S rRNA The most useful molecular chronometers used today for and 21 proteins, while the larger 50S subunit contains the the determination of phylogenetic relationships are the 16S 23S rRNA, the 5S rRNA and 32 proteins. Fully assembled ribosomal (r)RNA genes (rDNA) and their gene products, ribosomes are part of the translation process, in which the the 16S rRNAs (Figure 1) (Woese, 1987). In most organisms genetic information, channelled through the ribosomes via these genes occur in multiple copies per cell, but very slow- transcribed messenger RNA is translated into polypeptides growing species may have only a single copy. The 16S and proteins. It can be deduced from the universality of the rRNA gene is part of an rrn operon, located at the 5’ biological code that such a transcription process, although terminus, followed by the larger 23S rDNA gene and the primitive in early evolution, had already been functioning small 5S rRNA gene. These genes are separated by spacers, during the early stages of life. Thus, the 16S rRNA genes, as some of which contain genes for transfer RNA. During well as the genes coding for other components of the translation, the pre-rRNA is folded under the influence of ribosomes, are presumably derived from a common the ribosomal proteins into tertiary and quaternary ancestor and are homologous molecules. structures. This is the basis for the maturation process, in Several criteria have been identified which make the 16S which the 5’ and 3’ flanking region of the rRNA genes are rRNA and their genes the most widely studied phyloge- digested by specific enzymes. The mature RNA and the netic markers: (1) the function of ribosomes has not 1100r 1114f 519f 907r 536f 27f 926f 1392r 5’ 357f 342r 3’ Figure 1 Secondary structure of a 16S rRNA molecule based on the E. coli structure (Maidak et al., 1994; available in the public domain Ribosomal Database Project). Highly variable regions are red; highly conservative stretches are green. Binding sites of primers used in PCR amplification of the rRNA gene are blue, with the direction of amplification indicated by arrows. The other nucleotides are black. 2 ENCYCLOPEDIA OF LIFE SCIENCES / & 2001 Nature Publishing Group / www.els.net Phylogeny Based on 16S rRNA/DNA changed for about 3.8 billion years, (2) the 16S rDNA sequence analysis is most conveniently carried out as a genes are universally present among all cellular life forms, linear PCR cycle sequencing reaction. The sequences are (3) the size of 1540 nucleotides makes them easy to analyse, then aligned, which means that homologous nucleotides (4) the primary structure is an alternating sequence of derived from a common position within the ancestral invariant, more or less conserved to highly variable sequence are arranged in columns, and consequently are regions, and (5) lateral gene transfer has not (yet) been recognized as being identical or different (Stackebrandt observed among organisms. Other homologous molecules and Rainey, 1995). have been sequenced, e.g. the 23S rDNA, the 5S rRNA, Phylogenetic relationships can be assessed by pairwise genes coding for enzymes, and ribosomal proteins, but the similarities. One hundred per cent similarity found database is not nearly as extensive for these molecules as between a pair of 16S rDNA sequences using different 16S rDNA, for which about 8000 sequences have been methods indicates very high relatedness, if not identity of deposited. the investigated organisms. The lower the value the more unrelated the compared organisms. If, however, the number of organisms is too large, the respective similarity matrix cannot be interpreted meaningfully. In this case Sequence Determination, Sequence phylogenetic relationships can be visualized graphically by using algorithms that transform the similarity values into Alignment and Determination of dissimilarity values to compensate for superimposed Sequence Similarities (multiple) substitutions. These phylogenetic distances form the basis for phylogenetic trees or dendrograms. Progress in the elucidation of phylogenetic relationships The most widely applied treeing methods are distance parallels the development of sequencing methods. Com- methods but other approaches, such as maximum parsi- parative sequence analysis of 16S was introduced by Carl mony and maximum likelihood methods are frequently Woese and collaborators about 20 years ago (Woese and used as well. Tree topologies are best tested by comparing Fox, 1977). The demanding and expensive sequence the evolution of different phylogenetic markers with a analysis of T1-generated 16S rRNA oligonucleotides similar degree of sequence conservatism. The results