The Evolution of Mycobacterial Pathogenicity: Clues from Comparative Genomics
Total Page:16
File Type:pdf, Size:1020Kb
452 Review TRENDS in Microbiology Vol.9 No.9 September 2001 The evolution of mycobacterial pathogenicity: clues from comparative genomics Roland Brosch, Alexander S. Pym, Stephen V.Gordon and Stewart T.Cole Comparative genomics, and related technologies, are helping to unravel the times (1–14 days) and include the principal human molecular basis of the pathogenesis, host range, evolution and phenotypic pathogens, the tubercle and leprosy bacilli. The differences of the slow-growing mycobacteria. In the highly conserved M. tuberculosis complex comprises M. tuberculosis Mycobacterium tuberculosis complex, where single-nucleotide together with Mycobacterium africanum, polymorphisms are rare, insertion and deletion events (InDels) are the principal Mycobacterium bovis, Mycobacterium canettii and source of genome plasticity. InDels result from recombinational or insertion Mycobacterium microti and is defined as a single sequence (IS)-mediated events, expansion of repetitive DNA sequences, or species by DNA–DNA hybridization studies3, with replication errors based on repetitive motifs that remove blocks of genes or exceptionally little sequence variation4,5 (Fig. 1). contract coding sequences. Comparative genomic analyses also suggest that The subspecies can only be distinguished by a loss of genes is part of the ongoing evolution of the slow-growing limited number of phenotypic or, more recently, mycobacterial pathogens and might also explain how the vaccine strain BCG genotypic characteristics, but differ remarkably became attenuated. with respect to their host range and pathogenicity. M. microti, for example, is almost exclusively a Genomics has the potential to provide a complete rodent pathogen and has been used successfully as understanding of the genetics, biochemistry, a live vaccine against TB, whereas M. bovis infects physiology and pathogenesis of microorganisms. a wide variety of mammalian species, including With the genetic blueprint of an organism in hand, humans. BCG (bacille Calmette–Guérin), a researchers should be able to unravel its biology laboratory-attenuated variant of M. bovis, has been rapidly, despite the fact that for many fully used extensively since the 1920s as a vaccine against sequenced genomes no functional information is human TB and, more recently, against leprosy. available for up to 50% of the open reading frames Phylogenetically, the closest relatives of the (ORFs). These knowledge gaps are being filled by M. tuberculosis complex are the environmental extensive data-sets generated by functional and species Mycobacterium marinum and comparative genomics, new disciplines employing Mycobacterium ulcerans (Fig. 1). Whereas powerful techniques encompassing proteomics, M. marinum is principally an ectothermic pathogen bioinformatics, structural biology and microarrays. that rarely causes human infections, M. ulcerans Such approaches are being applied with success to is responsible for a debilitating cutaneous disease the genus Mycobacterium, which includes a wide that is reaching epidemic proportions in parts of variety of organisms of medical, veterinary and West Africa. The Mycobacterium avium complex, environmental importance, notably the human another group of slow-growing mycobacterial pathogens Mycobacterium tuberculosis and subspecies, includes both veterinary and Mycobacterium leprae, which are responsible for opportunistic human pathogens. tuberculosis (TB) and leprosy, respectively. In this review, we will briefly describe the various genomic Methods for genome-wide comparisons Roland Brosch approaches that have been used to study the Physical and integrated maps Alexander S. Pym Stewart T.Cole* mycobacteria, all of which have been catalyzed by the Pulsed-field gel electrophoresis of macro-restriction Unité de Génétique availability of the genome sequence of the H37Rv fragments, the first method enabling whole-genome Moléculaire Bactérienne, strain of M. tuberculosis1, and summarize the comparisons, has been used as a tool for molecular Institut Pasteur, findings to date. epidemiological and population-genetic studies of 28 rue du Dr Roux, 6,7 75724 Paris Cedex 15, the M. tuberculosis complex , and for the France. Slow-growing mycobacteria construction of physical and integrated maps of *e-mail: All mycobacteria possess a complex cell envelope M. tuberculosis8 and M. bovis BCG Pasteur9. The [email protected] that is rich in lipids and glycolipids, and the genus, physical maps were useful for estimating genome Stephen V.Gordon whose phylogeny is based on analysis of 16S rRNA size and structure (e.g. circular chromosome and Veterinary Laboratories sequences (Fig. 1), can be subdivided into fast- or absence of plasmids), and provided a scaffold for Agency, Woodham Lane, 2 New Haw, Addlestone, slow-growing species . The slow-growing establishing integrated maps of ordered cosmid and Surrey, UK KT15 3NB. mycobacteria have remarkably long generation bacterial artificial chromosome (BAC) libraries10,11, http://tim.trends.com 0966-842X/01/$ – see front matter © 2001 Elsevier Science Ltd. All rights reserved. PII: S0966-842X(01)02131-X Review TRENDS in Microbiology Vol.9 No.9 September 2001 453 PE and PPE gene families, which often comprise M. smegmatis Fast-growing mycobacteria multiple tandem repeats. Genetic rearrangements, insertions, inversions and duplications are also Slow-growing mycobacteria difficult to detect using microarrays. M. asiaticum M. gordonae M. canattil H37Rv M. marinum Subtractive hybridization M. tuberculosis CDC1551 M. ulcerans Genomic subtraction, another powerful technique M. africanum Strain 210 M. tuberculosis complex M. microti for whole-genome comparisons, has been used to M. leprae M. szulgai M. bovis compare the genomes of M. bovis and M. bovis BCG M. malmoense BCG Connaught; three regions (RD1–3) have been M. haemophilum identified that were deleted during the attenuation M. gastri/kansasii M. scrofulaceum of M. bovis BCG (Ref. 17). Unlike microarrays, the M. 'paraffinicum' strength of this technique is its ability to identify M. intracellulare regions that are present in some members of a species M. paratuberculosis but absent from the genome of the sequenced strain. M. avium TRENDS in Microbiology Bioinformatics Fig. 1. Phylogenetic tree of selected mycobacteria, based on 16S rRNA sequences2, including those The alignment of complete genome sequences in silico species whose genome sequences are in progress (green) or completed (red). is the ultimate DNA-based comparative strategy. With the growing number of completed genome which is essential for the rapid completion of sequences, and advances in bioinformatics, highly genome sequences1. refined comparisons of sequence variation between A minimal overlapping set of BAC clones two strains are possible using genome-alignment spanning the whole genome can also be used to tools such as MUMmer (Ref. 18) or ACT construct comparative genome-hybridization arrays (http://www.sanger.ac.uk/Software/ACT/). This is by (BAC arrays) or to compare individual BAC clones far the most informative approach and its power is using libraries from different strains. This combined increasing as more sequences become available. At approach identified several deleted regions the time of writing, complete genome sequences exist (RD1–10) in the genome of M. bovis BCG Pasteur for M. tuberculosis and M. leprae1,19 and the genome- compared with M. tuberculosis H37Rv, as well as sequencing projects for M. bovis, M. bovis BCG, loci (RvD1 and RvD2) that were deleted from the M. microti, M. avium, M. paratuberculosis, sequenced reference strain M. tuberculosis H37Rv M. smegmatis and M. ulcerans are at various stages relative to other strains of the M. tuberculosis of completion (Box 1). complex10,11. BAC-to-BAC comparison also allowed the detection of two large tandem duplications Proteomics (DU1 and DU2) in the genome of M. bovis BCG Comparative proteome analysis is based on two- Pasteur12. The construction of BAC libraries from dimensional electrophoresis (2DE) of proteins, mass M. tuberculosis, BCG, M. bovis, M. microti and spectrometry and database comparisons. Jungblut M. ulcerans has provided a basis for further et al. used this technique to compare the proteome of comparative studies, as well as a resource for M. tuberculosis H37Rv with that of BCG (Ref. 20). functional mycobacterial genomics. More recently, 2DE gel comparisons combined with in silico analysis of the genome sequences were Microarrays carried out for M. tuberculosis strains H37Rv and DNA microarrays are also attractive tools for CDC1551 (Ref. 21). Comparative proteomics has comparing genomes between closely related strains or the potential to detect very subtle differences, for species. Behr et al.13 were able to identify 14 regions example changes in abundance or pI, as well as post- (RD1–14) that were absent from BCG Pasteur relative translational modifications such as methylation, to M. tuberculosis H37Rv, and two deletions glycosylation or phosphorylation. (RD15 and 16) specific for particular BCG substrains, by hybridizing whole-genome probes to a microarray Results of genome-wide comparisons of spotted PCR products representing most of the M. tuberculosis isolates ~4000 M. tuberculosis H37Rv ORFs. An The clinical course and pattern of TB is highly oligonucleotide-based Affymetrix GeneChip in variable, ranging from life-long asymptomatic conjunction with powerful informatics can accurately infection to rapidly progressive pulmonary or identify deletions as small as