Deciphering Bacterial Epigenomes Using Modern Sequencing Technologies
Total Page:16
File Type:pdf, Size:1020Kb
REVIEWS Deciphering bacterial epigenomes using modern sequencing technologies John Beaulaurier, Eric E. Schadt and Gang Fang * Abstract | Prokaryotic DNA contains three types of methylation: N6-methyladenine, N4-methylcytosine and 5-methylcytosine. The lack of tools to analyse the frequency and distribution of methylated residues in bacterial genomes has prevented a full understanding of their functions. Now , advances in DNA sequencing technology , including single- molecule, real- time sequencing and nanopore- based sequencing, have provided new opportunities for systematic detection of all three forms of methylated DNA at a genome- wide scale and offer unprecedented opportunities for achieving a more complete understanding of bacterial epigenomes. Indeed, as the number of mapped bacterial methylomes approaches 2,000, increasing evidence supports roles for methylation in regulation of gene expression, virulence and pathogen–host interactions. Phase variation DNA methylation was discovered in bacteria more than MTases has been found to play important regulatory 1 2–12 A means by which reversible a half century ago . It is now known that modification roles in bacteria . RM systems protect cells from variation of protein expression of the four canonical DNA bases by methylation can act invading DNA by methylating endogenous DNA and is achieved in bacteria, often in as an epigenetic regulator — that is, it can impart dis- cleaving non- methylated foreign DNA2,4. RM sys- an ON/OFF manner. The process creates phenotypic tinct and reversible regulatory states to identical genetic tems are divided into three main categories based diversity in a clonally expanded sequences. In eukaryotes, epigenetic regulation can on the subunits involved and the precise site of DNA population and allows the occur at multiple levels: DNA methylation, nucleosome restriction13–16 (Fig. 2). Orphan MTases, such as Dam in colony to survive in rapidly positioning, histone variants and histone modifications. Gammaproteobacteria, are thought to regulate DNA changing environments. By contrast, bacteria lack histones and nucleosomes; replication and gene expression, among other func- 2 Adaptive selection therefore, DNA methylation is their primary means of tions . There is also emerging evidence that heterogene- An evolutionary process epigenetic gene regulation. ity in methylation patterns within bacterial populations though which surviving Three different forms of DNA methylation exist in (often caused by phase variation of MTases) can promote organisms accumulate genetic bacterial genomes: N6-methyladenine (6mA), which is adaptive selection by generating heterogeneity in gene changes that lead to a fitness advantage over their the most prevalent form; N4-methylcytosine (4mC); and expression and cellular phenotypes beyond those pro- 17,18 progenitors. 5-methylcytosine (5mC). Although 5mC is the domi- vided by genetic variation alone . The vast majority nant form in eukaryotes, 6mA is the most prevalent form of the > 6,000 bacterial genomes sequenced to date have in prokaryotes. DNA is methylated by methyltransferase been found to encode MTases and are, therefore, likely (MTase) enzymes, which transfer a methyl group from to be subject to DNA methylation19,20. Nonetheless, S- adenosyl-l- methionine (SAM) to the appropriate the precise sequence targets and biological roles of position on target bases (Fig. 1). Importantly, only a most MTases remain unknown21, largely owing to the select few sequence motifs in each bacterial genome historical lack of high- throughput tools for detecting are targeted by MTases; for example, in Escherichia coli, 6mA and 4mC. Indeed, while method development for 5ʹ-GATC-3ʹ is targeted by DNA adenine methylase detecting eukaryotic 5mC has flourished over the past (Dam) and 5ʹ-CCWGG-3ʹ by DNA- cytosine methyl- few decades22–25, only modest methodological advances Department of Genetics and transferase (Dcm). However, nearly every occurrence of for detecting the principle forms of DNA methyla- Genomic Sciences and Icahn the target motifs is methylated2. The MTase specificity tion in bacterial genomes have been made over the Institute for Genomics and domain that determines the target motif varies widely same period. Multiscale Biology, Icahn across species, resulting in a large diversity of methylated The recent introduction of new sequencing tech- School of Medicine at Mount (TABLE 1) Sinai, New York, NY, USA. motifs across the bacterial kingdom. nologies is beginning to address this problem . *e- mail: gang.fang@ MTases function either alongside a cognate restric- In particular, single- molecule, real- time (SMRT) mssm.edu tion enzyme as part of a restriction–modification (RM) sequencing has enabled all three major forms of bac- https://doi.org/10.1038/ system or as ‘orphans’ that lack a cognate restriction terial DNA methylation to be detected simultaneously s41576-018-0081-3 enzyme. DNA methylation mediated by both types of for the first time, and this technology has been used NATURE REVIEWS | GENETICS VOLUME 20 | MARCH 2019 | 157 REVIEWS history and foundations of bacterial epigenetics, which SAM S-adenosyl-L-methionine CH 3 have been thoroughly reviewed elsewhere2,4,21,33,34. SAH S-adenosyl-homocysteine NH Early methods for mapping methylomes N The bulk of methodological development for DNA SAH methylation detection has historically been devoted O N to characterizing 5mC in higher eukaryotes, largely SAM because the biological importance of 5mC in mam- NH 2 4mC 4mC malian cells has been recognized for more than half MTase a century35–37. However, characterization of bacterial N methylomes requires alternative methods that can detect NH2 the more prevalent 6mA and 4mC in addition to 5mC. 5mC O A number of different approaches have traditionally N MTase CH3 N been used to characterize DNA methylation in bacterial SAM genomes (Table 1). Cytosine SAH O N Restriction enzyme- based mapping. Prokaryotic MTases are known to primarily target specific sequence 5mC motifs for methylation. The genome- wide methylation status of these motifs can often be deduced by digest- ing genomic DNA with one or more methyl- sensitive CH3 restriction enzymes of known specificities and ana- NH NH lysing the pattern of cut and uncut restriction sites by SAM SAH next-generation sequencing (NGS)38,39. This approach is N N robust, reliable and accurate but is limited to the study 6mA of methylation motifs that perfectly or partially match N MTase N the known specificities of available restriction enzymes. N N Thus, although still useful for assessing methylation Adenine 6mA events within known sequence motifs, it is generally not Fig. 1 | Primary types of DNA methylation in bacteria. suitable for discovering new motifs. Chemical structures are shown for the most common forms of DNA methylation in bacteria, including Sanger sequencing- based mapping. Theoretically, the N4-methylcytosine (4mC), 5-methylcytosine (5mC) most common forms of bacterial DNA methylation and N6-methyladenine (6mA). In each instance, a can be detected as a by- product of Sanger sequencing methyltransferase (MTase) transfers a methyl group (CH3) because the presence of 4mC, 5mC and 6mA in the DNA from S- adenosyl-l- methionine to the unmodified template affects the amplitude of peaks in the sequencing nucleotide, producing a methylated nucleotide and trace. Although several studies have used this method to S- adenosyl-homocysteine. investigate the methylomes of pathogenic bacteria40–44, technical limitations, including subtle peak signatures to generate most of the > 2,000 mapped bacterial and and the low throughput of Sanger sequencing, have archaeal methylomes3,20,26–31 that are currently available in prevented it from achieving wider usage32. the centralized REBASE database19. These methylomes represent a wide variety of isolates from more than 750 Bisulfite sequencing- based mapping. Despite its rep- distinct species, including common human pathogens utation as the gold standard for characterizing 5mC such as Salmonella enterica (n = 150), E. coli (n = 123), in eukaryotic genomes (owing to its high sensitivity, Klebsiella pneumoniae (n = 93) and Staphylococcus accuracy and compatibility with NGS technologies), aureus (n = 47). bisulfite sequencing has only quite recently been applied Here, we review the currently available methods to the study of 5mC in bacteria45,46. More recently, it for mapping bacterial methylomes, with a focus on has been shown that treating genomic DNA with ten– cutting- edge technologies such as SMRT sequencing eleven translocation (TET) enzymes before bisulfite 32 Methylomes and Oxford Nanopore sequencing . We discuss their treatment makes it possible to characterize both 5mC 47 The entirety of DNA potential to provide us with fully characterized methy- and 4mC bacterial methylomes . Before bisulfite treat- methylation marks across lomes that contain not only the full set of methylated ment, TET enzymes oxidize 5mC to 5-carboxylcyto- genomes. positions and targeted motifs but also a complete map of sine (5caC), which is subsequently read as a thymine Bisulfite sequencing the MTases and RM systems responsible for each methy- in bisulfite sequencing data. By using a combination of The treatment of DNA with lated motif. We provide an overview of the insights standard bisulfite sequencing and 4mC- TET-assisted bisulfite chemically converts into bacterial epigenomes that these new technologies bisulfite sequencing (4mC- TAB-seq), it becomes pos- unmethylated