Principles and Challenges of Genome-Wide DNA Methylation
Total Page:16
File Type:pdf, Size:1020Kb
REVIEWS APPLICATIONS OF NEXT-GENERATION SEQUENCING Principles and challenges of genome- wide DNA methylation analysis Peter W. Laird Abstract | Methylation of cytosine bases in DNA provides a layer of epigenetic control in many eukaryotes that has important implications for normal biology and disease. Therefore, profiling DNA methylation across the genome is vital to understanding the influence of epigenetics. There has been a revolution in DNA methylation analysis technology over the past decade: analyses that previously were restricted to specific loci can now be performed on a genome-scale and entire methylomes can be characterized at single-base-pair resolution. However, there is such a diversity of DNA methylation profiling techniques that it can be challenging to select one. This Review discusses the different approaches and their relative merits and introduces considerations for data analysis. Transposons The phenotype of a cell is primarily determined by its DNA methylation information is erased by standard Mobile DNA elements that can expression profile and its response to environmental molecular biology techniques, such as cloning in bacteria relocate within the genome of cues. Epigenetics provides stability and diversity to the and PCR, and it is not revealed by hybridization as the their hosts. cellular phenotype through chromatin marks that affect methyl group is located in the major groove of DNA rather Restriction–modification local transcriptional potential and that are preserved or than at the hydrogen bonds. Therefore, methylation- system regenerated during cell division. Methylation of DNA dependent pretreatments of DNA were developed to A set of enzymes found in many cytosine residues at the carbon 5 position (5meC) is a reveal the presence or absence of the methyl group at bacteria and archaea that common epigenetic mark in many eukaryotes and is cytosine residues. The techniques based on these pre- protects the host genome from often found in the sequence context CpG or CpHpG treatments were initially restricted to relatively local- genomic parasites. Restriction– modification systems consist of (H = A, T, C). When located at gene promoters, DNA ized regions of the genome, but many have now been sequence-specific restriction methylation is usually a repressive mark. However, CpG scaled up to enable DNA methylation analysis on a endonucleases, which target DNA methylation is increased in the gene bodies of genome-wide scale. invading DNA, and associated actively transcribed genes in plants1–3 and mammals4–6. Although initial forays into broad DNA methylation DNA methyltransferases with similar recognition sequences, Plant CpHpG methylation is found in non-expressed profiling were made with two-dimensional gel electro- 7 me 8–13 which protect the host transposons . Bacteria and archaea also have 5 C, along phoresis , the era of epigenomics truly took off with the genome from the action of with N-4-methylcytosine and N-6-methyladenine, and adaptation of microarray hybridization techniques from the endonucleases. these modified bases participate in restriction–modification the gene expression and genomic fields to the profiling of systems and mismatch repair strand discrimination, histone modifications and DNA methylation patterns14–21. Mismatch repair A DNA-repair pathway that among other roles. The current revolution in sequencing technology has removes mismatched bases DNA methylation is laid down by dedicated DNA recently opened the door to single-base-pair resolution and corrects the insertion or methyltransferases with highly conserved catalytic whole-genome DNA methylation analysis22–24. Many deletion of short stretches of motifs. In eukaryotes, usually only a subset of potential of the high-resolution genome-wide DNA methylation (repeated) DNA. target sequences in the genome are methylated, therefore profiling techniques were pioneered in model organisms the distribution of methyl marks can convey epigenetic with small genomes, such as Arabidopsis thaliana22,24, but USC Epigenome Center, University of Southern information by demarcating regions of transcriptional are now being applied to organisms with larger genomes, 6,23,25,26 California, Keck School of silence or transcriptional potential. The delineation including mammals . These recent developments Medicine, 1450 Biggy Street, of regional DNA methylation patterns, and broader in the rapidly evolving panoply of DNA methylation Room G511B, Los Angeles, DNA methylation profiles, has important implications analysis techniques demand an updated synthesis of the California 90089-9601, USA. for understanding why certain regions of the genome main principles and an overview of the advantages and e-mail: [email protected] doi:10.1038/nrg2732 can be expressed in specific developmental contexts disadvantages of the various approaches. Published online and how epigenetic changes might enable aberrant This Review covers the general principles of DNA 2 February 2010 expression patterns and disease. methylation analysis, with a particular emphasis on NATURE REVIEWS | GENETICS VOLUME 11 | MARCH 2010 | 191 © 2010 Macmillan Publishers Limited. All rights reserved REVIEWS CpG islands genome-scale DNA methylation analysis technologies. PCR reaction. This approach would require a thermosta- In eukaryotic genomes, regions The relative merits of the different techniques are dis- ble DNA methyltransferase with very high efficiency of several hundred base pairs cussed, and bioinformatic challenges that are unique and maintenance fidelity and a complete lack of de novo that are not depleted of to DNA methylation analysis are outlined. Although methyltransferase activity, and so has not been realized CpGs by 5-methylcytosine deamination owing to them some of the methods covered in this Review rely on to date. Therefore, almost all sequence-specific DNA being unmethylated in the species-specific arrays, most of the principles could methylation analysis techniques rely on a methylation- germ line. They often overlap be applied to any organism with 5meC, including bac- dependent treatment of the DNA before amplification transcription start sites. Most teria and archaea. Some of the methods described in or hybridization29–32. There are three main approaches: definitions of CpG islands set a this Review may be applicable to 5-hydroxymethylcy- endonuclease digestion, affinity enrichment and bisul- minimum length (for example, 200 or 500 bp), a minimum tosine, which was recently confirmed to be present in phite conversion. After genomic DNA has been treated 27,28 observed:expected CpG ratio mammalian cells . with one of the methylation-dependent steps, various (for example, greater than molecular biology techniques, including probe hybridi- 0.6 or 0.65) and a minimum Distribution and detection of DNA methylation zation and sequencing, can be used to reveal the loca- GC content (for example, me 50% or 55%). As DNA samples are usually derived from a collection tion of the 5 C residues. The combination of different of cells, which may vary in their DNA methylation types of pretreatment followed by different analytical patterns, the distribution of 5meC in a DNA sample is steps has resulted in a plethora of techniques for deter- complex. Measurements can be made either of the pat- mining DNA methylation patterns and profiles29,32–42. In tern of methylated target sequences along individual the following sections I discuss techniques based on the DNA molecules or as an average methylation level at three main approaches, and a summary is provided in a single genomic locus across many DNA molecules29. TABLE 1. Expression profiling of cells treated with DNA Analysis of DNA methylation is further complicated by methyltransferase inhibitors and/or histone deacetylase the uneven distribution of methylation target sequences, inhibitors has also been used as a discovery tool for epi- such as CpG, across the genome. As a consequence of genetically silenced genes. However, it is prone to false- the frequent mutation of 5meC to thymine, these targets positive and false-negative results and is not considered are depleted throughout most of the genome but are to be a reliable gauge of DNA methylation at a given maintained in specific regions, such as CpG islands. This locus, and is therefore not discussed in this Review. non-uniform distribution is an important considera- tion in DNA methylation analysis, as discussed below. Endonuclease digestion As noted above, 5meC is not readily distinguished from Restriction endonucleases are such powerful tools in unmethylated cytosine by hybridization-based methods molecular biology that their biological role in restric- and, as DNA methyltransferases are not present during tion–modification systems in bacteria and archaea is PCR or in biological cloning systems, DNA methyla- sometimes overlooked. Each sequence-specific restric- tion information is erased during amplification. Some tion enzyme has an accompanying DNA methyl- investigators have suggested that it could be feasible to transferase that protects the endogenous DNA from maintain the pattern of methylation during PCR if an the restriction defence system by methylating bases in the appropriate DNA methyltransferase were present in the recognition site. Some restriction enzymes are inhibited Table 1 | Main principles of DNA methylation analysis Pretreatment Analytical step Locus-specific analysis Gel-based analysis Array-based analysis NGS-based analysis Enzyme • HpaII-PCR • Southern blot • DMH