How to Design a Whole-Genome Bisulfite Sequencing Experiment
Total Page:16
File Type:pdf, Size:1020Kb
epigenomes Review How to Design a Whole-Genome Bisulfite Sequencing Experiment Claudius Grehl 1,2,*, Markus Kuhlmann 3 , Claude Becker 4 , Bruno Glaser 2 and Ivo Grosse 1,5 1 Institute of Computer Science, Martin Luther University Halle-Wittenberg, Von Seckendorff-Platz 1, 06120 Halle (Saale), Germany; [email protected] 2 Institute of Agronomy and Nutritional Sciences, Martin Luther University Halle-Wittenberg, Soil Biogeochemistry, von-Seckendorff-Platz 3, 06120 Halle (Saale), Germany; [email protected] 3 Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Corrensstraße 3, 06466 Gatersleben, Germany; [email protected] 4 Gregor Mendel Institute of Molecular Plant Biology, Austrian Academy of Sciences, Vienna Biocenter (VBC), Dr. Bohr-Gasse 3, 1030 Vienna, Austria; [email protected] 5 German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Deutscher Platz 5e, 04103 Leipzig, Germany * Correspondence: [email protected]; Tel.: +49-(0)345-5527116 Received: 5 November 2018; Accepted: 3 December 2018; Published: 11 December 2018 Abstract: Aside from post-translational histone modifications and small RNA populations, the epigenome of an organism is defined by the level and spectrum of DNA methylation. Methyl groups can be covalently bound to the carbon-5 of cytosines or the carbon-6 of adenine bases. DNA methylation can be found in both prokaryotes and eukaryotes. In the latter, dynamic variation is shown across species, along development, and by cell type. DNA methylation usually leads to a lower binding affinity of DNA-interacting proteins and often results in a lower expression rate of the subsequent genome region, a process also referred to as transcriptional gene silencing. We give an overview of the current state of research facilitating the planning and implementation of whole-genome bisulfite-sequencing (WGBS) experiments. We refrain from discussing alternative methods for DNA methylation analysis, such as reduced representation bisulfite sequencing (rrBS) and methylated DNA immunoprecipitation sequencing (MeDIPSeq), which have value in specific experimental contexts but are generally disadvantageous compared to WGBS. Keywords: WGBS; coverage; library; DNA methylation; 5mC; epigenome; epigenetics 1. Introduction Aside from post-translational histone modifications and small RNA populations, the epigenome of an organism is defined by the level and spectrum of DNA methylation [1]. Cytosine methylation can occur as 5-methylcytosine (5mC) or 5-hydroxymethylcytosine (5hmC) [2]. Its key roles are in embryonic development regulation, genome imprinting, silencing of transposons, cell differentiation, X-chromosome inactivation, and general transcriptional gene regulation [3,4]. The addition of methyl groups onto DNA bases generally represses gene expression and therefore acts as one of several transcription control mechanisms. Whole genome bisulfite-sequencing has become the gold standard to detect DNA methylation patterns because of its single-base resolution and the possibility to cover entire genomes. Other approaches use either pre-selection and single-base resolution [5] or region-based approaches in whole genomes [6,7] and, therefore, do not deliver the full spectrum and detail of DNA methylation patterns. Epigenomes 2018, 2, 21; doi:10.3390/epigenomes2040021 www.mdpi.com/journal/epigenomes Epigenomes 2018, 2, 21 2 of 11 EpigenomesEpigenomes 2018 2018, 4, ,4 x, x 2 of2 11 of 11 Since the development of bisulfite-treated DNA sequencing in 1992 by Frommer et al. [8], SinceSince the the development development of of bisulfite-treated bisulfite-treated DNA DNA se sequencingquencing in in1992 1992 by by Frommer Frommer et al.et al.[8], [8], the the the application of high-throughput sequencing has been especially useful in facilitating the reliable applicationapplication ofof high-throughputhigh-throughput sequencing sequencing has has been been especially especially useful useful in infacilitating facilitating the the reliable reliable detection and analysis of methylation patterns in several organisms, tissues, and cell types. Huge steps detectiondetection and and analysis analysis of of methylation methylation patterns patterns in in se severalveral organisms, organisms, tissues, tissues, and and cell cell types. types. Huge Huge in the field of DNA methylation analysis were the first WGBS of Arabidopsis thaliana [9,10], and the stepssteps in in the the field field of of DNA DNA methylation methylation analysis analysis were were the the first first WGBS WGBS of ofArabidopsis Arabidopsis thaliana thaliana [9,10], [9,10], and and firstthethe humanfirst first human human methylome methylome methylome sequencing sequencing sequencing in 2009in in 2009 2009 by by Lister by List Lister eter et al. etal. [ 11al. [11].]. [11]. During During During the th eth bisulfite bisulfitee bisulfite reactionreaction reaction ofof DNA,of unmethylatedDNA,DNA, unmethylated unmethylated cytosine cytosine cytosine is converted is is converted converted to uracil to to uracil (Figureuracil (Figure (Figure1). This 1). 1). This occurrs This occurrs occurrs via cytosinsulfonate,via via cytosinsulfonate, cytosinsulfonate, a furthera a hydrolyticfurtherfurther hydrolytichydrolytic deamination deamination deamination step to stepuracilsulfonate, step to to uracilsulfonate, uracilsulfonate, and,finally, and, and, finally, afinally, desulfonation a desulfonationa desulfonation step tostep uracilstep to touracil (Figure uracil 1). However,(Figure(Figure 1). 1). 5mC However, However, and 5hmC 5mC 5mC and areand 5hmC inert 5hmC toare are this inert inert chemical to to this this chemical conversion. chemical conversion. conversion. FigureFigure 1. 1.1. Principle PrinciplePrinciple of of the the bisulfite-mediated bisulfite-mediated bisulfite-mediated conversion conversion conversion of ofcytosine ofcytosine cytosine to uracil.to touracil. uracil. TheTheThe subsequentsubsequentsubsequent polymerase polymerasepolymerase chain chain reaction reaction (PCR) (PCR) (PCR) translates translates translates uracil uracil uracil into into into thymine. thymine. thymine. During During During sequencing,sequencing,sequencing, thisthis this base base pair pair pair shift shiftshift causes causes cytosine/thymine-polymorphism, cytosine/thymine-polymorphism, which which which can can be can bequantified bequantified quantified and and and interpretedinterpretedinterpreted asas as proportionproportion proportion ofof of thethe the originaloriginal original methylationmethylation methylation atat at aa specificaspecific specific sitesite site throughthrough through thethe the comparisoncomparison comparison of readsof withreadsreads the with with original the the original original strand strand strand or areference or or a areference reference genome genome genome (Figure (Figure (Figure2). 2). 2). Figure 2. Exemplary DNA double strand with methylated (red) and unmethylated (blue) CpG-site Figure 2. Exemplary DNA double strand with methylated (red) and unmethylated (blue) CpG-site (cytosine-phosphate-guanine-dinucleotide)Figure 2. Exemplary DNA double strand with before methylated and after ( bisulfitered) and applicationunmethylated and (blue polymerase) CpG-site chain (cytosine-phosphate-guanine-dinucleotide) before and after bisulfite application and polymerase reaction(cytosine-phosphate-guanine-dinucl (PCR). Methylated cytosineeotide) is not before affected and by after bisulfite, bisulfite whereas application unmethylated and polymerase cytosine is chain reaction (PCR). Methylated cytosine is not affected by bisulfite, whereas unmethylated cytosine chain reaction (PCR). Methylated cytosine is not affected by bisulfite, whereas unmethylated cytosine convertedis converted to to uracil uracil and and further further on on toto thyminethymine duri duringng PCR PCR in in the the original original top top strand, strand, and and to adenine to adenine in is converted to uracil and further on to thymine during PCR in the original top strand, and to adenine thein the complementary complementary top top strand. strand. in the complementary top strand. AA disadvantagedisadvantage ofof WGBSWGBS isis thatthat bisulfitebisulfite sequencingsequencing cannot distinguish between between 5hmC 5hmC and and 5mC. A disadvantage of WGBS is that bisulfite sequencing cannot distinguish between 5hmC and The5mC. 5hmC The is5hmC described is described as the dynamic as the dynami DNA modificationc DNA modification associated associated with the DNAwith the demethylation DNA 5mC. The 5hmC is described as the dynamic DNA modification associated with the DNA processdemethylation [12] in animals.process [12] This in animals. has to be This taken has to into be accounttaken into for account organisms/cell for organisms/cell types withtypes substantial with demethylation process [12] in animals. This has to be taken into account for organisms/cell types with amountssubstantial of 5hmC,amounts such of as5hmC, the cerebellumsuch as the cells cerebell of Parkinsonum cells of disease Parkinson patients disease [13]. patients In plants, [13]. 5hmC In is of minorplants,substantial relevance 5hmC amounts is of [14 minor], asof itrelevance5hmC, occurs such only [14], as inas the veryit occurs cerebell low only amountsum in cells very and of low Parkinson may amounts merely anddisease serve may patients asmerely an intermediate serve[13]. In plants, 5hmC is of minor relevance [14], as it occurs only in very low amounts and