Assessing the Feasibility of Genome-Wide DNA Methylation Profiling
Total Page:16
File Type:pdf, Size:1020Kb
bioRxiv preprint doi: https://doi.org/10.1101/2021.08.04.455090; this version posted August 5, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. Extraction and high-throughput sequencing of oak heartwood DNA: assessing the feasibility of genome-wide DNA methylation profiling Federico Rossi1,13*, Alessandro Crnjar2, Federico Comitani3,4, Rodrigo Feliciano5,6, Leonie Jahn1,7, George Malim2, Laura Southgate8, Emily Kay1,9, Rebecca Oakey1, Richard Buggs8.10, Andy Moir11, Logan Kistler12, Ana Rodriguez Mateos5, Carla Molteni2*, Reiner Schulz1*, 1 King's College London, Department of Medical and Molecular Genetics, London, UK 2 King's College London, Department of Physics, London, UK 3 University College London, Department of Chemistry, London, UK 4 The Hospital for Sick Children, Toronto, Ontario, Canada 5 King's College London, Department of Nutrition, London, UK 6 University of Dusseldorf, Division of Cardiology, Pulmonology and Vascular Medicine, Dusseldorf, Germany 7 Technical University of Denmark, Novo Nordisk Foundation Center for Biosustainability, Kemitorvet, Denmark 8 Royal Botanical Gardens, Department of Natural Capital and Plant Health, Richmond, UK 9 CRUK Beatson Institute, Glasgow, UK 10 Queen Mary University of London, School of Biological and Chemical Sciences, London, UK 11 Tree-Ring Services Limited, Mitcheldean, UK 12 National Museum Of Natural History, Smithsonian Institution, Department of Anthropology, Washington, DC, USA 13 IEO European Institute of Oncology IRCCS, Department of Experimental Oncology, Via Adamello 16, 20139 Milan, Italy * Corresponding author Abstract Tree ring features are affected by environmental factors and therefore are the basis for dendrochronological studies to reconstruct past environmental conditions. Oak wood often provides the data for these studies because of the durability of oak heartwood and hence the availability of samples spanning long time periods of the distant past. Wood formation is regulated in part by epigenetic mechanisms such as DNA methylation. Studies of the methylation state of DNA preserved in oak heartwood thus could identify epigenetic tree ring features informing on past environmental conditions. In this study, we aimed to establish protocols for the extraction of DNA, the high-throughput sequencing of whole-genome DNA libraries (WGS) and the profiling of DNA methylation by whole-genome bisulfite sequencing (WGBS) for oak (Quercus spp.) heartwood drill cores taken from the trunks of living standing trees spanning the AD 1776-2014 time period. Heartwood contains little DNA, and large amounts of phenolic compounds known to hinder the preparation of high-throughput sequencing libraries. Whole-genome and DNA methylome library preparation and sequencing consistently failed for oak heartwood samples more than 100 and 50 years of age, respectively. DNA July 4, 2021 1/22 bioRxiv preprint doi: https://doi.org/10.1101/2021.08.04.455090; this version posted August 5, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. fragmentation increased with sample age and was exacerbated by the additional bisulfite treatment step during methylome library preparation. Relative coverage of the non-repetitive portion of the oak genome was sparse. These results suggest that quantitative methylome studies of oak hardwood will likely be limited to relatively recent samples and will require a high sequencing depth to achieve sufficient genome coverage. Introduction 1 Heartwood corresponds to the inner layers of wood that do not contain living cells [1]. 2 Anatomical and physiological changes in sapwood, the external layers of wood, lead to 3 the formation of heartwood. Carbon dioxide enrichment, formation of tylosis, 4 accumulation of gums and phenolic compounds have been proposed as factors 5 determining the death of sapwood residual living cells and the formation of 6 heartwood [2]. Oak heartwood in particular has been reported as a viable source of 7 archival DNA [3]. Such DNA would provide an opportunity to expand 8 dendroclimatology inferences on tree response to past environmental conditions by 9 including epigenomic features that may convey additional and/or more specific 10 information compared to the macroscopic features used so far [4]. Specifically, 11 genome-wide DNA methylation profiling followed by epigenomic association studies 12 could identify genomic regions (including genes) where DNA methylation changes 13 systematically relate to variations of a particular environmental factor [5]. Moreover, 14 the characterisation of oak heartwood DNA could be applied into processes that require 15 the identification of timber such as the assessment of the provenance of the wood during 16 the trading process, or the identification of timber used in buildings [6, 7]. However, to 17 what extent DNA methylation profiling by sequencing of oak heartwood DNA is 18 feasible, given the amounts and quality of residual DNA and the presence of 19 enzyme-inhibiting compounds, is unknown. 20 Previous attempts to explain failures of PCR when applied to the amplification of 21 DNA extracted from wood typically have focused on phenolic compounds, which are 22 abundant in heartwood [3, 8, 9]. Phenolic compounds are supposed to irreversibly bind 23 to DNA thus causing amplification impairment through three distinct non-covalent 24 modes: electrostatic interactions, intercalative binding and groove binding [10{13]. 25 DNA polymerase inhibition can be prevented by pre- or post-extraction removal of 26 inhibitors, inactivation of inhibitors and the addition of enzyme activity 27 facilitators [14{19]. Heartwood PCR success is also affected by template length, 28 suggesting DNA fragmentation can impede DNA amplification [3, 20, 21]. Moreover, 29 wood storage conditions and duration can impact DNA amplification by reducing the 30 sample exposure to aerobic conditions and limiting DNA degradation [3, 20, 22]. Despite 31 being challenging, others have reported the successful PCR-amplification of DNA 32 extracted from recent as well as ancient oak heartwood, though only chloroplast DNA 33 could be amplified from ancient samples [3, 23]. 34 In addition to PCR, there have also been reports on the high-throughput sequencing 35 of DNA extracted from ancient wood, specifically waterlogged oak wood (sapwood and 36 heartwood) and pine sapwood [24,25]. The analysis of the wood DNA reads showed 37 features typical of ancient samples such as deamination at overhanging single strands 38 and depurination-driven fragmentation [24, 25]. Despite being authenticated by the 39 presence of features specific to ancient DNA, the low amount of endogenous DNA reads 40 retrieved from oak heartwood renders questionable whether this tissue can be used for 41 genome-wide genetic and epigenetic studies [24]. 42 In this study, we aimed to establish protocols for the extraction of DNA, the 43 high-throughput sequencing of whole-genome DNA libraries (WGS) and the profiling of 44 July 4, 2021 2/22 bioRxiv preprint doi: https://doi.org/10.1101/2021.08.04.455090; this version posted August 5, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. DNA methylation by whole-genome bisulfite sequencing (WGBS) for oak (Quercus spp.) 45 heartwood drill cores taken from the trunks of living standing trees spanning the AD 46 1776-2014 time period. We optimised the DNA extraction protocol to limit PCR 47 inhibition, maximise DNA yield and extend the range of heartwood sample age from 48 which a DNA library can be obtained. We also studied the interaction of DNA with a 49 potential PCR inhibitor by using molecular dynamics (MD) simulations to elucidate 50 DNA amplification impairment and rationalise how it could be prevented. 51 Materials and methods 52 Wood core sampling and preparation 53 The wood (sapwood plus heartwood) samples analysed in the present study were 54 collected from four oak (Quercus spp.) trees located at the Horsepool Bottom Nature 55 Reserve (Gloucester, England) using a 2-thread increment borer and a core cross-section 56 of an oak tree trunk collected in the Dulwich Woods (London, England) (Tab. 1). 57 ◦ Samples were left to dry for subsequent dendrochronological analysis or stored at -20 C 58 without being dated (Tab. 1). 59 All of the samples were used for the test of DNA extraction. The GLOR drill cores 60 were divided into 5-15 year-long segments using ethanol-wiped scissors. For the cores 61 that were not dated, the division into 5-15 year segments was accomplished using as 62 reference the plot reporting the radial growth over time obtained from cores extracted 63 from the same tree and dated (S1 FigA). Unless stated otherwise, the wood segments 64 were decontaminated either by successive 5 minute applications of a 5% bleach solution, 65 a 60% EtOH solution, and a pure water bath, or by the mechanical removal of the 66 segment surface with a scalpel, dried and homogenised to obtain wood powder. The 67 wood powder was washed with TNE (200 mM Tris, 250 mM NaCl, 50 mM 68 EDTA) [26, 27] and methanol (absolute methanol + 0.05 M CaCl2 + 40 mM DTT) 69 buffers to reduce the content of phenolic compounds of the sample. 70 Sequence Rings + Germination Tree ID Storage Date Range Unmeasured Rings (circa, AD) Dulwich Wood n.d. ca. 50 ca. 1970 RT GLOR01 AD1797-AD2014 216 1781 RT GLOR02 AD1797-AD2014 216 1781 RT, -20 ◦C GLOR03 AD1879-AD2013 135 + hollow n.d. RT GLOR07 AD1792-AD2014 221 1758 RT, -20 ◦C Table 1. Report of dendrochronological analysis performed on oak (Quercus spp.) wood samples collected at Horsepool Bottom Nature Reserve (Gloucestershire, England) and used in the present study.