F1000Research 2021, 10:204 Last updated: 15 SEP 2021

RESEARCH ARTICLE

Genome-wide regulation of CpG methylation by ecCEBP

α in acute myeloid leukemia [version 2; peer review: 2 approved]

Adewale J. Ogunleye 1, Ekaterina Romanova2, Yulia A. Medvedeva1,2

1Department of Biological and Medical Physics, Moscow Institute of Physics and Technology, Moscow, Russian Federation 2Research Center of Biotechnology, Institute of Bioengineering, Russian Academy of Sciences, Moscow, Russian Federation

v2 First published: 11 Mar 2021, 10:204 Open Peer Review https://doi.org/10.12688/f1000research.28146.1 Latest published: 31 Aug 2021, 10:204 https://doi.org/10.12688/f1000research.28146.2 Reviewer Status

Invited Reviewers Abstract Background: Acute myeloid leukemia (AML) is a hematopoietic 1 2 malignancy characterized by genetic and epigenetic aberrations that alter the differentiation capacity of myeloid progenitor cells. The version 2 transcription factor CEBPα is frequently mutated in AML patients (revision) report leading to an increase in DNA methylation in many genomic locations. 31 Aug 2021 Previously, it has been shown that ecCEBPα (extra coding CEBPα) - a lncRNA transcribed in the same direction as CEBPα - regulates version 1 DNA methylation of CEBPα promoter in cis. Here, we hypothesize that 11 Mar 2021 report report ecCEBPα could participate in the regulation of DNA methylation in trans. Method: First, we retrieved the methylation profile of AML patients 1. Amrita Singh , CSIR Institute of Genomics with mutated CEBPα from The Cancer Genome Atlas (TCGA). We and Integrative Biology (CSIR-IGIB), New then predicted the ecCEBPα secondary structure in order to check the Delhi, India potential of ecCEBPα to form triplexes around CpG loci and checked if triplex formation influenced CpG methylation, genome-wide. 2. Oleg N. Demidov , University of Results: Using DNA methylation profiles of AML patients with a mutated CEBPα locus, we show that ecCEBPα could interact with DNA Bourgogne-Franche Comté, Dijon, France by forming DNA:RNA triple helices and protect regions near its Any reports and responses or comments on the binding sites from global DNA methylation. Further analysis revealed that triplex-forming oligonucleotides in ecCEBPα are structurally article can be found at the end of the article. unpaired supporting the DNA-binding potential of these regions. ecCEBPα triplexes supported with the RNA-chromatin co-localization data are located in the promoters of leukemia-linked transcriptional factors such as MLF2. Discussion: Overall, these results suggest a novel regulatory mechanism for ecCEBPα as a genome-wide epigenetic modulator through triple-helix formation which may provide a foundation for sequence-specific engineering of RNA for regulating methylation of specific .

Page 1 of 16 F1000Research 2021, 10:204 Last updated: 15 SEP 2021

Keywords Acute myeloid leukemia, Triplex, DNA methylation, long non coding RNA, extra-coding CEBPα

Corresponding author: Adewale J. Ogunleye ([email protected]) Author roles: Ogunleye AJ: Formal Analysis, Investigation, Methodology, Validation, Visualization, Writing – Original Draft Preparation, Writing – Review & Editing; Romanova E: Conceptualization, Data Curation, Supervision; Medvedeva YA: Conceptualization, Funding Acquisition, Methodology, Resources, Software, Supervision, Validation, Visualization, Writing – Review & Editing Competing interests: No competing interests were disclosed. Grant information: The work has been partially supported by Russian Science Foundation (grant 18-14-00240) to YAM and the Ministry of the Sciences and Higher Education of the Russian Federation. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Copyright: © 2021 Ogunleye AJ et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. How to cite this article: Ogunleye AJ, Romanova E and Medvedeva YA. Genome-wide regulation of CpG methylation by ecCEBPα in acute myeloid leukemia [version 2; peer review: 2 approved] F1000Research 2021, 10:204 https://doi.org/10.12688/f1000research.28146.2 First published: 11 Mar 2021, 10:204 https://doi.org/10.12688/f1000research.28146.1

Page 2 of 16 F1000Research 2021, 10:204 Last updated: 15 SEP 2021

mutations and chromosomal alterations are crucial for initiating REVISED Amendments from Version 1 AML, they are not sufficient to explain AML progression, 8 Figure 1b was updated. (Details: This was done to clarify heterogeneity, and relapse . the abundance of hypermethylated sites versus non- hypermethylated sites.) Recently, studies have identified the role of non-coding Figure 1c was updated. (details: The x-axis was confusing and RNAs, especially long non-coding RNAs (lncRNA) in the ini- 9–11 unclear, so we had to provide more explicit labels.) tiation and progression of cancers . LncRNAs are emerging functional transcriptional products of at least 200 nucleotides A new image was added to Figure 1 as Figure 1d. (details: Using external data, we showed the relationship between ecCEBPA lacking an open reading frame. Although they account for a and CpG methylation on the predicted binding and non-binding large proportion of transcriptional products in mammals (about sites.) 58,000 loci)12, only a small number of lncRNAs have been Since Figure 1d was not present before now, the previous well characterized. Although lncRNAs are mostly not con- Figure 1d and Figure 1e were updated to Figure 1E and served evolutionarily, they are heavily regulated suggesting Figure 1F respectively. their functional role12,13. They may function either as signal Figure 2a was updated: (details: we provided annotations on the transducers, guides, or molecular scaffolds to regulate image to show the single-stranded region that was rich in triplex- transcriptional and epigenetic events14–16. Some lncRNAs per- forming oligonucleotides.) form these functions in cis- by modulating transcription of A new image was added to Figure 2 as Figure 2c. (Details: nearby genes (Dum)17 or act in trans-, by modulating genes Through this, we showed the methylation difference in only the at multiple distant loci (MALAT1)18, while some can do both experimentally validated ecCEPBA contacts.) (HOTAIR)9,19. Since Figure 2c was not present before now, the previous Figure 2c and Figure 2d were updated to Figure 2D and Recent studies have identified lncRNAs for their remark- Figure 2E respectively. able role in regulating major epigenetic processes such as DNA We also provided a table of some manually curated functional methylation and chromatin remodeling. DNA methylation in annotations for genes that are close to the experimentally mammals is coordinated by one of the three DNA methyl- validated CpGs transferases (DNMT); DNMT1, DNMT3A, and DNMT3B7,17,20. Discussion: LncRNAs have been identified in recent studies as important We discussed the new findings presented inFigure 1d in the agents that can modulate DNA methylation, either by activating or context of Di Ruscio’s paper on ecCEBPA’s expression and repressing DNMTs. For example, the lncRNA Dum was dis- demethylation activities. covered to repress a nearby gene Dppa2 by recruiting multiple Underlying Data: DNMTs leading to methylation of a promoter region, thus pro- We provided the link to ENCODE data showing the normal moting myoblast differentiation17. Conversely, the lncRNA H19 methylome of HL-60 cells using RRBS. represses the activity of DNMT3B by interacting with SAHH Any further responses from the reviewers can be found at which hydrolyzes SAH, a step required for DNMT3B activation21. the end of the article ecCEBPα, which is transcribed from the CEBPα locus, directly blocks DNMT1 to prevent methylation of proximal and distal located promoters, thus promoting CEBPα-mediated granulocyte differentiation20. Mechanistic studies via reduced Introduction representation bisulfite sequencing and RNA immunoprecipita- Acute myeloid leukemia (AML) is a malignant tumor charac- tion sequencing shows that ecCEBPα suppresses DNA meth- 1,2 terized by the proliferation of undifferentiated myeloblasts . ylation in cis- by acting as a shield that sequesters DNMT1 It is the most prevalent form of leukemia in older adults from the CEBPα promoter. We speculate that ecCEBPα could (>60 years) with an annual mortality rate of 50% and a 5-year regulate DNMT1 activities in distant DNA regions (in trans-) 2,3 survival rate of 24% . With the combined effects of the global as well. The mechanism of this potential interaction is to be increase in average life expectancy and AML drug inefficiency, determined. the number of patients is expected to significantly increase 4,5 in the coming years . Methods ecCEBPα sequence The current understanding of the molecular interplay in AML In the recent version of GENCODE, ecCEBPα is not anno- has been defined under two distinct categories; (i) genetic tated most likely due to an overlap with a protein-coding gene abnormalities and (ii) non-random chromosomal rearrangements. CEBPα on the same strand. We retrieved the complete ecCEBPα Cases of AML with chromosomal rearrangements as t(15;17) sequence from the (hg19, chr19: 33298573- [PML-RARA], t(9;22) [BCR-ABL], inv(16) [CBFB-MYH11], 33303358) based on information reported in the work of t(8;21) [RUNX1-ETO] are called cytogenetically abnormal Di Ruscio et al.20. ecCEBPα is approximately 4.8kb and it over- (CA-AML), while cases with genetic abnormalities (including laps with the intronless CEBPα gene (~2.6kb) on the same frequent mutations in DNMT3A, NPM1, CEBPα, IDH1/2, TET2, strand. ecCEBPα does not share either the same transcription FLT3-ITD) are called cytogenetically normal (CN-AML)1,4. start site (TSS) or transcription end site (TES) with CEBPα, start- The former accounts for 50–55%, while the latter accounts for ing ~0.89kb upstream and ending ~1.46kb downstream of the 45–50% of diagnosed AML cases6,7. Even though these CEBPα gene.

Page 3 of 16 F1000Research 2021, 10:204 Last updated: 15 SEP 2021

DNA methylation data processing (GSM3478205). The iMARGI dataset was mapped to the CpG methylation (Illumina 450K array) and CEBPα muta- hg38 genome assembly. We used UCSC Liftover to convert tion data for 186 AML patients were retrieved from the Cancer ecCEBPα sequence coordinates from hg19 to hg38 sequences28. Genome Atlas (TCGA: http://firebrowse.org). CpG methyla- We expanded the DNA coordinates of CpGs by 3.0kb nucle- tion levels were measured in 307796 unique loci. We split all the otides upstream and downstream. IntersectBed from BEDTools AML patients into two groups based on CEBPα mutation status was used to check the co-location of predicted triplexes and (13 patients with a CEBPα mutation and 173 patients without a experimentally validated interactions of ecCEBPα29. Fisher’s mutation). We classified a CpG position as hypermethylated exact test was calculated for the number of confirmed ecCEBPα (HM) in patients with a CEBPα locus mutation if DNA meth- interactions between TDF and iMARGI data. ylation level was significantly increased in the case ofa CEBPα mutation (t-test, FDR ≤0.05 and Δ-value ≥0.1) and all GO enrichment analysis non-hypermethylated (NHM) CpG in the case of a CEBPα Finally, since Illumina 450K array probes are located close mutation were classified as non-hypermethylated CpGs (t-test, to genes, we performed functional enrichment using BiNGO FDR >0.05 and |Δ-value| <0.1). As a result, we obtained 11955 (v 3.0.3) (binomial test)30 to infer the biological significance HM and 261433 NHM CpGs. of the genes potentially affected by ecCEBPα binding.

Secondary structure and triplex prediction All statistical analyses were performed using R 4.0 or SciPy As suggested in a previous study22, unpaired RNA nucleotides v1.5.1 library. Visualization was done in Cytoscape 3.2.031 and are more likely to form triplexes with DNA. We predicted Python 3.7. Code is available at https://zenodo.org/record/ RNA secondary structure using RNAplfold (V 2.4.14), from 438525932. the Vienna suite using a cut-off for pairing probability (-c) of 0.123,24. To search for potential interactions between ecCEBPα Results and DNA target regions we used only unpaired nucleotides, ecCEBPα forms triplexes with promoter regions and while the nucleotides predicted to pair were replaced with ‘N’. affects promoter methylation The current study explores the potential of the lncRNA ecCEBPα DNA target regions were defined as 100 nucleotides centered at in the modulation of global CpG methylation status in trans each CpG. To predict ecCEBPα triplex formation with DNA via direct interaction with DNA regions. ecCEBPα (extra cod- target regions, we used Triplexator (V 1.3.2)25, since it has ing CEBPα), reported in work by Di Russio et al.20, is located higher accuracy of prediction14, with the following optimization on 19 and transcribed from the CEBPα locus parameters suggested in 22: minimum length = 10 nucleotides, (Figure 1a). error rate = 20%, G-C content = 70%, and filter-repeats = off. Using these parameters out of 307796 unique CpG loci, we Mutations in CEBPα locus are a common feature of AML lead- predicted 272131 loci with at least one triplex and 35715 with- ing to whole genome hypermethylation (Figure 1b). Since out any. Among them, 10351 and 222105 potential triplex TCGA is focused on protein-coding genes, all reported muta- targets were predicted in the HM and NHM regions respectively. tions are located within the CEBPα gene and could affect both CEBPα and ecCEBPα. To investigate if ecCEBPα could affect To estimate the statistical significance of predicted triplexes DNA methylation in trans, first we checked if it is capable of we used Triplex domain finder (TDF v 0.12.3), which clusters interaction via forming triple helices (triplexes) with its bind- RNA triplex-forming oligonucleotide (TFO) into DNA binding ing sites and if such interactions affect DNA methylation. domains (DBD)26. Briefly, all 272131 CpG loci with at least We observed that regions capable of forming triplexes with one predicted triplex were taken as input target regions. By pre- ecCEBPα remain protected from global DNA hypermethyla- dicting triplexes in the background regions, TDF is capable of tion observed in case of a mutation in a CEBPα locus (Figure 1c, estimating the statistical significance of ecCEBPα binding Fisher’s exact test, p-value <0.001). Furthermore, the overall between target regions and other non-target CpGs regions. Since methylation profile of HL-60 cells with overexpressed ecCEBPα TDF allows only to mask regions in the genomic background (Wang et al.) also have a protective effect on ecCEBPA’s bind- rather than to select the background explicitly we had to prepare ing sites in comparison to the rest of the genomic CpGs a special mask for the non-target regions. To do so we removed (Figure 1d, Fisher Test: p-val = 1.24E-09). This result sug- 35715 CpG loci with zero triplex predictions from the human gests a negative relationship between DNMT access to promoter genome using BEDtools subtract (BEDTools v2.29.2). TDF sites and ecCEBPα binding. was implemented with a minimum triplex length (-l) of 10 nucleotides, an error rate (-e) of 20%, and (-f) to mask ecCEBPα binding is not affected by the mutation in the background loci in 100 random samplings (-n). CEBPα locus To investigate deeper the potential of ecCEBPα to form triplexes RNA:chromatin colocalization analysis we used Triplex Domain Finder (TDF) - a triplex prediction To validate the predicted interactions we used RNA:chro- tool that refines the resolution of predicted TFOs in RNA into matin interactome obtained with iMARGI method capturing DNA binding domains (DBD) and calculates the significance chromatin-associated RNA (caRNA) and their genomic of the number of predicted triplexes for each DBD. Overall, interaction loci27. The data was downloaded from GEO 17 significant DBDs were identified within ecCEBPα, interspaced

Page 4 of 16 F1000Research 2021, 10:204 Last updated: 15 SEP 2021

Figure 1. (a) Schematic diagram for transcriptional products in the CEBPα locus; CEBPα is represented by an orange arrow and ecCEBPα is represented by the blue line. (b) Global change in DNA methylation. (c) Difference in DNA methylation levels between patients with and without CEBPα mutation in the regions of ecCEBPα predicted binding (non parametric t-test, p-value >1E-10). (d) Number of DNA Triplex Target Site (y-axis) and a location of the corresponding TFO on the ecCEBPα (TFO: RNA; x-axis). (e) Number of Triplex Target sites per TFO predicted for NHM and HM CpG regions.

between sequences 4086 and 4968 towards the 3’ end of ecCEBPα interacts with predicted binding sites the lncRNA (Figure 1e). These DBDs form triplexes with Since we use relatively relaxed thresholds for triplex the majority of regions protected from hypermethylation in prediction, we decided to validate the predicted RNA:DNA tri- patients with a CEBPα mutation (Figure 1f, Extended data: plexes using experimental data obtained with the iMARGI Supplementary Table 1). The DBD region is located down- method, allowing detection of RNA-chromatin interactions. We stream from the CEBPα gene suggesting that ecCEBPα binding identified 157 ecCEBPα contacts within the iMARGI dataset region is not affected by the mutation in the CEBPα locus. The and 29 of them contained predicted triplexes. Altogether, these predicted secondary structure of the ecCEBPα sequence showed 29 iMARGI interactions were made up of 182 predicted TTS that more than 95% of sequence positions from 4087-4987 (Fisher’s exact test, p-value < 2.2E-16) located in cis and (~0.5kb from CEBPα TES) (Figure 1e) were unpaired and in trans on 14 (Figure 2b). Chromosomes 19 potentially capable of forming triple helices with the target (the native chromosome for ecCEBPα) are accounted for by all DNA region (Figure 2a). predictions. Since these ecCEBPA contacts have been

Page 5 of 16 F1000Research 2021, 10:204 Last updated: 15 SEP 2021

Figure 2. (a) Predicted secondary structure of ecCEBPα. Nucleotide color represents base pairing probabilities as predicted by RNAplfold. (b) Experimentally validated ecCEBPα triplexes per chromosome. The x-axis represents the chromosome and the y-axis represents the triplex potential relative to TTS. Blue points represent all TDF predictions that are present in the iMARGI dataset. (c) Functional enrichment of genes located nearby CpG with predicted triplexes. (d) Schematic representation of ecCEBPα:DNA interactions in trans and its implication on DNA methylation. The presence of ecCEBPα inhibits DNA methylation process.

experimentally validated, we suggested that they are the most targets of ecCEBPA, 16 genes (ICAM1, PDXK, etc.) are clearly reliable regions where ecCEBPA might have a protective related to hematopoiesis and various leukemias. A summary effect from DNA methylation. Overall, a mean methylation of the genes and their function is provided in Table 1. delta score of experimentally validated binding sites is signifi- cantly lower than that of the non-binding sites (p-value < 1E-09) Discussion (Figure 2c). Unlike other forms of cancers, AML progression is often mutation-independent but may be explained by altered epige- Genes protected from methylation by ecCEBPα are netic regulation, DNA methylation specifically1,8. In this study, involved in transcription factor activities we elucidate a putative mechanism for the regulation of DNA We performed analysis on the genes located methylation by ecCEBPα. In a previous study, ecCEBPα, which nearby 182 ecCEBPα triplexes supported by iMARGI. Key accompanies the transcription of CEBPα on the same locus, gene ontology categories such as nucleic acid binding and was shown to protect the promoter of CEBPα from DNA meth- transcription factor activities were enriched among putative ylation leading to active expression20. We speculated that ecCEBPα targets (Figure 2d). Representative genes among ecCEBPα might perform a similar function in trans. We dem- transcription factors include MLF2, SUV39H2, RBM5, UBTF, onstrated that ecCEBPα-DNA triplex formation might pro- and among sequence-specific DNA binding include vide the molecular basis of this interaction. ecCEBPα binding POU2F2, MED12L, and DNASE1L1. The enrichment in tran- presumably protects the region from genome-wide hypermeth- scription factors (TF) suggests that triplex formation may ylation induced by CEBPα mutation in AML patients. Diruscio represent a possible mechanism employed by ecCEBPα to et al. confirmed that the most mutations in CEBPα do not influ- regulate TF methylation and as a consequence, their ence the expression of ecCEBPα but rather, its ability of the expression. Furthermore, out of the 33 genes we identified to be RNA to fold properly. We suggest that the inability to fold

Page 6 of 16 Table 1. AML specific function of nearby genes that are protected by ecCEBPA.

Gene Role Reference

FLVCR1 Alternative splicing disrupts of FLVCR1 disrupts erythropoiesis in Diamond-Blackfan https://haematologica.org/article/view/5065 anemia. (Rey et al. 2008)

PSMC6 Identified as part of a novel cluster for classifying AML risk and predicting outcomes. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1895492/ (Wilson et al. 2006)

CMIP Identified as one of the top genes expressed after rhIGFBP7 treatment (to suppress https://www.cell.com/cell-reports/pdfExtended/S2211-1247(18)31838-2 clonal growth) of AML cases. (Verhagen et al. 2018)

GNA13 A crucial signalling component of the GPR84/Beta-Catenin Signaling Axis in AML Stem https://ashpublications.org/blood/article/124/21/3577/97803/GNA13-a- Cells. (Dietrich et al. 2014) Novel-Component-of-the-GPR84-Beta-Catenin

ICAM1 Responsible for migration and adhesion of myeloid cells in hyperleukocytic AML. https://onlinelibrary.wiley.com/doi/full/10.1111/j.1365-2257.2006.00784.x (Zhang et al. 2006)

PDE4C Its mutation/absence leads to suppression of apoptotitc response in myelogenous https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1383661/ symptoms. (Lerner and Epstein 2006)

ANKRD27 Identified as part of a novel cluster of competing endogenous RNA that predict AML https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7432272/ survival and prognosis. (Wang et al. 2020)

PDXK Responsible for Vitamin B6 addiction of AML cells. (Chen et al. 2020) https://pubmed.ncbi.nlm.nih.gov/31935373/

NLRP3 The NLRP3 inflammasome is upregulated as part of the stress response in AML cells. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5754918/ (Jia et al. 2017)

PNPLA3 Responsible for elevated transaminases in lymphoblastic anaemia. (Bruschi et al. 2017) https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5683790/

CELSR1 Potentially contains mutations that contribute to patient specific mutations that are https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3407563/ F1000Research 2021,10:204Lastupdated:15SEP2021 responsible for the origination of AML.(Skoczen et al. 2019)

PFKFB4 Responsible for survival of acute monocytic myeloid leukemia by suppression of https://pubmed.ncbi.nlm.nih.gov/32299611/ apoptosis. (Wang et al. 2020)

COL7A1 Contains somatic variations that are described to be potentially pathogenic in ALL https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6978700/ patients. (Skoczen et al. 2019)

MED12L Regulates hematopoietic stem cell (HSC) specific enhancers. It loss leads to loss of HSC https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5268820/ stemness. (Aranda-Orgilles et al. 2016)

GALNTL5 Contains deleterious deletions that lead to copy number alteration in core binding https://ashpublications.org/blood/article/119/10/e67/29576/High- factors of both adult and pediatric AMLs. (Kühn et al. 2012) resolution-genomic-profiling-of-adult-and

TAZ Controls AML stemness and differentiation by modulating toll-like receptor (TLR) https://pubmed.ncbi.nlm.nih.gov/30930145/ signaling. (Seneviratne et al. 2019) Page 7of16 F1000Research 2021, 10:204 Last updated: 15 SEP 2021

properly even in overexpressed cases may affect its structural genome-wide CpG methylation, and its dysregulation could ability to bind its targets. ecCEBPα contains a TFO/DBD-rich contribute to aberrant methylation profile in AML patients. These region at its 3’ end, with low pairing probability, suggesting results suggest a novel regulatory mechanism for ecCEBPα that it is capable of triplex formation. Several of the predicted as a modulator of DNA methylation through triplex forma- ecCEBPα binding sites, which include transcription factors such tion providing a foundation for sequence-specific engineering as MLF2, SUV39H2, RBM5, UBTF, and sequence-specific of RNA for regulating methylation of specific genes. DNA binding proteins include POU2F2, MED12L, and DNASE1L1, were supported by experimental iMARGI Data availability RNA:chromatin interactions data. Underlying data Complete ecCEBPα sequence retrieved from the human genome Currently, the understanding of lncRNA function and mecha- (hg19, chr19: 33298573-33303358): https://www.ncbi.nlm.nih. nisms of action is limited to a few dozen well-annotated lncRNA gov/assembly/GCF_000001405.13/ transcripts. A few functional characterization attempts are based on the ‘guilt by association’ hypothesis, which may not reso- CEBPα mutation data for 186 AML patients retrieved from the 33 nate well with the ability of lncRNA to interact in trans . As Cancer Genome Atlas (TCGA): http://firebrowse.org. thoroughly reviewed previously, lncRNAs such as ecCEBPα, Dum, Dali, Dacor1, and LINCRNA-P21 interact with DNA GEO: Embryonic kidney that expresses SV40 large T antigen, 34 in trans to regulate DNA methylation . The results presented Accession number GSM3478205: https://www.ncbi.nlm.nih.gov/ herein further demonstrate that triplex formation between geo/query/acc.cgi?acc=GSM3478205 ecCEBPα and CpG containing DNA regions could indeed be regulatory and protect CpG sites from DNMT activity. GEO: DNA Methylation by Reduced Representation Bisulfite Seq from ENCODE/HudsonAlpha GSM980576: https://www.ncbi. Unfortunately, RNA:chromatin interaction protocols are rela- nlm.nih.gov/geo/query/acc.cgi?acc=GSM980576 tively new and the data is available only for a few cell types. Since RNA:chromatin interactions are highly cell-type Zenodo: josoga2/eccebp-alpha-project: F1000 Code Release, specific35 and lowly expressed, it is not surprising that we could https://doi.org/10.5281/zenodo.438525932 validate only a few of the predicted interactions. Nevertheless, based on our results, we suggest a model of potential ecCEBPα This project contains the following underlying data: chromatin interaction in trans (Figure 2e). In this model, ecCEBPα uses its unpaired regions to directly bind to specific - DNA_BINDING_DOMAINS_ID.tsv DNA sequences by forming triplexes and in this way prevents DNA methylation in the region of binding. ecCEBPα bind- - predicted_secondary_structure_of_ecCEBPA.fa ing to distant regions could be mediated either by 3-dimensional - probes.csv (main data) chromatin organization17 which brings them close to ecCEBPα. Code used for analysis available from: https://github.com/josoga2/ Recent studies have observed that promoter or transcription eccebp-alpha-project/tree/f1000 start sites (TSS) regions, which tend to be rich in CpG dinucle- otides, are TTS-rich and potential triplex-forming hotspots36,37. Archived code as at time of publication: https://doi.org/10.5281/ Through functional enrichment analysis, we observed that zenodo.438525932 transcription factors might be preferential targets of ecCEBPα. Interestingly, previous studies have shown that the suppression of a myeloid leukemia factor (MLF2), an oncogene in breast can- Extended data cer and myeloid leukemia38,39 as well as UBTF which controls Zenodo: Supplementary Data for Secondary structure and DNA rDNA expression40,41 contributes significantly to cancers upon binding domain prediction, http://doi.org/10.5281/zenodo. 44 promoter hypermethylation40,42. The suppressor of variegation 4433222 . 3-9 homolog 2 (SUV39H2), a histone-lysine-N-methyltransferase which regulates the hypermethylation H3K9 has also been This project contains the following extended data: reported to indirectly influence over 450 promoters in AML43. • Supplementary Table 1: Summary table of DNA bind- Having in mind that ecCEBP is transcribed from CEBP locus α α ing domains (DBD), the counts of target regions - a key transcription factor of hematopoiesis - this lncRNA within the genome and statistical analysis. could participate in the formation of a hub in the hematopoiesis regulatory network. • ecCEBPα secondary structure prediction with RNAplfold Conclusion In conclusion, we have shown that ecCEBPα could serve as a Data and code are available under the terms of the Creative trans-acting regulatory agent protecting its binding sites from Commons Attribution 4.0 International license (CC-BY 4.0).

Page 8 of 16 F1000Research 2021, 10:204 Last updated: 15 SEP 2021

References

1. Goldman SL, Hassan C, Khunte M, et al.: Epigenetic Modifications in Acute 2.0. Algorithms Mol Biol. 2011; 6: 26. Myeloid Leukemia: Prognosis, Treatment, and Heterogeneity. Front Genet. PubMed Abstract | Publisher Full Text | Free Full Text 2019; 10: 133. 24. Bernhart SH, Hofacker IL, Stadler PF: Local RNA base pairing probabilities in PubMed Abstract | Publisher Full Text | Free Full Text large sequences. Bioinformatics. 2006; 22(5): 614–615. 2. Gambacorta V, Gnani D, Vago L, et al.: Epigenetic Therapies for Acute Myeloid PubMed Abstract | Publisher Full Text Leukemia and Their Immune-Related Effects. Front Cell Dev Biol. 2019; 7: 207. 25. Buske FA, Bauer DC, Mattick JS, et al.: Triplexator: detecting nucleic acid PubMed Abstract | Publisher Full Text | Free Full Text triple helices in genomic and transcriptomic data. Genome Res. 2012; 22(7): 3. van Oosterwijk JG, Buelow DR, Drenberg CD, et al.: Hypoxia-induced 1372–1381. upregulation of BMX kinase mediates therapeutic resistance in acute PubMed Abstract | Publisher Full Text | Free Full Text myeloid leukemia. J Clin Invest. 2018; 128(1): 369–380. 26. Kuo CC, Hänzelmann S, Sentürk Cetin N, et al.: Detection of RNA-DNA binding PubMed Abstract | Publisher Full Text | Free Full Text sites in long noncoding RNAs. Nucleic Acids Res. 2019; 47(6): e32. 4. Zhang J, Gu Y, Chen B: Mechanisms of drug resistance in acute myeloid PubMed Abstract | Publisher Full Text | Free Full Text leukemia. Onco Targets Ther. 2019; 12: 1937–1945. 27. Wu W, Yan Z, Nguyen TC, et al.: Mapping RNA-chromatin interactions by PubMed Abstract | Publisher Full Text | Free Full Text sequencing with iMARGI. Nat Protoc. 2019; 14(11): 3243–3272. 5. Zueva I, Dias J, Lushchekina S, et al.: New evidence for dual binding site PubMed Abstract | Publisher Full Text | Free Full Text inhibitors of acetylcholinesterase as improved drugs for treatment of 28. Kuhn RM, Haussler D, Kent WJ: The UCSC genome browser and associated Alzheimer’s disease. Neuropharmacology. 2019; 155: 131–141. tools. Brief Bioinform. 2013; 14(2): 144–161. PubMed Abstract | Publisher Full Text PubMed Abstract | Publisher Full Text | Free Full Text 6. Hackl H, Astanina K, Wieser R: Molecular and genetic alterations associated 29. Quinlan AR, Hall IM: BEDTools: a flexible suite of utilities for comparing with therapy resistance and relapse of acute myeloid leukemia. J Hematol genomic features. Bioinformatics. 2010; 26(6): 841–842. Oncol. 2017; 10(1): 51. PubMed Abstract | Publisher Full Text | Free Full Text PubMed Abstract | Publisher Full Text | Free Full Text 30. Maere S, Heymans K, Kuiper M: BiNGO: a Cytoscape plugin to assess 7. O'Brien EC, Prideaux S, Chevassut T: The epigenetic landscape of acute overrepresentation of gene ontology categories in biological networks. myeloid leukemia. Adv Hematol. 2014; 2014: 103175. Bioinformatics. 2005; 21(16): 3448–3449. PubMed Abstract | Publisher Full Text | Free Full Text PubMed Abstract | Publisher Full Text 8. Mehdipour P, Santoro F, Minucci S: Epigenetic alterations in acute myeloid 31. Shannon P, Markiel A, Ozier O, et al.: Cytoscape: a software environment for leukemias. FEBS J. 2015; 282(9): 1786–1800. integrated models of biomolecular interaction networks. Genome Res. 2003; PubMed Abstract | Publisher Full Text 13(11): 2498–2504. 9. Hajjari M, Salavaty A: HOTAIR: an oncogenic long non-coding RNA in PubMed Abstract | Publisher Full Text | Free Full Text different cancers. Cancer Biol Med. 2015; 12(1): 1–9. 32. Ogunleye AJ: josoga2/eccebp-alpha-project: F1000 Code Release (Version PubMed Abstract | Publisher Full Text | Free Full Text f1000). Zenodo. 2020. 10. Gao Y, Wang JW, Ren JY, et al.: Long noncoding RNAs in gastric cancer: From http://www.doi.org/10.5281/zenodo.4385259 molecular dissection to clinical application. World J Gastroenterol. 2020; 33. Jalali S, Singh A, Maiti S, et al.: Genome-wide computational analysis of 26(24): 3401–3412. potential long noncoding RNA mediated DNA:DNA:RNA triplexes in the PubMed Abstract | Publisher Full Text | Free Full Text human genome. J Transl Med. 2017; 15(1): 186. 11. Zhang T, Hu H, Yan G, et al.: Long Non-Coding RNA and Breast Cancer. PubMed Abstract | Publisher Full Text | Free Full Text Technol Cancer Res Treat. 2019; 18: 1533033819843889. 34. Zhao Y, Sun H, Wang H: Long noncoding RNAs in DNA methylation: new PubMed Abstract | Publisher Full Text | Free Full Text players stepping into the old game. Cell Biosci. 2016; 6: 45. 12. Hon CC, Ramilowski JA, Harshbarger J, et al.: An atlas of human long PubMed Abstract | Publisher Full Text | Free Full Text non-coding RNAs with accurate 5’ ends. Nature. 2017; 543(7644): 199–204. PubMed Abstract | Publisher Full Text | Free Full Text 35. Bonetti A, Agostini F, Suzuki AM, et al.: RADICL-seq identifies general and cell type-specific principles of genome-wide RNA-chromatin interactions. 13. Alam T, Medvedeva YA, Jia H, et al.: Promoter Analysis Reveals Globally Nat Commun. 2020; 11(1): 1018. Differential Regulation of Human Long Non-Coding RNA and Protein- PubMed Abstract | Publisher Full Text | Free Full Text Coding Genes. PLoS One. 2014; 9(10): e109443. PubMed Abstract | Publisher Full Text | Free Full Text 36. Sridhar B, Rivas-Astroza M, Nguyen TC, et al.: Systematic Mapping of RNA- Chromatin Interactions In Vivo. Curr Biol. 2017; 27(4): 602–609. 14. Antonov IV, Mazurov E, Borodovsky M, et al.: Prediction of lncRNAs and their PubMed Abstract | Publisher Full Text | Free Full Text interactions with nucleic acids: benchmarking bioinformatics tools. Brief Bioinform. 2019; 20(2): 551–564. 37. Antonov I, Medvedeva YA: Purine-rich low complexity regions are potential PubMed Abstract | Publisher Full Text RNA binding hubs in the human genome [version 2; peer review: 3 15. Cruz-Miranda GM, Hidalgo-Miranda A, Bárcenas-López DA, et al.: Long Non- approved]. F1000Res. 2018; 7: 76. Coding RNA and Acute Leukemia. Int J Mol Sci. 2019; 20(3): 735. PubMed Abstract | Publisher Full Text | Free Full Text PubMed Abstract | Publisher Full Text | Free Full Text 38. Dave B, Granados-Principal S, Zhu R, et al.: Targeting RPL39 and MLF2 reduces 16. Marchese FP, Raimondi I, Huarte M: The multidimensional mechanisms of tumor initiation and metastasis in breast cancer by inhibiting nitric long noncoding RNA function. Genome Biol. 2017; 18(1): 206. oxide synthase signaling. Proc Natl Acad Sci U S A. 2014; 111(24): 8838–8843. PubMed Abstract | Publisher Full Text | Free Full Text PubMed Abstract | Publisher Full Text | Free Full Text 17. Wang L, Zhao Y, Bao X, et al.: LncRNA Dum interacts with Dnmts to 39. Gobert V, Haenlin M, Waltzer L: Myeloid leukemia factor: a return ticket regulate Dppa2 expression during myogenic differentiation and muscle from human leukemia to fly hematopoiesis. Transcription. 2012; 3(5): regeneration. Cell Res. 2015; 25(3): 335–350. 250–254. PubMed Abstract | Publisher Full Text | Free Full Text PubMed Abstract | Publisher Full Text | Free Full Text 18. He B, Peng F, Li W, et al.: Interaction of lncRNA-MALAT1 and miR-124 40. Nguyen le XT, Raval A, Garcia JS, et al.: Regulation of ribosomal gene regulates HBx-induced cancer stem cell properties in HepG2 through expression in cancer. J Cell Physiol. 2015; 230(6): 1181–1188. PI3K/Akt signaling. J Cell Biochem. 2019; 120(3): 2908–2918. PubMed Abstract | Publisher Full Text PubMed Abstract | Publisher Full Text 41. Santoro R, Grummt I: Molecular mechanisms mediating methylation- 19. Botti G, Scognamiglio G, Aquino G, et al.: LncRNA HOTAIR in Tumor dependent silencing of ribosomal gene transcription. Mol Cell. 2001; 8(3): Microenvironment: What Role? Int J Mol Sci. 2019; 20(9): 2279. 719–725. PubMed Abstract | Publisher Full Text | Free Full Text PubMed Abstract | Publisher Full Text 20. Di Ruscio A, Ebralidze AK, Benoukraf T, et al.: DNMT1-interacting RNAs block 42. Leshchenko VV, Kuo PY, Shaknovich R, et al.: Genomewide DNA methylation gene-specific DNA methylation. Nature. 2013; 503(7476): 371–376. analysis reveals novel targets for drug development in mantle cell PubMed Abstract | Publisher Full Text | Free Full Text lymphoma. Blood. 2010; 116(7): 1025–1034. 21. Zhou J, Yang L, Zhong T, et al.: H19 lncRNA alters DNA methylation genome PubMed Abstract | Publisher Full Text | Free Full Text wide by regulating S-adenosylhomocysteine hydrolase. Nat Commun. 2015; 43. Monaghan L, Massett ME, Bunschoten RP, et al.: The Emerging Role of 6: 10221. H3K9me3 as a Potential Therapeutic Target in Acute Myeloid Leukemia. PubMed Abstract | Publisher Full Text | Free Full Text Front Oncol. 2019; 9: 705. 22. Matveishina E, Antonov I, Medvedeva YA: Practical Guidance in Genome-Wide PubMed Abstract | Publisher Full Text | Free Full Text RNA:DNA Triple Helix Prediction. Int J Mol Sci. 2020; 21(3): 830. 44. Ogunleye AJ: Supplementary Data for Secondary structure and DNA PubMed Abstract | Publisher Full Text | Free Full Text binding domain prediction [Data set]. Zenodo. 2021. 23. Lorenz R, Bernhart SH, Höner Zu Siederdissen C, et al.: ViennaRNA Package http://www.doi.org/10.5281/zenodo.4433222

Page 9 of 16 F1000Research 2021, 10:204 Last updated: 15 SEP 2021

Open Peer Review

Current Peer Review Status:

Version 2

Reviewer Report 15 September 2021 https://doi.org/10.5256/f1000research.65190.r93095

© 2021 Singh A. This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Amrita Singh CSIR Institute of Genomics and Integrative Biology (CSIR-IGIB), New Delhi, Delhi, India

All the major comments raised by the reviewer have been answered thoroughly by the authors. The additional methylation analysis of HL-60 datasets in both normal and overexpressed ecCEPBα is highly appreciated. Moreover, re-labelling of the plots as highlighted previously has made it easier to comprehend them. However, there are few minor technical remarks, both figures 1 and 2 do not have the legend for the updated figure 1(d) and 2 (c), and accordingly, the subsequent legends labelling needs to be modified.

Competing Interests: No competing interests were disclosed.

Reviewer Expertise: Structure function relationship of noncoding RNA, with major focus on LncRNA function via triple helical structures.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Version 1

Reviewer Report 10 June 2021 https://doi.org/10.5256/f1000research.31133.r81320

© 2021 Demidov O. This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Oleg N. Demidov

Page 10 of 16 F1000Research 2021, 10:204 Last updated: 15 SEP 2021

INSERM U1231, Faculty of Medicine and Pharmacy, University of Bourgogne-Franche Comté, Dijon, France

The manuscript by Ogunleye AF and co-authors describes a novel phenomenon of lncRNA ecCEBP α -dependent regulation of methylation in several genes. The authors analyzed the open datasets and showed that mutations in CEBPA gene (coding both, CEBPA transcriptional factor and lncRNA ecCEBPα) changed the global CpG methylation status in AML cells. The manuscript is nicely written, conclusions are logically justified. This work will be interesting to the readers in the epigenetic regulation and cancer research field.

Still, I have several concerns about the analysis. There are a lot of predicted triplex target regions for ecCEBPA genome-wide. Some of them could be false-positive predictions. Is there any way to confirm the observed effect only of experimentally validated contacts? Or in regions where predictions are most reliable?

Also, it is unclear other genes, not only transcription factors, predicted to be regulated by ecCEBPA are related to hematopoiesis. It would be helpful to have at least some hypotheses in the discussion. If there is some link found between ecCEBPA targets and regulation of hematopoiesis, it would really strengthen the conclusions of the work. Probably, KEGG pathway enrichment analysis may help with that.

Minor comments:

It is unclear on the figure with a secondary structure of ecCEBPA where the triplex-forming region is located.

Is the work clearly and accurately presented and does it cite the current literature? Yes

Is the study design appropriate and is the work technically sound? Yes

Are sufficient details of methods and analysis provided to allow replication by others? Yes

If applicable, is the statistical analysis and its interpretation appropriate? Yes

Are all the source data underlying the results available to ensure full reproducibility? Yes

Are the conclusions drawn adequately supported by the results? Yes

Competing Interests: No competing interests were disclosed.

Reviewer Expertise: Models of human diseases, cancer research, fibrosis, paging

Page 11 of 16 F1000Research 2021, 10:204 Last updated: 15 SEP 2021

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Author Response 05 Aug 2021 ADEWALE OGUNLEYE, Moscow Institute of Physics and Technology, Moscow, Russian Federation

Response to Reviewer Two Comment one: There are a lot of predicted triplex target regions for ecCEBPA genome- wide. Some of them could be false-positive predictions. Is there any way to confirm the observed effect only of experimentally validated contacts? Or in regions where predictions are most reliable?

Response one: We are grateful to the reviewer for this suggestion. We performed a validation for the most reliable predicted triplexes using only experimentally validated ecCEBPA:DNA contacts. Overall, a mean methylation delta score of only 0.01 (p-value < 1E- 09) was observed which further emphasizes the protective effect of ecCEBPA. The following text and images were added to the main body of the paper: “Since these ecCEBPA contacts have been experimentally validated, we suggested that they are the most reliable regions where ecCEBPA might have a protective effect from DNA methylation. Overall, a mean methylation delta score of experimentally validated binding sites is significantly lower than that of the non-binding sites (p-value < 1E-09) (Figure 2C).”

New figure panels (1&2) here: https://drive.google.com/file/d/1iynm- t0DfGkxLF5vsumG4iCcfo2u2xCJ/view?usp=sharing

Comment two: Also, it is unclear if other genes, not only transcription factors, predicted to be regulated by ecCEBPA are related to hematopoiesis. It would be helpful to have at least some hypotheses in the discussion. If there is some link found between ecCEBPA targets and regulation of hematopoiesis, it would really strengthen the conclusions of the work. Probably, KEGG pathway enrichment analysis may help with that.

Response two: We thank the reviewer for this important comment. Indeed, we discovered that other non-transcription factor genes have links to acute myeloid leukemia via literature mining. Unfortunately, this could not be enriched through KEGG. We suggest including a supplementary table that details the different roles of the identified genes.

In the main text, we added: “Out of the 33 genes we identified to be targets of ecCEBPA, 16 genes (ICAM1, PDXK, etc.) are clearly related to hematopoiesis and various leukemias. A summary of the genes and their function is provided in Table 1.”

Table 1: AML specific function of nearby genes that are protected by ecCEBPA (added to the paper).

Competing Interests: No competing interests were disclosed.

Page 12 of 16 F1000Research 2021, 10:204 Last updated: 15 SEP 2021

Reviewer Report 15 April 2021 https://doi.org/10.5256/f1000research.31133.r81314

© 2021 Singh A. This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Amrita Singh CSIR Institute of Genomics and Integrative Biology (CSIR-IGIB), New Delhi, Delhi, India

The article has addressed an important question of the ability of a lncRNA to act in trans and regulate a multitude of genes, which might be affecting the progression of acute myeloid leukemia. The authors have also highlighted the possibility of a triplex structure for defining the binding of lncRNA at its target gene CpG region. However, there are certain parts wherein the authors haven't been able to convey their points properly. 1. In Figure 1(b), a global DNA hypermethylation has been represented in AML patients with CEBPα mutation, but the numbers of the HM and NHM sites in the methods section do not reflect the same. It shows higher sites for NHM, than the HM sites, kindly check if some mislabeling is there at the authors part.

2. Also, the reviewer has failed to understand Figure 1(c). In both the x-axis and the side legend, binding and non-binding have been depicted, but which one is the AML patients with and without mutation isn't clear from the figure.

3. There are few other key points that were not clear;

(a) what is the level of ecCEBPα expression in AML patients with and without mutation.

(b) Additionally, if the expression of the lncRNA remains the same in both cases, how does one explain the increase in binding of the lncRNA and subsequent higher NHM sites, in the case of AML patients with CEBPα mutation?

(c) Moreover, a previous report (Di Ruscio et.al1) on ecCEBPα had also analyzed genome- wide methylation, in which they report that methylation levels remain unchanged even when ecCEBPα was overexpressed. This is in contrast with the major theme of the paper i.e., methylation of genes in trans is affected by the ecCEBPα. The authors should comment on this in the discussion part.

References 1. Di Ruscio A, Ebralidze AK, Benoukraf T, Amabile G, et al.: DNMT1-interacting RNAs block gene- specific DNA methylation.Nature. 2013; 503 (7476): 371-6 PubMed Abstract | Publisher Full Text

Is the work clearly and accurately presented and does it cite the current literature? Partly

Page 13 of 16 F1000Research 2021, 10:204 Last updated: 15 SEP 2021

Is the study design appropriate and is the work technically sound? Yes

Are sufficient details of methods and analysis provided to allow replication by others? Partly

If applicable, is the statistical analysis and its interpretation appropriate? I cannot comment. A qualified statistician is required.

Are all the source data underlying the results available to ensure full reproducibility? Yes

Are the conclusions drawn adequately supported by the results? Partly

Competing Interests: No competing interests were disclosed.

Reviewer Expertise: Structure function relationship of noncoding RNA, with major focus on LncRNA function via triple helical structures.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

Author Response 05 Aug 2021 ADEWALE OGUNLEYE, Moscow Institute of Physics and Technology, Moscow, Russian Federation

Response to Reviewer One Comment One: In Figure 1(b), a global DNA hypermethylation has been represented in AML patients with CEBPα mutation, but the numbers of the HM and NHM sites in the methods section do not reflect the same. It shows higher sites for NHM, than the HM sites, kindly check if some mislabeling is there at the authors part.

Response Two: We believe that there is a confusion here. The plot was correct and reflected the change in DNA methylation in all CpG positions including those that were not statistically differentially methylated. We counted hypermethylated CpG (t-test, FDR ≤0.05 and Δ-value ≥0.1) and those that were not hypermethylated (t-test, FDR >0.05 and |Δ-value| <0.1). In this way, we refer to non-hypermethylated (NHM) to any CpG with DNA methylation change less than 0.1, which includes either not changing or hypomethylated and not only hypomethylated. To avoid this confusion we suggest replacing the plot with the one below. We specified a 0.1 mid-point to aid visualization. (Figure 1b).

Comment Two: Also, the reviewer has failed to understand Figure 1(c). In both the x-axis and the side legend, binding and non-binding have been depicted, but which one is the AML patients with and without mutation isn't clear from the figure.

Page 14 of 16 F1000Research 2021, 10:204 Last updated: 15 SEP 2021

Response Two: Each violin in Figure 1(c) is a representation of the differential methylation score between CpGs of AML patients with and without CEBPa mutation. The blue violin represents all CpGs that are located in ecCEBPA binding sites, while the orange violin represents CpGs that do not bind with ecCEBPA. To provide clarity, we corrected the labels on the X and Y axes and added explicit explanation in the legend of the figure (Figure 1c). Difference in methylation = Mean methylation of CpGs in AML patients with CEBPA mutation - Mean methylation of CpGs in AML patients without CEBPA mutation.

Comment 3: There are few other key points that were not clear; (a) what is the level of ecCEBPα expression in AML patients with and without mutation. (b) Additionally, if the expression of the lncRNA remains the same in both cases, how does one explain the increase in binding of the lncRNA and subsequent higher NHM sites, in the case of AML patients with CEBPα mutation? (c) Moreover, a previous report (Di Ruscio et.al1) on ecCEBPα had also analyzed genome- wide methylation, in which they report that methylation levels remain unchanged even when ecCEBPα was overexpressed. This is in contrast with the major theme of the paper i.e., methylation of genes in trans is affected by the ecCEBPα. The authors should comment on this in the discussion part.

Response: (a&b): It is relatively difficult to estimate the expression level of ecCEBPA since this gene is not in the annotation used by TCGA. The raw data is not freely available in TCGA. ecCEBPA gene also overlaps with the CEBPA gene making the estimation of expression of each of the genes even more complicated. Since the mutation in AML patients happens in the body of CEBPA gene, we do not expect a significant change in the expression of either CEBPA or ecCEBPA.

We explained the relationship between ecCEBPA binding and methylation in the first result section (Figures 1c,d&e). We suggested that the mutation(s) in CEBPA does not affect the expression of the lncRNA, but rather the ability to bind its targets.

We further emphasize this point in the paper with this line (discussion): “Di Ruscio et al. confirmed that most mutations in CEBPα do not influence the expression of ecCEBPα but rather, its ability of the RNA to fold properly. We suggest that the inability to fold properly even in overexpressed cases may affect its structural ability to bind its targets.”

(c): We are very grateful to the reviewer for this valuable comment. In response to this, we retrieved the DNA methylation data (RRBS) for wild-type HL-60 cells (ENCODE Dataset) and overexpressed HL-60 cells (Di Ruscio et al). We then calculated the methylation difference between the two cell states and compared it between the ecCEBPA binding sites versus the non-binding sites. Non ecCEBPA binding sites were more methylated in comparison to ecCEBPA binding sites (Fisher Test: p-val = 1.24E-09). Bearing in mind that ecCEBPA targets are located genome-wide, it is tempting to suggest that “enforced overexpression” (which was achieved with the R1 variant of ecCEBPA; comprise of downstream ecCEBPA sequences) of ecCEBPA strongly protects local CpGs from methylation while the rest of the genome gain some methylation.

Page 15 of 16 F1000Research 2021, 10:204 Last updated: 15 SEP 2021

We added the following statement to the results in the main text: "Furthermore, the overexpression of ecCEBPA in HL-60 cells (Wang et al) lead to ecCEBPA binding sites stay unmethylated while non ecCEBPA binding sites gain methylation (Fisher Test: p-val = 1.24E-09) , suggesting a protective effect on ecCEBPA’s binding to its target locations (Fig 1d)."

Competing Interests: No competing interests were disclosed.

Comments on this article Version 1

Reader Comment 14 Apr 2021 Olumide Inyang, moscow institute of physics and technology (MIPT), Russian Federation

This report gave a clear understanding of ecCEBPα as a genome-wide epigenetic modulator via triple-helix stacks for RNA(engineered) in regulating methylation of specific genes...great work from the authors.

Competing Interests: No competing interests were disclosed.

The benefits of publishing with F1000Research:

• Your article is published within days, with no editorial bias

• You can publish traditional articles, null/negative results, case reports, data notes and more

• The peer review process is transparent and collaborative

• Your article is indexed in PubMed after passing peer review

• Dedicated customer support at every stage

For pre-submission enquiries, contact [email protected]

Page 16 of 16