University of Southern Denmark

Epigenetic association analysis of clinical sub-phenotypes in patients with polycystic ovary syndrome (PCOS)

Jacobsen, Vibe Maria; Li, Shuxia; Wang, Ancong ; Zhu, Dongyi ; Liu, Min ; Thomassen, Mads; Kruse, Torben A; Tan, Qihua

Published in: Gynecological Endocrinology

DOI: 10.1080/09513590.2019.1576617

Publication date: 2019

Document version: Accepted manuscript

Citation for pulished version (APA): Jacobsen, V. M., Li, S., Wang, A., Zhu, D., Liu, M., Thomassen, M., Kruse, T. A., & Tan, Q. (2019). Epigenetic association analysis of clinical sub-phenotypes in patients with polycystic ovary syndrome (PCOS). Gynecological Endocrinology, 35(8), 691-694. https://doi.org/10.1080/09513590.2019.1576617

Go to publication entry in University of Southern Denmark's Research Portal

Terms of use This work is brought to you by the University of Southern Denmark. Unless otherwise specified it has been shared according to the terms for self-archiving. If no other license is stated, these terms apply:

• You may download this work for personal use only. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying this open access version If you believe that this document breaches copyright please contact us providing details and we will investigate your claim. Please direct all enquiries to [email protected]

Download date: 01. Oct. 2021 Epigenetic regulation of reproductive hormone levels and metabolic phenotypes in patients with polycystic ovarian syndrome

Vibe Maria Jacobsen1, Shuxia Li1, Dongyi Zhu2, Ancong Wang2, Hongmei Duan3, Torben Kruse1,

Mads Thomassen1, Qihua Tan1, 4, *

1. Unit of Human Genetics, Department of Clinical Research, University of Southern

Denmark, Odense, Denmark

2. Center for Reproductive Medicine, Linyi People’s Hospital, Linyi, China

3. Institute for Clinical Medicine, Copenhagen University, Copenhagen, Denmark

4. Epidemiology and Biostatistics, Department of Public Health, University of Southern

Denmark, Odense, Denmark

* Corresponding author:

Professor Qihua Tan, MD, PhD Unit of Human Genetics Department of Clinical Research University of Southern Denmark, Odense, Denmark Sdr. Boulevard 29, DK-5000, Odense C, Denmark Tel. 0045 65503536 e-mail: [email protected]

Abstract

Similar to other complex diseases, polycystic ovary syndrome (PCOS) has been investigated recently using genomic analysis by, for example, the genome and epigenome-wide association studies revealing important genetic mutations and DNA methylation sites in association with the syndrome. Given the fact that PCOS is highly heterogenous clinical condition, studying the molecular basis of the differential clinical manifestation of PCOS is not only meaningful for individualized management of the disease but also important for understanding the biology of

PCOS manifestation, a topic that has been rarely investigated. Using genome-wide DNA methylation data collected on PCOS patients, we perform a genomic region based association study to detect genomic regions under differential DNA methylation for levels of PCOS sub-phenotypes.

Our analysis identified on 19 (12877188-12876846 bp, FDR: 0.012) and chromosome

6 (MHC region, FDRs: 0.056, 0.071) regions where DNA methylation is associated with prolactin level. In conclusion, our genomic region based epigenetic association analysis of PCOS heterogeneity in multiple sub-phenotypes revealed significant DNA methylation patterns linked to functional of metabolic disorders and immunity or novel associations to serve as targets for validation and replication.

Key words: PCOS, clinical heterogeneity, DNA methylation, DMR

Introduction

Polycystic ovarian syndrome (PCOS) is a complex condition that affects reproductive age females with a prevalence as high as 15-20 % (1). The Rotterdam Criteria defines PCOS as the presence of at least two out of three features: polycystic ovaries measured by ultrasound, anovulation or oligo- ovulation and hyperandrogenism (clinical and/or biochemical). Apart from these primary features a lot of secondary features can be found in women with PCOS e.g. insulin resistance, abdominal obesity and luteinizing hormone (LH) abnormalities (2, 3).

Research to date has not yet been able to determine the etiology of PCOS, though several hypotheses to the origin of PCOS have been proposed. The most widely accepted is the development origin as studies suggest a high stimulus of androgens in fetal life can cause features that resemble those of women with PCOS (1, 4, 5). There is also indication for a hereditary component, the disorder has been observed to cluster in families with the highest prevalence in first degree relatives, which indicates a genetic component. The latter has been demonstrated by twin and family studies with estimated heritability as high as over 60% (Vink et al. 2006). It is generally believed that PCOS has a complex mode of inheritance in which genomic variants interfere with important environmental factors, including life style e.g. diet and physical inactivity leading to heterogeneous expression of the syndrome (Li et al. 2016).

Epigenetics is the study on molecular mechanisms in the regulation of activity not caused by DNA sequence variation. As a new frontier in functional genomics, epigenetics serves as a molecular bridge linking the environment (nurture) to the genetic material (nature) (6). One of the most used methods to measure epigenetic variations is DNA methylation (4, 7). In this study, we perform genome-wide DNA methylation profiling to detect significant genomic sites that associate with sub-phenotypes of PCOS to explore the molecular basis of observed clinical heterogeneity of PCOS. Results from the analysis could not only deepen our understanding of PCOS etiology but also help with promoting individualized management and handling of the syndrome.

Materials and methods

The patient samples

Thirty PCOS patients aged from 22 to 33 years were recruited from the outpatients diagnosed according to the 2003 revised diagnostic criteria of Rotterdam consensus at the Center of

Reproductive Medicine, Linyi People’s Hospital, Shandong, China. All participants were free from medication and hormone therapy. This study followed the principles of the Declaration of Helsinki.

Every patient signed a written informed consent. The research was approved by the Reproductive

Ethics Committee of Linyi People’s Hospital.

Blood samples

Peripheral blood was taken from each patient for DNA methylation analysis and for laboratory tests. Immediately after the blood was taken from antecubital venous, it was stored under -80 ͦC at the central laboratory of Linyi People’s Hospital. For measuring the biochemical features and reproductive hormone levels, the following methods were used:

• Direct chemiluminescence immunoassay was used to measure the reproductive hormones;

luteinizing hormone (LH), follicle stimulating hormone (FSH), estradiol (E2), progesteron (P),

prolactin (PRL), total testosterone (TST) and thyroid stimulating hormone (TSH).

• Radioimmunoassay was used to determine fasting immunoreactive insulin (IRI) and

immunoreactive insulin 2 hours after ingestion of 75 g dextrose (IRI2).

• Oxygen electrode method was used to assay the serum fasting blood glucose (GLU) and blood

glucose 2 hours after ingestion of 75 g dextrose (GLU2). • Homeostatic model assessment IR (HOMA-IR) was calculated by the equation HOMA-IR =

GLU*IRI/22,5

DNA methylation profile

The Illumina’s Infinium HumanMethylation450 BeadChip assay interrogates over 480 CpG sites and 96% of CpG islands across the (9), this assay was used to measure the DNA methylation profile of the samples. The data were normalized using R package minfi using the quantile normalization (Aryee et al. 2014). The β-value, illustrating the methylation level, was calculated from the Illumina’s formula: β=M/(M+U+100), where M and U is the intensity of the methylations and unmethylations at a CpG sites. For quality control (QC), each β-value was assigned a detection p-value calculated by minfi, and CpGs with a detection p-value > 0,01 were treated as missing. A CpG was dropped if there were >5% missing data across the patient samples.

A total of 485512 CpG sites across the genome was measured for methylation level and 728

CpGs were filtered out because of >5% missing data across the sample. Y-linked CpGs were dropped but X-linked CpGs were included, seeing that all the participants are females. A total of

484637 CpG sites were analyzed further. Before fitting statistical models, the methylation β-value was transformed into M-value using the logit transformation to improve the statistical property of the methylation data.

Data analysis

Instead of a conventional approach that focuses on analyzing each single CpGs across the genome, we deploy a region based analysis that detects and tests differentially methylated genomic regions enriched by CpGs exhibiting same direction of effects using the bumphunting approach proposed by Jaffe et al. (2012) and implemented in R package minfi as bumphunter() function. The bumphunting approach starts with regressing the DNA methylation M-value at each CpG site on the level of a PCOS sub-phenotype (dichotomized by its median). The analytical method implemented in bumphunter() assumes that, at population level, the locus-specific slope estimates of βs for a sub- phenotype are smooth along the strand of a chromosome and applies a smoothing technique to smooth βs estimated for CpGs within a pre-defined region. After smoothing, we calculated the 99th percentile of the smoothed βs to obtain upper and lower thresholds. These thresholds are then used to define hyper- or hypo-methylated DMRs with smoothed peaks above or below the thresholds.

For each DMR identified, bumphunter() calculates a sum statistic by taking the sum of the absolute values of all the smoothed βs within that region. The sum statistic is then used to rank all DMRs with the top-most important DMRs having the highest sum statistic value. To determine the statistical significance of each DMR, we performed 1000 permutations and estimated random

DMRs for each permutation using permuted βs. Empirical genome-wide p values were calculated based on family-wise error rate (FWER) that computed, for each observed DMR-area, the proportion of maximum area values per permutation that are larger than the observed area. A significant DMR is defined as with an empirical p value<0.05. In addition to the empirical genome- wide p value, we also estimated the empirical uncorrected p value for a single DMR as the proportion of all random DMRs from 1000 permutations that are larger than the area of the observed DMR.

For annotation and functional interpretation, the UCSC Genome Browser on Human –

GRCh37/hg19 were used to investigate which genes were linked to a differentially methylated region. Only the DMR’s with a family-wise error rate (FWER)<0.1 were analyzed, with a maximum of 30.000 basepairs up- and down-stream expansion if necessary.

Results In Table 1, we show the descriptive statistics of all measured sub-phenotypes in 30 PCOS patients.

Of the 19 measurements, substantial differences were observed across the patient samples as indicated by the large differences between the 2.5% and 97.5% distribution percentiles equivalent to the 95% confidence interval, suggesting wide range clinical heterogeneity in the manifestation of

PCOS sub-phenotypes in the diagnosed patients.

After applying the bumphunter() procedure as described in the Methods section, 7 significant

DMRs with were identified with FWER<0.1 (Table 2), for PRL (3 DMRs on chromosomes 19 and

6 with FWER 0.012, 0.056 and 0.071), HOMA (1 DMR on chromosome 11 with FWER 0.042), IRI

(1 DMR on chromosome 11 with FWER 0.056), P (1 DMR on chromosome 6 with FWER 0.067), and WHR (1 DMR on chromosome 2 with FWER 0.087). In Figure 1, we show the methylation differences between high (>median) and low (

HOOK2 gene on (Figure 2). The other 2 DMRs for PRL reside in the major histocompatibility complex (MHC) region on chromosome 6.

In Table 3, we show the detailed annotation and test statistics for each CpG under the 3 identified DMRs for PRL in Figure 1. These CpGs are characterized by (1) low p values although not at genome-wide significance and (2) same direction of effect (increased or hypermethylation in the high PRL group for all 3 DMRs). The red lines in Figure 1 are the moving average of methylation differences within each DMR. The enrichment of methylation patterns at each region is clearly demonstrated in Figure 1.

Discussion

PCOS is among the many complex diseases or traits that have been investigated recently using genomic analysis by, for example, the genome and epigenome-wide association studies to compare differences in genetic variants (e.g. Chen et al. 2011) or in the epigenetic regulation (e.g. Li et al.

2017) between PCOS patients and healthy controls. These studies have revealed important genetic mutations and DNA methylation sites in association with the syndrome. Given the fact that PCOS is highly heterogenous clinical condition, studying the molecular basis of the differential clinical manifestation of PCOS is not only meaningful for individualized management of the disease but also important for understanding the biology of PCOS development. Unfortunately, molecular study on PCOS heterogeneity has been rarely touched by the current literature.

In a recent EWAS on PCOS, Li et al. (2017) reported a highly significant association of DNA methylation with PRL level of PCOS patients in the MHC region on chromosome 6, suggesting that the different levels of PRL in PCOS patients can have an immune background mediated by epigenetic mechanism. Instead of single CpG based EWAS, results from our region based association analysis on the same patient samples support the findings of Li et al. (2017) and Shen et al (2017) on MHC of chromosome 6 (third and fifth DMRs in Table 2), but most importantly, detected a highly significant DMR on chromosome 19 (first DMR in Table 2) in the gene body of

HOOK2. This is interesting because a very recent study by Rodríguez-Rodero et al. (2017) reported that intragenic hyper-methylation of HOOK2 gene in adipose tissue is associated with increased risk of obesity and type 2 diabetes. Our study, for the first time, found that hyper-methylation in the body of HOOK2 gene is associated with high PRL level in PCOS patients. In the literature, the serum level of prolactin has been correlated with insulin resistance (Daimon et al. 2017), body- weight and obesity (Pereira-Lima et al. 2013) and metabolic syndrome (Chirico et al. 2013).

Considering the fact that PCOS is a common endocrine-metabolic disorder, our identified association between HOOK2 gene body methylation and serum level of PRL could reflect the metabolic dysfunction in PCOS while at the same time suggesting the potentially central role of

HOOK2 genes in the development and manifestation of PCOS condition. The BDNF gene on chromosome 11 encodes a found in regions of the brain that control eating, drinking, and body weight which promotes the survival of nerve cells (neurons). The influences of BDNF on food intake and the control of body weight has been established (Kernie et al. 2000). The significant correlation of methylation of BDNF gene with HOMA (second DMR in

Table 2) and IRI (fourth DMR in Table 2), both as indicators of insulin resistance, provides epigenetic support to the role of BDNF in metabolic abnormality (Marchelek-Myśliwiec et al. 2013) and PCOS.

In Table 2, DNA methylation levels at the DMRs harboring WDR27 and SNTG2 genes are correlated with progesterone (P) and waist-hip ratio (WHR). The roles of these two genes in P and

WHR or in metabolic disorders have not been characterized. However, our novel finding may serve as references for future studies.

In summary, our genomic region based epigenetic association analysis of PCOS heterogeneity in multiple sub-phenotypes revealed significant DNA methylation patterns linked to functional genes of metabolic disorders and immunity or novel associations to serve as targets for validation and replication.

References

1. Sirmans SM, Pate KA. Epidemiology, diagnosis, and management of polycystic ovary syndrome. Clin Epidemiol. 2013;6:1-13.

2. Bani Mohammad M, Majdi Seghinsara A. Polycystic Ovary Syndrome (PCOS),

Diagnostic Criteria, and AMH. Asian Pac J Cancer Prev. 2017;18(1):17-21.

3. Goodarzi MO, Dumesic DA, Chazenbalk G, Azziz R. Polycystic ovary syndrome: etiology, pathogenesis and diagnosis. Nat Rev Endocrinol. 2011;7(4):219-31.

4. Li S, Zhu D, Duan H, Tan Q. The epigenomics of polycystic ovarian syndrome: from pathogenesis to clinical manifestations. Gynecol Endocrinol. 2016;32(12):942-6.

5. Xita N, Tsatsoulis A. Review: fetal programming of polycystic ovary syndrome by androgen excess: evidence from experimental, clinical, and genetic association studies. J Clin

Endocrinol Metab. 2006;91(5):1660-6.

6. Handel AE, Ebers GC, Ramagopalan SV. Epigenetics: molecular mechanisms and implications for disease. Trends Mol Med. 2010;16(1):7-16.

7. Rakyan VK, Down TA, Balding DJ, Beck S. Epigenome-wide association studies for common human diseases. Nat Rev Genet. 2011;12(8):529-41.

8. Li S, Zhu D, Duan H, Ren A, Glintborg D, Andersen M, et al. Differential DNA methylation patterns of polycystic ovarian syndrome in whole blood of Chinese women.

Oncotarget. 2017;8(13):20656-66.

9. Bibikova M, Barnes B, Tsan C, Ho V, Klotzle B, Le JM, et al. High density DNA methylation array with single CpG site resolution. Genomics. 2011;98(4):288-95.

10. Di Blasi C, Morandi L, Barresi R, Blasevich F, Cornelio F, Mora M. Dystrophin- associated protein abnormalities in dystrophin-deficient muscle fibers from symptomatic and asymptomatic Duchenne/Becker muscular dystrophy carriers. Acta Neuropathol. 1996;92(4):369-

77.

11. Honne K, Hallgrimsdottir I, Wu C, Sebro R, Jewell NP, Sakurai T, et al. A longitudinal genome-wide association study of anti-tumor necrosis factor response among Japanese patients with rheumatoid arthritis. Arthritis Res Ther. 2016;18:12.

12. Shen H, Liang Z, Zheng S, Li X. Pathway and network-based analysis of genome- wide association studies and RT-PCR validation in polycystic ovary syndrome. Int J Mol Med.

2017;40(5):1385-96.

13. Rodriguez-Rodero S, Menendez-Torre E, Fernandez-Bayon G, Morales-Sanchez P,

Sanz L, Turienzo E, et al. Altered intragenic DNA methylation of HOOK2 gene in adipose tissue from individuals with obesity and type 2 diabetes. PLoS One. 2017;12(12):e0189153.

Figure captions:

Figure 1. Significant DMRs identified for PRL on chromosome 19 (top) and chromosome 6 (middle and bottom). The figure displays methylation difference for each CpG site in the region between high and low PRL level groups as indicated by the black curve. The red curve is the moving average of the observed methylation differences.

Figure 2. Annotation plot for the most significant DMR on chromosome 19 for PRL. The figure shows that the identified DMR resides within the gene body of HOOK2.

Table 1. Descriptive statistics of all measured sub-phenotypes in 30 PCOS patients*

Sub-phenotypes Median 2,5 % 97,5 % Normal values

BMI, kg/m2 23 19,4 38

Waist, cm 80 56,1 114,8

Hip, cm 97 84,8 123,3

WHR, % 83,7 75,7 102,8

SBP, mmHg 110 100 130

DBP, mmHg 70 60 83,8

MC, day 75 28 407

E2, pg/ml 49,9 13,8 151,5

LH, mIU/ml 14,1 3 25

FSH, mIU/ml 6,4 4,8 8,9

P, ng/ml 0,4 0,2 1,1

TSH, uIU/ml 1,8 0,5 5,4 PRL, ng/ml 10,4 4 55

TST, ngTdl 50,6 6,2 94,7

IRI, uIU/ml 15,3 2,1 51,6

IRI2, uIU/ml 66,4 10,4 300

GLU, mmol/l 5,2 4,8 5,7

GLU2, mmol/l 6,7 4,5 8,8

HOMA-IR 3,5 0,5 12,4

*Modified from Li et al. (2017)

Table FWER<0. with 1. 2.The identified 7DMRs Feature chr start end value area p.value fwer location gene function

PRL chr1 12876846 12877188 0.27554248 1.10216994 0.00014375 0.012 body HOOK2

99 6 4 9 l ti chr1 0.00062262 boby/ BDNF- antisense

HOMA 1 27528498 27528498 0.4470645 0.4470645 1 0.042 promotor AS RNA

PRL chr6 29894050 29894152 0.24368827 0.73106481 0.00067087 0.056 body HLA-H pseudogene

2 5 6 body HLA-G

body HLA-J Pseudogene

body AK09762 asociated

5 i h body HCG4B Non-coding

RNA l body BC03564 asociated

7 i h HLA chr1 0.00087714 body/ BDNF- antisense

IRI 1 27528498 27528498 0.4470645 0.4470645 3 0.056 promotor AS RNA

PRL chr6 30039374 30039380 0.22511487 0.67534461 0.00089849 0.071 body RNF39

2 5 4 P chr6 17005820 17005820 0.40318258 0.40318258 0.00160700 0.067 body WDR27 protein

5 5 4 4 3 di WHR chr2 1165351 1165351 - 0.40482273 0.00181099 0.087 body SNTG2 encodes a

0 40482273 1 4 i

Table 3. Annotation and statistics of CpGs under the 3 significant DMRs for PRL. Chromosome CpGs Position Mean, high Mean, low Difference t value Pr(>|t|)

PRL PRL

Chr19 cg06417478 12876846 0.542658156 0.255813218 0.286844938 2.949107983 0.006368581

cg04657146 12876947 0.552623493 0.315949287 0.236674206 3.123980447 0.004124229

cg11738485 12877000 0.692792003 0.362368473 0.33042353 2.912734462 0.006962808

cg23899408 12877188 0.571065178 0.322837907 0.248227271 3.075007873 0.004662085

Chr6 cg18786623 29894050 0.596800519 0.349959311 0.246841208 2.931576113 0.006648732

cg24751894 29894141 0.515614817 0.287366607 0.22824821 2.837059746 0.008371224

cg25644740 29894152 0.578848905 0.322873509 0.255975397 2.448267456 0.020878631

Chr6 cg20249327 30039374 0.444087449 0.242053939 0.202033509 3.28465375 0.002745993

cg15877520 30039376 0.510191498 0.282966079 0.227225419 2.728165584 0.010875001

cg13918754 30039380 0.489818759 0.243733072 0.246085687 2.666558749 0.012586514