bioRxiv preprint doi: https://doi.org/10.1101/274043; this version posted March 1, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1 Chromatin accessibility landscape of articular knee cartilage reveals aberrant

2 enhancer regulation in osteoarthritis

3

4 Ye Liu1,2+, Jen-Chien Chang1+, Chung-Chau Hon3, Naoshi Fukui4,5, Nobuho Tanaka4,

5 Zhenya Zhang2, Ming Ta Michael Lee6,7,8*, Aki Minoda1*

6

7 1. Epigenome Technology Exploration Unit, Division of Genomic Technologies,

8 RIKEN Center for Life Science Technologies (CLST), Yokohama, Japan.

9 2. Graduate School of Life and Environmental Sciences, University of Tsukuba,

10 Tsukuba, Japan.

11 3. Genome Information Analysis Team, Division of Genomic Technologies, RIKEN

12 Center for Life Science Technologies (CLST), Yokohama, Japan.

13 4. Clinical Research Center, National Hospital Organization Sagamihara Hospital,

14 Kanagawa, Japan.

15 5. Department of Life Sciences, Graduate School of Arts and Sciences, The University

16 of Tokyo, Tokyo, Japan.

17 6. Genomic Medicine Institute, Geisinger, Danville, PA, USA.

18 7. Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan, ROC.

19 8. Laboratory for International Alliance on Genomic Research, RIKEN Center for

20 Integrative Medical Sciences, Yokohama, Japan.

21

22 + Authors contributed equally to this work

23 * Co-corresponding. Address correspondence to Aki Minoda, Epigenome Technology

24 Exploration Unit, Division of Genomic Technologies, RIKEN Center for Life Science

1 bioRxiv preprint doi: https://doi.org/10.1101/274043; this version posted March 1, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

25 Technologies (CLST), 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan.

26 E-mail: [email protected]; Tel: 81-45-503-9309.

27 Ming Ta Michael Lee, Genomic Medicine Institute, Geisinger Health System, 100 N.

28 Academy Ave. Danville, PA, 17822, USA. E-mail: [email protected]; Tel: 1-570-

29 214-1005.

30

31 ABSTRACT

32 Background: Osteoarthritis (OA) is a common joint disorder with increasing impact

33 in an aging society; however, there is no cure or effective treatments so far due to lack

34 of sufficient understanding of its pathogenesis. While genome-wide association studies

35 (GWAS) and DNA methylation profiling identified many non-coding loci associated

36 to OA, the interpretation of them remains challenging.

37 Methods: Here, we employed Assay for Transposase-Accessible Chromatin with high

38 throughput sequencing (ATAC-seq) to map the accessible chromatin landscape in

39 articular knee cartilage of OA patients and to identify the chromatin signatures relevant

40 to OA.

41 Results: We identified 109,215 accessible chromatin regions in cartilage and 71% of

42 these regions were annotated as enhancers. We found these accessible chromatin

43 regions are enriched for OA GWAS single nucleotide polymorphisms (SNPs) and OA

44 differentially methylated loci, implying their relevance to OA. By linking these

45 enhancers to their potential target , we have identified a list of candidate enhancers

46 that may be relevant to OA. Through integration of ATAC-seq data with RNA-seq data,

47 we identified genes that are altered both at epigenomic and transcriptomic levels. These

48 genes are enriched in pathways regulating ossification and mesenchymal stem cell

49 (MSC) differentiation. Consistently, the differentially accessible regions in OA are

2 bioRxiv preprint doi: https://doi.org/10.1101/274043; this version posted March 1, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

50 enriched for mesenchymal stem cell-specific enhancers and motifs of transcription

51 factor families involved in osteoblast differentiation (e.g. bZIP and ETS).

52 Conclusions: This study marks the first investigation of accessible chromatin

53 landscape on clinically relevant hard tissues and demonstrates how accessible

54 chromatin profiling can provide comprehensive epigenetic information of a disease.

55 Our analyses provide supportive evidence towards the model of endochondral

56 ossification-like cartilage-to-bone conversion in OA knee cartilage, which is consistent

57 with the OA characteristic of thicker subchondral bone. The identified OA-relevant

58 genes and their enhancers may have a translational potential for diagnosis or drug

59 targets.

60 61 Keywords

62 osteoarthritis, epigenomics, enhancer, ATAC-seq, endochondral ossification.

63

64 Background

65 Osteoarthritis (OA) is a degenerative joint disease [1,2] that is one of the most common

66 causes of chronic disability in the world [3,4], of which the knee OA is the most

67 common. Main features of OA include cartilage degradation, subchondral bone

68 thickening, joint space narrowing and osteophytes formation [5], resulting in stiffness,

69 swelling, and pain in the joint. Currently available treatments are either pain relief or

70 joint function improvement by strengthening the supporting muscles. However, OA

71 progression ultimately leads to costly total joint replacement surgery, making it a

72 growing global health burden.

73 Although the causes of OA are not well understood, risk factors such as age,

74 weight, gender, and genetic factors have been identified [4]. Several models for OA

75 initiation, such as mechanical injury, inflammatory mediators from synovium, defects 3 bioRxiv preprint doi: https://doi.org/10.1101/274043; this version posted March 1, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

76 in metabolism and endochondral ossification, have been proposed to explain

77 pathogenesis of this disease [6–12]. To date, GWAS have identified more than 20 loci

78 to be associated with the risk of developing OA [13]. While next generation sequencing

79 data is being generated to discover rare variants with larger effect size, the identified

80 variants are often located in the non-coding regions of the genome [14], complicating

81 the identification of the causal genes. Transcriptomic analyses of cartilage in diseased

82 joints of OA patients (taken from the replacement surgeries) provided an opportunity

83 to pinpoint transcriptionally dysregulated genes and pathways relevant to OA [15,16].

84 However, such studies have yet to fully reveal the underlying molecular mechanism of

85 how the transcription of these genes are dysregulated.

86 Recently, epigenetic tools have been applied to gain further insight into the

87 pathogenesis of OA. There have been several reports of DNA methylation status [17–

88 19] in cartilage of diseased joints. However, relatively few DNA methylation

89 alterations were found in OA degenerated tissues, despite many genes being

90 differentially expressed. Consistent with these findings, it was also found that change

91 of expression is rarely associated with DNA methylation alterations at promoters

92 [16,19,20]. Many of the identified differentially methylated sites in fact fall in enhancer

93 regions, which are non-coding regulatory elements, disruption of which may lead to

94 dysregulated transcription, and many are cell type-specific [21,22]. Recent large-scale

95 studies, such as FANTOM5 [23], Roadmap Epigenomics Project [24] and GTEx [25]

96 have enabled the prediction of regulatory networks between enhancers and their

97 potential target genes (e.g. JEME [26]), which could be applied in a clinical context to

98 explore the roles of enhancers in disease pathogenesis.

99 Here, we set out to investigate alterations of enhancers associated with OA by

100 applying ATAC-seq [27] on the knee joint cartilages from OA patients, using an

4 bioRxiv preprint doi: https://doi.org/10.1101/274043; this version posted March 1, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

101 optimized protocol for cartilage sample preparation. ATAC-seq maps the accessible

102 chromatin regions, which are often regulatory regions such as promoters and enhancers

103 that play roles in regulation of gene expression. By integrating our ATAC-seq data with

104 the publicly available genetic, transcriptomic and epigenomic data, we identified

105 dysregulated enhancers and their potential target genes. Our data highlights a number

106 of OA risk loci and differentially methylated loci that potentially play roles in cartilage

107 degradation during OA development.

108

109 Methods

110 Knee joint tissues collection

111 Human knee joint tibial plates were collected from patients who undertook joint

112 replacement surgery due to severe primary OA at the National Hospital Organization

113 Sagamihara Hospital (Kanagawa, Japan). Demographic information for patients is

114 listed in Additional file 1: Table S1. Diagnosis of OA was based on the criteria of the

115 American College of Rheumatology [28], and all the knees were medially involved in

116 the disease. Tissues were stored in 4°C cultured medium after removal. Informed

117 consent was obtained from each patient enrolled in this study. This study was approved

118 by the institutional review board of all the participating institutions (Sagamihara

119 National Hospital and RIKEN).

120

121 Processing of cartilage samples

122 Fresh cartilage was separated from the subchondral bone by a scalpel immediately after

123 surgery and stored in 4°C Dulbecco's Modified Eagle Medium until they were

124 processed. 100 mg cartilage from both outer region of lateral tibial plateau (oLT) and

125 inner region of medial tibial plateau (iMT) (Fig. 1a and Additional file 2: Figure S1a)

5 bioRxiv preprint doi: https://doi.org/10.1101/274043; this version posted March 1, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

126 for each patient was digested with 0.2% type II collagenase (Sigma-Aldrich) in DMEM

127 at 37°C with rotation for 12 hours to fully remove the collagen matrix debris, and

128 stopped by washing with chilled PBS. Immediately after digestion, chondrocytes were

129 counted by a cell counting chamber Incyto C-Chip (VWR) and collected by

130 centrifugation at 500g for 5 minutes at 4°C.

131

132 ATAC-seq of the cartilage

133 Approximately 50,000 cells were taken after the cartilage processing and used for

134 ATAC-seq library preparation (Fig. 1a) according to the original protocol [27,29].

135 Briefly, transposase reaction was carried out as previously described [27] followed by

136 9 to 12 cycles of PCR amplification. Amplified DNA fragments were purified with

137 QIAGEN MinElute PCR Purification Kit and size selected twice with Agencourt

138 AMPure XP (1:1.5 and 1:0.5 sample to beads; Beckman Coulter). Libraries were

139 quantified by KAPA Library Quantification Kit for Illumina Sequencing Platforms

140 (KAPA Biosystems), and size distribution was inspected by Bioanalyzer (Agilent High

141 Sensitivity DNA chip, Agilent Technologies). Library quality was assessed before

142 sequencing by qPCR enrichment of a housekeeping gene promoter (GAPDH) over a

143 gene desert region [29]. ATAC-seq libraries were sequenced on an Illumina HiSeq

144 2500 (50bp, paired-end) by GeNAS (Genome Network Analysis Support Facility,

145 RIKEN, Yokohama, Japan).

146

147 ATAC-seq data processing, peak calling and quality assessment

148 An ATAC-seq data processing pipeline for read mapping, peak calling, signal track

149 generation, and quality control was implemented [30]. Briefly, fastq files for all patients

150 were grouped by tissue compartment (oLT or iMT) and input into the pipeline

6 bioRxiv preprint doi: https://doi.org/10.1101/274043; this version posted March 1, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

151 separately, with parameters –true_rep –no_idr. Reads were mapped to the hg38

152 reference genome. Peaks were called by macs2 with default parameters in the pipeline.

153 Basic sequencing information and library quality metrics are listed in Additional file 1:

154 Table S2. NucleoATAC was applied to infer genome-wide nucleosome positions and

155 occupancy from the ATAC-seq data [31]. Briefly, NucleoATAC were run for oLT and

156 iMT separately with default parameters, using bam files merging from all patients as

157 input, and the outputted nucleoatac_signal.bedgraph were used for aggregated plot

158 around TSS.

159 Previously annotated DNase I Hypersensitive Sites (DHS) were used to assess

160 the quality of our libraries. Briefly, a set of DHS defined in Roadmap Epigenomics

161 Project [32] were downloaded [33] and lifted over from hg19 to hg38 using UCSC

162 liftOver tool [34] (refer as Roadmap DHS thereafter). For each library, we calculated

163 the ratio of reads that fall within Roadmap DHS versus randomly chosen genomic

164 regions of the same total size (DHS enrichment score). It is noted that the DHS

165 enrichment score is in good correlation with enrichment qPCR for GAPDH (Additional

166 file 2: Figure S1b).

167

168 Defining a set of unified accessible chromatin regions

169 Peaks from all 16 libraries were pooled and those that are within 300 bp were merged,

170 resulting in 615,454 merged raw peaks. Reads fall in raw peak regions were counted

171 for individual samples using bedtools v2.27.0 [35]. Read counts were normalized as

172 count-per-million (CPM) based on relative log expression normalization implemented

173 in edgeR v3.18.1 [36,37]. In the end, we defined a set robust peaks (n=109,215) with ³

174 2 CPM in at least 6 oLT or 6 iMT samples for all downstream analyses.

175

7 bioRxiv preprint doi: https://doi.org/10.1101/274043; this version posted March 1, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

176 Peak annotation and differentially accessible peak identification

177 Peaks were annotated as promoters or enhancers based on their intersection with

178 promoter or enhancer as defined in Roadmap DHS [24]. Noted that a peak was

179 preferentially annotated as a promoter if it intersects both a promoter and an enhancer

180 DHS. Differentially accessible peaks between damaged (i.e. iMT) and intact (i.e. oLT)

181 tissues were identified using edgeR [36,37]. Briefly, DHS enrichment score of each

182 sample were incorporated as continuous covariates in the design matrix of the

183 generalized linear model implemented in edgeR [36,37]. The Benjamini-Hochberg

184 adjusted p-values (i.e. false discovery rate, FDR) was taken to measure the extent of

185 differential accessibility as used in Fig. 3b. A cutoff of FDR £ 0.05 was used to define

186 differentially accessible peaks.

187

188 Processing of OA associated SNPs

189 GWAS lead SNPs of OA were obtained from GWASdb v2 (as of 16 Sep 2017) [38],

190 NHGRI-EBI GWAS Catalog (as of 16 Sep 2017) [39], and a recent study [40]. SNPs

191 in linkage disequilibrium with these lead SNPs (i.e. proxy SNPs with r2 > 0.5 and

192 distance limit of 500kb in any three population panels of the 1000 Genomes Project

193 pilot 1 data) were obtained from SNAP database [41], resulting in 8,973 GWAS SNPs

194 associated with OA. As a negative control, GWAS SNPs associated with Parkinson’s

195 disease were obtained the same way from NHGRI-EBI GWAS Catalog. Coordinates

196 for these SNPs in hg38 were obtained from dbSNP Build 150.

197

198 Definition of cell type-specific enhancers

199 Bed files for enhancer clusters with coordinated activity in 127 epigenomes, as well as

200 the density of clusters per cell type, were obtained from online database [33] (DNase I

8 bioRxiv preprint doi: https://doi.org/10.1101/274043; this version posted March 1, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

201 regions selected with p < 1 × 10–2, lifted over from hg19 to hg38 using UCSC liftOver

202 tool). For each cell type, clusters with two-fold density than average (across all cell

203 types) were defined as specific and pooled.

204

205 Enrichment analysis for differentially accessible peaks

206 Fisher’s exact test were applied to assess the enrichment for the following features in

207 the selected peaks: (1) GWAS SNPs associated with OA (n=8,973), (2) differentially

208 methylated loci (DML) associated with OA (n=9,265) [19,42], and (3) cell type-

209 specific enhancers. Briefly, for the enrichment of (1) GWAS SNPs and (2) DML, peaks

210 were ranked by the extent of differential accessibility (as measured by FDR) and each

211 cumulative fraction of peaks was compared to all peaks. GWAS SNPs and DML [43]

212 associated with Parkinson’s disease was used as a negative control. For enrichment of

213 (3) cell type-specific enhancers, differentially accessible peaks (FDR ≤ 0.05) were

214 compared to all peaks, and same number of randomly selected peaks were used as a

215 control. Permutation of peaks as control were done 25 times.

216

217 Linking enhancer peaks to their potential target genes

218 A promoter peak is assigned to a gene if it intersects with the transcription start sites

219 (TSS) of its transcript as defined in the FANTOM CAGE associated transcriptome [44].

220 We noted that a promoter peak might be associated with multiple genes. An enhancer

221 peak is defined as linked to a target gene if it overlaps an expression quantitative trait

222 loci (eQTL) of the corresponding gene (GTEx V7, in any tissues with p < 1 × 10–5) [45],

223 or is supported by putative enhancer-promoter linkage predicted by JEME method [46]

224 on FANTOM5 and Roadmap Epigenomics Project data [26].

225

9 bioRxiv preprint doi: https://doi.org/10.1101/274043; this version posted March 1, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

226 Integration with publicly available transcriptome data

227 We reanalyzed an RNA-seq dataset (ArrayExpress E-MTAB-4304) from an

228 independent OA patient cohort, in which the RNA was extracted from cartilage tissues

229 of both iMT and oLT for 8 patients [15]. Read counts on transcripts were estimated by

230 Kallisto v0.43.1 [47] using default parameters on FANTOM CAGE associated

231 transcriptome [44]. Estimated read counts of a gene, defined as the sum of the estimated

232 read counts of its associated transcripts, were used as the input for differential gene

233 expression analysis using edgeR v3.18.1. In total, 3,293 genes were defined as

234 significantly differentially expressed between iMT and oLT (FDR ≤ 0.05). A gene is

235 defined as “consistently dysregulated both at the epigenomic and transcriptomic levels”

236 when it is upregulated (or downregulated) in RNA-seq with more (or less) accessible

237 promoters or enhancers in ATAC-seq (n= 371).

238

239 analysis

240 Enrichr [48] was used to identify gene sets enriched in genes implicated in OA. The

241 selected gene lists were input to query enrichment for gene ontology (Biological

242 Process 2017b) in the database. P-value ≤ 0.05 was considered significant.

243

244 Transcription factor binding motif analysis

245 DNA motif analysis for differentially accessible regions was performed using HOMER

246 v4.9 with default parameters [49]. The enriched de novo and known motifs as well as

247 its matching transcription factor are searched and scanned in more- and less-accessible

248 peaks separately (parameters: -mask -size given), with all robust peaks as background

249 region. P-value < 1 × 10–5 was considered significant.

250

10 bioRxiv preprint doi: https://doi.org/10.1101/274043; this version posted March 1, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

251 Results

252 Mapping chromatin accessibility of chondrocytes in OA knee cartilages

253 To investigate chromatin signatures in articular cartilage associated with OA, we

254 performed ATAC-seq on the chondrocytes isolated from the knee joints of patients. We

255 have previously shown that the oLT region (outer region of the lateral tibial plateau,

256 representing the intact cartilage) is a good control for comparing the iMT region (inner

257 region of medial tibial plateau, representing the damaged cartilage) as a model for OA

258 disease progression [16], and the transcriptome and methylome of this model have been

259 characterized by us and the others [19,20,50,51]. In this study, we performed ATAC-

260 seq on the chondrocytes isolated from of the oLT and iMT regions of 8 patients

261 (Additional file 2: Figure S1a), generating 16 ATAC-seq libraries (paired, from 8

262 patients) on clinically relevant OA cartilage tissues (Fig. 1a).

263 Overall, the libraries are of high quality, showing two- to four-fold enrichment

264 in the Roadmap DHS (DHS enrichment score, Methods), with no substantial difference

265 between oLT and iMT libraries (Additional file 1: Table S2). In addition, the expected

266 nucleosome banding patterns were observed in the fragment size distribution for both

267 oLT and iMT libraries (Fig. 1b). When we applied nucleoATAC to infer genome-wide

268 nucleosome occupancy and positioning from ATAC-seq data [31], we found similar

269 aggregated signals around TSS for both oLT and iMT libraries, corresponding to –2, –

270 1, +1, +2, +3 nucleosomes as well as nucleosome depletion region at upstream of TSS

271 (Fig. 1c). Thus, we concluded that our ATAC-seq libraries had good and

272 indistinguishable quality between oLT and iMT regions.

273

274 Accessible chromatin landscape highlights potential enhancers and their target

275 genes relevant to OA

11 bioRxiv preprint doi: https://doi.org/10.1101/274043; this version posted March 1, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

276 Based on the 16 ATAC-seq libraries, we identified a set of unified accessible chromatin

277 regions across all samples (n=109,215 robust peaks, Methods, Additional file 1: Table

278 S3); 77,655 (71.1%) of which were annotated as enhancers and 18,410 (16.9%) as

279 promoters (Fig. 2a) based on Roadmap DHS annotations (Methods). To assess the

280 relevance of these peaks to OA, we intersected these peaks against Roadmap DHS, OA

281 GWAS SNPs and OA DML datasets. We found that both OA GWAS SNPs and OA

282 DML are significantly enriched in these peaks compared to all Roadmap DHS (GWAS

283 SNP: Odds ratio = 4.3, p < 2 × 10–16; DML: Odds ratio = 1.8, p < 2 × 10–16, Fisher’s

284 exact test), suggesting the peaks identified in this study are relevant to OA. In fact,

285 majority of the OA GWAS SNPs (75.49%) that overlap our ATAC-seq peaks fall

286 within annotated enhancers, which is consistent with the fact that GWAS SNPs reside

287 in enhancers [14,52]. In this way, we identified 140 enhancers overlapping the OA

288 GWAS SNPs (Additional file 1: Table S4) and 727 enhancers overlapping the OA

289 DML (Additional file 1: Table S5). To further characterize these enhancers potentially

290 relevant to OA, we utilized public resources [25,26] to predict their target genes

291 (Methods). These enhancers and their predicted target genes represent candidates of

292 dysregulated transcriptional network during OA pathogenesis. This included the

293 previously identified OA associated genes, such as FGFR3 [53,54] (its enhancer

294 overlaps an OA GWAS SNP, Fig. 2b) and PTEN [55] (its predicted enhancer overlaps

295 with an OA DML, Fig. 2c), thus validating our approach. In addition, rs4775006 is a

296 proxy SNP of rs4646626 (r2 = 0.816), which has previously been identified as a

297 suggestive OA risk locus (p = 9 × 10–6, [56]). We found this SNP hitting the accessible

298 enhancer region (chr15:57922910-57924140) in our samples, and predicted to target

299 the ALDH1A2 gene. Therefore, this study provides an accessible chromatin landscape

12 bioRxiv preprint doi: https://doi.org/10.1101/274043; this version posted March 1, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

300 of cartilage tissue, which allows better interpretation of other genetic and epigenomic

301 data relevant to OA and other skeletomuscular disease related to cartilage tissue.

302

303 Identification of differentially accessible enhancers in OA

304 To compare the accessible chromatin landscape between the intact (oLT) and the

305 damaged (iMT) tissues, we performed principal component analysis on the peak signals

306 across the 16 samples. The first principal component (41.37% of variance) can be

307 attributed to tissue damage variations (i.e. oLT versus iMT), while the second principal

308 component (16.03% of variance) can be attributed to patient-to-patient variations (Fig.

309 3a). This observation suggests the accessible chromatin landscapes of damaged and

310 intact tissues are readily distinguishable from each other, despite the variations among

311 individual patients.

312 To identify the chromatin signatures relevant to OA, we performed differential

313 accessibility analysis between oLT and iMT using edgeR [37]. First and foremost, we

314 assessed the global relevance of the differential accessibility of peaks to OA. In Fig. 3b,

315 we investigated the relationship between the extent of differential accessibility of peaks

316 (measured by FDR) and their 1) compositions, 2) enrichment in OA GWAS SNPs, and

317 3) enrichment in OA DML. We found that the peaks that are more differentially

318 accessible (i.e. smaller FDR) contain substantially higher fraction of enhancers (Fig.

319 3b, top panel), suggesting enhancers are more likely to be dysregulated than promoters

320 during OA disease progression. Moreover, peaks that are more differentially accessible

321 tend to be more enriched (i.e. higher odds ratio) in both OA GWAS SNPs and DML

322 (Methods, Fig. 3b, middle and lower panels). The differentially accessible enhancers

323 overlapping either OA GWAS SNP or OA DML are summarized in Table 1 and 2,

324 respectively. As a negative control, we also examined the GWAS SNPs and DML

13 bioRxiv preprint doi: https://doi.org/10.1101/274043; this version posted March 1, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

325 associated with Parkinson’s disease [43], which is a neurodegenerative disease

326 pathologically unrelated to cartilage, and did not observe any substantial enrichment.

327 Taken together, these observations suggest the differential chromatin accessibility

328 measured in this study is relevant to OA disease progression in a specific manner.

329 We next determined the significantly differentially accessible peaks with FDR

330 £ 0.05. Out of the 4,450 differentially accessible peaks, 1,565 are more accessible and

331 2,885 are less accessible in the damaged tissues compared to the intact tissues (Fig. 3c,

332 Fig. 3d, and Additional file 2: Figure S2a). The identified differentially accessible peaks

333 cluster in two groups (Additional file 2: Figure S2b), and are generally consistent

334 among patients (Additional file 2: Figure S2c). Majority (85.3%) of these differentially

335 accessible peaks are enhancers and only 7.6 % are promoters. It is noted that the

336 promoter accessibility of SOX11 and RGR are altered in the OA damaged tissues (more

337 accessible for SOX11 and less accessible for RGR, Fig. 3d), which are consistent with

338 the previous findings that they are up- and down-regulated in OA damaged tissues,

339 respectively [20].

340 Majority of the differentially accessible peaks are enhancers, which are known

341 to regulate transcription. Cell type-specific enhancers largely drive the transcriptional

342 program required to carry out specific functions for each cell type [57] and their

343 dysregulation may lead to diseases [58]. To assess the cell type specificity of the

344 differentially accessible enhancers identified in this study, we examined their

345 enrichment of cell type-specific enhancers of 125 cell types defined by the Roadmap

346 Epigenomics Project [24]. We found these differential accessible enhancers are highly

347 enriched for enhancers that are specific to bone-related cell types in damaged tissues

348 (e.g. chondrocyte and osteoblast), as well as mesenchymal stem cells (Fig. 3e). This

349 observation is consistent with the hypothesis that inappropriate activation of

14 bioRxiv preprint doi: https://doi.org/10.1101/274043; this version posted March 1, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

350 endochondral ossification process might play a role in OA progression [59], which is

351 thought to occur through differentiation of mesenchymal stem cells to chondrocytes or

352 osteoblasts, as well as hypertrophic differentiation of chondrocytes [60].

353 In summary, we demonstrate that the differential accessible chromatin regions

354 identified in this study are enriched in OA-relevant enhancers supported by multiple

355 genetic and epigenome evidence. These differentially accessible enhancers and their

356 predicted target genes could be used for prioritization of candidate genes to be tested

357 for studying OA disease progression.

358

359 Motif enrichment analysis reveals transcription factors relevant to OA

360 In order to gain more insights into which regulatory pathways may be dysregulated in

361 OA, we next examined the enrichment of transcription factor binding motifs in the

362 differentially accessible regions. Enrichment analyses were performed separately for

363 the regions that are significantly more or less accessible in the damaged tissues, using

364 all accessible regions as background. Transcription factors with significantly enriched

365 motifs are summarized (Additional file 2: Figure S3b), and their binding prediction in

366 robust peaks are listed (Additional file 1: Table S3). We note most of these transcription

367 factors belong to ETS and bZIP family; many of which are known to regulate genes

368 involved in bone or cartilage development, including AP-1 [61], CEBP [62], MafK [63],

369 STAT3 [64], and ERG [65,66]. Moreover, we have previously proposed that ETS-1

370 may be involved in OA based on our DML study [19]. Taken together, the motif

371 enrichment analysis of the differentially accessible regions in the damaged tissues is

372 consistent with the hypothesis that the transcriptional program for chondrocyte

373 differentiation may be disrupted during OA progression, and suggests that cell type-

374 specific enhancers may be dysregulated through the ETS and bZIP family transcription

15 bioRxiv preprint doi: https://doi.org/10.1101/274043; this version posted March 1, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

375 factors.

376

377 Integrative transcriptomics and epigenomics analysis reveals pathways involved

378 in OA

379 To evaluate the effects of the dysregulated regulatory regions on gene expression in

380 OA, we reanalyzed the RNA-seq dataset from a different cohort that used the same

381 disease model (i.e. oLT vs. iMT) [15] and integrated it with the differentially accessible

382 regions identified in this study (Fig. 4a). We first verified that the dysregulated

383 regulatory regions have detectable effects on the gene expression; the genes with more

384 accessible promoters or enhancers have significantly higher expression fold-changes

385 between oLT and iMT (p < 0.001, Student’s t-test) than the ones with less accessible

386 promoters or enhancers (Fig. 4b). To further investigate the congruence between these

387 transcriptomic and epigenomic changes, we overlapped the significant differentially

388 expressed genes (n=3,293, FDR £ 0.05) onto the genes with differentially accessible

389 promoters (n=255) or enhancers (n=2,406) (Fig. 4c). We find that their overlaps are

390 statistically significant (p < 0.001 in both promoters and enhancers, Fisher’s exact test),

391 which further demonstrate the chromatin accessibility dataset from this study is

392 generally consistent with the transcriptomic dataset. As a result, we identified 371 genes

393 that are consistently dysregulated both at the epigenomic and transcriptomic levels,

394 representing a shortlist of OA-related genes candidates supported by multiple lines of

395 evidence (Fig. 4c, Additional file 1: Table S6).

396 To elucidate the biological pathways dysregulated in OA, we examined the

397 enrichment of gene ontology (GO) terms of the 371 OA-related candidate genes using

398 enrichR [48] (Figure 4d, top 30 terms listed). Overall, we observe the enrichment of

399 GO terms related to cell fate and differentiation, including MSC differentiation,

16 bioRxiv preprint doi: https://doi.org/10.1101/274043; this version posted March 1, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

400 ossification and bone development (Fig. 4d, dysregulated genes in each GO terms are

401 listed in Additional file 1: Table S7). In the ‘positive regulation of ossification pathway’,

402 two genes were identified to be more accessible and upregulated: SOX11 (only the

403 promoter is more accessible) and WNT5A (both the promoter and the enhancer are more

404 accessible) (Additional file 2: Figure S3b). It has been shown that WNT5A can

405 induce matrix metalloproteinase production and cartilage destruction [67], and its

406 upregulation is consistent with ossification being an important process and signature of

407 OA progression. Other susceptible genes and pathways that support the ossification

408 during OA from this analysis include LRP5, FGFR2 and BMPR1B in the endochondral

409 bone morphogenesis pathway, which have been reported as OA associated genes

410 previously [68–70]. In conclusion, our integrative analysis of ATAC-seq and publicly

411 available RNA-seq datasets indicates that dysregulated chondrocyte differentiation and

412 endochondral ossification are central to OA progression.

413

414 Discussion

415 Conventional epigenomic profiling at the chromatin level, such as chromatin

416 immunoprecipitation sequencing (ChIP-seq) and DNase-seq, is informative in

417 providing insights into the molecular mechanism underlying the regulation of gene

418 expression. However, applying these methods to clinically relevant tissue is less

419 feasible due to the requirement of large number of cells. Especially with the cartilage

420 tissues, the limited tissue sampling size and the extracellular matrix make collection of

421 sufficient cells difficult. One major advantage of ATAC-seq is that it can be achieved

422 with only thousands of cells, making the direct chromatin profiling of clinical samples

423 feasible. In this study, we applied ATAC-seq on OA samples to obtain a chromatin

424 accessibility map in articular cartilage, and identified regulatory regions associated with

17 bioRxiv preprint doi: https://doi.org/10.1101/274043; this version posted March 1, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

425 OA. The lack of normal knee tissues due to difficulties associated with collecting them

426 is a limitation for studying OA. However, our previous studies of the pathology and

427 transcriptome showed that the oLT regions are very similar to normal [16,20]. Thus,

428 the oLT regions could serve as a suitable alternative to normal control, which could

429 also reduce the inter-individual variations.

430 Our strategy of integrating the epigenomic data of clinically relevant tissues

431 with the publicly available genetic and transcriptomic data allowed us to better

432 understand how the identified loci may contribute to OA pathogenesis. Most of these

433 accessible chromatin regions are annotated enhancers and we linked them to their

434 putative target genes using public datasets. With this enhancer-gene map in

435 chondrocyte, we can now better interpret the previously identified OA GWAS SNPs or

436 OA differential methylated loci located lie outside of the coding regions. For example,

437 we have verified that the previously reported OA associated SNP rs4775006 is indeed

438 accessible in chondrocytes, that lies within the enhancer region of the predicted target

439 gene ALDH1A2. Since ALDH1A2 is inactivated in prechondrogenic mesenchyme

440 during the cartilage development [71], this SNP may contribute to OA through

441 disrupting the enhancer of ALDH1A2 and inappropriately activating a cell

442 differentiation pathway. In addition, we identified several aberrantly methylated

443 enhancers that may be associated with OA. One example is cg09221159 within the

444 enhancer for PTEN gene which is hypo-methylated in damaged cartilage [19]. It is

445 involved in the positive regulation of the apoptotic signaling pathway and has been

446 proposed that chondrocyte apoptosis contributes to the failure in appropriately

447 maintaining the cartilage. Thus, our analyses show that the chromatin accessibility map

448 can provide an additional layer of evidence for determining which loci, especially those

18 bioRxiv preprint doi: https://doi.org/10.1101/274043; this version posted March 1, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

449 in the non-coding regions, are associated with OA, which may have been ignored in

450 previous studies.

451 In general, our differential enhancer analysis shows MSC, chondrocyte and

452 osteoblast-specific enhancers are dysregulated in the damaged tissues. Furthermore,

453 motif enrichment analysis of differentially accessible loci has identified many

454 dysregulated transcription factors, the functions of which are known to be in

455 chondrocyte development regulation. For example, the transcription factor Jun-AP1,

456 which is enriched in the more accessible regions of the damaged tissues, is known to

457 regulate chondrocyte hypertrophic morphology, which contributes to longitudinal bone

458 growth, consistent with the notion of endochondral ossification [72]. A recent study

459 showed injection of adipose-derived stromal cells overexpressing an AP-1 family

460 transcription factor Fra-1 can inhibit OA progression in mice [73]. In our ATAC-seq

461 analysis, other members of AP-1 family (e.g. BATF, FOSL2) also showed enrichment

462 in differentially accessible regions, which could be candidate targets for such

463 experiments.

464 In the integrative analysis of ATAC-seq and RNA-seq, many dysregulated

465 genes related to lineage differentiation of MSC pathways were observed. For example,

466 we found that BMPR1B (bone morphogenetic protein receptor type 1B) is upregulated

467 and both its promoter and enhancer are more accessible in the damaged tissue. Its

468 activation is consistent with the ossification pathway activation, since it encodes a

469 transmembrane serine kinase that binds to BMP ligands that positively regulate

470 endochondral ossification and abnormal chondrogenesis [74,75]. Consistently, an

471 osteoblast marker gene MSX2 (Msh Homeobox 2) involved inpromoting osteoblast

472 differentiation [76,77], is upregulated and both its promoter and enhancer are more

473 accessible in the damaged tissue, suggesting the osteoblast differentiation may be

19 bioRxiv preprint doi: https://doi.org/10.1101/274043; this version posted March 1, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

474 activated in OA. Furthermore, we found ROR2 (receptor tyrosine kinase like orphan

475 receptor 2) is down-regulated and its enhancers are less accessible in the damaged

476 samples. Since it is required for cartilage development [78,79], it suggests that the

477 normal chondrocyte development and cartilage formation may be compromised in OA.

478 Consistently, we have also determined FGFR2 and STAT1, which are known to inhibit

479 chondrocyte proliferation [80], are upregulated. In summary, our ATAC-seq data and

480 integrative analyses support the OA model of abnormal MSC differentiation and

481 endochondral ossification over other models, such as inflammatory mediums from the

482 synovium.

483 The pathogenesis for OA is not yet fully understood, despite multiple genes and

484 pathways that have been characterized to be dysregulated [13,15,42,81,82]. In this

485 study, by integrating clinically relevant epigenomic data with genetic and

486 transcriptomic data, we provide multiple lines of evidence supporting a number OA

487 candidate genes and pathways that may be crucial to OA pathogenesis, which could

488 potentially be used for clinical diagnostic or as therapeutic targets.

489

490 Conclusions

491 We present in this work an application of ATAC-seq in OA in a clinical relevant setting.

492 The chromatin accessibility map in cartilage will be a resource for future GWAS and

493 DNA methylation studies in OA and other musculoskeletal diseases. We identified

494 altered promoters and enhancers of genes that might be involved in the pathogenesis of

495 OA. Our analyses suggest aberrant enhancer usage associated with MSC differentiation

496 and chondrogenesis in OA. Understanding these molecular basis of OA is necessary for

497 future therapeutic intervention.

498

20 bioRxiv preprint doi: https://doi.org/10.1101/274043; this version posted March 1, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

499 Abbreviations

500 ATAC-seq: Assay for Transposase-Accessible Chromatin with high throughput

501 sequencing

502 CPM: count-per-million

503 DHS: DNase I Hypersensitive Sites

504 DML: Differentially Methylated Loci

505 eQTL: expression Quantitative Trait Loci

506 FDR: False Discovery Rate

507 GWAS: Genome-wide association studies

508 iMT: inner region of medial tibial plateau

509 MSC: Mesenchymal Stem Cell

510 OA: Osteoarthritis

511 oLT: outer region of lateral tibial plateau

512 SNP: Single Nucleotide Polymorphisms

513 TSS: Transcription Start Sites

514

515 Ethics, consent and permissions

516 This study was approved by the ethics review board of all the participating institutions

517 (Sagamihara National Hospital and RIKEN).

518

519 Consent to publish

520 Informed consent was obtained from each patient enrolled in this study.

521

522 Availability of data and materials

21 bioRxiv preprint doi: https://doi.org/10.1101/274043; this version posted March 1, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

523 Raw and processed data generated are available on the Gene Expression Omnibus

524 under the accession code GSE108301. [Reviewer access code: cpynocyaxnsvfsr]

525

526 Competing interests

527 All authors declare that they have no conflict of interest.

528

529 Funding

530 This work was supported by a Research Grant from the Japanese Ministry of Education,

531 Culture, Sports, Science and Technology (MEXT) to the RIKEN Center for Life

532 Science Technologies, RIKEN Epigenetics Program, and Ye Liu was supported by the

533 RIKEN Junior Research Associate Program.

534

535 Authors’ contributions

536 The study was conceived and designed by AM and MTML. Sample collection were

537 done by NF and NT. Experiments were performed by YL. Analysis and interpretation

538 of data were carried out by YL, JC and CH. The study was supervised by AM, MTMKL

539 and ZZ. YL and JC drafted and AM, CH and MTML revised the manuscript. All authors

540 read and approved the final version of the manuscript.

541

542 Acknowledgements

543 We thank Dr. Shiro Ikegawa (RIKEN & Tokyo University) for his support on the

544 project. We thank Yanfei Zhang (Geisinger), Tommy Terooatea (RIKEN), Mitsunori

545 Yahata (RIKEN), Haruka Yabukami (RIKEN) and Prashanti Jeyamohan (RIKEN) for

546 active discussions, technical assistance and support. We thank GeNAS for the

547 sequencing service.

22

bioRxiv preprint doi: https://doi.org/10.1101/274043; this version posted March 1, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

548

Table 1. Differentially accessible enhancers overlapping with OA GWAS SNPs

Proxy SNP Lead SNP GWAS Log (fold Predicted Enhancer peak ID r2 2 FDR ID ID literature change) target gene(s) chr15:57922910-57924140 rs4775006 rs4646626 0.816 [56] -0.546 0.050 ALDH1A2

chr3:46705071-46706482 rs7426919 rs6442021 0.835 [40] 0.581 0.023 ALS2CL; PRSS45

chr7:32616344-32617105 rs10951345 rs7805536 0.901 [56] -0.769 0.028 AVL9

chr7:32611048-32612186 rs10807862 rs7805536 0.901 [56] -0.957 0.006 AVL9; LSM5

chr8:8680144-8681063 rs7837620 rs1876836 0.908 [56] -0.804 0.037 CLDN23; ERI1; MFHAS1

chr8:8680144-8681063 rs7837620 rs1876836 0.908 [56] -0.804 0.037 SGK223

chr3:15275408-15276316 rs4685243 rs3773475 0.581 [83] 1.060 0.009 SH3BP5

chr15:34841146-34841782 rs17237191 rs717941 1 [84] -0.696 0.041 ZNF770

549

Table 2. Differentially accessible enhancers overlapping with OA Differentially Methylated Loci

Log (fold Predicted Enhancer peak ID DML Delta beta 2 FDR change) target gene(s) AKR7A2; AKR7A3; CAPZB; chr1:19398124-19398841 cg06360604 -0.177 1.194 0.003 PQLC2 chr16:66511371-66512170 cg02934719 -0.168 0.827 0.021 BEAN1; TK2

chr11:74025533-74026295 cg01117339 -0.164 0.623 0.037 COA4; RAB6A; UCP2

chr1:85606965-85607684 cg00418071 -0.157 1.214 0.001 CYR61; DDAH1; ZNHIT6; SYDE2 ETFDH; FAM198B; FNIP2; chr4:158809913-158811140 cg11637968 -0.218 0.878 0.032 PPID; C4orf46 chr5:32767956-32768626 cg26647771 -0.198 0.924 0.010 GOLPH3; NPR3; TARS

chr1:226708432-226710048 cg04503570 -0.166 0.500 0.046 PSEN2

chr2:1720779-1722443 cg08216099 -0.193 0.732 0.016 PXDN

chr10:119523073-119524950 cg18591136 -0.152 0.508 0.040 RGS10

chr6:168412123-168412714 cg26379705 -0.174 0.736 0.009 SMOC2

chr6:146850220-146851275 cg18291422 -0.184 0.960 0.012 STXBP5

chr8:81688679-81689538 cg04491064 0.164 -0.959 0.046 CHMP4C

chr12:124700740-124701544 cg04658679 0.160 -0.514 0.030 FAM101A; UBC

chr4:26412447-26413348 cg22992279 0.159 -0.737 0.028 RBPJ

chr11:35344856-35345647 cg13971030 0.157 -0.656 0.038 SLC1A2 ZNF10; ZNF268; ZNF605; chr12:132846887-132848225 cg21232015 0.151 -0.615 0.038 ZNF84; GOLGA3; ANKLE2; CHFR 550

Table 3. OA associated genes enriched in top 30 GO terms

Differentially Differentially accessible Differentially accessible Genes expressed promoter enhancer BMPR1B √ √ √

23 bioRxiv preprint doi: https://doi.org/10.1101/274043; this version posted March 1, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

WNT5A √ √ √ ZMIZ1 √ √ CHRD √ √ DCC √ √ HMGA2 √ √ PMAIP1 √ √ PRDM1 √ √ SOX11 √ √ TENM3 √ √ WNT5B √ √ BDNF √ √ CHDH √ √ ETS1 √ √ FGFR2 √ √ FOXF1 √ √ IFT122 √ √ LRP5 √ √ MAF √ √ PTPRS √ √ RPS19 √ √ RTN4RL1 √ √ SLC2A12 √ √ SLC2A5 √ √ SLC44A1 √ √ TBX4 √ √ THBS1 √ √ ZNF516 √ √ 551

552 References

553 1. Findlay DM, Kuliwaba JS. Bone–cartilage crosstalk: a conversation for

554 understanding osteoarthritis. Bone Res. 2016;4:16028.

555 2. Goldring MB, Goldring SR. Articular cartilage and subchondral bone in the

556 pathogenesis of osteoarthritis. Ann N Y Acad Sci. 2010. p. 230–7.

557 3. Kwoh CK. Epidemiology of osteoarthritis. Epidemiol Aging. 2012. p. 523–36.

558 4. Martel-pelletier J, Barr AJ, Cicuttini FM, Conaghan PG, Cooper C, Goldring MB,

559 et al. Osteoarthritis. Nat Rev Dis Prim. 2016;2:16072.

560 5. Krane SM, Quigley J, Krane S. Petulant cellular acts: destroying the ECM rather

561 than creating it. J Clin Invest. American Society for Clinical Investigation;

562 2001;107:31–2.

563 6. Kapoor M. Pathogenesis of Osteoarthritis. Osteoarthritis. Cham: Springer

24 bioRxiv preprint doi: https://doi.org/10.1101/274043; this version posted March 1, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

564 International Publishing; 2015. p. 1–28.

565 7. Man GS, Mologhianu G. Osteoarthritis pathogenesis - a complex process that

566 involves the entire joint. J Med Life. 2014;7:37–41.

567 8. Kuyinu EL, Narayanan G, Nair LS, Laurencin CT. Animal models of osteoarthritis:

568 Classification, update, and measurement of outcomes. J. Orthop. Surg. Res. 2016.

569 9. Kawaguchi H. The canonical Wnt signal paradoxically regulates osteoarthritis

570 development through the endochondral ossification process. Integr Mol Med.

571 2016;3:672–4.

572 10. Dreier R. Hypertrophic differentiation of chondrocytes in osteoarthritis: the

573 developmental aspect of degenerative joint disorders. Arthritis Res Ther.

574 2010;12:216.

575 11. Kawaguchi H. Endochondral ossification signals in cartilage degradation during

576 osteoarthritis progression in experimental mouse models. Mol Cells. 2008;25:1–6.

577 12. Cox LGE, Van Donkelaar CC, van Rietbergen B, Emans PJ, Ito K. Alterations to

578 the subchondral bone architecture during osteoarthritis: Bone adaptation vs

579 endochondral bone formation. Osteoarthr Cartil. 2013;21:331–8.

580 13. Uhalte EC, Wilkinson JM, Southam L, Zeggini E. Pathways to understanding the

581 genomic aetiology of osteoarthritis. Hum. Mol. Genet. 2017. p. R193–201.

582 14. Corradin O, Scacheri PC. Enhancer variants: Evaluating functions in common

583 disease. Genome Med. 2014;6:85.

584 15. Dunn SL, Soul J, Anand S, Schwartz JM, Boot-Handford RP, Hardingham TE.

585 Gene expression changes in damaged osteoarthritic cartilage identify a signature of

586 non-chondrogenic and mechanical responses. Osteoarthr Cartil. 2016;24:1431–40.

587 16. Chou CH, Lee CH, Lu LS, Song IW, Chuang HP, Kuo SY, et al. Direct

588 assessment of articular cartilage and underlying subchondral bone reveals a

25 bioRxiv preprint doi: https://doi.org/10.1101/274043; this version posted March 1, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

589 progressive gene expression change in human osteoarthritic knees. Osteoarthr Cartil.

590 2013;21:450–61.

591 17. Alvarez-Garcia O, Fisch KM, Wineinger NE, Akagi R, Saito M, Sasho T, et al.

592 Increased DNA Methylation and Reduced Expression of Transcription Factors in

593 Human Osteoarthritis Cartilage. Arthritis Rheumatol. 2016;68:1876–86.

594 18. Jeffries MA, Donica M, Baker LW, Stevenson ME, Annan AC, Humphrey MB, et

595 al. Genome-Wide DNA methylation study identifies significant epigenomic changes

596 in osteoarthritic cartilage. Arthritis Rheumatol. 2014;66:2804–15.

597 19. Zhang Y, Fukui N, Yahata M, Katsuragawa Y, Tashiro T, Ikegawa S, et al.

598 Genome-wide DNA methylation profile implicates potential cartilage regeneration at

599 the late stage of knee osteoarthritis. Osteoarthr Cartil. Elsevier Ltd; 2016;24:835–43.

600 20. Chou CH, Lee MTM, Song IW, Lu LS, Shen HC, Lee CH, et al. Insights into

601 osteoarthritis progression revealed by analyses of both knee tibiofemoral

602 compartments. Osteoarthr Cartil. 2015;23:571–80.

603 21. Bernstein BE, Birney E, Dunham I, Green ED, Gunter C, Snyder M. An

604 integrated encyclopedia of DNA elements in the . Nature.

605 2012;489:57–74.

606 22. Herz HM. Enhancer deregulation in cancer and other diseases. BioEssays. 2016.

607 p. 1003–15.

608 23. Forrest ARR, Kawaji H, Rehli M, Baillie JK, De Hoon MJL, Haberle V, et al. A

609 promoter-level mammalian expression atlas. Nature. 2014;507:462–70.

610 24. Bernstein BE, Stamatoyannopoulos JA, Costello JF, Ren B, Milosavljevic A,

611 Meissner A, et al. The NIH roadmap epigenomics mapping consortium. Nat.

612 Biotechnol. 2010. p. 1045–8.

613 25. Lonsdale J, Thomas J, Salvatore M, Phillips R, Lo E, Shad S, et al. The Genotype-

26 bioRxiv preprint doi: https://doi.org/10.1101/274043; this version posted March 1, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

614 Tissue Expression (GTEx) project. Nat. Genet. 2013. p. 580–5.

615 26. Cao Q, Anyansi C, Hu X, Xu L, Xiong L, Tang W, et al. Reconstruction of

616 enhancer-target networks in 935 samples of human primary cells, tissues and cell

617 lines. Nat Genet. 2017;49:1428–36.

618 27. Buenrostro JD, Wu B, Chang HY, Greenleaf WJ. ATAC-seq: A method for

619 assaying chromatin accessibility genome-wide. Curr Protoc Mol Biol.

620 2015;2015:21.29.1-21.29.9.

621 28. Altman R, Asch E, Bloch D, Bole G, Borenstein D, Brandt K, et al. Development

622 of criteria for the classification and reporting of osetoarthritis: classification of

623 osteoarthritis of the knee. Arthritis Rheum. 1986;29:1039–49.

624 29. Milani P, Escalante-Chong R, Shelley BC, Patel-Murray NL, Xin X, Adam M, et

625 al. Cell freezing protocol suitable for ATAC-Seq on motor neurons derived from

626 human induced pluripotent stem cells. Sci Rep. 2016;6:25474.

627 30. Lee J, Christoforo G, Christoforo G, Foo C, Probert C, Kundaje A, et al.

628 Kundajelab/Atac_Dnase_Pipelines: 0.3.0. 2016;

629 31. Schep AN, Buenrostro JD, Denny SK, Schwartz K, Sherlock G, Greenleaf WJ.

630 Structured nucleosome fingerprints enable high-resolution mapping of chromatin

631 architecture within regulatory regions. Genome Res. 2015;25:1757–70.

632 32. Boyle AP, Davis S, Shulha HP, Meltzer P, Margulies EH, Weng Z, et al. High-

633 Resolution Mapping and Characterization of Open Chromatin across the Genome.

634 Cell. 2008;132:311–22.

635 33. Wook LJ, Wouter M, Anshul K. Delineation of DNaseI-accessible regulatory

636 regions. 2015. http://egg2.wustl.edu/roadmap/web_portal/DNase_reg.html

637 34. Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM, et al. The

638 human genome browser at UCSC. Genome Res. Cold Spring Harbor Laboratory

27 bioRxiv preprint doi: https://doi.org/10.1101/274043; this version posted March 1, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

639 Press; 2002;12:996–1006.

640 35. Quinlan AR, Hall IM. BEDTools: A flexible suite of utilities for comparing

641 genomic features. Bioinformatics. 2010;26:841–2.

642 36. McCarthy DJ, Chen Y, Smyth GK. Differential expression analysis of multifactor

643 RNA-Seq experiments with respect to biological variation. Nucleic Acids Res.

644 Oxford University Press; 2012;40:4288–97.

645 37. Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for

646 differential expression analysis of digital gene expression data. Bioinformatics.

647 2010;26:139–40.

648 38. Li MJ, Wang P, Liu X, Lim EL, Wang Z, Yeager M, et al. GWASdb v2: an

649 update database for human genetic variants identified by genome-wide association

650 studies. Nucleic Acids Res. 2016;44:869–76.

651 39. MacArthur J, Bowler E, Cerezo M, Gil L, Hall P, Hastings E, et al. The new

652 NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog).

653 Nucleic Acids Res. 2017;45:D896–901.

654 40. Zengini E, Hatzikotoulas K, Tachmazidou I, Steinberg J, Hartwig FP, Southam L,

655 et al. The genetic architecture of osteoarthritis: insights from UK Biobank. bioRxiv.

656 Cold Spring Harbor Laboratory; 2017;174755.

657 41. Johnson AD, Handsaker RE, Pulit SL, Nizzari MM, O’Donnell CJ, De Bakker

658 PIW. SNAP: A web-based tool for identification and annotation of proxy SNPs using

659 HapMap. Bioinformatics. 2008;24:2938–9.

660 42. Steinberg J, Ritchie GRS, Roumeliotis TI, Jayasuriya RL, Clark MJ, Brooks RA,

661 et al. Integrative epigenomics, transcriptomics and proteomics of patient chondrocytes

662 reveal genes and pathways involved in osteoarthritis. Sci Rep. 2017;7.

663 43. Fernandez-Santiago R, Carballo-Carbajal I, Castellano G, Torrent R, Richaud Y,

28 bioRxiv preprint doi: https://doi.org/10.1101/274043; this version posted March 1, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

664 Sanchez-Danes A, et al. Aberrant epigenome in iPSC-derived dopaminergic neurons

665 from Parkinson’s disease patients. EMBO Mol Med. 2015;7:1529–46.

666 44. Hon C, Ramilowski J, Harshbarger J, Bertin N, Rackham O, Gough J, et al. An

667 atlas of human long non-coding RNAs with accurate 5‘ ends. Nature. 2017;543:199–

668 204.

669 45. Consortium GTe. Human genomics. The Genotype-Tissue Expression (GTEx)

670 pilot analysis: multitissue gene regulation in humans. Science (80- ). 2015;348:648–

671 60.

672 46. Cao Q, Anyansi C, Hu X, Xu L, Xiong L, Tang W, et al. Reconstruction of

673 enhancer–target networks in 935 samples of human primary cells, tissues and cell

674 lines. 2017. http://www.nature.com/doifinder/10.1038/ng.3950

675 47. Bray NL, Pimentel H, Melsted P, Pachter L. Near-optimal probabilistic RNA-seq

676 quantification. Nat Biotechnol. 2016;34:525–7.

677 48. Kuleshov M V., Jones MR, Rouillard AD, Fernandez NF, Duan Q, Wang Z, et al.

678 Enrichr: a comprehensive gene set enrichment analysis web server 2016 update.

679 Nucleic Acids Res. 2016;44:W90–7.

680 49. Heinz S, Benner C, Spann N, Bertolino E, Lin YC, Laslo P, et al. Simple

681 combinations of lineage-determining transcription factors prime cis-regulatory

682 elements required for macrophage and B cell identities. Mol Cell. 2010;38:576–89.

683 50. Chou C-H, Wu C-C, Song I-W, Chuang H-P, Lu L-S, Chang J-H, et al. Genome-

684 wide expression profiles of subchondral bone in osteoarthritis. Arthritis Res Ther.

685 2013;15:R190.

686 51. Zhang Y, Fukui N, Yahata M, Katsuragawa Y, Tashiro T, Ikegawa S, et al.

687 Identification of DNA methylation changes associated with disease progression in

688 subchondral bone with site-matched cartilage in knee osteoarthritis. Sci Rep. Nature

29 bioRxiv preprint doi: https://doi.org/10.1101/274043; this version posted March 1, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

689 Publishing Group; 2016;6:34460.

690 52. Farh KKH, Marson A, Zhu J, Kleinewietfeld M, Housley WJ, Beik S, et al.

691 Genetic and epigenetic fine mapping of causal autoimmune disease variants. Nature.

692 2015;518:337–43.

693 53. Zhou S, Xie Y, Li W, Huang J, Wang Z, Tang J, et al. Conditional Deletion of

694 Fgfr3 in Chondrocytes leads to Osteoarthritis-like Defects in Temporomandibular

695 Joint of Adult Mice. Sci Rep. 2016;6.

696 54. Tang J, Su N, Zhou S, Xie Y, Huang J, Wen X, et al. Fibroblast Growth Factor

697 Receptor 3 Inhibits Osteoarthritis Progression in the Knee Joints of Adult Mice.

698 Arthritis Rheumatol. 2016;68:2432–43.

699 55. Iwasa K, Hayashi S, Fujishiro T, Kanzaki N, Hashimoto S, Sakata S, et al. PTEN

700 regulates matrix synthesis in adult human chondrocytes under oxidative stress. J

701 Orthop Res. 2014;32:231–7.

702 56. Zeggini E, Panoutsopoulou K, Southam L, Rayner NW, Day-Williams AG, Lopes

703 MC, et al. Identification of new susceptibility loci for osteoarthritis (arcOGEN): a

704 genome-wide association study. Lancet. 2012;380:815–23.

705 57. Heinz S, Romanoski CE, Benner C, Glass CK. The selection and function of cell

706 type-specific enhancers. Nat Rev Mol Cell Biol. Nature Publishing Group;

707 2015;16:144–54.

708 58. Kron KJ, Bailey SD, Lupien M. Enhancer alterations in cancer: a source for a cell

709 identity crisis. Genome Med. 2014;6:77.

710 59. Saito T, Fukai A, Mabuchi A, Ikeda T, Yano F, Ohba S, et al. Transcriptional

711 regulation of endochondral ossification by HIF-2α during skeletal growth and

712 osteoarthritis development. Nat Med. Nature Publishing Group; 2010;16:678–86.

713 60. Mackie EJ, Ahmed YA, Tatarczuch L, Chen KS, Mirams M. Endochondral

30 bioRxiv preprint doi: https://doi.org/10.1101/274043; this version posted March 1, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

714 ossification: How cartilage is converted into bone in the developing skeleton. Int J

715 Biochem Cell Biol. 2008;40:46–62.

716 61. Vincenti MP, Brinckerhoff CE. Transcriptional regulation of collagenase (MMP-

717 1, MMP-13) genes in arthritis: integration of complex signaling pathways for the

718 recruitment of gene-specific transcription factors. Arthritis Res. BioMed Central;

719 2002;4:157.

720 62. Okuma T, Hirata M, Yano F, Mori D, Kawaguchi H, Chung UI, et al. Regulation

721 of mouse chondrocyte differentiation by CCAAT/enhancer-binding . Biomed

722 Res. Biomedical Research Press; 2015;36:21–9.

723 63. Hong E, Di Cesare PE, Haudenschild DR. Role of c-Maf in Chondrocyte

724 Differentiation. Cartilage. SAGE PublicationsSage CA: Los Angeles, CA; 2011;2:27–

725 35.

726 64. Hayashi S, Fujishiro T, Hashimoto S, Kanzaki N, Chinzei N, Kihara S, et al. p21

727 deficiency is susceptible to osteoarthritis through STAT3 phosphorylation. Arthritis

728 Res Ther. BioMed Central; 2015;17:314.

729 65. Iwamoto M, Tamamura Y, Koyama E, Komori T, Takeshita N, Williams JA, et al.

730 Transcription factor ERG and joint and articular cartilage formation during mouse

731 limb and spine skeletogenesis. Dev Biol. 2007;305:40–51.

732 66. Iwamoto M, Higuchi Y, Enomoto-Iwamoto M, Kurisu K, Koyama E, Yeh H, et

733 al. The role of ERG (ets related gene) in cartilage development. Osteoarthr Cartil.

734 W.B. Saunders; 2001;9:S41–7.

735 67. Huang G, Chubinskaya S, Liao W, Loeser RF. Wnt5a induces catabolic signaling

736 and matrix metalloproteinase production in human articular chondrocytes. Osteoarthr

737 Cartil. 2017;25:1505–15.

738 68. Papathanasiou I, Malizos KN, Tsezou A. Low-density lipoprotein receptor-related

31 bioRxiv preprint doi: https://doi.org/10.1101/274043; this version posted March 1, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

739 protein 5 (LRP5) expression in human osteoarthritic chondrocytes. J Orthop Res.

740 2010;28:348–53.

741 69. Ellman MB, Yan D, Ahmadinia K, Chen D, An HS, Im HJ. Fibroblast growth

742 factor control of cartilage homeostasis. J. Cell. Biochem. 2013. p. 735–42.

743 70. Wu M, Chen G, Li Y-P. TGF-β and BMP signaling in osteoblast, skeletal

744 development, and bone formation, homeostasis and disease. Bone Res. 2016;4:16009.

745 71. Hoffman LM, Garcha K, Karamboulas K, Cowan MF, Drysdale LM, Horton WA,

746 et al. BMP action in skeletogenesis involves attenuation of retinoid signaling. J Cell

747 Biol. 2006;174:101–13.

748 72. He X, Ohba S, Hojo H, McMahon AP. AP-1 family members act with Sox9 to

749 promote chondrocyte hypertrophy. Development. 2016;1:dev.134502.

750 73. Schwabe K, Garcia M, Ubieta K, Hannemann N, Herbort B, Luther J, et al.

751 Inhibition of Osteoarthritis by Adipose-Derived Stromal Cells Overexpressing Fra-1

752 in Mice. Arthritis Rheumatol. 2016;68:138–51.

753 74. Yoon BS, Ovchinnikov DA, Yoshii I, Mishina Y, Behringer RR, Lyons KM.

754 Bmpr1a and Bmpr1b have overlapping functions and are essential for chondrogenesis

755 in vivo. Proc Natl Acad Sci. 2005;102:5062–7.

756 75. Li J, Dong S. The signaling pathways involved in chondrocyte differentiation and

757 hypertrophic differentiation. Stem Cells Int. 2016;2016.

758 76. Matsubara T, Kida K, Yamaguchi A, Hata K, Ichida F, Meguro H, et al. BMP2

759 regulates osterix through Msx2 and Runx2 during osteoblast differentiation. J Biol

760 Chem. 2008;283:29119–25.

761 77. Cheng SL, Shao JS, Charlton-Kachigian N, Loewy AP, Towler DA. Msx2

762 Promotes Osteogenesis and Suppresses Adipogenic Differentiation of Multipotent

763 Mesenchymal Progenitors. J Biol Chem. 2003;278:45969–77.

32 bioRxiv preprint doi: https://doi.org/10.1101/274043; this version posted March 1, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

764 78. Dickinson S, Hollander A. The Wnt5a Receptor, Receptor Tyrosine Kinase-Like

765 Orphan Receptor 2, Is a Predictive Cell Surface Marker of Human Mesenchymal

766 Stem Cells with an Enhanced Capacity for Chondrogenic Differentiation. Stem Cells.

767 2017;26:2399–407.

768 79. Schille C, Bayerlová M, Bleckmann A, Schambony A. Ror2 signaling is required

769 for local upregulation of GDF6 and activation of BMP signaling at the neural plate

770 border. Development. 2016;143:3182–94.

771 80. Karuppaiah K, Yu K, Lim J, Chen J, Smith C, Long F, et al. FGF signaling in the

772 osteoprogenitor lineage non-autonomously regulates postnatal chondrocyte

773 proliferation and skeletal growth. Development. 2016;143:1811–22.

774 81. Shen J, Abu-Amer Y, O’Keefe RJ, McAlinden A. Inflammation and epigenetic

775 regulation in osteoarthritis. Connect. Tissue Res. 2017. p. 49–63.

776 82. Usami Y, Gunawardena AT, Iwamoto M, Enomoto-Iwamoto M. Wnt signaling in

777 cartilage development and diseases: Lessons from animal studies. Lab Investig.

778 2016;96:186–96.

779 83. Miyamoto Y, Shi D, Nakajima M, Ozaki K, Sudo A, Kotani A, et al. Common

780 variants in DVWA on 3p24.3 are associated with susceptibility to knee

781 osteoarthritis. Nat Genet. Nature Publishing Group; 2008;40:994–8.

782 84. Zhai G, van Meurs JBJ, Livshits G, Meulenbelt I, Valdes AM, Soranzo N, et al. A

783 genome-wide association study suggests that a locus within the ataxin 2 binding

784 protein 1 gene is associated with hand osteoarthritis: the Treat-OA consortium. J Med

785 Genet. 2009;46:614–6.

786

787 Additional files

788 Additional file 1: Supplemental tables S1-S7. (XLSX, 12.8 MB)

33 bioRxiv preprint doi: https://doi.org/10.1101/274043; this version posted March 1, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

789 Additional file 2: Supplemental figures S1-S3. (PDF, 870 KB)

790

791 Figure legends

792 Fig. 1. ATAC-seq of OA cartilage tissues. (a) Schematic diagram of the experimental

793 flow. Chondrocyte from two regions of the articular cartilages were isolated for ATAC-

794 seq. oLT (outer Libial Tibial): intact tissue. iMT (inner Medial Tibial): damaged tissue.

795 (b) ATAC-seq sequencing fragment length distribution for pooled oLT (n=8) and iMT

796 (n=8) libraries. The banding pattern, representing fragments from nucleosome free

797 regions, mono-, di-, and tri-nucleosome, indicates a successful ATAC-seq experiment.

798 (c) Normalized nucleoATAC signal aggregated over all genes shows distinct

799 nucleosome positioning around transcription start sites (TSS). Positive and negative

800 numbers indicate the TSS upstream and downstream nucleosomes.

801

802 Fig. 2. Identification of accessible chromatin regions with OA susceptible GWAS

803 SNPs and differentially methylated loci. (a) Annotation of identified accessible

804 chromatin regions. (b-c) An example of an OA GWAS SNP rs3752746 (b) and

805 differentially methylated locus cg09221159 (c) overlapping with an open enhancer in

806 cartilages from OA patients. The arrow points to the predicted target genes.

807

808 Fig. 3. Bone- and chondrocyte-specific enhancers are dysregulated in OA cartilage.

809 (a) Principal component analysis using ATAC-seq peaks shows general separation

810 between oLT and iMT. (b) The extent of differential accessibility between iMT and

811 oLT are correlated with enhancers composition (top), enrichment of OA differential

812 methylated loci (DML, middle), and enrichment of OA GWAS SNPs (bottom). CTL:

813 Control trait (Parkinson’s disease). Circles represent the mean and error bars represent

34 bioRxiv preprint doi: https://doi.org/10.1101/274043; this version posted March 1, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

814 the standard error for 100 times permutation. (c) Pie chart shows proportions of the

815 ATAC-seq peaks that are more accessible in iMT (red) and oLT (blue). (d) Genome

816 browser views of example loci (all patients pooled) showing differential accessible

817 regions at promoters (top) and enhancers (bottom) with more accessible in iMT (left),

818 more accessible in oLT (middle), and not significantly altered between oLT and iMT

819 (right). (e) Enrichment of cell type-specific enhancers in differentially accessible peaks.

820 Top 10 are listed (inset), which includes bone- and chondrocyte-related cell types (bold).

821 Error bars represent 95% confidence interval.

822

823 Fig. 4. Integrative transcriptomic and epigenomic analysis reveals dysregulated

824 genes and pathways in OA. (a) A scheme of ATAC-seq and RNA-seq integration

825 analysis. (b) Differential chromatin accessibility (ATAC-seq) at both promoter (left)

826 and enhancer (right) in OA is generally consistent with differential expression (RNA-

827 seq [15]). Fold change between iMT and oLT is plotted. Box plots show the median,

828 quartiles, and Tukey whiskers. (c) Venn diagram summarizing protein coding genes

829 that are dysregulated both at the transcriptomic (RNA-seq) and epigenomic (ATAC-

830 seq) levels in OA. (d) GO enrichment analysis of 371 genes from (c). Top 30 terms in

831 GO biological process ranked by the level of enrichment and overlapping dysregulated

832 genes are listed. Terms related to MSC, bones, and chondrocytes are highlight in bold.

833

834 Additional file 2: Figure S1. ATAC-seq of human cartilage. (a) Images of the

835 dissected cartilage tissue from fresh articular tibial bone for all patients. Section regions

836 of tibial plateau: outer region of lateral tibial plateau (oLT) exhibited macroscopically

837 normal cartilage with a visibly smooth cartilage surface, and inner region of medial

838 tibial plateau (iMT) had visible severe erosion of cartilage. (b) Enrichment qPCR of a

35 bioRxiv preprint doi: https://doi.org/10.1101/274043; this version posted March 1, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

839 housekeeping gene (GAPDH) over a heterochromatin region is correlated with

840 enrichment of ATAC-seq signal in DHS regions.

841

842 Additional file 2: Figure S2. Differential ATAC-seq peaks. (a) Mean-difference plot

843 of all ATAC-seq peaks. Red and blue dots represent peaks that are more- and less-

844 accessible in iMT, respectively. (b) Principal Component Analysis of differential

845 ATAC peaks between oLT (blue) and iMT (red). (c) Genome browser views showing

846 consistency of ATAC-seq signals across patients at example loci.

847

848 Additional file 2: Figure S3. Data integration. (a) Top de novo motif (top) and top

849 predicted known transcription factors (bottom) enriched in differentially accessible

850 regions. (b) An example gene (WNT5A) that is dysregulated at all three criteria

851 (promoter accessibility, enhancer accessibility and expression).

36 bioRxiv preprint doi: https://doi.org/10.1101/274043; this version posted March 1, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Fig. 1

a Posterior

oLT iMT

Cartilage Collagenase II 50,000 ATAC-seq Anterior oLT or iMT digestion Chondrocytes Fresh tibial bone tissue 12 hours

b c +1 oLT 10–2 oLT 0.4 iMT l iMT

0.3 –1 +2 10–4 –2 +3 0.2

NucleoATAC signa NucleoATAC 0.1 Normalized read density read Normalized 0 250 500 750 1000 –1000 0 1000 Fragment length (bp) Distance to TSS (bp) bioRxiv preprint doi: https://doi.org/10.1101/274043; this version posted March 1, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Fig. 2

a ATAC-seq peaks

Unannotated Promoter 13,150 (12%) 18,410 (16.9%)

Enhancer 77,655 (71.1%)

b ATAC-seq + OA GWAS SNPs chr4:1727018-1809018

82 kb ATAC-seq peaks

OA GWAS SNPs rs3752746

Enhancers link to genes

Genes TACC3 FGFR3

c ATAC-seq + OA DMLs chr10:8701566-88472566

771 kb ATAC-seq peaks

cg09221159 OA DMLs

Enhancers link to genes

Genes PTEN RNLS bioRxiv preprint doi: https://doi.org/10.1101/274043; this version posted March 1, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

[0 - 31]

[0 - 31] Fig. 3

[0 - 31]

More accessible More accessible a ● Tissue c oLT01 [0 - 31] ● iMT in iMT in oLT

● 0.50 ● oLT (n=1,565, 1.4%) (n=2,885, 2.6%) iMT01

[0 - 31] 0.25 Not significantly oLT04 oLT03 [0 - 31] altered oLT02 ● ● ●

PC2[16.03%] 0.00 oLT06 (n=104,765, 96%) ● iMT03

● ● iMT06 oLT07 ● iMT07 ● ● ● ● ● oLT08 iMT02 [0 - 31]iMT08 iMT04 -0.25 ● d chr2:5689561-5695050 chr10:84243895-84245569 chr12:6533278-6536109 oLT05 ● iMT05 SOX11 RGR GAPDH -0.25 0.00 0.25 0.50 [0 - 63] [0 - 73] [0 - 302] PC1[41.37%] b [0 - 63] [0 - 73] [0 - 302] 100 Annotation Promoters Unannotated 5,489 bp 1,674 bp 2,831 bp Promoter 75

e Enhancer g chr4:5730356-5732322 chr19:7739088-7740288 chr10:78970684-78976907 50 EVC CD209 ZMIZ1-AS1 [0 - 200] [0 - 174] [0 - 684]

Percenta 25

[0 - 200] [0 - 174] [0 - 684] 0 Enhancers Trait 1,966 bp 1,200 bp 6,223 bp OA 4.0 CTL

3.0 e Top 10 enriched cell types Adipocytes (E023) 3.0 2.0 Dermal fibroblasts (E126) Enrichment of IMR90 cell line (E017) DML (odds ratio) Osteoblasts (E129) 1.0 MSC (E026) Chondrocytes (E049) Lung fibroblasts (E128) 2.5 Trait Muscle satellite cells e-spcific atio) r OA p (E052) CTL 2.0 2.0 Astrocytes (E125) Myoblasts (E120)

1.5 enhancer (odds Enrichment of 1.0

1.0 Enrichment of cell ty GWAS SNP (odds ratio) 0.1 0.3 0.5 0.7 0.9 Cumulative fraction of peaks Differential peaks Permutated peaks Extent of differential accessibility 125 cell types bioRxiv preprint doi: https://doi.org/10.1101/274043; this version posted March 1, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Fig. 4

a Predicted link

enhancer promoter Gene ATAC-seq RNA-seq

b Promoters Enhancers

-16 <2.2x10 <2.2x10-16 3

2 2.5 1 ) 2 0 (log 0 –1 RNA-seq –2 –2.5 Fold change in expression in change Fold More accessible More accessible More accessible More accessible in oLT in iMT in oLT in iMT ATAC-seq

c Total genes (n=17,438)

Genes with Genes with differentially accessible differentially accessible promoters enhancers 27 (ATAC-seq) 165 2059 (ATAC-seq) 12 51 308 Genes dysrgulated 2922 in OA (n=371) Genes with differential expression (RNA-seq)

Level of Genes dysregulated d Gene ontology terms enrichment in OA

positive regulation of TRAIL−activated apoptotic signaling pathway PMAIP1;THBS1 positive regulation of mesenchymal stem cell proliferation HMGA2;SOX11 negative regulation of collateral sprouting of intact axon in response to injury DCC;PTPRS;RTN4RL1 mesenchymal cell fate commitment HMGA2;FGFR2 mesenchymal cell differentiation involved in salivary gland development HMGA2;FGFR2 mesenchymal cell differentiation HMGA2;FGFR2 embryonic ectodermal digestive tract morphogenesis SOX11;FOXF1;FGFR2 collateral sprouting of intact axon in response to injury RTN4RL1;BDNF choline catabolic process SLC44A1;CHDH cardiac endothelial to mesenchymal transition HMGA2;FGFR2 positive regulation of metanephric cap mesenchymal cell proliferation CHRD;LRP5;FGFR2 positive regulation of mesenchymal cell proliferation involved in lung development CHRD;LRP5;FGFR2 post−embryonic limb morphogenesis BMPR1B;TBX4 negative regulation of optical nerve axon regeneration PTPRS;RTN4RL1 lens fiber cell development WNT5B;MAF positive regulation of mesenchymal cell proliferation involved in ureter development CHRD;LRP5;FGFR2 positive regulation of ossification SOX11;WNT5A positive regulation of cilium movement RPS19;ETS1 positive regulation of actin filament−based movement RPS19;ETS1 hindlimb morphogenesis BMPR1B;TBX4 hexose transmembrane transport SLC2A12;SLC2A5 fat body development ZNF516;LRP5 endochondral bone morphogenesis BMPR1B;LRP5;FGFR2 post−embryonic camera−type eye morphogenesis TENM3;IFT122 positive regulation of extrathymic T cell differentiation ZMIZ1;PRDM1 positive regulation of cellular component movement RPS19;ETS1 negative regulation of dendritic spine development DCC;PTPRS fat pad development ZNF516;LRP5 establishment of epithelial cell apical/basal polarity involved in camera−type eye morphogenesis TENM3;IFT122 adipose tissue development ZNF516;LRP5 0 10 20 30 (% of overlapping genes) bioRxiv preprint doi: https://doi.org/10.1101/274043; this version posted March 1, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Figure S1

a #1 Posterior #2 #3 Posterior #4 Posterior Anterior

oLT iMT iMT iMT iMT oLT oLT oLT Anterior Anterior Posterior Anterior #8 #5 Anterior #6 Posterior #7 Anterior Posterior oLT oLT iMT oLT iMT oLT iMT Anterior Posterior Anterior Posterior

b 7

6 oLT iMT 5

4

3 R² = 0.48578 (qPCR dCT) (qPCR 2 Abundance of GAPDH of Abundance 1

0 1 2 3 4 5

DHS Enrichment Score bioRxiv preprint doi: https://doi.org/10.1101/274043; this version posted March 1, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Figure S2

a c chr10:85918514-85921724 chr17:48541410-48545154

1.5 3,210 bp 3,744 bp

[0 - 60] [0 - 77] Pooled oLT oLT01 CF g o l oLT02 FC 2 oLT03 log oLT04 oLT05 oLT06

–1.5 –1.0 –0.5 0 0.5 1.0 oLT07

234567 oLT08 AverageAverage logCPM [0 - 60] [0 - 77] b Differential ATAC peaks Pooled oLT iMT01 iMT02 iMT02 oLT04 iMT03 iMT07 oLT02oLT03 oLT06 iMT04

iMT08 2 CP 2 oLT07 oLT01 iMT05 iMT06 oLT08 iMT05 iMT06

iMT04 iMT07 iMT01 iMT03 iMT08 Differential ATAC peaks –0.4 –0.2 0 0.2

oLT05 Genes –0.4 –0.2 0 0.2 0.4 0.6 GRID1 HOXB2 PC 1 bioRxiv preprint doi: https://doi.org/10.1101/274043; this version posted March 1, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Figure S3

a Top de novo Motif in oLT more accessible region (P value: 1e-22)

CEBPE/MA0837.1/Jaspar(0.936) Top de novo Motif in iMT more accessible region (P value: 1e-83)

Jun-AP1 (bZIP)/ Homer (0.977)

More accessible More accessible region in oLT region in iMT

ETS bZIP FLI1 ETS1 CEBP BATF FOSL2 AP-1 ETV2 CEBPE NFE2L2 EWS:ERG BACH2 ATF4 GABPA FOSL1 MafK ETV1 ERG NF1:FOXA1 CHOP STAT3

b 172 kb

ATAC-seq peaks

Differential ATAC-seq peaks

Enhancers link to genes

DEGs WNT5A