Investigating the role of epigenetics in scar maintenance

Andrew William Stevenson BSc. (Hons.) School of Surgery

This thesis is presented for the degree of Doctor of Philosophy at the University of Western Australia 2016

A

I

II

DECLARATION FOR THESES CONTAINING PUBLISHED WORK AND/OR WORK PREPARED FOR PUBLICATION The examination of the thesis is an examination of the work of the student. The work must have been substantially conducted by the student during enrolment in the degree. Where the thesis includes work to which others have contributed, the thesis must include a statement that makes the student’s contribution clear to the examiners. This may be in the form of a description of the precise contribution of the student to the work presented for examination and/or a statement of the percentage of the work that was done by the student. In addition, in the case of co-authored publications included in the thesis, each author must give their signed permission for the work to be included. If signatures from all the authors cannot be obtained, the statement detailing the student’s contribution to the work must be signed by the coordinating supervisor. Please sign one of the statements below.

1. This thesis does not contain work that I have published, nor work under review for publication.

Student Signature ......

2. This thesis contains only sole-authored work, some of which has been published and/or prepared for publication under sole authorship. The bibliographical details of the work and where it appears in the thesis are outlined below. Student Signature ......

III

3. This thesis contains published work prepared for publication, some of which has been co-authored. The bibliographical details of the work and where it appears in the thesis are outlined below. The student must attach to this declaration a statement for each publication that clarifies the contribution of the student to the work. This may be in the form of a description of the precise contributions of the student to the published work and/or a statement of percent contribution by the student. This statement must be signed by all authors. If signatures from all the authors cannot be obtained, the statement detailing the student’s contribution to the published work must be signed by the coordinating supervisor. Manuscripts submitted for review Student’s contribution Contribution of the co-authors Student Signature: …………………………… Date ………………… Coordinating Supervisor Signature: ……………………. Date: …………

IV

Dedication I dedicate this to my family – to my parents who supported, encouraged and pushed me to be the best the I can be, my brother who has always been there, and my beautiful wife Kara who encouraged me to do this PhD and patiently supported me throughout.

V

VI

Abstract The reparative response to skin injury in mammals results in the development of scar, underpinned by changes in the dermal matrix structure. Scarring is a significant clinical problem and leads to aesthetic, functional and psychological impacts in patients.

Scars are maintained for life and, in the case of children, increase in size during periods of growth. This suggests that the cells producing the scar matrix retain differences to those cells producing normal skin dermal matrix.

Epigenetic modification is a heritable alteration to DNA that regulates transcription but does not involve changes to the DNA sequence. DNA methylation, an important and stable regulator of gene transcription, is an example of epigenetic modification in DNA methylation, methyl groups are added to specific DNA bases and alter DNA structure. This affects the transcriptome and cell phenotype. Epigenetic changes are known to be critical in tissue differentiation during development and in cancer.

In this study, the central hypothesis is that epigenetic modification in fibroblasts during healing, specifically DNA methylation, is the mechanism responsible for long-term changes in collagen metabolism in normotrophic scar fibroblasts and subsequently the dermal matrix. Therefore the aim of this study was to characterise changes in DNA methylation and in the transcriptome of scar fibroblasts compared to normal skin fibroblasts and identify which underpin the maintenance of scar dermal matrix. This could lead to the identification of novel therapeutic targets to ameliorate scarring.

To identify changes in the epigenome and transcriptome of scar fibroblasts, matched 3mm skin biopsies were taken from both forearms of 6 male burn patients aged 18-34 years who had sustained a unilateral forearm burn injury at least one year prior to biopsy: 6 normotrophic scar samples, and 6 patient-matched normal skin samples. Fibroblasts were cultured from these biopsies using the explant technique and these cells were then used for whole genome methylation and transcriptome profiling.

The first study assessed the methylation profile of scar and normal skin fibroblasts. Whole genome methylation data was obtained using the Illumina Infinium 450K array. This measures 485 000 methylation sites across the genome. The data was analysed to identify site specific changes as well as gene and region changes in DNA methylation. Using a pairwise comparison and defining significance as p<0.05 with a Benjamini- Hochberg correction for multiple testing, 0.7% of Cytosine-phosphate-Guanine (CpG)

VII sites tested were differentially methylated in the scar fibroblasts, with 63% hypomethylated and 37% hypermethylated. Gene and region analysis showed 836/19 076 genes (4.38%) were differentially methylated within the gene body region. A smaller number of genes were differentially methylated in intragenic regions. Finally, the promoter regions were analysed. This showed that nearly 2% of the promoter regions were differentially methylated, with 44% hypermethylated and 56% hypomethylated. Many genes within this differentially methylated set were identified to have important roles in extracellular matrix metabolism through analysis.

The next part of the study involved comparative transcriptome analysis of scar and normal skin fibroblasts using the Affymetrix human gene ST 2.0 array. Using a nominal significance of p<0.05 and a fold change of ±1.5, 0.8% of genes were found to be significantly differentially expressed, with 47% increased and 53% decreased in expression in scar fibroblasts. Gene set enrichment analysis (GSEA) was carried out and identified 507 gene sets that were significantly differentially expressed, using a Mann- Whitney U-test with a p<0.05. The most significant changes were in gene sets related to extracellular matrix production and cell adhesion.

Further bioinformatics analysis involved the integration of the methylome and transcriptome data to identify potential gene targets involved in maintaining the scar fibroblast phenotype. Datasets were combined into a single database, and a list of 16 genes of interest was generated. These 16 genes had significantly differentially methylated promoter regions and were differentially expressed. This list of 16 genes was then linked to gene ontologies using the UCSC table browser, and genes with DNA- binding or transcriptional activity were selected. This criteria was met by 4 targets. After extensive review, 2 were selected for further modulation – Forkhead Box F2 (FOXF2), and Mohawk Homeobox (MKX). Both genes have been implicated in collagen production and in fibrosis, although they had not previously been investigated in the skin.

The final stage of this work involved using phenotypic assays to validate the pro- fibrotic functions of FOXF2 and MKX genes. Initially a ‘scar-in-a-jar’ model was modified to assess both collagen quantity and alignment (coherency) in collagen matrices produced by scar and normal skin fibroblasts. This modified assay was then used to compare untreated scar and normal skin fibroblasts. In this assay only scar fibroblasts cultured from 1/4 patients secreted an increased quantity and more aligned collagen matrix than the matched normal skin fibroblasts. This is most likely due to the requirement for stimulation of cells with TGFβ for this assay. VIII

The assay was then used to determine whether FOXF2 or MKX had an impact on collagen matrix deposition in scar fibroblasts. Using siRNA knockdown of the two target genes and the modified scar-in-a-jar assay it was found that the double knockdown of both MKX/FOXF2 significantly decreased the quantity and coherence of the collagen matrix. This suggests these genes are important in scar matrix maintenance and therefore potential targets to ameliorate scarring.

This study provides the first evidence for epigenetic and transcriptome changes in fibroblasts isolated from established normotrophic scars that may underpin long-term maintenance of the aberrant dermal matrix. Two genes identified using an integrative genomic approach affected collagen matrix deposition in vitro. These findings add important new knowledge to the understanding of why scars persist throughout the life of the patient. Further work exploring the role of the target genes in scarring and the mechanisms underlying the changes are required to translate these findings into therapeutic intervention.

IX

Table of Contents

Abstract ...... VII

Table of Contents ...... X

List of Figures ...... XV

List of Tables ...... XVII

Acknowledgements ...... XVIII

List of Abbreviations ...... XX

CHAPTER 1 - INTRODUCTION ...... 2

1.1 Skin and the problem of scarring ...... 2 1.1.1 Normal skin structure and function ...... 3 1.1.4 Wound Healing ...... 6 1.1.5 Scar phenotype at a tissue level ...... 7 1.1.6 Scar phenotype – cell level ...... 8 1.1.8 Collagen turnover ...... 10 1.1.9 Maintenance of scars ...... 14

1.2 Genetic control of scarring ...... 15 1.2.1 Gene expression studies of scarring ...... 15 1.2.2 Epigenetics ...... 17 1.2.6 Histone remodelling ...... 21

1.3 Reprogramming cells ...... 26 1.3.1 Cellular reprogramming and regeneration in vivo ...... 26 1.3.2 Cellular reprogramming in vitro ...... 30 1.3.3 Therapeutic cellular reprogramming ...... 30

1.4 Summary ...... 33

Hypotheses ...... 34

Aims ...... 35

CHAPTER 2 – INVESTIGATION OF THE METHYLATION STATUS OF SCAR FIBROBLASTS ...... 37

X

Introduction ...... 37

2.1. Methods ...... 38 2.1.1. Patient recruitment and biopsy procedure ...... 38 2.1.2. Tissue culture ...... 39 2.1.3. DNA extraction and processing ...... 40 2.1.4. Methylation Arrays ...... 42 2.1.5. Bioinformatics ...... 42

2.2. Results ...... 47 2.2.1. Differential methylation of CpG sites in scar fibroblasts ...... 47 2.2.2 Differentially methylated genes by genomic region ...... 48 2.2.3 Methylation profile of promoter regions (within 1500 base pairs of the transcription start site)...... 51 2.2.4. Top 40 genes with differentially methylated promoter regions in scar fibroblasts ...... 53 2.2.5. Top 40 ranked genes with differentially methylated promoter regions in scar fibroblasts using RnBeads analysis ...... 56 2.2.6. Enrichment analysis of differentially methylated gene promoter regions in scar fibroblasts ...... 58

2.3. Discussion ...... 60 2.3.1. Significant changes in individual CpG site methylation patterns ...... 60 2.3.2. Regional and genomic distribution of methylation changes in scar fibroblasts ...... 63 2.3.3. Changes in promoter methylation – gene level ...... 65 2.3.4 Alternative analysis of promoter regions using RNBeads ...... 68 2.3.5 Summary ...... 68 2.3.5 Limitations of study ...... 69

2.4. Conclusion ...... 71

CHAPTER 3 - INVESTIGATION OF THE TRANSCRIPTOME OF SCAR FIBROBLASTS ...... 73

Introduction ...... 73

3.1. Methods ...... 74 3.1.1. RNA extraction and processing ...... 74 3.1.2. Affymetrix Human Genechip 2.0 ST preparation and processing ...... 75 3.1.3. Processing of expression data ...... 75 3.1.4. Gene Set Enrichment Analysis ...... 76

3.2. Results ...... 78 3.2.1. Overview of differentially expressed genes ...... 78

XI

3.3. Discussion ...... 87 3.3.1. Gene expression ...... 87 3.3.2. GSEA ...... 90 3.3.3. Limitations of this Study ...... 92

3.4. Conclusion ...... 95

CHAPTER - 4 INTEGRATIVE GENOMIC APPROACH USING METHYLOME AND TRANSCRIPTOME DATA TO IDENTIFY TARGETS INVOLVED IN SCAR MAINTENANCE ...... 97

Introduction ...... 97

4.1. Methods ...... 98 4.1.1. Integration of methylation and gene expression datasets ...... 98 4.1.2. Restriction of results by gene ontology ...... 99 4.1.3. Prioritization of targets by literature analysis ...... 99

4.2. Results ...... 101 4.2.1. DNA Binding Targets ...... 106

4.3. Discussion ...... 112 4.3.1. Strengths of integrative analysis ...... 112 4.3.1. Limitations of study ...... 115

4.4. Conclusion ...... 117

CHAPTER 5 – PHENOTYPIC ASSAYS AND TARGET VALIDATION...... 119

Aims: ...... 119

5.1. Methods ...... 120 5.1.1. Experimental design ...... 120 5.1.2. Scar-in-a-jar ...... 120 5.1.3. siRNA knockdown of MKX and FOXF2 genes ...... 126 5.1.4. Quantitative real time PCR ...... 126 5.1.5. Statistical analysis ...... 127

5.2. Results ...... 128 5.2.1. Scar cell fibroblasts vs. normal skin fibroblasts...... 128 5.2.2. siRNA knockdown of MKX and FOXF2 ...... 135

5.3. Discussion ...... 140 5.3.1. Differences identified between normotrophic scar and normal skin fibroblasts ...... 140 XII

5.3.2. Validation of target gene involvement in extracellular matrix synthesis ...... 142 5.3.3. Limitations of study ...... 143

5.4. Conclusions ...... 145

6. CHAPTER 6 - GENERAL DISCUSSION ...... 147

6.1. Overview of research ...... 147 6.1.1. Summary of results ...... 148

6.2 Significance of these findings and relevance to the field ...... 148 6.2.1. Scar cell origin as a possible explanation for the observed differences in the epigenome of scar fibroblasts ...... 148 6.2.2. Pathway analyses of expression data and their importance to scar maintenance ...... 153 6.2.4. The role of MKX and FOXF2 and potential for therapeutic modulation ...... 156 6.2.5. Limitations and strengths of this study ...... 158

6.3. Future work ...... 160 6.3.1. Expanding the data set to increase confidence in current targets and identify additional genes involved in scar fibroblast phenotype ...... 160 6.3.2. Validation for FOXF2 and MKX in vivo ...... 161

6.3. Conclusions ...... 162

REFERENCES ...... 163

APPENDICES ...... 185

Appendix I – patient information sheet and consent forms ...... 186

Appendix II - IMA code for analysis of 450k human genechip using R ...... 191

Appendix III - Full list of promoter regions of genes with p<0.05 from differential methylation analysis ...... 194

Appendix IV – R code for analysis of HuGene 2.0 expression data ...... 209

Appendix V - Full list of significantly differentially expressed genes ...... 213

Appendix VI - 507 differentially expressed gene sets using a Mann-Whitney U test with a p<0.05 revealed in GSEA ...... 218

Appendix VII - Full list of gene ontologies associated with 16 differentially methylated and expressed genes ...... 238

XIII

Appendix VIII - Full list of gene set enrichment results for 16 differentially methylated and differentially expressed genes...... 242

XIV

List of Figures

Chapter 1 Figure 1. 1: Schematic of normal skin architecture...... 3 Figure 1. 2: Normal fibroblasts produce ECM ...... 5 Figure 1. 3: Temporal diagram of the three phases of wound healing...... 6 Figure 1. 4: The four main subtypes of scar...... 7 Figure 1. 5: The steps involved in collagen synthesis...... 11 Figure 1. 6: The steps involved in collagen degradation...... 12 Figure 1. 7: Examples of scar growth over time ...... 14 Figure 1. 8: An illustration of the two main types of epigenetic regulation...... 19 Figure 1. 9: Regeneration of human digit tip in a young child with conservative treatment...... 27 Figure 1. 10: Regeneration of axolotl limb after amputation ...... 29

Chapter 2 Figure 2. 1: Differential methylation profile of scar and normal skin fibroblasts at CpG site level...... 48 Figure 2. 2: Differential methylation in gene regions in scar fibroblasts...... 49 Figure 2. 3: Differential methylation by location ...... 50 Figure 2. 4: Differential methylation of promoter regions in scar fibroblasts compared to normal skin fibroblasts...... 52

Chapter 3 Figure 3. 1: Overview of expression data...... 79 Figure 3. 2: Heat map of expression patterns of all the genes with greater than 0.7 Log Change (1.68 Fold Change) increase...... 82 Figure 3. 3: Heat map showing the expression patterns of all the genes with greater than 0.7 Log Change (1.68 Fold Change) decrease...... 83 Figure 3. 4: Example of network generated from top 20 entities from the ‘extracellular matrix’ group from GSEA...... 86

XV

Chapter 4 Figure 4. 1: Gene selection was carried out first by differential methylation and expression analysis, then targets for screened for transcriptional activity and then screened for known or likely relevance to scarring ...... 100 Figure 4. 2: 16 genes of interest are both differentially expressed and differentially methylated ...... 101 Figure 4. 3: Involvement between MKX and FOXF2 and cell processes...... 111

Chapter 5 Figure 5. 1: Example ROIs showing quantification process for collagen/µm...... 125 Figure 5. 2: No change in amount of collagen produced per cell in scar and control fibroblasts in 3 out of 4 patients...... 129 Figure 5. 3: Collagen per cell levels are unchanged in scar compared to matched controls in 3 of 4 patients...... 130 Figure 5. 4: No change in coherence of collagen between scar and control fibroblasts in 3 out of 4 patients...... 132 Figure 5. 5: Confocal images of collagen matrix. No significant difference in collagen coherence was observed, with the exception of Patient 2...... 133 Figure 5. 6: qRT-PCR results match expression levels on expression array...... 134 Figure 5. 7: Collagen per cell levels decreased in siRNA knockdowns in both patients 2 and 4...... 135 Figure 5. 8: Patient 2 collagen per cell levels were significantly decreased in the FOXF2/MKX siRNA treated cells compared to the scrambled control...... 136 Figure 5. 9: Patient 4 collagen per cell levels were significantly decreased in all siRNA treated groups compared to the scrambled control...... 136 Figure 5. 1: Collagen coherence is significantly decreased in single MKX knockdown and double FOXF2/MKX knockdown cells from patient 2, and significantly increased in FOXF2 knockdown in patient 3...... 137 Figure 5. 11: Patient 2 showed a significant decrease in collagen orientation in cells from treated with siRNA knockdown compared to scrambled siRNA...... 138 Figure 5. 12: Patient 4 siRNA showed a no significant decrease in the coherence of collagen in the siRNA knockdowns compared to the scrambled siRNA...... 139

XVI

List of Tables

Chapter 2

Table 2. 1: Top 20 differentially methylated genes with largest increase in Δβ in scar fibroblasts (p<0.05)...... 54 Table 2. 2: Top 20 differentially methylated genes with largest decrease in Δβ in scar fibroblasts (p<0.05)...... 55 Table 2.3: Top 40 differentially methylated gene promoters from RnBeads analysis sorted by rank score…………………………………………………………………….57 Table 2. 3: Top 20 enriched groups/pathways from the differentially methylated gene set data...... 59

Chapter 3

Table 3. 1: Top 20 genes with increased expression in scar fibroblasts (≥1.5 Fold Change, nominal p<0.05) ...... 80

Table 3. 2: Top 20 genes with decreased expression in scar fibroblasts (≥ - 1.5 Fold Change, nominal p<0.05) ...... 81 Table 3. 3: Top 20 significantly differentially expressed gene sets identified through Gene Set Enrichment Analysis (GSEA)...... 84 Table 3. 4: Top 20 up and downregulated genes within the ‘extracellular matrix’ group, which was the top hit in the GSEA...... 85

Chapter 4

Table 4. 1: 16 Differentially expressed and differentially methylated genes sorted by relative gene expression level...... 102 Table 4. 2: Selected gene ontology terms for 16 target genes feature many fibrosis related ontologies...... 103 Table 4. 3: Gene set enrichment analysis of the 16 genes differentially methylated and expressed sorted by p-value...... 104 Table 4.4: 9 differentially expressed and differentially methylated genes in alternate RnBeads methylation analysis sorted by relative gene expression level………………105

XVII

Acknowledgements

There are so many people to acknowledge, because so many people helped me along the way!

Firstly to my supervisors. Huge thanks to Mark, who gave me my first grown up job, provided a great working environment, gave me a great PhD project in an area that fascinated me, and was so enthusiastic and positive about my results. Huge thanks to Hilary, who was an excellent principal supervisor, was fantastic at recruiting patients and easing me into the hospital side of things, always available and helpful, and tempered Mark and my (over)enthusiasm with proper scientific rigor. Many thanks to Phil, who had endless patience with a very novice R programmer, greatly expanded my knowledge of statistics and genetics, and had similar taste in music as me, making me jealous of all the cool bands he’s seen. Many thanks to Fiona, who inspired me whenever we spoke or I saw her speak, was instrumental in getting funding for the project and the conferences I attended, and whose input and interest in my project was always appreciated.

Next, to the many, many people who helped along the way. To the original McComb crew – Katharine Adcroft, for being so lovely when we I first started as an RA, Natalie Morellini, for all the tea (and lasagne) and Leigh Parkinson, for being such an awesome dude inside and outside the lab. Thanks to Patricia Danielsen, who was amazingly helpful with the patient recruitment at the start, helped out with the cell culture when I was away, and pushed us to really look at our collagen quantitation methods that resulted in us using the scar-in-a-jar technique. Thanks to Tomas O’Neill, for starting my project off and being my kebab buddy. Thanks to Emily O’Halloran for being so bubbly and enthusiastic and such a great person to have around the lab. Likewise to Samantha Valvis, who was so lively and energetic, and was an awesome person to have around the lab. Thanks to Janine Duke, who was great to have a chat to, even though I struggled to follow some of the statistics sometimes. Lastly, a huge thank you to Mitali Manzur, who was so helpful in the last year of my PhD with all the rush of phenotypic assays, shared all her knowledge in a ‘big sister’ role, and without whom I don’t think I could have obtained as much data as I did.

Thanks to the PhD students who shared my journey. Thanks to Dulharie Wijeratne, who is one of the kindest and sweetest people I’ve ever met, and who was always there to laugh at my crappy jokes. Thanks to Uliya Gankande, who was my CRC conference buddy and always great to talk to. Thanks to Vetrichevvel Palanivelu, who is such a XVIII nice guy and who was always willing to help out. Thanks to Mansour Alghamdi, who was a very patient student, is so generous, and helped me out with my cell culture when I was away. Thanks to the nano crew Michael Bradshaw, Vipul Agarwal, Priyanka Toshniwal and Tristan Clemons for being fun to hang out with in the lab, and without Bradshaw’s help with the coherency anaylsis I may have had far fewer results to write about!

Many thanks to the neuroscience crew who I shared the lab with – Marissa Penrose- Menz, Lindy Fitzgerald, Jenny Rodger, Stephanie Grehl, Tenelle Wilks, Michael Archer, Sophie Payne, Cameron Evans, Paula Fuller, Marcus Giacci, Ryan Doig, Tony Kuhar, Vince Clark, Zara Samani, Jez Supreme, Kalina Makowiecki, Ivan Lozic, Andrew Garrett, Michael Challenor, Alexia Drozdova, Nicholas Nagloo, Alex Tang, Alesha Heath, Jamie Beros, Darren Clarke, Kat Hankinson and all the Bath students – Jon Wells, Dan Smith, Matt Sykes, Charis Syzmanski, Emma Stephens, Ivana Gachulincová, Bethany Ashworth and Charlotte Bailey. You helped make an amazing working environment, it was great to have those lunchtime conversations with all of you, and I will always remember morning teas, Melbourne cup days and Christmas parties we had.

Thanks to my family for supporting and encouraging me through these 4 years. Thanks to Mum, for pushing me to do this PhD, and always making sure I was giving it my all and letting me know how you supported me. Thanks to Dad, who gave me my love of science and technology, was always positive about what I was doing and always took an interest to discuss the latest new scientific finding with me. Thanks to Joe, for being a brotherly support and being there to hang out. Thanks to my dog Soko, for being my companion while staying up late writing this thesis on those cold winter nights. Lastly, all my love and thanks to my wife Kara. I remember talking to you on a rocky beach in Croatia, where we were talking about whether I should do my PhD and I was unsure. You said something like “do you feel like you want to do it? Then just go on and do it!”, and that was the point I made my mind up to undertake this PhD. You supported me the whole way through, helped keep me motivated, had patience when things didn’t go to plan. Without you this would not have been possible.

XIX

List of Abbreviations

αSMA Alpha smooth muscle actin ABCB4 ATP-binding cassette sub-family B member 4 ACE Angiotensin converting ADAMTS5 A disintegrin and metalloproteinase metallopeptidase with thrombospondin type 1 motif 5 AIM2 Absent in melanoma 2 AML Acute myeloid leukaemia B-H Benjamini-Hochberg ∆β Beta difference BMMSC Bone marrow derived mesenchymal stem cells BSA Bovine serum albumin CADM1 Cell adhesion molecule 1 CCLN1 Cyclin L1 CD300E CD300e molecule CD34 CD34 Molecule cDNA complimentary DNA CNTFR Ciliary Neurotrophic Factor Receptor CLEC3B C-Type Lectin Domain Family 3 Member B CMCA Centre for Microscopy, Characterisation and Analysis COL4A1 Collagen IV α-1 COMP Cartilage oligomeric matrix CpG C-phosphate-G CS Cockayne syndrome CSA Cockayne Syndrome 1 CSB Cockayne Syndrome B Protein CTGF Connective tissue growth factor DALYs Disability adjusted life-years DEFB119 Defensin beta 119 DMEM Dulbecco’s modified eagle media DNA Deoxyribonucleic acid DNMTs DNA methyltransferases DPT Dermatopontin ECM Extracellular matrix

XX

EDTA Ethylenediaminetetraacetic acid ELISA Enzyme linked-immunosorbent assays EMT Epithelial-mesenchymal transition ES Embryonic stem eset expression set ESR1 Estrogen receptor 1 FBN2 Fibrillin 2 FBS Foetal bovine serum FDR false discovery rate FFT Fast fourier transform FIJI Fiji is just image j FOXF2 Forkhead box F2 FSP Fibroblast specific protein 1 FWER family wise error rate GAG Glycosaminoglycan GalT-II Galactosyltransferase II GAPDH Glyceraldehyde 3-phosphate dehydrogenase GC-MS Gas chromatography-mass spectrometry GO Gene ontology GRIA1 Glutamate receptor ionotropic α-amino-3-hydroxy-5-methyl-4- isoxazolepropionic acid 1 GRIK2 Glutamate receptor ionotropic kainate 2 GRWD1 Glutamate-rich WD repeat containing 1 GSEA Gene set enrichment analysis GSTA3 Glutathione S- alpha 3 H3K4 H3 lysine 4 H3K9me1 H3 methylation at lysine 9 HDAC HPLC High-performance liquid chromatography HSD17B2 Hydroxysteroid 17-beta dehydrogenase 2 Hsp47 Heat shock protein 47 IFNβ Ιnterferon-beta IMA Illumina methylation analyser INHBA Inhibin beta A iPS Induced pluripotent cells XXI

KRTAP22-22 Keratin associated protein 22-2 KRTAP6-3 Keratin associated protein 6-3 LAMA4 Laminin Alpha 4 LASIK Laser-assisted in situ keratomileusis LCS Leica confocal software Limma Linear models for microarray data NMT2 N-Myristoyltransferase 2 MBD Methyl-CpG binding domain proteins MDS Myelodysplastic syndrome MIR424 MicroRNA 424 MMP Matrix metalloproteinase mRNA messenger RNA MSTN Myostatin Msx1 Msh homeobox 1 NAPG N-ethylmaleimide-sensitive factor attachment protein gamma p passage PANX3 Pannexin 3 PBS Phosphate buffered saline PcGs Polycomb group proteins PDGFD Platelet derived growth factor D PDGFRL Platelet derived growth factor receptor like pen/strep Penicillin/streptomycin qRT-PCR Quantitative real time polymerase chain reaction RIMS1 Regulating synaptic membrane exocytosis 1 RIN RNA integrity number RNA Ribonucleic acid ROIs Regions of interest RPA3 Replication protein A3 RPE Retinal pigment epithelial RPH Royal Perth hospital SAM Significance analysis of microarrays SEMA3A Sema domain immunoglobulin domain (Ig) short basic domain secreted semaphorins 3A siRNA Small interfering RNA SKI SKI proto-oncogene XXII

SLC6A6 Solute carrier family 6 member 6 SNIP1 SMAD nuclear interacting protein 1 SSc Systemic sclerosis STAT3 Signal transducer and activator of transcription 3 TGF-β Transforming growth factor beta TGIF TGFβ induced factor homeobox 1 TIMPs Tissue inhibitors of metalloproteinases TNF Tumour necrosis factor TNXB Tenascin XB TSA Trichostatin A TSS1500 within 1500 base pairs of the transcription start site TSS200 within 200 base pairs of the transcription start site UTR3 3’ untranslated region UTR5 5’ untranslated region UV ultraviolet Wnt Wingless-related integration site

XXIII

I

Chapter 1

1

Chapter 1 - Introduction

1.1 Skin and the problem of scarring

“There is no magician’s mantle to compare with the skin in its diverse roles of waterproof overcoat, sunshade, suit of armour and refrigerator, sensitive to the touch of a feather, to temperature and to pain, withstanding the wear and tear of three score years and ten, and executing its own running repairs” (Lockhart, 1965).

The skin is essential for life and loss of the skin due to injury or illness can have disastrous and permanent effects - from major disability to death. Skin primarily acts as a protective barrier to the environment but also performs secondary functions such as vitamin D synthesis, thermoregulation and sensation (Saladin, 2001). Skin has an average surface area of 1.8m2 and accounts for around 12-15% of the total body weight of the average adult male (Bender). After injury to the skin in humans, the repair process that follows leads to scar formation - a scar that will remain for life. Scars can have a serious psychological and physical impact on a person’s life, changing their appearance and causing disability. The mechanisms underlying the persistence of scars are poorly understood and currently there are limited treatment options which are effective for the treatment of established scars.

Scarring remains a significant clinical problem even with state-of-the-art care (Sheridan and Tompkins, 2004). Every year, there are over 48 million operations in the USA alone that result in scars (Buie et al., 2010). In addition to these surgical scars there are approximately 11 million burn injuries globally, leading to 300 000 deaths per year and 19 million disability adjusted life-years (DALYs) (Peck, 2011) largely due to scarring. In low and middle-income countries, burn injury is one of the leading causes of both death and DALYs, and globally ranks 34th of all causes of DALYs (Murray et al., 2012). The ability to transform scar appearance and texture back to normal skin would have an enormous positive impact on the quality of life for the vast numbers of people with scars from injury or surgery.

Genomics is the study of deoxyribonucleic acid (DNA), the genetic material that encodes the instructions for development and functioning of all living organisms. Genomics studies the sequence of DNA and the way it is modified and expressed (NIH, 2014). Many diseases are caused by alterations to the sequence or the structure of DNA. Very little genomic work has focused on scar cells and any changes to the scar cells at a 2 genomic level that may underlie the formation and persistence of scar. The aim of this research is to use an integrative genomic approach to better understand the molecular changes underlying the persistence of scars in skin, with the goal of identifying ways to alter scar back to a more normal skin phenotype.

1.1.1 Normal skin structure and function

Skin consists of two main layers. The outer epithelial tissue (epidermis) which forms the barrier between the organism and the environment and the lower connective tissue layer (dermis) which primarily provides support.

Figure 1. 1: Schematic of normal skin architecture. The image shows the main layers of the skin, cell types present and adnexal structures. (Kendall and Nicolaou, 2013).

1.1.2 The epidermis

The epidermis is made up of five distinct layers: the stratum corneum, the stratum lucidium, the stratum granulosum, the stratum spinosum and the stratum basale. The main type of cell in the epidermis is the keratinocyte, with melanocytes, Merkel cells and Langerhans cells also present (McGrath and Uitto, 2010). Keratinocytes undergo mitosis to replace the dead cells that desquamate from the top layer, gradually stopping dividing and producing more and more keratin fibres as they get closer to the outer layer of the epidermis. Keratinocytes also produce glycolipids that are exocytosed to the outer

3 layer of the epidermis and give skin its waterproof properties (Saladin, 2001). Keratinocytes are not homogenous and have a unique phenotype and keratin profile dependent on body site location (Knapp et al., 1986). They also retain the phenotype of their original body site when transplanted to different body sites, suggesting a possible epigenetic mechanism for maintenance of keratinocyte phenotype (Compton et al., 1998).

Melanocytes produce and secrete the pigment melanin that protects the DNA from harmful ionizing radiation, and Merkel cells are part of the sensory apparatus and are sensitive to touch. Immune cells, including Langerhans cells (dendritic cells) and resident macrophages, are part of the extensive defence mechanisms in the skin (Saladin, 2001).

1.1.3 The dermis

The dermis is the thicker of the two layers, and most of the differences in skin thickness at different anatomical locations such as the eyelids (0.6mm) and the soles of the feet (3mm) are caused largely by differing dermal thickness. The dermis is mainly made of thick bundles of collagen and other components of the extracellular matrix (ECM), but also contains blood vessels, sweat glands, sebaceous (sebum, oil secreting) glands, nerve endings, hair follicles and muscular tissue. The two main layers of the dermis are the papillary and reticular layer. The papillary dermis makes up the top one fifth of the dermis and forms wavy projections into the epidermis called dermal papillae that function as the dermal/epidermal junction. Dermal papillae serve as anchors to keep the two layers together, preventing slippage between the two, and also allowing nerve fibres and capillaries to reach close to the surface of the skin to supply sensory input and blood supply. The reticular layer is the lower four-fifths of the dermis and is made up of thick collagen bundles with fewer cells present. The major cell type in the dermis is the fibroblast, which produces not only collagen but all the major components of the extracellular matrix, including fibronectins, integrins, glycosaminoglycans and glycoproteins (Fig.1.2). Cells that form the adnexal structures of the skin, such as the hair follicles, sweat glands and sebaceous glands are also present.

4

Figure 1. 2: Normal fibroblasts produce ECM proteins. Fibroblast specific proteins include fibroblast specific protein 1 (FSP) and vimentin. Fibroblasts produce many of the ECM proteins in the dermis including collagen I and fibronectin (Adapted from Kalluri and Zeisberg, 2006).

In scar tissue it is the substantial changes in the extracellular matrix of this dermal layer that underpins the aesthetic difference and functional deficits (see Scar phenotype, Section 1.4). Striated, densely-packed parallel collagen fibres are prevalent in scar, as opposed to the normal basket-weave lattice of collagen (Verhaegen et al., 2009). This aberrant matrix is produced by fibroblasts.

1.1.3.1 Fibroblasts

Fibroblasts are cells of mesenchymal origin that are present in not just the skin but in all connective tissues in the body. Fibroblasts are involved in the synthesis of the ECM during development and wound healing, as well as for the homeostasis of normal ECM structure and connective tissue function (Chang et al., 2002). In the context of scarring, fibroblasts are the key cell in the production and maintenance of the abnormal scar matrix. Fibroblasts interact with epithelial cells and these interactions are critical for the development of the skin and many other organs. They also have a crucial role in determining the epidermal appendages that develop and a key role in establishing positional identity of the epithelium (Chodankar et al., 2003). Although one of the most numerous cell types in the body, fibroblast phenotype is relatively poorly characterised, mainly due to a lack of homogeneity between tissues. Fibroblasts can be derived not just from other resident fibroblasts but from many different cell types, including from adipocytes and bone marrow derived cells. Fibroblasts produce and maintain the ECM

5 in all the tissues of the body, not just the skin (Werner et al., 2007). Epithelial keratinocytes can also dedifferentiate into fibroblasts in a process called epithelial- mesenchymal transition (EMT) (Yan et al., 2010). A particular subset of EMT, type 2, is associated with wound healing, inflammation, tissue regeneration and fibrosis (Kalluri and Weinberg, 2009).

1.1.4 Wound Healing

To deal with a loss of integrity of the skin, the body undergoes a dynamic, interactive wound healing process that has three phases – inflammation, proliferation and remodelling (Fig. 1.3).

Figure 1. 3: Temporal diagram of the three phases of wound healing. Wound healing occurs in 3 overlapping phases, with acute inflammation followed by proliferative and remodelling phases (Li et al., 2007).

The inflammation phase is initiated immediately after an injury occurs with the onset of haemostasis. This is followed by the release of cytokines and chemokines to promote phagocyte migration to the wound site. The proliferation phase occurs 2-10 days after injury, during which time new blood vessels are formed (angiogenesis), collagen and other extracellular matrix proteins are deposited and granulation tissue is formed as a

6 framework to allow keratinocytes and other cells to migrate across the wound (re- epithelialisation). Wound contraction then occurs, whereby myofibroblasts attach to the wound edges and contract using alpha smooth muscle actin (αSMA), a mechanism similar to that of smooth muscle cells (Darby et al., 1990). The remodelling phase is the final phase and occurs 1-12 months after injury, when disorganised collagen and the ECM is remodelled to improve tensile strength and tissue integrity (Fig. 1.3).

1.1.5 Scar phenotype at a tissue level

A scar is the result of the replacement of damaged skin tissue with fibrotic tissue, composed mainly of collagen produced by fibroblasts, with increased dermal thickness and reduced pliability (Saladin, 2001). Scars are the end point of the normal mammalian tissue repair response. It is hypothesised that wound healing was evolutionarily optimised for quick healing under dirty conditions, meaning that scars are the price we pay for a reduced risk of infection and increased chance of survival after wounding (Bayat et al., 2003). Scar tissue is dysfunctional – it has decreased sensation, altered pigmentation and increased vascularity, has no sweat glands, and is less resistant to ultraviolet radiation (O'Sullivan et al., 1996). Scars can cause severe discomfort through itchiness, tenderness and pain and are often aesthetically unpleasant, causing emotional stress, loss of self-esteem and depression and diminishing quality of life (Bayat et al., 2003). Physical deformity as a result of scar contractures can also be disabling, limiting movement of joints such as knees, elbows, shoulders, jaw and neck.

A B C D

Figure 1. 4: The four main subtypes of scar. A) Normotrophic. B) Atrophic. C) Hypertrophic. D) Keloid (Fabbrocini et al., 2010;

Brown and Bayat, 2009).

Scars are classified into four main categories – normal, atrophic, hypertrophic, and keloid (Brown and Bayat, 2009). Normal or normotrophic scars are usually categorised as flat, pale, soft, symptomless scars with no elevation or nodularity (Fig 1.4A). 7

Atrophic scars are flat and depressed below the skin, usually small and often round with an inverted centre (Fig 1.4B). They commonly arise after acne or chickenpox. Hypertrophic scars are raised scars that remain within the boundaries of the original lesion (Fig 1.4C). They are often red, inflamed, itchy, and even painful. Keloid scars (Fig 1.4D) share many characteristics of hypertrophic scars, the key difference being that they spread beyond the margins of the original wound and invade the normal skin surrounding the initial lesion (O'Sullivan et al., 1996).

1.1.6 Scar phenotype – cell level

Many of the differences in scar tissue compared to normal skin are related to the excess and aberrant structure of the dermal ECM. Not only is there more abundant collagen in scars, but collagen bundles within the dermis are aligned in a more parallel orientation compared to the ‘basket weave’ pattern observed in normal skin (Berthod et al., 2001). Collagen synthesis is increased in normal scar compared to normal skin (Muir, 1990), is estimated to be doubled in hypertrophic scar compared to normal scar (Linares and Larson, 1974), and ranges from slightly increased to very highly increased in keloid scar (Abergel et al., 1985). The distance between collagen bundles within scar is also different, with normal skin featuring spread out bundles and normal and hypertrophic scar having more tightly packed bundles. Interestingly, keloid scars display collagen bundles that are even more loosely packed than normal skin (Verhaegen et al., 2009), which may reflect the constant growth associated with keloid pathophysiology. In addition, there is a difference in the type of collagen present in scar tissue, with an increased amount of collagen III and a decreased collagen I/III ratio (Klinge et al., 2000) compared to normal skin. This likely reflects changes initiated during wound healing. The abnormal collagen ratio, collagen orientation, bundle size and distance between bundles leads to a decrease in tensile strength and abnormal pliability of the scar compared to normal skin (Klinge et al., 2000).

Scar fibroblast phenotype is also altered in scar, with dermal scar fibroblasts having altered growth kinetics and collagen synthesis (Diegelmann et al., 1979), apoptotic properties (Linge et al., 2005), cytokine responsiveness (Garner et al., 1993), increased angiotensin converting enzyme (ACE) activity (Morihara et al., 2006), decreased collagenase enzyme activity (Arakawa et al., 1996), increased connective tissue growth factor (CTGF) expression (Igarashi et al., 1996) and increased responsiveness to transforming growth factor beta (TGF-β). Some studies have shown that epidermal scar 8 cells, long thought to be passive in scar formation, have an altered phenotype and active role in scarring, with differential expression of keratin filaments and secretion of growth factors influencing fibroblasts and the inflammatory response (Machesney et al., 1998). These differences in scar tissue properties, both at a whole tissue level and at the cellular level, give rise to a distinct scar phenotype.

1.1.7 Factors influencing scar outcome

Wound healing is a complex process and many factors can delay healing, leading to a poor scar outcome (Deitch, 1984). Wound size (Cass et al., 1997), necrosis (Liesegang, 1997), hypoxia (Aarabi et al., 2007), oedema, infection (Hardy, 1989), age (Lavker, 1979), stress (Kiecolt-Glaser et al.) and genetics (Bayat et al., 2003) have all been found to influence wound healing. Burn injuries have many risk factors for poor scar outcome such as large size of wound, oedema and increased risk of infection (Klein, 2007). The complete picture of the molecular mechanisms that underpin wound healing and that can alter scar outcome is not fully understood (Ferguson and O'Kane, 2004). The mechanism through which damage sustained during wound healing results in the continued expression of scar phenotype after healing is also yet to be ascertained.

Delayed wound healing has a major effect on scar formation. In one study, burn wounds that healed in less than 10 days had only a 4 percent risk of developing scar hypertrophy, whereas wounds that took 21 days or more to heal had an 80 percent risk of hypertrophy (Deitch, 1984). Inflammation may also be associated with increased scarring by delaying healing, as inflammatory cells induce apoptosis in the migrating epithelium (Brown et al., 1997). Injuries to developing embryos do not result in scar formation, thought to be due to the absent/diminished inflammatory response, and induction of a pro-inflammatory response results in foetal scar formation in mice (Liechty et al., 2000). Therefore significant focus on therapeutic efforts to diminish scarring has been on reducing time to heal (usually re-epithelialisation) and also on the role of the inflammatory response, although this appears to be potentially both beneficial and detrimental to scar outcome and is therefore more difficult to modulate. However, to date little research has focused on the final scar and the potential to modulate scar cells to improve patient outcome and reduce the appearance of scar tissue.

9

1.1.8 Collagen turnover

Collagen is a triple helical protein consisting of three 1000 amino acid chains, and is the most abundant protein in the dermis, making up 70-80% of the dermis’ dry weight (Waller and Maibach, 2006) and 20-30% of total body protein (Eastoe, 1955). There are 29 different types of collagen identified, differentiated from each other by different amino acid chains and structural features, of which six (I, III, IV, V, VI and VII) are known to be present in the skin. In normal adult skin, 70% of the collagen is type I, 10% is type III and the rest is present in trace amounts (Uitto et al., 1989).

Figure 1.5 shows the steps involved in collagen synthesis. The steps involved are outlined on the following page with reference to this figure. First, collagen production is stimulated by ECM inducers such as TGF-β to induce transcription of DNA to messenger ribonucleic acid (mRNA) (Step 1). Then, collagen mRNA is imported into the endoplasmic reticulum and translated (Step 2). Next, triple helix formation is initiated via the aggregation of monomers into trimers facilitated by the heat shock protein 47 (Hsp47) chaperone (Step 3). Hsp47 prevents tangling and reduces the speed of aggregation to allow for hydroxylase processing of residues, resulting in stabilised intramolecular cross links. The trimer is then translocated to the golgi apparatus for aggregation into fibril stacks (Step 4). Next, the fibril stacks are deposited into the ECM and aggregation of collagen tripeptides at the cell surface occurs (Step 5). Cleavage of N and C terminal propeptides also occurs in the extracellular space (Step 6). Finally, stabilisation of collagen fibrils by intermolecular crosslinking resulting in mature insoluble collagen occurs (predominantly through lysyl oxidase (Step 7)) (Adapted from Chen and Raghunath, 2009; Eaton, 2012).

10

Figure 1. 5: The steps involved in collagen synthesis. 1) Transcription of mRNA by transcription factors; 2) Translation of mRNA into procollagen; 3) Triple helix formation; 4) Collagen trimer translocation to golgi apparatus; 5) Deposition of collagen into ECM; 6) Cleavage of terminal peptides; 7) Crosslinking of collagen peptides (Adapted from Chen and Raghunath, 2009; Eaton, 2012).

Collagen degradation is part of the ECM remodelling processs, and Figure 1.6 shows the steps involved in collagen degradation. Degradation of collagen fibres involves cleavage of fibrils by collagenolytic enzymes and uptake of collagen fragments by macrophages and fibroblasts or further cleavage by matrix metalloproteinases (MMPs) (McKleroy et al., 2013).

11

Figure 1. 6: The steps involved in collagen degradation. Collagen in the ECM is constantly degraded as well as synthesised. Degradation involves MMPs, macrophages and myofibroblasts (McKleroy et al., 2013).

Each type of collagen is synthesised, deposited and degraded by endogenous enzymes including MMPs in a dynamic process balancing synthesis and degradation in normal skin to maintain the normal ECM. After injury, this balance is altered to increase synthesis and repair the matrix, whilst in scar tissue the balance appears to be altered such that excess collagen is not degraded but rather the scar matrix remains for life.

The rate of collagen turnover has been difficult to elucidate. This is in part due to the variable methods used to measure collagen synthesis and degradation, as well as the difficulty in obtaining accurate in vivo estimates of protein turnover. The original procedure was to introduce radiolabelled [3H] proline (Madden and Peacock, 1971), which is incorporated into the collagen as hydroxyproline, and measure changes in specific radioactivity over time. Although the method works reasonably well over a short period it is highly inaccurate over longer time periods due to recycling of the proline label (Sodek, 1976) and it is also no longer considered safe for use in humans. 2 18 Non-radioactive isotopes such as deuterium (heavy water, H2O), O, measurements of atmospheric 14C from nuclear tests in the 1950s and gas chromatography-mass spectrometry (GC-MS) have all been used to assess collagen turnover (Tredget et al.,

12

2000; Babraj et al., 2005; Rucklidge et al., 1992; Heinemeier et al., 2013). Another method measures the racemisation of aspartic acid. Amino acids are all incorporated into proteins in the L-enantiomer form. During aging, racemisation slowly converts the L-enantiomers into the D-enantiomer form. Measuring the amount of racemisation therefore gives a relative age of the tissue – the more racemisation the older the tissue (Verzijl et al., 2000).

Quantification using the popular and commercially available Picrosirius red dye assay has been used to determine collagen deposition, and works by selective binding of the Sirius Red F3BA to collagen (Xu et al., 2007). However, doubts have been raised over the dye overestimating the amount of collagen and an essential modification must be made to the standard assay in order to accurately quantify collagen content (Lareu et al., 2010). A method that takes a different approach measures small propeptides cleaved during collagen synthesis (Oikarinen et al., 1992) and degradation (Sassi et al., 2001). This method is collagen type specific, as different collagens have different cleavage peptides that can be measured using enzyme linked-immunosorbent assays (ELISAs) as well as high-performance liquid chromatography (HPLC). Measuring cleaved propeptides and telopeptides of each collagen as well as MMP levels and tissue inhibitors of metalloproteinases (TIMPs) provides a snapshot of the balance of collagen synthesis and degradation as well as enzyme activity.

Lack of consensus concerning basal collagen turnover has been further confused by inconsistency in disease or injury status, skin sampling location, use of human or animal models and in vivo or in vitro experimentation. Estimates of basal collagen turnover in normal skin are extremely variable and range from a half-life of 14.8 years in adult humans (using racemisation of aspartic acid) (Verzijl et al., 2000), a fractional synthesis rate of 2% per day in humans (El-harake, Furman et al. 1998) a half-life of 73.6 days in juvenile rats (Rucklidge et al., 1992) and a 3-5% turnover per day in adult rats (Laurent,

1987). Hence, despite these efforts the rate of collagen turnover remains unknown. The higher the turnover rates of collagen in scar tissue the greater the potential to modulate scar collagen deposition over time and improve scar appearance.

13

1.1.9 Maintenance of scars

Scars are permanent modifications to the normal architecture of the skin that persist throughout the lifetime of the individual. These scar tissues are not static, and a scar formed when an individual is young will grow with that individual into adulthood (Fig. 1.7). An example of this can be observed after the surgical procedure, paediatric pyloromyotomy, in which the pylorus (between the stomach and duodenum) is incised to alleviate pyloric stenosis in infants. This surgery causes a scar on the abdomen that significantly increases in length and breadth as the infant grows to adult size, and often requires scar revision as an adult (Fig. 1.7A). The increase in size of the scar is not solely due to stretching of the tissue. The scar phenotype is retained either by cells within the original scar and their progeny or by cells migrating into the scar tissue. The cellular mechanism of scar maintenance is not known, and may be caused by a host of mechanisms - a ‘scar’ ECM affecting cells migrating in, wound cells having ‘damaged’ genetics or epigenetics in the wound replicating and forming a scar phenotype, or another unknown cause.

B C A B

Figure 1. 7: Examples of scar growth over time A) Growth of infant pyloromyotomy scar in a young adult (Harmon 2011). B) A patient with a 55% total body surface area burn at age 8 is pictured here at age 12 and C) at age 21.

14

1.2 Genetic control of scarring

The phenotype of a cell is largely determined by gene expression and thus the maintenance of scar matrix can be understood by studying the control of gene expression in scar fibroblasts. The traditional way of measuring expression levels of genes was to use northern blots or quantitative real time polymerase chain reaction (qRT-PCR) experiments, but this requires a priori knowledge of the gene(s) to be investigated and limits experimental design to a small number of genes. Gene expression microarrays provide a method of analysing the expression levels of thousands of genes at a time by using short fluorescently labelled probes that are specific to complementary DNA (cDNA) generated from the mRNA of the target cell or tissues (Schena et al., 1995). Complementary binding of the DNA to the probes causes them to fluoresce, and measuring the level of intensity of the fluorescence allows a measure of gene expression. Gene expression levels can then be compared to normal tissue, and differences in expression levels used to identify novel targets or better understand pathologies or phenotypes of interest.

1.2.1 Gene expression studies of scarring

Gene expression studies have been carried out on keloid and hypertrophic scars for the purpose of increasing knowledge of keloid and hypertrophic scar pathology as well as to identify possible target genes to modulate and reduce severity, prevent or even reverse formation of these scars.

Keloid scarring has a genetic predisposition and is characterized by aggressive growth of scar tissue beyond the wound boundary. Microarray studies on keloid scars have reported a large number of differentially expressed genes. Seifert et al. found 578 differentially expressed genes (at least 2-fold change) between the deepest part of the keloid and a control skin sample (Seifert et al., 2008). This study used cultured cells from skin punch biopsies from 3 excised keloid scars, cultured the cells to passage 2-4 before analysing, and compared to normal skin from an age, sex, ethnicity and anatomically matched control patient. Smith et al. found 511 differentially expressed genes (at least 2-fold change) between keloid scar cells cultured from 5 subjects (4 female, 1 male) and normal scar cells cultured from 5 females. These cells were controlled for ethnicity, but came from a variety of anatomical locations and were very old, with some of the original cells isolated in 1976 (Smith et al., 2007).

15

There are fewer microarray based gene expression studies comparing hypertrophic scars, normotrophic scars and normal skin. Examples of these include a study by Dasu et al., who found 26 genes differentially expressed, with 12 increased in expression and 14 decreased compared to controls, between fibroblasts cultured from 5 hypertrophic burn scars in a paediatric population with matched normal skin from the same patients, using a significance cut off of 0.05 and a fold change of 1.2 (Dasu et al., 2004). While matched control skin was obtained from the patients, a paired statistical analysis was not performed, and the samples were not ethnicity, sex or anatomically controlled, such that the results are not optimal. The chip used was the Affymetrix HG-U95 Av2 gene chip which only covers 12, 000 genes in total (Dasu et al., 2004).

Paddock et al. conducted a small study with frozen tissue from two paediatric and two adult hypertrophic burn scars and normal skin samples from each patient (Paddock et al., 2003). The analysis used a pairwise comparison between the scar and normal skin from each patient and a biologically relevant change in expression was determined by a significance analysis of microarrays (SAM) analysis and a 2-fold change. This analysis resulted in 31 upregulated and 4 downregulated genes between the hypertrophic scar and normal skin. The data were subjected to a paired analysis and the age, sex and race of the paediatric patients were matched; however, the adult patients had an age difference of 22 years and were different sexes and ethnicities. The anatomical locations of the scars were not matched in any of the patients, the site that the normal skin was sampled from was not recorded and the microarrays used were the Affymetrix U95Av2 GeneChips which only cover 13,000 genes (Paddock et al., 2003).

One more example of a hypertrophic scar gene expression study is a study by Tsou et al., which compared gene expression in fresh biopsies of uninjured skin, normotrophic scars and hypertrophic scars (Tsou et al., 2000). This study found 192 genes differentially expressed in normal scars compared to normal skin, 178 between hypertrophic scars and uninjured skin and 168 between normal scars and hypertrophic scars (Tsou et al., 2000). This study also had significant limitations however, as tissue samples were obtained from only two normal scars, three areas of normal skin and three hypertrophic scars. The samples were not matched according to the individual, age, sex, ethnicity or anatomical site, and used an older chip design that only covered 4,000 transcripts. 16

Interestingly, there were no genes in common between the published datasets of significantly expressed genes of the 3 studies on hypertrophic scar. Not all studies could be comprehensively analysed using a meta-analysis as these earlier studies did not all upload entire datasets to publicly available databases and authors only published selected changes without the provision of comprehensive supplementary data. However, the lack of overlap still strongly suggests these studies are inadequately powered to understand changes at the transcriptional level in scar fibroblasts. Although there were no genes in common, there were a limited number of close matches. For example, one of the two proteins that make up the collagen I molecule (α-chain I and II) were identified in two of the studies.

The use of tissue biopsies has both advantages and disadvantages. Mixed tissue biopsies allow a snapshot of the ‘real’ gene expression of the tissue in vivo, but even with stringent controls on the tissue used there is a significant problem of tissue homogeneity. This is caused by the numerous different cell types in the skin and/or dermis, including myofibroblasts, Langerhans cells, Merkel cells, melanocytes, keratinocytes, endothelial cells and all the cell types in adnexal structures. The use of cultured cells ensures a more homogenous sample but ‘noise’ is introduced to the transcriptome by exposure to culture conditions, and cultured cells have been shown to be less similar to tissues the longer they are left in culture (Boess et al., 2003).

These studies provide some information on changes in the transcriptome in scar fibroblasts but their significant limitations reduce confidence in these datasets. These include small sample size, inadequate controls and variation due to age, sex, and ethnicity. The use of older arrays also restricts the extent of coverage of the transcriptome. New studies to supplement these findings and overcome many of the study limitations are required to better understand the genes that influence scar phenotype.

1.2.2 Epigenetics

Epigenetics is defined as information heritable during cell division other than the DNA sequence itself (Feinberg, 2007), and is a key mechanism of control of gene expression and phenotype. Developmental processes are largely regulated by epigenetics because different cell types maintain their fate during cell division even though their DNA sequences are the same (Feinberg, 2007). Two of the main types of epigenetic processes

17 are illustrated in Figure 1.8. At the top of the Figure 1.8 is an example of DNA modification, DNA methylation, where methyl groups added to DNA most commonly inhibit transcription through blocking transcription factor binding of transcription start sites. This can be caused directly by structural changes or through subsequent chromatin remodelling. Histone modification is a second epigenetic modification (Fig. 2.1), where chemicals such as acetyl groups bind to the tails of histone proteins, causing them to unwind the DNA wrapped around them, modulating gene transcription. Histone modifications can vary and have different effects on gene transcription dependent on the type of modification.

In this thesis, epigenetic changes are postulated as a potential mechanism for the maintenance of the scar phenotype. This may occur by scar fibroblasts having their epigenetic imprint altered during the wound healing process, or by migratory cells moving into the wound and differentiating into fibroblasts with a distinct epigenetic pattern different to that of the resident fibroblasts. These scar fibroblasts will then pass their altered epigenome to subsequent daughter cells when they divide, maintaining scar phenotype and enabling the scar to grow over time with the skin. Epigenetic changes would also account for the stability of scar once it is mature, resulting in the abnormal matrix being maintained for life.

18

Figure 1. 8: An illustration of the two main types of epigenetic regulation. DNA methylation of cytosine bases (top) and histone modifications affecting the packaging and therefore activity of the DNA (bottom) (Qiu, 2006).

1.2.3 DNA methylation

The traditional understanding of epigenetic regulation through DNA methylation is the attachment of chemical marks in the form of methyl groups to the DNA by specialised enzymes called methyltransferases. The methyl group binds to the 5 position of the base cytosine in CpG sites, a linear sequence of cytosine followed by a guanine with the phosphate backbone in between (“Cytosine-phosphate-Guanine”). These CpG sites are clustered in regions called CpG islands, usually defined as an area more than 500bp long with a GC content greater than 55% and an observed CpG/expected CpG ratio of over 0.65 (Takai and Jones, 2002). These CpG islands are highly conserved in the genome and are more common than expected in the promoter regions of the genome (4- 6%), while in the rest of the genome, outside the CpG islands, the CG dinucleotide is suppressed – it is the least common dinucleotide, making up less than 1% of the genome 19

(IHGSC, 2001). This pattern of high frequency in promoters and low frequency outside of them suggests their importance to regulation of gene expression.

The traditional understanding of methylation, to which there are now well known exceptions, is that the methyl groups bound to the DNA cause suppression of gene expression, either through the methyl group itself physically impeding transcriptional protein binding (Choy et al., 2010) or through the recruitment of methyl-CpG binding domain proteins (MBD), which themselves can inhibit transcriptional apparatus, as well as recruiting chromatin remodelling proteins that can modify histones (Valinluck et al., 2004). More recently a broader understanding of methylation as a tool used for the regulation of DNA expression has been developing. Not only can CpG sites be methylated but Cytosine-phosphate-Adenine (CpA) and Cytosine-phosphate-Thymine (CpT) sites are common in embryonic stem cells (Ramsahoye et al., 2000). However, the effect of methylation on repressing gene expression is thought to be the same in these ‘non-traditional’ sites as in CpG sites (Lister et al., 2009). Methylation is usually a stable and long-term change in cell programming (it was originally believed to be irreversible, although this is not the case), and any alterations are only expressed after the cell undergoes cell division (Reik, 2007). This makes it suitable for developmental patterning (tissue differentiation) but also an ideal candidate as the mechanism for the maintenance of scar phenotype.

1.2.4 DNA methylation in skin

Aberrant methylation has been associated with pathology, including in the skin. Various cancers including basal and squamous cell carcinomas, melanoma and skin lymphomas, auto-immune diseases including psoriasis, atopic dermatitis, vitiligo and lupus erythematosis are all known to be associated with aberrant methylation patterns (Li et al., 2009; Roberson et al., 2012). In skin cancers (in common with many cancers), both hypermethylation of tumour suppressor genes and hypomethylation of oncogenes are observed (Cheng and Cho, 2012). Changes in methylation are also believed to be a cause of age-related changes in the skin, with some regions of key genes becoming hypomethylated and others hypermethylated with age (Koch et al., 2011). The role of methylation in scarring is less well known, with work to date focusing on changes in methylation in keloid scars.

20

Identifying whether there are changes in methylation that persist in scar fibroblasts and contribute to the maintenance of scar phenotype may identify a possible therapeutic approach to improving scar outcome.

1.2.5 DNA methylation in wound healing and fibrotic disease

The modulation of gene regulation by methylation has been implicated in fibrotic disease in the liver (Yoshida et al., 2004), lung (Rabinovich et al., 2012) and kidney (Bechtel et al., 2010). The influence of DNA methylation in wound healing has predominantly been investigated in cases of aberrant wound healing such as diabetes and keloid scars (Russell et al., 2010; Mann et al., 2007). To date, very little work has been done on human tissue. Of the human keloid studies to date, the methylation pattern of cultured keloid scar cells has shown differential methylation in several homeobox (HOX) genes (body patterning). When an inhibitor of methylation (5-aza-2’ deoxycytidine) was added, expression of collagen, connective tissue growth factor (CTGF) and other pro-fibrotic compounds was decreased and the expression of the body patterning gene HOXA10 increased (Russell et al., 2010). The effect of the methylation inhibitor suggests that methylation may be important in the control of collagen production and therefore important in scarring.

In mice, demethylation of polycomb group proteins (PcGs), which regulate wound repair genes, has been found in vivo during wound healing (Shaw and Martin, 2009b). The demethylation of these PcGs, as well as the upregulation of the demethylases Lysine (K)-Specific Demethylase 6B (JMJD3) and Lysine (K)-Specific Demethylase 6A (UTX) induce the repair genes in cells at the wound edge, activating the wound repair process (Shaw and Martin, 2009a). Problems with regulation of this gene expression have been hypothesised to be the cause of hyper-proliferation and excessive scar formation, which may lead to fibrotic pathology (Shaw and Martin, 2009b).

1.2.6 Histone remodelling

The second type of epigenetic modification that is well understood is that of remodelling the histone proteins, around which DNA is wrapped. Histones are proteins whose function is to compress the DNA (approx. 1.8m long if stretched out in a single piece) down into its functional size that fits inside the cell nucleus (around 6 µm). They are octomeric proteins with a long tail protruding from the nucleosome (centre of the protein) that can be covalently modified. The tails can be methylated, acetylated, 21 ribosylated, phosphorylated, sumoylated or ubiquitinated (Shiio and Eisenman, 2003). Modification of the histone tail can cause the histones to change their conformation – unwinding the DNA to allow transcriptional proteins access or winding the DNA up more tightly to deny transcriptional proteins access to promoter regions. The pattern of chemical modification of the histones is called the ‘histone code’ (Strahl and Allis, 2000).

Histone modification is generally more transient than physical DNA modification (e.g. methylation) and the cells do not have to undergo cell division to change phenotype if a change in histone conformation occurs (Reik, 2007). This transient nature means that it may not be as attractive a target for the maintenance of scar phenotype as DNA methylation, as we know that scar phenotype is maintained indefinitely. However, it may still be of interest as epigenetic traits such as histone conformation and methylation are often closely related (Cedar and Bergman, 2009).

1.2.6.1 Histone remodelling in skin

In a similar way to methylation, aberrant histone remodelling is thought to be a cause of pathology in the skin, as well as a factor in aging (Li et al., 2011). Many cancers, including skin cancers, have disruptions to histone remodelling processes, implicated as an important factor in their formation (Wang et al., 2007). However, very little work has been done on histone remodelling in humans in the skin in vivo. Histone remodelling is important for repairing ultraviolet (UV) damaged DNA in skin, and has been shown to be important in the excision and repair of the DNA of human melanoma cells in vitro (Wang et al., 2006). Cockayne Syndrome (CS), a photosensitive disorder and cause of skin cancer predisposition, is also associated with disrupted histone remodelling. Although the precise mechanism is not understood, a mutation in either the cockayne syndrome 1 (CSA) or cockayne syndrome B protein (CSB) gene is thought to stop the chromatin remodelling that normally takes place, preventing DNA repair enzymes from accessing and repairing DNA lesions caused by UV exposure (van der Horst et al., 1997). Histone remodelling is difficult to measure in the skin when compared to methylation, due to the more transient nature of the remodelling. Histone deacetylase (HDAC) inhibitors, which modulate histone remodelling by preventing histones from being deacetylated and going into a ‘closed’ conformation (and which usually increase gene transcription levels) have been touted as a method to treat cancer (Witt et al.,

22

2009), but little work has been completed so far on their effects on cancer or scar formation and maintenance in skin.

1.2.6.2 Histone remodelling in wound healing

Histone remodelling is important in wound healing, extracellular matrix remodelling and fibrosis (Schiller et al., 2004). It is important in gene transcription as SMAD complexes (which can both activate and repress gene expression) recruit transcriptional co-repressors such as TGFβ induced factor homeobox 1 (TGIF), SKI proto-oncogene (SKI) and SMAD nuclear interacting protein 1 (SNIP1), which recruit histone deacetylases. These histone deacetylases oppose the function of histone acetyltransferase activity, repressing transcription. These co-repressors inhibit gene responses to TGF-β, which is a major protein controlling the cellular proliferation and differentiation in wound healing. Expression levels of these transcriptional co- repressors has to be carefully balanced, as even slight reductions in the level of TGIF expression have been shown to have devastating developmental consequences (Schiller et al., 2004).

1.2.6.3 Histone remodelling in scarring and fibrotic disease

Histone remodelling has been associated with fibrotic pathology, and modulation of histones has been touted as a possible treatment. In normal skin in mice, cutaneous radiation syndrome symptoms similar to those of patients undergoing cancer radiation therapy were reduced by the topical application of phenylbutyrate, an HDAC inhibitor, in vivo. Late skin necrosis, inflammation and fibrosis were all reduced, suggesting that a hyperacetylated histone conformation may be partially responsible for fibrosis in normal skin (Chung et al., 2004). In liver cells in vitro, the gene methyltransferase absent, small, or homeotic (ASH1), which controls histone H3 lysine 4 (H3K4) conformation was found to be a positive regulator of several pro-fibrotic genes in both rat and human cells. Depletion of ASH1 caused broad suppression of pro-fibrotic genes, suggesting H3K4 conformation is important in liver fibrosis (Perugorria et al., 2012). Human fibroblasts from systemic sclerosis (SSc) patients, an autoimmune disease that causes progressive fibrosis of the skin and internal organs, have been shown to decrease their collagen and fibronectin output when treated with HDAC inhibitors in vitro, suggesting that an aberrant histone conformation is contributing to fibrosis in SSc patients. The treatment with these same HDACs in a bleomycin induced mouse model

23 of fibrosis prevented dermal accumulation of extracellular matrix (Hemmatazad et al., 2009), also suggesting histone conformation is important in fibrosis.

Another HDAC inhibitor, trichostatin A (TSA) has been used to prevent corneal haze, a type of corneal fibrosis, after laser-assisted in situ keratomileusis (LASIK) surgery on human cells in vitro and rabbit corneas in vivo (Sharma et al., 2009). The HDACs used to date are non-specific and there is limited understanding of their effects. However, further development is of great interest to the pharmaceutical industry, and the positive results to date suggest the importance of histone conformation in scar maintenance and fibrosis.

1.2.6.4 Relationship between DNA methylation and histone remodelling

Histones can be modified by a variety of chemicals, including acetylation, methylation, ubiquitination, sumoylation, ribosylation and phosphorylation (Shiio and Eisenman, 2003). The methylation of histones is usually found in heavily condensed heterochromatin, in which the DNA is transcriptionally silenced. This heavy condensation also coincides with a high level of CpG methylation of the DNA itself. Conversely, hyperacetylated histone conformation is associated with unmethylated CpG sites in DNA (Bird and Wolffe, 1999). Of the histone modifications, the best characterised to date are histone deacetylation and methylation of histone H3 at lysine 9 (H3K9me1). The existence of an epigenetic ‘conversation’ between histones and DNA, involving cytosine methylation, histone deacetylation and H3K9 methylation, leading to transcriptional silencing, is now well established (Fuks, 2005). The mechanisms by which this occurs are not clear, as there is evidence that both DNA methylation can induce histone modification and vice versa.

The evidence that DNA methylation drives histone modification is that components of DNA methylation such as DNA methyltransferases (DNMTs) and methyl-CpG-binding domain (MBD) proteins are able to bind repressor complexes containing HDACs (Bird, 2002). The evidence that histone modification influences DNA methylation is that methylation of lysine 9 of histone H3 acts as a ‘beacon’ for DNA methylation, and DNMTs interact with interact with Suv39h H3K9 methyltransferases. Loss of H3K9 methylation in Suv39h-knockout embryonic stem cells decreases DNMT dependent CpG methylation in these cells (Lehnertz et al., 2003). This relationship may mean that both play a role in the formation and maintenance of scar phenotype. A better

24 understanding of the dynamic relationship between these two key epigenetic regulatory mechanisms and the key drivers will be important in future studies of epigenetic control and for therapeutic applications.

25

1.3 Reprogramming cells

While scarring always occurs after injury in adult mammals, regeneration of damaged tissue without scarring can occur in humans, albeit only in certain time periods and in certain tissues. Other vertebrates closely related to humans share our propensity for scar formation after injury. However, there are vertebrates such as salamanders and axolotls that are able to regenerate without scarring (Roy and Gatien, 2008). They are able to do this by reprogramming mature cells into a less mature state (dedifferentiation), after which the cells migrate to the wound and redifferentiate into the required tissue (Kragl et al., 2009). Successful regeneration through reprogramming in other vertebrates suggests it may be feasible in humans with the right tools.

1.3.1 Cellular reprogramming and regeneration in vivo

1.3.1.1 In humans

After an early stage in embryogenesis, when early gestation foetal skin wounds repair rapidly and in the absence of scar formation, no whole organs can completely regenerate in both animal and human foetuses (Larson et al., 2010). However, tissues such as the digit tips, endometrium and liver are capable of regenerating large parts of themselves (Illingworth, 1974). Loss of up to 75% of liver tissue can result in regeneration of a functioning organ back to the same size and function as the original. This is accomplished by a hyperplastic response that involves replication of virtually all the mature functioning cells in the remnant liver (Taub, 2004). This is not true ‘regeneration’, as although the function of the liver is restored, the lobes of the liver do not reform, thus the form is different. It can more accurately be defined as compensatory hyperplasia (Michalopoulos and DeFrances, 1997).

In the case of the endometrium, the columnar epithelial tissue grows to a thick, blood vessel-rich, glandular tissue layer, which is then shed during menstruation and the cycle repeated over a 28 day time period (Masuda et al., 2007). The endometrium is thought to contain a pool of multipotent stem cells within the deep basalis layer, capable of cyclically producing progenitor cells that further differentiate into each endometrial cell component, allowing the regeneration of the endometrium over 400 times during the a woman’s reproductive years (Masuda et al., 2007). A combination of resident adult epithelial stem cells and stromal stem cells are thought to be the origin of regeneration in the endometrium (Gargett et al., 2008). The tips of the digits are also able to be 26 regenerated, complete with fingernail and fingerprint, but only the terminal phalanx and only if treated conservatively (Fig. 1. (Gardiner, 2005). In mice, this occurs by dynamic expression of the msh homeobox 1 (Msx1) gene during digit tip regeneration, and formation of fate restricted progenitor cells form the endoderm, mesoderm and ectoderm, and the same mechanism has been proposed in humans (Lehoczky et al., 2011).

Figure 1. 9: Regeneration of human digit tip in a young child with conservative treatment. (Illingworth, 1974).

1.3.1.2 Regeneration in other vertebrates

In other organisms close to humans on the evolutionary tree, regeneration of whole organs is rare. As with humans other mammals are able to regenerate digit tips, endometrium and liver tissue, but have scarring and fibrotic tendencies similar to humans in many other tissues (Han et al., 2005). Although slightly further away on the evolutionary tree, urodele amphibians such as the axolotl and the salamander have the ability to regenerate lost limbs at any stage in life (Roy and Gatien, 2008). Axolotls are particularly adept at regenerating tissues and are able to regenerate limbs, jaws, tail, spinal cord, heart and brain as well as parts of many other organs (Roy and Lévesque, 2006).

Limb regeneration has been the focus of the majority of the scientific research on axolotls and the regeneration has been found to occur by de-differentiation of mature cells. There is a bi-phasic response in limb wound healing, with a preparation phase and a redevelopment phase (Gardiner et al., 1999). The preparation phase involves

27 formation of epithelium to cover the wound, reorganisation of the ECM, de- differentiation of the cells within a few millimetres of the amputation plane, migration and proliferation of the de-differentiated cells and formation of the blastema (Brockes, 1997). The cells in the blastema are not pluripotent however, rather the blastema is a heterogeneous collection of restricted progenitor cells (Kragl et al., 2009). From this blastema, the cells re-differentiate and form all the tissues required for a functional limb. Figure 1.10 shows the restricted pattern of dedifferentiation that underpins limb regeneration in the axolotl. Cells from specific lineages can only partly dedifferentiate and subsequently their redifferentiation is also restricted (Fig. 1.10b).

28

Figure 1. 10: Regeneration of axolotl limb after amputation Each cell lineage dedifferentiates but the ability of these cells to redifferentiate is restricted dependent on the lineage of origin (a) (Kragl et al., 2009).

The main differences between human and axolotl healing is in the epithelial coverage time and the de-differentiation, direction and migration speed of the fibroblasts. While the epithelium will cover the injured area in humans, it is far slower, and although the fibroblasts do migrate their migration pattern appears random, and they do not de- differentiate, and many of them actually differentiate even further into myofibroblasts that contract the wound (Han et al., 2005). Although the last common ancestor of axolotls and humans was around 395 million years ago, most of the genes and signalling pathways involved in regeneration are the same in both species. Therefore, modulation

29 of the human healing process from a fibrotic reparative process to an axolotl-like regenerative process is a reasonable and desired possibility (Roy and Gatien, 2008).

1.3.2 Cellular reprogramming in vitro

Reprogramming mature cells back to a more ‘immature’ multipotent or pluripotent state has been the holy grail of regenerative medicine research, as pluripotent cells have the ability to differentiate into any tissue in the body, and have the potential to regenerate tissues lost to disease or injury without scarring (Nelson et al., 2009). Reprogramming of mature cells was first achieved in 1962, when Sir John Gurdon reprogrammed mature Xenopus (clawed frog) intestinal epithelial cells back into a pluripotent state by transferring the mature cell nuclei into enucleated eggs (somatic cell nuclear transfer), from which he obtained viable tadpoles (Gurdon, 1962). It was first demonstrated in adult mammals by Wilmuth and Campbell with the successful cloning of Dolly the sheep by a similar method in 1997 (Campbell et al., 1996).

Somatic cell nuclear transfer in humans suffered from many issues, such as the scarcity of cells and ethical issues of using and discarding viable human oocytes, so alternative methods of producing pluripotent cells were sought. One such method was discovered in 2006 by Yamanaka, who found that transfection of four transcription factors, Oct-4, Sox2, Klf2 and c-Myc, were able to mimic the effects of the oocyte based transfer method in mature cells and induce pluripotency, first in mouse cells and then in human cells (Takahashi and Yamanaka, 2006; Takahashi et al., 2007). These induced pluripotent cells (iPS) cells are able to differentiate into many different mature cell types, including neurons, cardiac cells, bone and blood precursor cells and many others (Chambers et al., 2009; Burridge et al., 2011; Grigoriadis et al., 2010). iPS cells are very similar to embryonic stem (ES) cells in morphology, gene expression and in histone conformation, but have been shown to retain some of the methylation pattern of the original cell type (Lister et al., 2011). Therefore they do not represent truly ‘reprogrammed’ cells but rather an intermediate cell which has lost much of the epigenetic modification associated with differentiation.

1.3.3 Therapeutic cellular reprogramming

Using cellular reprogramming in therapeutics has been a very recent development, with only a handful of animal and clinical trials and no cellular reprogramming therapeutics currently a part of standard medical practice. 30

1.3.3.1 Human clinical trials

There have been a number of clinical trials using ES cells, with Geron corporation initiating the first hES therapeutic trials in 2009, injecting allogenic hES cell derived oligodendrocyte precursor cells (OPCs) into patients with thoracic spinal cord injuries (Lebkowski, 2011). Unfortunately, this trial was discontinued in 2011 for financial reasons, although the patients that were treated will continue to be monitored. Another trial involved hES cell trials for macular degeneration, injecting allogenic hES cell derived retinal pigment epithelial (RPE) cells into patients’ eyes, showing impressive preliminary results (Schwartz et al.). The use of iPS cells in clinical trials is even rarer, with only one trial in the process of being performed in Japan. In this trial, the patient’s skin cells were induced to form iPS cells, then differentiated into a sheet of RPE cells and implanted into her eye to prevent the worsening of her macular degeneration (Reardon and Cyranoski, 2014). Despite the lack of translation to date, interest and research into the potential of reprogrammed cells for therapeutic use remains high.

1.3.3.2 Pharmacologic epigenetic reprogramming

Generation of both ES by somatic cell nuclear transfer and iPS cells through use of specific transcription factors can be seen as ways of extreme epigenetic reprogramming. Both strip almost all the epigenetic marks from the DNA and both involve treatment of cells in vitro before being used in vivo. There are also pharmacologic interventions for both methylation and histone modification in vivo, and a handful of studies using each to treat diseases linked to abnormal epigenetic status.

1.3.3.2.1 Methylation modifying drugs

The most successful and well known methylation modifying drug is 5-aza-cytidine and its derivative decitabine. Originally developed as chemotherapy drugs, both demethylate DNA by preventing methyltransferases from adding methyl groups to DNA during cell division at specific loci. They are used to treat myelodysplastic syndrome (MDS), a preleukemic bone marrow disorder, and are now part of the recommended treatment (Hagemann et al., 2011). Responses to these drugs are most apparent in MDS patients who have not previously been treated with drugs, and thus have not had a chance to develop drug resistance after several cycles of therapy. MDS is a slowly evolving disease, which is the most efficient testing conditions for hypomethylating drugs, and a second look at these drugs in other slowly evolving cancers with similar characteristics, 31 such as acute myeloid leukaemia (AML), is being investigated (Issa and Kantarjian, 2009). Hypomethylating drugs are often used in combined therapy with other epigenetic modifiers, such as HDAC inhibitors including valproic acid (Soriano et al., 2007). Following on from the success of these methyltransferase inhibitors, other inhibitors such as zebularine and RG108 have been tested for possible anticancer effects, with promising results in vitro and in animal models (Stresemann et al., 2006; Chen et al., 2012b; Graca et al., 2014).

Reprogramming cells back to an embryonic-like state may be seen as extreme for treatment of scarring, but advances in using stem cells or iPS cells to treat other pathologies will have flow on effects for other diseases, including their use in scarring. A targeted de-differentiation of mature scar fibroblasts into a less mature (but not embryonic) cell type that would then be re-differentiated to form new tissue, similar to what occurs naturally in the axolotl, would be an ideal treatment for scarring and indeed regeneration of other tissue lost to disease or injury.

32

1.4 Summary

Scarring remains a significant clinical problem even with state-of-the-art care. The change in physical appearance as a result of scarring causes emotional stress, loss of self-esteem, depression and diminished quality of life. Functional deficits as a result of scar contractures can also be physically disabling. Fibroblasts are the key cell that maintains the abnormal scar matrix. The mechanism for the maintenance of this scar phenotype is unknown and it is possible that epigenetic reprogramming is at least in part responsible for the stable maintenance of scar matrix for life. Recent advances in the understanding of cell differentiation and programming demonstrate the feasibility of reprogramming cells to alter their phenotype. Therefore, identification of epigenetic modifications that may underpin scar fibroblast phenotype has the potential to lead to novel therapeutic strategies targeted at scar cell reprogramming to alter scar matrix and regenerate a more ‘normal skin’ phenotype. Understanding the role of scar fibroblasts and how they maintain and even expand scar matrix during periods of growth has the potential to identify novel interventions to ameliorate even mature scars.

The work presented in this thesis focuses on identifying epigenetic changes in scar fibroblasts that underpin altered matrix production and/or maintenance. Integrated epigenetic and transcriptome data sets are used to identify novel targets that may be important in the maintenance of the scar fibroblast phenotype. Finally, in vitro validation of identified targets is presented.

33

Hypotheses

1. Aberrant dermal matrix in scar tissue is maintained post-healing by epigenetically modified fibroblasts.

2. Targeting a subset of genes that are differentially epigenetically regulated in scar fibroblasts will modulate dermal matrix production and ameliorate scar.

34

Aims

Chapter 2: Investigation of the methylation status of scar fibroblasts

1. To compare the genome-wide methylation patterns in normotrophic scar and normal skin fibroblasts to identify significant differences in the epigenomes of these two cell populations.

Chapter 3: Investigation of the transcriptome of scar fibroblasts

1. To compare the transcriptomes of normotrophic scar and normal skin fibroblasts to identify differentially expressed genes.

Chapter 4: Integrative genomic approach using methylome and transcriptome data to identify targets involved in scar maintenance

1. To identify the genes that are both differentially methylated and expressed in scar fibroblasts compared to normal skin fibroblasts.

2. To identify candidate ‘controller’ genes critical to maintaining scar matrix by incorporating information from functional annotation and published literature.

Chapter 5: In vitro validation of candidate genes predicted to regulate scar matrix production

1. To explore collagen orientation and quantitation in scar and normal fibroblasts.

2. To confirm increased gene expression of target genes, MKX and FOXF2, in scar fibroblasts using quantitative real time polymerase chain reaction qRT-PCR.

3. To validate the role of MKX and FOXF2 as pro-fibrotic regulators in scar fibroblasts using siRNA gene knockdown.

35

Chapter 2

36

Chapter 2 – Investigation of the methylation status of scar fibroblasts

Introduction

Permanent change to the skin fibroblast epigenome is a plausible mechanism for the maintenance of scar matrix after healing. It is possible that during the repair process epigenetic changes are sustained by the fibroblasts and these changes, as previously observed in other systems, are heritable and maintained for life. This leads to a population of fibroblasts within the scar that perpetuate a sustained and differential program of collagen production and matrix maintenance. If this is indeed the case then there is the potential for a reprogramming approach to modify scar fibroblast phenotype and change the dynamics of ECM production, leading to scar amelioration and a more ‘normal’ appearance of the skin. In this chapter, in order to determine whether scar and normal skin fibroblasts differ in their epigenomes, a genome-wide comparison of DNA methylation was conducted using an array approach in matched patient samples.

Aim: To compare the genome wide methylation patterns in normotrophic scar and normal skin fibroblasts to identify any significant differences in the epigenomes of these two cell populations.

37

2.1. Methods All research was conducted in accordance with the National Statement on Ethical Conduct in Human Research 2007 (NHMRC) and approved by the Royal Perth Hospital Human Research Ethics Committee (EC number 2009/114; amendment to include epigenetics component approved 9/4/2010), and was recognised by the UWA human ethics board (recognition of existing ethics reference RA/4/1/5604). All patients gave written informed consent and the study was performed in accordance with the relevant NHMRC ethical statements and guidelines. Patient information sheet and consent forms are available in appendix I.

2.1.1. Patient recruitment and biopsy procedure

Patients were recruited from the State Adult Burns Service at Royal Perth Hospital, Western Australia. Medical records were examined to identify patients who had sustained previous unilateral burn injury and who fulfilled all inclusion criteria (listed below). A phone call was made to potential participants to explain the procedure and if verbal consent was obtained an appointment at the outpatient burns clinic at Royal Perth Hospital (RPH) was made.

Inclusion criteria for participation in the study were:

• Male gender • 18-35 years of age • Injury a flame or scald burn • Injury over 12 months old • Injury included one forearm area • Contralateral forearm uninjured • Vancouver Scar Score (height score) of burn injured forearm = 0

Exclusion criteria for the study were:

• Unable to give informed consent

On the day of the biopsy collection the procedure was again explained to the participant and they were given time to ask questions. If the participant still agreed to participate, and a scar assessment by an experienced assessor rated the height of the forearm scar as 0 by the Vancouver Scar Scale (normotrophic scar (Thompson et al., 2015)), then the

38 consent form was signed and a brief medical history was taken. The biopsy sites were then marked and photographed. Biopsy sites were located on the forearm containing the scar tissue and matched biopsies taken from the contralateral uninjured forearm. All biopsy sites were on the ventral surface of the forearm (less sun exposed and fewer hair follicles) to reduce the potential influence of sun exposure on the changes observed. Biopsy sites were cleaned with ethanol and chlorhexidine and a local anaesthetic (mepivacaine and adrenalin) was injected by the attending surgeon. A single 3mm full- thickness punch biopsy was then taken from each site and the resulting piece of biopsy tissue placed in a 5mL sterile polypropylene tube and submerged in Dulbecco’s Modified Eagle Media (DMEM) with 10% foetal bovine serum (FBS) and 5% penicillin/streptomycin (pen/strep) (Life Technologies, USA). Biopsy sites on the patient were then dressed by the surgeon.

2.1.2. Tissue culture

2.1.2.1. Culture of human fibroblasts from 3mm punch biopsies

Fibroblasts from the skin biopsies were cultured using a standard explant method (Schneider and Mitsui, 1976) detailed below.

Tissue was transported to the lab where it was placed in a petri dish and sliced into three equal sized potions. Pieces of the biopsy were then placed dermis side down in a T-25 (25 cm2) (Greiner Bio-One, Germany) culture flask without any media, and a drop of 100% foetal bovine serum (FBS) added to the surface of each piece. Tissue pieces were left to incubate at 37°C for 30 minutes to allow them to stick to the base of the flask. After 30 minutes, the flask was tilted to an upright position to check the tissue pieces had adhered to the flask. If not sufficiently attached, another drop of FBS was added and the biopsy pieces were incubated for a further 30 minutes at 37°C. Once the biopsies had adhered, the flask was rotated to an upright position and 5ml of DMEM with 10% FBS and 5% pen/strep added (normal cell culture medium). The flask was then slowly rotated to a horizontal position, taking care to make sure the biopsies remained adhered to the base of the flask.

The tissue pieces were then incubated at normal cell culture conditions (37°C, 5% CO2) until fibroblasts began to migrate and divide. Media was changed every 48hrs. Once cells reached confluence in the T-25 cell culture flask, they were trypsinised following the protocol outlined in section 2.1.2.2 and placed into a T-75 (75 cm2) (1:3 split) 39

(Greiner Bio-One, Germany) marked as passage 1 (p1) and the biopsy tissue was discarded. Once cells were confluent, cells were transferred to three T-75s (1:3 split) marked as passage 2 (p2). Once the p2 cells were confluent, two T-75s were frozen down and the remaining confluent viable cells (one T-75) were trypsinised following the protocol outlined in section 2.1.2.2, and isolated cells divided so that half the cells were used for DNA extraction as outlined in sections 2.1.3.1 and 2.1.3.2 and half were used for RNA extraction as outlined in section 3.1.1.1.

2.1.2.2. Cell passage protocol

Cells were passaged when at 90-95% confluence. Prior to addition of trypsin, the media was removed and 5mL of sterile phosphate buffered saline (PBS) (Sigma-Aldrich, USA) was added to wash remaining media from the flask. The PBS was then aspirated and 0.05% trypsin with ethylenediaminetetraacetic acid (EDTA) (Life Technologies, USA) added at 1mL per 25cm2. The cells were then incubated for 10 minutes at 37°C,

5% CO2 and flasks examined to ensure cells were detached. Media and cells were removed and placed in a 15mL sterile polypropylene tube, along with an equal amount of media containing 10% FBS to inactivate the trypsin. Cells were then centrifuged in a benchtop centrifuge (Eppendorf 5810, Germany) and spun at 1500rpm for 3 minutes. The supernatant was aspirated and cells resuspended in 3 mL normal cell culture media. If required, cells were counted using a haemocytometer prior to further processing.

2.1.3. DNA extraction and processing

2.1.3.1. Wizard SV genomic DNA system

DNA extraction for the first two patients was carried out using the Promega Wizard SV Genomic DNA system (Cat. No. A2360, Promega, USA) according to manufacturer’s instructions. All procedures were carried out at room temperature unless otherwise stated. Briefly, cells for DNA extraction were centrifuged (1500g, 5 minutes at room temperature) in a benchtop centrifuge (Biofuge 13, Heraeus, Germany) and 150µL lysis buffer added to the cells and mixed by pipetting. Lysed cell solution was added to the SV mini-column assembly and spun at 13000g for 3 minutes. The flow through was discarded and 650µL of the wash solution added then centrifuged at 13000g for 1 minute. The flow through was again discarded, and the wash procedure repeated for a total of 4 washes. After the fourth wash, the flow through was discarded and the column dried by centrifugation for 2 minutes at 13000g. The mini-column was then transferred

40 to a new 1.5mL tube and 250µL nuclease free water was added. The column was incubated for 2 minutes at room temperature and centrifuged at 13000g for 1 minute. DNA was then stored at -20°C.

2.1.3.2. QIAamp DNA mini kit system

Subsequent patient samples (4 patients) used the QIAamp DNA mini kit system (Cat. No. 51304, Qiagen, Netherlands) as per the manufacturer’s instructions. All procedures were at room temperature unless otherwise stated. 200µL AL buffer was added to prepared cell pellets and the samples vortexed for 15 seconds, then transferred to a 1.5mL tube and incubated for 10 minutes at 56°C. The tubes were then briefly centrifuged at full speed (14,000 g) in the benchtop centrifuge (Biofuge 13, Heraeus, Germany). Two hundred (200) µL of 100% ethanol (Sigma-Aldrich, USA) was then added to the samples, after which they were vortexed and spun briefly in the microfuge. The samples were then added to QIAamp mini spin columns and centrifuged at 6000 g for 1 minute and the flow through was then discarded. The columns were then placed in new 2 mL collection tubes and 500µL of buffer AW1 was added. Samples were then centrifuged again at 6000 g for 1 min. The tube and flow through were then discarded and the columns placed in new 2 mL tubes.

Five hundred (500) µL of buffer AW2 was then added to the columns and centrifuged at 20000 g for 3 minutes. The flow through and tubes were discarded and the columns placed in new 2mL collection tubes and centrifuged at 20000 g for 1 minute. Columns were then placed in 1.5mL centrifuge tubes and 200µL buffer AE was added. The samples were then incubated at room temperature for 1 minute and then centrifuged at 6000 g for 1 minute. A further 50µL of buffer AE was added and the samples and spun again at 6000 g for 1 minute. Samples were then stored at -20°C.

2.1.3.3. Bisulfite Conversion

Bisulfite conversion is required for methylation analysis, which leaves methylcytosines intact but results in the conversion of cytosine bases without methylation to uracil. For these studies DNA was subjected to bisulfite conversion using the EZ DNA Methylation kit (Cat. No. D5001, Zymo, USA) according to the manufacturer’s instructions. All steps were conducted at room temperature unless otherwise specified. As a minimum of 500ng DNA was required for the Illumina methylation array and yield after bisulfite conversion is approximately 80%, 700ng of sample DNA was added to 41

5µL M-dilution buffer. Sample DNA was then made up to 45µL total volume with nuclease free water and incubated at 37°C for 15 minutes in an MJ Research PTC-200 thermocycler (Bio-Rad, USA). After this incubation 100µL of the CT conversion reagent was added. Samples were then incubated overnight in the thermocycler in ,alternating cycles of 95°C for 30 seconds, 50۠°C for 60 minutes, for a total of 16 cycles then set to a hold temperature of 4°C. After this cycle was completed, 400µL of M- binding buffer was then placed in the Zymo-spin IC columns and the columns placed in a collection tube. Samples were added to the columns and mixed by inversion. The columns were then centrifuged at 10000 g for 30 seconds in the benchtop centrifuge (Biofuge 13, Heraeus, Germany) and the flow through discarded. After this first spin, 100µL of M-wash buffer was then added to the column and columns spun at again at 10000 g for 30 seconds. Then, 200µL of M-desulphonation buffer was added to the columns, incubated for 20 minutes and then centrifuged at 10000 g for 30 seconds. After this step, 200µL of M-wash buffer was then added and the columns spun for 30 seconds, and this procedure repeated. The columns were then placed in a new 1.5mL centrifuge tube, 10µL of M-elution buffer was added and the columns spun for 30 seconds to elute the DNA. Bisulfite converted DNA was then stored at -20°C before being analysed using the Illumina Infinium array.

2.1.4. Methylation Arrays

The Illumina Infinium HumanMethylation 450K array (Cat. No. WG-314-1003, Illumina, USA) was used for these studies. Bisulfite converted DNA samples were provided to PathWest Laboratories (Nedlands, Western Australia) and arrays were run at this service facility according to the Illumina protocol.

2.1.5. Bioinformatics

2.1.5.1. Processing of methylation data

Methylation data was imported using Illumina® GenomeStudio and analysed using R statistics software (version 2.15.3) (R Core Team, 2015) and the Illumina Methylation Analyser (IMA) package (Wang et al., 2012). Data was imported from the raw files into GenomeStudio and then exported to a text file. This text file was then imported into R using the illumina methylation analyser (IMA) package (Wang et al., 2012) and processed according to the instructional vignette available on rforge.net (https://www.rforge.net/IMA/). The IMA package is designed to automate the pipeline

42 for exploratory analysis and summarization of site-level and region-level methylation changes in epigenetic studies utilizing the 450K DNA methylation array. Code is available in Appendix II.

During pre-processing, IMA used beta values as input, representing the methylation level of individual sites reported by Illumina BeadStudio or GenomeStudio software. IMA then filtered out loci with missing beta values and with median detection P-value greater than 0.05. As per the IMA instructions, the raw beta values were analysed without quantile normalisation, as it has been shown that quantile normalization is not sufficient for removing all unwanted technical variation across samples (Teschendorff et al., 2009). Normalisation of methylation arrays is the subject of ongoing research, as there is no gold standard method (Aryee et al., 2011). In addition to this, the primary concern of normalisation, batch effects, did not apply here as all samples tested were run on a single array. Differential statistical analysis was carried out using a pairwise comparison at a CpG site level, using a paired Linear Models for Microarray Data (Limma) test on each site and using a Benjamini-Hochberg (B-H) 5% false discovery rate (FDR) to correct for multiple testing (Benjamini and Hochberg, 1995).

Although originally designed for use in expression arrays, Limma has been adapted by the IMA package for use in methylation arrays (Wang et al., 2012). Limma uses linear models to assess differential expression in the context of multifactor designed experiments, providing the ability to analyse comparisons between many RNA/DNA targets simultaneously. It has features that make the analyses stable for experiments with small numbers of arrays and this is achieved by borrowing information across genes. The linear model and differential expression functions are applicable to data from any quantitative gene expression technology including microarrays, RNA-seq and quantitative PCR, and were used for both methylation and expression analysis (Smyth, 2004). Samples were compared pairwise. A moderated paired t-test can be computed by allowing for sib-pair effects in the linear model. The B-H 5% FDR correction for multiple testing is widely used in array data as it exerts less stringent control over false discovery than other family wise error rate (FWER) corrections such as Bonferroni (Benjamini and Hochberg, 1995). Other FWER techniques reduce the probability of false discovery, whereas the B-H 5% FDR allows more statistical power for very large datasets, such as those generated using array technologies, by allowing an expected number of false positives (Benjamini and Hochberg, 1995). The CpG site level data was then exported to a CSV file. 43

Samples compared pairwise at a CpG site level used the beta value (β) to estimate the methylation level of the CpG locus - the ratio of intensities of methylated and unmethylated alleles. A β of 1 indicates a fully methylated site, whereas a β value of 0 indicates no methylation at that site. The β value for the scar data was then subtracted from the control data from the same individual, giving a value for β difference (∆β value) (Bibikova and Fan, 2009). A positive ∆β change indicated that the scar fibroblast CpG site was more methylated than the control skin fibroblast, and a negative ∆β value meant that the scar skin fibroblast CpG site was less methylated than the same site in the control skin fibroblasts. The magnitude of the difference indicated the size of the change between the two cell types.

The samples were then compared at a gene region level using the ‘regionswrapper’ function (Wang et al., 2012), again using a Limma test with a 5% FDR and using the median β value as the region methylation index. This compared 11 different regions; 6 UCSC RefSeq gene regions and 5 UCSC CPG regions. The 11 regions were:

• The promoter regions TSS1500 and TSS200, referring to the region within 1500 base pairs and 200 base pairs of the transcription start site respectively (2 regions). • 5’ and 3’ untranslated regions of the gene (UTR5 and UTR3) (2 regions). • The first exon of the gene. • The body of the gene. • The CpG islands and surrounding areas. These are defined by a set of criteria and are independent of their proximity to genes, although they are usually close. The criteria are: GC content above 50%, ratio of observed versus expected number of CpG dinucleotides above 0.6, and more than 200 base pairs in length. • North and south shores, and north and south shelves. These are defined as regions flanking these CpG islands, with the shores being up to 2kb and the shelves being 2-4kb from the CpG islands (4 regions).

This region level data was then exported to a CSV file.

44

2.1.5.2. Alternative analysis using RnBeads package

An alternative analysis, using M-values instead of the β-values, was performed using the RnBeads package(Assenov et al., 2014). Rnbeads analysis involved a preprocessing step, which removed 46061 sites which overlapped with SNPs, subtracted background using the methylumi package (method "noob") (Triche et al., 2013), and normalized the methylation β values using the BMIQ normalization method (Teschendorff et al., 2013). 2774 non CpG probes were also removed, as well as 67195 probes that were removed because their beta values exhibited standard deviation lower than 0.005. In total, 69969 probes were removed in preprocessing.

Differential methylation analysis was conducted at the methylation site and region level, comparing scar to control and using a paired analysis. P-values were computed using the limma method: hierarchical linear models from the limma package were used and fitted using an empirical Bayes approach on derived M-values (Smyth, 2004). M- statistic rather than the β-value for the analysis has been recommended to reduce the problem of heteroscedasticity at the high and low ends of the methylation values, although once analysed the M-statistic is recommended to be converted back to the β- value due to its more intuitive interpretation (Du et al., 2010). A metric called the combined rank was used to determine whether a CpG site or region was significantly differentially methylated. This took into account relative and absolute effect sizes, as well as statistical significance between the sample groups, assigning a score based on the worst of these three measures. This avoided the problem of consistent but minimally different values, which are not biologically relevant, appearing in the list of significantly differentially methylated sites and regions. An automatic rank cutoff was generated by RnBeads for both site and region levels by analysing the best and worst ranking sites or regions, quantifying what proportion of the worst ranking loci have a better p-value than the best ranking loci, and setting a cutoff for that proportion not to be too high (Assenov et al., 2014). This was then used to define differential methylation.

2.1.5.3. Enrichment analysis of differentially methylated genes

An enrichment analysis of the differentially methylated genes was carried out using Pathway Studio Mammal® (Elsevier, Netherlands). Differentially methylated genes were uploaded as entities, and ‘find pathways/groups enriched with selected entities’, which uses Fisher's exact test to identify pre-defined groups, ontology, and pathways 45 that are statistically significantly changed based on entities. Gene set categories were set to the following Gene Ontology (GO) categories: ‘biological_process’, ‘cellular_component’ and ‘molecular_function’. The analysis was then run and the data exported.

46

2.2. Results 2.2.1. Differential methylation of CpG sites in scar fibroblasts

There were 3298 CpG sites (0.7% of all sites tested) that were designated significantly differentially methylated (Fig 2. 1 A), of which 2049 were proximal to a known gene and 1249 were in intergenic regions (Fig 2. 1 B). Of the 3298 CpG sites, 1227 (37%) had increased methylation in scar fibroblasts while 2071 (63%) had decreased methylation in scar fibroblasts (Fig 2. 1C). When sorted by the size of the Δβ between the scar and normal fibroblasts, there was a bimodal normal distribution, with the largest beta change group being the 0.1 to 0.2 group in the positive Δβ group and the largest change in the -0.1 to -0.2 group in the negative Δβ group, indicating few large changes in the measured ∆β between scar and normal fibroblasts (Fig 2. 1 D).

47

Figure 2. 1: Differential methylation profile of scar and normal skin fibroblasts at CpG site level. 0.7% of individual CpG sites are significantly differentially methylated in scar fibroblasts compared to normal skin fibroblasts (A), the majority of which are proximal to or within known genes (B), and have a small decrease in β value (hypomethylation) (C, D).

2.2.2 Differentially methylated genes by genomic region

Sorting of the differentially methylated sites by region (Fig. 2.2 A) shows that a large number of genes are differentially methylated within the gene body region with relatively even numbers distributed between the TSS1500 and north and south shelves and shores. When controlling for the differential number of sites within each region on the Illumina array, the gene body remains the category with the highest number of changes in methylation (Fig. 2.2. B). On examination of the directional change in methylation in each region it is observed that TSS1500, gene body and the north and south shelf regions all have far larger numbers of significantly hypomethylated genes, while the south shore is the only region to have a larger number of hypermethylated

48 genes when comparing scar fibroblast to normal skin fibroblast methylation (Fig. 2.2C). All other groups have a similar number of significantly differentially hypo and hyper methylated genes (Fig. 2.2C). Assessing differential methylation according to chromosomal location identifies the largest number of changes occurring on 1, 7 and 19 (Fig. 2.3).

Figure 2. 2: Differential methylation in gene regions in scar fibroblasts. A) Differentially methylated genes sorted by region. B) Percentage of differentially methylated regions of total regions in each group. C) Direction of methylation in differentially methylated regions.

49

49

60

50

40 Island Nshelf 30 Nshore Sshelf No. Regions Of Sshore

Differentially Methylated Differentially 20

10

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X Y Chromosome Figure 2. 3: Differential methylation by chromosome location The highest number of significantly differentially methylated regions are seen on chromosomes 1,7 and 19 when the number of significantly differentially methylated regions is sorted by chromosome.

4950

2.2.3 Methylation profile of promoter regions (within 1500 base pairs of the transcription start site).

There were 398 genes (1.95%) of the 20,387 total tested that were significantly differentially methylated in the promoter region (Fig. 2.4A). Of these genes, there were 174 (43.72%) genes with increased methylation in scar fibroblasts and 225 (56.28%) with decreased methylation in scar fibroblasts (Fig. 2.4 B). When looking at the number of genes by β difference (Fig. 2.4 C), there was a roughly normal distribution for both the positive and negative β change, with the 0.1 to 0.2 and the -0.1 to -0.2 being the most abundant level of change in their respective positive and negative groups (Fig. 2.4 C).

51

Figure 2. 4: Differential methylation of promoter regions in scar fibroblasts compared to normal skin fibroblasts.

A) 398 (1.95%) of the 20,387 total gene promoter regions were significantly differentially methylated. B) Direction of methylation change in scar fibroblasts compared to normal fibroblasts in significantly differentially methylated promoter regions of genes. C) Significantly differentially methylated promoter regions of genes sites grouped by size of Δβ between scar and normal fibroblasts.

52

2.2.4. Top 40 genes with differentially methylated promoter regions in scar fibroblasts

The genes with significantly differentially methylated promoter regions were then analysed for potential relevant functions in scarring. This was accomplished by reviewing gene ontology of the 40 genes with the largest Δβ in their promoter regions in the differential methylation data sets (20 hyper and 20 hypomethylated compared to control). Table 2.1 lists the 20 genes with the largest positive change in Δβ in the promoter region in scar fibroblasts (hypermethylated in scar fibroblasts compared to normal skin fibroblasts). The list includes genes involved in the production of the extracellular matrix (ECM) including Laminin Alpha 4 (LAMA4), CD34 Molecule (CD34), Forkhead box F2 (FOXF2), Tenascin XB (TNXB), those involved in epigenetic processes like Glutamate-Rich WD Repeat Containing 1 (GRWD1), transport proteins including Solute Carrier Family 6 Member 6 (SLC6A6) and ATP- Binding Cassette Sub-Family B Member 4 (ABCB4), and steroid hormone related genes including Estrogen Receptor 1 (ESR1) and Hydroxysteroid 17-Beta Dehydrogenase 2 (HSD17B2).

Table 2.2 lists those genes with the largest negative change in beta value (-Δβ) in their promoter region in scar fibroblasts (hypomethylated genes in scar fibroblasts). This list includes transcriptional regulators Cyclin L1 (CCLN1) and Replication Protein A3 (RPA3), cell-cell interaction genes Pannexin 3 (PANX3) and N-Ethylmaleimide- Sensitive Factor Attachment Protein Gamma (NAPG), keratin associated genes Keratin Associated Protein 22-2 (KRTAP22-22) and Keratin Associated Protein 6-3 (KRTAP6- 3) and immune system related genes, Absent in Melanoma 2 (AIM2), CD300e Molecule (CD300E), Defensin Beta 119 (DEFB119) and Glutathione S-Transferase Alpha 3 (GSTA3). The full list of genes with differentially methylated promoters in available in Appendix III.

53

Table 2. 1: Top 20 genes with largest increase in Δβ in the promoter region in scar fibroblasts (p<0.05). Gene Adjusted Symbol Gene Ontology Term P value Δβ

LAMA4 extracellular matrix constituent 0.022 0.298

SLC6A6 neurotransmitter:sodium symporter activity 0.011 0.249

NMT2 N-terminal protein myristoylation 0.045 0.228

MIR575 miRNA 0.028 0.223

PTN growth factor, ossification 0.048 0.220

FOXF2 transcription factor, EMT 0.023 0.212

CLEC3B extracellular matrix protein 0.030 0.193

CNTFR cytokine receptor activity 0.046 0.190

GIPC2 extracellular vesicular exosome 0.030 0.189

CEL lipid metabolic process 0.021 0.188

CD34 cell-matrix adhesion 0.022 0.181

ESR1 steroid hormone receptor activity 0.049 0.174

MIR505 miRNA 0.022 0.171

GRWD1 Histone methylation 0.041 0.169

ABCB4 transmembrane transport 0.002 0.166

HSD17B2 steroid biosynthetic process 0.040 0.165

C6orf186 methyltransferase activity, methylation 0.037 0.162

TNXB extracellular matrix glycoprotein 0.010 0.160

CRIP1 cell proliferation 0.045 0.159

LOC399959 miRNA 0.032 0.158

54

Table 2. 2: Top 20 genes with largest decrease in Δβ in the promoter region in scar fibroblasts (p<0.05). Gene Adjusted Symbol Gene Ontology Term P value Δβ

AIM2 activation of innate immune response 0.011 -0.341

MIR424 miRNA 0.046 -0.318

CD300E immune system process 0.032 -0.307

GSTA3 xenobiotic metabolic process 0.024 -0.279

KRTAP22-2 intermediate filament 0.046 -0.264

KRTAP6-3 intermediate filament 0.046 -0.264

C12orf77 unknown function 0.011 -0.259

LCTL activity 0.045 -0.248

DEFB119 defense response to bacterium protein 0.032 -0.238

CCDC86 protein binding 0.011 -0.237

C1orf68 epidermis development 0.046 -0.236

PANX3 cell-cell signalling gap junction 0.024 -0.234

RPA3 DNA replication 0.042 -0.228

SMPDL3A metabolic process 0.046 -0.208

CCNL1 regulation of transcription by RNA processing 0.035 -0.199

C2CD4D Unknown function 0.046 -0.199

LRRTM4 neurotransmitter-gated ion channel 0.014 -0.199

SPAG7 nucleic acid binding 0.014 -0.190

NAPG intracellular membrane fusion 0.036 -0.189

MIR518B miRNA 0.030 -0.187

55

2.2.5. Top 40 ranked genes with differentially methylated promoter regions in scar fibroblasts using RnBeads analysis

Genes with significantly differentially methylated promoter regions identified using the RnBeads analysis were also then analysed for relevant functions in scarring. Table 2.3 lists the 40 genes with the lowest combined rank score (best score) in scar fibroblasts, compared to normal skin fibroblasts. This table includes both hypermethylated and hypomethylated promoters. This was accomplished by reviewing gene ontology and literature for the promoter regions of the 40 genes with the best combined rank score in the differential methylation data sets.

Genes in this list include those involved in ECM production and structure like WW Domain Containing Transcription Regulator 1 (WWTR1), Laminin 4 (LAMA4) and Forkhead Box F2 (FOXF2), membrane components and adhesion proteins like Protocadherin Gamma Subfamily C 4 (PCDHGC4), Transmembrane Protein 184A (TMEM184A), and transport proteins like Importin 5 (IPO5). The genes LAMA4, FOXF2, C12orf77, defensin beta 119 (DEFB119), pannexin 3 (PANX3) and SPAG7 were common with the original IMA analysis.

56

Table 2.3: Top 40 differentially methylated gene promoters from RnBeads analysis sorted by rank score

log2 mean P value Rank Gene Symbol Gene Ontology Term Name Δβ methylation adjusted Score quotient FDR RNU6-411P snRNA class -0.67 -1.983 0.037 29

homophilic cell adhesion via PCDHGC4 plasma membrane adhesion 0.31 1.984 0.094 57 molecules

IPO5 protein transporter activity -0.39 -1.600 0.077 62

CYB5R2 lipid metabolic process -0.31 -2.160 0.094 63

ATP6V0CP1 Pseudogene 0.56 2.030 0.094 88 RNU6-839P RNA pseudogene -0.53 -1.546 0.109 127 C1orf158 unknown function -0.25 -1.461 0.094 129 MIR4761 miRNA 0.25 1.478 0.108 137 integral component of TMEM184A -0.37 -1.264 0.108 152 membrane positive regulation of IRS1 mesenchymal cell 0.30 2.472 0.114 159 proliferation OR7E47P 0.26 1.246 0.109 161 pseudogene C12orf77 unknown function -0.34 -1.223 0.102 166 RN7SL467P unknown function 0.26 2.179 0.119 177

transforming growth factor WWTR1 beta receptor signaling -0.34 -1.180 0.080 186 pathway

extracellular matrix LAMA4 0.24 1.500 0.121 201 structural constituent

transcription coactivator SERTAD2 -0.23 -1.393 0.109 202 activity

REC8 meiotic nuclear division 0.25 1.127 0.087 213

HIF1A-AS2 non-coding RNA 0.22 1.697 0.094 215 SEMA3D cell differentiation 0.37 1.109 0.119 225 RPL5P24 RNA pseudogene -0.26 -1.946 0.127 226 integral component of CLIC6 0.22 1.429 0.124 230 membrane COX17P1 psuedogene 0.22 1.608 0.124 237 LINC00163 non-coding RNA 0.29 1.848 0.129 243

57

IL15RA cytokine receptor activity 0.23 1.575 0.129 245

AP2A2 vesicle-mediated transport 0.22 1.246 0.131 250

RASGRP1 cell differentiation 0.22 1.036 0.127 271 C-X-C chemokine receptor CXCR6 0.34 1.167 0.134 297 activity AZIN1 catalytic activity 0.24 1.141 0.136 309 MIR4696 miRNA 0.38 1.343 0.136 312 LRRC18 cytoplasm 0.21 1.607 0.109 319 NAGK nucleotide binding -0.27 -1.308 0.142 327

FOXF2 Transcription factor, EMT 0.21 0.984 0.108 330

PCK2 glucose metabolic process 0.21 0.980 0.136 337

cell-cell signaling gap PANX3 -0.20 -1.077 0.094 347 junction

IGHV3-74 Immunoglobulin heavy chain -0.20 -1.079 0.094 349

MIR661 microRNA 0.21 0.962 0.133 354

OR8B4 olfactory receptor activity -0.33 -0.962 0.087 356

EGOT non-coding RNA -0.43 -0.953 0.144 371 SPAG7 nucleic acid binding -0.29 -0.953 0.108 375 defense response to DEFB119 -0.23 -0.943 0.134 389 bacterium

2.2.6. Enrichment analysis of differentially methylated gene promoter regions in scar fibroblasts

Of the 398 genes with differentially methylated promoter regions in scar fibroblasts, 357 of them had mapped entities in the pathway studio software, from which there were 2 050 pathways/groups enriched with at least one member of the selection. The software calculated the p-value as a ratio of a number of common objects between two pathways to the total number of objects in them. Table 2.4 shows the top 20 enriched groups/pathways, which includes many gene sets likely related to scaring including cell junction, extracellular region, extracellular space, cell adhesion and epidermis development.

58

Table 2. 4: Top 20 enriched groups/pathways from the differentially methylated gene set data. Name # of Entities Overlap p-value Hit type

plasma membrane 5701 111 1.7E-06 cellular_component heparin binding 166 11 8.8E-06 molecular_function extracellular region 2319 55 1E-05 cellular_component component of plasma 1292 36 1.7E-05 cellular_component epidermis development 116 9 2E-05 biological_process extracellular space 1166 33 2.8E-05 cellular_component glutamate metabolic process 15 4 3.3E-05 biological_process Keratinization 62 6 0.00015 biological_process Membrane 7556 129 0.00016 cellular_component adenosine receptor binding 2 2 0.00016 molecular_function nerve growth factor receptor 2 2 0.00016 molecular_function cell junction 664 21 0.0002 cellular_component cell adhesion 658 20 0.00038 biological_process regulation of protein 3 2 0.0005 biological_process apical plasma membrane 301 12 0.00067 cellular_component leukocyte migration 115 7 0.00075 biological_process endocytic vesicle membrane 56 5 0.00083 cellular_component cell surface 508 16 0.00119 cellular_component voltage-gated sodium 17 3 0.00119 molecular_function receptor complex 123 7 0.00121 cellular_component

59

2.3. Discussion

The data demonstrate there are a small number of significant changes in the epigenome of fibroblasts isolated from established normotrophic scars when compared to normal skin fibroblasts isolated from the same body site and from the same patients. This supports the hypothesis that changes in the epigenome are the mechanism responsible for long-term changes in collagen metabolism in normotrophic scar fibroblasts and subsequently the dermal matrix. The number of changes identified was much smaller than often reported in other studies of epigenetic changes (Sandoval et al., 2011; Zouridis et al., 2012). However this is to be expected since comparative studies of methylation have predominantly focused on more severe disease states including cancer. In these pathologies a much greater disruption to the epigenome would be expected than between scar and skin fibroblasts where the phenotypic change is mild. The finding that many of the identified changes occur in the promoter regions of genes known to be biologically relevant to matrix deposition suggests the epigenetic changes identified are likely to be important in scar maintenance. Therefore it is possible that future work aimed at modulating these epigenetic changes could improve scar outcome by altering scar fibroblast phenotype.

2.3.1. Significant changes in individual CpG site methylation patterns

At a CpG site level, there was a small but significant number of differentially methylated CpG sites, with 3 298 out of the possible 485 000 (0.732%) sites significantly differentially methylated. Studies on other pathologies with a similar study design are rare, but a study on psoriasis, a chronic inflammatory skin disease, identified 1 108 differentially methylated CpG sites (Roberson et al., 2012) in disease samples compared to control. This is only a third of the differences observed in this study between normal skin and normal scar. However, the study on psoriasis used an older array with only 27 578 CpG sites compared to the 485 000 tested here. Therefore when looking at the percentage change the psoriasis study had a much higher rate of 4% of the total CpG sites studied being differentially methylated compared to 0.732% in our normotrophic scar fibroblasts. In cancer studies, the number of differentially methylated CpG sites are far higher, with a study on colorectal cancer showing 6% of all CpG sites were differentially methylated (Sandoval et al., 2011) and a study on gastric cancer showing 42% of CpG sites were differentially methylated (Zouridis et al., 2012). A similar study on keloid cells in our own group has also shown much greater proportion 60

of differentially methylated genes (M. Alghamdi, pers. comm.). The higher proportion of differentially methylated genes in keloid cells may reflect the ongoing disease state and the likelihood that many epigenetic changes identified are a consequence of continued disease and loss of control of epigenetic and genetic regulation rather than causative changes. This is in contrast to the study here which compared two very stable cell types with minor phenotypic differences related to ECM maintenance.

Of the 3 298 CpG sites, 1,227 (37%) had increased methylation in scar fibroblasts while 2 071 (63%) had decreased methylation in scar fibroblasts. Methylation in promoter regions is usually inversely associated with expression, such that decreased methylation is associated with an increase in gene expression and increased methylation results in decreased expression (Choy et al., 2010). However, methylation in other gene regions such as the gene body can have the opposite effect (Yang et al.), and there are exceptions to the rule whereby methylation and expression can be directly associated. Interestingly, genome wide hypomethylation is present in many cancers (Esteller, 2007), where it causes chromosomal instability (Feinberg, 2004).

The pattern of differential methylation of the CpG sites globally favours hypomethylation (63% compared to 37% hypermethylated) and outside the promoter regions this proportion is even more pronounced, with more hypomethylated sites in the genebody, south and north shelves. Widespread hypomethylation is common in neurodegenerative disease and has also been linked to ageing with increasing loss of methylation occurring over the mammalian lifespan (Lu et al., 2013). The data here suggests changes in the scar fibroblasts are similar to those seen in many other studies of disease, with greater global hypomethylation and hypermethylation only observed in a smaller number of specific genes. However, the balance is not as skewed towards hypomethylation in these scar fibroblasts as it is in many diseases. This is likely because the widespread hypomethylation commonly observed in severe disease reflects a loss of genomic control over time. The more even distribution of changes in methylation found in scar fibroblasts suggests this loss of control has not occurred, as would be expected given the very mild phenotypic changes observed. However, the increased level of hypomethylation in the scar fibroblasts may reflect a limited loss of genomic control. It could be hypothesised that the traumatic initial injury environment after the burn injury triggered a loss of tight epigenetic regulation and that the loss of methylation observed here reflects this period of acute insult which is at least partially rectified in the mature

61

scar fibroblasts. Alternatively the similarity with aging related changes may reflect the period of rapid replication during acute wound healing and remodelling such that scar fibroblasts represent a prematurely ‘aged’ population when compared to those from normal skin.

Interestingly, other studies have suggested that it is the hypermethylated genes that may be more biologically relevant whilst the hypomethylation may reflect a loss of tight control of epigenetic regulation in disease (Sharma et al., 2010). In this study the ontology of those genes which were found to be hypermethylated does appear to show that many hypermethylated genes have roles relevant to ECM production, which is notably absent in the list of genes most extensively hypomethylated. However, the importance of each of these classes of genes in maintaining scar phenotype still needs to be investigated.

When sorted by the size of the beta difference (∆β) between the scar and control fibroblasts, there was an approximately normal distribution for each of the positive and negative categories of change. The largest beta change group was the 0.1 to 0.2 group in the positive ∆β group and the -0.1 to -0.2 group in the negative ∆β group, indicating small changes in methylation of the CpG site were the most common difference observed.

The average β value is a ratio of the intensities of the methylated to unmethylated alleles, with an average β of close to 1 meaning that all copies of the CpG site is methylated in all DNA samples and an average β of close to 0 meaning that all copies of the CpG site is unmethylated in all DNA samples. Biologically, DNA methylation is binary – a CpG site is either methylated or not, it cannot be partially methylated. The continuous value obtained for the average β represents the biological variation within and between the samples and chip binding properties. In this study the differentially methylated sites were sorted by p value, the probability of obtaining the observed sample results, or "more extreme" results, when the null hypothesis is actually true. A large ∆β is potentially more biologically relevant than a small one, as it suggests that the site is strongly differentially methylated, or rather is differentially methylated in a greater proportion of DNA strands in cells from which the DNA was isolated (Song et al., 2013).

62

Other studies, including one on obesity by Benton et al., showed similar or smaller ∆β, with the ∆β range between 0.020-0.273, using the same Illumina 450k array (Benton et al., 2015). In a psychological stress study by Naumova et al., using the lower resolution 27k array, the findings were very similar to those presented here with a bimodal normal distribution of the ∆β, but again with smaller ∆β values than obtained in this dataset (Naumova et al., 2012).

Less common CpG sites with large ∆β values (upper and lower edges of figure 2.1D) are of greater interest as targets for the mechanisms responsible for scar formation and maintenance, and are the priority for further investigation. Their consistently large ∆β between normotrophic scar and control skin fibroblasts suggest a consistent methylation change across the scar fibroblast population. Smaller changes may reflect changes due to individual variation or ‘noise’ introduced during the cell culture process.

2.3.2. Regional and genomic distribution of methylation changes in scar fibroblasts

The classification of CpG methylation according to University of California Santa Cruz (UCSC) RefSeq and CpG categories provides a broader view of the differences in methylation patterns between scar and control fibroblasts. The most striking feature of this analysis is the large number of differentially methylated genes in the GENEBODY category compared to all other categories. This remains the case when adjusting for the number of sites tested in each region. Differential methylation of CpG islands in south and north shores also stands out in the analysis with a large number of hypomethylated genes in each of these groups.

Whilst the effect of methylation in the promoter is the most understood of all regions, and will be discussed in Section 2.3.3, the effect of methylation status on other regions is less clear (Lister et al., 2009; Jones, 2012). Gene body methylation, the largest category by both number and percentage, has been shown to have a positive correlation with gene expression, in that greater methylation appears to be associated with greater levels of expression (Yang et al.). This is the opposite of the accepted dogma for promoter region methylation (Simpkins et al., 1999). There may be a number of different mechanisms by which DNA methylation in transcribed regions could potentially affect (and in particular increase) gene expression. It is possible that this methylation represses alternative promoters, retrotransposon elements or other

63

functional elements to maintain the efficiency of transcription from the gene start site (Yang et al.). Alternatively these intragenic methylation sites could cause other structural changes that increase expression, in contrast to promoter methylation which in general reduces DNA accessibility to transcription factors. CpG shore methylation is thought to be directly correlated with gene expression (Irizarry et al., 2009), and it would be expected that these genes would be upregulated in hypermethylated shores and down regulated in hypomethylated shores associated with the genes (and therefore many of these genes would show decreased expression). However, even promoter methylation, the most characterised methylation change, does not always repress expression (Bahar Halpern et al., 2014). Therefore whilst the data provides an interesting snapshot of increased and decreased methylation in specific genes and regions, without further empirical data it is not possible to infer what effects these changes in methylation are exerting on gene expression. Therefore it is important to integrate this methylation data with transcriptome data to provide empirical evidence to understand how these methylation changes affect cell phenotype.

Finally, assessment of genomic location demonstrates that chromosomes 1, 7 and 19 have the largest number of changes in CpG site methylation across the genome (Fig: 2.3). This may be expected for chromosome 1, as it is the largest chromosome and contains the largest number of genes. However, chromosome 7 has 100 million fewer base pairs and half the number of genes, yet it has a similar number of differentially methylated regions. is also much smaller than chromosome 1, although it is relatively gene dense. A comparison with other studies was made to see if these patterns were unique to scar fibroblasts. In a study on vestibular schwannoma, a peripheral nerve tumour, using the same chip, and the same IMA package in R and the same ‘regionswrapper’ function, the chromosomes with the greatest number of differentially methylated regions were chromosomes 1, 2 and 6 (Torres-Martín et al., 2015). This indicates that an abundance of differentially methylated regions in chromosome 1 may be common in this type of analysis given these findings in very different ‘disease’ states (and as previously mentioned the size and gene density of this chromosome). However, the abundance of change in regions at chromosome 19 and chromosome 7 may be more closely related to the scar fibroblast phenotype.

64

2.3.3. Changes in promoter methylation – gene level

The promoter region (TSS1500) was focused on as promoter methylation is generally understood to repress gene expression and is the most well understood of the regulatory mechanisms of DNA methylation. Repression is accomplished by the methyl group either physically preventing transcription factors from binding or by recruiting methyl binding domain (MBD) proteins that can interact with histones and restrict access to the DNA by transcription factors (Choy et al., 2010).

Similar to the CpG site data, a small percentage of genes (1.95%) were significantly differentially methylated in the promoter region. The direction of the methylation was more equal between groups in the promoter region compared to the global CpG site level, with 174 (43.72%) genes with increased methylation in the promoter region in scar fibroblasts and 225 genes (56.28%) with decreased methylation in the promoter region in scar fibroblasts when compared to normal skin control fibroblasts.

The overall β differences were smaller than seen in the individual CpG site data, with none of the genes having ∆β >0.35. This may be due to the fact that the IMA package uses the median β value of all the probes for the gene or region. Another study on a set of chronic fatigue patients, using the same chip, IMA package in R and same ‘regionswrapper’ function, showed a similar narrowing of the Δβ range when comparing CpG site data to gene level data. The CpG sites Δβ values narrowed from a range of between 0.32 and -0.36 to a range of 0.26 to -0.22 (de Vega et al., 2014).

As with the site level data, a large Δβ value is likely to be more biologically relevant than a small change, representing a more homogenous change in the DNA of the cell population rather than changes in methylation in a small subset of the cells from which the DNA was isolated. Therefore targets with larger Δβ will be of most interest in further investigating the role of epigenetics in controlling scar fibroblast phenotype.

The top 20 hypermethylated and 20 hypomethylated promoter regions were linked to multiple genes known to be important in scarring and fibrosis.

Most of the genes with the hypermethylated promoter regions have been shown to have some function relevant to scarring and fibrosis. Pleiotrophin (PTN) is an ECM- associated protein which has been shown to be activated in cutaneous wound healing and which has been found to have increased expression in hypertrophic scars in vivo 65

(Zhang et al., 2013; Paddock et al., 2003). Laminin 4 (LAMA4) is a key non- collagenous component of the basement membrane, and has been shown to be upregulated in renal fibrosis in rats (Eddy et al., 1995). Solute Carrier Family 6 Member 6 (SLC6A6) is a transporter of taurine and beta-alanine, and has been shown to be upregulated in a mouse model of idiopathic pulmonary fibrosis (IPF) (Tzouvelekis et al., 2007). N-Myristoyltransferase 2 (NMT2) is a protein that facilitates addition of fatty acid to signalling proteins, and deletion of this gene in the kidney cells of mice causes focal segmental glomerulosclerosis, a kidney dysfunction caused by scarring of the glomerulus (Porubsky, 2014). Forkhead Box F2 (FOXF2) is a mesenchyme specific gene involved in lung development and a tumour suppressor, and has been shown to be upregulated in fibrotic lesions in lung transplant tissue(Walker et al., 2011). C-Type Lectin Domain Family 3 Member B (CLEC3B) is involved in platelet activation and has been shown to be downregulated in human lung fibrosis (Estany et al., 2014). Ciliary Neurotrophic Factor Receptor (CNTFR) is a receptor that regulates immune response, and is involved in liver fibrosis in mice (Stefanovic and Stefanovic, 2012) and been shown to upregulated in inactive hepatic stellate cells, the non-fibrotic subtype of hepatic stellate cells (Liu et al., 2013).

Many of the top 20 genes with hypomethylated promoter regions have also been shown to be important in scarring and fibrosis. Absent in melanoma 2 (AIM2) binds to double stranded DNA in the cytosol and activates an immune response, and has been shown to be involved in many inflammatory related fibrotic pathologies such as systemic sclerosis (Artlett et al., 2011), liver fibrosis (Boaru et al., 2012), as well as other skin abnormalities including psoriasis, atopic dermatitis, venous ulcers and contact dermatitis (de Koning et al., 2012). MicroRNA 424 (MIR424) is a non-coding RNA involved in post transcriptional regulation of messenger RNA (mRNA), and has been shown to be downregulated in hypertrophic scar fibroblasts treated with tetrandrine, an antifibrotic drug (Ning et al., 2016). CD300e Molecule (CD300E) is a member of the CD300 family that functions as an activating receptor capable of regulating the innate immune response (Brckalo et al., 2010), and has been linked to persistent inflammation and lung fibrosis (Grabiec and Hussell, 2016). Glutathione S-Transferase Alpha 3 (GSTA3) is another immune enzyme, involved in cellular defence against toxic, carcinogenic, and pharmacologically active electrophilic compounds, and has been shown to be downregulated in human radiation induced vocal fold fibrosis (Johns et al., 2012). Defensin Beta 119 (DEFB119) is a member of the defensin family, which are 66

small cationic peptides with antimicrobial function and are part of the innate immune system, and while DEFB119 has not been specifically implicated in fibrosis, other beta defensins such as beta defensin 2 have been shown to be upregulated in pleural mesothelial cells in patients with pleural empyema, an accumulation of pus during pneumonia that leads to pleural scarring (Ashitani et al., 1998). Pannexin 3 (PANX3) is a member of the channel forming pannexin family of genes, which form mechanosensitive ATP-release channels, and are widely expressed in the skin (Cogliati et al., 2016). While PANX3 has not specifically been implicated in scarring, depletion of pannexin 1 (PANX1) in the mouse skin delayed wound healing and increased fibrosis in panx1 knockout mice, suggesting PANX3 is likely to have a role in this process (Penuela et al., 2014).

The genes with the most differentially methylated promoter regions revealed some expected and unexpected ontologies. Expected changes include the differentially methylated ECM genes in the scar fibroblasts including LAMA4, CD34, FOXF2 and TNXB. Partly unexpected was that these were all hypermethylated, which traditionally correlates with decreased gene expression (Choy et al., 2010), although as previously outlined methylation is not simple to correlate to expression without transcription level data. It may be that the epigenetic changes in these genes lead indirectly to increased collagen production. For example, laminin (LAMA4) is a key non-collagenous component of the basement membrane (Hahn et al., 1980), and hypermethylation accompanied by decreased gene expression may influence or cause abnormal collagen formation and expression in scar tissue. This could arise due to increased fragility or other changes caused by changes to the basement membrane. A similar case can be made for tenascin XB (TNXB), a key structural protein in the ECM which is important in the organisation of collagen and has anti-adhesive properties (Chiquet-Ehrismann, 2004). Therefore hypermethylation accompanied by decreased gene expression may cause the abnormal collagen orientation seen in scar (Verhaegen et al., 2009).

Within the top 20 genes with hypomethylated promoter regions, there were several genes involved in immune processes, including AIM2 and CD300E, DEFB119 and GSTA3. Hypomethylation of immune genes suggests sustained activation of expression, and as inflammation is a key contributing factor to scarring (Larson et al., 2010), this sustained activation of immune related genes after injury may contribute to maintenance of the scar phenotype. Alternatively, this change could reflect the origin of these

67

fibroblasts, such that if their lineage is from differentiated monocyte/fibrocyte precursors during healing then the relative absence of methylation at these sites could be due to their cellular origin. Ultimately empirical data on the transcriptome will be important to unravel the significance of these changes in methylation on gene expression.

2.3.4 Alternative analysis of promoter regions using RNBeads

Using the M-statistic rather than the β-value for the analysis has been recommended to reduce the problem of heteroscedasticity at the high and low ends of the methylation values, although once analysed the M-statistic is recommended to be converted back to the β-value due to its more intuitive interpretation (Du et al., 2010). This was carried out using the RnBeads package (Assenov et al., 2014), and this alternative analysis showed a larger number of differentially methylated promoter regions with, 725 compared to the original IMA analysis’ 398. Of these, there were 106 common promoters. The range of the Δβ was also much higher – from +0.55 to -0.66, compared to IMA analysis’+0.29 to -0.34, although the FDR adjusted p-values were also higher - 0.036 to 0.384 compared to the IMA analysis’ 0.002 – 0.049. Common genes in both the IMA and RnBeads list included genes involved in fibrotic processes, such as laminin (LAMA4), Forkhead Box F2 (FOXF2) and Tissue Inhibitor Of Metalloproteinases 4 (TIMP4). This list of common genes is very likely to contain some driving scar formation and maintenance. The greater number of differentially methylated targets in the RnBeads analysis, using the package recommended cut off for significance, validates that the original IMA analysis wasn’t too lenient. There are likely target genes in both individual lists as well as the list of common genes, but further integration with the transcriptomic data is required.

2.3.5 Summary

This comparative study of the genome-wide methylation patterns in primary human scar fibroblasts versus matched normal skin fibroblasts demonstrated significant changes in the methylation of a subset of genes likely to be important in the biological process of ECM maintenance. This suggests that epigenetic regulation could be an important element in the control of scar matrix deposition and turnover. The origin of these changes, whether due to changes during acute healing as part of the repair response, or whether it is due to altered matrix composition affecting cell biology, or even the origin

68

of cells that form the scar matrix (e.g. epithelial-mesenchymal transition) is not clear from these studies. However, the fact these changes are sustained in scars long after healing is complete has potentially important implications for scar biology.

2.3.5 Limitations of study

Limitations of this study are predominantly due to small sample size, the necessity of cell culture to generate sufficient cell number for DNA isolation and the resolution of the Infinium array. This study only used samples from 6 individuals for comparison of scar and normal skin fibroblasts. This limits the statistical power, although many similar studies also use small sample size due to tissue availability and costs. In this study, the use of matched samples from the 6 individuals to allow for paired analysis of the data significantly increased the statistical power in comparison to samples taken from affected versus unaffected groups. The use of matched samples also reduces the variability caused by other factors such as age, gender, biopsy site and sun exposure. Nevertheless the study would be improved by a greater sample number to increase confidence in the validity of identified targets.

A second issue is the possible effects of cell culture on the DNA methylation profile. Whilst all samples (scar and control) were treated the same and maintained in culture for the same duration to limit variability introduced by cell culture, changes in methylation during this process cannot be discounted. It has been shown that methylation profiles do change with cell culture, both in early passages (Nestor et al., 2015) and with extended time in culture (Boess et al., 2003). However, in mouse embryonic fibroblasts, most of the early methylation alterations are in the recently discovered hydroxymethylated cytosines, rather than the methylated cytosines analysed in this study. The methylated cystosines in these mouse fibroblasts stayed relatively stable up to passage 2 (Nestor et al., 2015), which was the same passage as the cells used in this study. Human normotrophic and normal skin fibroblast culture time was limited to the minimum possible to extract DNA, RNA and save some cells for further in vitro analysis to reduce variability. In addition, targets of interest are those with large ∆β between scar and control cells. It would be expected that stochastic changes occurring over time in cell culture would be less likely to generate large changes in methylation at individual sites, as a smaller proportion of cells would be affected by each change. Therefore by focusing on those large changes in beta values, it is likely any changes due to cell culture will have been diminished. A validation of this would be 69

interesting to carry out with further time and resources, analysing both the pre-culture cells and higher passage cells and comparing the changes in epigenetic markers.

The use of the Infinium 450K array is a significant limitation. This array tests 485 000 CpG sites across the genome, with 99% of RefSeq genes and 96% of CpG islands represented. However this is still only ~1.39% of the roughly 32 000 000 CpG sites in the , the CpG content of which is about 1% (Law and Jacobsen, 2010). More modern technologies using bisulfite whole genome sequencing give a much higher resolution and therefore greater ability to identify areas of significant change. However these strategies do also generate a much larger dataset which is prone to the same difficulties when investigating small sample numbers.

Finally, with large datasets such as arrays there are issues with an increased chance of false positives due to multiple testing. The 5% B-H correction for multiple testing is widely accepted for use in arrays but it is a compromise. It increases power compared to other methods of multiple testing correction, accepting a rate of false positives at 5% but allowing the other 95% as true positives (Benjamini and Hochberg, 1995). This correction increases the rate of false positives compared to other multiple testing corrections, and with the risk that genes identified by the analysis may not actually be involved in maintaining scar.

70

2.4. Conclusion

Despite these limitations, this study provides the first evidence for genome-wide methylation changes in normotrophic scar fibroblasts long after healing has occurred, in both genes and groups of genes related to scarring. This provides evidence to support the hypothesis that epigenetic regulation is the mechanism responsible for long-term changes in collagen metabolism in normotrophic scar fibroblasts and subsequently the dermal matrix. Further study of scar fibroblast phenotype and investigation of these target genes may identify new key targets involved in scarring.

71

Chapter 3

72

Chapter 3 - Investigation of the transcriptome of scar fibroblasts

Introduction In the previous chapter changes in the epigenome were identified in scar fibroblasts when compared to fibroblasts from normal skin. It is therefore reasonable to hypothesise that these changes in methylation will lead to changes in expression of a specific subset of genes and that this will influence cell phenotype.

The regulation of gene transcription is a key step in the control of protein synthesis and the subsequent control of cell phenotype. Whilst there are many other post- transcriptional mechanisms for control of cell function, the transcriptome of any cell is closely related to cell type and function.

Current microarray platforms enable simultaneous measurement of large numbers of transcripts, including the entire protein-coding transcriptome. This includes not only measurements at the level of changes in gene expression but also measures of alternative transcript expression (transcripts generated by alternative splicing of an individual gene). In this chapter, the Affymetrix platform was used to assess comparative expression levels of the transcriptome of scar and normal skin fibroblasts. The cells used were the same as for the genome-wide methylation analysis described in Chapter 2 enabling expression and methylation data to be matched for individuals. The gene expression profile data will be integrated with epigenetic data in Chapter 4.

Aim: To compare normotrophic scar and normal skin fibroblast transcriptomes to identify differentially expressed genes.

73

3.1. Methods

Patient recruitment and tissue culture procedures were identical to those described in Sections 2.1.1 and 2.1.2. Briefly, 6 male patients aged 18-35 years with a burn injury over 12 months old due to flame or scald were recruited from the State Adult Burns Service of Western Australia, Royal Perth Hospital. All subjects had previously been admitted to hospital for treatment of the acute injury. The burn injury had to include one forearm area and a contralateral uninjured forearm. Two 3mm full thickness punch biopsies were taken; one from the forearm containing the scar tissue and one from the contralateral uninjured forearm. Fibroblasts cultured from the skin biopsies were cultured to passage 2 in DMEM with 10% FBS and 5% pen/strep using a standard explant method (Section 2.1.2.1) (n=6 scar samples and n=6 matched normal control samples).

3.1.1. RNA extraction and processing

RNA was prepared using the RNeasy mini kit (Cat. No. 74104, QIAGEN, Netherlands) according to manufacturer’s instructions. Cells from one T-75 flask (Greiner Bio-One, Germany) were detached using the standard trypsinisation protocol (section 2.1.2.2). The cells were resuspended in sterile PBS (Sigma-Aldrich, USA), counted and half the total cell count pelleted by centrifugation and used for RNA extraction. First, 350 µL of buffer RLT was added to the cell pellet and mixed by pipetting. Seventy percent ethanol (350 μL) was then added and mixed by pipetting up and down. The samples were then transferred to RNeasy spin columns placed in a 2 mL collection tube, and spun at 8000g for 15 s. The flow-through was discarded. On-column DNase digestion was then performed by adding 350 µL of buffer RW1 to the samples and then centrifuging at 8000g for 15 s. The flow through was then discarded and 10 μL DNase I stock solution was added to 70 μL Buffer RDD which was then added directly to the RNeasy spin column, and incubated at room temperature for 15 min. Following incubation, 350 μL Buffer RW1 was then added to the RNeasy spin column and centrifuged for 15 seconds at 8000g. The flow through was discarded, and 700µL of buffer RW1 was added and centrifuged at 8000g for 15 seconds. The flow through was again discarded, and the procedure repeated twice, this time adding 500µL of buffer RPE, and discarding the flow through each time. RNeasy columns were then placed in fresh 2mL collection tubes and centrifuged at 10000g for 1 minute. The samples were then placed in a new 1.5mL Eppendorf tube and 30µL of RNase free water was added, then columns were 74

centrifuged at 8000g for 1 min. This step was repeated with another 30µL of RNase free water added. The RNA was then stored at -20°C.

3.1.2. Affymetrix Human Genechip 2.0 ST preparation and processing

Samples were checked for quantity and quality on a NanoDrop™ 1000 Spectrophotometer (Thermo Fisher Scientific, USA). Any samples with A260/A280 ratios <1.8 were re-purified from the cells using the RNeasy mini kit (QIAGEN, Netherlands). The samples were then diluted to 50ng/µL in a volume of 20µL to give a total RNA amount of 1µg. Aliquots of each sample were then run on the Agilent Bioanalyser (Agilent, USA) at the Lotteries West State Biomedical Facility, Western Australia for quality control. RNA Integrity Numbers (RIN) were >9 and passed standard quality control and the samples were sent to the Ramaciotti Centre for Genomics at the University of New South Wales for processing on the Affymetrix Human Genechip 2.0 ST (Affymetrix, USA) according to the manufacturer’s instructions.

3.1.3. Processing of expression data

Processing of the data was carried out using multiple steps in the statistical software R (R Core Team, 2015), using the Oligo and Limma packages (Carvalho and Irizarry, 2010; Smyth, 2004). The code used can be found in Appendix IV. Oligo provides tools to pre-process different oligonucleotide arrays types, including gene expression, supported by Affymetrix. Limma uses linear models to assess differential expression in the context of multifactor designed experiments, providing the ability to analyse comparisons between many RNA/DNA targets simultaneously (Carvalho and Irizarry, 2010).

Data was imported from .CEL files using Oligo and then normalised and corrected for background. Gene expression levels were calculated using the Robust Multichip Average (RMA) function of the Limma package. Normalisation of microarray data is required to make adjustments for systematic errors introduced by differences in procedures and dye intensity effects. Background correction was required to allow for background fluorescence of the dye (Carvalho and Irizarry, 2010).

An expression set (eset) was created for the transcripts, then the samples were compared pairwise using a moderated paired t-test from the linear model, using a 5% false discovery rate (FDR) with a Benjamini-Hochtberg (B-H) correction for multiple testing 75

(Benjamini and Hochberg, 1995). The pairs comprised the scar sample and matched normal skin control sample for each patient. The fold change (Fold Change) and the log of the fold change (base 2) (Log Change) was calculated for each gene. Hereafter the Log Change refers to the log of the fold change (base 2). Data was finally exported to a CSV file, where the anti-log of the Log Change data was calculated to determine relative expression. Genes were originally designated as significantly differentially expressed if they had a fold change of >1.5 a p<0.05 after B-H correction for multiple testing. However, due to the underpowered nature of this study, this p-value cut off was found to be too stringent. Other papers such as have used a nominal liberal significance value as their cutoff (Rodriguez et al., 2014), and a nominal p-value of p<0.05 uncorrected for multiple testing was used in this study.

3.1.4. Gene Set Enrichment Analysis

The Gene Set Enrichment Analysis (GSEA) algorithm in Pathway Studio Mammal® (Elsevier) was used to explore the biological pathway changes in the microarray data. GSEA ranks microarray results by the absolute value of the fold change in experimental results and identifies known gene sets that are statistically enriched based on this ranking (Subramaniam et al. 2005).

Gene expression data processed using the Limma and Oligo package in R (section 3.1.3) was reformatted prior to input into the Pathway Studio Web Mammal® software (Elsevier). This reformatting involved removal of all the data except for the Gene Name and the Log Change columns. The import experiment function was selected, with the experiment type set to ‘gene expression’ and the file type set to ‘Excel’ file. The sheet with the data was then selected, as well as the header row, first data row, column with probe identity and the first sample column. After this step, the number of columns per sample and the expression value column position were both set to 1. The last column of the sample was then set, and in the experiment properties panel the sample type was input as Log-ratio and the experiment named. In the mapping panel following the experiment properties panel, the type of identifier was set to ‘name’ and the probeset gene limit set to 1. As differential expression had already been calculated in R, the differential expression panel was skipped. Once the experiment had been entered, the experiment viewer was opened and the ‘analyse experiment’ function selected from the tools section. The analysis type was then set to ‘gene set enrichment analysis’, the sample column name to Log Change, the p-value to 0.05, the maximum number of 76

networks set to 1000, the enrichment test set to Mann-Whitney U test, and finally the gene set categories set to biological_process, cellular_component and molecular_function in the GO (gene ontology) section. The analysis was then run and the resulting significant gene set data exported to Excel.

77

3.2. Results

3.2.1. Overview of differentially expressed genes

163 of the 19 845 genes tested (0.8%) were found to be significantly differentially expressed (fold change >1.5, p<0.05 (Fig. 3.1 A)). Forty seven percent of the differentially expressed genes increased in expression in scar fibroblasts and fifty three percent of the differentially expressed genes decreased in expression (Fig. 3.1 B). The vast majority of genes had a small fold change, between 1.5 and 2 fold for both the increased and decreased gene expression groups (Fig. 3.1 C). There was an even spread of differentially expressed genes across chromosomes 1-14, with few differentially expressed genes on chromosomes 15-22 and on the X and Y chromosomes (Fig. 3.1 D). This pattern was similar when the percentage of differentially expressed genes on each chromosome was analysed, except for chromosomes 18 and 21 which had a higher percentage of genes differentially expressed as a percentage of the genes located on the chromosome (Fig. 3.1 E).

78

Figure 3. 1: Overview of expression data. A) Differentially expressed genes as a proportion of all expressed genes (fold change >1.5, p<0.05). B) Direction of differential expression in differentially expressed genes. C) Differentially expressed genes sorted by size of fold change. D) Differentially expressed genes sorted by chromosomal location. E) Percentage of differentially expressed genes on each chromosome.

Of the genes that were significantly differentially expressed, the 20 genes with the greatest increase in expression and the 20 genes with the greatest decrease in expression in scar fibroblasts when compared to normal skin fibroblasts were identified (Table 3.1

79

and Table 3.2). Gene ontology demonstrated many of the genes in both sets are involved or related to ECM or ECM production (Tables 3.1, 3.2). The full list of genes is available in Appendix V.

Table 3. 1: Top 20 genes with increased expression in scar fibroblasts (≥1.5 Fold Change, nominal p<0.05) Relative Gene Name Gene Ontology Term Expression p-value PLXDC2 multicellular organismal development 4.56 0.000 FBN2 extracellular matrix microfibril 3.40 0.000 ANKRD1 transcription factor complex 3.35 0.002 EDIL3 integrin 2.96 0.001 FNDC1 regulation of protein transport 2.85 0.008 GRIK2 extracellular-glutamate-gated ion channel 2.57 0.001 FOXF2 transcription factor activity 2.43 0.003 COMP extracellular matrix structural constituent 2.38 0.001 INHBA mesoderm formation, growth factor activity 2.35 0.009 FLG intermediate filament 2.33 0.005 CADM1 cell-cell adhesion 2.30 0.000 F2RL2 blood coagulation 2.17 0.036 RIMS1 regulation of neurotransmitter secretion 2.14 0.002 EYA2 transcription factor 2.05 0.000 CSTA peptidase inhibitor 2.04 0.001 PTPRD heterophilic cell-cell adhesion 2.03 0.005 DDX43 RNA helicase 1.97 0.003 PRICKLE1 negative regulation of Wnt signaling 1.96 0.001 HHIP dorsal/ventral pattern formation 1.92 0.036 PPAPDC1A phosphatidate phosphatase activity 1.91 0.004

80

Table 3. 2: Top 20 genes with decreased expression in scar fibroblasts (≥ - 1.5 Fold Change, nominal p<0.05)

Gene Relative Name Gene Ontology Term Expression p-value MSTN transforming growth factor beta activity 0.23 0.002 PDGFD platelet-derived growth factor activity 0.29 0.000 IL13RA2 cytokine receptor 0.32 0.000 CCRL1 chemokine-mediated signaling pathway 0.39 0.000 GRPR regulation of cell proliferation 0.39 0.004 hydrolase activity involved in CNS ASPA myelination 0.41 0.001 transforming growth factor beta receptor CCL2 signaling 0.43 0.000 IGSF10 ossification 0.43 0.002 MASP1 immune system process 0.46 0.000 GABBR2 gamma-aminobutyric acid signaling pathway 0.46 0.013 CDCA7 DNA-templated regulation of transcription 0.47 0.003 extracellular matrix disassembly and ADAMTS5 organization 0.47 0.000 collagen fibril organization in extracellular DPT matrix 0.48 0.038 SFRP1 Wnt signaling pathway 0.48 0.001 SEMA3A neuron migration and axon guidance 0.48 0.034 BEX1 transcription factor 0.49 0.044 GSTM5 xenobiotic metabolic process 0.50 0.003 GRIA1 extracellular-glutamate-gated ion channel 0.50 0.011 APOD lipid binding and transport 0.51 0.006 GALNTL2 cellular protein metabolic process 0.51 0.000

81

Differentially expressed genes displayed as heat maps are shown below (Figs. 3.2 and 3.3). The heat maps show relative expression levels of all the genes with greater than 0.7 Log Change (increase and decrease). This is equivalent to greater than 1.68 Fold Change (increase and decrease). Using these criteria the scar fibroblast data sets cluster together as do the control normal fibroblasts data sets, suggesting increased similarity in expression profiles of cells with the different phenotype rather than clustering of cells from the same patient. This suggests the existence of a scar fibroblast transcriptome.

Figure 3. 2: Heat map of expression patterns of all the genes with greater than 0.7 Log Change (1.68 Fold Change) increase. The yellow and orange correspond to an increase in gene expression, while green is neutral and blue is decreased expression. Each patient sample is on the x-axis; control samples on the left and scar samples on the right. 82

Figure 3. 3: Heat map showing the expression patterns of all the genes with greater than 0.7 Log Change (1.68 Fold Change) decrease. Yellow and orange correspond to an increase in gene expression, while green is neutral and blue is decreased expression. Each patient sample is on the x-axis; scar samples on the left and control samples on the right.

The GSEA analysis revealed 507 differentially expressed gene sets using a Mann- Whitney U test with a p<0.05. The full list is available in Appendix VI. The top 20 significantly differentially expressed gene sets in scar relative to control ranked by p- value are displayed in Table 3.3. The median change is the median Fold Change of the 83

entities in the set, the p-value is the result returned from the Mann-Whitney U test and the hit type is the gene ontology class of the gene set. The most significantly altered gene sets related to the extracellular matrix/space and region (Table.3.3).

Table 3. 3: Top 20 significantly differentially expressed gene sets identified through Gene Set Enrichment Analysis (GSEA). Median fold Name change p-value Hit type extracellular matrix -1.014 3.28E-08 cellular_component extracellular space -1.007 4.4E-07 cellular_component cell adhesion 1.000 2.26E-06 biological_process blood circulation -1.047 7.05E-06 biological_process extracellular region -1.004 1.46E-05 cellular_component chemotaxis -1.009 1.58E-05 biological_process ureteric bud development -1.041 3.01E-05 biological_process response to corticosterone -1.058 4.16E-05 biological_process heparin binding -1.007 8.17E-05 molecular_function positive regulation of tyrosine phosphorylation of Stat3 protein 1.000 8.2E-05 biological_process plasma membrane -1.002 9.88E-05 cellular_component cellular response to tumor necrosis factor -1.040 0.000126 biological_process female gonad development -1.009 0.000176 biological_process embryonic digestive tract development -1.059 0.000185 biological_process immune response -1.009 0.000198 biological_process cellular response to interferon-beta -1.063 0.000242 biological_process G-protein coupled receptor activity -1.001 0.000304 molecular_function positive regulation of cardiac muscle hypertrophy -1.054 0.000328 biological_process negative chemotaxis -1.009 0.000337 biological_process chemorepellent activity -1.193 0.000343 molecular_function

84

An example of the top hit from the GSEA, the extracellular matrix group, is shown below, with the top 20 up and down genes within the group displayed.

Table 3. 4: Top 20 up and downregulated genes within the ‘extracellular matrix’ group, which was the top hit in the GSEA.

Gene Relative Gene Relative Name Expression Name Expression COMP 2.38 ADAMTS5 0.47 TGFB2 1.67 DPT 0.48 COL11A1 1.62 SFRP1 0.48 WNT2 1.56 MMP1 0.52 THBS2 1.40 CCBE1 0.57 MFGE8 1.39 ECM2 0.65 MMP16 1.36 COL4A1 0.65 LAMA1 1.35 CLEC3B 0.66 HAPLN1 1.33 NID2 0.66 HMCN1 1.32 CPXM2 0.67 VCAN 1.32 SFRP2 0.71 LTBP2 1.31 TGFBR3 0.72 LAMA2 1.27 SERPINF1 0.72 ADAMTS12 1.26 LAMA4 0.72 IGFBP7 1.21 PRELP 0.73 COL8A2 1.20 MFAP4 0.75 LGALS3BP 1.20 LMCD1 0.75 ACAN 1.18 COL15A1 0.75 MMP13 1.16 LAMB1 0.76 SERAC1 1.16 MMP27 0.77

85

Figure 3. 4: Example of network generated from top 20 entities from the ‘extracellular matrix’ group from GSEA.

Red indicates positive fold change, blue negative fold change, and the darkness of the shade degree of change.

86

3.3. Discussion

This study investigated the differences in the transcriptome between normotrophic scar fibroblasts and normal skin fibroblasts using a transcriptome microarray approach and paired patient samples. Only one gene reached statistical significance for differential expression after correction for multiple testing due to the small sample number and likely limited differences between the two closely matched samples. Comparative expression data using a Fold Change cut-off of ±1.5 and a nominal p-value of p<0.05 identified 163 genes that were differentially expressed. GSEA analysis, due to the increased power of analysing pathways in place of single genes, identified 507 pathways that were significantly different between the two groups. The most significant changes were in pathways known to be relevant to the extracellular matrix and therefore highly likely to be important in the maintenance of the scar matrix.

3.3.1. Gene expression

Using a nominal p value of p<0.05 and cut off of ≥1.5 Fold Change as significant, 163 of the 19845 genes (0.8%) were found to be significantly differentially expressed between the normal skin and normal scar fibroblasts. This number is very similar to Tsou et al. (2000) who found 192 genes differentially expressed between normal skin and normal scar using a >2 Fold Change for significance. Tsou et al. however used tissue biopsies rather than isolated cells for their analysis, and only investigated the expression of 4000 genes. The much higher percentage of differentially expressed genes in this previous study may relate to the use of biopsies with mixed cell populations rather than the use of cultured and relatively homogenous cell samples here.

Of the 163 genes significantly differentially expressed in scar fibroblasts, 47% were increased in expression and 53% decreased in expression. This also differs to the findings from Tsou et. al. (2000), in which overexpression represented 75% of genes and reduced expression only 25% (Tsou et al., 2000). This may be due to differences in scar type, array used, bias in array coverage, tissue location or the lack of age, sex and ethnicity controls in the previous study. Expression array technology was in the early stages adoption in 2000, as the first array was only published in 1995 (Schena et al., 1995) and many of the array formats have subsequently been optimised together with the analytical techniques for the large datasets. 87

The vast majority of genes in this study had a small Fold Change, between 1.5 and 2 fold for both the increased and decreased gene expression groups. Unlike the analysis of the β difference in methylation, the size of the Fold Change is a less valid indicator for the importance of genes in maintenance of the scar phenotype. This is because many ‘control’ type genes including transcription factors typically require very small changes in expression level to dramatically change their activity (Brewster et al., 2014; Seidman and Seidman, 2002). Therefore small changes in gene expression can lead to much greater changes in phenotype. Conversely, it is commonly found in cancer cells for example that large fold-changes in expression occur for many of the key metabolic genes, reflecting the increased rate of growth and nutrient need rather than indicating their importance in tumorigenesis.

When examining the chromosomal location of the significantly differentially expressed genes, there is an even spread within chromosomes 1-15, with few differentially expressed genes identified from 16-22 and from the X and Y chromosomes. Chromosomes 1 and 6 appear to have a large number of differentially expressed genes, whereas chromosome 15 has a very small number of differentially expressed genes. Chromosome 6 has roughly half the number of genes of chromosome 1, but a similar number of differentially expressed genes. In the percentage of differentially expressed genes per chromosome, there is an even spread within chromosomes 1-14. However, unlike the count of genes per chromosome data, there are spikes in chromosome 13, 18 and 21, which have a greater proportion of significantly differentially expressed genes for their size. These chromosomes may be hotspots for differential gene expression in scar fibroblasts. This pattern would be expected to be similar to the methylation data with differentially methylated genes on the same chromosome appearing as differentially expressed genes in this dataset. This appears to be case for chromosome 1, which contains large numbers of differentially methylated and expressed genes. However, as discussed in Chapter 2, this is more likely due to the size and gene density of this chromosome, which appears to have many changes in other diseases not related to skin or scar formation (Torres-Martín et al., 2015; Rodriguez et al., 2014). Other than chromosome 1, the effects do not appear to match those found in the methylation data, with no apparent spikes in gene expression changes on chromosomes 7 or 19 and much higher changes as a percentage on chromosomes 13, 18 and 21, which did not appear significant in the methylation data. This contrasting data may be related to the limited

88

resolution of the methylation dataset or the limited power of using the small sample size for both analyses.

The top 20 genes with the largest increase/decrease in expression in scar fibroblasts compared to normal skin fibroblasts presents some expected and some unexpected results. The list of upregulated genes contains many genes with known function related to ECM production including fibrillin 2 (FBN2), forkhead box F2 (FOXF2), cartilage oligomeric matrix protein (COMP), inhibin beta A (INHBA) and cell adhesion molecule 1 (CADM1), all associated with fibrosis (Olivieri et al., 2010; Ormestad et al., 2006; Kim et al., 2006; Forrester et al., 2013; Moiseeva et al., 2014). In the set of genes with decreased expression were the ECM degradation protein, a disintegrin and metalloproteinase metallopeptidase with thrombospondin type 1 motif 5 (ADAMTS5) and the collagen fibril arrangement protein dermatopontin (DPT). Downregulation of these genes would be expected to cause slower collagen degradation and aberrant arrangement of the fibres. Less expected changes included the significant downregulation of myostatin (MSTN), a TGF-β mediated growth factor, which might be expected to be increased in scar, and the number of ion channel and neurotransmitter genes found in both the increased and decreased expression groups, including glutamate receptor ionotropic kainate 2 (GRIK2), regulating synaptic membrane exocytosis 1 (RIMS1), sema domain immunoglobulin domain (Ig) short basic domain secreted semaphorins 3A (SEMA3A) and glutamate receptor ionotropic α-amino-3-hydroxy-5- methyl-4-isoxazolepropionic acid 1 (GRIA1). The relevance of these to scar formation and fibrosis is not clear, although the importance of ion channels in epithelia is well described in disease, for example cystic fibrosis (Dalemans et al., 1991). Whether these are related to scarring is yet to be established, but the data here suggests a possible relationship.

Comparing the lists from this study to other studies in normal scars, there are few genes in common, but collagen IV α-1 (COL4A1) was found to be consistently differentially expressed. However, it was significantly downregulated in this study whereas in the Tsou et. al. (2000) paper COL4A1 was significantly up regulated. The significant down regulation may be a consequence of growing the scar fibroblasts in culture as this was not done in the previous study, which used RNA from whole tissue biopsies. It is also possible that since multiple cell types express Collagen IV that the observed increase in

89

the tissue biopsies originated from other cells that were in the preparation, for example endothelial cells.

Heat maps generated using the differentially expressed genes with greater than 0.7 Log Change ( >1.68 Fold Change) showed that the two tissue types (scar and normal skin) clustered together more closely than the samples from the same patient. This is suggestive of a distinct scar expression profile with some genes consistently differentially expressed between normal scar and normal skin fibroblasts. Although normal scar is a different phenotype from normal skin, the cells are not severely aberrant (e.g. hyper-metabolic or hyper-proliferative), and thus would only be expected to have a small number of differences from the normal skin cells, as found in this study. As a comparison, breast cancer cells have been reported to have 700 differentially expressed genes (Martin et al., 2000), and colon cancer cells 548 (Zhang et al., 1997).

3.3.2. GSEA

A GSEA derives its power by focusing on gene sets, groups of genes that share common biological function, chromosomal location, or regulation (Subramanian et al., 2005). GSEA is used to overcome the limitations of gene lists that focus on the largest and smallest differences in expression levels of individual genes in different experimental groups. Problems with this method include no individual gene meeting the threshold for statistical significance after correction for multiple testing, as the relevant biological differences are modest relative to the noise inherent to the microarray technology, or a long list of statistically significant genes without any unifying biological theme. This list method may also miss important effects on pathways, as cellular processes often affect sets of genes acting in concert, and an increase of 10% in all genes in a metabolic pathway may dramatically alter the flux through the pathway and may be more important than a 10-fold increase in a single gene (Subramanian et al., 2005). Interpretation can be daunting and dependent on a biologist's area of expertise, and important features may be overlooked and minor features over emphasised. GSEA is used to ameliorate these issues.

There were 507 significantly differentially expressed gene sets (Appendix VI). As expected from a pathology defined by an abnormal ECM, the top 3 gene sets are all directly ECM related – extracellular matrix, extracellular space and cell adhesion. Others are seemingly unrelated to ECM, but feature a large number of ECM genes, such

90

as developmental gene sets including embryonic digestive tract development, ureteric bud development and female gonad development. The blood circulation set is an interesting gene set, as scars are known to have abnormal vascularity and blood flow, and the heparin binding gene set may also be related to this (Ehrlich and Kelley, 1992). The signal transducer and activator of transcription 3 (STAT3) phosphorylation pathway gene set is also interesting, as Stat3 is involved in keloid (Lim et al., 2006), lung (Knight et al., 2011), cardiac (Jacoby et al., 2003), liver (Ogata et al., 2006) and kidney (Pang et al., 2010) fibrosis. Finally, a number of gene sets related to immune pathways are also present in the list, including response to corticosterone, cellular response to tumour necrosis factor (TNF) and immune response and cellular response to interferon-beta (IFNβ). The altered expression of these gene sets may be caused by changes in gene expression persisting after the acute inflammatory response to injury that led to the scars in the patients from whom biopsies were obtained.

The size and direction of the median change in the GSEA results, while an indicator of overall direction of the gene set and good for indicating functional changes in small groups, is not a good indicator for functional changes in large groups. The median Fold Change column in the GSEA results calculates the median Fold Change of all the genes in the group. In the ‘extracellular matrix’ group, this gives a value of -1.014, which at an initial look suggests a very minor downregulation of this group. However, this Fold Change value does not take into account the function of the genes in the group. An example of this, again within the ‘extracellular matrix’ group, are a subset of the matrix metalloproteinase (MMP) genes, which digest parts of the ECM as part of normal collagen turnover (Coussens et al., 2002). The MMP16 gene is upregulated (Fold Change <1.2), most of the other MMPs are unchanged (Fold Change near 1), while MMP1 and MMP27 are significantly downregulated (Fold Change <-1.2). The appearance of these three in the group increases the confidence that there is an imbalance in the collagen turnover of scar cells, but the median Fold Change value of the MMPs is 1.01, indicating a very minor increase in expression of the group. Compounding this, these MMPs are only a subset of the genes in the group, and other genes in this group act as promoters and repressors of these MMPs, yet still have their Fold Change values contribute to the overall median Fold Change value. The extracellular matrix group is not truly downregulated in scar fibroblasts, it simply has more genes with a slight negative Fold Change than positive Fold Change, and further

91

interrogation of this group must be undertake in order to determine the true direction of the change.

When interpreting GSEA results, unless the group is small, the median Fold Change is a poor indicator of functional effect, as although there may be a few key differentially expressed genes causing disruption to the group, many of the genes will have no change, and thus an ‘unchanged’ median Fold Change close to 1. Indeed, this was observed in the results of this study, with the largest median Fold Change values occurring in the smaller groups like ‘positive regulation of developmental growth’ (median change -1.446) and ‘positive regulation of epithelial cell proliferation involved in lung morphogenesis’(median change 1.41), which each only had 5 genes in the group. Further interrogation of large data sets that are significantly differentially expressed in the GSEA must be carried out in order to ascertain functional effect.

3.3.3. Limitations of this Study

The key limitation of this comparative study of the transcriptome in primary human scar fibroblasts versus matched normal skin fibroblasts was the use of cell culture to obtain the starting material for analysis.

The need to culture the fibroblasts from the isolated skin biopsies to isolate sufficient RNA from fibroblasts rather than whole tissue may alter gene expression by exposure to in vitro culture conditions (Boess et al., 2003). A comparison between cultured liver tissue and primary and cell line hepatocytes found that the primary cells were much more similar to the tissue than the cell lines, but become less similar the longer they are cultured (Boess et al., 2003). Culture time was kept to a minimum in our study and all cell samples were isolated at the same early passage number. However, it is possible that the culture time will have had an effect as the cells adapt to the culture conditions. Whether this effect is equivalent on both scar and skin fibroblasts is not known. Another limitation of using cell culture to prepare samples is the potential ‘founder effect’. This is discussed in detail in Chapter 2 as it also has a significant potential impact on the methylation array analysis.

The other important consideration with respect to the use of cell culture is the homogeneity of the cell populations. Even a small contaminating cell population could have a large effect on the changes in expression observed, in particular because small changes in expression are considered significant in this context. The possibility of a lack 92

of homogeneity in the fibroblasts is a greater problem for analysis of expression changes. In the case of methylation, the larger changes in beta values provide an indication of those likely to be present in the majority of cells tested and therefore confirmation these changes are important. However, in the case of expression data, as previously stated, small changes in expression can have very large effects on phenotype and therefore the scale of change is no longer a good indicator of relevance to the pathology being investigated. This means the expression analysis is more susceptible to possible contamination, and this effect cannot be excluded. The GSEA should partially mitigate this effect, especially in pathways with many entities, but even this may still be susceptible.

An alternate hypothesis to that postulated earlier about the GSEA results, that the altered expression of these gene sets may be caused by changes in gene expression persisting after the acute inflammatory response to injury, is that it is possible that there is a level of contamination in scar fibroblast cell culture populations that is not present in the normal skin fibroblast samples, possibly due to increased or altered macrophage derived cells resident in the dermis of the scars. Whilst the cell culture is expected to produce a largely homogenous population of fibroblasts this has not been verified in these samples. The use of cellular markers of fibroblasts such as FSP and/or Vimentin (although specificity of these markers is limited) and FACS analysis prior to RNA or DNA preparation could have been used to assess the cell population. However cell isolation and separation would involve greater cell manipulation and ultimately a need for longer culture times to isolate sufficient RNA samples which may in itself significantly affect expression.

Another significant issue, as with the methylation study, is the small sample number reducing the statistical power, particularly when generating large datasets for analysis per sample. One study comparing mouse and human gene expression in leukocytes after burn injury comprised 244 human burn injury subjects, 35 human controls and 32 mouse burn and 32 mouse control arrays (Seok et al., 2013). Additional recruitment, increased funds for research and the reducing costs of analysis will facilitate analyses of larger cohorts in the future and allow more robust identification of changes in gene expression.

A further limitation is that although the Affymetrix 2.0 ST chip covers >30 000 coding transcripts and >11 000 long intergenic non-coding transcripts, more modern 93

technologies including RNA-seq give a much higher resolution of all the transcripts in the human genome (Djebali et al., 2012). Microarray chips provide good transcriptome coverage of protein coding transcripts, but rely on knowledge of an organism’s transcriptome and only previously identified transcripts can be investigated. Known protein coding gene exons compose less than 3% of the human genome and the remaining 97% is largely uncharted territory, with only a small fraction characterized (Hangauer et al., 2013). However, recently the ENCODE project has demonstrated expression from the vast majority of the DNA and therefore the use of these arrays will miss many of these transcripts (Djebali et al., 2012). Arrays will therefore miss some aspects of the transcriptome that may be important in the formation and maintenance of scar. RNA-seq technology is becoming cheaper, and would provide a more detailed overview of the expression patterns in the scar cells. However, obtaining large data sets from a small number of samples could be a troublesome, as testing for so many transcripts will reduce statistical power, and with so few non-coding RNAs characterised much of the data may be meaningless. Focusing on known protein coding is a reasonable strategy, especially in the context of looking for targets for potential therapeutic modulation. Finally, there were also some limitations in the statistical analysis. Though widely used in analysis of arrays, use of Fold Changes for cut-offs for gene expression is problematic. Usage of a simple, static Fold Change for expression is biased for both low and high intensities, as static fold change thresholds are too stringent at high intensities and not stringent enough at low intensities (Mariani et al., 2003). This was originally coupled with a 5% B-H correction for multiple testing, which itself has some problems as discussed in Chapter 2. However, this method was found to be too stringent for the small sample size in this study, and a more liberal p<0.05 was used in conjunction with the Fold Change. The combination of both methods helped to increase confidence in the results, but both have flaws.

94

3.4. Conclusion

The aim of this chapter was to identify differences between the transcriptomes of scar and normal skin fibroblasts. One hundred and sixty three genes were differentially expressed between the normal skin and normotrophic scar fibroblasts, of which 47% were increased in expression and 53% decreased in expression.

Heatmaps of the most differentially expressed genes clearly clustered the scar and normal skin fibroblasts in their own groups, suggesting the similarities in expression for these gene sets was higher between fibroblasts of the same phenotype than between fibroblasts from normal skin and scar from an individual patient. This strongly suggests these genes play a role in the maintenance of scar phenotype.

Finally, the GSEA found 507 differentially expressed gene sets gene sets related to the ECM. Though the level of change was small, many of the top hits were in large groups, so further investigation of the functional effects of these changes must be undertaken.

The robust study design of this experiment using paired samples for the analysis of differential gene expression has identified individual target genes for the possible formation and maintenance of normal scarring and adds significant value to previous studies through the use of carefully matched samples and the ability to match this transcriptome data to genome-wide methylation data from the same cells.

In the next chapter an integrative genomic approach will be used, combining the two data sets from Chapters 2 and 3 to identify genes where alterations in gene transcription in scar fibroblasts are associated with epigenetic changes in DNA methylation and therefore may be strong candidates to underlie the maintenance of scar phenotype.

95

Chapter 4

96

Chapter - 4 Integrative genomic approach using methylome and transcriptome data to identify targets involved in scar maintenance

Introduction Changes in the methylation of DNA of fibroblasts during wound repair were hypothesised to be the mechanism responsible for long-term changes in collagen metabolism in normotrophic scar fibroblasts and subsequently the dermal matrix after healing. Changes in DNA methylation alter gene transcription and hence cell phenotype. Changes in DNA methylation are also sustained and heritable, suggesting they will be maintained by scar fibroblasts and lead to a sustained alteration to matrix production. Therefore identification of genes that are both differentially methylated and expressed in scar fibroblasts when compared to normal skin fibroblasts may identify novel targets to reduce scarring. In this chapter an integrated analysis of the methylation and transcriptome data will be presented. In this analysis the DNA methylation dataset was restricted to the promoter region, specifically within 1500 base pairs of the transcription start site (TSS1500), which contained 77 379 CpG sites (Sandoval et al., 2011) that were divided into genes as averaged using the ‘regionswrapper’ function in the IMA package in R (Wang et al., 2012). The promoter region was focused on as the function of this type of methylation is the most studied. The integrated data analysis provides a list of potential gene targets which can be modulated in vitro to confirm their role in collagen matrix production (Chapter 5). This method of target discovery is similar to that used in other pathologies including Atopic Dermatitis (AD) and Idiopathic Pulmonary Fibrosis (IPF) (Sanders et al., 2012; Rodriguez et al., 2014) and has the potential to increase our understanding of genes involved in scar maintenance as well as identify putative therapeutic targets to ameliorate scarring.

Aims:

1. To identify the genes that are both differentially methylated and expressed in scar fibroblasts compared to normal skin fibroblasts.

2. To identify candidate ‘controller’ genes critical to maintaining scar matrix by incorporating information from functional annotation and published literature on identified candidate genes.

97

4.1. Methods

A systematic methodology for data integration was followed comprising three steps:

1) Comparison of the differentially expressed gene dataset and differentially methylated promoter region dataset to identify genes in common.

2) Restriction of this list to putative ‘regulators’ of matrix production using existing functional annotations (gene ontology) to identify upstream effector genes and transcription factors.

3) Prioritisation of potential target genes for modulation in vitro via detailed analysis of any previous characterisation studies

4.1.1. Integration of methylation and gene expression datasets

The integration of the differentially methylated promoters and the differentially expressed genes was carried out by combining the two datasets generated in R from the DNA methylation analysis (Chapter 2) and gene expression analysis (Chapter 3) in a relational database. This database linked the full list of gene level results from the analysis for the TSS1500 region (promoter methylation) and the differential gene expression values into a master sheet containing the following variables: gene name, Fold Change of gene expression, DNA methylation average β difference and the adjusted p-value for the difference in DNA methylation. Genes were then sorted by DNA methylation p-value, and all genes that had a p-value > 0.05 were excluded. From this list, the genes were sorted by gene expression level in scar relative to control, and genes with a Fold Change ≥ ±1.50 were included (relative gene expression level ≥ 1.50 or ≤ 0.67) if the unadjusted p-value was < 0.05.

Integration of the differentially methylated promoters from the alternative analysis using the M-statistic in RnBeads was also carried out, using the same method of creating a relational database. This linked the significantly differentially methylated promoters, as defined by the automatic cutoff of the combined rank score in RnBeads, with the same significantly differentially methylated gene list from Chapter 3, genes which had a Fold Change ≥ ±1.50 and an unadjusted p-value of <0.05.

98

4.1.2. Restriction of results by gene ontology

The resulting list of genes was then linked to the gene ontology terms associated with each gene using the UCSC table browser (Karolchik D, 2004) and selecting the gene ontology function. A gene set enrichment analysis was also carried out on the ‘entity list’ of differentially methylated and differentially expressed genes, using Pathway Studio Mammal® (Elsevier), using the “Find Pathways/ Groups Enriched with Selected Entities”. Genes with DNA binding and/or transcription factor activity were selected.

4.1.3. Prioritization of targets by literature analysis

The final step involved undertaking a literature analysis for genes identified through the workflow described and targets were prioritised for further in vitro modulation. Transcription factors and proteins with DNA binding affinity were preferentially targeted, as these were more likely to control scar formation and maintenance and be drivers of scar formation and maintenance rather than ‘bystander’ genes affected by the ‘controllers’.

99

Figure 4. 1: Gene selection was carried out first by differential methylation and expression analysis, then targets for screened for transcriptional activity and then screened for known or likely relevance to scarring

100

4.2. Results

Sixteen (16) genes were identified that met the criteria for being differentially methylated (p<0.05) and differentially expressed (≥ ±1.50 Fold Change), (Fig. 4.1, Table 4.1). A selection of gene ontology terms associated with these genes is shown in Table 4.2, with the full list of terms available in Appendix VII. From the alternative RnBeads analysis, nine (9) genes were identified as being differentially methylated (combined rank score <3253) and differentially expressed (≥ ±1.50 Fold Change). Eight (8) of these 9 genes are common to the original IMA analysis, and are shown in table 4.4.

These genes can be categorized into two main groups: 1), those with DNA binding and transcriptional activity and associated functions; and 2) those without DNA binding activity and associated functions. Genes with transcriptional activity and associated functions are AIM2, AR, FOXF2 and MKX. The remaining genes, CDCP1, CLIC6, COLEC12, CSTA, GRIA1, ILIR1, KRT7, MEDAG, RASGRP1, SAA1 and TENM3 are without transcriptional activity. AIM2, AR, FOXF2 and MKX were therefore selected for more in-depth analysis.

Figure 4. 2: 16 genes of interest are both differentially expressed and differentially methylated

101

Table 4. 1: 16 Differentially expressed and differentially methylated genes sorted by relative gene expression level.

Differential Differential Relative Differential DNA DNA gene Expression methylation methylation Gene expression (p-value, (p-value) (∆β : scar – Symbol Gene Name level in scar Unadjusted) control) FOXF2 Forkhead Box F2 2.43 0.00 0.02 0.21 CSTA Cystatin A (Stefin A) 2.04 0.00 0.05 -0.17 Teneurin Transmembrane TENM3 Protein 3 1.91 0.01 0.05 0.06 MKX Mohawk Homeobox 1.78 0.00 0.05 0.15

Chloride intracellular CLIC6 channel protein 6 1.71 0.00 0.02 0.14

AIM2 Absent in Melanoma 2 1.70 0.00 0.01 -0.34 SAA1 Serum Amyloid A1 1.64 0.01 0.05 -0.05 Chromosome 3 Open LOC401097 Reading Frame 80 1.59 0.01 0.04 0.12 AR androgen receptor 0.65 0.00 0.04 0.10 CUB Domain CDCP1 Containing Protein 1 0.65 0.00 0.03 0.10 MEDAG (mesenteric estrogen-dependent C13orf33 adipogenesis) 0.63 0.01 0.05 0.09 KRT7 Keratin 7 0.60 0.01 0.03 0.16 COLEC12 Collectin 12 0.59 0.05 0.01 0.13 Interleukin 1 IL1R1 Receptor, Type I 0.58 0.00 0.04 -0.10

RAS Guanyl Releasing Protein 1 (Calcium And RASGRP1 DAG-Regulated) 0.57 0.02 0.04 0.06 Glutamate Receptor, GRIA1 Ionotropic, AMPA 1 0.50 0.01 0.04 0.13

102

Table 4. 2: Selected gene ontology terms for 16 target genes feature many fibrosis related ontologies. Gene Gene Ontology terms relevant to transcription factors, other upstream name effects or extracellular matrix DNA-templated extracellular matrix organization, sequence-specific DNA binding, RNA DNA-templated regulation of transcription, DNA-templated negative regulation of transcription, positive regulation of transcription from FOXF2 RNA polymerase II promoter cornified envelope, cell adhesion, nucleus, cytoplasm, nucleoplasm, single organismal cell-cell adhesion peptidase inhibitor activity, peptide, keratinocyte CSTA differentiation, cross-linking, negative regulation of proteolysis cell adhesion, signal transduction, homophilic cell adhesion via plasma membrane adhesion molecules, cell differentiation, protein homodimerization TENM3 activity, cell projection positive regulation of gene expression, collagen fibril organization, positive regulation of collagen, sequence-specific DNA binding, DNA binding, transcription from RNA polymerase II, promoter regulation of transcription, DNA-templated multicellular organismal development, regulation of gene MKX expression, integral component of membrane, cytoplasm, ion transport, chloride transport, protein C-terminus binding, chloride channel, complex regulation of CLIC6 ion chloride, transmembrane transport, vesicular exosome, DNA binding, inflammatory response, immune response, negative regulation of NF-kappaB transcription factor activity, nucleotide-binding domain, positive AIM2 regulation of NF-kappaB, transcription factor, pyroptosis SAA1 extracellular region, structural molecule activity, nucleus, cytoplasm LOC4010 97 integral component of membrane transcription factor activity, RNA polymerase II transcription factor activity, transcription factor activity involved in positive regulation of transcription, sequence-specific DNA binding, transcription factor activity involved in negative regulation of transcription DNA-templated zinc ion binding, steroid AR hormone mediated signaling pathway, sequence-specific DNA binding MEDAG positive regulation of fat cell differentiation, intracellular calcium ion binding KRT7 protein binding, intermediate filament, viral process, keratin filament MEDAG positive regulation of fat cell differentiation, intracellular calcium ion binding voltage-gated ion channel activity, chloride channel activity, scavenger receptor activity, collagen trimer, metal ion binding, immune response, low- COLEC12 density lipoprotein particle binding, innate immune response protein binding, integral component of membrane, protease binding, immune response, cell surface receptor signaling pathway, cell surface cytokine- IL1R1 mediated signaling pathway, response to interleukin-1 RASGRP1 cell differentiation, innate immune response extracellular-glutamate-gated ion channel activity, epithelial to mesenchymal transition, DNA binding, transcription factor activity, neuron projection, cell GRIA1 junction, ion transmembrane transport, ionotropic glutamate receptor

103

Table 4. 3: Gene set enrichment analysis of the 16 genes differentially methylated and expressed sorted by p-value using the “Find Pathways/ Groups Enriched with Selected Entities. This table only displays groups with at least 2 entities (genes) within the group. Full list available in Appendix VIII. Total # of Entities in Name group Overlapping Entities p-value RNA polymerase II transcription factor binding 45 MKX;AR 0.0004 endocytic vesicle membrane 56 GRIA1;COLEC12 0.0007 innate immune response 669 AIM2;SAA1;COLEC12;RASGRP1 0.0009 protease binding 90 CSTA;IL1R1 0.0016 positive regulation of NF- kappaB transcription factor activity 126 AIM2;AR 0.0034 immune response 449 AIM2;COLEC12;IL1R1 0.0034 protein complex 497 AR;IL1R1;GRIA1 0.0043 postsynaptic density 153 GRIA1;IL1R1 0.0048 platelet activation 211 SAA1;RASGRP1 0.0093 regulation of gene expression 218 MKX;AR 0.0099 sequence-specific DNA binding 693 AR;MKX;FOXF2 0.0101 lipid binding 262 RASGRP1;AR 0.0130 structural molecule activity 264 CSTA;KRT7 0.0132 protein homodimerization activity 765 TENM3;CLIC6;GRIA1 0.0132 protein domain specific binding 265 GRIA1;AR 0.0133 Axon 284 TENM3;AR 0.0158 transcription factor complex 309 MKX;FOXF2 0.0185 transcription factor binding 334 FOXF2;AR 0.0205 Dendrite 352 GRIA1;AR 0.0236 sequence-specific DNA binding transcription factor activity 1135 AR;MKX;FOXF2 0.0372 signal transduction 2955 AR;GRIA1;TENM3;RASGRP1;IL1R1 0.0440 cell surface 508 IL1R1;GRIA1 0.0463

104

Table 4.4: 9 differentially expressed and differentially methylated genes in alternate RnBeads methylation analysis sorted by relative gene expression level Differential Relative Differential DNA gene DNA methylation DNA Gene expression methylation p-value methylation present in Gene level in (∆β : scar – (FDR combined original symbol Gene name scar control) adjusted) rank score anaylsis Glutamate Ionotropic Receptor Kainate Type GRIK2 Subunit 2 2.573 0.126 0.221 1631 No Forkhead Box FOXF2 F2 2.429 0.212 0.108 330 Yes Mohawk MKX Homeobox 1.783 0.126 0.374 3114 Yes Chloride intracellular channel protein CLIC6 6 1.713 0.223 0.124 230 Yes Serum Amyloid SAA1 A1 1.642 -0.148 0.242 2300 Yes CUB Domain Containing CDCP1 Protein 1 0.653 0.097 0.213 2716 Yes KRT7 Keratin 7 0.605 0.169 0.163 1002 Yes COLEC12 Collectin 12 0.593 0.135 0.191 1372 Yes RAS Guanyl Releasing Protein 1 (Calcium And DAG- RASGRP1 Regulated) 0.569 0.221 0.127 271 Yes

105

4.2.1. DNA Binding Targets

The four genes identified as being differentially methylated, expressed and having DNA binding activity from the gene ontology data are AIM2, AR, FOXF2 and MKX. These were designated potential targets for driving the formation and maintenance of scars, and were subjected to an extensive literature review.

4.2.1.1. AIM2

The gene ‘absent in melanoma 2’ (AIM2) had an expression fold change of 1.7, differential methylation p-value of 0.01 and a ∆β of -0.34. This represents the smallest methylation p-value and the largest ∆β of all the differentially methylated and expressed genes, meaning that it is strongly and consistently hypomethylated in scar fibroblasts. The expression fold change of 1.7 showed that the promoter demethylation is associated with an increased level of gene expression, fitting the classical understanding of promoter methylation (Jones and Takai, 2001).

Gene ontology terms associated with AIM2 related to double stranded DNA (dsDNA) binding and the immune response, and functional studies in the literature focus on these properties. AIM2 is part of the innate immune response against viral and microbial pathogens, forming part of a multiprotein complex known as the inflammasome (Schroder et al., 2009). AIM2 recognises host or pathogen associated dsDNA and interacts with apoptosis-associated speck-like protein containing a C-terminal caspase- recruitment domain (ASC) to induce pyroptosis in cells containing caspase-1, a pro- inflammatory enzyme (Fernandes-Alnemri et al., 2009). Pyroptosis is a process of programmed cell death that is dependent on caspase-1 and is pro-inflammatory, as opposed to apoptosis, which actively inhibits inflammation. It occurs when macrophages sense infection within themselves and undergo programmed cell death, releasing cytokines and attracting more immune cells to fight the infection (Fink and Cookson, 2005). AIM2 is also a tumour suppressor involved in carcinogenesis of colorectal, endometrial and gastric tumours, and epigenetic inactivation of the AIM2 promoter by hypermethylation is common in a subtype of colorectal cancer (Woerner et al., 2007).

In skin, AIM2 is strongly upregulated in the epidermis of chronic skin conditions including psoriasis, atopic dermatitis, venous ulcers, contact dermatitis and experimental wounds and is expressed strongly in melanocytes and Langerhans cells in 106

normal skin (de Koning et al., 2012). Also, despite its name AIM2 is strongly expressed in primary melanomas and cutaneous squamous cell carcinomas (SCC). However, AIM2 is not expressed in secondary metastases of either tumour type, suggesting loss of expression is associated with metastasis. AIM2 suppresses proliferation, and therefore its constant expression in melanocytes and primary melanomas may be limiting tumourigenesis or tumour progression, whereas its activation in keratinocytes only occurs once they become hyperproliferative due to an underlying pathology (de Koning et al., 2014).

From in vitro work on lung fibroblasts, an increase in AIM2 has been shown to be associated with aging, cellular senescence and growth arrest, with a decrease in AIM2 expression with age but an increase in AIM2 expression associated with senescence (Duan et al., 2011). The silencing of AIM2 by methylation has also been shown to be associated with cell immortalisation in immortal fibroblasts (Kulaeva et al., 2003). AIM2 has also been implicated in fibrosis, with inflammasomes implicated in the progression of liver fibrosis (Boaru et al., 2012). AIM2 was also identified as having a fold change in expression of 11.79 in a study comparing normal fibroblasts and systemic sclerosis (SSc) fibroblasts, a chronic idiopathic fibrotic disease characterised by fibrosis of the skin and visceral organs and mediated by activated myofibroblasts (Artlett et al., 2011). Therefore there is extensive, albeit some conflicting evidence that AIM2 may be important in fibrosis. However, although an interesting target, the wide ranging functions of AIM2 and the lack of a clearly defined role in fibroblasts led to AIM2 not being selected for in vitro modulation in the work for this thesis.

4.2.1.2. AR

Androgen receptor (AR) had a differential methylation p-value of 0.04, a ∆β of 0.10 and an expression fold change of 0.65. Although the difference in DNA methylation is consistent the size of the difference is small between the scar and normal fibroblasts. Promoter hypermethylation of AR corresponds to a decrease in expression, fitting the understood classical model of promoter methylation. Gene ontology terms associated with AR involve many associated with transcriptional activity and steroid hormone activity.

AR is the main receptor for the male sex hormone testosterone, and the interaction of these two molecules is essential in the development and maintenance of the male sexual

107

phenotype. During normal function, testosterone or its converted form 5a- dihydrotestosterone (DHT) binds to AR, which then undergoes a conformational change and releases heat shock proteins. Before translocating to the nucleus, phosphorylation of the AR receptor occurs, either before or after androgen binding. After nuclear translocation, AR then dimerises, binds to the DNA and recruits cofactors, after which target gene mRNA transcription occurs (Mooradian et al., 1987).

In normal skin function, AR is involved in hair follicle growth, sebum production and sweat production (Gilliver et al., 2003). In wound healing, AR is expressed in inflammatory cells, keratinocytes and fibroblasts, and reduction of androgens by either reduction of testosterone or blockade of AR results in accelerated wound healing and an increase in the deposition of collagen 1 (Ashcroft and Mills, 2002). However, other studies suggest androgens (and thus AR) are pro-fibrotic (Gilliver et al., 2003), examples of which include keloid scars which have been shown to have increased levels of androgen binding (Ford et al., 1983), testosterone treated rats undergoing kidney transplant suffering kidney fibrosis (Antus et al., 2001) and non-transplant testosterone treated rats suffering cardiac hypertrophy (Papamitsou et al., 2011). The increased wound healing rates in androgen depleted mice may be due to a dampened inflammatory response, which would be expected to reduce the rate of matrix degradation, as well as allowing a greater effect of endogenous estrogen, a wound healing accelerator (Gilliver et al., 2003). Although an interesting target, AR was not selected for further in vitro modulation as it was deemed too broad a target, and a poor target for siRNA knockdown, as the receptors are too stable.

4.2.1.3. FOXF2

Forkhead box F2 (FOXF2) had a differential methylation p-value of 0.02 and a ∆β of 0.21, indicating strong and consistent hypermethylation. However, it had an increased expression of 2.43-fold, the highest of any of the 16 differentially expressed and differentially methylated genes. This does not match the classical model of methylation, where increased promoter methylation leads to decreased gene expression. However, DNA methylation of promoter regions, in some cases, can be associated with transcription activation; for example through blocking repressor protein binding to the promoter region (Niesen et al., 2005). Indeed, one of the key studies demonstrating this reversal of the classical understanding of promoter methylation occurs in a closely related gene to FOXF2, FOXA2 (Bahar Halpern et al., 2014). Therefore it appears in 108

this study that FOXF2 also follows this non-classical regulation, whereby increased methylation has resulted in increased gene expression.

Gene ontology terms associated with FOXF2 include many functions related to DNA transcription, epithelial to mesenchymal transition (EMT) and importantly for this study, extracellular matrix organisation.

Forkhead box (FOX) genes are a subtype of transcription factors with a common forkhead box DNA binding motif that are important in the establishment of the body axis and the development of tissues from all three germ layers (Lehmann et al.). FOXF2 has been identified as being expressed during fetal development in many organs - mainly in organs that have endodermal or ectodermal derived epithelia surrounded by mesoderm derived mesenchyme (Aitola et al., 2000). In mice, FOXF2 is expressed in the adult tissue of the eye, lung, intestines and stomach (skin not tested) (Aitola et al., 2000). FOXF2 null mice are perinatal lethal. However, conditional knockout mice in intestinal smooth muscle have shown that FOXF2 is essential in normal extracellular matrix production and maintenance. In particular, collagens are severely reduced in FOXF2 mutant intestine, which causes epithelial depolarization and tissue disintegration (Ormestad et al., 2006).

Little data is available on FOXF2 effects on skin and fibrosis. However, the closely related FOXF1 gene has been found to be upregulated in fibroblasts of idiopathic pulmonary fibrosis (IPF) (Melboucy-Belkhir et al., 2014). The relationship of FOXF2 with EMT is also of interest, as the origin of scar fibroblasts, whether they migrate in from the edge of the wound or come from the circulation, is currently unknown. EMT occurs when epithelial cells transform into a mesenchymal phenotype mediated by the loss of epithelial cell adhesion, increased expression of α-smooth muscle actin, basement membrane disruption and the cells becoming migratory and invasive (Liu, 2004). It is possible that epithelial cells that transform into mesenchymal cells may have an epigenetic ‘memory’, and therefore if EMT is a source of scar fibroblasts after an injury this may underlie at least some changes in methylation and gene expression. Therefore the origins of fibroblasts in scars could explain the differences in methylation and expression observed. Current published work on FOXF2 and its relationship to EMT has only investigated this role in prostate tissue (van der Heul-Nieuwenhuijsen et al., 2009b), where it has been suggested to regulate not only EMT but also the

109

production of many ECM proteins (van der Heul-Nieuwenhuijsen et al., 2009a). Figure 4.3 displays this array of cell processes.

The combination of transcription factor activity controlling many downstream genes, extracellular matrix modulation and EMT activity plus highly upregulated expression in scar fibroblasts make FOXF2 a high-priority target, and it was selected for further modulation in scar fibroblasts as part of this thesis (Chapter 5).

4.2.1.4. MKX

Mohawk Homeobox (MKX) had a methylation p-value of 0.05 and a ∆β of 0.15, which was the least significant p-value of the five gene targets. As with FOXF2, MKX had a positive expression fold change as well as an increased level of promoter methylation, suggesting this gene is regulated differently by methylation than most other genes. The fold change in expression of 1.78 was also high. Gene ontology terms associated with MKX included a large number of DNA binding and transcriptional ontologies, as well as two functions very relevant for scarring - positive regulation of collagen and collagen fibril organisation.

MKX is the sole member of a relatively newly characterized class within the three amino acid loop extension (TALE) superclass of atypical homeobox genes (Anderson et al., 2006). These homeobox genes are a superfamily of transcription factors that regulate the spatial organization of the embryonic body plan, cellular identity, proliferation, and differentiation during organogenesis (Duboule, 1995). In embryogenesis in mice, MKX is important in tendon, skeletal, limb bud, testis and kidney (Anderson et al., 2006). It is expressed in mesoderm derived tissues, and is important in differentiation of bone marrow derived mesenchymal stem cells (BMMSC) into tendon tissue, upregulating ECM and collagen production (Otabe et al., 2015). In osteoarthritis patients, MKX is downregulated, and was also found to be downregulated by IL-1β, suggesting that MKX expression is reduced by inflammatory stimuli (Otabe et al.). However, another study showed IL-1β induced expression of MKX in vitro (Zhang et al., 2015), so the relationship between MKX and inflammatory stimuli is still unclear. Most of the work on MKX has been done on tendon tissue with little work on skin or fibrosis, and the effect of overexpression of MKX in skin is unknown. In skin, MKX was found to be downregulated in skin fibroblasts of two sisters suffering from a mutation causing deficiency in galactosyltransferase II (GalT-II), involved in the

110

synthesis of glycosaminoglycan (GAG), a major ECM protein, causing many connective tissue problems (Ritelli et al., 2015). In a mouse model of tendon repair with fibrotic adhesions, MKX was shown to be upregulated, and thought to play a part in the formation of fibrotic adhesions (Juneja, 2013). Figure 4.3 displays this array of cell processes.

MKX was selected for further modulation in scar cells, as it is significantly upregulated in scar tissue and has the characteristics of an ‘arsonist’ gene, controlling many downstream genes including many ECM proteins. As described above (in the case of FOXF2) the fact that that the gene expression and promoter methylation patterns do not have the classical inverse relationship does not rule MKX out, as the relationship between gene promoter methylation and expression is complex (Niesen et al., 2005; Bahar Halpern et al., 2014).

Figure 4. 3: Involvement between MKX and FOXF2 and cell processes. Cell differentiation and embryonal development processes are the two in common, and are highlighted.

111

4.3. Discussion

This study is the first to combine methylome and transcriptome datasets of human scar fibroblasts in the hunt for genes which maintain the scar phenotype, using higher resolution arrays than previous studies for each category and with a robust study design using matched site and patient controls, as well as controlling for age, sex and anatomical site.

4.3.1. Strengths of integrative analysis

A pragmatic target identification strategy was used that aimed to identify targets according to a ‘fireman/arsonist’ analogy: to identify the ‘arsonist’ lighting the fires, rather than the ‘firemen’ who were on the scene responding to the fire. With this strategy it is also important to be able to identify bystanders, neither responding to the scene nor actively engaged in the pathological process.

Potential targets were narrowed using an integrative genomic approach combining multiple lines of evidence – this reduced 77 379 CpG sites in 20 496 genes and 24 838 expressed genes to 16 targets, potentially making it easier to find the ‘arsonists’. Searching for ‘arsonists’ from the sixteen genes that were differentially methylated and expressed, four possible candidate genes were identified.

Integration of the alternative RnBeads analysis with the transcriptome data identified 9 targets, of which 8 were common with the original integrated data. Of the four DNA binding targets identified from the original analysis, only the two selected for further validation, MKX and FOXF2, were present in the alternative analysis. This further validates their choice for further investigation, as the method used in the alternate analysis can be taken as being more statistically valid (Du et al., 2010).

Of the non-transcription factor genes, the presence of keratin 7 (KRT7) is of particular interest. KRT7 is not expressed in fibroblasts, but is expressed in keratinocytes, melanocytes and Langerhans cells in the skin (Uhlén et al., 2015). Both differential methylation and expression suggests some effect of either contamination by keratinocytes or other non-fibroblast cells, or that the scar fibroblasts may have undergone some kind of trans-differentiation. In this study, the KRT7 is hypermethylated and downregulated in expression in the scar fibroblasts. This is very surprising as if the presence of KRT7 in the list of differentially expressed genes was due to the abnormal cell lineage of the scar fibroblasts (for example due to EMT during 112

wound repair), it would be expected that the expression would be increased in the scar fibroblasts as a vestigial mark of the cell lineage. The fact that KRT7 is observed to be decreased in expression in scar fibroblasts suggest this may be due to a contamination effect, with the normal scar fibroblast culture potentially containing a small population of immune cells that are absent in the scar fibroblast cultures and therefore skew the comparative expression data. An alternate explanation would be that the process of EMT switches off KRT7 expression so effectively through hypermethylation that it appears to be reduced in expression levels even though the two fibroblast populations are very similar. Whilst an interesting observation, KRT7 is of limited interest functionally for the purposes of this study, and therefore this result was not pursued further.

Although four genes were found to be possible ‘arsonist’ genes, ultimately only two were selected for modulation in vitro. These were prioritised based on having the strongest links to both DNA transcription and ECM function and regulation from previous work in the literature in particular. This does not exclude the other two genes, AR and AIM2, from contributing to formation and maintenance of scars and these should be investigated further.

The two genes selected, MKX and FOXF2, have many features in common with one another. Both are important in the spatial determination during embryo development and cell differentiation and both are predominantly expressed in mesenchymal tissue and have a large number of downstream effector genes. Both are upregulated in scar fibroblasts and upregulation of these genes has been implicated in fibrotic pathologies in other tissues, but little is known about the effect of either gene in skin.

Two other studies to date have combined integrative analysis of the methylome and transcriptome in skin disease or fibrosis which provides an opportunity to look at similarities and differences in terms of the analytical approaches and findings. One study was on Idiopathic Pulmonary Fibrosis (IPF) by Sanders et al. (2012) and one on Atopic Dermatitis (AD) by Rodriguez et al. (2014). The study by Rodriguez et al (2014) included a comparison of lesional and non-lesional skin in 16 patients (DNA methylation) and 6 patients (gene expression). The analysis of the data was similar in key respects. Paired t-tests were applied and the following cut-offs were used: a ‘liberal significance threshold’ of p< 0.001, DNA methylation ∆β of 0.10 and a gene expression fold change of >1.3.(Rodriguez et al., 2014). In the study conducted here, a cut-off of 113

1.5 for Fold Change and similarly liberal nominal significance of p<0.05 were used for gene expression. A cut-off for DNA methylation ∆β was not specified in this study, however, only 2 of the 16 genes differentially methylated and differentially expressed had a ∆β of <0.10, and both of our targets selected for further modulation (MKX and FOXF2) fit the criteria of Rodriguez et al.

There are some methodological differences between Rodriguez et al. (2014) and the study undertaken for this thesis. Rodriguez et al. compared individual CpG sites with RNA transcripts, rather than whole promoter gene regions with RNA transcripts, using the older 27k gene chip, which only has 27 000 CpG sites, many of which are cancer related, compared to the 485 000 CpG sites on the array used here. In the study conducted here, promoter region methylation and transcripts were analysed at the gene level using the ‘Regionswrapper’ function in the package IMA in R (Wang et al., 2012) and Limma (Smyth, 2004). It can be argued that statistical comparisons carried out on larger genomic regions rather than on single CpGs give rise to more significant results, as neighbouring CpGs with similar differences in DNA methylation reinforce each other (Bock, 2012). Analysing 485 000 individual CpG sites on a 450k chip would incur a large reduction in statistical power. The same argument for increased statistical power applies for using sets of transcripts for a gene rather than all possible transcript variants, as fewer comparisons are made, reducing the multiple testing penalty. No genes were found to be in common with the 47 target genes identified in the study of atopic dermatitis and this study, but this is not surprising due to the well-known differences in underlying pathology (immune disease vs. wound healing) and tissue type (epidermis vs. dermis) of the two conditions.

Sanders et al. carried out a similar study on tissue samples from 12 patients with IPF and 7 controls and again used slightly more stringent cut offs (Sanders et al., 2012). They initially used a Diff Score of ±13 for the DNA methylation data, which is a transformation of the p-value providing direction, equivalent to a p-value of p<0.05. For the expression data they used a fold change of ±2 and a p<0.05, and for selection of targets they used a B-H correction for multiple testing for both DNA methylation and RNA expression data. They then ran an ingenuity pathway analysis (IPA) similar to the GSEA that was used in this study. Sanders et al. didn’t take into account the size of ∆β, only selecting possible targets by p-values of the methylation change. Another difference is that Sanders et al. only retained genes that showed an inverse relationship

114

between DNA methylation and gene expression, although as previously discussed this relationship has been demonstrated to be more bidirectional than previously thought and therefore excluding those genes without the inverse relationship may be unnecessarily excluding important targets. In the study here, the directionality of the relationship between methylation and expression changes was not considered as a factor for target elimination. This was done to maximise the search for target genes, as methylation of promoters has been shown to go against the traditional model and increase expression in select cases (Bahar Halpern et al., 2014).

Using their more stringent cut offs for significance, and only using genes with an inverse correlation between methylation and expression, Sanders et al. found 16 genes that were differentially methylated and expressed (Sanders et al., 2012). There were no genes in common with the 16 genes in this study, which is most likely again due to the differences in tissue type and differences in the pathology (idiopathic fibrosis is a progressive fibrotic condition whereas scar maintenance does not progress).

4.3.1. Limitations of study

This study has several limitations. Despite the use of an integrated strategy bringing together the DNA methylation and gene expression data, small patient numbers still limit the statistical power of the analysis, and increase the risk of a type II error (false negatives). That is, genes which are involved in maintaining scar may not be identified in the analysis. This is the most significant limitation of the study as it means important biological regulators of scar maintenance may not be identified.

With array data there is an increased chance of false positives due to multiple testing. Although widely accepted for use in arrays, using a 5% B-H correction for multiple testing is a compromise. It increases power compared to other methods of multiple testing correction, accepting a rate of false positives at 5% but allowing the other 95% as true positives (Benjamini and Hochberg, 1995). This correction increases the rate of false positives compared to other multiple testing corrections, and with the risk that genes identified by the analysis may not actually be involved in maintaining scar. This is not as significant a risk as that for false negatives, since false positives can at least be negated through further review and/or current knowledge of function to remove false positives providing that at least some targets identified are indeed real positives.

115

Again, also widely used in analysis of arrays, use of fold changes for cut-offs for gene expression is problematic. Usage of a simple, static fold change for expression is biased for both low and high intensities, as static fold change thresholds are too stringent at high intensities and not stringent enough at low intensities (Mariani et al., 2003).

Another limitation of the small sample size is whether the sample is representative of the normal scar population. These patients were controlled for sex, age, body site and scar type, which helped to reduce variation due to other variables but other scar types or locations may exhibit differences to those tested here. In addition this study used only male samples and therefore the relevance of these findings to female patients cannot be confirmed at this point.

As previously discussed, the use of cultured skin cells rather than fresh skin biopsies creates the possibility of phenotypic change during culture (Boess et al., 2003). Also previously discussed is the risk of founder population bias.

116

4.4. Conclusion

The aim of this chapter was to identify genes involved in maintaining the scar phenotype, and in particular those genes that may regulate scar phenotype and therefore be potential therapeutic targets. Genes that were both differentially expressed and methylated in scar fibroblasts compared to normal fibroblasts were examined by gene function, and upstream effector genes and transcription factors were selected for further investigation. These potential target genes were analysed in detail and prioritised as targets for modulation in vitro.

Four genes that were transcription factors were selected for evaluation in the literature. Of the four genes, two genes – FOXF2 and MKX - were chosen as being possible ‘arsonist’ genes that drive scar formation and maintenance and used for subsequent validation, although other genes are expected to be tested in the future.

The next chapter will focus on in vitro validation of the fibrotic activity of MKX and FOXF2 in scar fibroblasts and the effects of reducing MKX and FOXF2 expression on ECM production.

117

Chapter 5

118

Chapter 5 – Phenotypic Assays and Target Validation

Introduction In this thesis changes to the epigenetic programming of normal skin fibroblasts are hypothesised to be the mechanism responsible for long-term changes in collagen metabolism in normotrophic scar fibroblasts and subsequently the dermal matrix. Epigenetic changes alter the transcriptome and cell phenotype and it is proposed that this will lead to sustained differential collagen production and matrix maintenance. To test this hypothesis, two genes identified through the whole genome methylation and transcriptome studies (Chapters 2-4) as being differentially methylated and expressed in normotrophic scar fibroblasts compared to normal skin fibroblasts were selected for in vitro validation of their effects on ECM metabolism. These two genes are both transcription factors with biological roles suggesting they are plausible regulators of matrix production in skin. In the experiments in this chapter, the two genes were targeted using small interfering RNA (siRNA) to reduce the level of expression in an in vitro model of scarring, in order to determine whether reducing the expression of these genes altered cell phenotype with respect to extracellular matrix production. If the hypothesis is supported, reprogramming of scar fibroblast phenotype through epigenetic modification may be able to ameliorate scarring by switching the dynamics of ECM production back to the ‘normal’ phenotype.

Aims:

1. To explore collagen orientation and quantitation in scar and normal fibroblasts.

2. To confirm increased gene expression of target genes, MKX and FOXF2, in scar fibroblasts using quantitative real time polymerase chain reaction (qRT-PCR).

3. To validate the role of MKX and FOXF2 as pro-fibrotic regulators in scar fibroblasts using siRNA gene knockdown.

119

5.1. Methods 5.1.1. Experimental design

The first experiment compared the phenotypic properties of the normal skin fibroblasts and the normotrophic scar fibroblasts from the 6 burn injury patients using the ‘scar-in- a-jar’ assay (Section 5.1.2), analysing collagen orientation (Section 5.1.4.1) and amount of collagen I produced per cell (Section 5.1.4.2). For each patient, three wells of scar fibroblasts and three wells of normal fibroblasts were cultured, and collagen orientation (coherence) and collagen production per cell were measured in each of the wells.

The second experiment used the same RNA as the gene expression microarrays (Chapter 3), and expression levels of MKX and FOXF2 were analysed in matched scar and control fibroblasts by qRT-PCR (Section 5.1.6). RNA was available from scar and control fibroblasts from 5 of the 6 patients (insufficient sample for Patient 1). All primer pairs were run in duplicate.

The third experiment knocked down the two target genes, MKX and FOXF2, in scar fibroblasts cultured from two of the burn injury patients using siRNA transfection. Genes were knocked down individually (single knockdown) and together (double knockdown). For each patient’s cells, two wells each of untransfected fibroblasts, scrambled RNA transfected fibroblasts, FOXF2 siRNA transfected fibroblasts, MKX siRNA transfected fibroblasts, and FOXF2/MKX siRNA transfected fibroblasts were prepared. Collagen orientation (coherence) (Section 5.1.4.1) and collagen per cell (Section 5.1.4.2) were measured in each well.

5.1.2. Scar-in-a-jar

A procedure entitled ‘scar-in-a-jar’, developed by Chen and Raghunath (Chen and Raghunath, 2009) to measure collagen per cell was adapted for use with the primary skin cells and to measure collagen orientation (M. Bradshaw, unpublished data). Normal skin and scar fibroblasts produce very little collagen in vitro without stimulation, whilst western blots and other protein assays only detect the total amount of collagen, not other important structural parameters related to scarring (Chen and Raghunath 2009). Other models commonly require long culture times, omit vital co- factors such as ascorbic acid, and destroy the cell layer and matrix when measuring the collagen produced. The ‘scar-in-a-jar’ method is a technique that allows for the quantification and visualisation of collagen structure and other extracellular components 120

in vitro, providing a good model of scar matrix production. By using immunohistochemistry and multiple analytical procedures on the images generated, the ‘scar-in-a-jar’ model allows assessment of the true fibrillar, cross-linked matrix produced by fibroblasts in vitro and it was therefore chosen as the methodology of choice for the validation of putative pro-fibrotic genes.

5.1.2.1. Scar-in-a-jar-culture protocol

Fibroblasts were seeded at 50 000 cells/well in 4-chamber slides (Lab-Tek, Thermo Fisher Scientific, USA) and cultured in normal media (DMEM-Glutamax™ with 10% FBS and 1% penicillin/streptomycin, Life Technologies, USA) for 14 hours, after which time the media was changed to stimulated media. This media contained DMEM- Glutamax™ with 0.5% FBS (Life Technologies, USA), 100 mM of L-ascorbic acid 2- phosphate (as the magnesium salt hexahydrate, a more stable form of ascorbate, Sigma Aldrich, USA), a mixture of 37.5 mg/mL Ficoll PM70 (Fc70, GE Healthcare, UK) with 25 mg/mL Ficoll PM 400 (Fc 400, GE Healthcare, UK) and 5 ng/mL-1 TGFβ1 (R & D Systems, USA). Fibroblasts were then cultured in stimulated media for 6 days.

5.1.2.2. Scar-in-a-jar staining protocol

After 6 days of stimulated culture, media was removed and stored at -20°C for later analysis. Cells were washed in sterile PBS (Sigma-Aldrich, USA) to remove excess media. Cells were then blocked in 3% Bovine Serum Albumin (BSA) in PBS (Sigma- Aldrich, USA) for 10 minutes. After 10 minutes, blocking solution was removed and primary antibody solution was added. Primary antibody solution contained primary collagen 1 antibody (mouse monoclonal IgG1, Cat. No. sc-59772, Santa Cruz Biotech, USA) at a 1:1000 dilution in blocking solution (3% BSA in PBS). Cells were then incubated at 37°C 5% CO2 for 90 minutes in a cell culture incubator (Heraeus, Germany). After 90 minutes, the primary antibody was removed and the cells washed three times in PBS (2 minutes per wash).

The cells were then fixed in 4% paraformaldehyde (Sigma Aldrich, USA) solution for 10 minutes at room temperature. After this, the cells were again washed three times in PBS (2 minutes per wash). The cells were then blocked in 3% BSA in PBS for 10 minutes at room temperature. The secondary antibody solution was then prepared by diluting the secondary antibody (goat-anti mouse Alexa-fluor 488, Cat. No. A-11001,

121

Life Technologies, USA) 1:500 in PBS. Blocking solution was then removed and secondary antibody solution added which was then incubated at 37°C for 30 minutes. The secondary antibody was removed and cells washed in PBS three times (2 minutes per wash). Hoechst® staining solution (Cat. No. H3570, Life Technologies) was made up to a 1:1000 dilution in PBS and then added to the cells, which were then incubated for 20 minutes at room temperature. Hoechst® solution was then removed and the cells washed in PBS three times (2 minutes per wash). PBS was then removed and coverslips mounted on the slides using Prolong® Gold anti-fade mounting solution (Life technologies, USA). Nail polish was added around the outside of the coverslips to fix them and the slides stored in a light proof box at 4°C prior to imaging.

5.1.2.3. Scar-in-a-jar imaging

5.1.2.3.1. Assessment and blinding process

In order to remove potential bias when assessing the coherency and collagen per cell, the image assessor was blinded to the identity of each of the samples. Cells were seeded into the wells by a separate person from the assessor, who recorded the identity and/or treatment of each well. The staining, imaging and analysis was then carried out by the assessor, who was blinded to the identity of each well. After the analysis was complete, the experiment was unblinded. This blinding method was used for the comparison of collagen orientation and quantitation in scar and normal fibroblasts and the siRNA experiments.

5.1.2.3.2. Confocal imaging for collagen orientation

Slides were imaged using the Leica TCS SP2 multiphoton confocal microscope using Leica Confocal Software (LCS), located at the Centre for Microscopy, Characterisation and Analysis (CMCA) at UWA. Slides were imaged using a 40x oil objective and a 488 nm laser wavelength. Each slide was imaged in 3 separate areas, randomly chosen by the assessor, who was blinded to the identity of the cell type and treatments. A z-stack of each area was taken, as determined by the top and the bottom of the collagen layer. Once the top and bottom of the z plane was set, images were taken at 0.05µM intervals, and then using the maximum projection function, condensed into a single frame.

122

5.1.2.3.3. Whole chamber slide imaging for collagen quantitation

Slides were imaged using the Nikon TE 300 running PC running NIS-Elements software using the 4x objective and the B2-A (488nm) and the DAPI (358nm) filter blocks. A composite image of the whole chamber was created using the ‘scan large image’ function, marking the boundaries of the image and focusing every 9-13 images. Exposure time was set at 300ms for the Hoechst® stain and 1s for the 488nm excitation. This was carried out for both the blue (nuclear) and the green (collagen) stains. The assessor was blinded to the identity of the cell phenotype and treatment.

5.1.2.4. Scar-in-a-jar image analysis

5.1.2.4.1. Collagen orientation

Confocal images were analysed using the ‘fiji is just image j’ (FIJI) plug-in (Schindelin et al., 2012) and the orientation J package, specifically using the coherency feature, using a method recently developed within the group (M Bradshaw et. al., paper submitted for publication, personal communication). Six regions of interest (ROIs) were chosen by the blinded assessor, so as to maximise the amount of collagen area analysed. Ideally, the entire sample area would be analysed, but due to the nature of the primary human fibroblasts the collagen deposition was not uniform across the experimental area. A methodology described by Kador, et al. was employed, with 6 ROI chosen at random, excluding those areas with deformations or holes (Kador et al., 2013). The ROIs were kept as large as possible to examine the orientations of the collagen bundles, as opposed to the local variations within the bundles. The ‘measure’ feature was then used to measure the coherency of each ROI, which was then copied to an Excel worksheet.

The coherency function draws an ellipse that indicates the coherency of the ROI. The coherence formula takes into account the largest eigenvalue (major axis) and the smallest eigenvalue (minor axis). When the coherency = 0, the ellipse becomes a circle, with no elongation and no elongated structures present in this position of the image. When the coherency = 1, the ellipse becomes a line segment, with very high elongation and elongated structures present in this position of the image. Within the context of measuring collagen orientation, a coherence of 0 would be the most random orientation and a coherence of 1 would be the least random orientation.

123

5.1.2.4.2. Collagen quantitation

Whole chamber slide images were analysed using the NIS-elements software (Nikon, Japan). Hoechst® stained images were analysed for each chamber followed by the corresponding 488nm green image. Six ROIs were selected, covering the representative areas of the chamber, being moved if they were in an area absent of collagen or cell nuclei, and ensuring that ROIs were consistent in size between different chambers. The exact position of each ROI in the Hoechst® image was recorded to ensure the same position in the 488nm image. Each ROI had a binary threshold applied to mark either the cell nuclei or the collagen fibres, depending on image. Once the threshold was adjusted to minimise background and ensure an accurate read, the ‘measure’ function was used to measure object count and binary area covered, and the data exported to Excel. To obtain a value for area of collagen per cell for each ROI, the ‘binary area’ of the 488nm image was divided by the ‘object count’ from the blue image.

124

A B

C D

E F

G H

Figure 5. 2: Example regions of interest (ROIs) showing quantification process for collagen/µm. The whole chamber was imaged at 4x resolution, in blue (358nm) and green (488nm) filters (A and B) to image Hoechst stained cell nuclei and Alexa Fluor 488 stained collagen 1. 6 ROIs were chosen (C and D). For each ROI, the number of cell nuclei was counted using a ‘binary threshold’ and ‘object count’ (E and F). A similar process was applied to the collagen I image, exchanging number of cells with area of collagen coverage was measured (G and H). Combining the data from these two measures gave a collagen I secreted per cell measure.

125

5.1.3. siRNA knockdown of MKX and FOXF2 genes

Knockdowns of MKX and FOXF2 using siRNA were carried out using the FlexiTube GeneSolution siRNA system (QIAGEN, Netherlands), which is a gene-specific package of 4 preselected siRNAs for a target gene. FlexiTube siRNA sets for both FOXF2 (Cat. No. GS2295) and MKX (Cat. No. GS2830780) were obtained, as well as AllStars Hs Cell Death siRNA (Cat. No. SI04381048) as a positive control and AllStars Negative Control siRNA (Cat. No. SI03650318) as a scrambled negative control. Transfection was carried out using the HiPerFect transfection system (QIAGEN, Netherlands) according to the manufacturer’s instructions.

Scar cells were seeded at a density of 2.4x105 cells per well in a 6 well plate (Corning, USA) with 2200µl cell culture medium (DMEM with 10% FBS and 1% penicillin/streptomycin, Life Technologies, USA). The cells were then left for 3 hours to settle and attach to the bottom of the well. siRNA was then diluted in 200 µl serum free cell culture medium (DMEM with 1% penicillin/streptomycin), to which 12µl of HiPerFect transfection reagent was added. This mixture was then vortexed and left to incubate for 10 minutes at room temperature to allow formation of transfection complexes. This diluted mixture was then added to the cells and left to incubate at 37°C and 5% CO2 for 72hrs. After incubation, cells transfected with AllStars Hs Cell Death siRNA were monitored for cell death, and if >90% of cells were dead then cells in the remaining wells were trypsinised, seeded, imaged and analysed using the ‘scar-in-a-jar’ protocol outlined in sections 5.1.2, 5.1.3 and 5.1.4. Cells from patient 2 and patient 4 were selected for siRNA knockdown.

5.1.4. Quantitative real time PCR

Quantitative real time polymerase chain reaction (qRT-PCR) was performed on aliquots of RNA extracted for the original expression array in Section 3.1.1. qRT-PCR was carried out using the QuantiTect reverse transcription kit (Cat. No 205311), Quantitect Sybr Green PCR kit (Cat. No. 204143) and the Quantitect Primer Assay (Cat. No. 249900) using pre-validated primers for MKX (Cat. No. Hs_MKX_1_SG) and FOXF2 (Hs_FOXF2_1_SG). This was run on the Qiagen Rotor-Gene Q, according to settings specific for the QuantiTect SYBR Green assays instructions (Qiagen, Netherlands). Pre-

126

validated Quantitect primers for β-actin and glyceraldehyde 3-phosphate dehydrogenase (GAPDH) were used as housekeeping genes, and genes of interest were normalised to the geometric means of the two housekeeping genes (Vandesompele et al., 2002).

5.1.5. Statistical analysis

All statistical analysis was done using PRISM graph pad (Version 6.0) software.

For comparison of the amount of collagen per cell in scar and normal control fibroblasts, the collagen produced per cell in each ROI was calculated and compared between scar and control fibroblasts for each patient using a Mann-Whitney U-test. For comparison of collagen coherence in scar and normal control fibroblasts, the median coherency value of 6 ROIs from each image was calculated and compared between scar and control for each patient using a Mann-Whitney U-test.

In the siRNA experiment, the collagen per cell of each ROI was compared between the scrambled RNA transfected fibroblasts and the FOXF2, MKX and FOXF2/MKX knockdowns (pair-wise) for each patient using a Mann-Whitney U-test. For comparison of collagen coherence, the median coherency value of 6 ROIs from each image was calculated and was compared between the scrambled RNA transfected fibroblasts and the FOXF2, MKX and FOXF2/MKX knockdowns (pair-wise) for each patient using a Mann-Whitney U-test.

127

5.2. Results 5.2.1. Scar cell fibroblasts vs. normal skin fibroblasts 5.2.1.1. Collagen produced per cell between scar and control fibroblasts

The amount of collagen per cell was significantly higher in scar fibroblasts compared to normal skin fibroblasts from patient 2 (p<0.0023) (Fig. 5.2). In all other patients tested there was no significant difference in the collagen produced per cell (Fig. 5.2). Experiments with cells from patients 1 and 6 were not completed due to technical issues. Representative images are shown in Figure 5.3.

128

A B

C D

E

Figure 5. 3: No change in amount of collagen produced per cell in scar and control fibroblasts in 3 out of 4 patients. Data points represent technical replicates (ROIs) (**=p<0.01) in patients 2-5 (A,B,C and D). Data points represent median of all ROIs in “All patients” graph (E).

129

Figure 5. 4: Collagen per cell levels are unchanged in scar compared to matched controls in 3 of 4 patients. Example ROIs from collagen per cell experiment at 4x magnification with Hoechst stained nuclei and Alexa Fluor 488 stained collagen 1. A) Patient 2 control. B) Patient 2 scar. C) Patient 3 control. D) Patient 3 scar. E) Patient 4 control. F) Patient 4 scar. G) Patient 5 control. H) Patient 5 scar. Collagen per cell levels are significantly increased in Patient 2 scar (B) compared to the matched control cells (A). 130

5.2.1.2. Collagen coherence between scar and control fibroblasts

The collagen matrix produced by scar fibroblasts from patient 2 was significantly more coherent (aligned) than the collagen matrix produced by the normal skin fibroblasts from the same patient (Fig. 5.4, p=0.0023, Mann Whitney U test) as described in section 5.1.2.4.1. In all other patients tested there was no significant difference in coherence between matched scar and normal skin fibroblast collagen matrices. The appearance of the collagen matrix in scar and matched normal control cells is shown in representative confocal images (Fig. 5.4)

131

A B

C D

E

Figure 5. 5: No change in coherence of collagen between scar and control fibroblasts in 3 out of 4 patients. Data points represent technical replicates (median of 6 ROIs) (*=p<0.05) in patients 2- 5 ( A,B,C and D). In “All patients”graph, data points represent median of all ROIs of each patient (E).

132

Figure 5. 6: Confocal images of collagen 1 matrix stained with Alexa Fluor 488 at 40x magnification. No significant difference in collagen coherence was observed, with the exception of Patient 2. A) Patient 2 control. B) Patient 2 scar. C) Patient 3 control. D) Patient 3 scar (D). E) Patient 4 control. F) Patient 4 scar. G) Patient 5 control. H) Patient 5 scar. 133

5.2.1.3. Relative gene expression of FOXF2 and MKX in scar fibroblasts by qRT-PCR

Gene expression measured by qRT-PCR showed expression levels of FOXF2 and MKX genes were higher in 3 of the 5 patients (Figs. 5.6A and 5.6C). In the gene expression micro-array (Chapter 3) FOXF2 and MKX had increased expression in all 6 patients (Figs. 5.6B and 5.6D).

A Relative FOXF2 Expression B FOXF2 Expression 0.7 12 0.6 11 0.5 10 0.4 Control 9 Control 0.3 Scar 8 Scar 0.2 0.1 Expression Level 7 0 6

Normalized Relative expression Relative Normalized 2 3 4 5 6 1 2 3 4 5 6 Patient Patient

C Relative MKX Expression D MKX Expression 0.018 12 0.016 11 0.014 0.012 10 0.010 9 Control 0.008 Control 0.006 8 Scar Scar

0.004 Expression Level 7 0.002 0.000 6

Relative Expression Normalized Relative 2 3 4 5 6 1 2 3 4 5 6 Patient Patient

Figure 5. 7: qRT-PCR results match expression levels on expression array. A) Relative FOXF2 expression assessed by qRT-PCR. B) FOXF2 expression levels from expression microarray (individual numbers refer to each individual patient). C) Relative MKX expression by qRT-PCR. D) MKX expression levels from expression microarray (individual numbers refer to each individual patient).

134

5.2.2. siRNA knockdown of MKX and FOXF2

5.2.2.1. Collagen quantitation.

There was a significant decrease in the level of collagen I per cell in the FOXF2/MKX siRNA treated (double knockdown) scar fibroblasts compared to scrambled and untreated controls in both patient 2 (Fig. 5.7A; p<0.001) and patient 4 (Fig. 5.7B; p<0.001). Additionally, the single knockdowns for FOXF2 and MKX showed a significantly decreased amount of collagen per cell in patient 4 scar fibroblasts (Fig 5.7B; FOXF2 p<0.001, MKX p<0.001)

A B

Figure 5. 8: Collagen per cell levels decreased in siRNA knockdowns in both patients 2 and 4. Both the single and double gene knockdowns cause a significant decrease in the collagen per cell (***=p<0.001).

135

Figure 5. 9: Patient 2 collagen per cell levels were significantly decreased in the FOXF2/MKX siRNA treated cells compared to the scrambled control. Examples of collagen per cell levels for patient 2 at 4x magnification with Hoechst stained cell nuclei and Alexa Fluor 488 stained collagen 1: A) Scrambled siRNA. B) FOXF2/MKX siRNA.

Figure 5. 10: Patient 4 collagen per cell levels were significantly decreased in all siRNA treated groups compared to the scrambled control. Examples of collagen per cell levels for patient 4 at 4x magnification with Hoechst stained cell nuclei and Alexa Fluor 488 stained collagen 1: A) Scrambled siRNA. B) FOXF2 siRNA. C) MKX siRNA. D) FOXF2/MKX siRNA. 136

5.2.2.2. Collagen coherence of cells treated with siRNA

When treated with siRNA for MKX and FOXF2/MKX, Patient 2 scar fibroblasts produced a collagen I matrix with significantly reduced coherence compared to the scrambled control (Fig. 5.10A: single MKX knockdown p<0.05; double FOXF2/MKX knockdown p<0.01). Patient 4 scar fibroblasts treated with siRNA for FOXF2, MKX and FOXF2/MKX showed no significant difference in coherence when compared to controls (Fig. 5.10B).

A B

C D

Figure 5. 11: Collagen coherence is significantly decreased in single MKX knockdown and double FOXF2/MKX knockdown cells from patient 2, and significantly increase in FOXF2 knockdown in patient 3. A) Patient 2 scar fibroblasts treated with siRNA single knockdown of MKX and FOXF2 and double knockdown of MKX/FOXF2. B) Patient 3 scar fibroblasts treated with a siRNA single knockdown of MKX and FOXF2 and double knockdown of MKX/FOXF2. C) Patient 4 scar fibroblasts treated with a siRNA single knockdown of MKX and FOXF2 and double knockdown of MKX/FOXF2. D) Combined medians of all three patient’s collagen coherence, split by group of each groups (*=p<0.05, *=p<0.01). 137

Figure 5. 12: Patient 2 showed a significant decrease in collagen orientation in cells from treated with siRNA knockdown compared to scrambled siRNA. 40x confocal images of Alexa Fluor 488 stained collagen 1 of A) Scrambled siRNA. B) FOXF2 siRNA. C) MKX siRNA D) FOXF2/MKX double siRNA.

138

Figure 5. 13: Patient 4 siRNA showed a no significant decrease in the coherence of collagen in the siRNA knockdowns compared to the scrambled siRNA. 40x confocal images of Alexa Fluor 488 stained collagen 1 of A) Scrambled siRNA. B) FOXF2 siRNA. C) MKX siRNA. D) FOXF2/MKX siRNA.

139

5.3. Discussion 5.3.1. Differences identified between normotrophic scar and normal skin fibroblasts

In this study, the collagen coherence and the amount of collagen per cell were found to be unchanged in the scar fibroblasts of three of the four burn patients compared to their normal skin fibroblasts, while of the remaining patients had a statistically significant increase in both the coherence (figure 5.2) and amount of collagen produced (figure 5.4). In previous studies, collagen orientation and synthesis in normotrophic scar has been shown to be different to that of normal skin in vivo (Muir, 1990; Verhaegen et al., 2009). Scar collagen is organised in parallel bundles, aligned with the epidermis, compared with the more random ‘basket weave’ orientation of normal skin (van Zuijlen et al., 2003). In human tissue analyses, orientation has been shown to be different between normal skin, normal scar, hypertrophic scar and keloid scar using a fast fourier transform (FFT) analysis, which calculates the randomness of the collagen orientation. Collagen orientation was found to be the most random in normal skin, with increasingly parallel orientation with increasing severity of scar phenotype (Verhaegen et al., 2009).

This pattern of increased alignment of the collagen matrix was expected to be replicated in this in vitro model, with the scar fibroblasts predicted to produce a more dense and aligned collagen matrix than that produced by the matched normal skin fibroblasts from each patient. However, in this study, only the scar fibroblasts from one patient (Patient 2) produced a more dense and aligned collagen matrix than the matched normal skin fibroblasts. In the remaining three patients from whom data was obtained there was no significant difference either in collagen quantity or coherency was observed.

The ‘scar-in-a-jar’ method developed by Chen and Raghunath was initially used to screen large numbers of potentially anti-fibrotic drugs (Chen and Raghunath, 2009) and only measured the amount of collagen produced per cell. In our laboratory we have adapted the technique to also measure collagen coherence (M.Bradshaw, unpublished data). Results obtained using the modified methodology described here with keloid fibroblasts have shown that keloid fibroblasts secrete more collagen than normal scar or normal skin fibroblasts and that the collagen is organised with greater coherency (more aligned) (M. Bradshaw, unpublished data, personal communication). In addition, a drug targeting collagen cross-linking significantly reduced coherence of the collagen matrix in scar and keloid fibroblasts using the model (M Bradshaw, personal communication). 140

Therefore the in vitro model appears to work well both for assessing drug efficacy as well as monitoring differences between disease states such as keloid and normal skin.

There are a number of possible explanations for the lack of significant difference in the collagen matrix produced by scar and normal skin fibroblasts in the experiments in this thesis. In the artificially stimulated conditions in vitro such as those used for the ‘scar- in-a-jar’ model, both scar and control fibroblasts are stimulated to produce increased collagen using exogenous TGFβ and crowding molecules. This may have the effect of ‘normalising’ the matrix production to be more scar-like, and therefore diminish any difference detected between skin and scar fibroblast cultures. The comparison between non-stimulated normal skin and normotrophic scar fibroblasts cannot be performed in vitro using this model, as both normotrophic scar and normal fibroblasts produce very little ECM under unstimulated conditions. Other studies using different models have found a similar variability in collagen production of scar fibroblasts when tested in vitro. A study comparing hypertrophic scar fibroblasts to normotrophic scar fibroblasts found that of matched scar and hypertrophic scar fibroblasts (non-stimulated) from patients, only half were found to show increased collagen production in hypertrophic compared to normotrophic scar fibroblasts (Tredget et al., 1997). In another study comparing hypertrophic scar to normal skin (this study compared normotrophic scar to normal skin), no difference between the collagen secretion of non-stimulated hypertrophic scar and normal skin fibroblasts was observed (Harrop et al., 1995).

In addition to the use of exogenous stimuli, the model requires de novo ECM production in a flask devoid of existing ECM, which may stimulate all cells to produce a similar amount of matrix. In vivo studies on tissue samples measured coherency over time in a longitudinal study of scar maturation (Verhaegen et al., 2009). This may have facilitated the detection of changes between scar phenotype and over time in comparison to the in vitro model used here.

141

5.3.2. Validation of target gene involvement in extracellular matrix synthesis

Upregulation of gene expression of MKX and FOXF2 in scar fibroblasts was confirmed in most patient samples by qRT-PCR. Although all the data from the microarray showed that all the genes were upregulated in normotrophic scar fibroblasts compared to the controls, the qPCR did not match this exactly. This suggests that although shown to be quite robust, microarray results still require validation by qRT-PCR. The siRNA gene knockdown experiments provide preliminary evidence that both of the genes MKX and FOXF2 are involved in extracellular matrix production.

Knockdown of target genes caused a significant decrease in the quantity of collagen produced by scar fibroblasts from patient 2 in the double knockdown, and both the single knockdowns and double knockdown in patient 4. These results provide preliminary evidence that MKX and FOXF2 are involved in the control of matrix production in human fibroblasts. The data also provides preliminary validation of the value of the integrated genomic approach used in this thesis to identify novel anti- fibrotic genes.

Knockdowns of the individual genes also had a varying effect on the coherency of the scar cell collagen 1 in the 3 patients tested. In patient 2, collagen 1 coherence with MKX knockdown was significantly decreased and with FOXF2 knockdown there was a trend towards a decrease (p=0.11). Alternately, in patient 3 the collagen coherence increased in the FOXF2 knockdown (p=0.04), and no change in the MKX and FOXF2/MKX knockdown. The data for patient 4 was less conclusive, with a high variability of the coherence values in the controls.

That single knockdowns of the MKX and FOXF2 showed a highly significant decrease in the amount of collagen per cell in patient 4 (p<0.0001 and p<0.0001 respectively), suggests the individual role of both genes in the production of collagen 1. Single knockdowns were not available for patient 2 for collagen production levels due to experimental issues. It is interesting to note that the qPCR results for patient 2 suggest minimal change in the levels of FOXF2 and MKX expression between scar and normal skin fibroblasts whereas patient 4 had a large difference in expression between the two samples. Nevertheless, the siRNA approach appears to be effective in reducing collagen production in both patients’ scar fibroblasts, suggesting that even in cells expressing lower levels of these genes collagen production can be ameliorated.

142

The decrease in collagen coherence and collagen production with knockdown of MKX is compatible with the current known literature on the function of MKX. MKX is involved in collagen fibril organisation in tendons, observed in MKX -/- mice which have a decreased amount and abnormal organisation of collagen in their limb and tail tendons, where MKX appears to be required for the regulation of postnatal growth and maturation of collagen fibrils (Liu et al., 2010; Berthet et al., 2013). Interestingly, no abnormal skin phenotype was reported in these MKX null mice, so the role of MKX in collagen organisation and maintenance in normal skin may be minimal (Ito et al., 2010). Alternatively the skin may not have been adequately investigated. Further analysis of these mice and in particular a wound healing and scar formation study will be important in establishing the potential role of MKX in scar formation and maintenance and collagen production in the skin.

FOXF2 has also been implicated in collagen fibril organisation and production and therefore the trend observed with FOXF2 siRNA (and in particular in the double knockdown FOXF2/MKX siRNA) reducing collagen quantity and coherence is supported by its known function. In the intestinal tissues of Foxf2 null mice there is a significant decrease in the amount of collagen deposited, and FOXF2 null mice suffer spontaneous disintegration of the epithelium, the mesenchyme and the two muscular layers in the intestine (Ormestad et al., 2006). Therefore FOXF2 appears to be important in matrix production and maintenance in the gut. However, as with the MKX null mice, no abnormal skin phenotype was reported in the FOXF2 null mice. Again, this may be due to the skin not being assessed and that the mice had not been exposed to an injury. Therefore the use of MKX and FOXF2 null mice also provides a potentially useful strategy for in vivo validation of these new anti-fibrotic targets.

5.3.3. Limitations of study

The small number of samples isolated from patients is a key limitation of this study. In addition, complete analysis of all 6 patients’ fibroblasts would help to provide more information on the variation between individuals.

The use of the ‘scar-in-a-jar’ assay, whilst a very useful in vitro model, does have some limitations. This assay was selected as the best option to measure in vitro collagen as it resolves many of the problems commonly associated with other in vitro fibroplasia models, models all parts of the biosynthetic pathway of collagen and is flexible in

143

allowing multiple measures to be made of the same sample, generating a large amount of data from a single sample (Chen and Raghunath, 2009). However, the use of imaging does not provide a measure of total collagen protein produced and the focus on only one component of the matrix (collagen I) does not provide a complete picture of the changes that may be occurring. The method is also complicated, requiring multiple steps and a lengthy process to obtain data. Optimisation was also difficult. This leads to inter- experiment variation and can limit the ability to compare samples that are run at different times.

Another limitation of this method is the inability to study collagen production of normal skin and scar fibroblasts in vitro without the use of TGFβ stimulation. Other more sensitive methods of collagen quantitation, such as mass spectrometry, could have been used to detect collagen produced without stimulation, but these do not provide the additional coherency information which is important in understanding the structural ECM changes.

Finally, the use of siRNA to investigate the function of the genes also has limitations. Further work assessing the extent of knockdown of expression using these siRNAs is required, and there is the additional possibility of off-target effects of siRNAs (Jackson and Linsley, 2010), which whilst minimised in these oligonucleotides by the manufacturers, could still have occurred in this experiment.

144

5.4. Conclusions

In this thesis, changes in epigenetic programming of normal skin fibroblasts which alter the transcriptome and cell phenotype, were hypothesised to be the mechanism responsible for long-term changes in collagen metabolism in normotrophic scar fibroblasts and subsequently the dermal matrix maintenance of scar. Two genes identified through the whole genome methylation and transcriptome studies (Chapters 2-4) as being differentially methylated and expressed in normotrophic scar fibroblasts compared to normal skin fibroblasts were selected for in vitro validation of their effects on ECM metabolism. The purpose of this chapter was to explore whether the genes identified from the integrative methylome and transcriptome studies, MKX and FOXF2, influence collagen production in scar fibroblasts. Modulation of the expression of these genes caused a decrease in the fibrotic properties of the scar fibroblasts demonstrated by decreased collagen production per cell and coherence of the collagen produced in the scar-in-a-jar model. These results suggest that both genes are potential targets for anti- fibrotic therapies. In addition, the results also help to validate the integrated genomic approach to identify novel anti-fibrotic targets. Reprogramming of scar fibroblast phenotype through epigenetic modification may be able to ameliorate scarring by switching the dynamics of ECM production back to the ‘normal’ phenotype.

This study is the first time MKX and FOXF2 have been shown to be important in scarring. Furthermore, the use of patient cells rather than animal models for all these experiments provides strong support that these genes have the potential to be important therapeutic targets in patients with scars. Further research to validate these targets in vivo, potentially using transgenic mice, will be important to understand the role of these genes in scar formation and maintenance.

145

Chapter 6

146

6. Chapter 6 - General Discussion

6.1. Overview of research

Scars are permanent modifications to the normal architecture of the skin that persist throughout the lifetime of the individual. These scar tissues are not static, and a scar formed when an individual is young will grow with that individual into adulthood. Epigenetic changes have been shown to be important in fibrosis in other tissues (Sun et al., 2010), and in this thesis were hypothesised as the mechanism responsible for long- term changes in collagen metabolism in normotrophic scar fibroblasts and subsequently the dermal matrix formation and maintenance. The scar fibroblasts pass this altered epigenome to all subsequent daughter cells, maintaining the mature scar phenotype for life and resulting in growth of a scar during periods of physical growth.

This study aimed to test the hypothesis that differences in the epigenome of scar fibroblasts compared to normal skin fibroblasts are responsible for long-term changes in collagen metabolism in normotrophic scar fibroblasts and subsequently the dermal matrix.

Six male patients with a mature normal burn scar on one forearm and a contralateral non-injured forearm were recruited. Skin biopsies were collected and fibroblasts were cultured from tissue explants. DNA, RNA and cells from these biopsies were used for the experiments in the thesis, including DNA methylation and gene expression microarrays. Integrated bioinformatic analysis was undertaken to identify potential target genes involved in skin fibrosis and in vitro validation experiments were performed with two selected genes. These genes appear to be important in regulating collagen deposition by fibroblasts and may be targeted therapeutically to ameliorate scarring in the future.

147

6.1.1. Summary of results

In this study, an integrated methylome and transcriptome analysis of human normotrophic scar fibroblast RNA and DNA found 398 differentially methylated genes and 163 differentially expressed genes, with an overlap of 16 genes both differentially methylated and expressed. Of these 16 genes, 2 were found to have gene ontologies suggesting a role in regulating collagen production and organisation. Modulating the expression of these genes using siRNA in vitro caused a significant decrease in the amount of collagen produced by the fibroblasts from two patients tested, and a significant decrease in coherence of collagen matrix produced by the fibroblasts from one of the two patient’s cells tested. This evidence supports a role of both genes in collagen production in scar fibroblasts, suggesting intervention to modulate expression of these genes could offer therapeutic potential to ameliorate scarring.

6.2 Significance of these findings and relevance to the field

This is the first study to present an integrated snapshot of both the methylome and transcriptome of normal scar fibroblasts isolated from mature burn scars. It is also one of very few studies to present an integrated analysis of methylome and transcriptome data in skin or skin pathology, and the first on normotrophic scar. It is also the first to implicate MKX and FOXF2 transcription factors, known to be involved in collagen production, orientation and fibrosis in other tissues, in the maintenance of normal scars in the skin. These genes are now potential targets for the development of therapeutics to prevent or ameliorate scar formation and maintenance, and this study has provided a basis on which further work can be built on to identify and subsequently validate more target genes.

6.2.1. Scar cell origin as a possible explanation for the observed differences in the epigenome of scar fibroblasts

The methylation profile of the scar fibroblasts (chapter 2) revealed 398 promoter regions of genes that were classified as significantly differentially methylated, with 174 hypermethylated and 225 hypomethylated genes (Fig. 2.1). Whilst this represents only approximately 1.5% of the protein coding genes it is nevertheless potentially sufficient to underpin changes in matrix biology. Two key questions arise from these findings. The first is what is the origin of these changes in the epigenome? The second is what are the functional consequences of these epigenomic differences? 148

One possible explanation for the origin of the epigenomic changes may lie in the origin of the cells involved in healing and scar formation themselves. The origin of scar fibroblasts is currently unclear, but there are three sources that have been postulated. The first is that after injury, wound fibroblasts are stimulated from the edge of the wound, where they migrate in to form new dermis, in a similar mechanism to epithelial cells (Linares, 1996). The second is that scar fibroblasts originate from the bone marrow, and that these circulating bone marrow derived cells, or fibrocytes, migrate to the wound and differentiate into fibroblasts to form the new dermis (Rea et al., 2009; Reilkoff et al., 2011). The third is epithelial cells in the wound margin are stimulated to undergo epithelial to mesenchymal transformation (EMT), and these former epithelial cells contribute to the formation of the new dermis (Savagner et al., 2005). It is also possible all of these mechanisms contribute during wound healing and in different amounts or in response to injuries of different extent.

If the dermal fibroblasts in the scars are primarily from those that have migrated from the edge of the wound, the differences in methylation observed between the scar and control fibroblasts may be caused by the injury itself. The fibroblasts at the wound edge may suffer epigenetic ‘damage’ by reactive oxygen and nitrogen species, which have been shown to induce damage such as abnormal activation of DNA methyltransferase (Ohshima, 2003). This may be caused acutely by the injury itself or possible inflammatory mediators that alter the epigenome not in a directed mechanistic fashion, but simply as a consequence of the injury. This could mean that the observed changes are not playing a particular function during wound healing but rather they are a by- product of the injury. Alternatively the activation of the healing process may induce changes in the epigenome to stimulate a healing response and promote repair, and these changes may in part remain after healing is complete, leaving an epigenetic ‘fingerprint’ that persists in the scar tissue. In this case the changes in the epigenome could be responsible for maintaining the aberrant matrix in the mature scar. Evidence for the injury causing epigenetic damage is found in that wounds cause damage at a cellular level, with emergency plasma membrane fusion (repair of torn plasma membrane of cell) occurring after disruption of the plasma membrane and influx of Ca2+ in many cell types (McNeil and Kirchhausen, 2005), including fibroblasts (Togo et al., 2000). This insult to the plasma membrane may cause aberrant epigenetic patterning, including methylation changes that may persist after the wound has healed. The finding that the ‘plasma membrane’ pathway was the top hit in the entity list of differentially 149

methylated promoter regions may support this hypothesis. Reactive oxygen and nitrogen species are also increased in the acute wound (auf dem Keller et al., 2006; Soneja et al., 2005), which are known to cause both genetic and epigenetic damage (Ohshima, 2003).

Separate from the ‘epigenetic damage’ hypothesis, the activation of the ‘repair’ transcriptome has been shown to be at least partially regulated epigenetically (Shaw and Martin, 2009a), and persistence of this epigenetic fingerprint may also contribute to scar formation and maintenance. Either of these hypotheses, or a combination of both, may be responsible for the changes observed. Earlier and longitudinal sampling and epigenetic profiling of wound fibroblasts could potentially help in unravelling whether these mechanisms contribute to the epigenetic changes, as well as possible cell labelling experiments to track cell fate after injury.

The second theory, that bone marrow derived cells called fibrocytes migrate to the wound and undergo differentiation to form dermal fibroblasts, may explain the epigenetic differences between the scar and normal fibroblasts by residual epigenetic patterns ‘left over’ from the bone marrow cell origin. This residual epigenetic pattern after differentiation is seen in other types of cells, where induced pluripotent cells successfully differentiated maintain the epigenetic signature of the cell type they were induced from (Lister et al., 2011). This different epigenetic signature may result in abnormal gene expression, and thus the scar phenotype. The presence of the differentially methylated CD34 gene in the scar fibroblasts in the results of this study, a common marker of hematopoietic cell origin also expressed in a subset of mesenchymal stem cells (Most, 1992), may be evidence supporting a bone marrow origin of dermal scar fibroblasts.

The main function of bone marrow derived fibrocytes is to migrate into the wound and differentiate into myofibroblasts – fibroblast-like cells that express alpha smooth muscle actin (αSMA) (Xu et al., 2015). Myofibroblasts are found transiently in the wound, where their function is to promote contraction of the wound, as well as secrete ECM proteins (Mori et al., 2005). Bone marrow cells are thought to make up at least some proportion of the cells in normal and wounded skin (Wu et al., 2010), but more work must be done to elucidate their epigenetic profile and contribution to the normotrophic scar phenotype.

150

The third theory, that epithelial keratinocyte cells undergo EMT to differentiate into dermal fibroblasts, has a similar explanation for the epigenetic differences between scar and normal fibroblasts as the bone marrow theory. Like the bone marrow stem cell theory, there may be some residual epigenetic patterns that persist after dedifferentiation from an epithelial to mesenchymal lineage that underpin the epigenetic changes observed. Multiple epigenetic mechanisms have been shown to play a key role in controlling epithelial–mesenchymal transition (Wang and Shang, 2013), and residual epigenetic effects from EMT cells in the dermis may cause or contribute to scar formation and maintenance. The presence of several differentially methylated keratin genes in the data from this study, which are expressed in epidermal keratinocytes but not in dermal fibroblasts, may provide some evidence for this. A methylation profile of EMT derived dermal fibroblasts is required to determine if this is the case.

All three theories are plausible explanations for the origin of epigenetic changes in scar fibroblasts, and scar dermis may contain some populations of fibroblasts from the wound edge, some derived from bone marrow and some derived from epithelial cells via EMT. The extent of the injury may also play a role in where the cells are sourced from, as severe burn injuries have many different issues than smaller, non-severe injuries (<20% body surface area) (Herndon, 2007). An example of this might be that in major injuries, the circulating reservoir of fibrocytes could be depleted, leading to an increase in EMT from the edges of the wound to compensate and leading to a different epigenetic profile of normotrophic scar fibroblasts in severe as compared to non-severe burn scars. This could be in contrast to small wounds which heal using predominantly resident cells, or intermediate injuries that have contributions both from resident and bone marrow cell populations. It is also possible that the etiology of injury is important. In this study, all the scar samples obtained were post-burn injury. Recent evidence for a differential immune response to burn and excisional injury (Valvis et al., 2015), as well as clinical observations of differences between male and female recovery from trauma and burns (Kerby et al., 2006), suggest that etiology of injury may have an impact on the repair process. Therefore the differences observed in this study may reflect injury etiology, and a comparative study using a different trauma model (for example excisional injury) would be interesting to determine if the epigenetic changes are the same.

151

Finally, it is important to consider the clinical treatment and the impact this may have on the epigenetic profiles of scar fibroblasts. Patients are commonly treated with autologous cells to promote healing after burn injury, both from a similar body site and in cases of more severe burn injury from more distal sites. The cells may also be prepared either in a split-thickness skin graft or as a cell suspension. Whilst it is not clear that these cells persist in the injury site, if the cells do remain it is likely they will retain some of the epigenetic profile of the site of origin, which may differ from the injury site (Johansson and Headon, 2014).

Further characterisation of human dermal fibroblasts, including potentially sorting cells with markers that differentiate between anatomical origin will be important in trying to delineate the origin of the changes in DNA methylation observed. Analysis of different wound etiologies and severity of injury, as well as the impact of clinical treatment will all add significantly to understanding the changes in the epigenome of scar fibroblasts and how this may be modulated to improve scar outcome in the future.

The origin of the changes in methylation may also underpin the functional importance of the changes observed. Whilst the focus of this thesis has been on the potential functional relevance of these changes, as has previously been stated the changes could instead be a residual marker of cell origin or direct cell damage with no direct functional relevance to scar formation and maintenance. It may also be that the epigenetic changes have no meaningful impact on the gene expression of the scar fibroblasts. Of the 398 differentially methylated promoter regions of genes, only 16 were also significantly differentially expressed (chapter 4, table 4.1). This suggests that of the many epigenetic changes observed, only a few have an impact on gene expression. However, this may be due to the resolution of the methylation chip vs. the expression chip. There were multiple sites within each gene for the methylation chip, which has the effect of increasing statistical power with the analytical algorithm used. Similarly, the pathway analyses conducted for the transcriptome (chapter 3, methods 3.1.4, table 3.3) has the same effect of increasing power by clustering groups of genes to identify altered pathways in the cells (by reducing the number of tests). However individual gene analysis in the transciptome data results in a large number of individual tests being performed and as a consequence many of the significant changes in the transcriptome may be missed due to lack of statistical power. This is the most likely reason for the

152

mismatch between epigenetic changes and the limited number of transcriptional changes observed in this study.

The use of the pathway analysis suggests that the epigenetic changes do have an effect and are at least partially responsible for driving scar formation. From the entity list of the differentially methylated genes, there are 435 pathways with a p<0.05 (top 20 table 2.3), in many wound healing and scarring related pathways such as cell adhesion, ion transport and proteinaceous extracellular matrix. Many of these same groups appear in the pathway analysis of the differentially expressed genes (chapter 3, top 20 table 3.3, full list appendix VI), including the three mentioned earlier, suggesting that the differential methylation of pathways affects the expression of these pathways. This leads to the idea that the methylation differences are functional, not vestigial, affecting the gene expression and thus scar phenotype in scar fibroblasts.

6.2.2. Pathway analyses of expression data and their importance to scar maintenance

The transcription profile of human scar fibroblasts revealed 163 significantly differentially methylated genes with a fold change of >1.5 and a nominal p value of p<0.05 (chapter 3, top 20 upregulated table 3.1, top 20 downregulated table 3.2, full list appendix V). A gene set enrichment analysis revealed 507 differentially methylated gene sets, many of which were associated with wound healing and extracellular matrix production (chapter 3, top 20 table 3.3, full list appendix VI). The data provide novel insight into the changes that occur and persist in normotrophic scar fibroblasts long after healing is complete.

One gene set that is very interesting in this group is the wingless-related integration site (Wnt) signalling pathways, of which 4 groups were significantly differentially expressed according to the GSEA: Wnt protein binding, positive regulation of Wnt signalling pathway: planar cell polarity pathway, positive regulation of Wnt signalling pathway and Wnt-activated receptor activity (appendix VI). The Wnt signalling pathway is important in embryogenesis, EMT, cell migration and proliferation, cell fate specification, is involved in carcinogenesis and is also important in wound healing and regeneration across many species (Nusse and Varmus, 2012). Wnt signalling is also critical to dermal development and the determination of dermal thickness as well as induction of adnexal structures in skin (Chen et al., 2012a). In previous studies, the Wnt pathway has been shown to be vital in tissue regeneration, and overexpression of 153

Wnt has been observed in hypertrophic and keloid scars (Profyris et al., 2012). This makes the Wnt pathway of significant interest in the maintenance of scar phenotype. Interestingly, in the results of this study, although the individual WNT2 gene is upregulated, all the Wnt signalling groups have slightly decreased expression in the scar fibroblasts compared to the controls. Although the median change value may not be indicative of functional change in large groups (as stated in Chapter 3), the small size of the Wnt groups (6, 22, 28 and 33 members) may mean that median change value is relevant here. This slight decrease in expression may be explained by the downregulation of Wnt signalling in mature wounds to prevent growth of scar tissue in the case of normotrophic scars, and a defect in this process causing their continued expression in hypertrophic and keloid scars contributing to these pathologies. Therefore the identification of this pathway suggests an important role for Wnt signalling in scar maintenance that could be a potential target in aberrant (hypertrophic and keloid) scars although is less likely to be suitable for targeting in stable normotrophic scars.

Supporting the GSEA and pathway analysis that many small changes are important for scar phenotype, the hierarchical clustering of the scar fibroblasts and normal skin fibroblasts into separate groups, rather than into individual patient clusters suggests that there are common pathways that are important to the scar phenotype, and that these can be detected above the noise of individual variation within the population (Fig. 3.2 and 3.3). Although two genes were knocked down in the functional assays, there may be many genes contributing to the scar phenotype, and therefore further studies modulating specific pathways may be more effective than individual knockdown experiments. Further studies with increased power may elucidate which of these pathways are the most important in scar maintenance and further refine the target list.

The two genes that were most dramatically decreased in expression in the transcriptome analysis are also of interest – with both showing differential methylation at specific sites and both important growth factors with roles likely to be relevant in scarring. Myostatin (MSTN) was the most downregulated gene in the scar fibroblasts (table 3.2), with a 4.4 fold decrease in expression compared to normal skin fibroblasts. MSTN is a growth factor that is part of the TGF-β super family and a main role is the inhibition of myogenesis (muscle cell growth) (Sharma et al., 1999). Although once thought to only be expressed in skeletal muscle, it has been shown to be expressed in heart, adipose tissue (Sharma et al., 1999), and more recently, epidermal and dermal tissue (Zhang et

154

al., 2012). MSTN null mice showed delayed wound healing, due to an inhibition of TGF-β signalling, and reduced fibrosis in muscle after injury (McCroskery et al., 2005). Whether expression levels are increased in other scar types is currently unknown. However, the overexpression of MSTN has been reported in Peyronie’s disease, a fibrotic disease characterised by a distinct circumscribed lesion in the penile tunica albuginea (Cantini et al., 2008). While beneficial during the initial wound healing phase, the continued expression of MSTN may be detrimental to the scar in the remodelling phase, causing fibrosis. Therefore, the downregulation of MSTN in mature normotrophic scar fibroblasts may be a normal response after healing has concluded to prevent further ECM secretion and contraction and thus fibrosis. An examination of MSTN levels in other skin scar pathologies must be undertaken in order to determine whether continued expression of MSTN contributes to abnormal scar formation.

Platelet derived growth factor D (PDGFD) was the second most downregulated gene after MSTN, with a 3.4 fold decrease in normotrophic scar fibroblasts compared to normal skin fibroblasts (table 3.2). PDGF has long been established to have a significant impact on wound healing, where it acts as a potent mitogen for mesenchymal derived cells, as well as promoting chemotaxis of wound fibroblasts (Martin, 1997). Overexpression of PDGFD in mouse epidermis caused macrophage accumulation in the skin, and this effect was enhanced in wound healing skin (Uutela et al., 2004). Macrophages secrete growth factors important for wound healing, and removal of macrophages significantly impairs the healing process (Leibovich and Ross, 1975). However, overexpression of PDGFD in mice did not cause the wounds to heal any faster (Uutela et al., 2004).

Related to PDGFD downregulation, the platelet derived growth factor receptor like (PDGFRL) was also significantly downregulated in the normotrophic scar fibroblasts, although to a lower level (1.6 fold decrease, table 3.2). Little is known about PDGFRL, except for a few studies on colorectal cancer where it acts a tumour suppressor (Hou et al., 2013) and Behҫet’s disease, a multi system disorder characterised by ulcerations and skin lesions (Hou et al., 2013). That both the PDGFD and PDGFRL are significantly downregulated increases the likelihood that there are changes to PDGF signalling rather than just changes in the expression of an individual gene. Similarly to MSTN, while beneficial in the acute phase of wound healing, in the remodelling stages the presence of PDGFD and PDGFRL may promote fibrosis and thus be undesirable. Their

155

downregulation in mature normotrophic scar fibroblasts may again reflect the normal response to prevent further recruitment of macrophages to the wound, fibroblast proliferation, wound contraction and ECM deposition, all of which contribute to fibrosis. This suggests that these genes may be altered in abnormal scar pathologies. Therefore an examination of PDGFD and PDGFRL levels in other skin scar pathologies must be undertaken in order to determine whether their continued expression contributes to abnormal scar formation such as hypertrophic and keloid scars.

6.2.4. The role of MKX and FOXF2 and potential for therapeutic modulation

In this study MKX and FOXF2 were assessed for their functional role with respect to collagen matrix production in scar fibroblasts (chapter 5). Other studies have shown that MKX is important in tendon, skeletal, limb bud, testis and kidney tissues in embryogenesis in mice (Anderson et al., 2006), that MKX is expressed in mesoderm derived tissues, and that MKX expression is important in differentiation of bone marrow derived mesenchymal stem cells (BMMSC) into tendon tissue, upregulating ECM and collagen production (Otabe et al., 2015). MKX is downregulated in human tendinopathy, and importantly for scarring, achieves its effects partially through the TGF-β signalling pathway (Liu et al., 2015). However, almost no work on MKX has been done in skin or fibrosis, and this study appears to be the first to study the effect of MKX in not only skin scarring, but skin tissue itself. The only study on human skin available observed under expression of MKX in skin fibroblasts of two sisters suffering from a mutation causing deficiency in galactosyltransferase II (GalT-II), involved in the synthesis of glycosaminoglycan (GAG), a major ECM protein, causing many severe connective tissue problems (Ritelli et al., 2015).

Previous studies in tendons have shown a direct correlation between MKX expression and collagen production levels (Otabe et al., 2015), and collagen fibril organisation (Ito et al., 2010; Liu et al., 2010). This study matches the conclusion of previous studies that decreases in MKX expression caused decreased collagen production levels, as shown by the MKX siRNA knockdowns (Fig. 5.7). However, there was a lack of association between the increased MKX expression and an increase in the amount of collagen produced in the scar fibroblasts in 3 of the 4 patients tested (Fig. 5.2). The same results occurred with the role of MKX in collagen fibril organisation, with this study showing a decrease in collagen coherence in one of the two patients after MKX siRNA knockdown (Fig. 5.10). Similarly to the amount of collagen however, there was a lack of association 156

between the increased MKX expression and any increase in the coherence of collagen in the scar fibroblast in 3 of the 4 patients tested (Fig. 5.4). This did not match the data from studies in tendon tissue, where overexpression of MKX caused an increase in collagen I in the tendons of rats in vitro as well as in vivo (Otabe et al., 2015; Liu et al., 2015). This may be due to differences in skin tissue versus tendon tissue, expression levels of MKX in the overexpressed tendons compared to the endogenously overexpressed normotrophic scar fibroblasts, or simply the lack of sufficient samples in this study. It is also possible that the use of the in vitro system in this study is a reason for the differences observed.

This study is also the first to implicate FOXF2 in normal scarring. FOXF2 has previously been identified as being expressed during fetal development in many organs - mainly in organs that have endodermal or ectodermal derived epithelia surrounded by mesoderm derived mesenchyme (Aitola et al., 2000). Conditional knockout mice in intestinal smooth muscle have shown that FOXF2 is essential in normal extracellular matrix production and maintenance in this tissue type. In particular, collagens are severely reduced in FOXF2 mutant intestine, which causes epithelial depolarization and tissue disintegration (Ormestad et al., 2006). This matched the results in this study, with the FOXF2 siRNA knockdown causing a significant decrease in the amount of collagen produced (Fig. 5.7).

Little data is available on FOXF2 effects on skin and fibrosis. However, the closely related FOXF1 gene has been found to be upregulated in fibroblasts of idiopathic pulmonary fibrosis (IPF) (Melboucy-Belkhir et al., 2014). The relationship of FOXF2 with EMT is also of interest, as the origin of scar fibroblasts, whether they migrate in from the edge of the wound or come from the circulation, is currently unknown. EMT occurs when epithelial cells transform into a mesenchymal phenotype mediated by the loss of epithelial cell adhesion, increased expression of α-smooth muscle actin, basement membrane disruption and the cells becoming migratory and invasive (Liu, 2004). Current published work on FOXF2 and its relationship to EMT has only investigated this role in prostate tissue (van der Heul-Nieuwenhuijsen et al., 2009b), where it has been suggested to regulate not only EMT but also the production of many ECM proteins (van der Heul-Nieuwenhuijsen et al., 2009a).

157

6.2.5. Limitations and strengths of this study

The results of this study suggest that the integrative genomic strategy for identification of key proteins involved in scar maintenance was successful. Nevertheless, there are a number of key limitations that need to be considered when evaluating the outcomes of this research. The key limitations of the differential methylation analysis included small sample size, necessity of using cultured cells and the relatively limited coverage (low resolution) of the methylation array. There were practical considerations that underpinned all these limitations, including time to complete analysis, availability of suitable patients for recruitment and costs associated with whole methylome sequencing. These limitations were partially mitigated by steps such as using paired control DNA from a matched non-injured site, using a methylation array with the largest number of CpG sites available on a commercial chip, and the use of cultured cells with minimised culture time allowing multiple assays to be completed on the same cells, using them for methylation, expression and phenotypic assays. Culturing the cells allowed the generation of a large multidimensional data set, and multiple lines of evidence increased confidence in the results generated. All these limitations increase the possibility of missing important potential targets due to the underpowered nature of the study. Therefore it is likely that increased sample size and greater resolution of the methylation analyses would identify additional targets that were not found in this study. Therefore it will be important to continue this work and increase sample size to identify a more complete range of genes involved in the process of normal scar maintenance. Further validation of the targets identified in the wider population would also be important to determine how representative of the population this relatively small sample size is.

As with other similar expression studies, a liberal p-value and fold-change cut off was required for the expression data, as due to the low sample size the methods correction for multiple testing was found to be too stringent (Rodriguez et al., 2014). Also, the similar nature of the two samples compared in this study means that the size of the differences in gene expression may be small when compared to many other comparative expression studies focused on more extreme pathologies. In addition, whilst the Affymetrix array provides complete coverage of protein coding genes, recent projects investigating the importance and expression of the remaining >95% of the DNA has clearly demonstrated extensive and important expression of non-protein coding regions

158

of the DNA (Kellis et al., 2014; Clark et al., 2012), which are largely not interrogated using the expression arrays (Mercer et al., 2009). Therefore significant differences in these other regions of the DNA and their possible consequences for cell function have not been explored in this study.

The two most significant limitations of the bioinformatics integration included the use of only the promoter region to define changes in methylation and the relative knowledge of gene function in ontology and pathway analyses. Use of the proximal promoter region to define differentially methylated genes removed some genes with intragenic changes in methylation and the study design does not allow for analysis of the effects of distally methylated enhancers or repressors influencing expression. With respect to gene ontology, some genes have a large amount of information about function whilst others have very little and some exist only as hypothetical proteins. These may be crucial in the formation and maintenance of scar, but were overlooked due to a lack of evidence and/or understanding of their role at present.

Finally, the key limitation of the phenotypic assay was the limited detection of matrix differences between scar and control fibroblasts, most likely due to the artificial conditions in which matrix was produced, including using TGFβ to stimulate collagen production in both control and scar fibroblasts. This suggests that additional assays to verify the effects of the target genes on matrix production are required to truly understand the impact of these genes on scar fibroblast phenotype. It maybe that the real effects are only observed in vivo and ultimately the use of an animal model will be essential to determine whether targeting the genes identified can modulate scar formation or maintenance.

Despite these limitations, the study has identified a number of novel targets and provided preliminary evidence supporting a role for these targets in scar maintenance. This suggests that the integrated bioinformatics and subsequent in vitro target validation approach was successful.

159

6.3. Future work

6.3.1. Expanding the data set to increase confidence in current targets and identify additional genes involved in scar fibroblast phenotype

To progress the current data set a key next step is to increase patient number to validate identified genes in a wider population. The use of additional patient samples will also provide an opportunity to potentially identify additional targets in a study with increased power. In addition, the study to date has focused solely on scars in Caucasian males. Further work could expand this study to look at the effect of gender as well as to investigate the role of ethnicity. Substantial evidence exists for keloid scar formation in particular being more common in certain ethnic populations, and it may be that the key drivers of scar formation and maintenance are not the same in all populations. This investigation will be critical before therapeutic translation to identify whether the targets are suitable in both genders and different populations or whether the suitable patient population for targeting these genes is more restricted. Identification of potential responders is an important part of clinical development and expanding the study to cover different patient populations will be key to facilitating this. Capturing the whole methylome and genome using whole genome bisulfite sequencing and RNA-seq would also improve the study as more information emerges about the role of methylation in other gene regions. This higher resolution of mapping may also be particularly beneficial in closely related samples such as in normotrophic scar and normal skin.

6.3.1.1. Further work to validate MKX, FOXF2 and other targets in vitro

Although the scar-in-a-jar assay was used successfully, it is clear there are limitations and expanding the number of functional studies, and ultimately using in vivo models, will be critical to successful validation of target genes. Different collagen assays such as ELISA plates and western blots may be more useful in determining properties of collagen deposition, by showing finer details in the collagen turnover dynamics in scar cells. Other important ECM proteins other than collagen I, such as collagen III and fibronectin must also be examined in order to obtain a more complete picture of ECM deposition. Phenotypic assays such as contraction and proliferation assays would also be useful to determine properties of normotrophic scar fibroblasts both before and after modulation of target genes, to characterise both the effect of the modulation and the cells themselves.

160

Modulation of the targets will also be expanded to other types of modulation apart from siRNA in vitro. Complete deletion of the target genes could also be carried out using a clustered regularly interspaced short palindromic repeats (CRISPR) system, to complement the siRNA data. To test whether DNA methylation of the promoter region is the reason for enduring scar phenotype, targeted DNA methylation will be carried out, possibly using transcription activator–like effector- ten-eleven translocation (TALE-TET1) domains (Maeder et al., 2013) or other targeted demethylation tools.

6.3.2. Validation for FOXF2 and MKX in vivo

Once association between scarring and FOXF2, MKX, and other possible targets is further established in vitro, work will move to in vivo work in animals for generation of therapeutic treatments. Knockout mice for both genes exist (Ormestad et al., 2006; Ito et al., 2010), and the wound healing and scarring properties of these mice would be of great interest. Inducible knockout mice for these genes would also be important to prevent systemic effects of the gene knockout affecting wound healing and scarring. Inducing knockout of the target genes just in the skin or at particular times in the wound healing or scarring process is important in determining the mechanistic effects of MKX and FOXF2 in fibrosis.

A pig model will also be established, where the effect of knocking down MKX, FOXF2 and other possible target genes using topically applied siRNA established in other studies (Ritprajak et al., 2008) or local injections will be established on scars of varying maturity. If this work reinforces the importance of MKX and FOXF2 modulation in ameliorating scarring, further work to develop treatments targeting these genes will be warranted.

161

6.3. Conclusions

This study was the first to measure and integrate data from both the methylome and transcriptome of normal scar fibroblasts. This gave an overview of their methylation and expression profile, and the integration of these two datasets revealed two target genes, previously not associated with skin fibrosis, that may have a role in driving the formation and maintenance of scarring. Modulation of these two genes caused the scar cells to decrease both the amount and orientation of the collagen produced, suggesting a role of these targets in fibrosis. These findings add important new knowledge to the understanding of why scars form and persist throughout the life of the patient. Further work exploring the role of the target genes in scarring and the mechanisms underlying the changes are required for eventual therapeutic intervention.

162

References

Aarabi S, Longaker MT and Gurtner GC. (2007) Hypertrophic scar formation following burns and trauma: new approaches to treatment. PLoS Med 4: e234. Abergel RP, Pizzurro D, Meeker CA, et al. (1985) Biochemical composition of the connective tissue in keloids and analysis of collagen metabolism in keloid fibroblast cultures. J Invest Dermatol 84: 384-390. Aitola M, Carlsson P, Mahlapuu M, et al. (2000) Forkhead transcription factor FoxF2 is expressed in mesodermal tissues involved in epithelio-mesenchymal interactions. Dev. Dyn. 218: 136-149. Anderson DM, Arredondo J, Hahn K, et al. (2006) Mohawk is a novel homeobox gene expressed in the developing mouse embryo. Dev. Dyn. 235: 792-801. Antus B, Yao Y, Liu S, et al. (2001) Contribution of androgens to chronic allograft nephropathy is mediated by dihydrotestosterone. Kidney Int 60: 1955-1963. Arakawa M, Hatamochi A, Mori Y, et al. (1996) Reduced collagenase gene expression in fibroblasts from hypertrophic scar tissue. Br J of Dermatol 134: 863-868. Artlett CM, Sassi-Gaha S, Rieger JL, et al. (2011) The inflammasome activating caspase 1 mediates fibrosis and myofibroblast differentiation in systemic sclerosis. Arthritis & Rheum. 63: 3563-3574. Aryee MJ, Wu Z, Ladd-Acosta C, et al. (2011) Accurate genome-scale percentage DNA methylation estimates from microarray data. Biostatistics 12: 197-210. Ashcroft GS and Mills SJ. (2002) Androgen receptor–mediated inhibition of cutaneous wound healing. J Clin Invest 110: 615-624. Ashitani J-i, Mukae H, Nakazato M, et al. (1998) Elevated pleural fluid levels of defensins in patients with empyema. CHEST Journal 113: 788-794. Assenov Y, Muller F, Lutsik P, et al. (2014) Comprehensive analysis of DNA methylation data with RnBeads. Nat Meth 11: 1138-1140. auf dem Keller U, Kumin A, Braun S, et al. (2006) Reactive oxygen species and their detoxification in healing skin wounds. J Investig Dermatol Symp P 11: 106-111. Babraj JA, Cuthbertson DJ, Smith K, et al. (2005) Collagen synthesis in human musculoskeletal tissues and skin. Am J Physiol Endocrinol Metab 289: E864- 869. Bahar Halpern K, Vana T and Walker MD. (2014) Paradoxical role of DNA methylation in activation of FoxA2 gene expression during endoderm development. J Biol Chem 289: 23882-23892. 163

Bayat A, McGrouther DA and Ferguson MWJ. (2003) Skin scarring. BMJ 326: 88-92. Bechtel W, McGoohan S, Zeisberg EM, et al. (2010) Methylation determines fibroblast activation and fibrogenesis in the kidney. Nat Med 16: 544-550. Bender DA. (2014) A Dictionary of Food and Nutrition: Oxford University Press. Benjamini Y and Hochberg Y. (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Statis. Soc. B 57: 289-300. Benton M, Johnstone A, Eccles D, et al. (2015) An analysis of DNA methylation in human adipose tissue reveals differential modification of obesity genes before and after gastric bypass and weight loss. Genome Biol. 16: 8. Berthet E, Chen C, Butcher K, et al. (2013) Smad3 binds scleraxis and mohawk and regulates tendon matrix organization. J Orthop Res 31: 1475-1483. Berthod F, Germain L, Li H, et al. (2001) Collagen fibril network and elastic system remodeling in a reconstructed skin transplanted on nude mice. Matrix Biol. 20: 463-473. Bibikova M and Fan J-B. (2009) GoldenGate® Assay for DNA methylation profiling. In: Tost J (ed) DNA Methylation. Humana Press, 149-163. Bird A. (2002) DNA methylation patterns and epigenetic memory. Genes & Development 16: 6-21. Bird AP and Wolffe AP. (1999) Methylation-induced repression— belts, braces, and chromatin. Cell 99: 451-454. Boaru SG, Borkham-Kamphorst E, Tihaa L, et al. (2012) Expression analysis of inflammasomes in experimental models of inflammatory and fibrotic liver disease. J Inflamm (Lond) 9: 49. Bock C. (2012) Analysing and interpreting DNA methylation data. Nat Rev Genet 13: 705-719. Boess F, Kamber M, Romer S, et al. (2003) Gene expression in two hepatic cell lines, cultured primary hepatocytes, and liver slices compared to the in vivo liver gene expression in rats: possible implications for toxicogenomics use of in vitro systems. Toxicol Sci. 73: 386-402. Brckalo T, Calzetti F, Pérez-Cabezas B, et al. (2010) Functional analysis of the CD300e receptor in human monocytes and myeloid dendritic cells. European Journal of Immunology 40: 722-732. Brewster Robert C, Weinert Franz M, Garcia Hernan G, et al. (2014) The transcription factor titration effect dictates level of gene expression. Cell 156: 1312-1323.

164

Brockes JP. (1997) Amphibian limb regeneration: rebuilding a complex structure. Science 276: 81-87. Brown DL, Kao WWY and Greenhalgh DG. (1997) Apoptosis down-regulates inflammation under the advancing epithelial wound edge: Delayed patterns in diabetes and improvement with topical growth factors. Surgery 121: 372-380. Brown JJ and Bayat A. (2009) Genetic susceptibility to raised dermal scarring. Br J Dermatol 161: 8-18. Buie VC, Owings MF, DeFrances CJ, et al. (2010) National Hospital Discharge Survey: 2006 summary. Vital Health Stat 13. Burridge PW, Thompson S, Millrod MA, et al. (2011) A universal system for highly efficient cardiac differentiation of human induced pluripotent stem cells that eliminates interline variability. PLoS ONE 6: e18293. Campbell KHS, McWhir J, Ritchie WA, et al. (1996) Sheep cloned by nuclear transfer from a cultured cell line. Nature 380: 64-66. Cantini LP, Ferrini MG, Vernet D, et al. (2008) Profibrotic role of myostatin in peyronie's disease. JSM 5: 1607-1622. Carvalho BS and Irizarry RA. (2010) A framework for oligonucleotide microarray preprocessing. Bioinformatics 26: 2363-2367. Cass DL, Bullard KM, Sylvester KG, et al. (1997) Wound size and gestational age modulate scar formation in fetal wound repair. J Pediatr Surg 32: 411-415. Cedar H and Bergman Y. (2009) Linking DNA methylation and histone modification: patterns and paradigms. Nat Rev Genet 10: 295-304. Chambers SM, Fasano CA, Papapetrou EP, et al. (2009) Highly efficient neural conversion of human ES and iPS cells by dual inhibition of SMAD signaling. Nat Biotech 27: 275-280. Chang HY, Chi JT, Dudoit S, et al. (2002) Diversity, topographic differentiation, and positional memory in human fibroblasts. Proc Natl Acad Sci U S A 99: 12877- 12882. Chen CZ and Raghunath M. (2009) Focus on collagen: in vitro systems to study fibrogenesis and antifibrosis state of the art. Fibrogenesis Tissue Repair 2: 7. Chen D, Jarrell A, Guo C, et al. (2012a) Dermal β-catenin activity in response to epidermal Wnt ligands is required for fibroblast proliferation and hair follicle initiation. Development 139: 1522-1533.

165

Chen M, Shabashvili D, Nawab A, et al. (2012b) DNA methyltransferase inhibitor, zebularine, delays tumor growth and induces apoptosis in a genetically engineered mouse model of breast cancer. Mol Cancer Ther 11: 370-382. Cheng JB and Cho RJ. (2012) Genetics and epigenetics of the skin meet deep sequence. J Invest Dermatol 132: 923-932. Chiquet-Ehrismann R. (2004) Tenascins. Int J Biochem Cell Biol 36: 986-990. Chodankar R, Chang C-H, Yue Z, et al. (2003) Shift of localized growth zones contributes to skin appendage morphogenesis: role of the Wnt/beta-catenin pathway. J Investig Dermatol 120: 20-26. Choy M-K, Movassagh M, Goh H-G, et al. (2010) Genome-wide conserved consensus transcription factor binding motifs are hyper-methylated. BMC Genomics 11: 519-519. Chung YL, Wang A-J and Yao L-F. (2004) Antitumor histone deacetylase inhibitors suppress cutaneous radiation syndrome: Implications for increasing therapeutic gain in cancer radiotherapy. Mol Cancer Ther 3: 317-325. Clark MB, Johnston RL, Inostroza-Ponta M, et al. (2012) Genome-wide analysis of long noncoding RNA stability. Genome Research 22: 885-898. Cogliati B, Mennecier G, Willebrords J, et al. (2016) Connexins, Pannexins, and Their Channels in Fibroproliferative Diseases. The Journal of Membrane Biology 249: 199-213. Compton CC, Nadire KB, Regauer S, et al. (1998) Cultured human sole-derived keratinocyte grafts re-express site-specific differentiation after transplantation. Differentiation 64: 45-53. Coussens LM, Fingleton B and Matrisian LM. (2002) Matrix metalloproteinase inhibitors and cancer—trials and tribulations. Science 295: 2387-2392. Dalemans W, Barbry P, Champigny G, et al. (1991) Altered chloride ion channel kinetics associated with the ΔF508 cystic fibrosis mutation. Nature 354: 526- 528. Darby I, Skalli O and Gabbiani G. (1990) Alpha-smooth muscle actin is transiently expressed by myofibroblasts during experimental wound healing. Lab Invest 63: 21-29. Dasu MRK, Hawkins HK, Barrow RE, et al. (2004) Gene expression profiles from hypertrophic scar fibroblasts before and after IL-6 stimulation. J Pathol 202: 476-485.

166

de Koning HD, Bergboer JGM, van den Bogaard EH, et al. (2012) Strong induction of AIM2 expression in human epidermis in acute and chronic inflammatory skin conditions. Exp Dermatol 21: 961-964. de Koning HD, van Vlijmen-Willems IMJJ, Zeeuwen PLJM, et al. (2014) Absent in Melanoma 2 is predominantly present in primary melanoma and primary squamous cell carcinoma, but largely absent in metastases of both tumors. J Am Acad Dermatol 71: 1012-1015. de Vega WC, Vernon SD and McGowan PO. (2014) DNA methylation modifications associated with chronic fatigue syndrome. PLoS ONE 9: e104757. Deitch EAW, T. M.; Rose, M. P. (1984) Hypertrophic burn scars: analysis of variables. Injury 16: 213. Diegelmann RF, Cohen IK and McCoy BJ. (1979) Growth kinetics and collagen synthesis of normal skin, normal scar and keloid fibroblasts in vitro. J Cell Physiol 98: 341-346. Djebali S, Davis CA, Merkel A, et al. (2012) Landscape of transcription in human cells. Nature 489: 101-108. Du P, Zhang X, Huang C-C, et al. (2010) Comparison of Beta-value and M-value methods for quantifying methylation levels by microarray analysis. BMC Bioinformatics 11: 1-9. Duan X, Ponomareva L, Veeranki S, et al. (2011) Differential roles for the interferon- inducible IFI16 and AIM2 innate immune sensors for cytosolic DNA in cellular senescence of human fibroblasts. Mol. Cancer Res. 9: 589-602. Duboule D. (1995) Guidebook to the homeobox genes: Oxford University Press. Eastoe JE. (1955) The amino acid composition of mammalian collagen and gelatin. Biochem J 61: 589-600. Eaton C. (2012) Dupuytren's disease and related hyperproliferative disorders : principles, research, and clinical perspectives, Heidelberg ; New York: Springer. Eddy AA, Giachelli CM, McCulloch wttaoL, et al. (1995) Renal expression of genes that promote interstitial inflammation and fibrosis in rats with protein-overload proteinuria. Kidney Int 47: 1546-1557. Ehrlich HP and Kelley SF. (1992) Hypertrophic scar: an interruption in the remodeling of repair-a laser doppler blood flow study. Plast Reconstr Surg 90: 993-998.

167

Estany S, Vicens-Zygmunt V, Llatjós R, et al. (2014) Lung fibrotic tenascin-C upregulation is associated with other extracellular matrix proteins and induced by TGFβ1. BMC Pulmonary Medicine 14: 120. Esteller M. (2007) Cancer epigenomics: DNA methylomes and histone-modification maps. Nat Rev Genet 8: 286-298. Fabbrocini G, Annunziata MC, D'Arco V, et al. (2010) Acne scars: pathogenesis, classification and treatment. Dermatol Res Pract. 2010. Feinberg AP. (2004) The epigenetics of cancer etiology. Semin Cancer Biol 14: 427- 432. Feinberg AP. (2007) Phenotypic plasticity and the epigenetics of human disease. Nature 447: 433-440. Ferguson MWJ and O'Kane S. (2004) Scar–free healing: from embryonic mechanisms to adult therapeutic intervention. Philos Trans R Soc Lond B Biol Sci. 359: 839- 850. Fernandes-Alnemri T, Yu J-W, Datta P, et al. (2009) AIM2 activates the inflammasome and cell death in response to cytoplasmic DNA. Nature 458: 509-513. Fink SL and Cookson BT. (2005) Apoptosis, pyroptosis, and necrosis: mechanistic description of dead and dying eukaryotic cells. Infect Immun 73: 1907-1916. Ford LC, King DF, Lagasse LD, et al. (1983) Increased androgen binding in keloids: a preliminary communication. J Dermatol Surg Oncol. 9: 545-547. Forrester HB, Ivashkevich A, McKay MJ, et al. (2013) Follistatin is induced by ionizing radiation and potentially predictive of radiosensitivity in radiation-induced fibrosis patient derived fibroblasts. PLoS ONE 8: e77119. Fuks F. (2005) DNA methylation and histone modifications: teaming up to silence genes. Curr. Opin. Genet. Dev 15: 490-495. Gardiner DM. (2005) Ontogenetic decline of regenerative ability and the stimulation of human regeneration. Rejuvenation Res. 8: 141-153. Gardiner DM, Carlson MRJ and Roy S. (1999) Towards a functional analysis of limb regeneration. Semin Cell Dev Biol 10: 385-393. Gargett CE, Chan RWS and Schwab KE. (2008) Hormone and growth factor signaling in endometrial renewal: Role of stem/progenitor cells. Mol Cell Endocrinol 288: 22-29.

168

Garner WL, Karmiol S, Rodriguez JL, et al. (1993) Phenotypic differences in cytokine responsiveness of hypertrophic scar versus normal dermal fibroblasts. J Investig Dermatol 101: 875-879. Gilliver SC, Wu F and Ashcroft GS. (2003) Regulatory roles of androgens in cutaneous wound healing. Thromb Haemost 90: 978-985. Grabiec AM and Hussell T. (2016) The role of airway macrophages in apoptotic cell clearance following acute and chronic lung inflammation. Seminars in Immunopathology 38: 409-423. Graca I, J. Sousa E, Baptista T, et al. (2014) Anti-tumoral effect of the non-nucleoside DNMT inhibitor RG108 in human prostate cancer cells. Curr Pharm Des 20: 1803-1811. Grigoriadis AE, Kennedy M, Bozec A, et al. (2010) Directed differentiation of hematopoietic precursors and functional osteoclasts from human ES and iPS cells. Blood. 115: 2769-2776. Gurdon JB. (1962) The developmental capacity of nuclei taken from intestinal epithelium cells of feeding tadpoles. J Embryol Exp Morphol 10: 622-640. Hagemann S, Heil O, Lyko F, et al. (2011) Azacytidine and decitabine induce gene- specific and non-random DNA demethylation in human cancer cell lines. PLoS ONE 6: e17388. Hahn E, Wick G, Pencev D, et al. (1980) Distribution of basement membrane proteins in normal and fibrotic human liver: collagen type IV, laminin, and fibronectin. Gut 21: 63-71. Han M, Yang X, Taylor G, et al. (2005) Limb regeneration in higher vertebrates: Developing a roadmap. Anat Rec B New Anat 287B: 14-24. Hangauer MJ, Vaughn IW and McManus MT. (2013) Pervasive transcription of the human genome produces thousands of previously unidentified long intergenic noncoding RNAs. PLoS Genet 9: e1003569. Hardy MA. (1989) The biology of scar formation. Phys Ther 69: 1014-1024. Harrop AR, Ghahary A, Scott PG, et al. (1995) Regulation of collagen synthesis and mRNA expression in normal and hypertrophic scar fibroblasts in vitro by interferon-γ. J Surg Res 58: 471-477. Heinemeier KM, Schjerling P, Heinemeier J, et al. (2013) Lack of tissue renewal in human adult Achilles tendon is revealed by nuclear bomb 14C. FASEB J.

169

Hemmatazad H, Rodrigues HM, Maurer B, et al. (2009) Histone deacetylase 7, a potential target for the antifibrotic treatment of systemic sclerosis. Arthritis & Rheum. 60: 1519-1529. Herndon DN. (2007) Total burn care: Elsevier Health Sciences. Hou S, Xiao X, Zhou Y, et al. (2013) Genetic variant on PDGFRL associated with behçet disease in chinese han populations. Hum. Mut. 34: 74-78. Igarashi A, Nashiro K, Kikuchi K, et al. (1996) Connective tissue growth factor gene expression in tissue sections from localized scleroderma, keloid, and other fibrotic skin disorders. J Invest Dermatol 106: 729-733. IHGSC. (2001) Initial sequencing and analysis of the human genome. Nature 409: 860- 921. Illingworth CM. (1974) Trapped fingers and amputated finger tips in children. J Pediatr Surg 9: 853-858. Irizarry RA, Ladd-Acosta C, Wen B, et al. (2009) The human colon cancer methylome shows similar hypo- and hypermethylation at conserved tissue-specific CpG island shores. Nat Genet 41: 178-186. Issa J-PJ and Kantarjian HM. (2009) Targeting DNA methylation. Clin Cancer Res 15: 3938-3946. Ito Y, Toriuchi N, Yoshitaka T, et al. (2010) The Mohawk homeobox gene is a critical regulator of tendon differentiation. Proc Natl Acad Sci U S A 107: 10538-10542. Jackson AL and Linsley PS. (2010) Recognizing and avoiding siRNA off-target effects for target identification and therapeutic application. Nat Rev Drug Discov 9: 57- 67. Jacoby JJ, Kalinowski A, Liu M-G, et al. (2003) Cardiomyocyte-restricted knockout of STAT3 results in higher sensitivity to inflammation, cardiac fibrosis, and heart failure with advanced age. Proc Natl Acad Sci U S A 100: 12929-12934. Johansson JA and Headon DJ. (2014) Regionalisation of the skin. Semin Cell Dev Biol 25–26: 3-10. Johns MM, Kolachala V, Berg E, et al. (2012) Radiation fibrosis of the vocal fold: From man to mouse. Laryngoscope 122: SS107-SS125. Jones PA. (2012) Functions of DNA methylation: islands, start sites, gene bodies and beyond. Nat Rev Genet 13: 484-492. Jones PA and Takai D. (2001) The role of DNA methylation in mammalian epigenetics. Science 293: 1068-1070.

170

Juneja SC. (2013) Cellular distribution and gene expression profile during flexor tendon graft repair: a novel tissue engineering approach. J. Tissue Eng. 4. Kador KE, Montero RB, Venugopalan P, et al. (2013) Tissue engineering the retinal ganglion cell nerve fiber layer. Biomaterials 34: 4242-4250. Kalluri R and Weinberg RA. (2009) The basics of epithelial-mesenchymal transition. J Clin Invest 119: 1420-1428. Kalluri R and Zeisberg M. (2006) Fibroblasts in cancer. Nat Rev Cancer 6: 392-401. Karolchik D HA, Furey TS, Roskin KM, Sugnet CW, Haussler D, Kent WJ. (2004) The UCSC Table Browser data retrieval tool. Nucleic Acids Res. 32(Database Issue): D493-496. Kellis M, Wold B, Snyder MP, et al. (2014) Defining functional DNA elements in the human genome. Proc Natl Acad Sci U S A 111: 6131-6138. Kendall AC and Nicolaou A. (2013) Bioactive lipid mediators in skin inflammation and immunity. Prog Lipid Res 52: 141-164. Kerby JD, McGwin GJ, George RL, et al. (2006) Sex differences in mortality after burn injury: results of analysis of the national burn repository of the american burn association. J Burn Care Res 27: 452-456. Kiecolt-Glaser JK, Marucha PT, Mercado AM, et al. Slowing of wound healing by psychological stress. The Lancet 346: 1194-1196. Kim W, Moon S-O, Lee SY, et al. (2006) COMP–angiopoietin-1 ameliorates renal fibrosis in a unilateral ureteral obstruction model. J Am Soc Nephrol 17: 2474- 2483. Klein MB. (2007) Thermal, chemical and electrical injuries. Grabb and smith’s Plastic surgery: 132-149. Klinge U, Si ZY, Zheng H, et al. (2000) Abnormal collagen I to III distribution in the skin of patients with incisional hernia. Eur Surg Res 32: 43-48. Knapp AC, Franke WW, Heid H, et al. (1986) Cytokeratin No. 9, an epidermal type I keratin characteristic of a special program of keratinocyte differentiation displaying body site specificity. J Cell Biol 103: 657-667. Knight D, Mutsaers SE and Prêle CM. (2011) STAT3 in tissue fibrosis: Is there a role in the lung? Pulm Pharmacol Ther 24: 193-198. Koch CM, Suschek CV, Lin Q, et al. (2011) Specific age-associated DNA methylation changes in human dermal fibroblasts. PLoS ONE 6: e16679.

171

Kragl M, Knapp D, Nacu E, et al. (2009) Cells keep a memory of their tissue origin during axolotl limb regeneration. Nature 460: 60-65. Kulaeva OI, Draghici S, Tang L, et al. (2003) Epigenetic silencing of multiple interferon pathway genes after cellular immortalization. Oncogene 22: 4118. Lareu RR, Zeugolis DI, Abu-Rub M, et al. (2010) Essential modification of the Sircol Collagen Assay for the accurate quantification of collagen content in complex protein solutions. Acta Biomaterialia 6: 3146-3151. Larson BJ, Longaker MT and Lorenz HP. (2010) Scarless fetal wound healing: a basic science review. Plast Reconstr Surg 126: 1172-1180. Laurent GJ. (1987) Dynamic state of collagen: pathways of collagen degradation in vivo and their possible role in regulation of collagen mass. Am J Physiol. 252: C1-C9. Lavker RM. (1979) Structural alterations in exposed and unexposed aged skin. J Investig Dermatol 73: 59-66. Law JA and Jacobsen SE. (2010) Establishing, maintaining and modifying DNA methylation patterns in plants and animals. Nat. Rev. Genet. 11: 204-220. Lebkowski J. (2011) GRNOPC1: the world’s first embryonic stem cell-derived therapy. Regen Med 6: 11-13. Lehmann OJ, Sowden JC, Carlsson P, et al. (2003) Fox's in development and disease. Trends Genet 19: 339-344. Lehnertz B, Ueda Y, Derijck AA, et al. (2003) Suv39h-mediated histone H3 lysine 9 methylation directs DNA methylation to major satellite repeats at pericentric heterochromatin. Curr Biol 13: 1192-1200. Lehoczky JA, Robert B and Tabin CJ. (2011) Mouse digit tip regeneration is mediated by fate-restricted progenitor cells. Proc Natl Acad Sci U S A 108: 20609-20614. Leibovich SJ and Ross R. (1975) The role of the macrophage in wound repair. A study with hydrocortisone and antimacrophage serum. Am J Pathol 78: 71-100. Li J, Chen J and Kirsner R. (2007) Pathophysiology of acute wound healing. Clin Dermatol 25: 9-18. Li Y, Daniel M and Tollefsbol T. (2011) Epigenetic regulation of caloric restriction in aging. BMC Medicine 9: 98. Li Y, Sawalha AH and Lu Q. (2009) Aberrant DNA methylation in skin diseases. J Dermatol Sci 54: 143-149.

172

Liechty KW, Kim HB, Adzick NS, et al. (2000) Fetal wound repair results in scar formation in interleukin-10–deficient mice in a syngeneic murine model of scarless fetal wound repair. J Dermatol Sci 35: 866-873. Liesegang TJ. (1997) Apoptosis, necrosis, and proliferation: Possible implications in the etiology of keloids. Am J Ophthalmol 123: 724. Lim CP, Phan TT, Lim IJ, et al. (2006) Stat3 contributes to keloid pathogenesis via promoting collagen production, cell proliferation and migration. Oncogene 25: 5416-5425. Linares HA. (1996) From wound to scar. Burns 22: 339-352. Linares HA and Larson DL. (1974) Early differential diagnosis between hypertrophic and nonhypertrophic healing. J Invest Dermatol 62: 514-516. Linge C, Richardson J, Vigor C, et al. (2005) Hypertrophic scar cells fail to undergo a form ofapoptosis specific to contractile collagen - the role of tissue transglutaminase. J Investig Dermatol 125: 72-82. Lister R, Pelizzola M, Dowen RH, et al. (2009) Human DNA methylomes at base resolution show widespread epigenomic differences. Nature 462: 315-322. Lister R, Pelizzola M, Kida YS, et al. (2011) Hotspots of aberrant epigenomic reprogramming in human induced pluripotent stem cells. Nature 471: 68-73. Liu H, Zhang C, Zhu S, et al. (2015) Mohawk promotes the tenogenesis of mesenchymal stem cells through activation of the TGFβ signaling pathway. STEM CELLS 33: 443-455. Liu W, Watson SS, Lan Y, et al. (2010) The atypical homeodomain transcription factor mohawk controls tendon morphogenesis. Mol Cell Biol 30: 4797-4807. Liu X, Xu J, Brenner DA, et al. (2013) Reversibility of Liver Fibrosis and Inactivation of Fibrogenic Myofibroblasts. Current Pathobiology Reports 1: 209-214. Liu Y. (2004) Epithelial to mesenchymal transition in renal fibrogenesis: pathologic significance, molecular mechanism, and therapeutic intervention. J Am Soc Nephrol 15: 1-12. Lockhart RD. (1965) Anatomy of the human body / by R.D. Lockhart, G.F. Hamilton [and] F.W. Fyfe, London: Faber & Faber Ltd. Lu H, Liu X, Deng Y, et al. (2013) DNA methylation, a hand behind neurodegenerative diseases. Front Aging Neurosci. 5: 85. Machesney M, Tidman N, Waseem A, et al. (1998) Activated keratinocytes in the epidermis of hypertrophic scars. Am J Pathol 152: 1133-1141.

173

Madden JW and Peacock EE, Jr. (1971) Studies on the biology of collagen during wound healing. 3. Dynamic metabolism of scar collagen and remodeling of dermal wounds. Ann Surg 174: 511-520. Maeder ML, Angstman JF, Richardson ME, et al. (2013) Targeted DNA demethylation and activation of endogenous genes using programmable TALE-TET1 fusion proteins. Nat Biotech 31: 1137-1142. Mann J, Oakley F, Akiboye F, et al. (2007) Regulation of myofibroblast transdifferentiation by DNA methylation and MeCP2: implications for wound healing and fibrogenesis. Cell Death Differ 14: 275-285. Mariani TJ, Budhraja V, Mecham BH, et al. (2003) A variable fold change threshold determines significance for expression microarrays. FASEB J 17: 321-323. Martin KJ, Kritzman BM, Price LM, et al. (2000) Linking gene expression patterns to therapeutic groups in breast cancer. Cancer Res 60: 2232-2238. Martin P. (1997) Wound healing--aiming for perfect skin regeneration. Science 276: 75- 81. Masuda H, Maruyama T, Hiratsu E, et al. (2007) Noninvasive and real-time assessment of reconstructed functional human endometrium in NOD/SCID/γcnull immunodeficient mice. Proc Natl Acad Sci U S A 104: 1925-1930. McCroskery S, Thomas M, Platt L, et al. (2005) Improved muscle healing through enhanced regeneration and reduced fibrosis in myostatin-null mice. J Cell Sci 118: 3531-3541. McGrath JA and Uitto J. (2010) Anatomy and organization of human skin. Rook's Textbook of Dermatology. Wiley-Blackwell, 1-53. McKleroy W, Lee T-H and Atabai K. (2013) Always cleave up your mess: targeting collagen degradation to treat tissue fibrosis. Am J Physiol Lung Cell Mol Physiol 304: L709-L721. McNeil PL and Kirchhausen T. (2005) An emergency response team for membrane repair. Nat Rev Mol Cell Biol 6: 499-505. Melboucy-Belkhir S, Pradère P, Tadbiri S, et al. (2014) Forkhead Box F1 represses cell growth and inhibits COL1 and ARPC2 expression in lung fibroblasts in vitro. Am J Physiol Lung Cell Mol Physiol 307: L838-L847. Mercer TR, Dinger ME and Mattick JS. (2009) Long non-coding RNAs: insights into functions. Nat Rev Genet 10: 155-159. Michalopoulos GK and DeFrances MC. (1997) Liver Regeneration. Science 276: 60-66.

174

Moiseeva EP, Straatman KR, Leyland ML, et al. (2014) CADM1 controls actin cytoskeleton assembly and regulates extracellular matrix adhesion in human mast cells. PLoS ONE 9. Mooradian AD, Morley JE and Korenman SG. (1987) Biological actions of androgens. Endocr Rev 8: 1-28. Mori L, Bellini A, Stacey MA, et al. (2005) Fibrocytes contribute to the myofibroblast population in wounded skin and originate from the bone marrow. Exp Cell Res 304: 81-90. Morihara K, Takai S, Takenaka H, et al. (2006) Cutaneous tissue angiotensin– converting enzyme may participate in pathologic scar formation in human skin. J Am Acad Dermatol 54: 251-257. Most C. (1992) Molecular features of CD34: a hemopoietic progenitor cell-associated molecule. Leukemia 6: 31-36. Muir IF. (1990) On the nature of keloid and hypertrophic scars. Br J Plast Surg 43: 61- 69. Murray CJL, Vos T, Lozano R, et al. (2012) Disability-adjusted life years (DALYs) for 291 diseases and injuries in 21 regions, 1990–2010: a systematic analysis for the Global Burden of Disease Study 2010. The Lancet 380: 2197-2223. Naumova OY, Lee M, Koposov R, et al. (2012) Differential patterns of whole-genome DNA methylation in institutionalized children and children raised by their biological parents. Dev Psychopathol 24: 143-155. Nelson TJ, Martinez-Fernandez A, Yamada S, et al. (2009) Repair of acute myocardial infarction by human stemness factors induced pluripotent stem cells. Circulation 120: 408-416. Nestor CE, Ottaviano R, Reinhardt D, et al. (2015) Rapid reprogramming of epigenetic and transcriptional profiles in mammalian culture systems. Genome Biology 16: 11. Niesen MI, Osborne AR, Yang H, et al. (2005) Activation of a methylated promoter mediated by a sequence-specific DNA-binding protein, RFX. J Biol Chem 280: 38914-38922. NIH. (2014) A Brief Guide to Genomics. Available at: http://www.genome.gov/19016904.

175

Ning P, Peng Y, Liu D, et al. (2016) Tetrandrine induces microRNA differential expression in human hypertrophic scar fibroblasts in vitro. Genetics and molecular research: GMR 15. Nusse R and Varmus H. (2012) Three decades of Wnts: a personal perspective on how a scientific field developed. EMBO J. 31: 2670-2684. O'Sullivan ST, O'Shaughnessy M and O'Connor TPF. (1996) Aetiology and management of hypertrophic scars and keloids. Ann R Coll Surg Engl 78: 168- 175. Ogata H, Chinen T, Yoshida T, et al. (2006) Loss of SOCS3 in the liver promotes fibrosis by enhancing STAT3-mediated TGF-[beta]1 production. Oncogene 25: 2520-2530. Ohshima H. (2003) Genetic and epigenetic damage induced by reactive nitrogen species: implications in carcinogenesis. Toxicol Lett 140–141: 99-104. Oikarinen A, Autio P, Kiistala U, et al. (1992) A new method to measure type I and III collagen synthesis in human skin in vivo: demonstration of decreased collagen synthesis after topical glucocorticoid treatment. J Invest Dermatol 98: 220-225. Olivieri J, Smaldone S and Ramirez F. (2010) Fibrillin assemblies: extracellular determinants of tissue formation and fibrosis. Fibrogenesis Tissue Repair 3: 24. Ormestad M, Astorga J, Landgren H, et al. (2006) Foxf1 and Foxf2 control murine gut development by limiting mesenchymal Wnt signaling and promoting extracellular matrix production. Development 133: 833-843. Otabe K, Nakahara H, Hasegawa A, et al. (2013) The transcription factor mohawk plays an important role for maintaining human ACL homeostasis and ligament/tendon differentiation of mesenchymal stem cells. Osteoarthritis Cartilage 21: S48. Otabe K, Nakahara H, Hasegawa A, et al. (2015) Transcription factor Mohawk controls tenogenic differentiation of bone marrow mesenchymal stem cells in vitro and in vivo. J Orthop Res 33: 1-8. Paddock HN, Schultz GS, Baker HV, et al. (2003) Analysis of gene expression patterns in human postburn hypertrophic scars. J Burn Care Res 24: 371-377. Pang M, Ma L, Gong R, et al. (2010) A novel STAT3 inhibitor, S3I-201, attenuates renal interstitial fibroblast activation and interstitial fibrosis in obstructive nephropathy. Kidney Int 78: 257-268.

176

Papamitsou T, Barlagiannis D, Papaliagkas V, et al. (2011) Testosterone-induced hypertrophy, fibrosis and apoptosis of cardiac cells – an ultrastructural and immunohistochemical study. Med Sci Monit. 17: BR266-BR273. Peck MD. (2011) Epidemiology of burns throughout the world. Part I: Distribution and risk factors. Burns 37: 1087-1100. Penuela S, Kelly JJ, Churko JM, et al. (2014) Panx1 Regulates Cellular Properties of Keratinocytes and Dermal Fibroblasts in Skin Development and Wound Healing. Journal of Investigative Dermatology 134: 2026-2035. Perugorria MJ, Wilson CL, Zeybel M, et al. (2012) Histone methyltransferase ASH1 orchestrates fibrogenic gene transcription during myofibroblast transdifferentiation. Hepatology 56: 1129-1139. Porubsky SF, G. Rampoldi ,F. Kuppe, C. Gröne, H.-J. (2014) 10th International Podocyte Conference, Freiburg, June 4-6, 2014: Abstracts "Lack of Protein Fatty-Acylation Can Cause Focal Segmental Glomerulosclerosis (FSGS)". Nephron Clinical Practice 126: 159-228. Profyris C, Tziotzios C and Do Vale I. (2012) Cutaneous scarring: pathophysiology, molecular mechanisms, and scar reduction therapeutics: part I. The molecular basis of scar formation. J Am Acad Dermatol 66: 1-10. Qiu J. (2006) Epigenetics: Unfinished symphony. Nature 441: 143-145. Rabinovich EI, Kapetanaki MG, Steinfeld I, et al. (2012) Global methylation patterns in idiopathic pulmonary fibrosis. PLoS ONE 7: e33770. Ramsahoye BH, Biniszkiewicz D, Lyko F, et al. (2000) Non-CpG methylation is prevalent in embryonic stem cells and may be mediated by DNA methyltransferase 3a. Proc Natl Acad Sci U S A 97: 5237-5242. Rea S, Giles NL, Webb S, et al. (2009) Bone marrow-derived cells in the healing burn wound—more than just inflammation. Burns 35: 356-364. Reardon S and Cyranoski D. (2014) Japan stem-cell trial stirs envy. Nature 513: 287- 288. Reik W. (2007) Stability and flexibility of epigenetic gene regulation in mammalian development. Nature 447: 425-432. Reilkoff RA, Bucala R and Herzog EL. (2011) Fibrocytes: emerging effector cells in chronic inflammation. Nat Rev Immunol 11: 427-435. Ritelli M, Chiarelli N, Zoppi N, et al. (2015) Insights in the etiopathology of galactosyltransferase II (GalT-II) deficiency from transcriptome-wide expression

177

profiling of skin fibroblasts of two sisters with compound heterozygosity for two novel B3GALT6 mutations. Mol Genet Metab Rep 2: 1-15. Ritprajak P, Hashiguchi M and Azuma M. (2008) Topical application of cream- emulsified CD86 siRNA ameliorates allergic skin disease by targeting cutaneous dendritic cells. Mol Ther 16: 1323-1330. Roberson EDO, Liu Y, Ryan C, et al. (2012) A subset of methylated CpG sites differentiate psoriatic from normal skin. J Invest Dermatol 132: 583-592. Rodriguez E, Baurecht H, Wahn AF, et al. (2014) An integrated epigenetic and transcriptomic analysis reveals distinct tissue-specific patterns of DNA methylation associated with atopic dermatitis. J Invest Dermatol 134: 1873- 1883. Roy S and Gatien S. (2008) Regeneration in axolotls: a model to aim for! Exp Gerontol 43: 968-973. Roy S and Lévesque M. (2006) Limb Regeneration in Axolotl: Is It Superhealing? Sci World J 6. Rucklidge GJ, Milne G, McGaw BA, et al. (1992) Turnover rates of different collagen types measured by isotope ratio mass spectrometry. Biochimica et Biophysica Acta (BBA) - General Subjects 1156: 57-61. Russell SB, Russell JD, Trupin KM, et al. (2010) Epigenetically altered wound healing in keloid fibroblasts. J Invest Dermatol 130: 2489-2496. Saladin KS. (2001) Anatomy and Physiology: The Unity of Form and Function, New York, NY, USA: McGraw-Hill. Sanders YY, Ambalavanan N, Halloran B, et al. (2012) Altered DNA Methylation Profile in Idiopathic Pulmonary Fibrosis. Am J Respir Crit Care Med 186: 525- 535. Sandoval J, Heyn H, Moran S, et al. (2011) Validation of a DNA methylation microarray for 450,000 CpG sites in the human genome. Epigenetics 6: 692-702. Sassi M-L, Jukkola A, Riekki R, et al. (2001) Type I collagen turnover and cross- linking are increased in irradiated skin of breast cancer patients. Radiother Oncol 58: 317-323. Savagner P, Kusewitt DF, Carver EA, et al. (2005) Developmental transcription factor slug is required for effective re-epithelialization by adult keratinocytes. J Cell Physiol 202: 858-866.

178

Schena M, Shalon D, Davis RW, et al. (1995) Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 270: 467- 470. Schiller M, Javelaud D and Mauviel A. (2004) TGF-β-induced SMAD signaling and gene regulation: consequences for extracellular matrix remodeling and wound healing. J Dermatol Sci 35: 83-92. Schindelin J, Arganda-Carreras I, Frise E, et al. (2012) Fiji: an open-source platform for biological-image analysis. Nat Meth 9: 676-682. Schneider EL and Mitsui Y. (1976) The relationship between in vitro cellular aging and in vivo human age. Proc Natl Acad Sci U S A 73: 3584-3588. Schroder K, Muruve DA and Tschopp J. (2009) Innate immunity: cytoplasmic DNA sensing by the AIM2 inflammasome. Curr Biol 19: R262-R265. Schwartz SD, Hubschman J-P, Heilwell G, et al. Embryonic stem cell trials for macular degeneration: a preliminary report. The Lancet 379: 713-720. Seidman JG and Seidman C. (2002) Transcription factor haploinsufficiency: when half a loaf is not enough. J Clin Invest 109: 451-455. Seifert O, Bayat A, Geffers R, et al. (2008) Identification of unique gene expression patterns within different lesional sites of keloids. Wound Repair Regen 16: 254- 265. Seok J, Warren HS, Cuenca AG, et al. (2013) Genomic responses in mouse models poorly mimic human inflammatory diseases. Proc Natl Acad Sci U S A 110: 3507-3512. Sharma A, Mehan MM, Sinha S, et al. (2009) Trichostatin A inhibits corneal haze in vitro and in vivo. Invest Ophthalmol Vis Sci 50: 2695-2701. Sharma M, Kambadur R, Matthews KG, et al. (1999) Myostatin, a transforming growth factor-β superfamily member, is expressed in heart muscle and is upregulated in cardiomyocytes after infarct. J Cell Physiol 180: 1-9. Sharma S, Kelly TK and Jones PA. (2010) Epigenetics in cancer. Carcinogenesis 31: 27-36. Shaw T and Martin P. (2009a) Epigenetic reprogramming during wound healing: loss of polycomb-mediated silencing may enable upregulation of repair genes. EMBO Rep 10: 881-886. Shaw TJ and Martin P. (2009b) Wound repair at a glance. J Cell Sci 122: 3209-3213.

179

Sheridan RL and Tompkins RG. (2004) What's new in burns and metabolism. J Am Coll Surg 198: 243-263. Shiio Y and Eisenman RN. (2003) Histone sumoylation is associated with transcriptional repression. Proc Natl Acad Sci U S A 100: 13225-13230. Simpkins SB, Bocker T, Swisher EM, et al. (1999) MLH1 promoter methylation and gene silencing is the primary cause of microsatellite instability in sporadic endometrial cancers. Hum Mol Genet 8: 661-666. Smith JC, Boone BE, Opalenik SR, et al. (2007) Gene profiling of keloid fibroblasts shows altered expression in multiple fibrosis-associated pathways. J Invest Dermatol 128: 1298-1310. Smyth GK. (2004) Linear models and empirical bayes methods for assessing differential expression in microarray experiments. Stat Appl Genet Mol 3: 1. Sodek J. (1976) A new approach to assessing collagen turnover by using a micro-assay. A highly efficient and rapid turnover of collagen in rat periodontal tissues. Biochem J 160: 243-246. Soneja A, Drews M and Malinski T. (2005) Role of nitric oxide, nitroxidative and oxidative stress in wound healing. Pharmacological Reports 57: 108. Song M-A, Tiirikainen M, Kwee S, et al. (2013) Elucidating the landscape of aberrant DNA methylation in hepatocellular carcinoma. PLoS ONE 8: e55761. Soriano AO, Yang H, Faderl S, et al. (2007) Safety and clinical activity of the combination of 5-azacytidine, valproic acid, and all-trans retinoic acid in acute myeloid leukemia and myelodysplastic syndrome. Blood 110: 2302-2308. Stefanovic L and Stefanovic B. (2012) Role of cytokine receptor-like factor 1 in hepatic stellate cells and fibrosis. World J Hepatol 4: 356-364. Strahl BD and Allis CD. (2000) The language of covalent histone modifications. Nature 403: 41-45. Stresemann C, Brueckner B, Musch T, et al. (2006) Functional diversity of DNA methyltransferase inhibitors in human cancer cell lines. Cancer Res 66: 2794- 2800. Subramanian A, Tamayo P, Mootha VK, et al. (2005) Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A 102: 15545-15550. Sun G, Reddy MA, Yuan H, et al. (2010) Epigenetic histone methylation modulates fibrotic gene expression. J Am Soc Nephrol.

180

Takahashi K, Tanabe K, Ohnuki M, et al. (2007) Induction of pluripotent stem cells from adult human fibroblasts by defined factors. Cell 131: 861-872. Takahashi K and Yamanaka S. (2006) Induction of pluripotent stem cells from mouse embryonic and adult fibroblast cultures by defined factors. Cell 126: 663-676. Takai D and Jones PA. (2002) Comprehensive analysis of CpG islands in human chromosomes 21 and 22. Proc Natl Acad Sci U S A 99: 3740-3745. Taub R. (2004) Liver regeneration: from myth to mechanism. Nat Rev Mol Cell Biol 5: 836-847. Team RC. (2015) R: A Language and Environment for Statistical Computing. Available at: http://www.R-project.org. Teschendorff AE, Marabita F, Lechner M, et al. (2013) A beta-mixture quantile normalization method for correcting probe design bias in Illumina Infinium 450 k DNA methylation data. Bioinformatics 29: 189-196. Teschendorff AE, Menon U, Gentry-Maharaj A, et al. (2009) An epigenetic signature in peripheral blood predicts active ovarian cancer. PLoS ONE 4: e8274. Thompson CM, Sood RF, Honari S, et al. (2015) What score on the Vancouver Scar Scale constitutes a hypertrophic scar? Results from a survey of North American burn-care providers. Burns. Togo T, Krasieva TB and Steinhardt RA. (2000) A decrease in membrane tension precedes successful cell-membrane repair. Mol Biol Cell 11: 4339-4346. Torres-Martín M, Lassaletta L, de Campos JM, et al. (2015) Genome-wide methylation analysis in vestibular schwannomas shows putative mechanisms of gene expression modulation and global hypomethylation at the HOX gene cluster. Gene Chromosome Cancer 54: 197-209. Tredget EE, Nedelec B, Scott PG, et al. (1997) Hypertrophic scars, keloids, and contractures: the cellular and molecular basis for therapy. Surg Clin North Am 77: 701-730. Tredget EE, Wang R, Shen Q, et al. (2000) Transforming growth factor-beta mRNA and protein in hypertrophic scar tissues and fibroblasts: antagonism by IFN- alpha and IFN-gamma in vitro and in vivo. J Interferon Cytokine Res 20: 143- 151. Triche TJ, Weisenberger DJ, Van Den Berg D, et al. (2013) Low-level processing of Illumina Infinium DNA Methylation BeadArrays. Nucleic Acids Res 41: e90.

181

Tsou R, Cole JK, Nathens AB, et al. (2000) Analysis of hypertrophic and normal scar gene expression with cDNA microarrays. J Burn Care Res 21: 541&hyhen;550. Tzouvelekis A, Harokopos V, Paparountas T, et al. (2007) Comparative Expression Profiling in Pulmonary Fibrosis Suggests a Role of Hypoxia-inducible Factor-1α in Disease Pathogenesis. American Journal of Respiratory and Critical Care Medicine 176: 1108-1119. Uhlén M, Fagerberg L, Hallström BM, et al. (2015) Tissue-based map of the human proteome. Science 347. Uitto J, Olsen DR and Fazio MJ. (1989) Extracellular matrix of the skin: 50 years of progress. J Invest Dermatol 92: 61S-77S. Uutela M, Wirzenius M, Paavonen K, et al. (2004) PDGF-D induces macrophage recruitment, increased interstitial pressure, and blood vessel maturation during angiogenesis. Blood. 104: 3198-3204. Valinluck V, Tsai H-H, Rogstad DK, et al. (2004) Oxidative damage to methyl-CpG sequences inhibits the binding of the methyl-CpG binding domain (MBD) of methyl-CpG binding protein 2 (MeCP2). Nucleic Acids Res 32: 4100-4108. Valvis SM, Waithman J, Wood FM, et al. (2015) The immune response to skin trauma is dependent on the etiology of injury in a mouse model of burn and excision. J Invest Dermatol 135: 2119-2128. van der Heul-Nieuwenhuijsen L, Dits N, Van Ijcken W, et al. (2009a) The FOXF2 pathway in the human prostate stroma. The Prostate 69: 1538-1547. van der Heul-Nieuwenhuijsen L, Dits NF and Jenster G. (2009b) Gene expression of forkhead transcription factors in the normal and diseased human prostate. BJU International 103: 1574-1580. van der Horst GTJ, van Steeg H, Berg RJW, et al. (1997) Defective transcription- coupled repair in cockayne syndrome B mice is associated with skin cancer predisposition. Cell 89: 425-435. van Zuijlen PPM, Ruurda JJB, van Veen HA, et al. (2003) Collagen morphology in human skin and scar tissue: no adaptations in response to mechanical loading at joints. Burns 29: 423-431. Vandesompele J, De Preter K, Pattyn F, et al. (2002) Accurate normalization of real- time quantitative RT-PCR data by geometric averaging of multiple internal control genes. Genome Biology 3: research0034.0031-research0034.0011.

182

Verhaegen PDHM, Van Zuijlen PPM, Pennings NM, et al. (2009) Differences in collagen architecture between keloid, hypertrophic scar, normotrophic scar, and normal skin: An objective histopathological analysis. Wound Repair Regen 17: 649-656. Verzijl N, DeGroot J, Thorpe SR, et al. (2000) Effect of collagen turnover on the accumulation of advanced glycation end products. J Biol Chem 275: 39027- 39031. Walker N, Badri L, Wettlaufer S, et al. (2011) Resident Tissue-Specific Mesenchymal Progenitor Cells Contribute to Fibrogenesis in Human Lung Allografts. The American Journal of Pathology 178: 2461-2469. Waller JM and Maibach HI. (2006) Age and skin structure and function, a quantitative approach (II): protein, glycosaminoglycan, water, and lipid content and structure. Skin Res Technol 12: 145-154. Wang D, Yan L, Hu Q, et al. (2012) IMA: an R package for high-throughput analysis of Illumina's 450K Infinium methylation data. Bioinformatics 28: 729-730. Wang GG, Allis CD and Chi P. (2007) Chromatin remodeling and cancer, part I: covalent histone modifications. Trends Mol Med 13: 363-372. Wang J, Chin MY and Li G. (2006) The novel tumor suppressor p33ING2 enhances nucleotide excision repair via inducement of histone H4 acetylation and chromatin relaxation. Cancer Res 66: 1906-1911. Wang Y and Shang Y. (2013) Epigenetic control of epithelial-to-mesenchymal transition and cancer metastasis. Exp Cell Res 319: 160-169. Werner S, Krieg T and Smola H. (2007) Keratinocyte-fibroblast interactions in wound healing. J Invest Dermatol 127: 998-1008. Witt O, Deubzer HE, Milde T, et al. (2009) HDAC family: What are the cancer relevant targets? Cancer Lett. 277: 8-21. Woerner SM, Kloor M, Schwitalle Y, et al. (2007) The putative tumor suppressor AIM2 is frequently affected by different genetic alterations in microsatellite unstable colon cancers. Genes Chromosomes Cancer 46: 1080-1089. Wu Y, Zhao RCH and Tredget EE. (2010) Concise review: bone marrow-derived stem/progenitor cells in cutaneous repair and regeneration. STEM CELLS 28: 905-915. Xu J, Cong M, Park TJ, et al. (2015) Contribution of bone marrow-derived fibrocytes to liver fibrosis. Hepatobiliary Surg Nutr. 4: 34-47.

183

Xu Q, Norman JT, Shrivastav S, et al. (2007) In vitro models of TGF-β-induced fibrosis suitable for high-throughput screening of antifibrotic agents. Am J Physiol Renal Physiol. 293: F631-F640. Yan C, Grimm WA, Garner WL, et al. (2010) Epithelial to mesenchymal transition in human skin wound healing is induced by tumor necrosis factor-α through bone morphogenic protein-2. Am J Pathol 176: 2247-2258. Yang X, Han H, De Carvalho Daniel D, et al. (2014) Gene body methylation can alter gene expression and is a therapeutic target in cancer. Cancer Cell 26: 577-590. Yoshida T, Ogata H, Kamio M, et al. (2004) SOCS1 is a suppressor of liver fibrosis and hepatitis-induced carcinogenesis. J. Exp. Med. 199: 1701-1707. Zhang C, Tan CK, McFarlane C, et al. (2012) Myostatin-null mice exhibit delayed skin wound healing through the blockade of transforming growth factor-β signaling by decorin. Am J Physiol Cell Physiol. 302: C1213-C1225. Zhang K, Asai S, Yu B, et al. (2015) IL-1β irreversibly inhibits tenogenic differentiation and alters metabolism in injured tendon-derived progenitor cells in vitro. Biochem Biophys Res Commun 463: 667-672. Zhang L, Zhou W, Velculescu VE, et al. (1997) Gene expression profiles in normal and cancer cells. Science 276: 1268-1272. Zhang Q, Tao K, Huang W, et al. (2013) Elevated expression of pleiotrophin in human hypertrophic scars. Journal of Molecular Histology 44: 91-96. Zouridis H, Deng N, Ivanova T, et al. (2012) Methylation subtypes and large-scale epigenetic alterations in gastric cancer. Sci Transl Med 4: 156ra140-156ra140.

184

Appendices

185

Appendix I – patient information sheet and consent forms

186

Patient Data Collection Form

Name:

Date of Birth:

Date of Injury:

Cause of Injury:

Treatment Type (e.g. Conservative, SSG etc.):

Smoker Y/N:

Other Medications:

Other medical conditions:

187

ROYAL PERTH HOSPITAL

CONSENT FORM (2)

STUDY TITLE: Western Australian Scar Outcome Study – Changes in scar cells

Chief Investigator: Professor Fiona Wood Part A: Consent for Skin Collection, Storage and Use for Medical Research The Royal Perth Hospital Human Research Ethics Committee requires that the collection, use and storage of tissue samples be conducted in accordance with the NHMRC National Statement on Ethical Conduct in Research Involving Humans (2007). Accordingly, in this study your samples will be used in the following ways: 1. Please nominate one or more categories of research for which your skin samples (including cells) may be used: i. Samples may be used for the purposes specified for the WA Scar Outcome Study: Yes  No  ii. Samples may also be used for other research on scarring and related wound healing disorders: Yes  No  2. Please nominate for how long your skin sample (including cells) may be stored i. Current storage for WA Scar Outcome Study: Yes  No  ii. Indefinite storage for other research on scarring and related wound healing disorders: Yes  No 

Part B: General I understand that:

If a researcher wishes to obtain additional information or samples from me, my name will not be divulged to that researcher without my written permission;

Any research results about me, and the fact that I have taken part in this study, will not be revealed to any third party without my written consent, except if required by law;

The researchers will not reveal my identity and personal information about this project if published in any public form; 188

I will not receive, or be entitled to, any reward or remuneration for providing my skin samples for this project. I understand that the sample being donated could be used for commercial development and acknowledge the public interest in the research and donate the sample absolutely;

I understand the potential benefits and risks involved in taking part in this study which have been explained to me and I accept the risks involved. I have had the opportunity to ask questions and am satisfied with the explanation and the answers to my questions;

I may withdraw from the study at any time, no questions asked, and without it negatively affecting my future medical treatment;

If I choose to withdraw from the study, I understand that any information about me already collected by the researchers will be retained unless I request otherwise in writing;

I understand that my information will only be collected and used for the study of scarring and related conditions. Related conditions are defined as other types of wound healing problems. Part C: Declaration (Please tick the option that applies)

I, (print name)______,

Of (address)______freely give my consent to participate in the WA Scar Outcome Study (2) - .

I am over 18 years of age I have also read and understood Part B I have read and understood the Patient Information Sheet for this study Any questions I have asked have been answered to my satisfaction I freely give my consent for my skin to be collected and used as indicated on this form I have been given a copy of this Consent Form for my records

Yes No

Please sign:

……………………………... …………...………… …………………... Name of Participant Signature Date

……………………………... …………...………… …………………... 189

Name of Witness Signature Date

The Human Research Ethics Committee at Royal Perth Hospital requires that all participants are informed that, if they have any complaint regarding the manner in which a research project is conducted, it may be given to the researcher or, alternatively, to the Chairman, Human Research Ethics Committee, Royal Perth Hospital, Perth WA, 6001.

190

Appendix II - IMA code for analysis of 450k human genechip using R

191

Appendix II – IMA code for analysis of 450k human genechip using R

R version 2.15.3 was installed and run. The IMA package was used, and instructions were taken from the vignette explaining how to use IMA on the rForge page (http://www.rforge.net/IMA/meth450k.pdf). 1. Connection was established with Bioconductor: > source("http://bioconductor.org/biocLite.R")

2. Limma, Biodist, Mass and IMA installed and IMA loaded: > biocLite(c("limma","bioDist")) > install.packages(c("WriteXLS","MASS"),repos="http://cran.r-project.org") > install.packages("IMA",repos=c("http://rforge.net")) > library(IMA)

3. Methylation file read in. This was the output from Illumina Genome studio as a .txt file, with all the columns included except for the ‘index’ column:

> MethyFileName = "2nd run all pts all col.txt"

4. Phenotype file read in. This was a .txt file that told R which samples to compare in a pairwise fashion, as per section 3.1 of the loading data section of the vignette.

> PhenoFileName = "Phenotype2ndrun.txt"

5. Data read in using the two text files: > data =IMA.methy450R(fileName = MethyFileName,columnGrepPattern=list(beta=".AVG_Beta", + detectp=".Detection.Pval"),groupfile = PhenoFileName)

6. Data preprocessed using default settings: > dataf = IMA.methy450PP(data,na.omit = TRUE,peakcorrection = FALSE,normalization=FALSE,transfm = FALSE, + samplefilterdetectP = 1e-5,samplefilterperc = 0.75,sitefilterdetectP = 0.05, + sitefilterperc = 0.75,locidiff = FALSE,locidiffgroup = list(c("g1"),"g2"), XYchrom = c(FALSE,"X","Y"))

7. Sites compared pairwise using a 5% FDR (BH) cut off >sitetestALL = sitetest(dataf,gcase="g2",,gcontrol=c("g1"),test ="limma" ,Padj="BH", + rawpcut = NULL,adjustpcut =NULL,betadiffcut = NULL,paired = TRUE)

8. Regionswrapper function applied, which groups the sites together into different regions and genes: >regionswrapper(dataf,indexmethod ="mean",gcase = "g2",gcontrol=c("g1"),testmethod = "limma", + Padj="BH",concov = "OFF",list11excel="list11result.xls",list11Rdata="list11result.Rdata", + rawpcut = NULL,adjustpcut = NULL,betadiffcut = NULL,paired = TRUE)

9. Annotation was carried out using instructions from section 4 of the r-forge instructions of how to use IMA (http://ima.r-forge.r-project.org/).

192

> dataf2 = IMA.methy450PP(data,peakcorrection = FALSE,na.omit = FALSE,normalization=FALSE,transfm = FALSE,samplefilterdetectP =FALSE,locidiff = FALSE, XYchrom = FALSE,snpfilter=FALSE ) > fullannot = dataf2@annot > temp = c("TSS1500Ind","TSS200Ind","UTR5Ind", "EXON1Ind","GENEBODYInd","UTR3Ind","ISLANDInd","NSHOREInd","SSHORE Ind","NSHELFInd", "SSHELFInd") > for( i in 1:11){eval(parse(text=paste(temp[i],"=dataf2@",temp[i],sep="")))} > eval(parse(text = paste("save(fullannot", paste(temp,collapse = ","), "file = 'fullannotInd.rda')", sep = "," ))) > head(sitetestALL) > load("./fullannotInd.rda") > summary(fullannot)

10. Data written to a .csv file >write.csv(fullannot,"2ndrunallsamples.csv")

193

Appendix III - Full list of promoter regions of genes with p<0.05 from differential methylation analysis

194

Appendix III: Full list of promoter regions of genes with p<0.05 from differential methylation analysis, sorted by Δβ Adjusted Gene Symbol Full name P value Δβ LAMA4 laminin, alpha 4 0.022 0.298 SLC6A6 solute carrier family 6 0.011 0.249 NMT2 N-myristoyltransferase 2 0.045 0.228 MIR575 microRNA 575 0.028 0.223 PTN pleiotrophin 0.048 0.220 FOXF2 forkhead box F2 0.023 0.212 CLEC3B C-type lectin domain family 3, member B 0.030 0.193 CNTFR ciliary neurotrophic factor receptor 0.046 0.190 GIPC2 GIPC PDZ domain containing family, member 2 0.030 0.189 CEL carboxyl ester lipase 0.021 0.188 CD34 CD34 molecule 0.022 0.181 ESR1 estrogen receptor 1 0.049 0.174 MIR505 microRNA 505 0.022 0.171 GRWD1 glutamate-rich WD repeat containing 1 0.041 0.169 ABCB4 ATP-binding cassette, sub-family B 0.002 0.166 HSD17B2 hydroxysteroid 0.040 0.165 C6orf186 chromosome 6 open reading frame 186 0.037 0.162 TNXB tenascin XB 0.010 0.160 CRIP1 cysteine-rich protein 1 0.045 0.159 LOC399959 Mir-100-Let-7a-2 Cluster Host Gene 0.032 0.158 KRT7 keratin 7 0.031 0.156 PCK2 phosphoenolpyruvate carboxykinase 2 0.034 0.155 C15orf62 chromosome 15 open reading frame 62 0.044 0.154 LOC100270710 uncharacterized LOC100270710 0.028 0.151 S100A3 S100 calcium binding protein A3 0.032 0.147 CDH29 cadherin-related family member 4 0.046 0.147 MKX Mohawk Homeobox 0.050 0.147 IL15RA interleukin 15 receptor, alpha 0.036 0.146 TGFBR3 transforming growth factor, beta receptor III 0.035 0.144 GKN2 gastrokine 2 0.046 0.144 MIR548Q microRNA 548q 0.042 0.142 GUCY2D guanylate cyclase 2D, membrane 0.048 0.141 GGT1 gamma-glutamyltransferase 1 0.046 0.140 KIAA0240 KIAA1456 0.030 0.137 CLIC6 chloride intracellular channel 6 0.018 0.135 MIR591 microRNA 591 0.044 0.132 immunoglobulin superfamily containing leucine-rich ISLR repeat 0.037 0.131 NAV1 neuron navigator 1 0.048 0.130 C17orf46 chromosome 1 open reading frame 68 0.048 0.127 COLEC12 collectin sub-family member 12 0.014 0.127 ATP2B4 ATPase, Ca++ transporting, plasma membrane 4 0.045 0.126 GMPR guanosine monophosphate reductase 0.049 0.126 195

GRIA1 glutamate receptor, ionotropic, AMPA 1 0.037 0.125 LOC401097 uncharacterized LOC401097 0.044 0.124 STAB1 stabilin 1 0.032 0.124 SERPINB9 serpin peptidase inhibitor, clade B 0.046 0.123 FAM196B family with sequence similarity 196, member B 0.037 0.123 HAS1 hyaluronan synthase 1 0.037 0.122 ZDHHC8P Zinc Finger, DHHC-Type Containing 8 Pseudogene 1 0.049 0.120 GREM1 gremlin 1, DAN family BMP antagonist 0.045 0.120 C21orf121 chromosome 21 open reading frame 121 0.036 0.117 CT62 cancer/testis antigen 62 0.028 0.116 FBLN1 fibulin 1 0.020 0.115 S100A4 S100 calcium binding protein A4 0.046 0.114 KLF11 Kruppel-like factor 11 0.019 0.111 SLC16A3 solute carrier family 16, member 3 0.019 0.110 GJC2 gap junction protein, gamma 2, 47kDa 0.037 0.110 phosphatidylinositol 4-kinase, catalytic, alpha PI4KAP1 pseudogene 1 0.030 0.108 CDA 0.034 0.108 REC8 REC8 homolog 0.022 0.108 RYBP RING1 and YY1 binding protein 0.028 0.107 CCKAR cholecystokinin A receptor 0.044 0.106 ITGB3 integrin, beta 3 0.046 0.106 RASL11B RAS-like, family 11, member B 0.028 0.105 KLF8 Kruppel-like factor 8 0.030 0.105 TPM1 tropomyosin 1 0.029 0.105 CHST1 carbohydrate 0.048 0.104 CDCP1 CUB domain containing protein 1 0.032 0.104 SYNGR1 synaptogyrin 1 0.011 0.102 SPACA4 sperm acrosome associated 4 0.048 0.101 AR androgen receptor 0.037 0.099 IGSF9 immunoglobulin superfamily, member 9 0.049 0.098 LDLR low density lipoprotein receptor 0.044 0.098 OSBPL10 oxysterol binding protein-like 10 0.046 0.097 TP53BP1 tumor protein p53 binding protein 1 0.049 0.097 RBPMS RNA binding protein with multiple splicing 0.042 0.096 PTHLH parathyroid hormone-like hormone 0.028 0.096 CTGF connective tissue growth factor 0.030 0.096 MIR7-2 microRNA 7-2 0.032 0.095 TIMP4 TIMP metallopeptidase inhibitor 4 0.047 0.095 TNS3 tensin 3 0.046 0.094 FSTL1 follistatin-like 1 0.032 0.094 calcium/calmodulin-dependent protein kinase II CAMK2A alpha 0.019 0.094 DENND4B DENN/MADD domain containing 4B 0.014 0.092 MFSD2A major facilitator superfamily domain containing 2A 0.042 0.091 RAB3IL1 RAB3A interacting protein 0.047 0.091 MMRN2 multimerin 2 0.028 0.091

196

ATXN7L1 ataxin 7-like 1 0.016 0.089 PTPN6 protein tyrosine phosphatase, non-receptor type 6 0.032 0.087 DDR1 discoidin domain receptor tyrosine kinase 1 0.048 0.086 TNK2 tyrosine kinase, non-receptor, 2 0.048 0.086 TRAK1 trafficking protein, kinesin binding 1 0.028 0.085 TNFSF13 tumor necrosis factor 0.044 0.085 CKLF-like MARVEL transmembrane domain CMTM8 containing 8 0.046 0.085 NAGS N-acetylglutamate synthase 0.046 0.085 KRTAP4-4 keratin associated protein 4-4 0.030 0.083 RINL Ras and Rab interactor-like 0.028 0.083 ATPase, H+ transporting, lysosomal 16kDa, V0 ATP6V0C subunit c 0.041 0.082 LRRC16B leucine rich repeat containing 16B 0.037 0.082 C8orf51 chromosome 8 open reading frame 51 0.049 0.082 WISP2 WNT1 inducible signaling pathway protein 2 0.032 0.081 HMGA1 high mobility group AT-hook 1 0.046 0.081 RASSF5 Ras association 0.041 0.080 MIR29B2 MicroRNA 29b-2 0.028 0.080 TNFAIP2 tumor necrosis factor, alpha-induced protein 2 0.032 0.080 SORT1 sortilin 1 0.028 0.079 SLC38A2 solute carrier family 38, member 2 0.048 0.079 MSN moesin 0.042 0.078 latent transforming growth factor beta binding LTBP3 protein 3 0.030 0.078 ULK2 unc-51-like kinase 2 0.046 0.075 TIMP3 TIMP metallopeptidase inhibitor 3 0.018 0.075 BAIAP2L1 BAI1-associated protein 2-like 1 0.049 0.075 ZBTB42 zinc finger and BTB domain containing 42 0.047 0.075 DMKN dermokine 0.030 0.074 ETV7 ets variant 7 0.046 0.074 C8orf79 chromosome 8 open reading frame 79 0.022 0.073 PKD2 polycystic kidney disease 2 0.042 0.073 potassium voltage-gated channel, delayed-rectifier, KCNS3 subfamily S, member 3 0.032 0.071 ECE1 endothelin converting enzyme 1 0.032 0.067 extracellular leucine-rich repeat and fibronectin type ELFN2 III domain containing 2 0.046 0.067 CCL27 chemokine 0.030 0.067 SYT8 synaptotagmin VIII 0.024 0.066 LIM domain containing preferred translocation LPP partner in lipoma 0.035 0.066 TRIM68 tripartite motif containing 68 0.044 0.066 ANTXRL Anthrax Toxin Receptor-Like 0.036 0.066 DRD4 dopamine receptor D4 0.019 0.065 CKMT2 creatine kinase, mitochondrial 2 0.048 0.065 GRID1 glutamate receptor, ionotropic, delta 1 0.046 0.065

197

PTPRM protein tyrosine phosphatase, receptor type, M 0.019 0.065 USP19 ubiquitin specific peptidase 19 0.036 0.064 HYAL2 hyaluronoglucosaminidase 2 0.045 0.064 ODZ3 Teneurin Transmembrane Protein 3 0.046 0.063 ARHGAP29 Rho GTPase activating protein 29 0.042 0.063 HCST hematopoietic cell signal transducer 0.045 0.063 RCSD1 RCSD domain containing 1 0.044 0.063 U2AF2 U2 small nuclear RNA auxiliary factor 2 0.049 0.062 PGF placental growth factor 0.047 0.062 CLDN4 claudin 4 0.046 0.061 FBXO44 F-box protein 44 0.046 0.061 HHIPL1 HHIP-like 1 0.046 0.060 FBLIM1 filamin binding LIM protein 1 0.049 0.060 PHLDA3 pleckstrin homology-like domain, family A, member 3 0.048 0.060 TMEM14E transmembrane protein 14E 0.041 0.059 HCRTR1 hypocretin 0.049 0.059 sodium channel, voltage gated, type VIII, alpha SCN8A subunit 0.049 0.058 RASGRP1 RAS guanyl releasing protein 1 0.037 0.058 PLEKHG3 pleckstrin homology domain containing, family G 0.049 0.058 General Transcription Factor IIIC, Polypeptide 1, Alpha POL3S 220kDa 0.046 0.057 C20orf103 chromosome 20 open reading frame 103 0.046 0.057 CUEDC1 CUE domain containing 1 0.036 0.057 UDP-N-acetyl-alpha-D-galactosamine:polypeptide N- GALNT7 acetylgalactosaminyltransferase 7 0.019 0.055 EEF1A1 eukaryotic translation elongation factor 1 alpha 1 0.046 0.053 SCMH1 sex comb on midleg homolog 1 0.043 0.052 LMNA lamin A/C 0.045 0.052 CHRNB4 cholinergic receptor, nicotinic, beta 4 0.044 0.052 ZSCAN20 zinc finger and SCAN domain containing 20 0.029 0.050 NFIC nuclear factor I/C 0.046 0.050 HMGCL 3-hydroxymethyl-3-methylglutaryl-CoA 0.046 0.049 NRG1 neuregulin 1 0.047 0.048 CTNNA1 catenin 0.046 0.047 LOC729156 Uncharacterized LOC729156 0.032 0.046 FAM111B family with sequence similarity 111, member B 0.046 0.045 LOC440461 Rho GTPase Activating Protein 27 Pseudogene 0.037 0.044 MIR941-1 microRNA 941-1 0.049 0.042 RGS12 regulator of G-protein signaling 12 0.034 0.042 protein phosphatase 2A activator, regulatory subunit PPP2R4 4 0.034 0.042 NOTUM notum pectinacetylesterase homolog 0.044 0.041 NTRK1 neurotrophic tyrosine kinase, receptor, type 1 0.048 0.041 CAMK1D calcium/calmodulin-dependent protein kinase ID 0.046 0.040 GJB6 gap junction protein, beta 6, 30kDa 0.032 0.040 PLEKHG5 pleckstrin homology domain containing, family G 0.036 0.039

198

STK24 serine/threonine kinase 24 0.048 0.038 LRRC49 leucine rich repeat containing 49 0.049 0.036 G protein-coupled receptor, family C, group 5, GPRC5C member C 0.046 0.033 DTX1 deltex homolog 1 0.048 0.032 - ZNF280D zinc finger protein 280D 0.049 0.037 - CYFIP1 cytoplasmic FMR1 interacting protein 1 0.041 0.037 - ZNF772 zinc finger protein 772 0.046 0.041 - C19orf41 chromosome 19 open reading frame 41 0.049 0.042 - ASPRV1 aspartic peptidase, retroviral-like 1 0.049 0.047 apolipoprotein B mRNA editing enzyme, catalytic - APOBEC3H polypeptide-like 3H 0.030 0.048 - NXF1 nuclear RNA export factor 1 0.049 0.049 - SNORD116-15 small nucleolar RNA, C/D box 116-15 0.049 0.049 - C6orf115 chromosome 6 open reading frame 115 0.046 0.050 - TRIT1 tRNA isopentenyltransferase 1 0.049 0.050 - COX6C cytochrome c oxidase subunit VIc 0.046 0.051 - C12orf69 open reading frame 69 0.049 0.052 - USP4 ubiquitin specific peptidase 4 0.032 0.052 - FGFR1 fibroblast growth factor receptor 1 0.048 0.052 - SNAR-A10 Small ILF3/NF90-Associated RNA A10 0.046 0.054 - SNAR-A11 Small ILF3/NF90-Associated RNA A11 0.046 0.054 - SNAR-A14 Small ILF3/NF90-Associated RNA A14 0.046 0.054 - SNAR-A3 small ILF3/NF90-associated RNA A3 0.046 0.054 - SNAR-A4 Small ILF3/NF90-Associated RNA A4 0.046 0.054 - SNAR-A5 Small ILF3/NF90-Associated RNA A5 0.046 0.054 - SNAR-A6 Small ILF3/NF90-Associated RNA A6 0.046 0.054

199

- SNAR-A7 Small ILF3/NF90-Associated RNA A7 0.046 0.054 - SNAR-A8 Small ILF3/NF90-Associated RNA A8 0.046 0.054 - SNAR-A9 Small ILF3/NF90-Associated RNA A9 0.046 0.054 membrane-spanning 4-domains, subfamily A, - MS4A12 member 12 0.048 0.055 serine palmitoyltransferase, long chain base subunit - SPTLC3 3 0.023 0.055 - MIR200A microRNA 200a 0.048 0.057 - NRXN1 neurexin 1 0.040 0.059 - GSTA1 glutathione S-transferase alpha 1 0.042 0.063 - ANKRD30A ankyrin repeat domain 30A 0.040 0.064 - PPIE peptidylprolyl E 0.041 0.065 ALG1, chitobiosyldiphosphodolichol beta- - ALG1L mannosyltransferase-like 0.048 0.066 - NCRNA00164 Ankyrin Repeat Domain 30B-Like 0.046 0.067 - OR10A6 olfactory receptor, family 10, subfamily A, member 6 0.049 0.067 - OR10G7 olfactory receptor, family 10, subfamily G, member 7 0.048 0.067 - SNORD115-33 small nucleolar RNA, C/D box 115-33 0.049 0.068 transient receptor potential cation channel, - TRPV6 subfamily V, member 6 0.043 0.070 - OR11H4 olfactory receptor, family 11, subfamily H, member 4 0.040 0.072 - SNORD116-20 small nucleolar RNA, C/D box 116-20 0.046 0.073 - ZNF217 zinc finger protein 217 0.036 0.073 - LOC100133050 glucuronidase, beta pseudogene 0.044 0.076 - PRSS16 protease, serine, 16 0.030 0.076 - KRT4 keratin 4 0.046 0.077 - CECR1 cat eye syndrome chromosome region, candidate 1 0.019 0.077 - WIPF1 WAS/WASL interacting , member 1 0.049 0.077

200

- SERPINA4 serpin peptidase inhibitor, clade A 0.046 0.079 - MIR449A microRNA 449a 0.048 0.079 - MIR449B microRNA 449b 0.048 0.079 - MIR187 microRNA 187 0.047 0.079 - HERC2P4 hect domain and RLD 2 pseudogene 4 0.046 0.079 - LOC29034 uncharacterized LOC29034 0.030 0.080 - SNORD115-13 small nucleolar RNA, C/D box 115-13 0.034 0.081 - TNNT1 troponin T type 1 0.028 0.082 - CABP5 calcium binding protein 5 0.033 0.082 - BPIL3 BPI Fold Containing Family B, Member 6 0.030 0.082 - MYOZ1 myozenin 1 0.049 0.083 - HLA-DQA2 major histocompatibility complex, class II, DQ alpha 2 0.045 0.083 wingless-type MMTV integration site family, member - WNT8B 8B 0.028 0.085 olfactory receptor, family 10, subfamily AD, member - OR10AD1 1 0.023 0.087 - GSTA5 glutathione S-transferase alpha 5 0.030 0.088 - SLC46A3 solute carrier family 46, member 3 0.046 0.088 - LOC732275 Uncharacterized LOC732275 0.049 0.088 - MIR548H3 microRNA 548h-3 0.026 0.088 - OR4C15 olfactory receptor, family 4, subfamily C, member 15 0.028 0.089 - A2BP1 RNA binding protein, fox-1 homolog 0.038 0.089 - ZG16B zymogen granule protein 16B 0.048 0.090 membrane-spanning 4-domains, subfamily A, - MS4A4A member 4A 0.048 0.090 - ABCC13 ATP-binding cassette, sub-family C 0.049 0.091 - LOC642006 uncharacterized LOC642006 0.037 0.091

201

- CARD14 caspase recruitment domain family, member 14 0.040 0.093 - CD300LF CD300 molecule-like family member f 0.048 0.093 UDP-Gal:betaGlcNAc beta 1,3-galactosyltransferase, - B3GALT1 polypeptide 1 0.046 0.094 - MIR519D microRNA 519d 0.044 0.095 - SNORD115-35 small nucleolar RNA, C/D box 115-35 0.042 0.095 - SPANXN4 SPANX family, member N4 0.048 0.095 - PATE2 prostate and testis expressed 2 0.046 0.096 - DKFZp434L192 uncharacterized protein DKFZp434L192 0.049 0.096 - OR6C1 olfactory receptor, family 6, subfamily C, member 1 0.046 0.096 - IL1R1 interleukin 1 receptor, type I 0.036 0.097 sodium channel, voltage-gated, type XI, alpha - SCN11A subunit 0.049 0.097 - PRSS33 protease, serine, 33 0.042 0.097 - SLC25A41 solute carrier family 25, member 41 0.045 0.099 - OR5P3 olfactory receptor, family 5, subfamily P, member 3 0.041 0.099 - GPR45 G protein-coupled receptor 45 0.028 0.100 - KCNK17 potassium channel, subfamily K, member 17 0.030 0.100 - ALDH3B2 aldehyde dehydrogenase 3 family, member B2 0.021 0.101 - FCGR3A Fc fragment of IgG, low affinity IIIa, receptor 0.036 0.102 - TREM1 triggering receptor expressed on myeloid cells 1 0.020 0.103 - MIR200B microRNA 200b 0.011 0.104 - SOX6 SRY (Sex Determining Region Y)-Box 6 0.032 0.104 - DEFB122 defensin, beta 122 0.049 0.104 - OPRK1 opioid receptor, kappa 1 0.025 0.104 - KLK12 kallikrein-related peptidase 12 0.041 0.105

202

- FAT3 FAT atypical cadherin 3 0.042 0.105 - SLC34A2 solute carrier family 34 0.014 0.106 - HBII-52-46 Small Nucleolar RNA, C/D Box 115-47 0.040 0.106 - PRO1768 protein kinase D2 0.032 0.107 - OR1A2 olfactory receptor, family 1, subfamily A, member 2 0.034 0.107 - GIF gastric intrinsic factor 0.037 0.107 - ZNF648 zinc finger protein 648 0.028 0.108 - KIAA0125 KIAA0125 0.046 0.110 - SPRR1B small proline-rich protein 1B 0.028 0.110 - TTLL2 tubulin tyrosine -like family, member 2 0.046 0.110 - NOX4 NADPH oxidase 4 0.037 0.112 - TCL1B T-cell leukemia/lymphoma 1B 0.028 0.112 - GAGE12C G Antigen 12C 0.046 0.112 - GAGE12D G antigen 12D 0.046 0.112 - GAGE12E G antigen 12E 0.046 0.112 - GAGE12G G antigen 12G 0.046 0.112 - OR1S1 olfactory receptor, family 1, subfamily S, member 1 0.030 0.113 - C4orf41 chromosome 4 open reading frame 41 0.036 0.113 - MTNR1B melatonin receptor 1B 0.032 0.114 - OR10A7 olfactory receptor, family 10, subfamily A, member 7 0.046 0.115 - LOC84856 Uncharacterized LOC84856 0.046 0.116 cytochrome P450, family 2, subfamily E, polypeptide - CYP2E1 1 0.028 0.117 - C20orf114 chromosome 20 open reading frame 114 0.037 0.117 - ZIM3 zinc finger, imprinted 3 0.028 0.119

203

- RNASE3 ribonuclease, RNase A family, 3 0.032 0.120 - GAGE10 G antigen 10 0.049 0.120 - MEFV Mediterranean fever 0.046 0.122 - LOC643406 Uncharacterized LOC643406 0.048 0.122 - OR52I1 olfactory receptor, family 52, subfamily I, member 1 0.044 0.123 - YIPF7 Yip1 domain family, member 7 0.030 0.123 - DSCAM Down syndrome cell adhesion molecule 0.030 0.123 - OR10W1 olfactory receptor, family 10, subfamily W, member 1 0.048 0.124 - PCDHA8 protocadherin alpha 8 0.049 0.124 - ITLN2 intelectin 2 0.032 0.124 - LOC100128554 uncharacterized LOC100128554 0.046 0.125 - FAM198B family with sequence similarity 198, member B 0.049 0.126 - KRTDAP keratinocyte differentiation-associated protein 0.042 0.126 inter-alpha-trypsin inhibitor heavy chain family, - ITIH5L member 6 0.019 0.127 - TSPY2 testis specific protein, Y-linked 2 0.030 0.127 - TBC1D21 TBC1 domain family, member 21 0.012 0.127 - LILRB1 leukocyte immunoglobulin-like receptor, subfamily B 0.049 0.127 - OR1A1 olfactory receptor, family 1, subfamily A, member 1 0.037 0.130 - TREML4 triggering receptor expressed on myeloid cells-like 4 0.028 0.132 - DEFB116 defensin, beta 116 0.030 0.132 - OR7C2 olfactory receptor, family 7, subfamily C, member 2 0.037 0.133 - TRIM64B tripartite motif containing 64B 0.045 0.134 hepatocellular carcinoma up-regulated long non- - HULC coding RNA 0.032 0.134 - OR8B4 olfactory receptor, family 8, subfamily B, member 4 0.046 0.135

204

- LILRB5 leukocyte immunoglobulin-like receptor, subfamily B 0.049 0.135 - OR6F1 olfactory receptor, family 6, subfamily F, member 1 0.037 0.137 - CXorf64 chromosome X open reading frame 64 0.028 0.137 - DNAH7 dynein, axonemal, heavy chain 7 0.032 0.138 - C18orf20 chromosome 18 open reading frame 20 0.028 0.138 - RASAL3 RAS protein activator like 3 0.028 0.139 - TAT tyrosine aminotransferase 0.046 0.139 - GAGE12J G antigen 12J 0.038 0.139 - FAM75A5 SPATA31 subfamily A, member 5 0.047 0.140 - FAM75A7 SPATA31 subfamily A, member 7 0.047 0.140 Family With Sequence Similarity 90, Member A10, - FAM90A10 Pseudogene 0.044 0.140 - OR4A15 olfactory receptor, family 4, subfamily A, member 15 0.048 0.141 - CASP14 caspase 14, apoptosis-related cysteine peptidase 0.030 0.141 - GFRAL GDNF family receptor alpha like 0.023 0.141 - SIGLEC14 sialic acid binding Ig-like lectin 14 0.046 0.142 - P704P cDNA FLJ43851 fis, clone TESTI4006728. 0.023 0.143 - MIR2053 microRNA 2053 0.049 0.143 - EXOG endo/exonuclease 0.034 0.144 - MIR153-2 microRNA 153-2 0.046 0.145 - MIR542 microRNA 542 0.046 0.146 - ITLN1 intelectin 1 0.021 0.146 - RAB9P1 RAB9B, Member RAS Oncogene Family Pseudogene 1 0.028 0.146 - PPP3R2 protein phosphatase 3, regulatory subunit B, beta 0.049 0.148 - FABP1 fatty acid binding protein 1, liver 0.018 0.151

205

- LCE1D late cornified envelope 1D 0.046 0.151 - KRTAP7-1 keratin associated protein 7-1 0.030 0.153 leucine-rich repeat, immunoglobulin-like and - LRIT2 transmembrane domains 2 0.019 0.153 - CD244 CD244 molecule, natural killer cell receptor 2B4 0.037 0.153 - SPRR2G small proline-rich protein 2G 0.049 0.155 - TSIX TSIX transcript, XIST antisense RNA 0.044 0.156 cytochrome P450, family 2, subfamily C, polypeptide - CYP2C18 18 0.049 0.157 cytochrome P450, family 2, subfamily C, polypeptide - CYP2C9 9 0.028 0.158 - C19orf30 chromosome 19 open reading frame 30 0.019 0.158 - OR8G2 olfactory receptor, family 8, subfamily G, member 2 0.047 0.159 - DEFA3 Epithelial discoidin domain-containing receptor 1 0.048 0.160 - OR9G4 olfactory receptor, family 9, subfamily G, member 4 0.030 0.160 - KRT2 keratin 2 0.049 0.161 - PDILT protein disulfide isomerase-like, testis expressed 0.029 0.162 killer cell immunoglobulin-like receptor, three - KIR3DX1 domains, X1 0.030 0.162 - SLC6A14 solute carrier family 6 0.032 0.162 - LY6G6F lymphocyte antigen 6 complex, locus G6F 0.028 0.163 - DEFB115 defensin, beta 115 0.046 0.164 - OR10G8 olfactory receptor, family 10, subfamily G, member 8 0.045 0.165 - GPX6 glutathione peroxidase 6 0.049 0.166 - MIR519E microRNA 519e 0.036 0.167 - OR11G2 olfactory receptor, family 11, subfamily G, member 2 0.046 0.169 - OR8B8 olfactory receptor, family 8, subfamily B, member 8 0.048 0.170 - CLEC1A C-type lectin domain family 1, member A 0.042 0.171

206

- COG5 component of oligomeric golgi complex 5 0.045 0.171 - CLNK cytokine-dependent hematopoietic cell linker 0.044 0.171 Family With Sequence Similarity 90, Member A20, - FAM90A20 Pseudogene 0.041 0.172 - CSTA cystatin A 0.048 0.172 - HNRNPCL1 heterogeneous nuclear ribonucleoprotein C-like 1 0.035 0.174 - ZNF679 zinc finger protein 679 0.037 0.174 - LCE1A late cornified envelope 1A 0.046 0.175 - PGLYRP3 peptidoglycan recognition protein 3 0.047 0.175 - FAM12A epididymal protein 3A 0.024 0.177 - CTSG cathepsin G 0.032 0.178 - SLC1A6 solute carrier family 1 0.037 0.179 - KRTAP11-1 keratin associated protein 11-1 0.032 0.179 - PRAMEF12 PRAME family member 12 0.011 0.185 - LOC388796 uncharacterized LOC388796 0.011 0.185 - OR5H2 olfactory receptor, family 5, subfamily H, member 2 0.049 0.187 - MIR518B microRNA 518b 0.030 0.187 N-ethylmaleimide-sensitive factor attachment - NAPG protein, gamma 0.036 0.189 - SPAG7 sperm associated antigen 7 0.014 0.190 - LRRTM4 leucine rich repeat transmembrane neuronal 4 0.014 0.199 - C2CD4D C2 calcium-dependent domain containing 4D 0.046 0.199 - CCNL1 cyclin L1 0.035 0.199 - SMPDL3A sphingomyelin phosphodiesterase, acid-like 3A 0.046 0.208 - RPA3 replication protein A3, 14kDa 0.042 0.228 - PANX3 pannexin 3 0.024 0.234

207

- C1orf68 chromosome 1 open reading frame 68 0.046 0.236 - CCDC86 coiled-coil domain containing 86 0.011 0.237 - DEFB119 defensin, beta 119 0.032 0.238 - LCTL lactase-like 0.045 0.248 - C12orf77 chromosome 12 open reading frame 77 0.011 0.259 - KRTAP22-2 keratin associated protein 22-2 0.046 0.264 - KRTAP6-3 keratin associated protein 6-3 0.046 0.264 - GSTA3 glutathione S-transferase alpha 3 0.024 0.279 - CD300E CD300e molecule 0.032 0.307 - MIR424 microRNA 424 0.046 0.318 - AIM2 absent in melanoma 2 0.011 0.341

208

Appendix IV – R code for analysis of HuGene 2.0 expression data

209

Appendix IV: R code for analysis of HuGene 2.0 expression data http://www.bioconductor.org/packages/2.12/bioc/vignettes/oligo/inst/doc/primer.pdf https://github.com/benilton/oligo/wiki/Getting-the-grips-with-the-oligo-Package http://www.bioconductor.org/packages/release/bioc/vignettes/limma/inst/doc/usersguide .pdf Go to R and open new session Install packages “limma” and “oligo” > source("http://bioconductor.org/biocLite.R") > biocLite("limma") > biocLite("oligo") Go file and Change directory to microarray directory (e.g. C:\Andrew’s Files\Microarray ) Open notepad and design the formatting as the following and save it in the same directory. This is equivalent to the “Sibship” analysis in the limma vignette chapter 9.4 FileName Pt Skintype AS_C1_(HuGene-2_0-st).CEL 1 C AS_C2_(HuGene-2_0-st).CEL 2 C AS_C3_(HuGene-2_0-st).CEL 3 C AS_C4_(HuGene-2_0-st).CEL 4 C AS_C5_(HuGene-2_0-st).CEL 5 C AS_C6_(HuGene-2_0-st).CEL 6 C AS_S1_(HuGene-2_0-st).CEL 1 S AS_S2_(HuGene-2_0-st).CEL 2 S AS_S3_(HuGene-2_0-st).CEL 3 S AS_S4_(HuGene-2_0-st).CEL 4 S AS_S5_(HuGene-2_0-st).CEL 5 S AS_S6_(HuGene-2_0-st).CEL 6 S

Issue the following command at the R prompt: > library(oligo) > library(limma) > celFiles <- list.celfiles() > affyRaw <- read.celfiles(celFiles) Loading required package: pd.hugene.2.0.st Loading required package: RSQLite Loading required package: DBI Platform design info loaded. Reading in : AS_C1_(HuGene-2_0-st).CEL Reading in : AS_C2_(HuGene-2_0-st).CEL Reading in : AS_C3_(HuGene-2_0-st).CEL Reading in : AS_C4_(HuGene-2_0-st).CEL Reading in : AS_C5_(HuGene-2_0-st).CEL Reading in : AS_C6_(HuGene-2_0-st).CEL Reading in : AS_S1_(HuGene-2_0-st).CEL Reading in : AS_S2_(HuGene-2_0-st).CEL Reading in : AS_S3_(HuGene-2_0-st).CEL Reading in : AS_S4_(HuGene-2_0-st).CEL Reading in : AS_S5_(HuGene-2_0-st).CEL Reading in : AS_S6_(HuGene-2_0-st).CEL > library(pd.hugene.2.0.st) > conn <- db(pd.hugene.2.0.st) ##loads annotation info

210

> targets <- readTargets("targetsAO.txt") ##tells R which .CEL file is for which patient and how they’re paired > targets ##shows the experimental design FileName Pt Skintype AS_C1_(HuGene-2_0-st) AS_C1_(HuGene-2_0-st).CEL 1 C AS_C2_(HuGene-2_0-st) AS_C2_(HuGene-2_0-st).CEL 2 C AS_C3_(HuGene-2_0-st) AS_C3_(HuGene-2_0-st).CEL 3 C AS_C4_(HuGene-2_0-st) AS_C4_(HuGene-2_0-st).CEL 4 C AS_C5_(HuGene-2_0-st) AS_C5_(HuGene-2_0-st).CEL 5 C AS_C6_(HuGene-2_0-st) AS_C6_(HuGene-2_0-st).CEL 6 C AS_S1_(HuGene-2_0-st) AS_S1_(HuGene-2_0-st).CEL 1 S AS_S2_(HuGene-2_0-st) AS_S2_(HuGene-2_0-st).CEL 2 S AS_S3_(HuGene-2_0-st) AS_S3_(HuGene-2_0-st).CEL 3 S AS_S4_(HuGene-2_0-st) AS_S4_(HuGene-2_0-st).CEL 4 S AS_S5_(HuGene-2_0-st) AS_S5_(HuGene-2_0-st).CEL 5 S AS_S6_(HuGene-2_0-st) AS_S6_(HuGene-2_0-st).CEL 6 S > Pt <- factor(targets$Pt) > Skintype <- factor(targets$Skintype, levels=c("C","S")) > design <- model.matrix(~Pt+Skintype) > design ##shows the design (Intercept) Pt2 Pt3 Pt4 Pt5 Pt6 SkintypeS 1 1 0 0 0 0 0 0 2 1 1 0 0 0 0 0 3 1 0 1 0 0 0 0 4 1 0 0 1 0 0 0 5 1 0 0 0 1 0 0 6 1 0 0 0 0 1 0 7 1 0 0 0 0 0 1 8 1 1 0 0 0 0 1 9 1 0 1 0 0 0 1 10 1 0 0 1 0 0 1 11 1 0 0 0 1 0 1 12 1 0 0 0 0 1 1 attr(,"assign") [1] 0 1 1 1 1 1 2 attr(,"contrasts") attr(,"contrasts")$Pt [1] "contr.treatment" attr(,"contrasts")$Skintype [1] "contr.treatment" > rmaC0 <- rma(affyRaw, target='core') ##normalises, background corrects etc. Background correcting Normalizing Calculating Expression > featureData(rmaC0) <- getNetAffx(rmaC0, 'transcript') > names(fData(rmaC0)) [1] "transcriptclusterid" "probesetid" "seqname" "strand" "start" "stop"

211

[7] "totalprobes" "geneassignment" "mrnaassignment" "swissprot" "unigene" "gobiologicalprocess" [13] "gocellularcomponent" "gomolecularfunction" "pathway" "proteindomains" "crosshybtype" "category" > autosomes <- paste('chr', 1:22, sep='') > iCore <- fData(rmaC0)$seqname %in% autosomes > coreAutosomes <- rmaC0[iCore,] > fit <- lmFit(rmaC0, design) > fit <- eBayes(fit) > topTable(fit, coef="SkintypeS", adjust="BH", number = 10) ## check annotation has worked > tt1 = topTable(fit, coef="SkintypeS", adjust="BH", number = 54000) ##all values > write.csv(tt1, "6ptsonly.csv") ##writes to a .csv file in selected directory

212

Appendix V - Full list of significantly differentially expressed genes

213

Appendix V: Full list of significantly differentially expressed genes, with a fold change of ±1.5 and a nominal p value of p<0.5 Fold p- Gene Name Gene name Change Value ACSL5 acyl-CoA synthetase long-chain family member 5 0.64 0.002 ADAM metallopeptidase with thrombospondin type ADAMTS5 1 motif 0.47 0.000 ADH1B alcohol dehydrogenase 1B 0.62 0.005 ADRA2A adrenoceptor alpha 2A 0.64 0.047 AIM2 absent in melanoma 2 1.70 0.001 AMIGO2 adhesion molecule with Ig-like domain 2 1.89 0.000 ANGPTL2 angiopoietin-like 2 0.61 0.003 ANKH ankylosis, progressive homolog 1.86 0.001 ANKRD1 ankyrin repeat domain 1 3.35 0.002 ANKRD44 ankyrin repeat domain 44 1.60 0.001 AOX1 aldehyde oxidase 1 0.66 0.026 APOD apolipoprotein D 0.51 0.006 AR androgen receptor 0.65 0.003 ARMC4 armadillo repeat containing 4 1.55 0.000 ASPA 0.41 0.001 BEX1 brain expressed, X-linked 1 0.49 0.044 BHLHE40 basic helix-loop-helix family, member e40 1.57 0.009 C13orf15 regulator of cell cycle 0.61 0.015 C13orf33 mesenteric estrogen-dependent adipogenesis 0.63 0.007 C21orf96 RUNX1 intronic transcript 1 1.63 0.010 C3 complement component 3 0.60 0.007 C4orf31 neuron-derived neurotrophic factor 0.60 0.006 CADM1 cell adhesion molecule 1 2.30 0.000 CCBE1 collagen and calcium binding EGF domains 1 0.57 0.005 CCL2 chemokine 0.43 0.000 CCRL1 chemokine 0.39 0.000 CD24 CD24 molecule 1.77 0.006 CDCA7 cell division cycle associated 7 0.47 0.003 CDCP1 CUB domain containing protein 1 0.65 0.003 CDH1 cadherin 1, type 1, E-cadherin 1.52 0.014 CDH2 cadherin 2, type 1, N-cadherin 1.52 0.024 CDKN2B cyclin-dependent kinase inhibitor 2B 1.50 0.003 CLGN calmegin 0.62 0.020 CLIC6 chloride intracellular channel 6 1.71 0.003 COL11A1 collagen, type XI, alpha 1 1.62 0.052 COL4A1 collagen, type IV, alpha 1 0.65 0.021 COLEC12 collectin sub-family member 12 0.59 0.050 COMP cartilage oligomeric matrix protein 2.38 0.001 CPM carboxypeptidase M 0.56 0.001 CSTA cystatin A 2.04 0.001 cytochrome P450, family 24, subfamily A, CYP24A1 polypeptide 1 0.53 0.020

214

DDX43 DEAD 1.97 0.003 DIO2 deiodinase, iodothyronine, type II 0.57 0.010 DKK2 dickkopf WNT signaling pathway inhibitor 2 0.55 0.013 DPT dermatopontin 0.48 0.038 DSG2 desmoglein 2 1.69 0.002 DTNA dystrobrevin, alpha 0.60 0.012 extracellular matrix protein 2, female organ and ECM2 adipocyte specific 0.65 0.008 EDIL3 EGF-like repeats and discoidin I-like domains 3 2.96 0.001 EMCN endomucin 0.66 0.001 ectonucleotide pyrophosphatase/phosphodiesterase ENPP2 2 0.59 0.002 ESM1 endothelial cell-specific molecule 1 1.90 0.010 EYA2 eyes absent homolog 2 2.05 0.000 F2RL2 coagulation factor II 2.17 0.036 FAM155A family with sequence similarity 155, member A 1.62 0.007 piezo-type mechanosensitive ion channel component FAM38B 2 1.72 0.011 FBN2 fibrillin 2 3.40 0.000 FGL2 fibrinogen-like 2 0.59 0.055 FLG filaggrin 2.33 0.005 FNDC1 fibronectin type III domain containing 1 2.85 0.008 FOXF2 forkhead box F2 2.43 0.003 GABBR2 gamma-aminobutyric acid 0.46 0.013 UDP-N-acetyl-alpha-D-galactosamine:polypeptide N- GALNTL2 acetylgalactosaminyltransferase 15 0.51 0.000 GBP2 guanylate binding protein 2, interferon-inducible 0.59 0.001 GPR116 G protein-coupled receptor 116 1.55 0.013 GPR133 G protein-coupled receptor 133 0.64 0.031 GPR158 G protein-coupled receptor 158 1.50 0.003 G protein-coupled receptor, family C, group 5, GPRC5A member A 0.64 0.005 GRIA1 glutamate receptor, ionotropic, AMPA 1 0.50 0.011 GRIK2 mRNA for kainate receptor subunit 2.57 0.001 GRPR gastrin-releasing peptide receptor 0.39 0.004 GSTM5 glutathione S-transferase mu 5 0.50 0.003 HHIP hedgehog interacting protein 1.92 0.036 HIST1H1D histone cluster 1, H1d 0.60 0.009 HIST1H3H histone cluster 1, H3h 1.61 0.019 HTR1F 5-hydroxytryptamine 1.63 0.022 HTR2A 5-hydroxytryptamine 1.51 0.006 IGF2BP1 insulin-like growth factor 2 mRNA binding protein 1 0.60 0.000 IGSF10 immunoglobulin superfamily, member 10 0.43 0.002 IL13RA2 interleukin 13 receptor, alpha 2 0.32 0.000 IL1R1 interleukin 1 receptor, type I 0.58 0.005 INHBA inhibin, beta A 2.35 0.009 ITGB8 integrin, beta 8 1.53 0.039

215

KCNH1 potassium voltage-gated channel, subfamily H 1.63 0.001 KCNRG potassium channel regulator 1.59 0.002 KIAA1324L KIAA1324-like, 0.52 0.009 KIAA1462 KIAA1462 1.53 0.005 KITLG KIT ligand 0.61 0.005 KRT7 keratin 7 0.60 0.005 KRT7 mannan-binding lectin serine peptidase 1 0.51 0.012 LOC401097 mesenteric estrogen-dependent adipogenesis 1.59 0.012 MASP1 met proto-oncogene 0.46 0.000 MET uncharacterized LOC90768, non-coding RNA. 1.55 0.012 MGC45800 midline 1 1.65 0.012 MID1 microRNA 100 0.66 0.002 MIR100 microRNA 32 1.68 0.000 MIR32 microRNA 519a-2 1.56 0.006 MIR519A2 mohawk homeobox 0.63 0.019 MKX matrix metallopeptidase 1 1.78 0.002 MMP1 monooxygenase, DBH-like 1 0.52 0.030 MOXD1 N-methylpurine-DNA glycosylase 1.77 0.000 MSTN myostatin 0.23 0.002 MT1F metallothionein 1F 0.58 0.002 NID2 nidogen 2 0.66 0.001 NR4A3 nuclear receptor subfamily 4, group A, member 3 0.66 0.002 NR5A2 nuclear receptor subfamily 5, group A, member 2 0.59 0.000 partial mRNA for teneurin-2 variant containing ODZ2 intronic insert 0.59 0.001 ODZ3 teneurin transmembrane protein 1 1.91 0.006 OR5L1 olfactory receptor, family 5, subfamily L, member 1 1.52 0.028 PAPPA pregnancy-associated plasma protein A, pappalysin 1 0.62 0.002 PCSK1 proprotein convertase subtilisin/kexin type 1 0.53 0.007 PDE1A phosphodiesterase 1A, calmodulin-dependent 1.77 0.016 PDGFD platelet derived growth factor D 0.29 0.000 PDGFRL platelet-derived growth factor receptor-like 0.65 0.007 PLCB1 phospholipase C, beta 1 0.58 0.014 PLD5 phospholipase D family, member 5 1.70 0.009 PLXDC2 plexin domain containing 2 4.56 0.000 phosphatidic acid phosphatase type 2 domain PPAPDC1A containing 1A 1.91 0.004 PPP4R4 protein phosphatase 4, regulatory subunit 4 1.69 0.015 PRICKLE1 prickle homolog 1 1.96 0.001 PRL prolactin 0.53 0.009 PROS1 protein S 0.66 0.003 PRUNE2 prune homolog 2 0.58 0.004 PSG5 pregnancy specific beta-1-glycoprotein 5 0.64 0.004 PTGFR prostaglandin F receptor 0.59 0.006 PTPRB protein tyrosine phosphatase, receptor type, B 1.77 0.001 PTPRD protein tyrosine phosphatase, receptor type, D 2.03 0.005 RASGRP1 RAS guanyl releasing protein 1 0.57 0.023

216

RCAN2 regulator of calcineurin 2 0.58 0.032 REP15 RAB15 effector protein 1.51 0.001 RIMS1 regulating synaptic membrane exocytosis 1 2.14 0.002 RSPO3 R-spondin 3 0.65 0.037 S1PR1 sphingosine-1-phosphate receptor 1 0.66 0.004 SAA1 serum amyloid A1 1.64 0.011 SEMA3A sema domain, immunoglobulin domain 0.48 0.034 SESN3 sestrin 3 0.61 0.016 SFRP1 secreted frizzled-related protein 1 0.48 0.001 SFRP4 secreted frizzled-related protein 4 1.52 0.031 SNORD111 small nucleolar RNA, C/D box 111 1.66 0.003 SNORD114-20 small nucleolar RNA, C/D box 114-20 1.64 0.048 SNORD114-25 small nucleolar RNA, C/D box 114-25 1.59 0.027 SNORD114-27 small nucleolar RNA, C/D box 114-27 1.74 0.014 SNORD114-28 small nucleolar RNA, C/D box 114-28 1.51 0.043 SNORD114-29 small nucleolar RNA, C/D box 114-29 1.55 0.053 SNORD114-8 small nucleolar RNA, C/D box 114-8 1.60 0.010 SNORD116-14 mall nucleolar RNA, C/D box 116-14 0.60 0.006 SNORD72 small nucleolar RNA, C/D box 72 0.66 0.005 SPACA5 sperm acrosome associated 5 1.52 0.009 STC1 stanniocalcin 1 0.59 0.005 STEAP4 STEAP family member 4 0.54 0.012 SULF1 sulfatase 1 1.63 0.018 sushi, von Willebrand factor type A, EGF and SVEP1 pentraxin domain containing 1 0.66 0.024 TBC1D28 TBC1 domain family, member 28 0.62 0.002 TBX5 T-box 5 0.64 0.014 TCF21 transcription factor 21 0.57 0.003 TGFB2 transforming growth factor, beta 2 1.67 0.014 THRB thyroid hormone receptor, beta 0.63 0.011 TIAM2 T-cell lymphoma invasion and metastasis 2 1.73 0.002 transmembrane and tetratricopeptide repeat TMTC2 containing 2 1.55 0.010 TSPAN2 tetraspanin 2 0.57 0.050 UNC5B unc-5 homolog B 1.68 0.002 wingless-type MMTV integration site family member WNT2 2 1.56 0.043 X-ray repair complementing defective repair in XRCC4 Chinese hamster cells 4 1.71 0.007

217

Appendix VI - 507 differentially expressed gene sets using a Mann-Whitney U test with a p<0.05 revealed in GSEA

218

Appendix VI: 507 differentially expressed gene sets using a Mann-Whitney U test with a p<0.05 revealed in GSEA # of # of Measured Median Name Entities Entities change p-value extracellular matrix 250 230 -1.014 3.28E-08 extracellular space 1166 908 -1.007 4.404E-07 cell adhesion 658 578 1.000 2.256E-06 blood circulation 44 42 -1.047 7.051E-06 extracellular region 2319 1823 -1.004 1.459E-05 chemotaxis 164 138 -1.009 1.577E-05 ureteric bud development 50 47 -1.041 3.009E-05 response to corticosterone 35 34 -1.058 4.161E-05 heparin binding 166 141 -1.007 8.173E-05

positive regulation of tyrosine phosphorylation of Stat3 protein 27 26 1.000 8.195E-05 plasma membrane 5701 3843 -1.002 9.877E-05 cellular response to tumor necrosis factor 69 57 -1.040 0.0001264 female gonad development 29 23 -1.009 0.0001759 embryonic digestive tract development 21 18 -1.059 0.0001848 immune response 449 355 -1.009 0.0001982

cellular response to interferon-beta 28 16 -1.063 0.0002416

G-protein coupled receptor activity 2257 753 -1.001 0.0003043 positive regulation of cardiac muscle hypertrophy 12 12 -1.054 0.0003277 negative chemotaxis 14 14 -1.009 0.000337 chemorepellent activity 7 7 -1.193 0.0003432 cellular response to lipopolysaccharide 187 103 -1.040 0.000373 positive regulation of inflammatory response 59 57 -1.006 0.0003735 retinal dehydrogenase activity 9 8 -1.118 0.0003774 axon extension involved in axon guidance 16 15 1.044 0.0004785

induction of positive chemotaxis 21 17 1.049 0.000486 positive regulation of gene expression 215 190 -1.015 0.0004986 semaphorin-plexin signaling pathway 20 17 -1.054 0.0005515 positive regulation of ERK1 and ERK2 cascade 113 101 1.011 0.0005549

219

heterotypic cell-cell adhesion 10 9 -1.124 0.0005722 positive regulation of cardiac muscle cell proliferation 23 20 1.019 0.0005843

N-formyl peptide receptor activity 18 8 -1.081 0.0006314 spinal cord development 29 28 -1.006 0.0007056 regulation of branching involved in salivary gland morphogenesis by mesenchymal-epithelial signaling 5 5 -1.133 0.0007076 collagen binding 61 55 1.012 0.0007191 learning or memory 59 56 -1.045 0.0007442 response to glucocorticoid 139 130 -1.019 0.000892 positive regulation of developmental growth 5 5 -1.446 0.0009359 negative regulation of norepinephrine secretion 10 10 -1.117 0.0010597 phagocytosis, recognition 14 8 -1.109 0.0010815 positive regulation of JAK-STAT cascade 23 20 -1.018 0.0011052 response to cold 43 40 -1.028 0.0011129

DNA-dependent DNA replication 32 29 -1.059 0.0011131 DNA replication 179 155 -1.045 0.0011354 proteinaceous extracellular matrix 318 292 -1.009 0.0011359 female pregnancy 142 110 -1.031 0.0011662 olfactory receptor activity 1710 378 1.003 0.0012617 positive regulation of epithelial to mesenchymal transition 25 23 1.046 0.0012922 neural crest cell differentiation 6 6 1.108 0.0013875 response to low-density lipoprotein particle 5 5 -1.333 0.0014827 positive regulation of corticotropin secretion 7 7 -1.112 0.001515 angiogenesis 249 229 -1.014 0.0015558 G-protein coupled receptor signaling pathway 2248 812 -1.007 0.0015583 positive regulation of catenin import into nucleus 11 11 -1.019 0.0015765 complement activation 50 41 -1.064 0.0016219 regulated secretory pathway 14 13 -1.016 0.0016363 regulation of cytokine biosynthetic process 9 9 -1.077 0.0016771 inflammatory response 367 322 -1.008 0.0017238

220

extracellular matrix organization 290 266 1.001 0.0017435 response to laminar fluid shear stress 10 10 1.094 0.0017462 ethanol oxidation 11 11 -1.109 0.00176 signal transduction 2955 1835 -1.007 0.0017609 hematopoietic progenitor cell differentiation 16 16 1.062 0.0017836 positive regulation of chemokine secretion 8 8 1.063 0.0018512 cytokine activity 224 184 -1.012 0.001885 negative regulation of endothelial cell apoptotic process 18 16 1.054 0.0019024 adhesion to symbiont 11 8 -1.126 0.0019455 heparan sulfate proteoglycan binding 22 19 1.029 0.0019699 chemokine activity 56 44 -1.006 0.0020452 negative regulation of angiogenesis 69 63 -1.013 0.0020455 regulation of striated muscle tissue development 7 7 -1.082 0.0020462 parturition 14 12 -1.086 0.0024596 positive regulation of follicle- stimulating hormone secretion 5 5 1.102 0.0025665 regulation of cell proliferation 193 165 -1.015 0.0026544 positive regulation of myeloid cell differentiation 7 5 -1.142 0.0027348 response to estrogen 105 97 -1.020 0.0027937 JAK-STAT cascade 34 33 -1.044 0.0028152 regulation of transcription involved in G1-S transition of mitotic cell cycle 22 21 -1.084 0.0028779 regulation of ossification 15 15 -1.038 0.0029015 complement activation, classical pathway 61 45 -1.046 0.0029754 excitatory synapse 30 25 -1.026 0.00299 cyclin-dependent protein serine- threonine kinase regulator activity 14 10 -1.068 0.0030889 cellular response to interleukin-1 53 46 -1.015 0.0031186 fever generation 6 6 1.045 0.0033609 vascular endothelial growth factor receptor signaling pathway 25 22 -1.027 0.0036283

221

negative regulation of transposition 8 6 -1.132 0.0036784 positive regulation of cell proliferation 534 478 -1.004 0.0039509

STAT protein import into nucleus 7 7 1.053 0.0039517 liver development 127 113 -1.025 0.003976 sterol metabolic process 20 19 -1.034 0.0039797 CD4 receptor binding 5 5 -1.129 0.0040767 DNA synthesis involved in DNA repair 12 10 -1.047 0.0041127 positive regulation of transcription regulatory region DNA binding 8 8 -1.030 0.0042354 integral component of plasma membrane 1292 1100 -1.004 0.0042435 actinin binding 10 8 1.132 0.0043218 response to cortisol 6 5 -1.118 0.0043679 response to radiation 48 44 -1.023 0.004373 negative regulation of protein serine-threonine kinase activity 15 14 1.015 0.0043964 DNA cytosine deamination 6 5 -1.194 0.0044323 Wnt-protein binding 33 28 -1.015 0.0044911 DNA replication initiation 25 23 -1.061 0.0045465 negative regulation of systemic arterial blood pressure 9 8 1.073 0.0045562 positive regulation of peptidyl- tyrosine phosphorylation 89 80 -1.005 0.0046054 laminin-1 complex 7 7 -1.007 0.0046197

CCR2 chemokine receptor binding 5 5 -1.128 0.004688 negative regulation of cell proliferation 470 415 -1.010 0.0046922 embryonic camera-type eye development 12 12 -1.083 0.0047355 alpha-amino-3-hydroxy-5-methyl- 4-isoxazolepropionic acid selective glutamate receptor complex 29 28 -1.016 0.0047548 glutathione metabolic process 53 46 1.007 0.0048706 generation of neurons 18 17 -1.039 0.0049182 extracellular matrix structural constituent 73 69 -1.012 0.0049402 cellular response to dsRNA 6 6 -1.121 0.0049967 symbiont-containing vacuole membrane 10 6 -1.126 0.005016

222

activation of JNKK activity 8 8 -1.051 0.0054312 motor neuron axon guidance 28 25 1.027 0.0054507 skeletal system development 167 148 -1.005 0.005453 deaminase activity 6 6 -1.141 0.005464 copper ion import 6 6 -1.078 0.005518 embryonic skeletal system morphogenesis 63 54 -1.033 0.005755 cellular response to hormone stimulus 46 42 -1.038 0.0057596 antigen binding 97 53 -1.035 0.0057791 positive regulation of small GTPase mediated signal transduction 5 5 -1.083 0.0058257 spindle organization 19 16 -1.086 0.0058742 phosphatidylserine acyl-chain remodeling 17 16 1.057 0.0058843 response to lithium ion 21 21 -1.046 0.0059304 growth factor activity 186 165 -1.001 0.0059915 lung vasculature development 9 8 -1.037 0.0060035 Wnt-activated receptor activity 22 21 -1.015 0.0061684 response to mechanical stimulus 92 83 -1.028 0.0062763 melanocyte differentiation 30 27 -1.024 0.0062767 neuronal action potential 25 22 -1.010 0.0062832 mitotic cytokinesis 21 20 -1.037 0.0063986 positive regulation of actin cytoskeleton reorganization 14 12 1.090 0.0064818 sympathetic nervous system development 27 25 -1.018 0.0065088 extrinsic component of membrane 54 48 -1.027 0.0065501 cellular response to ethanol 13 11 1.044 0.0065526 receptor signaling protein activity 45 45 -1.018 0.006623 lung development 145 121 -1.007 0.0066951 regulation of cytokine secretion 14 11 -1.049 0.0067571 response to wounding 93 83 -1.006 0.006832 response to estradiol 160 145 -1.021 0.0068344 cAMP catabolic process 16 16 1.007 0.0068941 blood vessel development 72 64 -1.016 0.0069068 negative regulation of astrocyte differentiation 15 14 -1.037 0.006997 positive regulation of cytokine secretion 36 32 -1.014 0.0070193 anchored component of membrane 163 134 1.001 0.0070305

223

low-density lipoprotein particle clearance 9 9 -1.126 0.00706 AMP binding 14 14 -1.036 0.0071077 positive regulation of phosphatidylinositol 3-kinase signaling 61 52 -1.004 0.007211 negative regulation of smooth muscle cell apoptotic process 7 6 -1.073 0.0072142 retinal metabolic process 14 13 -1.017 0.0073446 response to fungicide 14 14 -1.054 0.0073655 negative regulation of activation- induced cell death of T cells 6 5 1.108 0.0073914 intermediate filament 189 137 -1.003 0.0073923 mitotic M phase 9 9 -1.094 0.0074915 regulation of cell migration 67 59 -1.002 0.0075064 positive regulation of neutrophil chemotaxis 19 19 1.016 0.0075219 positive regulation of activated T cell proliferation 23 20 1.034 0.0077793 blood vessel morphogenesis 33 30 -1.010 0.0078004 activin binding 13 12 1.055 0.0079154 R-SMAD binding 21 20 -1.016 0.0080122 peripheral nervous system development 42 38 1.029 0.0080207 Rho GTPase activator activity 28 24 1.013 0.0080736 phosphoglucomutase activity 6 5 1.128 0.0081371 node of Ranvier 15 14 1.019 0.0081891 response to cytokine 104 89 1.002 0.0081906 embryonic hindgut morphogenesis 5 5 1.081 0.0081953 positive regulation of cell-matrix adhesion 24 20 1.026 0.0083267 central nervous system myelination 10 9 1.034 0.0083282 type II transforming growth factor beta receptor binding 8 7 1.068 0.0085132 positive regulation of nitric-oxide synthase biosynthetic process 19 17 -1.035 0.0087048 hemopoiesis 106 91 -1.012 0.0087204 vasoconstriction 19 19 -1.035 0.0089511 calcium- and calmodulin- dependent protein kinase complex 8 7 -1.078 0.0089636 artery smooth muscle contraction 9 9 -1.059 0.0089813

224

interleukin-1 receptor activity 8 8 -1.138 0.0089855 male gonad development 118 104 -1.011 0.0091123 cGMP catabolic process 7 7 1.018 0.0091236 cell-cell adhesion 144 132 -1.010 0.0091595 cell growth involved in cardiac muscle cell development 7 7 -1.037 0.0092975 positive regulation of macrophage derived foam cell differentiation 15 15 -1.041 0.0093219 spleen development 38 32 1.015 0.0093694 cytokine secretion 11 10 1.138 0.0095044 brain renin-angiotensin system 7 6 1.066 0.0095629 cyclin binding 21 20 -1.044 0.0097817 regulation of embryonic development 12 11 1.081 0.0098125 regulation of fibroblast growth factor receptor signaling pathway 8 8 1.090 0.0098351 organ regeneration 99 84 -1.032 0.0100262 limb morphogenesis 31 27 -1.039 0.0101815 negative regulation of single stranded viral RNA replication via double stranded DNA intermediate 7 6 -1.040 0.0102274 transdifferentiation 9 8 -1.049 0.010289 positive regulation of tyrosine phosphorylation of Stat1 protein 9 9 -1.059 0.0104203 cyclin-dependent protein kinase holoenzyme complex 12 10 -1.026 0.0104214 3',5'-cyclic-nucleotide phosphodiesterase activity 27 24 1.007 0.0104388 cellular response to interferon- gamma 45 36 -1.034 0.0106609 fibronectin binding 26 22 1.011 0.0106968 DNA strand elongation involved in DNA replication 31 31 -1.060 0.0107737 relaxation of cardiac muscle 9 9 -1.037 0.0108255 attachment of spindle microtubules to kinetochore 9 9 -1.094 0.0111551 response to superoxide 6 6 -1.078 0.0112798 primary amine oxidase activity 10 6 -1.124 0.011597 positive regulation of protein phosphorylation 146 132 -1.020 0.0116266 AT DNA binding 8 7 -1.090 0.0116389 axon guidance 342 332 -1.003 0.01177 patterning of blood vessels 45 40 -1.020 0.0117821

225

interleukin-1 receptor binding 20 12 -1.062 0.0118471 cellular response to hydroperoxide 5 5 -1.163 0.0118535 negative regulation of growth 20 18 -1.044 0.0122314 response to lipopolysaccharide 228 207 -1.018 0.0122704 glycogen catabolic process 18 18 1.050 0.0124542 positive regulation of cell migration 158 136 -1.009 0.0124741 xenobiotic metabolic process 159 144 -1.007 0.0124816 calmodulin-dependent protein kinase activity 17 16 1.017 0.0125357 neuromuscular process 27 24 1.036 0.0125966 macrophage chemotaxis 15 14 -1.022 0.012604 phosphatidylcholine acyl-chain remodeling 25 24 1.044 0.0128151 sleep 13 12 1.053 0.0128645 mast cell granule 8 6 1.062 0.0129421 response to water deprivation 7 6 -1.050 0.0129444 3',5'-cyclic-AMP phosphodiesterase activity 12 12 1.007 0.0130099 kidney development 140 126 1.006 0.0130397 lens fiber cell differentiation 12 11 -1.054 0.0130816 positive regulation of epithelial cell proliferation involved in lung morphogenesis 5 5 1.141 0.013093 cyclic-nucleotide phosphodiesterase activity 8 8 1.018 0.0132296 receptor binding 438 378 -1.000 0.0136526 regulation of dendrite development 17 14 -1.072 0.0137553 phosphatidylethanolamine acyl- chain remodeling 23 22 1.046 0.0140502 axon initial segment 12 12 -1.064 0.0141198 response to gamma radiation 42 41 -1.033 0.0141212 androgen binding 5 5 -1.084 0.0143028 positive regulation of apoptotic process 371 344 -1.011 0.0145701 positive regulation of skeletal muscle tissue development 6 6 1.041 0.0146396 collagen type IV 8 6 -1.105 0.0146839 activation of phospholipase C activity 60 60 -1.002 0.0146868 cell migration 175 152 1.001 0.0147056 positive regulation of interleukin-5 production 5 5 -1.082 0.0147757 positive regulation of monocyte chemotaxis 14 10 1.089 0.0149107

226

regulation of epithelial cell proliferation 16 14 -1.035 0.0149372 regulation of vasodilation 17 16 1.047 0.0149389 calcium ion binding 792 645 1.003 0.015114 growth hormone receptor binding 7 6 1.113 0.0151195 cerebral cortex regionalization 6 6 -1.007 0.0156982 condensed chromosome, centromeric region 6 5 -1.097 0.0157953 positive regulation of protein kinase B signaling 80 72 -1.000 0.0158001 positive regulation of endothelial cell proliferation 70 62 -1.011 0.015888 positive regulation of MAP kinase activity 54 48 1.015 0.0159191 maternal process involved in parturition 8 8 -1.042 0.0161093 muscle fiber development 17 13 1.042 0.0161115 lysozyme activity 17 10 1.063 0.0161146 platelet-derived growth factor receptor binding 16 14 -1.012 0.0163996 growth factor receptor binding 6 5 -1.129 0.0164481 blood vessel remodeling 45 41 -1.010 0.0165057 synaptic cleft 5 5 1.077 0.0165649 cardiac left ventricle morphogenesis 16 13 -1.092 0.0166069 positive regulation of calcineurin- NFAT signaling cascade 5 5 -1.161 0.0166697 cellular triglyceride homeostasis 6 5 -1.091 0.0167112 positive chemotaxis 32 26 1.012 0.0167181 somatic stem cell maintenance 52 42 -1.035 0.0170704 positive regulation of vascular permeability 13 11 1.058 0.0173922 interleukin-6 receptor binding 6 5 -1.097 0.017592 G-protein coupled purinergic nucleotide receptor activity 12 10 -1.002 0.017695 semaphorin receptor binding 10 9 -1.053 0.0179815 regulation of cell growth 90 82 1.010 0.0181524 activin receptor signaling pathway 12 12 1.024 0.0181561 regulation of complement activation 26 25 -1.038 0.0182037 neuromuscular junction 57 51 -1.014 0.0182065 positive regulation of interleukin- 10 production 20 18 -1.000 0.0182356

227

positive regulation of tyrosine phosphorylation of Stat5 protein 18 16 1.008 0.0182598 phosphatidylglycerol acyl-chain remodeling 17 16 1.050 0.0183562 regulation of behavior 11 11 -1.009 0.0184562 phospholipid dephosphorylation 5 5 1.043 0.0185854 positive regulation of interleukin-6 secretion 6 6 1.066 0.0186119 lung alveolus development 46 39 -1.037 0.0186461 response to antibiotic 44 41 -1.021 0.0186801 ectopic germ cell programmed cell death 9 8 -1.025 0.018827 positive regulation of macrophage differentiation 12 12 1.066 0.0189279 positive regulation of leukocyte chemotaxis 13 12 -1.014 0.018999 negative regulation of necrotic cell death 9 8 1.023 0.0193097 peptide cross-linking via chondroitin 4-sulfate glycosaminoglycan 10 6 1.105 0.0194415 mast cell activation 13 12 1.004 0.0196564 regulation of inflammatory response 60 54 1.014 0.0196622 sebaceous gland development 7 5 1.086 0.0196811 negative regulation of sequence- specific DNA binding transcription factor activity 67 63 -1.012 0.0198222 response to interferon-beta 7 6 -1.015 0.0198359 cellular response to zinc ion 13 13 -1.063 0.0199002 activity, acting on the CH-CH group of donors, NAD or NADP as acceptor 6 6 1.001 0.020292 folic acid binding 14 12 1.067 0.0202942 axonogenesis involved in innervation 5 5 -1.081 0.0203622 response to electrical stimulus 42 39 -1.030 0.020572 positive regulation of peptidase activity 15 13 -1.032 0.020801 positive regulation of cytosolic calcium ion concentration involved in phospholipase C-activating G- protein coupled signaling pathway 18 18 -1.009 0.0208675

228

testosterone dehydrogenase (NAD+) activity 7 7 -1.010 0.0210282 palate development 79 69 -1.016 0.0210485 dopamine biosynthetic process 12 11 -1.066 0.0211107 lymphocyte chemotaxis 10 9 1.013 0.0211943 reproductive system development 6 6 1.018 0.0214894 ethanol catabolic process 8 7 -1.109 0.0214983 phospholipase A2 activity 28 24 1.044 0.0215711 cytoskeletal protein binding 74 66 -1.001 0.0217559 negative regulation of cell migration 96 83 1.009 0.0219021 thrombin receptor activity 6 5 1.075 0.0220942 salivary gland cavitation 5 5 1.041 0.0222484 signal transduction by phosphorylation 41 39 1.001 0.0222602 negative regulation of protein import into nucleus 7 5 1.055 0.0224854 endocytic vesicle membrane 56 54 -1.001 0.0225235 steroid metabolic process 127 110 1.005 0.0226932 ionotropic glutamate receptor complex 14 14 1.060 0.0228927 negative regulation of epithelial cell proliferation 65 58 -1.013 0.0229439 apoptotic cell clearance 14 14 1.065 0.0229925 positive regulation of receptor internalization 18 18 1.039 0.023261 chemoattractant activity 26 20 1.050 0.0233231 positive regulation of DNA biosynthetic process 12 11 -1.073 0.0234577 muscle organ development 113 107 -1.005 0.0235137 I-SMAD binding 11 11 1.090 0.0235352 detection of chemical stimulus involved in sensory perception of smell 1407 298 1.003 0.0235701 regulation of relaxation of cardiac muscle 6 5 -1.078 0.0238754

CCR5 chemokine receptor binding 7 7 -1.078 0.0240057 eyelid development in camera-type eye 11 11 -1.051 0.0242371 negative regulation of Ras protein signal transduction 27 23 1.031 0.0244695 tetrahydrobiopterin biosynthetic process 7 6 1.075 0.0247628 G-protein beta-subunit binding 7 6 1.083 0.0248646 calmodulin binding 190 177 1.006 0.0249623

229

regulation of metabolic process 7 7 -1.003 0.0250244 response to acidity 10 6 1.132 0.0253789 signal transducer activity 1905 856 -1.005 0.0254058 positive regulation of epithelial cell proliferation 78 68 -1.010 0.0255302 salivary gland morphogenesis 10 10 1.018 0.025726 positive regulation of cell division 50 45 1.006 0.0257824 cysteine metabolic process 7 6 1.009 0.0258851 cell surface 508 456 -1.001 0.0258869 branching involved in ureteric bud morphogenesis 55 53 -1.003 0.0259047 receptor signaling protein tyrosine kinase activity 12 12 -1.014 0.0259471 dopamine transport 6 6 1.085 0.0259654 response to reactive oxygen species 24 22 1.016 0.0264261 polysaccharide metabolic process 8 8 -1.050 0.02645 protein targeting 53 45 1.016 0.0268116 negative regulation of cell differentiation 64 58 -1.000 0.0268966 positive regulation of immune response 14 9 1.010 0.0272159 cellular response to cadmium ion 15 15 -1.063 0.0272215 positive regulation of membrane potential 9 8 1.067 0.0273746 positive regulation of Ras protein signal transduction 20 17 -1.048 0.0274233 positive regulation of cellular protein metabolic process 12 11 1.027 0.02775 receptor catabolic process 7 7 1.096 0.0277575 voltage-gated potassium channel complex 79 76 -1.005 0.0278026 regulation of binding 5 5 1.090 0.0278305 organ formation 8 7 1.006 0.028128 inflammatory response to antigenic stimulus 19 15 1.019 0.0284227 positive regulation of p38MAPK cascade 7 7 1.092 0.0286469 insulin-like growth factor binding 23 23 1.014 0.0288223 positive regulation of stress- activated MAPK cascade 21 20 -1.020 0.0290676 cardiac myofibril assembly 15 12 -1.012 0.0291655

230

negative regulation of vascular permeability 12 11 -1.036 0.0293177 response to vitamin A 31 28 1.006 0.0294136 calcium-dependent cell-cell adhesion 33 29 1.038 0.029468 cellular response to cAMP 53 51 -1.037 0.0295676 positive regulation of pathway- restricted SMAD protein phosphorylation 28 26 1.024 0.0299868 regulation of vascular endothelial growth factor production 5 5 1.013 0.0300148 potassium ion transport 136 134 -1.012 0.0301179 mitotic sister chromatid segregation 20 17 -1.054 0.0301258 nucleosome assembly 157 101 -1.031 0.030172 positive regulation of Wnt signaling pathway 28 25 -1.009 0.0302684 regulation of developmental pigmentation 8 7 1.047 0.0306629 intramolecular transferase activity, phosphotransferases 8 8 1.128 0.0307939 oxidoreductase activity, acting on paired donors, with incorporation or reduction of molecular oxygen, reduced flavin or flavoprotein as one donor, and incorporation of one atom of oxygen 33 18 -1.017 0.0308296 interleukin-6-mediated signaling pathway 8 8 -1.091 0.0309208 positive regulation of keratinocyte proliferation 6 5 -1.100 0.0310272 bud elongation involved in lung branching 7 6 -1.052 0.0311666 astrocyte cell migration 9 8 1.055 0.0312227 vasculogenesis 76 66 -1.035 0.0313722 striated muscle cell differentiation 17 15 1.042 0.0315473 mechanoreceptor differentiation 7 7 -1.064 0.0317303 intermediate filament cytoskeleton organization 14 13 -1.029 0.0317925 NAD(P)+-protein-arginine ADP- ribosyltransferase activity 7 6 1.064 0.0322048 E-box binding 30 28 -1.034 0.0322945

231

cellular response to potassium ion 6 5 -1.091 0.0325009 phosphatidylinositol phospholipase C activity 31 31 -1.004 0.032736 fat pad development 8 6 -1.054 0.0327601 regulation of immune response 114 98 -1.022 0.0328947 positive regulation of odontogenesis 5 5 1.040 0.0330293

S-adenosylmethionine-dependent methyltransferase activity 19 16 -1.062 0.0332747 thrombin receptor signaling pathway 9 9 1.042 0.0334057 myoblast fusion 23 20 1.050 0.0334367 negative regulation of erythrocyte differentiation 10 8 -1.080 0.0337957 cellular response to cytokine stimulus 28 25 1.020 0.0341013 cell wall macromolecule catabolic process 23 15 1.042 0.0341238 clathrin coat of coated pit 6 6 1.072 0.0341591 platelet-derived growth factor binding 12 12 -1.012 0.0341622 response to steroid hormone 53 50 -1.040 0.0341984 mRNA 3'-UTR AU-rich region binding 7 6 1.040 0.0343614 positive regulation of NF-kappaB import into nucleus 21 20 -1.049 0.0345604 response to virus 145 121 -1.016 0.0347967 face development 9 9 -1.072 0.0349187 RNA polymerase II regulatory region sequence-specific DNA binding 35 29 -1.006 0.0349452 synapse organization 33 30 -1.024 0.0349636 skeletal system morphogenesis 56 45 -1.012 0.0349686 cellular response to leptin stimulus 5 5 1.046 0.034994 guanyl-nucleotide exchange factor complex 5 5 -1.094 0.0351129 metallocarboxypeptidase activity 28 24 -1.011 0.0352608 Rac guanyl-nucleotide exchange factor activity 11 11 -1.033 0.0355073 sodium channel regulator activity 23 21 -1.021 0.0355237

232

negative regulation of neurotrophin TRK receptor signaling pathway 6 5 -1.135 0.0356518 ketone body catabolic process 6 5 -1.027 0.0357242 blood coagulation, extrinsic pathway 6 6 -1.048 0.0358051 glutamate receptor signaling pathway 14 14 -1.020 0.0358949 aldehyde dehydrogenase (NAD) activity 10 9 -1.032 0.0359237 positive regulation of SMAD protein import into nucleus 11 11 1.049 0.0361928 positive regulation of interleukin-8 production 22 21 -1.049 0.0362535 peptidyl-arginine methylation, to asymmetrical-dimethyl arginine 5 5 -1.098 0.0364115 protein-arginine omega-N asymmetric methyltransferase activity 5 5 -1.098 0.0364115 positive regulation of macrophage chemotaxis 11 10 1.011 0.0364361 protein ADP-ribosylation 22 19 -1.057 0.0367381 leading edge membrane 6 5 1.082 0.0371159 positive regulation of fibroblast proliferation 52 46 -1.009 0.0372002 IgG binding 10 9 -1.064 0.0374731 dendritic spine membrane 8 7 1.090 0.0377986 intermediate filament organization 16 14 1.033 0.0378824 response to immobilization stress 25 21 -1.033 0.0380001 telencephalon cell migration 8 6 1.024 0.0380114 cleavage furrow formation 8 7 -1.097 0.0380288 transmembrane receptor protein tyrosine kinase signaling pathway 99 94 -1.004 0.0380476 response to aluminum ion 9 9 -1.044 0.038071 intracellular signal transduction 495 452 -1.003 0.0381839 lateral line nerve glial cell development 8 7 1.056 0.0384386 iridophore differentiation 8 7 1.056 0.0384386 positive regulation of Rac protein signal transduction 6 6 1.026 0.0385595 retina morphogenesis in camera- type eye 13 13 -1.037 0.0386881

233

positive regulation of acrosome reaction 5 5 -1.089 0.0387511 negative regulation of peptidyl- threonine phosphorylation 9 8 1.014 0.0387818 gamma-glutamyltransferase activity 12 8 -1.014 0.0392226 positive regulation of cell death 43 40 -1.037 0.0392869 reproductive structure development 11 11 -1.021 0.0395009 wound healing 121 109 1.010 0.0395451 estradiol 17-beta-dehydrogenase activity 12 12 1.072 0.0398591 positive regulation of glycolytic process 15 13 -1.045 0.0399877 negative regulation of endothelial cell migration 17 15 -1.054 0.0402665 P granule 9 8 1.028 0.0403846

CCR1 chemokine receptor binding 6 6 1.093 0.0404492 positive regulation of interleukin-1 beta production 9 9 -1.035 0.0405359 positive regulation of blood coagulation 13 11 1.011 0.0406447 positive regulation of cardiac muscle cell apoptotic process 7 6 1.115 0.0406691 positive regulation of interleukin-6 production 50 38 -1.021 0.0411671 response to acid 22 17 -1.016 0.0416411 growth factor binding 42 41 1.017 0.0416886 potassium channel activity 87 87 -1.016 0.0417126 regulation of protein kinase B signaling 11 11 1.057 0.0420238 morphogenesis of an epithelial sheet 8 8 1.059 0.0420246 detection of mechanical stimulus involved in sensory perception of pain 13 12 -1.020 0.0420533 endocardial cushion development 12 10 1.011 0.042162 protein binding transcription factor activity 13 13 1.035 0.0422751 somatic stem cell division 11 10 1.030 0.0424853 aromatase activity 46 27 -1.006 0.0425797 DNA methylation involved in gamete generation 16 15 -1.045 0.0427259 glucose import 6 6 1.084 0.0427311

234

protein tyrosine kinase activator activity 12 10 1.056 0.0429889 regulation of sodium ion transmembrane transport 6 6 1.089 0.0431278 forebrain radial glial cell differentiation 6 6 1.053 0.0432371 defense response to virus 184 146 -1.030 0.0434008 fibroblast growth factor-activated receptor activity 5 5 1.096 0.0434638 peptidyl-arginine methylation 5 5 -1.098 0.0437691 cellular response to peptide 13 13 -1.031 0.0441314 serotonin binding 12 10 1.071 0.0442132 cell chemotaxis 30 25 -1.006 0.0442337 response to UV-B 10 9 -1.057 0.0442358 positive regulation of glucose transport 7 5 -1.068 0.0443345 negative regulation of multicellular organism growth 15 15 1.039 0.0447462 positive regulation of vascular endothelial growth factor receptor signaling pathway 22 20 -1.028 0.0447929 substrate-specific transmembrane transporter activity 15 13 -1.034 0.0447975 growth 46 39 -1.011 0.0450976 positive regulation of immunoglobulin secretion 7 7 1.050 0.0452006 regulation of cholesterol biosynthetic process 7 6 1.079 0.045363 long-term synaptic potentiation 31 26 1.026 0.0454553 fascia adherens 13 13 1.006 0.0455811 neurexin family protein binding 17 15 -1.026 0.0457294 positive regulation of Wnt signaling pathway, planar cell polarity pathway 6 5 -1.077 0.0458142 interstitial matrix 20 16 -1.041 0.0458511 activated T cell proliferation 11 10 -1.006 0.0459869 cellular response to organic cyclic compound 76 70 -1.020 0.0460293 oxidoreductase activity, acting on the aldehyde or oxo group of donors, NAD or NADP as acceptor 30 21 -1.051 0.0460491 central nervous system development 152 139 1.002 0.0461002

235

negative regulation of osteoblast differentiation 53 37 -1.039 0.0462126 adrenal gland development 32 27 -1.045 0.0463009 response to fatty acid 28 26 -1.049 0.0463436 pituitary gland development 35 33 -1.028 0.0465321 neuron projection membrane 9 7 1.042 0.0466072 positive regulation of DNA damage response, signal transduction by p53 class mediator 14 11 -1.009 0.0466164 complement binding 8 7 -1.120 0.0466452 pathway-restricted SMAD protein phosphorylation 13 13 1.024 0.0470969 positive regulation of cell cycle 32 30 1.044 0.047198 response to organic cyclic compound 239 219 -1.016 0.047214 ventricular cardiac muscle tissue morphogenesis 32 29 -1.018 0.0472218 positive regulation of interleukin- 13 production 5 5 1.034 0.0474321 oocyte maturation 18 18 -1.036 0.0475638 positive regulation of BMP signaling pathway 33 30 1.014 0.04761 synaptic transmission 452 439 -1.009 0.0476724 positive regulation of cardioblast differentiation 5 5 1.049 0.0476908 transcription factor complex 309 276 -1.011 0.0477185 ligand-gated sodium channel activity 10 5 1.084 0.0477672 vascular endothelial growth factor signaling pathway 10 8 -1.068 0.0478353 positive regulation of mitotic metaphase-anaphase transition 6 6 1.036 0.0479276 mesenchymal cell differentiation 10 10 1.052 0.0481533 positive regulation of positive chemotaxis 14 13 -1.068 0.048179 positive regulation of cell-cell adhesion 17 16 -1.009 0.0483837 smooth muscle cell differentiation 21 19 -1.036 0.0485126 positive regulation of protein tyrosine kinase activity 23 21 1.017 0.0486038 peptidyl-tyrosine phosphorylation 114 106 -1.009 0.0489741 acute inflammatory response to antigenic stimulus 8 7 1.043 0.0491019

236

ligand-activated sequence-specific DNA binding RNA polymerase II transcription factor activity 44 43 -1.027 0.0495191 orbitofrontal cortex development 5 5 -1.092 0.049597

237

Appendix VII - Full list of gene ontologies associated with 16 differentially methylated and expressed genes

238

Appendix VII: Full list of gene ontologies associated with 16 differentially methylated and expressed genes Gene Symbol Gene Name Gene Onotology Terms protein binding,activation of innate immune response,positive regulation of defense response to virus by host,immune system process,DNA binding,double-stranded DNA binding,apoptotic process,cytosol,nucleus,cytoplasm,inflammatory response,immune response,identical protein binding,negative regulation of NF-kappaB transcription factor activity,positive regulation of protein oligomerization,positive regulation of interleukin-1 beta production,tumor necrosis factor-mediated signaling pathway,cellular response to interferon-beta,cellular response to drug,nucleotide- Absent in binding domain, leucine rich repeat containing receptor signaling pathway,innate immune response,interleukin-1 beta secretion,positive regulation of interleukin-1 beta Melanoma secretion,positive regulation of NF-kappaB transcription factor activity,pyroptosis,AIM2 AIM2 2 inflammasome complex,positive regulation of cysteine-type endopeptidase activity, DNA binding,sequence-specific DNA binding transcription factor activity,steroid hormone receptor activity,androgen receptor activity,steroid binding,metal ion binding,nucleus,transcription, DNA-templated,regulation of transcription, DNA- androgen templated,zinc ion binding,androgen receptor signaling pathway,steroid hormone AR receptor mediated signaling pathway,sequence-specific DNA binding, CUB Domain Containing CDCP1 Protein 1 plasma membrane,integral component of membrane,membrane,extracellular region, plasma membrane,integral component of membrane,membrane,voltage-gated ion Chloride channel activity,chloride channel activity,cytoplasm,transport,ion transport,chloride intracellular transport,protein C-terminus binding,D2 dopamine receptor binding,D3 dopamine receptor binding,D4 dopamine receptor binding,chloride channel complex,regulation of channel ion transmembrane transport,protein homodimerization activity,dopamine receptor CLIC6 protein 6 binding,extracellular vesicular exosome,chloride transmembrane transport,

plasma membrane,integral component of membrane,pattern recognition receptor signaling pathway,membrane,defense response,carbohydrate binding,scavenger receptor activity,collagen trimer,metal ion binding,galactose binding,receptor-mediated endocytosis,phagocytosis, recognition,immune response,signaling pattern recognition receptor activity,carbohydrate mediated signaling,low-density lipoprotein particle COLEC1 binding,endocytic vesicle membrane,protein homooligomerization,innate immune 2 Collectin 12 response,extracellular vesicular exosome, intracellular,cornified envelope,protease binding,protein binding, bridging,cell adhesion,endopeptidase inhibitor activity,cysteine-type endopeptidase inhibitor activity,structural molecule activity,extracellular space,nucleus,cytoplasm,nucleoplasm,negative regulation of peptidase activity,negative regulation of endopeptidase activity,single organismal cell-cell Cystatin A adhesion,peptidase inhibitor activity,peptide cross-linking,keratinocyte CSTA (Stefin A) differentiation,negative regulation of proteolysis,extracellular vesicular exosome, RNA polymerase II regulatory region sequence-specific DNA binding,RNA polymerase II transcription regulatory region sequence-specific DNA binding transcription factor activity involved in positive regulation of transcription,epithelial to mesenchymal transition,DNA binding,sequence-specific DNA binding transcription factor activity,nucleus,transcription factor complex,transcription, DNA-templated,regulation of transcription, DNA-templated,transcription factor binding,positive regulation of transcription, DNA-templated,negative regulation of transcription, DNA- templated,extracellular matrix organization,establishment of planar polarity of embryonic epithelium,sequence-specific DNA binding,positive regulation of transcription from RNA polymerase II promoter,embryonic digestive tract Forkhead development,embryonic camera-type eye morphogenesis,genitalia development,palate FOXF2 Box F2 development,

239

plasma membrane,protein binding,integral component of membrane,membrane,receptor activity,ionotropic glutamate receptor activity,alpha- amino-3-hydroxy-5-methyl-4-isoxazole propionate selective glutamate receptor activity,signal transduction,endoplasmic reticulum,ion channel activity,extracellular- glutamate-gated ion channel activity,endoplasmic reticulum membrane,transport,ion transport,synaptic transmission,long-term memory,synaptic vesicle,glutamate receptor activity,ionotropic glutamate receptor complex,cell surface,postsynaptic density,neuron projection,cell junction,PDZ domain binding,dendrite,endocytic vesicle membrane,receptor internalization,alpha-amino-3-hydroxy-5-methyl-4- isoxazolepropionic acid selective glutamate receptor complex,dendrite Glutamate membrane,dendritic spine membrane,ion transmembrane transport,ionotropic Receptor, glutamate receptor signaling pathway,synaptic transmission, glutamatergic,cell projection,neuronal cell body,dendritic spine,axonal spine,neuron Ionotropic, spine,synapse,postsynaptic membrane,recycling endosome,long term synaptic GRIA1 AMPA 1 depression, plasma membrane,protein binding,integral component of membrane,protease binding,membrane,extracellular region,signal transducer activity,transmembrane signaling receptor activity,interleukin-1 receptor activity,interleukin-1, Type I, activating Interleukin receptor activity,signal transduction,integral component of plasma membrane,platelet- derived growth factor receptor binding,immune response,cell surface receptor signaling 1 Receptor, pathway,cell surface,cytokine-mediated signaling pathway,regulation of inflammatory IL1R1 Type I response,interleukin-1-mediated signaling pathway,response to interleukin-1, protein binding,intermediate filament,structural molecule activity,nucleus,cytoplasm,viral process,keratin filament,extracellular vesicular KRT7 Keratin 7 exosome, MEDAG (mesenteric estrogen- dependent adipogenesi MEDAG s) cytoplasm,positive regulation of fat cell differentiation,

negative regulation of transcription from RNA polymerase II promoter,RNA polymerase II core promoter proximal region sequence-specific DNA binding,sequence-specific DNA binding RNA polymerase II transcription factor activity,RNA polymerase II core promoter proximal region sequence-specific DNA binding transcription factor activity involved in positive regulation of transcription,RNA polymerase II core promoter proximal region sequence-specific DNA binding transcription factor activity involved in negative regulation of transcription,RNA polymerase II transcription factor binding,tendon sheath development,DNA binding,transcription from RNA polymerase II promoter,nucleus,regulation of transcription, DNA-templated,multicellular organismal development,muscle organ development,regulation of gene expression,positive regulation of gene expression,collagen fibril organization,positive regulation of collagen biosynthetic process,tendon cell differentiation,tendon formation,sequence-specific Mohawk DNA binding,negative regulation of myoblast differentiation,positive regulation of MKX Homeobox transcription from RNA polymerase II promoter, plasma membrane,Golgi membrane,intracellular,calcium ion binding,cytokine production,inflammatory response to antigenic stimulus,membrane,blood coagulation,cytosol,signal transduction,endoplasmic reticulum,intracellular signal transduction,guanyl-nucleotide exchange factor activity,metal ion RAS Guanyl binding,cytoplasm,endoplasmic reticulum membrane,Golgi apparatus,lipid binding,small GTPase mediated signal transduction,Ras protein signal Releasing transduction,regulation of phosphatidylinositol 3-kinase signaling,cell Protein 1 differentiation,platelet activation,secretory granule localization,activation of Ras (Calcium GTPase activity,activation of Rho GTPase activity,Fc-epsilon receptor signaling pathway,mast cell granule,regulation of GTPase activity,mast cell degranulation,innate RASGRP And DAG- immune response,vesicle transport along microtubule,regulation of small GTPase 1 Regulated) mediated signal transduction, Serum SAA1 Amyloid A1 acute-phase response,extracellular region,high-density lipoprotein particle,

240

Teneurin integral component of membrane,membrane,cell adhesion,signal Transmemb transduction,homophilic cell adhesion via plasma membrane adhesion molecules,positive regulation of neuron projection development,cell rane differentiation,axon,protein homodimerization activity,cell projection,protein TENM3 Protein 3 heterodimerization activity,camera-type eye morphogenesis,self proteolysis,

241

Appendix VIII - Full list of gene set enrichment results for 16 differentially methylated and differentially expressed genes.

242

Appendix VIII: Full list of gene set enrichment results for 16 differentially methylated and differentially expressed genes. Genes are sorted by number of genes in gene set (overlap). # of Name Entities Overlap Overlapping Entities p-value RNA polymerase II transcription factor binding 45 2 MKX;AR 0.0004 endocytic vesicle membrane 56 2 GRIA1;COLEC12 0.0007 activation of prostate induction by androgen receptor signaling pathway 1 1 AR 0.0007 negative regulation of integrin biosynthetic process 1 1 AR 0.0007 male somatic sex determination 1 1 AR 0.0007 tendon sheath development 1 1 MKX 0.0007 positive regulation of interleukin-1 secretion 1 1 SAA1 0.0007 AIM2;SAA1;COLEC12; innate immune response 669 4 RASGRP1 0.0009 POU domain binding 2 1 AR 0.0013 D4 dopamine receptor binding 2 1 CLIC6 0.0013 interleukin-1, Type I, activating receptor activity 2 1 IL1R1 0.0013 regulation of developmental growth 2 1 AR 0.0014 lateral sprouting involved in mammary gland duct morphogenesis 2 1 AR 0.0014 carbohydrate mediated signaling 2 1 COLEC12 0.0014 protease binding 90 2 CSTA;IL1R1 0.0016 androgen receptor activity 3 1 AR 0.0020 D2 dopamine receptor binding 3 1 CLIC6 0.0020 AIM2 inflammasome complex 3 1 AIM2 0.0021 regulation of prostatic bud formation 3 1 AR 0.0021 tertiary branching involved in mammary gland duct morphogenesis 3 1 AR 0.0021 establishment of planar polarity of embryonic epithelium 3 1 FOXF2 0.0021 tendon formation 3 1 MKX 0.0021 activation of Ras GTPase activity 3 1 RASGRP1 0.0021 D3 dopamine receptor binding 4 1 CLIC6 0.0027 243

alpha-amino-3-hydroxy-5- methyl-4-isoxazole propionate selective glutamate receptor activity 4 1 GRIA1 0.0027 interleukin-1 beta secretion 4 1 AIM2 0.0028 positive regulation of integrin biosynthetic process 4 1 AR 0.0028 regulation of establishment of protein localization to plasma membrane 4 1 AR 0.0028 morphogenesis of an epithelial fold 4 1 AR 0.0028 cellular response to amine stimulus 4 1 GRIA1 0.0028 regulation of phosphatidylinositol 3-kinase signaling 4 1 RASGRP1 0.0028 secretory granule localization 4 1 RASGRP1 0.0028 androgen binding 5 1 AR 0.0034 beta-2 adrenergic receptor binding 5 1 GRIA1 0.0034 positive regulation of NF- kappaB transcription factor activity 126 2 AIM2;AR 0.0034 immune response 449 3 AIM2;COLEC12;IL1R1 0.0034 neuron spine 5 1 GRIA1 0.0034 epithelial cell differentiation involved in prostate gland development 5 1 AR 0.0035 male sex differentiation 5 1 AR 0.0035 male genitalia morphogenesis 5 1 AR 0.0035 tendon cell differentiation 5 1 MKX 0.0035 signaling pattern recognition receptor activity 6 1 COLEC12 0.0040 myosin V binding 6 1 GRIA1 0.0040 positive regulation of cysteine- type endopeptidase activity 6 1 AIM2 0.0042 reproductive system development 6 1 AR 0.0042 cellular response to dsRNA 6 1 GRIA1 0.0042 response to nitric oxide 6 1 IL1R1 0.0042 protein complex 497 3 AR;IL1R1;GRIA1 0.0043 G-protein beta-subunit binding 7 1 GRIA1 0.0047 Rap guanyl-nucleotide exchange factor activity 7 1 RASGRP1 0.0047 postsynaptic density 153 2 GRIA1;IL1R1 0.0048 pyroptosis 7 1 AIM2 0.0049

244

positive regulation of transcription from RNA polymerase III promoter 7 1 AR 0.0049 regulation of receptor recycling 7 1 GRIA1 0.0049 galactose binding 8 1 COLEC12 0.0054 interleukin-1 receptor activity 8 1 IL1R1 0.0054 mast cell granule 8 1 RASGRP1 0.0055 organ formation 8 1 AR 0.0056 positive regulation of interleukin-1 beta production 9 1 AIM2 0.0063 prostate gland growth 9 1 AR 0.0063 genitalia development 9 1 FOXF2 0.0063 positive regulation of membrane potential 9 1 GRIA1 0.0063 cellular response to steroid hormone stimulus 10 1 AR 0.0070 regulation of protein secretion 10 1 SAA1 0.0070 lymphocyte chemotaxis 10 1 SAA1 0.0070 positive regulation of protein oligomerization 11 1 AIM2 0.0077 positive regulation of intracellular estrogen receptor signaling pathway 11 1 AR 0.0077 reproductive structure development 11 1 AR 0.0077 seminiferous tubule development 11 1 AR 0.0077 regulation of systemic arterial blood pressure 11 1 AR 0.0077 mast cell degranulation 11 1 RASGRP1 0.0077 activation of innate immune response 12 1 AIM2 0.0084 positive regulation of insulin- like growth factor receptor signaling pathway 12 1 AR 0.0084 prostate gland epithelium morphogenesis 12 1 AR 0.0084 prostate gland development 12 1 AR 0.0084 vesicle transport along microtubule 12 1 RASGRP1 0.0084 adenylate cyclase binding 13 1 GRIA1 0.0087 small GTPase binding 13 1 GRIA1 0.0087 glutamate receptor activity 13 1 GRIA1 0.0087 pattern recognition receptor signaling pathway 13 1 COLEC12 0.0091 interleukin-1-mediated signaling pathway 13 1 IL1R1 0.0091 245

ovulation 13 1 IL1R1 0.0091 activation of Rho GTPase activity 13 1 RASGRP1 0.0091 platelet activation 211 2 SAA1;RASGRP1 0.0093 ionotropic glutamate receptor complex 14 1 GRIA1 0.0096 asymmetric synapse 14 1 GRIA1 0.0096 phagocytosis, recognition 14 1 COLEC12 0.0098 response to fungicide 14 1 GRIA1 0.0098 regulation of gene expression 218 2 MKX;AR 0.0099 voltage-gated chloride channel activity 15 1 CLIC6 0.0101 sequence-specific DNA binding 693 3 AR;MKX;FOXF2 0.0101 endocytic vesicle lumen 15 1 SAA1 0.0102 negative regulation of myoblast differentiation 15 1 MKX 0.0104 macrophage chemotaxis 15 1 SAA1 0.0104 platelet-derived growth factor receptor binding 16 1 IL1R1 0.0107 ionotropic glutamate receptor activity 18 1 GRIA1 0.0121 Leydig cell differentiation 18 1 AR 0.0125 extracellular-glutamate-gated ion channel activity 19 1 GRIA1 0.0127 lipid binding 262 2 RASGRP1;AR 0.0130 structural molecule activity 264 2 CSTA;KRT7 0.0132 protein homodimerization activity 765 3 TENM3;CLIC6;GRIA1 0.0132 embryonic camera-type eye morphogenesis 19 1 FOXF2 0.0132 response to arsenic-containing substance 19 1 GRIA1 0.0132 inflammatory response to antigenic stimulus 19 1 RASGRP1 0.0132 protein domain specific binding 265 2 GRIA1;AR 0.0133 low-density lipoprotein particle binding 20 1 COLEC12 0.0134 positive regulation of defense response to virus by host 20 1 AIM2 0.0139 mammary gland alveolus development 20 1 AR 0.0139 dendrite membrane 21 1 GRIA1 0.0143 embryonic digestive tract development 21 1 FOXF2 0.0146 response to lithium ion 21 1 GRIA1 0.0146

246

positive regulation of interleukin-1 beta secretion 22 1 AIM2 0.0153 positive regulation of synaptic transmission 22 1 GRIA1 0.0153 protein kinase A binding 23 1 GRIA1 0.0154 axon 284 2 TENM3;AR 0.0158 sex differentiation 23 1 AR 0.0160 ionotropic glutamate receptor binding 24 1 IL1R1 0.0160 cellular process 24 1 AR 0.0167 regulation of synaptic transmission 24 1 GRIA1 0.0167 ionotropic glutamate receptor signaling pathway 24 1 GRIA1 0.0167 cellular response to peptide hormone stimulus 24 1 GRIA1 0.0167 peptide cross-linking 25 1 CSTA 0.0174 neuronal action potential 25 1 GRIA1 0.0174 positive regulation of collagen biosynthetic process 25 1 MKX 0.0174 G-protein alpha-subunit binding 27 1 GRIA1 0.0180 high-density lipoprotein particle 27 1 SAA1 0.0184 transcription factor complex 309 2 MKX;FOXF2 0.0185 cellular response to interferon- beta 28 1 AIM2 0.0194 cytokine production 28 1 RASGRP1 0.0194 alpha-amino-3-hydroxy-5- methyl-4-isoxazolepropionic acid selective glutamate receptor complex 29 1 GRIA1 0.0197 spinal cord development 29 1 GRIA1 0.0201 excitatory synapse 30 1 GRIA1 0.0204 transcription factor binding 334 2 FOXF2;AR 0.0205 synaptic transmission, glutamatergic 30 1 GRIA1 0.0208 long-term memory 31 1 GRIA1 0.0215 positive regulation of phosphorylation 32 1 AR 0.0222 tumor necrosis factor- mediated signaling pathway 33 1 AIM2 0.0229 receptor internalization 33 1 GRIA1 0.0229 positive regulation of fat cell differentiation 34 1 MEDAG 0.0235 dendrite 352 2 GRIA1;AR 0.0236 response to interleukin-1 35 1 IL1R1 0.0242

247

positive regulation of cytokine secretion 36 1 SAA1 0.0249 epithelial to mesenchymal transition 38 1 FOXF2 0.0263 response to cocaine 38 1 GRIA1 0.0263 cornified envelope 40 1 CSTA 0.0271 androgen receptor binding 41 1 AR 0.0273 intracellular receptor signaling pathway 40 1 AR 0.0276 steroid binding 43 1 AR 0.0286 nucleotide-binding domain, leucine rich repeat containing receptor signaling pathway 42 1 AIM2 0.0290 negative regulation of extrinsic apoptotic signaling pathway 42 1 AR 0.0290 response to electrical stimulus 42 1 GRIA1 0.0290 ligand-activated sequence- specific DNA binding RNA polymerase II transcription factor activity 44 1 AR 0.0292 androgen receptor signaling pathway 44 1 AR 0.0304 regulation of synaptic plasticity 45 1 GRIA1 0.0310 collagen fibril organization 45 1 MKX 0.0310 positive regulation of cell adhesion 47 1 SAA1 0.0324 endopeptidase inhibitor activity 49 1 CSTA 0.0325 neutrophil chemotaxis 48 1 SAA1 0.0331 G-protein coupled receptor binding 50 1 SAA1 0.0332 chloride channel complex 51 1 CLIC6 0.0344 acute-phase response 51 1 SAA1 0.0351 dendritic shaft 54 1 GRIA1 0.0364 cellular response to drug 53 1 AIM2 0.0365 sequence-specific DNA binding transcription factor activity 1135 3 AR;MKX;FOXF2 0.0372 scavenger receptor activity 57 1 COLEC12 0.0377 fertilization 55 1 AR 0.0378 steroid hormone receptor activity 58 1 AR 0.0384 neuromuscular junction 57 1 GRIA1 0.0384 steroid hormone mediated signaling pathway 56 1 AR 0.0385 protein oligomerization 58 1 AR 0.0398 cysteine-type endopeptidase inhibitor activity 61 1 CSTA 0.0403 cell growth 60 1 AR 0.0412 248

regulation of inflammatory response 60 1 IL1R1 0.0412 regulation of catalytic activity 62 1 AR 0.0425 AR;GRIA1;TENM3;RAS signal transduction 2955 5 GRP1;IL1R1 0.0440 negative regulation of epithelial cell proliferation 65 1 AR 0.0446 chloride channel activity 68 1 CLIC6 0.0448 cell surface 508 2 IL1R1;GRIA1 0.0463 cellular response to glucose stimulus 69 1 IL1R1 0.0472 beta-catenin binding 72 1 AR 0.0474 negative regulation of NF- kappaB transcription factor activity 71 1 AIM2 0.0486 keratinocyte differentiation 71 1 CSTA 0.0486 protein binding, bridging 75 1 CSTA 0.0494

249