Gene expression and regulation in survivors of critical illness with muscle weakness and meta-analysis of transcriptomic profiles across muscle diseases

by

Christopher J Walsh

A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy

Institute of Medical Science University of Toronto

©Copyright by Christopher J Walsh 2019

Transcriptional profiling and regulation in survivors of critical illness with muscle weakness and meta-analysis across human muscle diseases

Christopher J Walsh

Doctor of Philosophy

Institute of Medical Science University of Toronto

2019

ABSTRACT:

ICU acquired weakness (ICUAW) is a common complication of critical illness characterized by decreased muscle mass and function with resulting physical impairment that may persist for years after ICU discharge. Transcriptomic profiling of peripheral muscle biopsies in patients with muscle weakness and healthy controls may detect changes in key biological processes related to muscle impairment. We hypothesized that abnormal expression of mRNAs and miRs related to muscle repair may be an important feature of ICUAW compared to controls.

In study 1, we integrated clinical data and mRNA transcriptomic data from quadriceps muscle biopsies from patients with ICUAW at day 7 post-ICU discharge and at follow up at month 6 post-ICU discharge and compared to healthy controls. A co-expression network analysis method detected groups of co-expressed related to muscle repair that were downregulated in ICUAW compared to healthy controls. In study 2, we aimed to identify miRs that are significant regulators of mRNAs and mRNA networks in ICUAW. Mir-424-5p was found to regulate the greatest number of mRNAs in early ICUAW, including downregulated mRNAs related to striated muscle cell differentiation. At month 6 post-ICU, a differentially expressed miR signature was found between patients that increased quadriceps muscle mass (“Improvers”) from those who did not (“Non-improvers”). In study 3, we performed meta-analysis of mRNA transcriptional profiles from muscle biopsies from human muscle diseases and healthy controls to identify a common signature of genes dysregulated across muscle diseases as well as those genes with expression changes that are unique to ICUAW.

ii We detected a common muscle signature of 131 genes similarly expressed across five categories of muscle diseases. Finally, removing the genes common to muscle disease from meta-analysis of only ICUAW cohorts revealed uniquely down-regulated muscle development and contraction genes specific to ICUAW. In summary, dysregulation of mRNAs and miRs related to muscle repair was detected in ICUAW compared to controls. Transcriptional changes unique to ICUAW versus other categories of muscle disease were detected using meta-analysis strategy.

iii

ACKNOWLEDGEMENTS:

I would like to express my deepest gratitude to several groups of individuals without whom this PhD thesis would not be possible. First, I would like to thank all the study participants and researchers who have made their transcriptomic data available publically.

Secondly, I would like to thank my PAC committee Dr. Claudia dos Santos, Dr. Jane Batt, Dr. Pingzhao Hui, and Dr. Gary Bader who provided me with invaluable insights throughout my PhD.

I am grateful for support from our collaborators Dr. Margaret Herridge and Dr. Sunita Mathur. I would also like to thank Dr. Purvesh Khatri (Assistant Professor at Biomedical Informatics Research in the Department of Medicine at Stanford) for his invaluable assistance with our meta-analysis and to Dr. Zhi Wei (Associate Professor at the Department of Computer Sciences New Jersey Institute of Technology) for his invaluable statistical review of my first transcriptomic paper. I am grateful to Dr. Dmitry Rozenberg at Toronto General Hospital for his ongoing encouragement both as a colleague and friend. I would also like to thank the MEND-ICU and Canadian Critical Care Translational Biology Group for their support.

I am grateful for the financial support provided by the Canadian Thoracic Society (CTS) Research Committee of the Lung Association (CLA) studentship (2015- 2017).

Finally, to my caring wife and family for their enduring support.

iv

STATEMENT OF CONTRIBUTIONS

The thesis is presented in manuscript format and consists of six chapters. Chapter 1 is a Literature Review, including a summary of ICU acquired weakness, gene transcriptomic analysis, miR-target interactions, and microarray meta-analysis. My previously published Review articles on these topics were incorporated as the source of most references. Chapter 2 outlines the hypothesis and objectives of each of the studies. Chapter 3 through 5 present original investigations from 3 separate manuscripts, the first of which has been published. Chapter 6 is a general discussion of the thesis, including limitations and future directions.

Summary of Contributions Related to Thesis:

1. Walsh, Christopher , Batt, Jane, S. Herridge, Margaret, Santos, Claudia. (2014). Muscle Wasting and Early Mobilization in Acute Respiratory Distress Syndrome. Clinics in Chest Medicine . 35. 10.1016/j.ccm.2014.08.016.

Contributions: C.J.W. contributed to selection of the articles, wrote the first draft of the manuscript and C.J.W., J.B., M.H., C.S. revised the manuscript for important intellectual content. All authors reviewed and approved of the final manuscript.

2. Walsh, Christopher , Batt, Jane , S. Herridge, Margaret , Mathur, Sunita, D. Bader, Gary & Hu, Pingzhao & Santos, Claudia. (2016). Transcriptomic analysis reveals abnormal muscle repair and remodeling in survivors of critical illness with sustained weakness. Scientific Reports . 6. 29334. 10.1038/srep29334.

Contributions C.C.d.S., J.B. and M.S.H. conceived the study and designed the experiments. S.M. collected clinical data, J.B performed muscle biopsies and C.C.d.S extracted the RNA for microarray analysis. C.J.W . wrote the code for microarray analysis including data visualization, analyzed the data, and drafted the manuscript. C.J.W , J.B., C.C.d.S analyzed the data and G.D.B. and P.H. supported the analysis of the data. C.J.W. made critical revisions and all authors discussed and approved the final manuscript.

v 3. Walsh, Christopher , Hu, Pingzhao, Batt, Jane , Santos, Claudia. (2016). Discovering MicroRNA-Regulatory Modules in Multi-Dimensional Cancer Genomic Data: A Survey of Computational Methods. Cancer Informatics . 15. 25-42. 10.4137/CIN.S39369

Contributions C.J.W. Selected the articles and wrote the first draft of the manuscript and designed and created figures. Contributed to the writing of the manuscript: P.H., C.C.dS. Agree with manuscript results and conclusions: C.J.W., P.H., J.B., C.C.dS. Jointly developed the structure and arguments for the paper: C.J.W., P.H. Made critical revisions and approved final version: C.J.W., J.B., P.H., C.C.dS. All authors reviewed and approved of the final manuscript.

4. Christopher J Walsh*, Carlos Escudero*, Pam Plant, Muskan Gupta, Judy Correa, Pingzhao Hu, Claudia C. dos Santos, and Jane Batt Submitted on behalf of the Canadian Critical Care Translational Biology Group. Identification of novel microRNAs regulating muscle regeneration in ICUAW. Manuscript in preparation for submission to American Journal of Respirology and Critical Care Medicine . * Co-first authors

Contributions C.C.d.S., J.B. conceived the study and designed the experiments. S.M. collected clinical data. C.J.W. wrote the code for microarray analysis including data visualization and analyzed the data and drafted the manuscript. C.E. performed murine myoblast cell line experimental analysis and wrote the methods and results from this analysis in the manuscript. P.P and M.G conducted experimental analysis. C.J.W , C.E., J.B. and C.C.d.S. analyzed the data and revised the manuscript. P.H. supported the analysis of the data. All authors discussed and critically reviewed the manuscript.

5. Walsh, Christopher , Hu, Pingzhao , Batt, Jane , Santos, Claudia. (2015). Microarray Meta-Analysis and Cross-Platform Normalization: Integrative Genomics for Robust Biomarker Discovery. Microarrays . 4. 389-406. 10.3390/microarrays4030389.

Contributions C.J.W. contributed to selection of the articles, wrote the first draft of the manuscript and C.J.W., J.B., P.H., C.S. revised the manuscript. All authors discussed and approved of the final manuscript.

vi 6. Walsh, Christopher , Batt J, Herridge M.S., Mathur S, Bader GD, Hu P, Khatri P, and CC. dos Santos. Comprehensive multi-cohort transcriptional meta-analysis of muscle diseases identifies a signature of disease severity. Manuscript in preparation for submission submitted to American Journal of Respirology and Critical Care Medicine (pending review)

Contributions C.J.W.: conceived the study and designed analysis workflow, performed the data collection of microarray studies in public repositories, evaluated which microarrays were relevant to the analysis, summarized the selected studies, wrote the computer code for microarray and meta- analysis and code for visualization of the data and analyzed the data and created the draft of the manuscript. S.M.: collected clinical data for one study included in the analysis. P.H.: supported the analysis and interpretation of the data. G.B.: supported analysis and interpretation of the data. J.B.: supported analysis and interpretation of the data. C.C.D.: supported analysis and interpretation of the data. P.K.: development of computational methods, analysis and interpretation of the data. All authors discussed and critically reviewed the manuscript.

Financial Support:

C.J.W. received funding from Canadian Thoracic Society (CTS) Resarch Committee of the Lung Association (CLA) studentship (2015- 2017) for the original research presented below.

vii

TABLE OF CONTENTS: Page

Acknowledgements………………………………………………………………………………….iv Statement of Contributions……………………………………………………………………… …v Table of Contents……………………………………………………………………………………viii List of Tables…………………………………………………………………………………………xii List of Figures………………………………………………………………………………………...xiii List of Appendices……………………………………………………………………………………xv Abbreviations…………………………………………………………………………………………xvi

CHAPTER 1: Literature Review and Introduction ...... 1 1.0 Statement of the problem ...... 1 1.1 Overview of Intensive Care Unit Associated Illness (ICUAW) ...... 2 1.1.1 Diagnosis and presentation of ICU acquired weakness ...... 2 1.1.2 Epidemiology of ICU acquired weakness ...... 4 1.1.3 Risk factors for ICUAW ...... 5 1.1.4 Putative therapies for ICUAW: Non-pharmacological strategies ...... 7 1.1.5 Putative pharmacologic strategies for ICUAW ...... 10 1.1.6 Current state of pathomechanisms ...... 11 1.2 Gene co-expression ...... 17 1.2.1 Gene co-expression networks networks for discovering molecular mechanisms ...... 17 Section 1.2.2: Weight gene co-expression analysis (WGCNA) ...... 20 1.3 Integration of microRNA and mRNA expression data ...... 24 1.3.1 Overview of high-throughput experimental methods for miR-target identification ...... 25 1.3.2 Experimentally supported miR-target interaction databases ...... 26 1.3.3 Sequence-based miR-target prediction ...... 27 1.3.4 Methods for integrating miR-mRNA expression data ...... 30 1.4 Approaches to meta-analysis across muscle diseases: identifying common transcriptional signatures ...... 35

viii 1.4.1 Comparison of microarray chip design across common platforms ...... 37 1.4.2 Integrative Transcriptomic Data Analysis ...... 39 1.4.3 Pre-processing and quality control prior to integrative analysis ...... 39 1.4.4 Meta-analysis ...... 41 1.4.5 Cross-platform normalization ...... 42 1.4.6 Comparison of meta-analysis vs. cross-platform normalization ...... 42 1.4.7 Comparison of meta-analysis methods ...... 43 1.4.8 Software and websites implementing microarray meta-analysis and cross- platform merging/normalization ...... 45 1.4.9 Examples of disease signatures discovered using MetaIntegrator ...... 46

Thesis overview ...... 49 2.1 Overall aims and hypothesis ...... 49 2.2 Study aims and hypothesis ...... 49

3.0 Transcriptomic analysis reveals abnormal muscle repair and remodeling in survivors of critical illness with sustained weakness ...... 54 3.1 Abstract:...... 54 3.2 Introduction ...... 55 3.3 Methods ...... 55 3.3.1 Patient selection ...... 55 3.3.2 Outcome measures of physical function, strength and mass...... 56 3.3.3 Muscle sample collection ...... 57 3.3.4 Muscle sample staining ...... 58 3.3.5 Microarray samples and Quality control...... 58 3.3.6 Single-gene differential expression analysis ...... 59 3.3.7 Co-expression network analysis ...... 59 3.3.8 Gene ontology and Human phenotype ontology analysis ...... 60 3.3.9 Gene set visualization using enrichment map ...... 61 3.3.10 factor binding site analysis ...... 61 3.3.11 Preservation analysis ...... 61 3.3.12 Independent validation data sets ...... 62 3.4 Results ...... 62 3.5 Discussion ...... 66

4.0 microRNA-RNA interactions underlying abnormal muscle repair in survivors of critical illness with sustained weakness ...... 78 4.1 Abstract ...... 78 4.2 Introduction ...... 79

ix 4.3 Methods ...... 80 4.4 Results ...... 87 4.5 Discussion ...... 90

5.0 Multi-cohort transcriptional meta-analysis of muscle diseases ...... 99 5.1 Abstract ...... 99 5.2 Introduction ...... 100 5.3 Methods ...... 101 5.3.1 Data collection and pre-processing ...... 101 5.3.2 Meta-analysis ...... 102 5.3.4 CMDM score ...... 103 5.3.5 Correlation of the CMDM genes with clinical and histological severity .. 104 5.3.6 Correlation of the CMDM genes with response to exercise therapy in inflammatory myopathy ...... 104 5.3.7 Association of the CMDM genes with normal aging ...... 104 5.3.8 Muscle disease category specific meta-analysis ...... 104 5.3.9 Identification of enriched transcription factors ...... 105 5.3.10 Assessment of cell type specificity in CMDM genes ...... 105 5.3.11 Subcellular localization analysis ...... 105 5.3.12 Availability of support supporting data ...... 106 5.4 Results ...... 106 5.4.1 Meta-analysis identifies a common gene signature of muscle disease .. 106 5.4.2 CMDM significantly associates with clinical and histological measures of disease severity ...... 107 5.4.3 Meta-analysis highlights common mechanisms of muscle disease ...... 109 5.4.4 Transcription factors associated with the CMDM ...... 110 5.4.5 Cell type-specificity analysis of CMDM genes ...... 110 5.4.6 Subcellular localization analysis of CMDM genes ...... 111 5.4.7 Disease-specific patterns of gene expression changes ...... 111 5.4.8 Comparison of CMDM signature with Transforming Growth Factor-β signature...... 112 5.4.9 Association of CMDM signature with response to exercise therapy in inflammatory myopathy ...... 114 5.4.10 Assessment of CMDM muscle disease z-score as marker of response to muscle disease-specific pharmacotherapy ...... 114 5.4.11 Characterizing the association of CMDM genes with normal aging ..... 115 5.5 Discussion ...... 116

x 5.6 Limitations ...... 119 5.7 Conclusions ...... 120

CHAPTER 6 General Discussion ...... 141 6.1 Overview of findings ...... 141 6.2 Limitations ...... 153 6.3 Conclusion ...... 159 6.4 Future Directions ...... 160

REFERENCES ...... 165

APPENDIX 1 – Supplementary Data ...... 208 Chapter 3 Supplementary Data...... 208 Chapter 4 Supplementary Data...... 230 Chapter 5 Supplementary Data...... 240

xi

LIST OF TABLES :

Table 1.1: List of software and websites for performing microarray meta-analysis Table 3.1 . Descriptive statistics (mean and standard deviation) and test of difference of means for demographic and clinical data for the three subgroups Table 3.2: Weighted gene correlation network identifies eleven ICUAW relevant modules Table 4.1 MicroRNAs with differential expression in ICUAW identified as master regulators of gene target expression Table 5.1 : Summary of public gene expression data sets used in the discovery and validation data set meta-analysis Table 5.2: Common muscle disease module (CMDM) genes

xii

LIST OF FIGURES:

Figure 1.1 Putative modifiable risk factors for muscle atrophy and ICUAW and potential therapeutic interventions Figure 1.2 Schematic to conceptualize biological complexity at multiple scales Figure 1.3 Sources of gene-gene correlation/co-expression. Figure 1.4 Inferring MTIs by integrating matched miR–mRNA expression profiles and sequence-based target prediction data Figure 1.5 Methods for estimation of False Discovery Rate (FDR) from MTI data Figure 1.6: Outline of two microarray integration methods Figure 2.1 Conceptual flow diagram of for integration of clinical phenotyping and co- expression network analysis to identify clinically relevant co-expression modules for validation in independent cohorts of ICUAW Figure 3.1: Differentially expressed genes in ICUAW Figure 3.2 Heatmap of correlations between co-expression module eigengenes (rows) and quantitative clinical traits (columns) Figure 3.3 : Module 1 and 3 are associated with ICUAW at day 7 and month 6 post- ICU discharge, respectively Figure 3.4 Enrichment Map results of the gene set functional enrichment analysis for module 1 and module 3 Figure 3.5: Modules M1 and M3 from human ICUAW patients are preserved in two independent data sets Figure 4.1 : Workflow of Master MicroRNA Regulator Analysis (MMRA) pipeline adapted based on this study’s research objectives Figure 4.2 : Expression patterns of differentially expressed (DE) microRNAs (miRs) in ICUAW Figure 4.3: Relative mmu-miRNA expression during C2C12 myoblast proliferation and differentiation Figure 5.1: Meta-analysis workflow diagram Figure 5.2 : Meta-analysis and leave-one-disease-out analysis reveal common differentially expressed genes across muscle diseases Figure 5.3 : CMDM significantly associates with clinical and histological severity in ICUAW Figure 5.4 CMDM significantly associates with clinical and histological severity in ALS Figure 5.5 – CMDM signature in cancer cachexia (GSE34111) Figure 5.6 – CMDM signature shows significant difference between mild and moderate and severe fibrosis in a cohort of congenital muscle disease (GSE17091). Figure 5.7 : Functional enrichment reveals common pathways in muscle disease Figure 5.8 : Transcription factor binding site enrichment analysis identifies transcription factors upstream of CMDM genes Figure 5.9 - Bar graph of genes in the CMDM signature by subcellular localization. Figure 5.10 : Disease-specific meta-analysis

xiv

LIST OF APPENDICES:

Appendix 1 – Supplementary Tables and Figures

ABBREVIATIONS : 6MW = Six minute Walk Test ADL = Activity of Daily Living AIC = Akaike Information Criterion ALS = Amyotrophic lateral sclerosis APACHE II = Acute Physiology and Chronic Health Evaluation II ARACNe = Algorithm for the Reconstruction of Accurate Cellular Networks ARDS = Acute Respiratory Distress Syndrome BH = Benjamini Hochberg (false discovery rate correction method) CIP = critical illness polyneuropathy CIM = critical illness myopathy. CIPNM = critical illness polyneuromyopathy. CLIP = cross-linking immunoprecipitation CLASH = cross-linking, ligation, and sequencing of hybrids CMDM = Common muscle disease module CMD = Congenital muscle disease COPD = Chronic obstructive pulmonary disease CSA = cross-sectional area CSM = chronic systemic diseases affecting muscle CT = Computed Tomography DE = Differential Expression DI = Disuse and Immobility EB = Empirical Bayes ECM = Extracellular matrix EMG = electromyography, GEO = Gene Expression Omnibus FC = Fold change (non-log space) FDR = False Discovery Rate FEM = Fixed effects model FIM = Functional Independence Measure FP = False positive GEP = gene expression profiles GLM = Generalized Linear Model GO = Gene ontology GSEA = Gene Set Enrichment Analysis HPO = Human phenotype ontology ICUAW = ICU acquired weakness ICUAW-RM = ICU acquired weakness-relevant modules IM = Inflammatory myopathies ME = Module Eigenegene LMM = Linear mixed model MD = Muscular dystrophy MI = Mutual information MTI = miR-target interactions MRM = MicroRNA regulatory module MODS = Multiorgan dysfunction syndrome mRNA = messenger ribonucleic acid miR = microRNA (ribonucleic acid) MMRA = MicroRNA master regulator analysis MTI = MicroRNA Target Interaction MRCSS = Medical Research Council sum score Nm = Newton-meters (Nm) NMES = Neuromuscular electrical stimulation NMBA = Neuromuscular blocking agents PCC = Pearson’s Correlation Coefficient PITA = Probability of Interaction by Target Accessibility ReMOAT = Re-annotation and Mapping for Oligonucleotide Array Technologies REM = Random effects model RIN = RNA integrity number RISC = RNA induced silencing complex RCT = Randomized Control Trial SBT = spontaneous breathing trial SDM = substitute decision maker SIRS = systemic inflammatory response SOFT = simple omnibus format in text SLR = stepwise linear regression TF = Transcription factor TGF-β = transforming growth factor beta TFBS = Transcription factor binding site TOM = topological overlap measure TP = True positive UPS = ubiquitin-proteasome system UTR = untranslated region WGCNA = Weighted gene correlation network analysis

xvii

CHAPTER 1: Literature Review and Introduction

1.0 Statement of the problem

Intensive care unit-acquired weakness (ICUAW) describes a spectrum of muscle weakness associated with critical illness that may persist for years after ICU discharge and contributes to significant long-term disability 1. The first three to six months after critical illness is crucial as many patients have a marked improvement in muscle function before reaching a plateau, resulting in sustained ICUAW 2,3 . The biological mechanisms responsible for recovery of muscle strength versus persistence of muscle weakness remain poorly understood. Comprehensive, longitudinal studies concurrently assessing structural, functional and molecular features of ICUAW out in survivors of critical illness are lacking. While molecular data from animal models have been used to infer the molecular pathways of early ICUAW in , these models suffer from a number of limitations 4 and cannot be used to model sustained ICUAW. In the acute phase, ICUAW is associated with failure of ventilator weaning, prolonged ICU stay and increased mortality 5-8. In patients that survive a large proportion of patients (40-65%) have diminished functional capacity 5 years post ICU discharge. The determinants of this persistent ICUAW remain inadequately defined. No pharmacologic therapies are currently available for the treatment of ICUAW. While early mobilization and physiotherapy have demonstrated short-term improvements in functional outcomes in several studies, long-term physical recovery has not been assessed 9. Drug discovery will likely require elucidation of biological mechanisms underlying ICUAW using biological network analysis to integrate multiple sources of experimental data with linkage to functional outcomes. This thesis aims to characterize the gene expression and microRNA changes in patients with ICUAW compared to healthy muscle and to patients with ICUAW who improve muscle mass. We seek to determine whether these gene expression changes are associated with clinical measures of muscle strength and mass. We then aim to utilize the large number of publically available human muscle samples from numerous muscle diseases to characterize a common gene expression signature as well as a signature specific to ICUAW. We hypothesized that the

1 ICUAW specific signatures will have aberrant expression of genes related to muscle development and regeneration. The present introductory chapter is comprised of 4 sections. Section 1.1 provides an overview of Intensive Care Unit Associated (ICUAW). It is partly adapted from Walsh, Christopher , Batt, Jane, S. Herridge, Margaret, Santos, Claudia. (2014). Muscle Wasting and Early Mobilization in Acute Respiratory Distress Syndrome. Clinics in Chest Medicine . 35. Section 1.2 will outline methods for gene co-expression networks analysis for discovering molecular mechanisms in ICUAW. Section 1.3 will outline methods for joint microRNA and gene expression integration and is partly adapted from Walsh, Christopher , Hu, Pingzhao, Batt, Jane , Santos, Claudia. (2016). Discovering MicroRNA-Regulatory Modules in Multi-Dimensional Cancer Genomic Data: A Survey of Computational Methods. Cancer Informatics . 15. 25-42. Section 1.4 will review methods for integration of microarray datasets. It is partly adapted from Walsh, Christopher , Hu, Pingzhao , Batt, Jane , Santos, Claudia. (2015). Microarray Meta-Analysis and Cross-Platform Normalization: Integrative Genomics for Robust Biomarker Discovery. Microarrays . 4. 389-406.

1.1 Overview of Intensive Care Unit Associated Illness (ICUAW)

Intensive care unit (ICU)-acquired weakness (ICUAW) is a syndrome of generalized limb weakness in patients with critical illness in the absence of other etiologies of muscle impairment. ICUAW is well described in the acute phase of critical illness and increasingly recognized to contribute to long-term disability in survivors of critical illness. Skeletal muscle wasting and weakness acquired during critical illness may result from muscle dysfunction, loss of myosin, and less commonly, frank myofiber necrosis (critical illness myopathy [CIM]), axonal sensory-motor axonopathy (critical illness polyneuropathy [CIP]), or a combination of both. Both processes manifest as muscle weakness, induced by the resultant and variable combination of muscle wasting and impaired muscle contractility.

1.1.1 Diagnosis and presentation of ICU acquired weakness

At present, no universally accepted diagnostic criteria for ICUAW is available. Diagnosis based on clinical examination of respiratory and peripheral muscle force using handgrip strength and maximal inspiratory pressure has been made 10,11 . It is

2 often difficult to differentiate CIM and CIP on physical examination and definitive diagnosis requires specific neuromuscular electrophysiology testing and muscle biopsy. Early identification of ICUAW enables the healthcare team to limit patient exposure to risk factors that compound muscle weakness (discussed section 1.1.3), initiate further investigations, and consider potential interventions and trial enrolment. A well recognized, less frequent, clinical presentation of ICUAW is symmetric flaccid paralysis (quadriparesis) with relative sparing of the facial muscles. Respiratory and limb muscles are often affected concurrently in ICUAW 7. A common presenting feature of ICUAW in the acute setting is failure to wean from MV (mechanical ventilation) 12 , however it is often not suspected until the critically ill patient has failed unsupported breathing and returns to MV. Complications arising from delayed diagnosis of ICUAW highlight the importance of earlier assessments for this condition 13 . Manual muscle testing using the Medical Research Council (MRC) sum score has been advocated as a primary means of diagnosis ICUAW 1414 . This approach provides a global estimate of motor function by combining strength scores obtained from predefined muscle groups in each extremity, yielding a total score ranging from 0 to 60 (full strength). An arbitrarily defined score cut-off of < 48 has been used to define ICUAW and found to be associated with increased duration of MV and ICU stay, and increased mortality 7,10,12,15 . An inherent limitation of the MRC score is its inability to evaluate patients with impaired cognitive state. A significant proportion of ICU patients were unable to perform MRC testing in one recent study which also found that MRC scores less than 48 had limited clinical predictive value 16 . Diagnostic approaches have been proposed in other reviews of ICUAW 17,18 . Other important factors that may contribute to muscle weakness and poor functional performance in survivors of critical illness include 1) poor prehospital functional status and comorbidity 3, 2) persistent organ dysfunction 3 3) deconditioning and disuse atrophy 19 , and 3) psychological disturbance (mood and cognition) 20 . Therefore, it is important to exclude a history of neuromuscular weakness prior to ICU and assess premorbid function in the critical illness survivor. Assessing the beneficial effects of potential therapies for ICUAW after hospital discharge can be measured using peripheral muscle strength (e.g. MRC score 21 ), exercise capacity (e.g. 6 minute walk test 22 ), functional independence measure (FIM score 23 ), and health related quality of life scores 24 . There is significant variation in the extent of muscle mass and strength improvement among patients in ICUAW cohorts. At present, minimally important differences for these outcome measures have not been established in the ICUAW population. Clinical

3 phenotypes of ICUAW that are closely linked to long-term prognosis and therapeutic response will continue to be refined as more studies measuring these outcome variables become available.

1.1.2 Epidemiology of ICU acquired weakness

The incidence of ICUAW in a general ICU population is difficult to establish given that it is strongly dependent on the risk factors, diagnostic method, and timing of examination 25,26 . The incidence in two general ICU cohorts diagnosed by clinical exam was 25% and 23.8%, after 7 days of mechanical ventilation and 10 days of ICU stay, respectively 15,27 . The rate of ICUAW in severe Acute Respiratory Distress Syndrome (ARDS) patients was evaluated as a secondary pre-specified outcome in a double-blind RCT comparing neuromuscular blockade to placebo using a validated measure (MRC sum score) to screen for ICUAW 28 . ICUAW was found to occur in 37.7% (61/162) of patients receiving placebo (61/162) on day 28 or at time of hospital discharge. A secondary analysis of 128 ARDS patients surviving more than 60 days after enrollment in a prospective trial found 34% were diagnosed with ICUAW, though the authors speculated that some cases likely went undetected 13 . A retrospective study of 50 consecutive ARDS patients screened for ICUAW using electrophysiological studies and physical exam (MRC score) found nearly two thirds (27/50) of patients were diagnosed with ICUAW. At the time of admission, all ARDS patients fulfilled sepsis/SIRS criteria and 38 of 50 patients had severe sepsis with shock and/or MODS. Risk factors that may be associated with the development of ICUAW in ARDS patients were specifically examined in this retrospective study by comparing patients diagnosed with ICUAW to those without ICUAW (controls). The occurrence of ICUAW was significantly associated with increased age and with elevated blood glucose level, however the authors speculate that age may have influenced ICUAW indirectly due to the higher incidence of hyperglycemia in the elderly. The study did not detect a significant difference in duration of sepsis, severity of illness or multi-organ failure, or administration of potentially iatrogenic medications (e.g. aminoglycosides, neuromuscular relaxing agents, corticosteroids) between ARDS patients with and without ICUAW. The relatively higher incidence of ICUAW in ARDS versus the general ICU population is not surprising given its connection to sepsis and MODS.

4

1.1.3 Risk factors for ICUAW

Identification of risk factors for ICUAW may be useful to stratify critically ill patients by likelihood of developing early (and possibly also sustained) muscle weakness 1. A recent meta-analysis of 14 prospective cohort studies of ICUAW identified the Acute Physiology and Chronic Health Evaluation II (APACHE II) score, neuromuscular blocking agents and aminoglycosides as significantly associated with ICUAW 29 . The latter two risk factors are frequently unavoidable therapies required for treatment of critical illness. The APACHE II reflects degree of severity of critical illness, including multi-organ dysfunction (MODS); its association with ICUAW likely reflects that peripheral muscle is a target of end organ damage in critical illness. While the above risk factors are associated with development of ICUAW, they are not predictive of recovery of strength or other long-term outcomes in ICUAW. Several other risks factors for ICUAW that have been identified in multiple studies include sepsis, immobility, and hyperglycemia. Age, burden of co-morbid disease, and ICU length of stay have been recognized as major risk modifiers of long-term recovery of function after critical illness. Patients with sepsis and MODS are at high risk for ICUAW; one systematic review found a nearly 50% incidence of ICUAW in this population 25 . The severity and duration of both systemic inflammatory response syndrome (SIRS) and MODS have been associated with ICUAW in a number of studies and several authors have concluded that ICUAW is one manifestation of MODS 14,18,30-33 . Witeveen et al 34 found that, after adjusting for disease severity, patients that developed ICUAW had higher levels of inflammatory markers at 48hrs of MV compared to critically ill patients that did not develop ICUAW in a prospective observational cohort. Hyperglycemia, a frequent complication of critical illness and inactivity, has been linked to ICUAW in multiple observational studies 25 and in two large randomized control trials (RCT) of insulin therapy that examined the effect of intensive insulin therapy (IIT) versus conventional therapy (CIT) on ICUAW as a secondary outcome 35,36 . The first RCT screened for ICUAW by electromyography weekly in 363 surgical patients requiring ICU stay for one week or more. The trial found a reduced incidence of ICUAW (28.7% vs. 51.9%, p < 0.001) and a faster resolution of ICUAW cases in the IIT group versus CIT 36 . The second RCT enrolled 420 medical ICU patients requiring more than 1 week in the ICU and found similar outcomes 36 . Both trials demonstrated reduced 180-day mortality, ICU stay and

5 duration of mechanical ventilation in the total population and the population screened for ICUAW. It is unclear whether the shortened duration of mechanical ventilation and mortality benefit can be explained by fewer cases of ICUAW or reductions of other hyperglycemia associated morbidities 3737 . However, a large multicenter trial found increased mortality at 90 days (mostly from cardiovascular complications) in 6104 patients randomized to IIT vs. CIT (number needed to harm of 38) 3838 . While ICUAW was not formally evaluated in this study it did not find any difference in duration of mechanical ventilation or ICU stay. A significantly higher proportion of patients in the IIT arm received corticosteroids in this trial, which may have confounded the outcome measure. Nevertheless, the benefits of preventing ICUAW with an intensive insulin protocol must be weighed against the harms of potentially higher mortality. There have been conflicting conclusions regarding the association of corticosteroids with ICUAW 25,30,39,40 . An ARDSnet RCT that compared methylprednisone versus placebo for severe persistent ARDS found 9 patients in the intervention group developed serious adverse events associated with ICUAW versus none in the placebo group 4141 . A secondary analysis of this trial found no difference in ICUAW among the 128 patients that survived 60 days after study enrollment by reviewing patient charts 4242 . Another RCT found no difference in rate of ICUAW between prolonged administration of methylprednisolone (1mg/kg/d) versus controls in early ARDS using a protocol that avoided NMBA 43 . ICUAW was not a secondary outcome measure in these trials and patients were not systematically evaluated for ICUAW. It has been argued while corticosteroids may increase the risk of myopathy, alternatively these agents may decrease the overall risk of ICUAW in the ARDS patient by reducing the duration of shock and mechanical ventilation 4242 . Neuromuscular blocking agents (NMBA) are commonly used in the management of ARDS to prevent patient-ventilator asynchrony and improve ventilation. These agents have been associated with ICUAW in a retrospective study of septic patients 40 and severe asthma 4444 , however these studies are confounded by the use high-dose glucocorticoid therapy . A double-blind trial that randomized 340 patients with severe ARDS to 48-hour continuous infusion of cisatracurium or placebo resulted in decreased mortality at 28 days with no increased risk of ICUAW measured by physical exam (MRC score) at day 28 and at time of ICU discharge 28 . Changes to the administration of NMBA, including shorter duration and avoidance of agents associated with persistent drug effect, along with bedside monitoring to detect excessive blockage may explain the lower risk of ICUAW in more recent trials 30 .

6 Further studies assessing the incidence of ICUAW with NMBA, as well as the interaction of NMBA with corticosteroids using more sensitive methods to detect ICUAW are warranted. ICUAW has been associated with immobilization in several studies using the duration of mechanical ventilation and ICU stay as an indirect measure of immobility 14,30,45

1.1.4 Putative therapies for ICUAW: Non-pharmacological strategies

1.1.4.1 Early rehabilitation therapy for disuse atrophy:

Moderate exercise during critical illness has been proposed to counteract the atrophy-inducing effects of the inflammatory state that occurs with inactivity and ICUAW 46,47 . There is increasing evidence demonstrating the benefits of early physical therapy in the ICU. Early rehabilitation is generally defined as physical therapy starting at the period of initial physiological stabilization, possibly within 24 to 48 hours after initiating MV. Multiple barriers to adopting early mobilization regimens in the ICU have been identified including safety concerns, patient sedation, and the requirement and cost of a trained multidisciplinary team to minimize the risk of adverse events 48 . Several observational studies have demonstrated that early rehabilitation is safe and feasible without increased costs 49-52 . Eligibility criteria that has been used to commence early rehabilitation in the critically ill patient are: 1) Neurologic: ability to cooperate, 2) respiratory criteria: fraction of inspired oxygen less than 60% and positive end expiratory pressure ≤ 10, 3) cardiovascular criteria: no requirement for vasopressors or symptomatic orthostasis. Patients with unstable fractures are excluded from early rehabilitation 53,54 . Patients meeting the neurologic criteria but missing a single cardiac or respiratory criterion can be started cautiously in early rehabilitation trials with close monitoring for physiologic deterioration 53 . A prospective trial that applied an early activity protocol in a respiratory ICU included 103 patients and found 41% of activity events occurred in patients on MV. Adverse events were infrequent, occurring in only 1% of all activities, and no event was serious 49 . An observational study of 104 acute respiratory failure patients requiring MV > 4 days transferred to an ICU setting with early physical therapy found the number of patients ambulating had tripled compared to pre-intervention 53 . Patients in the study that had not received any sedation were two-fold more likely to ambulate.

7 Development and implementation of an early rehabilitation protocol in an ICU setting where physical therapy was provided infrequently has been shown in one prospective study to increase the proportion of ICU patients receiving physical therapy versus usual care 50 . Patients in the protocol group were far more likely to receive physical therapy (91.5% vs. 12.5%, p ≤ 0.001) and had earlier mobilization and significantly shorter length of ICU and in hospital stay. No adverse events were reported and costs of usual care versus mobility protocol were the same. A four phase protocol for progressive mobilization in the ICU has been established with categories at each phase that include education for patient and families, positioning, transfer training, exercises, and walking program 55 . Patients are classified within four phases based on functional status; phase 1) inability to bear weight, phase 2) able to begin transfer training with a walker, phase 3) able to begin walking re-education, and phase 4) patients transferred out of the ICU. Duration and frequency of training sessions are also specified for patients at each stage in this guideline. One barrier to rehabilitation after ICU discharge in some centers is a relative lack of physical therapists. To address this issue, a self-help rehabilitation manual was provided to patients after discharge from ICU with instructions to perform their own physical therapy 56 . A randomized trial compared the self-help manual versus control in a mixed population of 126 post-ICU patients and found that it improved physical function scores at 8 weeks and 6 months post-ICU versus usual care.

1.1.4.2 Cycle ergometry and Neuromuscular electrical stimulation

Novel rehabilitation devices are being studied that may have potential to improve muscle strength in the critically ill, particularly for those unable to move actively due to weakness or sedation. The bedside cycle ergometer may be used to perform active or passive cycling (for sedated patients) at multiple levels of resistance that are individually adjusted. One randomized trial found that patients using the cycle ergometer, in addition to standard mobilization therapies initiated early in ICU rehabilitation, showed no difference in quadriceps force or physical function at ICU discharge 57 . However, it significantly improved quadriceps force, functional scores and 6-minute walk test (average of 56m greater in training group) at hospital discharge. While this study did not rule out whether extra time spent performing standardized physical therapy was as beneficial as cycling ergometry, it does provide

8 limited evidence associating increased physical activity in the ICU with improved functional outcomes. There is growing evidence that neuromuscular electric stimulation (NMES) improves muscular function in the critically ill. NMES applies electrical stimulation using surface electrodes, typically on target muscle of the lower limbs, to produce visible muscle contractions. It does not require active patient cooperation and has been shown in a small controlled study to increase synthesis and quadriceps cross-sectional area in orthopedic patients with knee immobilization 58 . A randomized trial of NMES applied to the lower limb versus sham (52 ICU patients total) found a significant reduction in ICUAW measured by MRC (27.3% vs. 39.3%, p = 0.04) 59 . A systematic review of RCTs that compared NMES versus sham in ICU patients found 5 studies that evaluated strength of different muscle groups and 4 that evaluated muscle mass (thickness or volume) 6060 . All 5 studies that evaluated muscle strength found an improvement with NMES while only 2 of the 4 studies assessing muscle mass found an improvement. Meta-analysis of these 8 trials was not possible due to high inconsistency in the ICU patient characteristics between studies. However, the data points to moderate treatment effects on muscle strength, but minimal impact on muscle wasting. Heterogeneity of NMES protocols across studies also limits the generalizability of the studies. Compliance and tolerability was generally high without any adverse events reported in these studies. The only contraindication to use of NMES is the use of NMBAs 59 . The major treatment modalities for early rehabilitation studied in prospective trials are summarized in Table 2.

1.1.3.3 Sedation interruption

The use of sedation in patients receiving MV has been found to increase the duration of MV and hinder early rehabilitation 61 . A protocol that combined the interruption of sedation with spontaneous breathing trials (SBT) significantly reduced duration of MV versus routine sedation care with SBT in a randomized control trial of 336 ICU patients with respiratory failure 62 . Similarly, early rehabilitation performed during periods of interruption of sedation resulted in a significant shortening of duration of MV and improved functional outcomes at hospital discharge and shorter periods of delirium versus interruption of sedation with routine care in one

9 randomized control trial 6363 . No differences in duration of ICU stay, hospital stay, or hospital mortality were detected.

1.1.4.4 Potential limitations of early rehabilitation therapy

At present the most effective timing, mode, intensity and frequency of early rehabilitation has not been established in clinical trials 64 . To what extent clinical phenotypes at high risk for limited long-term functional improvement (advanced age, co-morbid disease, and poor previous functional status) may benefit from even “optimal” physical therapies is controversial 55,65 . The significant correlation between the degree of MODS and muscle wasting suggests that physical rehabilitation may only counteract a portion of lost muscle mass in severe critical illness 66 . Though early rehabilitation may be able to attenuate muscle proteolysis and normalize muscle mass in some patients with ICUAW, it may be unable to restore normal muscle strength 67 . Thus therapeutic interventions in addition to early rehabilitation therapy are crucial in order to improve management of ICUAW. Potential pharmacologic adjuncts to early rehabilitation are reviewed briefly further below.

1.1.5 Putative pharmacologic strategies for ICUAW

At present, no pharmacological therapies have been approved for the treatment of ICUAW. The anabolic hormone insulin like growth factor I (IGF-I) was found to increase in-hospital mortality in a randomized control trial of patients with prolonged ICU stay 68 . The increased rate of multi-organ dysfunction and septic shock in the treatment group lead the authors to suspect IGF-1 exerts an immunomodulatory effect. Grip and hand strength were unaffected by treatment with IGF-I. It has also been postulated that poor delivery of hormones and nutritients to muscle due to aberrant microvascular perfusion in muscle may explain this anabolic resistance 69 . Estrogen has been shown to regulate skeletal muscle regeneration and mass recovery in disuse atrophy by activating Akt phosphorylation 70 . The phyto-estrogen 8-prenylaringenin was found to prevent loss of muscle mass using a denervation model in mice 71 . Whether this agent can enhance recovery from disuse atrophy or ICUAW in humans by accelerating the blunted Akt pathway should be investigated in future trials.

Clenbuterol, a selective β 2-adrenergic agonist, administered chronically in high doses in multiple animal studies has been shown to increase muscle mass and

10 force-producing capacity. Clenbuterol has been shown to activate the Akt/mTor pathway and inhibit proteasomal and lysosomal proteolysis independently of Akt, leading to a muscle-sparing effect in animal models of atrophy 72,73 . Concerns have been raised that the β-agonist induced shift from oxidative to glycolytic metabolism may result in clinically significant muscle fatigue 74 and that the adverse effects of IGF-1 stimulation may also be applicable to β-agonist administration 75 . Bortezomib (Velcade), a proteasome inhibitor approved for use in multiple myeloma and non-Hodgkin lymphoma has been shown to prevent atrophy in several animal models 76,77 , however no effect on diaphragm atrophy was found in a model that may have been explained by increased calpain activity 78 . There is currently no evidence that bortezomib improves muscle atrophy in humans and the numerous reports of cardiotoxicity linked to bortezomib regimes suggests that long term administration might adversely effect muscle function 79 . Trials investigating more specific therapies that target the degradation of sarcomeric and impaired muscle contractility are needed 75 .

1.1.6 Mechanisms underlying loss of strength and muscle mass in ICUAW

1.1.6.1 Current state of pathomechanisms in Early ICUAW

In the context of critical illness, intensive care unit acquired weakness (ICUAW) has recently gained much attention as a target of organ failure and complication of prolonged ICU care 80 . Most studies investigating the pathomechanisms of ICUAW were conducted in animal models of early critical illness and sepsis. There is currently limited supporting evidence from human studies. ICUAW results from a spectrum of nerve dysfunction (CIP) and primary muscle injury (CIM), the combination being termed critical illness neuromyopathy (CINM). The mechanisms underlying CIM are better understood than CIP. It is known that muscle weakness in CIM occurs due to a combination of loss of muscle mass (atrophy) and diminished contractility 1. Skeletal muscle in ICUAW can display muscle contractile dysfunction, causing decreased muscle-specific force-generating capacity, despite being structurally intact. This functional impairment has been attributed to diminished membrane excitability due to an acquired sodium channel abnormality 81 , mitochondrial dysfunction 82 , oxidative stress, and altered calcium homeostasis within muscle fibers impairing excitement-contraction coupling 1. Membrane inexcitability

11 has been shown to occur early in critical illness and predicts the development of muscle weakness 83 . Critically ill patients with greater severity of illness and elevated IL-6 plasma levels at the onset of critical illness were found to be at high risk for development of non-excitable muscle membrane 84 . Decreased mitochondrial enzyme activity and mitochondrial content in skeletal muscle has been found in animal models of sepsis 85 and in critically ill patients 86,87 . Fredriksson et al 87 found that the pattern of mitochondrial gene expression observed in septic patients with MODS differed appreciably from muscle unloading in humans. Reduced mitochondrial function is associated with loss of muscle mass and muscle dysfunction in several animal models 88 . It is hypothesized that dynamic changes in mitochondrial morphology regulate mitochondrial function and alter signaling pathways linked to atrophy, however these mechanisms remain unclear 89 . Levels of proinflammatory cytokines IL-1, IL-6, and tumour necrosis factor α (TNF-α) are correlated with severity of illness and mortality in critically ill patients 90 and implicated in ICUAW. TNF-α, the most extensively studied inflammatory cytokine in muscle, promotes both contractile dysfunction and muscle atrophy 91 . TNF-α upregulates reactive oxygen species (ROS) that can disrupt muscle contractile proteins and induce contractile dysfunction in the absence of muscle atrophy. The production of free radicals with mitochondrial dysfunction results in energy loss and muscle fiber damage leading to muscle weakness and fatigue 1. Muscle atrophy, the net loss of muscle protein and fat-free mass, results when rates of muscle proteolysis exceed those of protein synthesis. Proinflammatory cytokines including TNF-α, IL-6, and Growth and Differentiation Factor 15 (GDF-15) promote atrophy by inducing upregulation of genes involved in the UPS pathway 91,92 . Proteolysis is achieved by several cellular signaling networks, but the predominant proteolytic pathway activated in animal models of muscle atrophy is the ubiquitin (Ub) - proteasome system (UPS) with the ubiquitin ligases atrogin-1 and muscle specific RING finger protein-1 (MuRF1) playing key regulatory roles early in the process. Unlike the animal models, analysis of engagement of the UPS in the muscle of critically ill patients has yielded inconsistent results; some studies reported UPS activation and/or up-regulation of atrogin-1 / MuRF1 93,94 , while others found decreased atrogin-1/MuRF1 expression levels 66,95 , implying decreased UPS mediated proteolysis. These discrepancies are most likely explained by temporal changes in atrogin-1/MuRF1 expression assessed by differing single time-point studies and by possible discordance between the assessment of the expression level

12 of a limited number of UPS components and proteasome function in studies that do not directly assess muscle proteasome activity. Autophagy, an intracellular process of bulk degradation of cytoplasmic substrates, is increased in multiple animal models of muscle atrophy 96,97 . Both autophagy and the UPS pathway are activated by the forkhead box O (FoxO) transcription factors, which are induced by immobility, inflammation, nutrient depletion and cellular stress 96,98,99 . However, the complete depletion of muscle- specific autophagy in mice has also been associated with significant myofiber degeneration and muscle weakness 100 . Thus, the balance of autophagy activation appears essential to skeletal muscle homeostasis. Few studies have assessed autophagy in ICU patients, with some having found decreased autophagy in the peripheral muscle, but increased autophagy in the diaphragm 101,102 . However a more recent study of ICUAW examined autophagy and day 7 and month 6 post ICU discharge compared to healthy controls and found no differences in the majority of autophagy markers at month 6 compared to controls. The investigators could not draw conclusions regarding differences at day 7 due to variability among assays 103 . Thus contribution of autophagy to ICUAW in early ICUAW remains uncertain. Decreased protein synthesis, in addition to increased protein degradation, may play a role in early ICUAW 104 . While protein synthesis was found to be decreased at day 1 of ICU admission, it increased by day 7 to rates associated with controls 66 . Two major signaling networks control muscle protein synthesis and muscle growth in animal models, the IGF-1/PI3K/Akt pathway and the myostatin- Smad2/3 pathway, functioning as positive and negative regulators respectively. 105 . The activation of Akt concurrently inhibits the up-regulation of proteolysis pathways by preventing nuclear translocation of FoxO transcription factors and subsequently the development of atrophy in an animal model 106 . At present, the significant pathways governing anabolic signaling in patients with critical illness remain to be delineated. Future studies will be needed to determine whether pharmacological upregulation of Akt and inhibition of myostatin restores muscle protein synthesis in ICUAW 93 . In addition to synthesis of contractile and structural proteins, injured muscle grows via skeletal muscle hypertrophy that is dependent on recruitment of muscle stem cells (satellite cells). The satellite cells are activated in response to muscle injury causing them to proliferate and differentiate to fuse into mature muscle 1. Satellite cells are essential for regeneration of injury muscle, whereas they are not required for regrowth of atrophied muscle from immobility 107 . Studies in ICUAW that have investigated the role of satellite cells are discussed in Section 1.1.6.2 .

13

Previous transcriptomic analyses of early ICUAW

Whole transcriptome profiling of ICUAW in human muscle biopsies has been limited by challenges related to the invasive nature of the muscle biopsies and costs of microarray profiling 108 . Three prior transcriptomic studies of peripheral muscle in early ICUAW have assessed changes in gene expression and functional pathways compared to healthy controls 87,109,110 . Each of the studies performed differential gene expression (DE) between ICUAW and healthy controls from biopsies of the quadriceps muscles. Also common to each of these transcriptomic studies was limited clinical phenotyping (e.g. muscle mass and/or strength were unmeasured), relatively small sample sizes, and absence of samples from patients with persistent ICUAW. Using whole-transcriptome gene expression profiling, Langhans et al 110 found increased SAA1 expression in patients with ICUAW, which correlated muscle membrane excitability in critically ill patients. Di Giovanni et al 109 found upregulation of transforming growth factor beta TGF-β pathways in a cohort of five patients with ICUAW, whereas Fredriksson et al 87 studied a cohort of septic patients with MODS in the ICU (therefore having high likelihood of ICUAW) and found dramatic loss of unique muscle-related gene expression leading the investigators to suggest that patient muscle tissue was undergoing a de-differentiation process. The investigators also found downregulation genes involved in oxidative stress response; decreased mitochondrial enzyme function was found using spectrophotometric analysis.

Contribution of immobility and other factors to atrophy in critical illness

Critically ill patients in the ICU are frequently immobilized, sedated, underfed and have heightened inflammation and systemic cellular energy stress. Any of these conditions may occur independently in healthy individuals and each are associated with increased muscle degradation via upregulation of proteolytic pathways 1. An important challenge in understanding the pathomechanisms of ICUAW has been identifying the biological pathways unique to ICUAW versus those that are common to other types of muscle atrophy. Critically ill patients in the ICU have reduced physical activity of the peripheral and respiratory muscle due to bed rest, limb immobilization and mechanical ventilation 111 . This mechanical unloading results in skeletal muscle wasting termed disuse atrophy characterized by decreased

14 muscle mass and fiber cross-sectional area 112 . As patients with ICUAW are frequently immobile and at risk for disuse atrophy questions have remained to what degree the muscle wasting observed is related to immobility versus underlying ICUAW. In CIM there is a remarkably preferential loss of myosin relative to which cannot be explained merely by immobilization 113 . These structural changes are more likely explained by direct muscle impairment that underlies CIM and indirect muscle impairment due to axonal degeneration in CIP 114114 . The deleterious effects of critical illness and immobilization may act synergistically to intensify the pro-inflammatory state and accelerate muscle turnover 115-118 . Bedrest reduces mechanical load on skeletal muscles which has been shown to induce muscle catabolism and decrease muscle contractile strength, even in the absence of critical illness 19,119 . Muscle fiber atrophy begins within hours of immobility in healthy volunteers, resulting in an average of 3% decrease of muscle mass weekly 120120 , with more than 50% of muscle atrophy occurring in the initial 2 weeks 121 . Immobility has been shown to increase oxidative stress leading to accelerated muscle fiber degradation via activation of multiple proteolytic systems in animal models 99,118 . Moderate exercise increases anti-oxidants to counteract oxidative stress suggesting potential therapeutic benefits with early rehabilitation in the ICU setting 46,82,122,123 Fredriksson et al 87 attempted to utilize previous data from animal models of muscle wasting, inactivity, and inflammation. However, there are a number of disadvantages to using animal models to study ICUAW 91 and the utility of such models to accurately represent human muscle wasting disorders is controversial 124 .

1.1.6.2 Sustained ICUAW

The pathobiology of sustained ICUAW is still undetermined, however emerging evidence supports impaired muscle regeneration as an important causative factor. A murine model of sepsis has shown long-term impairment in satellite cells that lead to inefficient muscle regeneration 125 . In humans, a recent longitudinal cohort study was the first to assess muscle biopsies and clinical measures of muscle mass and strength in early and sustained ICUAW 103 . Patients with ICUAW were assessed at Day 7 (early ICUAW) and Month 6 (sustained ICUAW) post-ICU discharge. Whereas myopathy persisted at the 6-month follow-up, neuropathy remained in only one patient. For each muscle biopsy sample, the activity of the ubiquitin-proteasome system (UPS), mitochondrial content, vascularity, inflammatory

15 infiltration, myofiber cross-sectional areas, and quantity of muscle stem cells (satellite stem cells) was assessed. The initially observed decreased mitochondrial content and increased ubiquitin-proteasome system-mediated muscle proteolysis and inflammation all normalized at 6 months. Thus, patients with sustained ICUAW were characterized by normal mitochondrial content and structure, without excessive inflammation or upregulation of proteolysis. Normal myofiber structure was noted on electron microscopy and peripheral nerve function was normal on nerve conduction studies. Strikingly, a reduced quantity of satellite cells in muscle biopsies were noted compared to healthy controls and ICUAW patients who normalized muscle mass. These findings suggest that impaired regenerative capacity may explain the reduced muscle mass and weakness in patients with sustained ICUAW.

Figure 1.1 Putative modifiable risk factors for muscle atrophy and ICUAW (above) and potential therapeutic interventions (below) in a mechanically ventilated patient with critical illness. Solid red arrows denote interventions for critical illness. Dashed arrows denote adverse effects of the intervention in addition to underlying critical illness. Interventions with predominantly inconclusive or contradictory findings in the literature are denoted by a question mark (?).

16

1.2 Gene co-expression networks

1.2.1 Gene co-expression networks for discovering molecular mechanisms

Complex human diseases, like all biological systems, can be examined at multiple scales 126,127 (Figure 1.2). At the lowest scale is the expression of a single gene (itself represented by multiple probes on a microarray) and at the largest scale is the clinical phenotype of the disease as assessed by clinical measurements 128 . Variations across clinical phenotypes are viewed as an emergent property arising from complex interactions at multiple levels of biological scale. Gene expression reflects the biological state of the cells being analyzed 129 . Changes in the expression of sets of genes mediated by shared regulatory mechanisms results in variations in the cellular state and ultimately the clinical phenotype. Therefore, changes in the expression of these genes as measured by mRNA abundance (a molecular phenotype) can be correlated with measurable changes in the trait (clinical phenotype). Differential expression (DE) analysis can be used to identify genes altered in disease, however it does not account for the relationships between genes, and cannot identify small but potentially biologically significant changes in groups of co- expressed genes. Thus, to understand how groups of co-expressed genes contribute to the disease phenotype it has become necessary to apply data-driven methods 129 to identify critical networks that are relevant to disease mechanisms 130- 132 . Molecular phenotyping using microarrays remains the most commonly used method to measure relative (but not absolute) gene expression 133 which has been shown to be accurate and reproducible across platforms, with similar performance and lower cost than RNA sequencing 134 . Genes that perform similar functions or interact within the same molecular pathway are often termed “gene sets”. Gene ontology (GO) is the most widely used ontology (uniform vocabulary and structure) for categorizing functionally related gene sets. Groups of genes that are tightly co-expressed, thus having gene expression highly correlated across multiple samples, are often termed “modules” and tend to be functionally related 130,135 . Co-expression analysis therefore has the potential to identify modules enriched for genes comprising one or more biological pathways.

17 Gene Ontology (GO) and other biological pathway databases are incomplete, with bias favoring well characterized genes 136 (see Limitations section 6.2 ). Therefore, co-expression analysis can allow genes without known function to have their function inferred using the validated “guilt by association” principle in modules enriched for genes with known function 137-139 . Gene Set Enrichment Analysis (GSEA) is another method that can detect small but coordinated changes in gene expression by testing whether gene sets accumulate at the top or bottom of a full ranked list of genes ordered by magnitude of gene expression change 140 . Like co-expression analysis, GSEA does not require the user to select a threshold on fold change, thereby taking account of all measured genes, however unlike co-expression analysis, it requires a priori defined gene sets as input, whereas co-expression analysis can function in an unsupervised manner (described in Section 1.2.2 ). Furthermore GSEA does not explicitly take into account the interactions between genes in the gene sets. The sources of gene co-expression can be biological and non-biological (e.g. batch effects) ( figure 1.3). Modules from biological sources reflect convergent mechanisms related to genetic, regulatory (e.g. transcription factors [TFs] and microRNAs [miRs] discussed in Section 1.3 ), cell signaling, and environmental influences 141 . The degree of cellular heterogeneity (proportions of different cell types) within samples can also influence co-expression modules, which may be considered biologically useful in some instances. For example, cell-type specific modules, driven by co-expression of cell-type specific genes, can be identified if several cell types are present across samples 139 . As the correlation patterns from non-biological sources are indistinguishable from biological, caution is required when interpreting the biological significance of modules that are identified. Modules are more likely to be derived from a biological source if they are enriched for biological pathways or genes sharing a similar regulatory mechanism or if the summary expression of the module correlates with relevant phenotypic trait(s). However it is not possible to conclude causality between gene expression and clinical phenotype even if significant correlation between the module and phenotypic trait is detected. Another common means to assess biological networks is to examine their global structural properties. Gene co-expression networks are typically modeled as undirected networks consisting of nodes (e.g. genes) and connections between them termed edges (e.g. correlation values). Directed networks, in contrast, have directionality specified, such in TF binding networks or miR regulatory networks, where the regulatory factor interacts with its targets.

18 A number of studies have examined the frequency distribution of gene network connectivity and many have found these networks to exhibit scale-free topology 142 . The scale-free topology is characterized by a large number of nodes having only one or smaller number of connections and a smaller number of highly connected nodes termed “hubs”. This topology is notable for its robustness as loss of individual components would typically preserve the overall network architecture 143 . The tightly clustered topology of the scale-free network allows for well-organized transcriptional activation 144 . The low path-length and well-organized, highly connected hubs can serve as master regulators of cellular function and development. Mutation in genes known to be hubs have been found to have critical roles in disease 145 139 . Thus, when selecting amongst gene co-expression networks derived from correlation data, a model with approximate scale-free topology would be more plausible than a network that does not meet this criterion. As microarray studies are often characterized by noise and small sample sizes, pairwise correlation values of gene-gene expression have been shown to be suboptimal for analysis of co-expression networks 135 . To reduce the impact of erroneous correlations between genes and detect more biologically meaningful results, the Topological Overlap Measure (TOM) was adapted from protein-protein interaction studies 146 for application to gene expression correlation matrices. TOM, a robust measure of interconnectedness, is based on the concept that when assessing two genes, the greater the number of common neighboring genes shared between them, the stronger is their relationship. Thus TOM integrates both the direct gene-gene connection strengths as well as those mediated by shared neighbors in the network. The biological relevance of a co-expression module can also be assessed based on reproducibility across independent cohorts, animal models, or other cell types. Module preservation statistics allow the investigator to assess the degree of similarity of the modules generated in independent samples which can serve as external validation 147 . Two other fundamental concepts in co-expression networks are related to thresholding and scaling the gene correlation matrix to improve detection of biologically meaningful modules:

Weighted vs. unweight networks In an unweighted network the interactions between genes are binary (present or absent). Weighted networks use continuous weighted values 0 to 1 (derived from the

19 correlation measures) to indicate the strength of the interaction between two genes. Weighted networks can be transformed into an unweighted network using hard thresholding, for example by thresholding the absolute values of the correlation matrix. Soft thresholding has been used to preserve the continuous nature of the gene co-expression data while emphasizing strong correlations and penalizing weak correlations. A weighted network adjacency can be generated by raising the co- expression similarity sij to a power β >= 1. A common method of selecting the value of β is by evaluating whether the network at the chosen β exhibits an approximate scale-free topology.

Signed vs unsigned networks Correlation-based weighted networks can be analyzed as signed or unsigned networks. In an unsigned network, the absolute value of the correlation is used to represent the relationship between genes. As a result, negatively correlated genes are included among positively correlated genes. To overcome this limitation, a signed network scales ensures that genes with negative correlation will have a low similarity by scaling the correlation values into the range between 0 and 1, such that values < 0.5 indicate negative correlation, and those > 0.5 indicate positive correlation. A signed hybrid method allows positively correlated values to be between 0 and 1, and is 0 whenever the corresponding correlation is negative. Signed networks have been shown to identify modules with more coherent functional annotation 148 . In the next step, modules are identified using a clustering technique.

Section 1.2.2: Weight gene co-expression analysis (WGCNA)

Weighted gene co-expression network analysis identifies modules in a multiple-step process, first by calculating a gene-gene correlation matrix, S= [ sij ], where sij is the pair-wise transcription correlation coefficient between genes i and j, and S is the correlation matrix. For small sample sizes (n ~ 20) robust correlation measures such as Tukey’s biweight or Spearman’s have been shown to reduce the effect of outliers 149 . Then correlation matrix S is transformed by raising to a power β >=1 based on the scale-free topology criterion (soft threshold). Then the topological overlap measure is applied to generate dissimilarity (distance) measures, where genes with high co-expression have a small dissimilarity, and those without co- expression have large dissimilarity. These values are then input into average linkage hierarchical clustering. Modules are defined as branches of the resulting tree

20 produced by clustering. These steps occur in an unsupervised manner, without any reference or bias to external gene sets, databases or publications. Next, for each sample, the expression profile of all genes within each module are summarized using the first principal component (PC), termed the module eigengene (ME). Principal component analysis is a data reduction method to assess dominant patterns in multivariate data. The first PC (i.e. module eigengene) captures the largest explained variance within the data. Module eigengene expression can then compared between cases and controls or correlated to clinical traits. Thus, WGCNA exploits data-reduction to mitigate the multiple testing problem inherent to transcriptomic analysis by reducing analysis to modules (typically less than 100 detected by WGCNA) rather than tens of thousands of genes. WGCNA is commonly used to detect modules associated with disease or clinical traits. For example, if module eigengene expression increases with severity of pathology, genes within the module can be further characterized based on functional similarity or known disease associations. While WGCNA provides a useful conceptual framework to better understand the pathomechanisms of disease, modules are less suited for use as disease classifiers compared to methods for detecting gene signatures for biomarker discovery (discussed in Section 1.4 ) 130 . Other network analysis methods capture non-linear relationships between gene co-expression data. Algorithm for the Reconstruction of Cellular Networks (ARACNe) applies the mutual information (MI) statistic between two variables using a hard threshold chosen based on the significance level estimated using permutation testing. Allen et al 150 found that WGCNA and ARACNe both performed well in constructing the global network architecture of simulated and E.coli transcriptomic data. In Section 1.3 , application of ARACNe for detection of gene networks targeted by a miR regulator will be discussed.

21

Figure 1.2 Schematic to conceptualize biological complexity at multiple scales. The top figure ranks biological sources from left to right by increasing biological complexity (single gene probes from a microarray, biological pathway, WGCNA co- expression module, population of cells/tissues, and clinical phenotype). The methods used assess and integrate these biological sources are listed below. The figure at the bottom shows a microarray (left), for measuring gene expression and CT scan images (right) used to estimate muscle mass for clinical phenotyping. Gene co-expression modules (middle) characterize relationships between genes and clinical phenotypes. Top part of figure adapted from Deffur et al 127 with permission

22

Figure 1.3 Sources of gene-gene correlation/co-expression. Biological and non- biological processes can contribute to the observed expression patterns of two or more genes. The biological sources include genetic, gene regulatory factors (transcription factors, epigenetic mechanisms including methylation and microRNAs), biochemical, and cell type heterogeneity and environmental factors. These multiple biological processes have convergent influence on the transcriptome and give rise to co-expression networks which represent the biological state. Additionally, non- biological technical variability, such as batch effects can generate gene correlation patterns indistinguishable from biological sources of co-expression (figure adapted with permission from Gaiteri et al 130 )

23

1.3 Integration of microRNA and mRNA expression data

MicroRNAs (miRs) are small single-stranded noncoding RNA (~20-22 nucleotides) that function in RNA silencing and post-transcriptional regulation of mRNA expression. MicroRNAs target messenger RNA (mRNA) expression by binding to complementary sites (seed sequences) in the target mRNA 151 . This interaction, mediated by the miR-induced silencing complex (miR-RISC), reduces the stability and translational rate of the mRNA target 152,153 . These miRs are predicted to target one-third of all mRNAs in the genome, where each miR is expected to target hundreds of transcripts 154,155 . An increasing number of studies have been implicated in muscle diseases and shown to play important roles in myogenesis and muscle repair 156,157 . The skeletal muscle enriched miRs are the “myomiRs” miR-1, miR-133a/b, miR-206, miR- 486, and miR-499a/b. These miRs have been shown to play key roles in myogenesis, regeneration and hypertrophy of adult muscle 157-159 . There is evidence suggesting that miRs can contribute to variability in myonuclear recruitment, determined by the balance of satellite cell/myoblast proliferation and differentiation 160 . Higher levels of miRs that promote the pluripotent-stem-cell phenotype favor maintenance of muscle mass, whereas higher levels of miRs that promote withdrawal from the cell cycle and differentiation favor wasting (Lewis et al. , 2016). These observations imply that delaying myocyte differentiation and extending the proliferation period increases the ability of the muscle to regenerate, thus leading to a slower loss of muscle mass in response to injury or insult. Understanding the regulatory mechanism of miRs in mRNA regulatory networks will help elucidate the molecular mechanisms underlying muscle regeneration. miRs have potential for disease biomarkers, including prognostic markers of ICUAW and as potential therapeutic targets 161 . Van de Worp et al 108 have proposed the term “atromiRs” to describe miRs involved in muscle atrophy and cachexia. miR-29b is a potential atromiR as it was shown to be required for the loss of muscle mass in murine models of muscle atrophy 162. Using a pre-selected screen of miRs, mir-542-3p and miR-424-5p were found to be upregulated in patients with

24 ICUAW and mir-424-5p was inversely proportional to physical function 163 . Farre Garros et al 164 selected miR-542 for further evaluation in their ICUAW cohort as it was dysregulated in COPD and was predicted to target pathways controlling protein turnover and energy balance; it was found it to be upregulated in their ICUAW cohort compared to controls and associated with mitochondrial dysfunction. As miR-424-5p is expressed at the same as miR-542, Connolly et al 163 studied miR-424-5p and found it upregulated in the same ICUAW cohort used by Farre Garros et al (cohort characteristics described in Bloch et al. 92 ). Overexpression of the rodent miR-424 orthologue, miR-322 in mice resulted in muscle atrophy and reduced ribosomal RNA levels. These studies are the first to identify DE miRs with partial conservation between different human diseases and contribute to atrophy across other muscle diseases 108 . It is expected that numerous important miR regulators of muscle function in ICUAW remain to be discovered. Therefore, unsupervised miR transcriptome profiling from patients with ICUAW compared to healthy controls and other muscle diseases is required to identify other miRs dysregulated in ICUAW. As the number of published miR sequences continues to increase with small RNA deep sequencing experiments 165 , the biological implications of miRs as modulators of post-transcriptional regulation expand. As of August 2018, there are 2815 miR sequences in the annotated in mirBase (http://www.mirbase.org), the primary miR sequence repository (described in more detail in Section 1.3.2 ). However, the functions of only a subset of these miRs have been experimentally determined. As miRs mainly regulate function through their targets, elucidating the miR-target interactions (MTIs) is vital for functional characterization of miRs. Therefore, much progress has been made over the past decade to develop high-throughput experimental and computational methods for MTI identification. Despite advances, determination of MTIs and identification of functional modules composed of miRs and their specific targets remains a challenge. While high-throughput measurement of miR and mRNA expression has become relatively straightforward, their joint integration for detecting high-confidence interacting miR:mRNA pairs is more challenging 166,167 . The integration of matched miR and mRNA expression data with sequence-based target prediction has been shown to significantly improve the quality of the identified MTIs 167,168 .

1.3.1 Overview of high-throughput experimental methods for miR-target identification

25 High-quality experimentally derived training data are generally required to improve sequence-based target prediction performance 169 . High-throughput methods such as those employing crosslinking and immunoprecipitation (CLIP) are an important class of capture-based methods for detection of direct miR-target binding events associated with the Argonaute protein (Ago) 170,171 . Argonaute high-throughput sequencing of RNAs isolated by CLIP (Ago-HITS-CLIP) simultaneously sequences Ago-miR and Ago-mRNA binding sites to identify interaction sites between miR- target pairs 171 . One limitation of this approach is that miR-target complexes are dissociated prior to sequencing, requiring the target sequence in each miR-target pair to be inferred computationally which is prone to error. Recently a method for producing ligation of the miR-target pair called crosslinking, ligation, and sequencing of hybrids (CLASH) has shown to be more robust than CLIP for identification of miR target sites 172 . The former method is similar to CLIP, but adds a ligation step between the miR and target allowing direct characterization of the chimeric, or hybrid miR-target to unambiguously identify the miR bound at a specific target site. A novel finding from CLASH analysis was the detection of strongly overrepresented motifs in the interaction sites of several miRs, suggesting that individual miRs systematically differ in their binding site modes. Although this likely affects the response of RISC to miR-target binding, it is unclear how it impacts in vivo function of MTIs 172 . While CLASH holds much promise, at the present time this method has a very low yield with only about 2% of the reads obtained in an experiment corresponding to miR-target chimaeras. Thus, further improvements to CLASH will be needed before comprehensive mapping of MTIs will be possible 173 . As each cell line has a different miR expression profile, the cell line used in an experimental analysis will yield different sets of MTIs than other cell lines or disease conditions. For example, a miR with low expression in a cancer (sub)type profiled using CLASH may not be detected, whereas that miR may have high expression in another (sub)type of cancer. Therefore, studies have integrated knowledge of MTIs from one cell line or condition identified by CLIP and CLASH with miR and mRNA expression profiles from the cell line or condition of interest, yielding condition-specific MTIs.

1.3.2 Experimentally supported miR-target interaction databases

DIANA-TarBase is a manually curated database of experimentally derived MTIs. The eighth version of the Tarbase contains > 670,000 unique miR-target pairs (the largest

26 public database of experimental MTIs) from roughly 600 cell types/tissues and over 33 experimental methodologies (including low and high-throughput methods). While high-throughput experimental ( in vitro ) approaches are being increasingly performed, they currently suffer from several disadvantages compared to in silico predicted MTIs: 1) the number of experimentally derived MTIs remains far fewer than predicted MTIs 174 and 2) the tissue specificity of the experimental method may exclude availability of cell or disease (sub)types of interest to the investigator. Therefore, the most common approach to facilitate inference of disease-specific MTIs involves profiling paired miR and mRNA expression datasets (described below). Integrating condition-specific (“dynamic”) expression profiles with in silico predicted MTI datasets (“static”) datasets has been shown to improve MTI prediction 167 . miRBase is the primary database of all miR annotations and sequences for all species, providing a centralized repository for miR names and official rules for miR nomenclature that is updated yearly 175 . The latest miRBase version 22, released in March 2018, included 2815 miR sequences from the human genome derived from published literature. Annotations of many miRs vary markedly across previous miRBase versions. Each new release of miRBase includes corrections of misannotated miRs from previous versions. Additionally, changes to the miR symbol annotations took place between miRBase versions 16 through 19 compared to previous versions. Previous miRBase versions used annotation that included family relationships and expression levels that have been subsequently removed to alleviate the complexity of the annotations. For example, in miRBase 16 the “-as” nomenclature (indicating antisense miRNA) and the “miR*” (indicating the miR is found at lower concentration than the mature product without asterisk) designations were discontinued. Since miRBase version 19 the “-5p” and “-3p” suffixes have been consistently adopted (where 5p means that miR is from the 5’ arm of the hairpin and 3p from the 3’ end). If miR sequences are similar then the “a” and “b” suffixes are added to the miR name.

1.3.3 Sequence-based miR-target prediction

A number of bioinformatics algorithms for predicting miR recognition sites within transcripts have been developed using knowledge from experimentally validated target sites. There is a diversity of algorithms employed for prediction miR-target prediction which can be categorized based on seven basic features 176 : (i) assessment of evolutionary conservation of putative binding region, (ii) target

27 sequence analysis of putative binding regions (e.g. miR sequence complementarity), (iii) binding energy between miR and putative target sequence, (iv) use of miR / mRNA expression profiles, (v) use of CLIP data, (vi) use of machine learning methods, (vii) use of prior predictions from other miR-target prediction tools. Many algorithms, including TargetScan 177 and TargetRank 178 , use evolutionary conservation of the target site to select predicted targets based on conservation to reduce false positive (FP) predictions. However, roughly 20% of functional target sites are not conserved between and conservation is further decreased in a step-wise fashion in larger taxonomic groups 179 indicating that sensitivity of target prediction decreases with higher conservation thresholds 180 . Early experimental studies found that a primary determinant of target specificity was perfect complementarity (“canonical site”) at the 5’ end of the miR “seed region” at positions 2-7 151,152,178 . Therefore, these algorithms initially were primarily focused on sequence complementarity between the seed region of miR and the 3’ untranslated region (UTR) of the putative target 178,181 . However given the large number of randomly occurring 6 nucleotide sequences in a 3’UTR of a mRNA, perfect seed match itself is a poor predictor of miR regulation 180,182,183 . Several studies have proposed that target sites in which the pairing between miR seed and mRNA does not completely match (termed “non-canonical sites”) also confer regulatory effect 172,184 . Recent studies 177,185 found that while miRs bind to non-canonical sites, there was no detectable repression based on mRNA stability or translation using multiple cell types. This finding has supported the focus on canonical binding sites by sequence-based target prediction programs. Multiple studies have demonstrated the important role of structural accessibility in miR-target recognition 186,187 . The thermodynamic stability of the seed/target binding has been shown to be a significant determinant of the miR-target recognition by preventing or promoting the interaction 185 . Several algorithms model the mRNA secondary structure and accessibility of the target sequence to explained variability in target strength 169,186,188 . The Probability of Interaction by Target Accessibility (PITA) algorithm 186 scans complementarity seeds of miRs in the mRNA sequences and computes the difference between the free energy gained from the formation of the miR-target duplex and the energetic cost of unpairing the target to make it accessible to the miR. The growing number of experimental MTI data has prompted more recent use of machine learning algorithms to train classifiers directly on the experimental data. TargetScan version 7 177 was developed using microarray expression sets using knock-up and -down miRs and CLIP datasets to assemble a set of 26 candidate

28 features to predict the effects of miR binding to canonical sites. Stepwise regression models were built with an increasing number of features until it reached the optimal Akaike Information Criterion (AIC) value. The AIC evaluates the tradeoff between the cost of increasing the complexity of the model (more features) and the benefit of increasing the likelihood of the regression fit. The final model selected 14 different features of miR and mRNA to predict which sites within mRNA are most effectively targeted by miRs and were used to train multiple linear regression models. The investigators found that TargetScan performed significantly better than existing models and was as good as recent high-throughput experimental approaches to identify effective target sites. DIANA-microT 189 is another miR-target prediction algorithm based on feature extraction from CLIP datasets. True sites were defined as overlapping in CLIP whereas false sites were nonoverlapping. Significant features were combined through mRNAralized linear models and independently tested on high throughput proteomics data. miRTarget 169 is a target prediction model generated using a support vector machines algorithm based on structural and sequence features from several CLASH datasets. The predicted targets are represented in the online database miRDB. Agarwal et al 177 recently compared 17 sequence-based target prediction algorithms and found that the number of potential MTIs varied greatly, reflecting the varied strategies of these algorithms. Other sequence-based miR target prediction algorithms have been comprehensively reviewed elsewhere 190 . The miR Data Integration Portal (miRDIP) database 176 integrates human miR-target predictions from a large number of resources (currently 30, including miRDB, TargetScan, and PITA described above). The miRDIP investigators found that individual resources overlap only mildly, suggesting that predictions are heavily dependent on the underlying methodology and data used by each resource. miRDIP quantified the confidence of the predictions within each resource by assessing the subset of experimentally validated MTIs that are present in each prediction resource to obtain a set of ranks and associated predictions. Then the precision of all the predictions from the resource were obtained, termed confidence scores, which allow quantitative comparison of the predictions across the resources. The confidence scores are group into four intuitive categories “very high”, “high”, “medium” and “low” confidence based on their ranks to assist in interpretation. Finally, an integrative score is assigned to each MTI based on predictions across all the prediction resources. The integrated score was shown to provide more accurate predictions of MTI than those obtained from individual resources.

29

1.3.4 Methods for integrating miR-mRNA expression data

The use of joint miR-mRNA expression data facilitates assessment of the regulatory relationship between miRs and their putative targets, allowing the functional significance of miRs to be ascertained. A number of approaches have been used to quantify the statistical significance for association between a miR and its target using their expression measurements. Cantini et al 191 divide these approaches into two main categories based on the final aim of the method: (i) those investigating miR-mRNA pairs and (ii) miR-centered approaches. A limitation of methods identifying miR-mRNA pairs is that evaluation of the miR is always conditioned by the activity of the associated mRNA. To overcome this, miR-centered approaches begin with miR differential expression analysis followed by evaluation of their regulatory effect on mRNAs based on expression. Next, those genes evaluated to be regulated by the miR are compared to sequence-based target prediction for the same miR, and the intersection between the mRNA list is selected as the output (MTIs). The principle of assuming that the expression levels of miRs and target mRNAs are negatively correlated is commonly used to detect MTIs 192-194 . These methods typically select potential miR-target pairs that (i) are negatively correlated above some statistical significance threshold and (ii) have been identified to interact using sequence-based target prediction or experimental methods (Figure 1.4). Genmir++, was the first developed target prediction algorithm that integrated miR and mRNA expression data with sequence-based target predictions (from TargetScan) 195 and scored candidate miR regulators according to how much the miR expression profile explained downregulation of the mRNA expression using a bayesian inference algorithm. The large number of miR–target correlations calculated necessitates estimation of the false discovery rate (FDR), defined as the number of FP divided by the number of FP and true positives (TPs). Peng et al. 193 proposed a permutation- based method to estimate the FDR of miR–target correlations at a given statistical threshold (Figure 1.5 A). The FDR was defined as the ratio of the number of correlated miR–mRNA pairs above a given threshold (eg, Pearson’s r < −0.5 and P < 0.01) in a randomly permuted dataset (ie, FP) to the number of pairs above the threshold in the original dataset (ie, TP and FP). To generate the randomly

30 permutated miR–mRNA datasets, the sample labels for miR and mRNA were randomly swapped such that the samples in the random miR datasets did not correspond to the samples in the random mRNA datasets. This process was repeated 100 times, and a median value of FDR was selected. Using this approach, the investigators were able to ascertain that Pearson’s r < −0.55 with P-value < 0.01 was associated with an approximately 5% FDR. Empirical bayes (EB) methods have been used to explicitly borrow information across samples to increase the power of detecting TP and reducing FP for MTI detection 196 . In the first stage, Pearson's correlation coefficients (PCCs) for all miR–mRNA pairs are expressed as z-scores (derived from Fisher transformation), which approximately followed a normal distribution, where increasingly negative z- scores are more likely to represent MTIs. Next, to correct for multiple hypothesis testing, a variant of the Benjamini–Hochberg FDR, termed local FDR 197 , is applied to the set of all z-scores. The local FDR method estimates the empirical null distribution (histogram of z-scores) using maximum likelihood, where the central peak of the distribution mainly consisted of null cases (non-interacting pairs) and the (negative) tail tended to contain non-null cases (interacting pairs with negative z-scores). Thus, the FDR can be determined at any given z-score threshold; a lower (absolute) z- score threshold results in a greater sensitivity at the cost of a higher FDR and vice- versa (Figure 1.5 B). More recent studies have shown that miRs can upregulate mRNA expression by direct and indirect mechanisms 159,198 . To reflect this, some methods do not use miR-sequence complementarity target prediction and consider both positive and negative correlations between miRs and genes. Cantini et al 191 argue that the combined use of miR targets prediction and miR-mRNA expression analysis without selecting the intersection of the two analyses represents a good compromise between methods that take advantage of joint expression alone and those that use the intersection. These approaches are represented by Context-Specific miR analysis (Cosmic) 199 , targetRunningSum 200 and miR Master Regulatory Analysis (MMRA) 201 . MMRA performs four sequential steps in its analysis to progressively reduce the number of candidate miRs, with the goal of identifying “master regulators”. The first step is the differential expression analysis of miRs, then target transcript enrichment analysis to select DE miRs whose predicted targets are enriched in the associated DE mRNA signature. In the third step, network analysis in which an mRNA network is constructed around of miR using ARACNE (described below). The complete set of genes regulated by a given miR is termed its “regulon”. In the final

31 step, miRs whose expression ‘explains’ the DE mRNA expression using stepwise linear regression (SLR) analysis are identified. MMRA was experimentally validated in colon cancer cell lines by miR silencing experiments 201 . ARACNe is an experimentally validated algorithm that can be used to infer the targets of transcription factors or miRs using as input mRNA expression profiles (GEP) and miR expression profiles containing two or more phenotypes (e.g. control versus disease of interest). ARACNe relies on the assumption that the expression of a miR or TF is (positively or negatively) correlated with that of its targets. Unlike miRs which demonstrate higher correlation with expression of their targets, there is only a weak correlation between the expression of most TFs and their mRNA target genes 202 . Therefore miR expression data is expected to be better suited than TF mRNA expression for modeling target networks in ARACNe. Regulator analysis is then performed to determine whether the inferred miR targets are enriched in the phenotype signature (typically list of genes based on expression differences between the two phenotypes) 201,203,204 . The miRs or TFs with greatest enrichment are termed master regulators. ARACNe can take as input the expression of a single miR or list of miRs and the corresponding GEP to perform the following three key steps 205 : i) Preprocessing – mutual information (MI) threshold estimation is used to identify a significant threshold of MI values from the GEPs (dependent on number of samples). ii) Bootstrapping and network construction - sampling is used to address the noise in microarray data with MI estimation errors. Bootstrap datasets are generated by randomly selecting samples with replacement from the original dataset and non- statistically significant connections are removed using the MI threshold from (i). Indirect interactions (i.e. those likely mediated by another miR) are removed using the Data Processing Inequality tolerance filter 206 . iii) Building consensus network - a consensus network is identified by retaining edges (interactions) supported across a significant number of bootstrap networks.

1.3.5 Linking mRNA co-expression modules with miR regulon . As expression levels of genes within a regulon are mutually controlled by a shared miR regulator, significant overlap with one or more modules from mRNA co- expression network analysis is expected 203 . Thus, fisher’s exact test can be used to calculate significant intersection between clinically relevant modules (e.g. determined using WGNCA) and regulon (e.g. determined using MMRA).

32

Figure 1.4 Inferring MTIs by integrating matched miR–mRNA expression profiles and sequence-based target prediction data. Sequence predicted targets from a pre-selected database (orange matrix) depicted as a binary matrix (indicating the presence or absence of miR–target pairs, as dark orange or light orange boxes, respectively). Expression profiles from matched mRNA (green matrix) and miR (red matrix) microarrays are correlated and input into the purple matrix. MTIs with correlation values above a selected threshold and present in the sequence-based target prediction database are indicated as dark purple boxes, while all other pairs as light purple boxes.

33

Figure 1.5 Methods for estimation of False Discovery Rate (FDR) from MTI data. ( A) Permutation-based method to estimate FDR at a given statistical threshold. Randomly permuted datasets are generated to estimate the proportion of FPs above a given threshold. To generate randomly permuted datasets, the sample labels for miR and mRNA expression matrices are randomly swapped such that samples in the random miR datasets do not correspond to samples in the random mRNA datasets (swapping of sample labels indicated by red arching arrows above miR and above mRNA expression matrices). Pearson’s Correlation Coefficient (PCC) is calculated for each miR–mRNA pair in the sequence-based target prediction matrix (dark orange boxes above). The distribution of PCC values is depicted as a gray histogram (below). PCC values below a threshold of −0.4 (vertic al dashed red line) represent FP in the permuted data (gray). PCC values below this threshold using the original (unpermuted dataset) represent TPs (yellow region in the histogram). The FDR is calculated as the number of FP divided by the number of FP and TP. ( B) Empirical Bayes (EB) estimation of FDR using local FDR. PCC values for all miR–mRNA pairs within the sequence-based target prediction matrix are calculated and transformed into z-scores. The z-scores corresponding to false (null) interactions are depicted as a histogram in gray (below), and TPs are shown in yellow within the histogram. The empirical null distribution (orange curve) and theoretical null distribution (blue curve) are shown overlapping the histogram. Note that the theoretical null is too narrow for the data, whereas the empirical null (determined using local FDR) is a better fit for the data.

34

1.4 Approaches to gene expression meta-analysis across muscle diseases: identifying common transcriptional signatures

The diagnostic and prognostic potential of the vast quantity of publicly available microarray data has driven the development of methods for integrating the data from different microarray platforms. Microarray platform integration can be conceptually divided into approaches that perform early stage integration (cross-platform normalization) versus late stage data integration (meta-analysis). A growing number of statistical methods and associated software for platform integration are available to the user, however an understanding of their comparative performance and potential pitfalls is critical for best implementation. This section provides an evidence-based review of microarray meta-analysis and cross-platform integration. The discovery of highly reliable biomarkers from high dimensional microarray data is an important goal in molecular medicine with wide-ranging clinical applications. Potential roles for biomarkers include early-detection of disease in healthy individuals, disease classification, prognosis, prediction of response to therapy, and as surrogate outcomes in clinical trials 207 . The ideal biomarker is inexpensive, robust, easily interpretable, well-validated, and clinically useful (e.g. improving prognosis or choice of therapy) compared to current standards of practice, meaning that the result is “actionable, leading to patient benefit” 207 . Publicly available microarray data has vast potential to serve as a source of biomarker discovery as there is an enormous quantity of existing gene expression data 45,208 . At the present time (March 2018), the Gene Expression Omnibus, a repository of array- and sequence-based expression data, currently contains 2,431,671 samples performed on 18,261 platforms 209 . The most widely known of these platforms include the Affymetrix GeneChips (in-situ synthesized oligonucleotide microarray) and the Illumina high-density bead arrays 14 . While other types of microarrays exist such as protein and miR 210,211 , this section will focus on integration of gene expression data from multiple oligonucleotide DNA microarray platforms as it relates to discovery of gene signatures that may serve as biomarkers for clinical applications. While microarrays measure the expression of thousands of genes simultaneously, it is expected that only a small subset of the genes will be associated with the clinical or biological outcome of interest. This subset of genes, often termed

35 a “gene signature” or “prognostic signature”, has a collective expression pattern that is unique to the outcome of interest and thus has potential to function as a biomarker 212 . The gene signature is typically composed of far fewer number genes (often less than 100 genes) than that on a microarray chip (often more than 20,000 genes) making it feasible for further study using approaches such as quantitative RT-PCR. Point of Care (POC) devices that rely on transcriptional signatures are progressively gaining momentum as diagnostic tools for routine use in the clinical setting, resulting from their practical and affordable application making this approach highly accessible as cheaper diagnostic kits 33,213 . Biomarkers for the monitoring of disease activity of POC are currently lacking. A number of published gene signatures validated using independent samples have been shown to serve as significant predictors of clinical outcome 38,214-216 . However, the development of prognostic signatures that are robust and stable (e.g. the same biomarkers are identified in both discovery and validation sets) 217 has proven challenging 218-220 . Published prognostic gene signatures derived from internal validation often show little overlap with genes identified by other study groups 216 . Potential causes of low reproducibility include differences in sample collection methods, processing protocols, and microarray platforms, patient heterogeneity, and small sample sizes 214 . Due to the difficulty of acquiring samples, particularly from human tissue and the associated costs, microarray experiments from single-institution patient cohorts are often composed of small sample sizes. Predictive models derived from gene signatures identified in smaller sized individual studies are less robust 216,221 . Michiels et al 37 re-analyzed data from nine studies predicting cancer prognosis and found an unstable misclassification rate for the gene signature (defined as the 50 genes for which expression was most highly correlated with outcome) using training sets derived using a re-sampling approach, with performance increasing as the size of the training set increases . Integration of multiple microarray data sets has been advocated to improve gene signature selection 222 . Four characteristics of microarray experiments determine the false discovery rate (FDR) of DE genes : (1) the proportion of truly DE genes, (2) the distribution of the true differences (fold changes) of truly DE genes, (3) measurement variability, and (4) the study sample size 223 . Increasing sample sizes increases the statistical power to obtain a more precise estimate of integration of (differential) gene expression and to assess the heterogeneity of the overall estimate, as well as to reduce the effects of individual study-specific biases 224-227 . Meta- analysis is most commonly applied for the purpose of detecting differentially expressed (DE) genes 228 which may serve as a candidate gene signature or be used

36 as features in classification models or classifiers to further refine a clinically useful gene signature 229 . Next is a comparison of microarray chip design across platforms followed by a review of integrative transcriptomic methods (meta-analysis and cross-platform integration) and lastly a discussion of several promising transcriptomic biomarkers for disease diagnosis and prognosis that have been identified using meta-analysis approaches.

1.4.1 Comparison of microarray chip design across common platforms

The three most popular oligonucleotide microarray manufacturers (Illumina, Affymetrix and Agilent) produce microarray platforms that differ in structure in several important aspects. These include 1) oligonucleotide physical attachment, 2) probe selection, 3) probe design. Affymetrix, the most commonly used microarray platform, uses 25-mer oligonucleotide probes to measure the abundance of mRNA transcripts. The tradition Affymetrix platforms use 3’ in vitro transcription (e.g. the Affymetrix Human Genome U133 Plus 2.0 array) where 11 probe pairs termed “perfect match” and “mismatch” interrogate each transcript. The perfect match (PM) probes are chosen from sequence fragments of 600-mer located near the noncoding 3’ end of a gene transcript 230 . A mismatch (MM) probe, created by mutating the central (13 th ) nucleotide base of every PM probe, is intended to measure the level of non-specific hybridization 231 . Single nucleotide polymorphisms (SNPs) lower the efficiency of hybridization, resulting in decreased gene expression which is used to estimate the level of nonspecific hybridization via MM probes. The newer generation Affymetrix microarray, such as the Human Exon (HuEx) 1.0 ST and Human Gene (HuGene) 1.0 ST arrays, have three major changes to their design 232 . First, HuEx and HuGene contain probes with affinity to the individual exons in a given transcript, rather than mostly at the 3’ end of the gene. Each probeset consists of four individual probes which usually correspond to a single exon. The probesets targeting individual exons can be further grouped into “transcript clusters” of around 25 groups corresponding to an individual gene. The exon level expression can be used to investigate splicing events, whereas the gene- level expression estimates facilitate traditional gene expression array analysis. The second design change to the newer platforms is that MM probes are no longer included. Instead, the non-specific hybridization is measured using two sets of negative control probes. The first are the antigenomic background probes which

37 query sequence that is not present in the human or other commonly studied genomes to cover the range of GC content. The second are genomic background probes, also non-complementary to any human gene sequence, that are designed to evaluate background intensity levels for probes of different sequence characteristics. The last major difference between the newer generation Affymetrix arrays are that they allow for more probes on each array (> 5 million individual probes on HuEx). Thus the feature size (number of probes per given area) has been reduced by almost one fifth the area 232 . Illumina microarrays use multiple copies of a 50-mer probe (produced from standard oligonucleotide synthesis methods) that are attached to microbeads which are then put onto plasma-etched silicon wafers using random self-assembly 233 . Multiple Illumina arrays are placed on the same substrate allowing hybridization and other steps to be performed in parallel, whereas Affymetrix arrays are processed separately. Affymetrix arrays are constructed in a specific layout with each probe at a predefined location whereas Illumina arrays require a “decoding” step to determine the locations of each probe on the array using a molecular address. Additionally, Illumina uses 30 copies of the same oligonucleotide on the randomly generated array to provide an internal technical replication, whereas Affymetrix does not have technical replicates. Agilent microarrays typically use sets of 60-mer probes using SurePrint inkjet arrays for in-situ oligonucleotide synthesis 234,235 . The number of probes per gene is considerably lower in Agilent than Affymetrix, ranging from 2 probes in the 8 x 60 k platform to 8 on average in the exon arrays (2 x 400 k) 230 . The longer probes used by Agilent and Illumina compared to Affymetrix tend to have higher specificity, however the lower number of probes per gene makes the Agilent microarrays more sensitive to single SNPs 230 . For Affymetrix arrays, mismatches influence signal only at the individual probe level, with little impact on the overall transcript-specific probe set. A large meta-analysis of Illumina microarrays found that the vast majority of SNPs within probes had no significant effect on hybridization efficiency 236 . Despite differing technologies among commercially available platforms, the microarrays yield overall very comparable results 233 237 232 . Comparison of Affymetrix and Illumina microarrays found high experimental reproducibility when the analysis was restricted to probes that target the same set of transcripts for a given gene 235 . Transcriptome data is often inaccurate at the time of microarray design, which can result in low probe specificity (e.g. probesets targeting transcripts from multiple genes) or probes that do not map to any known transcripts. Thus reassembled versions of the microarray annotations based on updated version of the corresponding genome are common 238 .

38 The Microarray Quality Control (MAQC-I) project was designed to assess the reproducibility of repeatability of gene expression microarray data across multiple testing sites and the comparability of results across platforms 237 . The study found intraplatform consistency across test sites and a high level of consistency across platforms in terms of the number of genes detected, gene list overlap, and detection. The MAQC investigators proposed that an even higher level of agreement would be detected by comparing results from functional gene set enrichment analysis such as GO enrichment analysis.

1.4.2 Integrative Transcriptomic Data Analysis

Two fundamental approaches to combine the information of multiple independent microarray studies from different platforms (termed “integrative analysis” 224 ) are meta-analysis and cross-platform normalization (also termed merging). A conceptual framework by Hamid et al. 222 classifies microarray meta-analysis as “late stage” data integration as it combines the final statistic results from different studies, whereas cross-platform normalization integrates data at the “early stage”. Application of these approaches necessitates that all of the included studies are testing the same hypothesis and/or performed under comparable conditions or treatments 208,239,240 . While the degree of similarity that is required between “suitably similar” datasets still remains to be determined, cross-platform integration for the purpose of biomarker discovery is most appropriate using relatively homogenous datasets selected to answer well defined questions 241 . Early or late stage integration of data can be used regardless of the biological question (e.g. differential expression analysis or class prediction). The overall principle of these two approaches is summarized in Figure 1.6.

1.4.3 Pre-processing and quality control prior to integrative analysis

Ramasamy et al 225 identified key issues and steps for performing a meta- analysis including identifying suitable microarrays, pre-processing and preparing individual datasets, selection of meta-analysis method, and interpretation of results. The expression data must be transformed to a common scale (e.g. log2) and resolution (e.g. 12, 14, 16, or 20 bit) 242 . Another important pre-processing step in meta-analysis is ascertaining which probes represent a given gene within and across the different microarray platforms. The relationship between probes and genes may be determined by mapping probes to the gene using sequence-matched datasets or

39 using gene-level identifiers such as Gene ID available in the annotations packages in R/Bioconductor 243 to unify the microarray datasets. Sources of high- quality probe re-annotation include Simple Omnibus Format in Text (SOFT) files from GEO 209 , alternative chip definition files for Affymetrix 244 and ReMOAT (Re- annotation and Mapping for Oligonucleotide Array Technologies) and its associated annotation packages in R/Bioconductor for Illumina 238 . Typically, only genes that are present across the different platforms being integrated will remain for further analysis, while those absent in one or more platforms will be “lost”, reflecting the tradeoff between increasing sample size and power versus decreasing the number of genes analyzed 241 . Genes with low mean expression across most studies are typically filtered out prior to meta-analysis. Turnbull et al 241 applied relatively strict filter thresholds for their microarray integration analysis based on a prior study that found genes with low or intermediate expression have poorer inter-platform reproducibility than highly expressed genes 218,245 . Furthermore, incorporation of a quality measure based on detection p-values estimated from Affymetrix arrays into the study-specific test statistics within a meta-analysis of two Affymetrix array studies using an effect sized model produced more biologically meaningful results than an unweighted model 226,246 . If multiple probes match a single gene, numerous approaches have been recommended to generate a single measure of expression for each transcript (gene) on the microarray. These including selecting the probe with the highest interquartile range 247 , the highest average expression 248 , and summarizing the gene using the fixed effect inverse-variance model 249 . A fixed effect probe level model is recommended as high levels of variation among probes are often observed, with variance for a single probe across replicates being smaller than the variance between probes within a replicate 250 . A systematic review of microarray meta-analysis studies in the literature has found that while objective methods to exclude microarray studies are available, the decision to remove studies must be weighed by the investigator as a highly homogenous study population may not yield generalizable results 228 . There are a number of quality assessment packages available for Bioconductor including Simpleaffy 251 and affyPLM 42 for Affymetrix. The MetaQC package provides six quality control measurements 252,253 . While the objective of MetaQC is to identify problematic studies across multiple platforms, the thresholds for excluding studies appear to remain incompletely defined. Therefore, inclusion of all eligible studies (with at least 5 patient samples and 5 controls) for meta-analysis, regardless of quality assessment, is recommended (P. Khatri, personal communication).

40

1.4.4 Meta-analysis

In the meta-analysis approach, each experiment (study) is first analyzed separately and the results of each study are then combined. Meta-analysis methods that combine primary statistics (e.g. p-values or effect sizes) require the use of raw gene expression data whereas secondary statistics rely only on ranked lists of genes. Popular methods for meta-analysis mainly combine one of three types of statistics: p- value 254 , effect size 255 , and ranked gene lists (“rank aggregation”) 228,256,257 . Ranked lists of genes produced for each study (e.g. ranked by order of p-value for DE of each gene) have been aggregated into a single gene ranking (“consensus”) using a number of methods including the rank product method 257 . Combined effect size to generate an estimate of the overall effect size and its confidence interval is frequently used in microarray meta-analysis. A major advantage of combing effect sizes is that an overall estimate of effect size is generated which can be useful to assess the importance of the result 133 . Choi et al 255 described one of the first methods to combine effect sizes using a random-effects modeling (REM) approach for combining datasets from individual studies of two groups to form an overall estimate of the weighted effect size. The effect size was measured by the standardized mean difference obtained by dividing the difference in the average gene expression between the treatment and control groups by a pooled estimate of standard deviation (Hedges g) 226 . The effect size was used to measure the magnitude of treatment effect in each study and a random effects model was used to incorporate inter-study variability. Random-effects models are generally preferred method for use in gene-expression meta-analysis as the model assumes that each individual study effect is an estimate of a hypothetical population of studies, whereas fixed effects models estimate a summary effect size only of the studies present in the meta-analysis 258 . Thus, whereas FEM simply synthesizes the data, REM aims to discover the “true” biological background effect 133 . The choice of statistical meta-analysis method is selected based on the biological purpose of the analysis. A gene serving as a biomarker from a meta- analysis is expected to show concordant biological effects across all or most experiments for a given condition derived from relatively homogenous sources (e.g. upregulation of a gene predicting risk of lung cancer detection from lung epithelium biopsied from a cohort of smokers versus healthy non-smokers) 259 . While detecting biomarkers DE in all studies seems an ideal goal, it can be too stringent when the

41 number of samples is large, increasing the heterogeneity of experimental, platform, or biological samples 31 .

Meta-analysis methods detecting DE in the majority of samples (HB r) are generally recommended as they provide robustness and detection of relevant signals across the majority of samples 256 . The objective and type of outcome types (e.g. two-class, multi-class, survival) 225 will govern the choice of both the test statistic (t- statistic, F-statistic, log-rank statistic) and the meta-analysis method (combing p- values, effect sizes, or ranks). Methods combining effect sizes (standardized mean differences or odds ratios) are appropriate for combining two-class outcomes.

1.4.5 Cross-platform normalization

Cross-platform normalization (also termed “data merging” 224 ) considers all data from experiments across different microarray platforms as a single data set from the same experiment. Direct integration of data sets performed on different microarray platforms may introduce undesirable batch effects due to systematic multiplicative biases 224,241,260 . Cross-platform transformation and normalization methods have been developed with an aim to remove the artifactual differences between data from different microarray platforms while preserving the underlying biological differences between conditions. This step is essential, as non-biological differences (“batch effects”) in the gene signature discovery data can obscure real biological differences found between clinical groups. The level of difficulty present to combine multiple datasets has been termed dataset complexity 261 . For example, integrating different Affymetrix platforms is less complex to analyze by meta-analysis or cross platform normalization than datasets performed across very different platforms. Studies using low complexity datasets, mainly from the Affymetrix platform, have directly merged the studies to construct a gene signature 60,262-264 .

1.4.6 Comparison of meta-analysis vs. cross-platform normalization

Cross-platform normalization has been argued to have better performance than meta-analysis for the identification of robust biomarkers on the premise that “deriving separate statistics and then averaging is often less powerful than directly computing statistics from aggregated data” 262 . In a comparative study, Taminau et al 224 found significantly more differentially expressed genes using cross-platform normalization than meta-analysis.

42 While cross-platform normalization has been applied in multiple studies 54,265,266 , it has less frequently been used in the literature compared to meta-analysis 208 . A recent comprehensive systematic literature review of studies applying microarray integration methods found that only 27% of the studies directly merged microarray data and this subset of studies were mostly performed on the same platform 228 . Cross-platform normalization methods do not guarantee elimination of laboratory or batch effects across experiments. Rung and Brazma 45 have argued that microarray meta-analysis provide better assessment and control of inter-study heterogeneity, which can be estimated using Cochrane’s Q statistic (τ2). For a gene with effect size with large spread across studies (higher τ2), we would be less confident that it should be synthesized in the overall effect. Different random-effects meta-analysis models estimate τ2 differently and then apply different weighting of τ2 to the final significance.

1.4.7 Comparison of meta-analysis methods

Several comparative studies systematically comparing meta-analysis methods for microarray data have been previously published 256,261,267 . Meta-analysis methods have been categorized based on the hypothesis settings that gene biomarkers are differentially expressed “in all studies” (HS A), “in the majority of

31,256 studies” (HS r), or in “one or more studies” (HS B) . In Fisher’s, Stouffer’s and minP method, an extremely small p-value in one study likely meets criteria for statistical significance, thus it detects DE in “in one more studies” (HS B), whereas the maxP or rank product method tends to detect gene biomarkers DE in “all studies”

268 th (HS A). Song and Tseng proposed a robust order statistic, r ordered p-value (rOP), that tests the alternative hypothesis that there are significant p-values in at least a given percentage of studies. This method detects biomarkers DE in the majority of studies (e.g. >70% of studies) based on a user-specific threshold of studies. Chang et al 256 benchmarked the performance of six p-value combination methods (Fisher, Stouffer, adaptively weighted Fisher, minP, maxP, and rth ordered p-value [rOP]), two combined effect size methods (fixed effects and random effects) and four combined ranks methods (RankProd, RankSum, product of ranks and sum of ranks). The 12 meta-analysis methods were categorized into three hypothesis settings (candidate markers DE in “all” [HS A], “most” [HS r], or “one or more”[HS B] studies) based on their strengths for detecting DE genes. They then applied four statistical criteria to the assessment of each meta-analysis method: 1) detection capability (the number of DE genes detected), 2) biological association (degree of

43 association between DE list with predefined genes from pathways related to the disease), stability (randomly splitting the data and comparing results of the two-meta- analysis) and robustness (effect of including a outlying irrelevant study to the meta- analysis).

Among the methods based on HS A setting, the maxP performed the worst based on their four criteria and the investigators recommend that it be avoided. Rank product method had improved performance but weaker detection capability. The two methods that tended to detect DE in the majority of samples were the Random Effect Model (REM) and the rth order P-value (rOP). rOP out performed REM based on stronger biological association and detection capabilities, but this was achieved at the expense of diminished stability and robustness. A recent systemic analysis of random-effects models 133 using non-simulated data compared meta-analysis methods for relative accuracy among different models using a true positive list made of the intersection of genes that were found to be significant (FDR < 0.01) by three random effects models across the entire set of studies in the meta-analysis (termed the “silver standard”). Six random effects models were tested: Sidik-Jonkman, Hedges-Olkin, empiric Bayes, restricted maximum likelihood, DerSimonian–Laird, and Hunter–Schmidt. Each model incorporates inter-study heterogeneity differently; Sidik-Jonkman strongly penalizes inter-study heterogeneity, whereas Hunter-Schmidt is strongly permissive of inter- study heterogeneity, and DerSimonian-Laird is intermediate between these two models. At equivalent significance thresholds, the more conservative methods were found to contain subsets of genes identified by the less conservative methods. The investigators also found that increasing stringency for residual heterogeneity decreased more TP than FP. Therefore it was recommended that for identification of a diagnostic signature, more conservative models be used, whereas for exploration of biology of a given disease, less conservative models were recommended. For a single REM method, the investigators simultaneously varied the significance and effect sizes and compared the performance. Importantly, they found that increasing significance thresholds reduced both TP and FP, whereas more stringent ES thresholds reduced more FP than TP. For example, FDR adjusted p- value (“q-value”) < 0.05 and ES > 1.4 fold returned more TP than q-value < 0.01 and ES > 1.2 fold. An effect size threshold of 1.2-1.3 fold (in non-log space) decreased FP while minimally impacting TP. In summary, TP were maximized with a fixed number of FP using a less-conservative method with a high effect size threshold. The investigators also compared the performance of small number of larger studies versus a large number of smaller studies. They found that for any fixed sample size, accuracy increased with the number of datasets present. In general,

44 adding more datasets tended to decrease the number of FP while increasing the number of TP. Therefore, the investigators concluded that funding a larger number of smaller yet modestly powered studies for a given disease (rather than a single large study) may improve signature discovery. The investigators also advocated for consistent methods across studies and for all studies to be made publically available.

1.4.8 General guidelines for designing a gene expression meta-analysis

Meta-analyses have traditionally used cross-validation or split datasets at random into discovery and validation datasets. However, this approach inaccurately assumes that these datasets are derived from unbiased sampling of a large homogeneous population of samples 217 . Sweeney et al 133 have outlined a general set of best practices for designing a gene expression meta-analysis to obtain reproducible results. Discovery cohorts that include at least 4-5 cohorts (datasets) are recommended and ideally a nearly even split between cases and controls. The discovery cohorts should be comprised of a greater number of modestly powered studies with multiple platforms if possible to increase heterogeneity. Cohorts that have clinical phenotyping should be assigned to the validation cohort so that the derived gene set can be correlated to clinical factors, including confounding variables. The investigators highly recommend avoiding use of cross-validation and instead to use validation cohorts that are fully independent from the discovery cohorts. In particular, all data from the same research group should be kept together due to subtle dependencies in the datasets.

1.4.9 Software and websites implementing microarray meta-analysis and cross- platform merging/normalization

Software, including packages in R/Bioconductor and websites allowing users to implement microarray meta-analysis and cross-platform merging and normalization methods are listed in Table 1.1 . The MetaIntegrator package 269 by Purvesh Khatri’s lab is most notable as it has had demonstrated utility for identification of clinically relevant disease signatures. MetaIntegrator is an analysis pipeline which implements meta-analysis by computing Hedges g effect size for each gene in each dataset and effect sizes are pooled across datasets into a summary effect size using DerSimonian-Laird REM, with p-value corrected for multiple hypothesis across all genes using a Benjamini-Hochberg False Discovery Rate (FDR) correction. Signature selection can then be performed by varying one or more filtering

45 parameters including: gene effect size, effect size FDR, and the number of studies in which the gene was present. For a set of signature genes selected, a normalized z-score to center the samples for each study around zero can be calculated for each sample to serve as a signature score. MetaIntegrator includes a number of useful built-in visualization tools including forest plots, to examine the effect sizes and standard errors (and summary effect size) for genes of interest across studies. Violin Plots compare the signature scores across categorical variables (like disease subtypes), and regression plots evaluate the relationship of the signature score with continuous variables like muscle strength.

1.4.10 Examples of disease signatures discovered using MetaIntegrator

Khatri et al 249 performed meta-analysis of 8 independent transplant datasets from 4 organs (236 graft biopsy samples) to detect an 11 gene signature that was significantly overexpressed in acute rejection across all transplant organs. The rejection module predicted future rejection injury in 2 independent cohorts. Li et al 270 performed meta-analysis of 13 patient cohorts among 4 neurodegenerative diseases to identify a common transcriptional signature of neurodegeneration. The analysis applied a “leave-one-disease-out” analysis in order to ensure that the meta- analysis was not biased towards any one specific neurodegenerative disease. The investigators repeated the meta-analysis four times by removing datasets corresponding to one disease at a time; at each iteration significant DE genes were identified and genes that were significant across each iteration (i.e. irrespective of which subset of neurodegenerative diseases were analyzed) formed the common neurodegeneration module, comprising 243 DE genes which were similarly dysregulated in independent validation cohorts from 7 neurodegenerative diseases. The signature was shown to correlate with histologic disease severity.

46

Table 1.1: List of software and websites for performing microarray meta-analysis

Microarray Meta-analysis (command line packages) Software name language features MetaIntegrator R Implements Hedges g effect size, DerSimonian-Laird REM. Analyzes signature results in discovery and validation (calculateScore) Numerous visualization functions (forestPlot, Heatmaps, violinPlots) Demonstrated utility for identification of clinically relevant disease signatures metaDE (metaOmics) R Implements 12 major meta-analysis methods 253 MAMA R Implements combined effect size, combined p-values, combined ranks metaMA R Implements combined moderated effect size, combined p-values metaGEM R Implements combined effect size, combined p-values, vote counting 225 metahdep R Effect size estimates particularly when hierarchical dependence is present GeneMeta R Implements combined effect size 255 OrderedList R Combine ranks with or without expression data RankProd R Implements Product of Ranks method RankAggreg R Aggregation of ordered lists based on the ranks using several different algorithms

Automated web applications for microarray meta-analysis / normalization

Software name Features and URL INMEX Meta-analysis. Support for 45 microarray platforms for human, rat. Combines P- values, effect sizes, rank order, others http://www.inmex.ca/INMEX/ Network Analyst Meta-analysis. Combines P-values, effect sizes, rank order. Significantly altered genes are then presented within the context of protein-protein interaction networks. http://www.networkanalyst.ca/NetworkAnalyst/faces/home.xhtml A-MADMAN Affymetrix platform normalization using quantile distribution transformation http://compgen.bio.unipd.it/bioinfo/amadman/ MAAMD Affymetrix meta-analysis http://www.biokepler.org/use_cases/maamd-workflow- standardize-meta-analyses-affymetrix-microarray-data Microarray cross-platform merging/normalization (command line packages) Software name language features mergeMaid R Implements Probability of Expression transformation (POE) 271 metaArray R Implements POE 271 CONOR R Implements XPN, Empirical Bayes (EB), Quantile normalization (QN), Quantile discretization (QD), others 208 VirtualArray R Implements EB, QN, QD, others 242 inSilico Merging R Implements XPN, EB, DWD, others 224 Automated Microarray R Implements. Allows analysis of Illumina, Affymetrix and Agilent. Data Analysis v2.13 XPN R Implements Cross Platform Normalization 227 DWD JAVA, R, Implements Distance Weighted Discrimination method 272 MATLAB Combat R Implements Empirical Bayes methods 41 PLIDA MATLAB Normalizes an arbitrary number of platforms 273 metAnalyzeAll R Elastic net classifier 274

47

Figure 1.6: Outline of two microarray integration methods a) meta-analysis (“late integration”). Individual case-cohort microarray studies are pre-processed and each study is used to identify ranked gene lists which are then combined in the final step. b) Cross-platform merging and normalization (“early integration”). After pre- processing of individual studies, a single unified case-cohort dataset is generated (“clustered” into cases and cohorts, indicating removal of batch to batch variation) and in this example, used to identify a ranked gene list.

48

CHAPTER 2:

Thesis overview

2.1 Overall aims and hypothesis

The proposed doctoral work aims to analyze gene and miR expression profiles from muscle biopsies in survivors of critical illness with muscle weakness assessed longitudinally after discharge from the intensive care unit (ICU). The analysis aims to integrate this expression data with key clinical measures of muscle weakness, including muscle mass and strength. Then using a meta-analysis of transcriptional profiles across all human muscle diseases, we aim to identify a gene expression signature specific to ICU acquired weakness (ICUAW) as well as a signature of common muscle disease and to assess whether it can serve as a robust marker of muscle disease severity.

2.2 Study aims and hypothesis

Study #1 (Chapter 3)

Study 1 will evaluate gene transcriptional profiles from quadriceps muscle biopsies from patients with ICUAW assessed at day 7 and month 6 post-ICU and healthy controls. The study will integrate clinical measures of muscle weakness, including muscle mass and strength. This ICUAW cohort is a subgroup from a large prospective cohort study (RECOVER) which assessed clinical outcomes in survivors who required mechanical ventilation for at least 7 days. Aims : 1) Assess transcriptomic profiles using differential expression analysis and weighted gene co-expression network analysis 2) Identify ICUAW relevant modules of co-expressed genes based on association with clinically relevant measures of muscle impairment. Then assess enrichment of functional pathways and transcription factor binding sites within the modules to gain insight into molecular mechanisms of ICUAW.

49 3) Assess preservation of co-expression topology of ICUAW-relevant modules in independent cohorts of critically ill patients at high risk for ICUAW and in a porcine model of ICUAW.

Hypothesis We hypothesize that transcriptomic gene co-expression analysis that integrates clinical phenotypes of early and sustained ICUAW will identify novel disease-relevant pathways and genes dysregulated in ICUAW compared to controls. Specifically, we hypothesize that among the pathways identified, those related to muscle development and regeneration will be over-represented in patients with sustained ICUAW. Lastly, we hypothesize that the ICUAW relevant modules will have preserved topology in independent cohorts of critically ill patients at high risk for ICUAW and in a porcine model of ICUAW.

Figure 2.1 Conceptual flow diagram of for integration of clinical phenotyping and co- expression network analysis to identify clinically relevant co-expression modules for validation in independent cohorts of ICUAW. Abbreviations (WGCNA = weighted gene coexpression analysis; TFBS = transcription factor binding site).

50 Study #2 (Chapter 4)

Study 2

Study 2 will integrate joint miR and mRNA transcriptomic profiles from quadriceps muscle biopsies from patients with ICUAW assessed at day 7 and month 6 post-ICU and healthy controls to identify networks of genes regulated by miRs. This study is taken from the same ICUAW cohort as in study 1. The miR-mRNA networks from Study 2 will be assessed for enrichment of genes within the ICUAW relevant modules identified in Study 1.

Aims

1) Identify miR master regulators of mRNA signatures associated with ICUAW using an analytic pipeline. Identify functional pathways enriched within the mRNA networks targeted by the miR regulators. 2) Assess whether patients with ICUAW that recover their muscle mass have a different miR expression profile than those patients that do not 3) Assess enrichment of mRNAs in the networks targeted by the miR regulators within the ICUAW relevant modules from Study 1 4) Assess the expression of the identified miR regulators during myogenic differentiation using in vitro C2C12 murine myoblasts.

Hypothesis We hypothesize that there are significant differences in the expression of miRs between patients with ICUAW that demonstrate improvement in muscle mass (“improvers”) compared to those patients without improvement in muscle mass. We hypothesize that dysregulated expression of miRs in patients with ICUAW will serve to identify networks of mRNA targets with roles in muscle function and regeneration. It is expected that the mRNA networks targeted by miRs will be enriched for genes within the ICUAW co-expression modules identified by Chapter 1.

See Figure 4.1 For Flow diagram fro Chapter 4

51 Study #3 (Chapter 5)

Study 3 is a meta-analysis of gene transcriptional profiles from muscle biopsies from human muscle and healthy controls from public microarray repositories fulfilling quality criteria. Studies will be divided into six categories: i) immobility, ii) inflammatory myopathy, iii) ICU acquired weakness (ICUAW), iv) congenital muscle diseases, v) chronic systemic diseases, vi) motor neuron disease. For identification of common signature of muscle disease, patient cohorts will be separated into discovery and validation cohorts retaining roughly equal proportions of samples from the disease categories. To remove bias towards a specific muscle disease, the “leave-one-disease-category-out” method will be applied. Next, disease category specific signatures will be identified. For both common muscle disease, and disease specific signatures, functional pathway analysis will be used to improve understanding of disease pathomechanisms.

Aims

1) Identify a common signature of muscle disease using meta-analysis as described above and assess the following: i) association between the summary expression of the signature and clinical measures of muscle disease in cohorts with clinical data available. ii) enrichment of transcription factor binding sites, functional pathways, and cell-type specific signatures, and subcellular localization analysis. iii) compare meta-analysis signature with other signature(s) identified in the literature.

2) Identify muscle disease specific category signatures using meta-analysis. Assess functional pathway enrichment within each disease category. Identify differentially expressed genes and functional pathways that are unique to ICUAW (not present in other muscle disease specific categories).

Hypothesis

We hypothesize that there is a common set of genes dysregulated across muscle diseases compared to healthy muscle and that these genes correlate with severity of muscle disease. Next, meta-analysis performed separately on each muscle disease category, with removal of common genes across muscle disease

52 categories, is expected to reveal novel mechanistic differences between ICUAW and other causes of muscle atrophy. Furthermore, integration of all publically available ICUAW datasets from human samples using meta-analysis may serve to improve detection of a robust signature of ICUAW.

See Figure 5.1 for flow diagram for Chapter 5

53

Chapter 3

3.0 Transcriptomic analysis reveals abnormal muscle repair and remodeling in survivors of critical illness with sustained weakness

3.1 Abstract:

Background/Objectives : ICU acquired weakness (ICUAW) is a common complication of critical illness characterized by structural and functional impairment of skeletal muscle. The resulting physical impairment may persist for years after ICU discharge, with few patients regaining functional independence. Elucidating molecular mechanisms underscoring sustained ICUAW is crucial to understanding outcomes linked to different morbidity trajectories as well as for the development of novel therapies. Methods : Quadriceps muscle biopsies and functional measures of muscle strength and mass were obtained at 7 days and 6 months post-ICU discharge from a cohort of ICUAW patients. Unsupervised co-expression network analysis of transcriptomic profiles identified discrete modules of co-expressed genes associated with the degree of muscle weakness and atrophy in early and sustained ICUAW. Results : Modules were enriched for genes involved in skeletal muscle regeneration and extracellular matrix deposition. Collagen deposition in persistent ICUAW was confirmed by histochemical stain. Modules were further validated in an independent cohort of critically ill patients with sepsis-induced multi-organ failure and a porcine model of ICUAW, demonstrating disease-associated conservation across species and peripheral muscle type. Conclusions : Our findings provide a pathomolecular basis for sustained ICUAW, implicating aberrant expression of distinct skeletal muscle structural and regenerative genes in early and persistent ICUAW.

54 Chapter reproduced with permission from Nature Publishing Group . The article has been published in final form: Walsh, Christopher , Batt, Jane , S. Herridge, Margaret , Mathur, Sunita, D. Bader, Gary & Hu, Pingzhao & Santos, Claudia. (2016). Transcriptomic analysis reveals abnormal muscle repair and remodeling in survivors of critical illness with sustained weakness. Scientific Reports . 6. 29334. 10.1038/srep29334.

3.2 Introduction

A fundamental question regarding the pathomechanism of ICUAW is whether convergent transcriptional changes in response to muscle injury are associated with impaired recovery of muscle mass and strength in ICUAW. We hypothesized that the degree of aberrant expression of genes involved in skeletal muscle regeneration and repair is associated with the extent of muscle atrophy and weakness (muscle phenotypes) in ICUAW 275,276 . To test this hypothesis, quadriceps muscle biopsies and functional measures of muscle strength and mass were obtained at 7 days and 6 months post ICU-discharge from a cohort of ICUAW patients enrolled in the RECOVER Program (phase 1: Towards RECOVER ) 275-277 (Supplementary Table 3.1), a multi-center prospective longitudinal study evaluating functional outcomes in critically ill patients following prolonged mechanical ventilation over a 1 year period, after ICU discharge 275,276 . Clinical and functional data for the entire RECOVER and muscle biopsy cohorts are published 103 .

3.3 Methods

3.3.1 Patient selection

Patients were enrolled from a multi-center nested prospective study (Towards RECOVER). Patients were included if mechanically ventilated for a minimum of 1 week and were fully ambulatory and independent with activities of daily living prior to ICU admission. Exclusion criteria included any of the following: important current or pre-existing neurological injury that would preclude cognitive testing; a formal diagnosis of neuromuscular disease; non-ambulatory prior to ICU; anticipated death

55 or withdrawal of life sustaining treatment within 48 hours; history of psychiatric illness or significant cognitive impairment with documented hospital admission; not fluent in English or French; living greater than 300 km from referral centre; physician refusal; patient or SDM (substitute decision maker) consent refusal; and no available next of kin or SDM available (if patient unable to provide consent). Additional exclusion criteria for this nested study included important current neurological injury that would preclude motor testing; known HIV, Hepatitis B or Hepatitis C infection; therapeutic anticoagulation and/or active cancer undergoing treatment. Written informed consent was obtained from all participants or their surrogate decision makers and participants were re-consented when capacity was regained. The study protocol was approved by the research ethics boards of all participating institutions. Banked muscle biopsy specimens previously collected from consenting healthy individuals were used as controls. All controls had been screened by interview and self-reported the absence of malignancy, significant cardiac, pulmonary, hepatic, renal or endocrine disease

3.3.2 Outcome measures of physical function, strength and mass

Physical functional capacity was measured using the motor component of the Functional Independence Measure questionnaire (FIM score) 23 . Functional Independent Measures questionnaire provides a numerical score for cognitive and motor function that has been validated and standardized across diverse patient populations. A higher FIM score (scale 18-126) connotes better function in both domains. The motor subscore is based on the individual’s ability to perform their activities of daily living (ADLs; toileting, dressing, walking, climbing stairs, eating). A healthy individual with complete and unassisted independence of ADL would achieve a maximal FIM motor subscore of 91. The measure is independent of an individual’s gender or age. Global muscle strength was assessed by the Medical Research Council Sum Score (MRCSS) 21 . Manual assessment of muscle strength was performed by a physiotherapist, grading muscle strength from 0 to 5, as established by the Medical Research Council. The higher the score, the stronger the muscle group, with 5 representing what is deemed by the assessor to be normal strength, taking into account the individual’s sex and age. The MRC sum score is calculated by summing the total score for bilateral shoulder abduction, elbow flexion, wrist extension, hip flexion, knee extension, and ankle dorsiflexion. The score for an individual with normal strength is 60.

56 The midthigh quadriceps femoris muscle cross-sectional area (CSA) was determined as previously described 278 . Briefly, computed tomography imaging of the right thigh, halfway between the pubic symphysis and the inferior condyle of the femur, using a Light Speed QXi 4 slice helical scanner (General Electric, Milwaukee, WI), was performed with the subject in the supine position. Each image was 10 to 20 mm thick, and the muscle identified as tissue with a density of 40 to 100 Hounsfield units. Images were analyzed and the CSA of thigh muscle determined by a single radiologist, blinded as to the categorization of each test subject. Quadriceps was then manually traced using ImageJ software (version 1.42q, NIH, USA) by a single investigator blinded to participant categorization and the CSA calculated. The cross- sectional area of the muscle at the midsection of the thigh determined by computerized tomography (CT) was compared to age and sex matched publication based norms 279 and muscle mass for study subjects is expressed as a percent of the age and sex matched norms. Quadriceps isometric peak torque (strength) was measured from maximal voluntary contractions of the knee extensors. The patient was seated on the Biodex dynamometer (Biodex System 4), with the hip at 85 degrees and knee at 60 degrees of flexion. The patient performed five maximal voluntary contractions with one minute rests, according to a previously described protocol 280 . Peak torque was recorded in Newton-meters (Nm). Patients’ peak torque was reported in absolute terms and as percentage of the predicted normal, as previously described 281 . Predicted Quadriceps Force in Nm = - (2.21 x age) + (55.9 x gender [female = 0, male = 1 ] ) + (1.78 x Body weight) + 124. All testing was conducted at 7 days and 6 months post-ICU discharge, with the exception of the Biodex measures, which were conducted solely at 6 months. For controls, the values for FIM motor subscore, MRCSS, percent predicted quadriceps isometric peak strength, and percentage of normal quadriceps CSA were set to the normal value in a healthy population (91, 60, 100% and 100%, respectively). Descriptive statistics of patient demographic and clinical variables are shown in Supplementary table 3.1.

3.3.3 Muscle sample collection

Biopsy of the vastus lateralis muscle was performed under local anesthetic using a modified Bergstrom needle, as previously described 282 . Briefly, under sterile

57 conditions, the skin and subcutaneous tissue overlying the lateral aspect of the distal third of the muscle were anesthetized, and a small incision was made in the outer fascial layer with a scalpel blade. The Bergstrom needle was advanced through the incision roughly 1 cm into the muscle, suction was applied as the trochar was advanced, and several pieces of muscle tissue were obtained. The needle was withdrawn under counter pressure, the skin closed with a single suture, and a pressure dressing applied. Tissue (approx. 200mg in total) was immediately sectioned with a sterile scalpel blade and pieces were processed for electron and light microscopy or flash frozen in liquid nitrogen (N2).

3.3.4 Muscle sample staining

Muscle biopsies were fixed in 10% buffered formalin phosphate for 24 hours at room temperature, rinsed in ethanol, paraffin embedded and sectioned (10 µm thickness) on cross section. The sections were rehydrated in a series of xylene and ethanol washes. Nuclei were stained with Weigert's haematoxylin for 8 minutes, washed in running tap water and then sections stained in picro-sirius red for one hour. Sections were then washed in two changes of acidified water, dehydrated with ethanol and xylene washes and mounted.

3.3.5 Microarray samples and Quality control.

Muscle sample collection. Percutaneous muscle biopsy of the vastus lateralis was performed at day 7 post-ICU discharge ( n = 14) and 6 months post-ICU admission ( n = 10). Healthy muscle biopsy samples (controls) were obtained from previously banked specimens collected from consenting individuals ( n = 8). RNA extraction and microarray hybridization. RNA was processed, amplified, and labeled as previously described 283 . RNA samples included in the expression analysis had high RNA quality (median RNA integrity number [RIN] of 8.5). Expression profiles were obtained using IlluminaHT-12 V4 microarrays (1 microarray per sample). All microarray data are deposited in GEO under accession number GSE78929. Microarray data analysis. Microararray data analysis was performed using the R software and Bioconductor packages. Raw expression data were background corrected, quantile normalized and log 2 transformed using the neqc function in the limma package. Data quality control included high inter-array correlation (Pearson correlation coefficients > 0.85) and detection of outlier arrays based on mean inter- array correlation and hierarchical clustering. All samples fulfilled data quality control criteria. Probes listed as “No match” or “Bad” using the illuminHumanv4 package

58 were removed, resulting in 34,476 high-quality probes. Robustly expressed probes were defined as those with detection P value < 0.05 for at least half of the samples in the data set and standard deviation of probe expression > 0.25 (11,482 probes, corresponding to 9869 unique genes).

3.3.6 Single-gene differential expression analysis

Differential expression of all robustly expressed probes (see above) in ICUAW day 7 and month 6 post-ICU and control samples was assessed in limma as gene-wise linear models adjusted for patient age and sex and for consensus correlation between patient samples using the duplicateCorrelation function 284 . Moderated F- statistics combined t-statistics for all three pair-wise comparisons (contrasts) into an overall test of significance for each gene used. The decideTests function with “global” setting performed error rate control across multiple contrasts and genes simultaneously with pre-specific significant threshold Benjamini Hochberg FDR < 5% and absolute values of non-log fold change (aFC) > 1.0. Diagnostic plot of the linear model fit (gene-wise residual standard deviations against average log-expression) were examined and no variance trends were identified. One gene with differential expression in both up and downregulated probes at month 6 post-ICU was removed. Hierarchical agglomerative clustering using average-linkage with the hclust() function was performed using the gene expression data from the differentially expressed genes across all samples. The clustering analysis was performed on the columns (samples) and rows (differentially expressed genes) with data visualized as a heatmap using heatmap.3() function.

3.3.7 Co-expression network analysis

Signed hybrid co-expression networks were detected using the WGCNA package in R. We chose signed networks as they have been shown to detect modules with more significant enrichment of functional classifications 148 . We used a linear mixed model (LMM) to correct the gene expression for age and sex effects (fixed effects) with random intercept terms to account for correlation among samples from the same patient and for potential differences in age- or sex-related expression effects between ICUAW and controls. Residual values from the LMM were used for input in WGCNA and the remainder of the analysis 285 . Pairwise Tukey’s Biweight correlations for the set of genes were calculated using the bicor function as this correlation method is more robust than Pearson correlation and often more powerful than Spearman

59 correlation 149 . Adjacency transformation was calculated by raising the correlation matrix to the power of 6, which was chosen using the scale-free topology criterion 135 . For each pair of probes the topological overlap measure was calculated based on the adjacency matrix and the topological overlap dissimilarity measure was used as input for average linkage hierarchical clustering. The Dynamic Hybrid tree cutting algorithm was used to cut branches off the dendrogram because it produces robustly defined modules 52 . To obtain moderately large and distinct modules we set the minimum module size to 50 probes and the minimum height for merging modules set at 0.2 (default parameter). Each module was summarized by its first principal component of the scaled module age and sex corrected (residual) expression values (module eigengene ). Probes were assigned to a module if they had high module membership (correlation between gene expression and module eigengene > 0.5). Modules were numerically labeled by module size, with M1 indicating the largest module. We then tested the association of each module eigengene with disease status (ICUAW at day 7 and ICUAW month 6 vs control) using a mixed effects model with random intercept term to account for correlation among the eigengenes from the same patient with pre-specific significance threshold FDR < 5%. P-values were adjusted for multiple tests using the p.adjust function (Benjamini Hochberg) in the stats package in R. The co-expression analysis on the 11,482 age and sex corrected probes identified 17 modules, however 6846 probes that did not fulfill these criteria were assigned to the predefined module M0 (designating non-module genes). The module memberships of all 11,482 probes are shown in Supplementary Table 3.4. The association of each module eigengene to disease status is shown in Supplementary Table 3.5. Module visualization: for each module the topological overlap measure (kME) was calculated and used to rank genes within the module. The top 50 ranked genes for each module were visualized using Cytoscape. Module-clinical variable relationships . Tukey’s biweight correlations were calculated between continuous clinical variables and module eigengene values. Significant correlation to clinical variables was empirically defined as R ≥ 0.5 and adjusted p- value < 0.05 ( Supplementary Figure 3.2).

3.3.8 Gene ontology and Human phenotype ontology analysis

Functional enrichment in Gene Ontology (GO) and Human Phenotype Ontology (HPO) was performed using the gProfiler tool in R 286 . The statistical significance

60 threshold level for GO and HPO enrichment was Benjamini and Hochberg corrected p < 0.05 and a minimum of 10 genes detected. The background list for the enrichment analysis included all genes represented on the Illumina Human HT-12 v4 array with a detection P value < 0.05 in at least three samples.

3.3.9 Gene set visualization using enrichment map

Enrichment map, a network-based visual representation of enriched terms that groups similar gene sets (functional terms, was performed to aid identification of functional themes ( Supplementary Figure 3.1). Gene set functional enrichment analysis was performed using Gene Ontology (GO) biological process terms in the web-based version of gProfiler (http://biit.cs.ut.ee/gprofiler/). The statistical significance threshold level for GO enrichment was Benjamini and Hochberg corrected p < 0.05 and only gene sets between 10 and 100 genes were used from GO. The background list for the enrichment analysis included all genes represented on the Illumina Human HT-12 v4 array with a detection P value < 0.05 in at least three samples. Enrichment Map Cytoscape plugin software was used to create the enrichment map, with the parameters p-value < 0.005, correct p-value value < 0.05 and “Jaccard + overlap similarity” cutoff = 0.5.

3.3.10 Transcription factor binding site analysis , version 3.0 287 (http://opossum.cisreg.ca/oPOSSUM3), was used to detect enrichment of human transcription factor binding sites (TFBSs) in the 5,000 base pairs (bp) upstream and downstream sequence of input genes (single site analysis; cutoffs were z score of ≥10, Fisher exact test score [negative natural logarithm of the hypergeometric p-value] of ≥7, default values in oPOSSUM based on empirical studies, and conservation cut-off 0.6). Results were ranked by Fisher exact score and the top three highest scoring TFBS above cutoffs were selected (Supplementary Table 3.7).

3.3.11 Preservation analysis

To make the data from different microarray platforms comparable, we converted the probe-level measurements from the validation datasets and our dataset of ICUAW samples to genes-level measurements using CollapseRows function in WGCNA

61 using the “MaxMean” setting. The two validation sets contained 5344 of the 9683 (55.2%) of genes used in our co-expression analysis. The 5344 genes in the ICUAW dataset (reference dataset) were mapped to their corresponding module labels from the co-expression analysis using the original 11,482 gene probe dataset. The module membership from the reference data is applied to each validation dataset

147 tested. We applied the module preservation statistic Z summary in the modulePreservation function in WGCNA. The Z summary statistic integrates the overlap in module membership with the connectivity (sum of connections) and density (mean connectivity) patterns of the each module. The recommended significance thresholds, Z summary <2 implies no evidence for module preservation, 2 < Z summary <10 implies weak to moderate evidence, and Z summary >10 implies strong evidence for module preservation.

3.3.12 Independent validation data sets

Two datasets were obtained from the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database . Validation dataset 1 : Human skeletal muscle ( vastus lateralis ) transcriptome in ICU patients with sepsis induced multi-organ failure (GSE13205, 13 sepsis and 8 control samples) performed on Affymetrix GeneChip HG-U133 Plus 2.0 Array 87 . Expression data was age and sex corrected using residuals of a linear regression model. Validation dataset 2: Experimental model of ICUAW in Sus scrofula skeletal muscle ( biceps femoris ) transcriptome containing a total of 40 samples from a series of 5-day longitudinal experiments (24 samples, GSE16348; 8 samples GSE24239; 8 samples, GSE33037) performed on Affymetrix Porcine Genome Array 288 . For both validation datasets, series raw data were downloaded (Affymetrix .CEL files) and background corrected using rma in the expresso package (“average difference” summary method) without normalization then arrays from the same platform were merged together and quantile normalized. Data quality was assessed as described above. Corresponding published putative homologs of the porcine dataset were used to link probe IDs to Human Genome Organization (HUGO) symbols 289 .

3.4 Results

We analyzed 32 vastus lateralis muscle tissue samples from 14 ICUAW patients (14 biopsies at day 7 and 10 follow-up biopsies at month 6 post-ICU discharge) and 8 healthy controls ( Table 3.1) using Illumina microarrays. For the probes that passed

62 quality control their expression was adjusted for age, sex, and correlation between samples from the same patient (methods). A total of 695 genes were found to be differentially expressed between 14 ICUAW day 7 post-ICU, 10 ICUAW month 6 post-ICU, and 8 control samples, using a cut-off false discovery rate (FDR) < 5% (Supplementary Table 3.1). Hierarchical clustering performed using the differentially expressed genes showed skeletal muscle profiles from ICUAW patients were grouped largely based on disease status and time-point, as expected given that the differential expression analysis compared ICUAW at day 7 and month 6 versus controls (Figure 3.1a ). A total of 347 genes were under-expressed (downregulated) day 7 post-ICU compared to controls. Eighty-seven of these were also downregulated in month 6 post-ICU samples ( Figure 3.1b ). Twenty-one of 256 genes upregulated at day 7 were also upregulated at month 6 post-ICU discharge versus controls. Genes significantly downregulated in ICUAW day 7 post-ICU compared to controls showed enrichment for the Gene Ontology (GO) and Human Phenotype Ontology (HPO) terms that underscore the role of bioenergy metabolism and mitochondrial dysfunction in acute muscle injury including mitochondrial inner membrane

-11 (Phypergeometric = 8.50x10 ) and acidosis ( Phypergeometric = 0.0129) ( Supplementary Table 3.2), compatible with findings from animal models of ICUAW and cohorts of sepsis with multiple organ dysfunction syndrome (MODS) 87,288 . No significant enrichment of GO terms was detected for upregulated genes at day 7 post-ICU or up- (N = 50) or downregulated ( N = 152) genes at month 6 post-ICU compared to controls. We used weighted gene co-expression network analysis (WGCNA) to detect co-expression modules that correlate with clinical muscle phenotypes. Accordingly, a co-expression network based on age and sex adjusted expression data was built using 11,482 probes robustly expressed probes (corresponding to 9869 unique genes) from the entire dataset, containing all ICUAW and control samples (32 muscle transcriptomes), identifying 17 co-expression modules, labeled numerically by module size, with module 1 (M1) indicating the largest module ( Supplementary table 3.3). Module expression for each sample was summarized by the first principal component of its expression (module eigengene [ME]). We defined ICUAW-relevant modules as those having ME associated with case-control status for at least one clinical variable. Modules with differential expression of ME at day 7 post-ICU versus controls and month 6 post-ICU versus controls were termed “early ICUAW-relevant” and “sustained ICUAW-relevant” modules, respectively. Differences in ME expression were tested using a linear mixed effects model to account for correlation between patient samples. Clinical

63 variables tested for correlation with ME included the motor subscore of the Functional Independence Measure (FIM), global muscle strength measured by MRC sum score (MRCSS) and quadriceps cross sectional area (CSA), expressed as a percentage of published age and sex matched norms, each measured at 7 days and 6 months, and muscle strength determined by quadriceps peak torque (% predicted) 281 at month 6 post-ICU discharge ( Figure 3.2 and Supplementary Table 3.4). Eleven of 17 modules met criteria for ICUAW-relevant modules composed of eight early, one sustained, and two early and sustained ICUAW-relevant modules. Seven of the 11 disease-relevant modules, were functionally enriched for GO terms (Table 3.2 and Supplementary Table 3.5 ) including modules M1 and M4, each enriched for the GO term mitochondrial inner membrane and significantly downregulated at day 7 post-ICU ( P = 5.8x10 -5, 3.3x10 -5 respectively), concordant with our single-gene differential expression analysis. Unlike single-gene analysis, we detected functional enrichment in an additional five differentially expressed modules. We further hypothesized that genes within an ICUAW-associated module were co-regulated by shared transcription factor binding sites (TFBS) 139 . Analysis of conserved TFBS showed significant over-representation of at least one TFBS in six of 11 modules (M1, M2, M3, M6, M7, M17), many of which are known to be associated with muscle related pathways (as expanded upon below). This suggests that shared TFBS may be co-regulating gene expression in ICUAW-associated modules ( Table 3.2 and Supplementary Table 3.6). The largest early ICUAW-relevant module, M1 ( N = 900 genes) was downregulated at day 7 post-ICU ( P = 5.77 x10 -5), whereas no difference in expression was detected at month 6 post-ICU compared to controls ( Table 3.2). Module eigengene expression correlated to quadriceps CSA (% of age/sex matched population based norm) ( R = 0.60, p = 2.1x10 -3) and global muscle strength measured by MRCSS ( R = 0.64, p = 9.7x10 -4). Genes in this module were significantly enriched for multiple GO terms relating to mitochondrial function, bioenergy metabolism, and muscle structure development ( Figure 3.3a-d and Figure 3.4a). We found significant overrepresentation of multiple TFBS including the myocyte enhancer factor 2A

-9 (MEF2A ) binding site ( Phypergeometric = 1.43 x 10 ). MEF2A has been shown to be upregulated during muscle regeneration 290 and is required for adult early myogenic (myoblast) differentiation and regeneration in response to injury 291 . Module M2 (N = 850 genes) was upregulated at day 7 post-ICU and inversely correlated with quadriceps CSA (R = -0.68, p=4.0x10 -4) and global muscle strength ( R = -0.60, p = 2.1x10 -3). The module was enriched for genes related to ribosome biogenesis (p = 9.62 x10 -13 ) and genes targeting to the membrane (p = 3.01x10 -5). The GA-binding

64 protein alpha (GABPA) TFBS was found to be significantly overrepresented in M2

-4 (Phypergeometric = 1.23x10 ). GABPA has been shown to regulate the expression of synaptic genes at the neuromuscular junction 292 . The largest sustained ICUAW-relevant module, M3 ( N = 592 genes) had no difference in ME expression at day 7 post-ICU, but was upregulated at 6 months post-ICU ( P= 0.028). Module eigengene expression was inversely proportional to muscle strength as determined by quadriceps peak torque at 6 months ( R= -0.65, P = 8.0x10 -4). The module was enriched for multiple terms relating to wound healing and repair such as extracellular membrane deposition, calcium handling (contractile function), and muscle structure development, suggesting activation of regenerative pathways ( Figure 3.3e-h, Figure 3.4b). The most significantly overrepresented

-21 TFBS in M3, TEAD1 ( Phypergeometric = 1.8 x 10 x 10 ), has been shown to be a mediator of skeletal muscle development 293,294 . The skeletal muscle development genes enriched in M3 were distinct from those in M1, indicating specific temporal alterations in skeletal muscle regenerative expression profiles in early and sustained ICUAW. Proof of concept for the in silico prediction of increased extracellular membrane deposition was corroborated by staining for collagen in a representative sustained ICUAW biopsy sample compared to healthy control ( Figure 3.3i). We externally validated the robustness of our findings by determining whether the modules detected in our ICUAW dataset were preserved in an experimental pig (Sus scrofa ) model of sepsis-induced early ICUAW and a human cohort of sepsis- induced MODS biopsied at an average of 7 days after admission to ICU. Previously annotated homologues were used to detect genes shared between pig and human in our dataset 289 , resulting in a network of 5344 genes consisting of the intersection all three datasets. We then tested for preservation of module structure in the validation datasets using a permutation based composite Z statistic, Zsummary to assess the significance of the observed preservation statistics, where Z-values > 10 imply significant preservation of module structure (on line methods). Module M1 was significantly preserved in both the cohort of sepsis-induced MODS ( Z = 27.1, Figure 3.5a) and the pig model of ICUAW ( Z = 14.8, Figure 3.5b). Module M3 was modestly preserved in the MODS cohort ( Z = 8.6, Figure 3a) or the pig model of ICUAW ( Z = 9.5, Figure 3b). To assess differences in module expression between ICUAW and controls in the validation cohorts we tested the association between the ME and disease status using linear mixed effects regression analysis. Concordant with the findings from our cohort, early ICUAW-relevant module M1 was downregulated in both sepsis-induced MODS versus controls (p = 4.52x10 -6) and the porcine ICUAW model at day 5 versus

65 pre-sepsis day 1 ( p = 2.22 x10 -5) ( Supplementary Table 3.7) showing preservation of module structure and directionality of ME expression change.

3.5 Discussion

We have performed the first longitudinal study of ICUAW to integrate clinical measurements of muscle mass and strength with transcriptomic profiles of skeletal muscle. We hypothesized that the degree of aberrant expression of genes involved in skeletal muscle regeneration and repair is associated with the extent of muscle atrophy and weakness (muscle phenotypes) in ICUAW. Using differential expression (DE) analysis we found distinct expression signatures between early and persistent ICUAW compared to controls. Functional analysis of the differentially expressed genes in early ICUAW found downregulation of mitochondrial genes suggesting bioenergetic failure, in keeping with previous studies 9. However, DE analysis of genes does not account for variability of muscle phenotypes within our ICUAW cohort. Therefore we applied co-expression analysis to detect groups of genes whose expression correlates with muscle phenotypes and differentiates ICUAW from healthy controls (DE modules), termed ICUAW-relevant modules. We found one early and one sustained ICUAW relevant module that were enriched for skeletal muscle regeneration genes. One sustained ICUAW-relevant module upregulated for extracellular matrix genes was inversely correlated to muscle strength, suggesting that aberrant muscle repair may hinder force-generation in sustained ICUAW. We verified the presence of collagen deposition in samples of sustained ICUAW using histochemical stains. Using two publically available transcriptomic datasets of early ICUAW we externally validated the module downregulated in early ICUAW and enriched in genes for muscle regeneration and the mitochondria. While we found moderate preservation of the largest sustained ICUAW-relevant module compared to the early ICUAW datasets, there are currently no other datasets of sustained ICUAW available for comparison. Co-expression analysis has been previously shown to detect enrichment of prognostic relevant genes 295 and transcription factor binding sites (TFBS) 139 . Indeed, we detected significant overrepresentation of TF binding sites (TFBS) in the majority of ICUAW-relevant modules, suggesting that co-expression of genes in ICUAW-relevant modules are attributable to co-regulation by common TFs. Strikingly, in modules related to muscle regeneration, the most enriched TFBS have

66 experimental evidence supporting their roles in muscle development and differentiation. MEF2A, a TF critical for skeletal muscle regeneration 291 296 , was enriched in an early ICUAW-relevant module (M1) downregulated in early ICUAW, suggesting impaired binding of MEF2A to its is associated with the degree of weakness and atrophy in early ICUAW. TEAD1 binding sites, the muscle CAT (MCAT) elements, were enriched in the sustained ICUAW module (M3) upregulated at 6 months post-ICU. TEAD-binding sites are found in the promotors of genes involved in terminal differentiation and co- activated by the Hippo pathway transduction 294,297,298 RUNX1 and NFATC2, also enriched for TFBS in Module 3, have been associated with response to muscle damage 299,300 . The calcium-regulated transcription factor NFATC2 is activated only in newly formed myotubes and its gene targets play a key role in myoblast fusion and myotube growth 301 . The RUNX1-mediated transcriptional program regulates muscle- specific genes and structural proteins, controlling the balance of proliferation and differentiation in myoblasts during muscle regeneration 299 . The corroboration of our in silico analysis with in vivo experimental studies of muscle regeneration therefore serves to further validate our findings. The inverse correlation of muscle strength with M3 gene expression and enrichment of these genes for the extracellular matrix and muscle development genes may denote aberrant muscle regeneration programs resulting in impaired muscle structure and function in sustained ICUAW.

3.6 Conclusions In summary, application of WGCNA, a robust and unbiased systems-level method for gene network analysis has significantly extended our understanding of ICUAW-specific transcriptional alterations compared to individual gene expression analysis. Our study identified distinct transcriptional alterations in the early and sustained phases of ICUAW. The strength of our study, the largest transcriptomic analysis of ICUAW, was the integration of comprehensive clinical measurements with gene network analysis. We have identified striking correlations between module expression and clinical measures of muscle mass and function, suggesting an association between disease-perturbed networks and phenotypic changes. Importantly, the relationships between module expression and clinical traits identified here do not imply causation but they provide a foundation for future ongoing research. Increased sensitivity of gene module functional enrichment in future analysis may be gained with increasing sample size, use of RNA sequencing, and transcriptomic analysis of single cells. Our analysis showed biologically meaningful

67 enrichment in the majority of gene modules and provides novel insights regarding potential molecular mechanisms of sustained ICUAW. An important theme that emerged from our analysis was temporally distinct alterations of skeletal muscle regenerative genes in early and sustained ICUAW. The robustness of these findings was further supported by preservation of early ICUAW gene networks in other datasets of ICUAW. Thus we have established a comprehensive gene network- based framework to study candidate genes associated with muscle weakness in early and sustained ICUAW. We anticipate that these findings may provide avenues for development of therapies to ameliorate ICUAW in the future.

68

Table 3.1 . Descriptive statistics (mean and standard deviation) and test of difference of means for demographic and clinical data for the three subgroups. Differences in continuous variables between the groups were tested using one-way Analysis of Variance (ANOVA). Differences in categorical variables between groups were tested using Fisher's exact test. Day 7 Month 6 Control P-value post ICU Post ICU Age 50.29(15.86) 48.10(13.49) 35.25(9.48) 0.055 Sex 7F 7M 5F 5M 3F 5M 1 Quadriceps cross sectional area 37.34(19.19) 63.18(25.23) 100* <0.001 (percentage of age and sex matched norms) Quadriceps strength (%predicted isometric peak torque) NA 65.36(17.29) 100* <0.001 FIM motor subscore (max score 91) 39.86(22.82) 83.78(5.74) 91* <0.001 MRCSS (max score 60) 46.31(14.31) 55.40(4.03) 60* 0.09 * Clinical data values for controls are set to the normal value in a healthy population Abbreviations: Medical Research Council Sum Score (MRCSS). Functional Independence Measure (FIM). Female (F), Male (M).

69

Table 3.2: Weighted gene correlation network identifies eleven ICUAW-associated modules. Association with ICUAW for each module, represented by the module eigenegene (first principal component). Direction of differential expression: A positive sign (+) indicates upregulation, and negative sign (-) downregulation of ICUAW samples compared to controls. Ten modules are significantly associated with ICUAW day 7 post-ICU at FDR < 0.05 (Differential expression [DE] P-value: red boxes in the left-hand column indicate significant DE in ICUAW day 7 post-ICU,), and three modules are associated with ICUAW month 6 post-ICU (red boxes in right column). Grey boxes in left and right columns indicate no significant DE in ICUAW at day 7 and month 6 post-ICU discharge, respectively. Module phenotype correlations ( p-value in brackets) for muscle strength (based on highest absolute value of either MRCSS or peak torque), muscle mass, and physical function (FIM score). Significantly enriched gene ontology or human phenotype ontology terms, based on FDR < 0.05 (representative examples listed). Significantly over-represented transcription factor binding sites (TFBS) based on P-value < 9.1 x10 -4 (three most significant TFBS above cutoff shown per module).

module Size Association with Association with phenotypes GO and HPO terms TFBS (# genes) ICUAW Direction P Strength Mass Function D7 M6 M1 900 - 0.6 0.6 0.64 Myopathy MEF2A (2.1x10 -3) (2.1x10 -3) (9.7x10 -4) Mitochondrial inner RORA_1 membrane NR4A2 Muscle structure development M2 850 + -0.60 -0.68 -0.74 Protein targeting to GABPA (2.1x10 -3) (4.0x10 -4) (1.3x10 -4) membrane Ribosome biogenesis M3 592 + -0.65 Calcium ion binding TEAD1 (8.0x10 -4) Collagen metabolic HOXA5 process SPI1 Muscle structure development M4 529 - 0.63 0.74 0.65 Mitochondrial inner (1.3x10 -3) (1.3x10 -4) (7.4x10 -3) membrane M6 166 + -0.56 -0.54 Endocytosis ELK1 (5.5x10 -3) (6.8x10 -3) M7 136 + -0.54 0.52 Tal1::Gata1 (6.8x10 -3) (9.7x10 -3) M11 80 + -0.67 Inflammatory (5.8x10 -4) response M13 62 + -0.53 -0.66 (7.5x10 -3) (7.4x10 -4)

70 M14 57 + -0.56 (5.5x10 -3) M16 59 - 0.70 (2.5x10 -4) M17 46 - 0.54 Angiogenesis ELK1 (7.4x10 -2) INSM1 ZFx

71

Figure 3.1: Differentially expressed genes in ICUAW a. Heat map of 695 gene probes differentially expressed between ICUAW at day 7 and ICUAW at month 6 post-ICU versus healthy controls. Differential expression was assessed at a false positive discovery rate (FDR) < 0.05 and non-log fold change > 1.0. Scaled expression values are color coded according to the legend below the heat map. The top bars indicate patient variables: group (purple, ICUAW day 7; pink, ICUAW month 6; grey, control), age and sex (values are color coded according to respective legend to the right of the heat map). b. Venn diagram of differentially expressed probes in ICUAW day 7 post-ICU (left) and ICUAW 6 months ICU (right). Number of overlapping genes shared between day 7 and month 6 are shown within the four squares within the yellow diamond; number of genes exclusively differentially expressed in ICUAW day 7 (left) or ICUAW month 6 post-ICU (right) are shown in the 4 squares outside the yellow diamond.

72

Figure 3.2 Heatmap of correlations between co-expression module eigengenes (rows) and quantitative clinical traits (columns). The clinical traits (from left to right) are FIM, Functional Independence Measure (motor subset score); Quadriceps mass, quadriceps cross sectional area expressed as a percentage of age and sex matched norms determined by computerized tomography; APACHE2 score, a severity of disease classification system; MRC, a sum score of global muscle strength; Muscle strength determined by quadriceps peak torque (% predicted); RIN, RNA integrity number. Scale bar (right) indicates the range of possible correlations from positive (red, 1) to negative (green, -1).

73

74

Figure 3.3: Module 1 and 3 are associated with ICUAW at day 7 and month 6 post-ICU discharge, respectively. a,e Module eigengene values (y-axis) across samples (x-axis), Purple, ICUAW day 7 post ICU; pink, ICUAW month 6 post ICU; grey, controls. P-values of linear mixed effects regression with age and sex as fixed effects. b,f Module eigengene values (y-axis) vs. clinical measurements (x-axis). Correlation R-values calculated using Turkey’s Biweight correlation, p-values adjusted for multiple comparisons. c,g Relevant gene ontology categories enriched in the M1 and M3 modules. d,h Top 50 most highly connected genes in the M1 and M3 modules. i Low magnification (10x) representative photomicrographs of vastus lateralis muscle cross sections from a healthy control (left panel) and a sustained ICUAW patient at month 6 (right panel) stained with Picro-Sirius Red Stain. Muscle stains yellow while collagen stains red.

75

Figure 3.4 Enrichment Map results of the gene set functional enrichment analysis for module 1 (top) and module 3 (bottom). A node in the Enrichment Map represents a gene set. Node size represents gene set size. Node color, blue or red, indicates whether gene sets are down- or up-regulated compared to controls, respectively.

76

Figure 3.5: Modules M1 and M3 from human ICUAW patients are preserved in two independent data sets . Summary preservation statistic based on permutation testing Z summary score, using the human ICUAW patients as a reference. The y axis displays the Z score for each module in a. human sepsis-induced multiple organ dysfunction and b. Pig model of ICUAW; numbers beside each module (colored circle) indicate the corresponding module in the reference dataset. The x-axis indicates the number of genes in the module. Z scores of less than 2 (bottom red line) implies no evidence for module preservation, while scores exceeding 5 (green line) and exceeding 10 implies moderate and strong evidence for module preservation, respectively.

77

Chapter 4:

4.0 microRNA-mRNA interactions underlying abnormal muscle repair in survivors of critical illness with sustained weakness

4.1 Abstract

Rationale & Objectives : ICU acquired weakness (ICUAW) is a common complication of critical illness characterized by decreased muscle mass and function that may persist for years after ICU discharge with few patients regaining functional independence . Understanding the molecular mechanisms contributing to sustained ICUAW is necessary for the development of novel therapies. We aimed to identify miRs that are significant regulators of differentially expressed (DE) mRNAs and mRNA networks in ICUAW associated with muscle dysfunction. Methods : From a cohort of critically ill patients, skeletal muscle strength, mass, and physical function were measured and whole-transcriptome miR and mRNA expression was assessed from quadriceps muscle biopsies at Day 7 and Month 6 post-ICU discharge. miR regulators of mRNA signatures associated with sustained muscle weakness following critical illness were detected using an analytic pipeline. Results : Nineteen miRs were found to significantly regulate the differential mRNA signature at Day 7 post-ICU, with miR-424-5p regulating 24% of all DE mRNAs, suggesting its role as a master regulator of early ICUAW. The down-regulated mRNA network targeted by miR-424-5p was enriched for mRNAs related to skeletal muscle differentiation; mRNAs with muscle function were also targeted by two other downregulated miRs (miR-206 and miR-29a). miR-424-5p and miR-206 were found to be upregulated during myogenic differentiation in vitro using C2C12 murine myoblasts. At Month 6 post-ICU, distinct miR expression signatures were found to separate ICUAW patients with significant improvement in muscle mass from those with little gain. Eight miRs in this signature was found to regulate the DE mRNA signature > 1 % in patients who regained muscle mass over the 6 months period post ICU discharge.

78 Conclusion : MicroRNA profiling identified key miRs involved in the regulation of muscle weakness at Day 7 and in the recovery of muscle mass at Month 6 post-ICU discharge.

4.2 Introduction

Intensive care unit-acquired weakness (ICUAW) describes a spectrum of muscle weakness that develops in the critically ill due to varied combinations of muscle denervation, impaired contractility, muscle proteolysis and limited regenerative capacity 302 . Patients with ICUAW often exhibit sustained muscle weakness resulting in permanent disability and loss of functional independence, even after resolution of critical illness and discharge from the ICU 103,303 . Heterogeneity of functional outcomes after critical illness has been observed in cohorts of ICUAW; within the first 3-6 months after ICU discharge, patients may have significant functional recovery before reaching a plateau at the first year following critical illness resolution 1,303 . We have recently reported transcriptomic and histopathologic findings from quadriceps biopsies obtained during early and sustained ICUAW in a multi-center prospective longitudinal study of critically ill patients following prolonged mechanical ventilation (RECOVER program) 103,304 . Histopathologic stains of muscle biopsied from early and sustained ICUAW found decreased progenitor (satellite) cell content 103 . Co-expression analysis was used to detect groups of mRNAs (modules) with expression differences between ICUAW and healthy controls, correlated to measures of muscle mass, strength and/or physical function. We identified 11 modules satisfying this criteria termed ICUAW-relevant module s ( ICUAW-RMs ); two of the modules contained mRNAs involved in skeletal muscle regeneration – one in early and the other in sustained ICUAW (measured at day 7 and month 6 post-ICU discharge, respectively). The early ICUAW-RM was externally validated using muscle transcriptomic data from a pig model of early ICUAW 288 and human muscle biopsies from a cohort of sepsis-induced multi-organ dysfunction 87 . These findings collectively suggested a link between muscle atrophy and impairment of muscle regenerative capacity 103 MicroRNAs (miRs) are small non-coding RNAs that modulate cell phenotypes by regulating the degradation and translation of sets of mRNAs in biological pathways. While miRs have primarily been found to mediate downregulation of

79 mRNA expression and transcriptional repression, more recent studies have shown that miRs can upregulate mRNA expression by direct and indirect mechanisms 159,305 . miRs have been shown to play important roles in muscle repair and regeneration 156,306,307 with distinctive miR expression patterns found in primary muscle disorders 308 . We hypothesized that alterations in miR expression mediate the aberrant expression of muscle regenerative genes in ICUAW-RMs. To test this hypothesis we obtained transcriptome-wide paired miR and mRNA expression profiling of muscle biopsy samples from ICUAW patients in the RECOVER cohort and healthy controls. Then we implemented an analytic pipeline 309 to progressively filter miRs based on biologic relevance including degree of association with the expression of mRNA signatures within ICUAW subgroups. miRs having highest association with mRNA signature expression are termed miR master regulators .

4.3 Methods

Patient selection, outcome measures and percutaneous muscle biopsy of the vastus lateralis were described previously in Chapter 3 304 . Written informed consent was obtained from all participants or their surrogate decision makers and participants were re-consented when capacity was regained. The study protocol was approved by the University Health Network Research Ethics Board and St. Michael’s Hospital Research Ethics Board. All methods were performed in accordance with the relevant guidelines and regulations within the study protocol. Percutaneous muscle biopsy of the vastus lateralis was performed at Day 7 post-ICU discharge (n = 14) and Month 6 post-ICU discharge (n = 10); referred to as “early” and “late” ICUAW, respectively. Two patient samples taken at Month 6 post- ICU discharge did not have miR expression profiling data available and therefore were excluded from the analysis. Healthy muscle biopsy samples (controls) were obtained from previously banked specimens collected from consenting individuals (n = 8). RNA was processed, amplified, and labeled as previously described 283 .

4.3.1 ICUAW Phenotyping

ICUAW patients had outcome measures of physical function, strength, and mass measured at Day 7 and Month 6 post-ICU discharge as described previously 304 .

80 Briefly, physical functional capacity was measured using the motor component of the Functional Independence Measure questionnaire (FIM score) 23 , global muscle strength was assessed by the Medical Research Council Sum Score (MRCSS) 21 , and midthigh quadriceps femoris muscle cross-sectional area (CSA) was determined using computed tomography 278 . Patients at Month 6 post-ICU discharge having greater than 10cm 2 increase in muscle mass compared to Day 7, which resulted in normalization or near normalization of their quadriceps CSA compared to age and gender matched published population based norms, were defined as improvers ; patients with less than 10cm 2 CSA increase were defined as non-improvers .

4.3.2 miR and mRNA microarray pre-processing and quality control

RNA samples from muscle biopsy had high RNA quality (median RNA integrity number [RIN] of 8.5. RNA samples were labeled using the miRCURY LNA miR Hy3/Hy5 Hi-Power labeling Kit (Exiqon) and hybridized on the miRCURY LNA miR Array (Exiqon, 7 th generation, containing 2042 H. sapiens miRs as annotated by miRBase 19, spotted in quadruplicate) according to the manufacturer’s guidelines. All capture probes for control spike-in oligonucleotides produced signals in the expected range. Data quality control included high inter-array correlation (Pearson correlation coefficients >0.85) and detection of outlier arrays based on mean inter- array correlation and hierarchical clustering. All samples fulfilled data quality control criteria. Probes were processed, background corrected, and normalized using the MmPalateMiRNA package 310 . Probes with foreground intensity values above 1.3 times their background intensity level in at least 3 samples were retained for further analysis. After background correction, between-array normalization was performed using quantile normalization. Quadruplicate probes were averaged to a single expression value for each miR, resulting in 514 high-quality, robustly expressed miRs included for further analysis. mRNA expression profiles were obtained using IlluminaHT-12 V4 microarrays (1 microarray per sample) and previously deposited in GEO under accession number GSE78929 and previously published 304 . All samples fulfilled data quality control criteria. Probes listed as “No match” or “Bad” using the illuminaHumanv4 package were removed, resulting in 34,476 high-quality probes. Robustly expressed probes were defined as those with detection p-value < 0.05 for at least half of the samples in the data set and standard deviation of probe expression >0.25 (11,482 probes). The probe level measurements were then converted to mRNA-level measurements using

81 the CollapseRows function in WGCNA 135 using the “MaxMean” setting corresponding to 9869 high-quality, robustly expressed unique mRNAs.

4.3.3 Master miR-regulator analysis (MMRA)

MMRA is an analysis pipeline designed to detect DE miRs significantly contributing to the expression of target mRNAs in disease subgroups 309 . MMRA combines statistical tests, target prediction and unsupervised network analysis and was implemented using R code available at http://eda.polito.it/ MMRA/ (with the following modifications: 1) Use of limma in place of Kolmogorov-Smirnov test in step (i) , 2) updated versions of target prediction databases in step (ii) , 3) Empirical significance thresholds for enrichment analysis implemented in step (ii) 4) updated version of the Algorithm for the Reconstruction of Accurate Cellular Networks [ARACNe] in step (iii) . Rationale for each modification described below). All expression profiles analyzed in MMRA were adjusted for age and sex using random effects linear model in lme4 package to account for intra-subject correlation between samples obtained longitudinally.

4.3.3.1 MMRA step (i) Differential expression (DE) analysis . Subgroups in this analysis consist of comparisons of ICUAW Day 7 vs controls, ICUAW Month 6 versus controls and improvers versus non-improvers. The DE mRNAs and miRs resulting from subgroup analysis are termed the “subgroup signature” for mRNAs and miRs, respectively. Differential expression of all robustly expressed miRs and mRNAs in ICUAW Day 7 and Month 6 post-ICU and control samples was assessed in limma using linear models adjusted for patient age and sex and for consensus correlation between patient samples using the duplicateCorrelation function 284 . Moderated F- statistics combined t-statistics for all three pair-wise comparisons (contrasts) into an overall test of significance for each mRNA or miR was used. The decideTests function with “global” setting performed error rate control across multiple contrasts and miRs or mRNAs simultaneously. Differential expression of all robustly expressed miRs and mRNAs in ICUAW patients classified as improvers and non- improvers (as defined above) was assessed in limma using linear models adjusted for patient age and sex with pairwise testing using t-statistics. To detect DE miRs, the pre-specified significance threshold was set at Benjamini Hochberg (BH) false detection rate (FDR) < 5% and absolute values of non-log fold change (aFC) >1.5. Differentially expressed mRNAs were defined as aFC > 1.0 with FDR threshold < 20% for analysis of ICUAW at Day 7 and Month 6

82 post-ICU versus controls. For comparison of improvers versus non-improvers, due to the smaller size of these subgroups, an unadjusted p-value of 0.05 with aFC > 1.0 was defined as significant for DE mRNA analysis. For this analysis we chose more stringent FC and FDR thresholds to detect DE miRs compared to the DE mRNA analysis as most high-quality studies of miR expression changes in disease states generally report FC in the range of 1.5-4 fold in miR levels 311 . In contrast, mRNA network analysis has demonstrated that multiple co-expressed mRNAs, each having small effect (i.e. FC), function to modulate disease phenotype 131 as we have previously demonstrated for this ICUAW mRNA expression data 304 . Diagnostic plots of the linear model fit (miR-wise or mRNA-wise residual standard deviations against average log-expression) were examined and no variance trends were identified. The limma package was used instead of the Kolmogorov- Smirnov test for DE analysis as originally implemented in MMRA 309 as the former has been shown to be advantageous in experiments with smaller sample sizes 284 . Hierarchical agglomerative clustering using average-linkage with the hclust() function was performed using the miR expression data from the differentially expressed miRs across all samples. The clustering analysis was performed on the columns (samples) and rows (differentially expressed miRs) with data visualized as a heatmap using heatmap.3() function.

4.3.3.2 MMRA step (ii) Target enrichment analysis : Target enrichment analysis was performed for each DE miR in a given subgroup comparison analysis (ICUAW at day 7 and month 6 post-ICU versus controls, improvers versus non-improvers) in order to assess overlap between its putative mRNA targets and the DE mRNAs in each subgroup comparison analysis. To generate a list of putative mRNA targets we combined the results of four prediction databases (DBs): TargetScan V7.0 (conserved and non-conserved, 2015) 177 , miRDB (2014) 169 , DIANA-microT v5.0 (2013) 189 , and PITA (2010) 186 using mirDIP 312 (to ensure consistent mRNA and miR names). To increase stringency of predicted targets, we selected from PITA the targets designated “top scores” and for the other 3 DBs we filtered out the bottom 50% of targets based on each of their prediction scores. We then included all experimentally validated miR-target pairs from TarBase V7.0 313 to our list of putative targets. miR-target pairs were selected for enrichment analysis if present in Tarbase or if present in at least two prediction databases. For each DE miR identified in a given subgroup comparison analysis in step (i) , target enrichment analysis of the putative miR-targets in the up-regulated and

83 down-regulated DE mRNAs in each subgroup comparison analysis were calculated using the hypergeometric test p-values and observed/expected (O/E) ratios. The significance threshold for miRs were Bonferroni adjusted p-values < 0.2 and O/E ratios > 1.1; thresholds were selected empirically for enrichment analysis of predicted miR-target pairs.

4.3.3.3 MMRA step (iiia) Network analysis : Unsupervised network analysis was performed using the Algorithm for the Reconstruction of Accurate Cellular Networks (ARACNe) with Adaptive Partitioning strategy (AP) 205,206 to infer interactions between each DE miR selected in step (ii) and any of the robustly expressed mRNAs. ARACNe uses an unbiased approach without assuming the underlying network topology 203 to construct a network based on mutual information (MI), an information theoretic measure of the mutual dependence between two variables, for each input miR and putative mRNA targets. The ARACNe algorithm provides advantages over simple pairwise miR-mRNA MI analysis as it removes indirect candidate relationships using the direct processing inequality algorithm and uses bootstrapping to improve the statistical significance of the network for limited sample sizes 314 . The ARACNe algorithm performs bootstrapping with randomly selected samples and builds a consensus network of edges (miR-mRNA interactions) computed by estimating the statistical significance of the number of times a specific edge is detected across all bootstrap runs. For each network, 100 bootstraps were performed and the chosen MI p-value significance threshold (10 -7) was selected based on previously recommended threshold 314 .

4.3.3.4 MMRA step (iiib) Master regulator analysis : The consensus networks constructed around each miR from step iiia were then tested for significant enrichment in DE mRNAs for each subgroup using the master regulator analysis (MRA) algorithm 315 ; the statistical significance (Fisher’s exact test p-value) of the overlap between the mRNAs within each network and the DE mRNAs of the subgroup in which the miR was identified as DE. Next, we tested the significance of the overlap between the mRNAs within each network and the 11 ICUAW-relevant modules identified in Walsh et al 304 . To determine a p-value threshold, a random null model was built consisting of networks built using miRs not DE in any subgroup (signal-to-noise ratio, corresponding to FC divided by within-group standard deviation, < -0.05). The threshold for the MRA p-value was estimated by comparing with the p-values obtained in our analysis with those of the null model. We performed MRA on a null model built of 30 randomly selected non-DE miRs and

84 tested the enrichment in DE mRNAs of all ICUAW subgroup as well as for enrichment for all ICUAW-relevant modules. The MRA p-value threshold of 10 -4 was chosen for DE mRNAs, and 10 -3 for ICUAW-relevant modules, corresponding to the 90 th percentile of each of the null models.

4.3.4.5 MMRA step (iv) Stepwise linear regression (SLR) analysis : Next, SLR was used to filter out weak miR-mRNA relations within each network. The SLR constructs a linear model for each mRNA target as the response variable and all miRs linked to the mRNA (by ARACNe-AP in step [iiia] ) as explanatory variables. A stepwise algorithm is used to select the best minimal set of explanatory variable within the model. The Akaike information criteria (AIC) 316 was used as the stop criterion. The output of SLR analysis is then reorganized at the miR level, to identify for each miR, the DE mRNAs in each subgroup that are linearly associated with the miR. The degree of regulation by a miR for each ICUAW subgroup signature was defined as the percentage of DE mRNAs (up or down) in each subgroup identified by SLR (positive or negative coefficient). For the final output of MMRA pipeline, miRs satisfying all of the above thresholds were selected ( Table 1 ).

4.3.4 miR-clinical variable relationships

Tukey’s biweight correlation was calculated between continuous clinical variables and miR expression values (unadjusted for age and sex). Significant correlation to clinical variables was empirically defined as R ≥0.5 and p-value < 0.05.

4.3.5 Gene ontology

Functional enrichment of Gene Ontology (GO) was performed using the gProfiler tool 286 in R with FDR corrected p-value < 0.05 with gene set size range 10-300, and minimum of 5 mRNAs intersecting. The background list (“universe”) for the enrichment analysis included all mRNAs represented on the Illumina Human HT-12 v4 array with a detection p-value < 0.05 in at least three samples.

4.3.6 Transcription Factor binding site analysis oPOSSUM version 3.0 287 (http://opossum.cisreg.ca/oPOSSUM3) was used to detect

85 enrichment of human transcription factor binding sites (TFBSs) in the 5,000bp upstream and downstream sequence of input miRs (single site analysis; z score cutoff is larger or equal to 10, Fisher exact test score [negative natural logarithm of the hypergeometric p-value] of ≥7, default values in oPOSSUM based on empirical studies, and conservation cutoff 0.75). Results were ranked by Fisher exact score. miR sequences were obtained using biomaRt package to access Ensembl GRCh37 database of miR annotations input into GenomicRanges to select the genome sequences in UCSC hg19 human genome within the BSgenome.Hsapiens.UCSC.hg19 dataset in Bioconductor.

4.3.7 In vitro miR qRT-PCR

4.3.7.1 C2C12 culture and RNA Isolation . C2C12 myoblasts were cultured in Greiner CELLSTAR® 12-well plates at a density of 15,000 cells/well as quantified by the Vi- CELL Series Cell Viability analyzer (Beckman Coulter) at total volume 1ml, grown in DMEM + 10% FBS + pen/strep at 5% CO2. Total RNA >20bp was extracted from C2C12 myoblasts in RNAse-free conditions using the Qiagen miRNEasy mini kit. To quantify miRNA concentration, we used the Agilent Bioanalyzer 2100 Small RNA kit (Agilent Technologies), which applies chip-based microfluidic electrophoresis to separate and detect RNA ranging from 0-150bp. The Agilent Bioanalyzer quantifies both total RNA and miR-specific concentration, the latter of which was used for subsequent cDNA synthesis. 4.3.7.2 Taqman qRT-PCR . Top ranking miR candidates were selected for in vitro study based on: correlation of miR expression with clinical measure of muscle mass (quadriceps cross-sectional area) and adjusted p-value; non-log fold change of expression in Day 7 vs control, or improvers vs non-improvers; availability of murine Taqman miRNA assays for select miR candidates. Quantification of miR expression was performed using Taqman miRNA assays, including a miR-specific RT primer for cDNA synthesis, and PCR primers and TaqMan® MGB probes for subsequent qPCR. cDNA synthesis was conducted using the Taqman miRNA Reverse Transcription Kit, which utilizes a multiscribe reverse transcriptase (RT). For each sample, a No-RT control was included to ensure RT primer specificity and no genomic contamination. Subsequent Taqman qPCR was conducted using the QuantStudio7 Flex Real-Time PCR System (Thermo) according to the manufacturer’s recommended settings. Candidate miRNA expression was assessed using the 2-∆∆Ct method and Gaussian propagation of error to determine fold change + SD. miR expression was

86 compared across timepoints and normalized to the geometric mean of two housekeeping small RNA – RNU6b and snoRNA234 – per the Vandesompele method 317 . RNU6b and snoRNAs234 are established as stable reference RNA in murine cell lines in proliferative conditions. Values are presented as fold changes.

4.4 Results

We analyzed 30 vastus lateralis muscle tissue samples from 14 ICUAW patients (14 biopsies “early” at Day 7 post-ICU discharge, and 8 “late” follow-up biopsies at Month 6 post-ICU discharge) and 8 healthy controls using Exiqon and Illumina microarrays to quantify miRNA (miR) and mRNA probes, respectively. For the miR and mRNA probes that passed quality control, linear models were used to identify differential expressed (DE) probes adjusting for age, sex, and correlation between samples from the same patient ( Methods Section 4.3 ). A total of 52 miRs were DE at ICUAW Day 7 post-ICU and 3 miRs DE at ICUAW Month 6 post-ICU compared to healthy control samples, using a cut-off false discovery rate (FDR) < 5% and absolute non-log fold change (aFC) > 1.5 ( Supplementary Table 4.1a ). A total of 29 miRs were downregulated at Day 7 post-ICU compared to controls and 1 miR was downregulated at Month 6 post-ICU. Twenty-three miRs were upregulated at Day 7 post-ICU compared to controls, whereas 2 were upregulated at Month 6 post-ICU (Supplementary Table 4.1a-c). The DE mRNA signatures for each subgroup used in this analysis are shown in Supplementary Table 4.1d-f. Hierarchical clustering performed using the DE miRs showed skeletal muscle profiles from ICUAW patients were grouped largely based on disease status and time-point, as expected given that the differential expression analysis compared ICUAW at day 7 and month 6 versus controls (Figure 4.2a ). The sample grouping using DE miRs in the clustering analysis was similar to clustering analysis using DE mRNA 304 which also showed separation of Day 7 from controls and heterogeneity of Month 6 samples which are interspersed among control and Day 7 subgroups. This suggests a strong correlation between transcriptional changes in the subgroup signatures of mRNAs and miRs. We hypothesized that the relative lack of DE miRs at Month 6 post-ICU is caused by differences in miR expression profiles between phenotypic subgroups of ICUAW at Month 6. We found that a subset of ICUAW patients at Month 6 had significantly greater increase in quadriceps cross-sectional area (CSA) 278 (i.e.

87 greater improvement in muscle mass reconstitution) than the other patients. The ICUAW patients at Month 6 were therefore classified into two phenotypic subgroups: patients having greater than 10cm 2 increase in CSA compared with their Day 7 measurement, and who normalized or near normalized their CSA compared to gender and age matched published population based norms, were defined as improvers (3 patients), whereas patients with less than 10cm 2 CSA increased were defined as non-improvers (5 patients). Forty-one miRs were upregulated in the improvers versus non-improvers, and 9 were downregulated suggesting association between degree of muscle mass recovery and miR expression profiles. The expression profiles of the miRs differentially expressed between improvers from non- improvers were visualized in a heatmap in Figure 4.2b . For each DE miR, we then tested whether the DE mRNA signatures in each subgroup were targets of the miR and whether the identified targets were significantly enriched. miR-target pairs were defined as those present in Tarbase (database of experimentally validated targets) or at least two of four prediction databases (methods); 2,221,231 unique miR-target pairs fulfilled this criteria. Twenty-one of the 52 miRs were enriched for miR targets in the DE mRNA signatures at Day 7 post- ICU, 2 of 3 miRs enriched at Month 6 post-ICU, and 32 of 50 miRs were enriched for improvers ( Supplementary Table 4.2). For each miR enriched for DE targets, a network composed of a single-hub miR regulating mRNA targets was reconstructed from the expression of the miR and robustly expressed mRNAs using the Algorithm for the Reconstruction of Accurate Cellular Networks (ARACNe). The complete set of mRNAs regulated bv a miR are termed its “regulon” 315 . The number of edges (miR-target interactions) in the obtained regulons varied from 62 to 353 mRNAs (average 187 mRNAs) and the mutual information (MI) values within each network ranged between 0.32 and 0.58 (Supplementary Table 4.3). The regulons of each miR were tested for enrichment of DE mRNAs; 19 of 21 miRs were enriched at Day 7 post-ICU, 0 of 2 miRs enriched at Month 6 post-ICU, and 11 of 32 miRs for improvers ( Supplementary Table 4.4). In the final step of the analysis, candidate miRs were further restricted to include only those whose expression was found to fit the expression of DE mRNAs in each regulon using stepwise linear regression analysis (SLR). A total of 27 miRs were found to fit the expression of subtype DE mRNAs in regulons according to SLR analysis (19 miRs DE at Day 7 and 8 miRs DE in improvers vs non-improvers). For the majority of miRs (16) the expression was opposite to that of the associated DE mRNA signature, whereas 14 miRs had concordant expression ( Table 4.1). Five miRs were found to regulate both up and downregulated mRNA signatures at Day 7

88 post-ICU (miR-424-5p, miR-3175, miR-29a-3p, miR-29b-3p, miR-638). The percentage of DE mRNA signatures in each subtype regulated by each miR was found to range from 1.1% (miR-4698) to 24% (miR-424-5p). Of the 11 ICUAW-relevant modules previously characterized 304 , 8 modules (M1, M2, M3, M4, M6, M7, M11, M13) were significantly enriched in at least one regulon (Supplementary Table 4.5 and Table 4.1); all 19 miRs at DE at Day 7 were found to target regulons having significant enrichment for mRNAs in at least one of 8 ICUAW-relevant modules; only one of the 8 miR DE in improvers (mir-4732-3p) had target regulon enriched for ICUAW-relevant modules. Significant correlation between miR expression and clinical measures of strength, mass and function were found for the majority of miRs ( Table 4.1). Significant enrichment of Gene Ontology terms were detected in the regulons controlled by sixteen miRs. The majority of enriched GO terms (11) were related to cellular respiration and 3 GO terms were related to muscle processes (striated muscle cell differentiation [p=3.42x10 -3] for regulons controlled by miR-424-5p, muscle system process for miR-206 [p=5.29x10 -3], and muscle contraction [p=6.27x10 -3] for miR-29a-3p). Finally, we found significant enrichment of 6 transcription factor binding sites (TFBS) among upregulated miRs at Day 7 post-ICU and 13 TFBS among downregulated Day 7 post-ICU miRs. Five TFBS were common to both up- and downregulated miRs ( PAX5, ZNF143, ESR2, ZFX, and DDIT3::CEBPA ; Supplementary Table 4.6a,b ). Remarkably, 23 TFBS were enriched among upregulated the upregulated miRs for improvers vs non-improvers, Supplementary Table 4.6c ). Paired box 4 ( PAX4 ) was the most significantly enriched TFBS (p- value= 4.56x10-72 ), followed by CCCCTC-binding factor ( CTCF , p-value=1.31x10 -51 ). PAX4 has been recently shown to enhance protein degradation during atrophy induced by denervation or fasting 318 . CTCF has been shown to modulate myogenesis by regulating muscle-specific mRNA expression in mice 319 . No significant TFBS were found for the downregulated miRs in improvers vs non- improvers. To evaluate the influence miRs identified by our transcriptomic analysis as master regulators had on skeletal myogenic processes, miR expression was measured in typically proliferating and differentiating C2C12 murine myoblast in vitro using Taqman qRT-PCR. miR expression was measured at 6h post-plating, when C2C12s were considered to be at baseline growth conditions immediately upon adherence to the plate plastic, 48h post-plating when highly proliferative, and at 7 days post-plating during active differentiation ( Figure 4.3a ). Of our highest ranking

89 miRs for which murine Taqman miRNA assays were available, miR-23a-3p, miR- 424-5p (miR-322-5p in mice) and miR-206, which we identified as master regulators of muscle mRNA expression at Day 7 post-ICU, were significantly upregulated during myogenic differentiation, but not during myoblast proliferation ( Figure 4.3b ). miR 23a-3p had a non-log fold change (FC) increase in expression of 2.89 between 48h- 7d, but only 1.3 from 6h-48h, while miR-322/424-5p had a 3.29-FC upregulation from 48h-7d, but interestingly was significantly downregulated (-1.15 FC) during proliferation. miR-206 had a FC of 1.47 during proliferation vs 58.08 during differentiation, confirming its established role as an inducer of myogenic differentiation 158 . miR-487b, significantly downregulated in improvers vs non-improvers, had no change in expression during proliferation compared to baseline (1.01 FC), but was significantly downregulated during myoblast differentiation (-1.42 FC) which supports previous research identifying a role for miR 487b in delaying myogenic differentiation 320 . Furthermore, miR-29b-3p was significantly upregulated during proliferation only (FC 2.39) and miR 542-3p was upregulated during both proliferation (FC 1.57) and differentiation (FC 2.09) ( Figure 4.3b ).

4.5 Discussion

The RECOVER cohort is the first to have longitudinal assessments integrating clinical measures of muscle mass, and strength with transcriptomic profiles of skeletal muscle in ICUAW 103,304 . We have previously identified temporally distinct alterations in co-expressed mRNA modules in early and sustained ICUAW having significant correlation between muscle strength and mass and module expression. These findings suggested an association between the disease- perturbed modules and phenotypic changes. The aim of this current study was to identify miRs regulating ICUAW-relevant modules (ICUAW-RMs) and differentially expressed mRNA signatures. In contrast to other studies of miR expression in ICUAW 164,321 , our analysis did not start with an a priori candidate set of potential miRs and mRNA targets, but instead began with transcriptome-wide miR and mRNA expression. While numerous methods have been proposed to discover miR-target interactions using paired miR-mRNA expression profiling 322 , we implemented an innovative data-driven analytic pipeline to identify miRs that best ‘explained’ the mRNA signatures and modules 309 . This pipeline, MiR Master Regulator Analysis

90 (MMRA), integrates statistics and unsupervised network theory to identify mRNA-miR regulatory networks containing a set of mRNAs regulated by a miR are termed its regulon. Concordant with our findings from ICUAW-RMs, a number of regulons contained mRNAs involved in skeletal muscle differentiation and cellular respiration, suggesting that miRs have critical roles in bioenergy and muscle regeneration dysfunction in ICUAW. We found significant overlap between mRNAs within regulons and the majority of ICUAW-RM. Interestingly, the majority of driver miRs at Day 7 targeted multiple, overlapping ICUAW-RM; this combinatorial miR regulation network is characteristic of the global architecture of mammalian miR regulatory networks 323 . MiR-206, miR-29a-5p and miR-424-5p were identified as master regulators of mRNA networks enriched for muscle processes at Day 7 post-ICU. All 3 miRs were downregulated at Day 7, with miR-424-3p targeting 30% of the differentially expressed mRNAs at this time-point. All 3 miRs have been identified previously in animal models as playing critical roles in muscle differentiation. miR-206, a well characterized “myo-miR” 157 expressed only in skeletal muscle, has been shown play an important regulatory role in muscle differentiation 158 and found to be highly expressed in satellite cells and regenerating muscle fibres after injury 306,324 . Mice lacking miR-206 have been shown to have inefficient skeletal muscle regeneration in response to cardiotoxin injury 306 . The mouse homologue of miR-424 (miR-322) has also been shown to be induced during muscle differentiation and blocks cell cycle progression during myogenesis 156 . Our in vitro findings validated the key roles of miRs-206 and miR-424-5p in myogenic differentiation and implicate down-regulation of these miRs in early ICUAW as critical factors in muscle impairment. Mouse models of muscular dystrophy and CKD have shown downregulation of both miR-29a and miR-29b in muscle 325,326 . Increasing miR-29 was found to improve differentiation of myoblasts into myotubes 325 . In contrast Li et al 162 , using mouse and rat atrophy models (denervation, dexamethasone, fasting, cancer and aging) found upregulation of miR-29b expression. Overexpression promoted muscle atrophy while its inhibition attenuated atrophy in their models. These findings may indicate that miR-29b downregulation in early ICUAW may function as a compensatory process. ICUAW patients with significant muscle mass increase had a distinct signature of differentially expressed miRs at Month 6 post-ICU compared to those with minimal muscle gain. Eight upregulated miRs (miR-4732-3p, mir-490-3p, mir- 4762-5p, mir-4279, mir-4473, mir-642b-5p, mir-4421, mir-4698) were identified as

91 regulators of the mRNA signatures in these patients. While no studies characterizing the roles of these miRs in muscle are currently available, we speculate that these miRs may also contribute to the muscle regeneration in phenotypes of ICUAW with increased muscle mass reconstitution. Our in vitro study was limited to those miRs for which there were murine Taqman assays available, thus the upregulated miRs in patients with greater increase in muscle mass compared to those without could not be tested in our assay. The majority of miR regulators at Day 7 post-ICU targeted mRNA networks related to mitochondrial function. This is concordant with studies demonstrating decreased mitochondrial content or bioenergetic dysfunction in skeletal muscle of early ICUAW. Quadricep muscle biopsied from ICUAW patients (at mean length of stay ~ 50 days) compared to ambulatory controls had nearly 50% reduction of ATP synthesis in mitochondria 327 . Histologic analysis of muscle biopsies from our patient cohort showed decreased mitochondria content at day 7 post-ICU 103 . Garros et al 164 identified elevated miR-542-3p/5p from a subset of a priori candidate miRs in a cohort of ICUAW patients after aortic surgery which was shown to promote mitochondrial dysfunction in the muscles of mice overexpressing miR-542. Interestingly, miR-542 was downregulated at Day 7 post-ICU in our cohort, possibly reflecting dynamic changes in miR expression over the different time-points for muscle biopsy between the two ICUAW cohorts. Both miRs and transcription factors (TF) regulate differential mRNA expression throughout the transcriptome and are known to interact with each other in the context of regulatory networks. We found that the miR drivers upregulated in improvers compared to non-improvers had TF binding sites known to play important roles in skeletal muscle and that several TFBS enriched among DE miRs ( ZFX , RORA , TAL1::GATA1, INSM1 ) were also found to be enriched in ICUAW-RM for muscle structure development genes, reflecting common transcriptional controls for mRNA and miR expression in ICUAW. Importantly, we identified miRs regulating expression of mRNA with biologically meaningful enrichment in the majority of targeted mRNA networks, thereby providing novel insights into the regulatory mechanisms underlying the pathobiology of ICUAW. These findings provide a framework to study miR drivers of muscle pathology in early and sustained ICUAW and we anticipate will identify key therapeutic targets to promote ICUAW recovery in the future.

92

Figure 4.1: Workflow of Master MicroRNA Regulator Analysis (MMRA) pipeline 309 adapted based on this study’s research objectives. The figure indicates the procedure performed in each of the four analytic steps (middle) using data required as initial input (left) and/or the outputs of preceding analytic steps (right). Abbreviations : DBs, databases; MiR, microRNA; MRN, miR regulatory network; ICUAW (ICU acquired weakness); DE, differential expression.

93

Figure 4.2: Expression patterns of differentially expressed (DE) microRNAs (miRs) in ICUAW . a) Heatmap of 51 miRs differentially expressed between ICUAW at Day 7 and ICUAW at Month 6 post-ICU versus healthy controls. The top bars indicate patient variables: groups (purple, ICUAW Day 7; pink, ICUAW Month 6; grey, control), age and sex (Values are color coded according to respective legend to the left of the heat map). Below heatmap is a venn diagram of differentially expressed probes in ICUAW Day 7 post-ICU (left) and ICUAW Month 6 ICU (right). Number of overlapping mRNAs shared between Day 7 and Month 6 are shown within the four squares within the yellow diamond number of mRNAs exclusively differentially expressed in ICUAW Day 7 (left) or ICUAW Month 6 post-ICU (right) are shown in the 4 squares outside the yellow diamond. b) Heatmap of 50 miRs DE between ICUAW ‘improvers’ and ‘non-improvers’ at Month 6 post-ICU. The top bars indicate patient variables: groups (red, non-improvers; green, improvers), age and sex (Values are color coded according to respective legend to the far left of the heat map as shown in 1a).

94 Number of downregulated and upregulated probes are shown below heatmap. Differential expression was assessed at false positive discovery rate (FDR) < 0.05 and non-log fold change > 1.5. Scaled expression values are color coded according to the legend to the far left of the heat maps.

95

Figure 4.3: Relative mmu-miRNA expression during C2C12 myoblast proliferation and differentiation. a) Phase-contrast microscopic images of C2C12 murine myoblasts taken during baseline conditions (6 hours post-plating), high proliferation (48 hours post-plating) and active differentiation (7 days post-plating). Central images were taken at 10X magnification, while top right images taken at 20X magnification. b) Bar graph depicting non-log fold change (FC) of mus musculus (mmu) miRNA expression during proliferation (6h-48h) and differentiation (48h-7d) of C2C12 murine myoblasts. miRNA expression is represented as the fold change between timepoints relative to the geometric mean of two stable reference RNA, snoRNA234 and RNU6b. Error bars depict standard deviation as calculated by Gaussian propagation of error. Dotted line at FC=1 represents no change between timepoints relative to the geometric mean of reference RNA.

96 Table 4.1 MicroRNAs (miRs) with differential expression in ICUAW identified as master regulators of mRNA target expression. The table reports the miR master regulator analysis (MMRA) pipeline output with miRs ranked based on percentage of signature mRNAs putatively targeted by the miR (fourth column). The first column reports the identified miRs, the second column the subtype in which the miR is differentially expressed (down = downregulated, up= upregulated). The third column reports the mRNA signature associated to each miR. The fourth column reports the percentage of signature mRNAs whose expression is explained by the miR expression in step-wise linear regression (SLR) analysis. The fifth, sixth, and seventh column show the Spearman correlation and p-value between the muscle strength, mass, and function, respectively. The eighth column lists the co-expression modules identified by Walsh et al 304 that are significantly enriched in the regulon targeted by the miR. Abbreviations: ICUAW = ICU acquired weakness; D7 = Day 7 post-ICU discharge; SLR = step-wise linear regression; GO terms = Gene Ontology terms; NA = no significantly enriched mRNA sets. miR miR mRNA targets SLR -estimated Association with phenotypes ICUAW modules GO terms (FDR p -value) association association association (%) regulated with ICUAW with ICUAW Strength Mass Function hsa-mir-424-5p Down in D7 Up in D7 13.3 0.75 (7.36e -03) 0.86 (7.72e -04) 0.92 (5.66e -05) M1, M2, M3, M4, M7 NA hsa-mir-4780 Down in D7 Up in D7 10.3 0.75 (7.36e-03) 0.65 (2.99e-02) 0.76 (6.30e-03) M1, M2, M3, M4 GO:0045333: cellular respiration (6.9x10 -3) GO:0051146:striated muscle cell differentiation hsa-mir-424-5p Down in D7 Down in D7 10.0 0.75 (7.36e-03) 0.86 (7.72e-04) 0.92 (5.66e-05) M1, M2, M3, M4, M7 (3.4x10 -3) hsa-mir-3622a-3p Up in D7 Down in D7 8.3 -0.65 (2.95e-02) -0.77 (5.34e-03) -0.89 (2.12e-04) M1, M2, M3, M4 GO:0045333: cellular respiration (8.6x10 -4) hsa-mir-3175 Down in D7 Up in D7 6.6 0.83 (1.57e-03) 0.75 (7.39e-03) 0.88 (3.05e-04) M1, M2, M3, M4 NA hsa-mir-206 Down in D7 Up in D7 6.2 0.8 (2.82e-03) 0.83 (1.64e-03) 0.81 (2.53e-03) M2, M11 GO:0003012: muscle system process (5.3x10 -3) hsa-mir-3175 Down in D7 Down in D7 6.2 0.83 (1.57e-03) 0.75 (7.39e-03) 0.88 (3.05e-04) M1, M2, M3, M4 GO:0045333: cellular respiration (1.3x10 -3) hsa-mir-600 Up in D7 Up in D7 6.1 -0.77 (5.92e-03) -0.74 (9.95e-03) -0.78 (4.49e-03) M1, M3, M4, M13 GO:0045333 cellular respiration (6.2x10 -17 ) hsa-mir-23a-3p Down in D7 Up in D7 6.1 0.82 (2.13e-03) 0.86 (7.72e-04) 0.8 (3.10e-03) M1, M2, M3 NA hsa-mir-502-3p Down in D7 Up in D7 5.5 0.75 (8.17e-03) 0.79 (4.11e-03) 0.67 (2.55e-02) M1, M2, M3 NA hsa-mir-3136-3p Up in D7 Up in D7 5.5 -0.58 (5.99e-02) -0.56 (7.14e-02) -0.6 (5.09e-02) M1, M2, M3, M4 GO:0045333: cellular respiration (5.3x10 -12 ) hsa-mir-4795-5p Up in D7 Up in D7 5.4 -0.72 (1.32e-02) -0.51 (1.08e-01) -0.78 (4.49e-03) M1, M2, M3, M4 GO:0045333 cellular respiration (2.9x10 -5) hsa-mir-29b-3p Down in D7 Up in D7 4.0 0.44 (1.79e-01) 0.53 (9.33e-02) 0.65 (2.99e-02) M1, M2, M3, M4, M7 NA hsa-mir-638 Down in D7 Down in D7 3.8 0.8 (2.82e-03) 0.87 (4.98e-04) 0.68 (2.03e-02) M1, M2, M3, M4 GO:0071294: cellular response to zinc ion (0.0101) hsa-mir-502-3p Down in D7 Down in D7 3.7 0.75 (8.17e-03) 0.79 (4.11e-03) 0.67 (2.55e-02) M1, M2, M3 NA hsa-mir-29a-3p Down in D7 Up in D7 3.5 0.56 (7.08e-02) 0.62 (4.04e-02) 0.61 (4.44e-02) M1, M2, M3, M4, M7 NA hsa-mir-29a-3p Down in D7 Down in D7 3.5 0.56 (7.08e-02) 0.62 (4.04e-02) 0.61 (4.44e-02) M1, M2, M3, M4, M7 GO:0006936: muscle contraction (6.3x10 -3) GO:0006122: mitochondrial electron transport hsa-mir-3133 Up in D7 Up in D7 3.2 -0.54 (8.74e-02) -0.61 (4.65e-02) -0.5 (1.19e-01) M2, M4 (4.42x10 -4) hsa-mir-29b-3p Down in D7 Down in D7 3.1 0.44 (1.79e-01) 0.53 (9.33e-02) 0.65 (2.99e-02) M1, M2, M3, M4, M7 NA hsa-mir-4488 Down in D7 Up in D7 3.1 0.83 (1.57e-03) 0.91 (9.22e-05) 0.75 (7.39e-03) M1, M3, M6, M11 NA Up in M1 NA hsa-mir-4732-3p Improvers Up in Improvers 2.6 0.76 (2.74e-02) 0.74 (3.75e-02) 0.67 (6.91e-02) hsa-mir-638 Down in D7 Up in D7 2.6 0.8 (2.82e-03) 0.87 (4.98e-04) 0.68 (2.03e-02) M1, M2, M3, M4 NA Up in hsa-mir-490-3p Improvers Up in Improvers 2.4 0.79 (1.94e-02) 0.85 (7.54e-03) 0.77 (2.59e-02) NA NA 97 hsa-mir-5704 Up in D7 Up in D7 2.3 -0.48 (1.34e-01) -0.32 (3.43e-01) -0.29 (3.90e-01) M1, M3, M4 GO:0045333: cellular respiration (2.61x10 -10 ) GO:0018105: peptidyl- phosphorylation hsa-mir-4516 Down in D7 Up in D7 2.1 0.8 (2.82e-03) 0.83 (1.46e-03) 0.69 (1.80e-02) M1, M2, M3, M4 (0.0149) hsa-mir-4764-3p Up in D7 Down in D7 2.1 -0.44 (1.72e-01) -0.25 (4.56e-01) -0.16 (6.42e-01) M1, M3, M4 GO:0045333: cellular respiration (9.7x10-12 ) Up in Down in NA GO:1903749: positive regulation of establishment of hsa-mir-490-3p Improvers Improvers 1.8 0.79 (1.94e-02) 0.85 (7.54e-03) 0.77 (2.59e-02) protein localization to mitochondrion (1.4 x 10 -2) Up in Down in NA NA hsa-mir-4762-5p Improvers Improvers 1.5 0.74 (3.72e-02) 0.6 (1.19e-01) 0.3 (4.77e-01) hsa-mir-551a Up in D7 Down in D7 1.5 -0.59 (5.65e-02) -0.76 (6.30e-03) -0.68 (2.15e-02) M2, M4 NA Up in Down in NA NA hsa-mir-4279 Improvers Improvers 1.4 0.68 (6.25e-02) 0.76 (2.83e-02) 0.59 (1.23e-01) Up in NA GO:0038128: ERBB2 signaling pathway (3.1x10 -3) hsa-mir-4473 Improvers Up in Improvers 1.2 0.3 (4.70e-01) 0.23 (5.87e-01) -0.1 (8.16e-01) Up in Down in NA NA hsa-mir-642b-5p improvers improvers 1.2 0.68 (6.25e-02) 0.76 (2.83e-02) 0.59 (1.23e-01) Up in Down in NA NA hsa-mir-4421 improvers Improvers 1.2 0.85 (8.17e-03) 0.68 (6.09e-02) 0.47 (2.37e-01) Up in Down in NA NA hsa-mir-4698 improvers improvers 1.1 0.79 (1.94e -02) 0.72 (4.27e -02) 0.57 (1.39e -01)

98

CHAPTER 5

5.0 Multi-cohort transcriptional meta-analysis of muscle diseases

5.1 Abstract

Introduction and Objective : Muscle diseases share common pathological features suggesting common underlying mechanisms. We hypothesized there is a common set of genes dysregulated across muscle diseases compared to healthy muscle and that these genes correlate with severity of muscle disease. We performed meta- analysis of transcriptional profiles from muscle biopsies from human muscle diseases and healthy controls. Methods : Studies obtained from public microarray repositories fulfilling quality criteria were divided into six categories: i) Immobility, ii) inflammatory myopathies, iii) ICU acquired weakness (ICUAW), iv) congenital muscle diseases, v) chronic systemic diseases, vi) motor neuron disease. Patient cohorts were separated in discovery and validation cohorts retaining roughly equal proportions of samples for the disease categories. To remove bias towards a specific muscle disease category we repeated the meta-analysis five times by removing data sets corresponding to one muscle disease class at a time in a “leave-one-disease-out” analysis. Results : Using 490 muscle tissue samples from 22 cohorts we identified 98 up- regulated and 33 down-regulated genes which were validated in 17 addition patient cohorts (689 samples) encompassing five categories of muscle diseases. This gene signature inversely correlated with muscle mass in ICUAW (r = -0.70, p- value=8.62x10 -6) and inversely correlated with motor functional capacity (r=-0.54, p- value=1.67x10 -3). The signature also correlated with histologic assessment of muscle atrophy in amyotrophic lateral sclerosis (ALS) (r=0.84, p-value = 5.10x10 -3) and inversely correlated with shoulder abduction strength (r=-0.74, p-value=0.022) in ALS. Finally, removing the genes common to muscle disease from meta-analysis of only ICUAW cohorts revealed uniquely down-regulated muscle development and contraction genes specific to ICUAW.

99 Conclusions : Our results identify a conserved muscle disease transcriptional signature associated with clinical and histologic disease severity, and identify numerous novel genes associated with muscle disease severity.

5.2 Introduction

Skeletal muscle diseases are associated with decreased skeletal muscle mass and function resulting in impaired physical function, prolonged hospitalization and increased mortality. Understanding the pathomolecular mechanisms conserved across muscle diseases may provide vital insight to help develop therapies to ameliorate them. Among muscle diseases, the common themes of mitochondrial dysfunction and myofibril protein breakdown and force generation defects have been identified 328,329 . It remains difficult to determine whether pathomechanisms studied in animal models may translate to human muscle diseases. A crucial challenge in studying human muscle diseases remains linking individual responses to muscle dysfunction for a given severity of disease at the level of the transcriptome. A growing number of studies of human muscle diseases have identified dysregulated gene expression that is associated with muscle severity. However these studies are usually limited by relatively small sample sizes without external validation from independent samples 330 . The vast quantity of expression profiling data in the public repositories Gene Expression Omnibus (GEO) and ArrayExpress facilitates comprehensive integration of human muscle disease cohorts for meta-analysis. We have applied a multi-cohort analysis framework that leverages the biological and technical heterogeneity across independent data sets to identify reproducible disease gene signatures 249,270 . This approach has successfully discovered robust signatures in organ rejection 249 and neurodegenerative diseases 270 . While one recent paper examined gene co-expression across various muscle diseases on a single microarray platform 331 , no systematic multi-cohort analysis has investigated transcriptional changes across human muscle diseases. We applied our meta-analysis approach to analyze publicly available gene expression datasets of peripheral muscle tissue for muscle diseases. We hypothesized that convergent transcriptional abnormalities occur across muscle diseases regardless of the specific muscle pathophysiology. We identified a conserved signature of muscle disease applicable even to cerebral palsy (CP) and

100 amyotrophic lateral sclerosis (ALS), disease with a muscle phenotype not included in the original meta-analysis. We found that the muscle disease gene signature is significantly associated with clinical and histological muscle disease severity in independent validation cohorts. Finally we identified patterns of gene dysregulation unique to each muscle disease category relative to the others.

5.3 Methods

5.3.1 Data collection and pre-processing

Two public gene expression microarray repositories (ArrayExpress, NIH GEO) (search date: Dec 12, 2017) were searched for human muscle disease datasets using the search terms “skeletal muscle”, “myopathy”, “dystrophy”, “cachexia”, “immobility”, “unloading”, “disuse”, “bed rest” , “paralysis”, “spinal cord injury (SCI)”, “morbid obesity”, “critical illness”, “ICU”, “sepsis”, “myositis”, “dermatomyositis”, “polymositis”, “inclusion body myositis”, “chronic obstructive pulmonary disease (COPD)”, “cancer”, “amyotrophic lateral sclerosis (ALS)”, “hereditary spastic paraplegia (HSP)”, “primary lateral sclerosis (PLS)”, “progressive muscular atrophy (PMA)” , “spinal muscular atrophy (SMA)”, “motor neuron disease”, “mitochondrial”, “symptom”, “weakness”, “muscle mass”. We first identified data sets that satisfied the following criteria: (1) samples were from human peripheral muscle tissue, (2) data was originally acquired using a genome-wide gene expression microarray platform with probes representing > 10,000 genes, (3) the microarray platform had reasonably accessible and clear probe-to-gene mapping annotations and (4) there were >= 5 cases and >= 5 controls total for the relevant patient cohort in each data set and (5) the controls were taken from healthy muscle tissue. Samples that were taken after intervention (e.g. after leg casting or gastric bypass surgery) were excluded. Some data sets included more than one disease category therefore we refer to each disease-specific group and its respective control group as a patient cohort . Two patient cohorts (GSE36398 and GSE38680) were each divided into two separate cohorts (designated by the suffix “a” and “b”) due to batch effects identified during the preliminary analysis. Datasets containing a subset of samples included in a large dataset were removed from further analysis to avoid redundant samples. We identified a total of 42 cohorts containing 1247 samples from 37 independent data sets that satisfied these criteria.

101 We divided the sample cohorts into 6 disease categories for analysis: 1) Inflammatory myopathies (IM) 2) ICU acquired weakness (ICUAW) 3) Congenital muscle diseases (CMD) 4) Chronic systemic disease affecting muscle (CSM) 5) Disuse and immobility (DI) 6) Motor neuron disease (MND). Next we divided the patient cohorts into a discovery cohort for the initial meta-analysis and a validation cohort for the independent validation analysis. For the discovery cohort we ensured that there were at least three cohorts for each disease category that met our selection criteria. As there were only two cohorts for the MND category, this was not included in the discovery cohort as a disease category; instead, the two MND cohorts were included in the validation cohort. We chose 22 patient cohorts containing 490 samples to include in the discovery cohort. For the validation cohorts, we used 17 patient cohorts containing 689 samples. The GEO accession numbers (or ArrayExpress accession if unavailable on GEO) of the data sets used in our analysis are summarized in Table 5.1. Each of the studies are summarized in the Supplemental Section 5.1 To ensure similar normalization methods, all microarray data were renormalized from raw data (when available) using standardized methods. Affymetrix arrays were normalized using GC robust multiarray average (gcRMA) on arrays with mismatch probes or RMA (R package “affy”). Gene detection (presence/absence calls) on Affymetrix HG-U133 series microarray data was performed using default parameters using R package panp ). For all non-Affymetrix arrays, we downloaded data in non-normalized form, background corrected using the normal-exponential method, and then quantile normalized (R package limma ) . All probe-to-gene mappings were derived from the most current SOFT files in GEO (downloaded on Dec 12, 2017). All microarray data were log-2 transformed and probes were summarized to genes within datasets using a fixed-effect inverse variance model. We restricted the gene lists to genes that were measured in all data sets.

5.3.2 Meta-analysis

Multicohort meta-analysis of gene expression was performed combing gene effect sizes (Hedges’ g ) using a DerSimonian-Laird random-effects model (using R package MetaIntegrator ) 269 , correlating to false discovery rate (FDR) via Benjamini- Hochberg method. We set significance threshold for differential expression at FDR less than 5% and effect size greater than 0.3 (non-log fold change of greater than 1.23). These thresholds were selected as they were previously used in a multi- cohort transcriptional meta-analysis of neurodegenerative diseases 270 .

102 We generated effect size plots to compare the effect size distributions in each cohort in the discovery and validation sets ( Supplementary Figures 5.1A and 5.1B). In order to ensure that our meta-analysis was not biased towards a specific muscle disease category we repeated our meta-analysis 5 times by removing data sets corresponding to one disease at a time (e.g. in the first iteration, ICU-acquired weakness data sets were removed, and the meta-analysis was completed on the combined remaining data sets). At each iteration, we identified significant DE genes (FDR <= 5%). Genes that were significant, irrespective of which subset of muscle disease category were analyzed formed the pre-validation common muscle disease module (CMDM). My collaborator, Dr. Purvesh Khatri has previously shown the utility of the leave-one-disease-out approach in identifying a robust gene expression signature during acute rejection across different transplanted solid organs 249 and across neurodegenerative diseases 270 .

5.3.3 Gene ontology functional analysis Gene Set Enrichment Analysis (GSEA) was used to identify the enrichment of pre- established curated gene sets in Gene Ontology (GO) without arbitrary thresholds for significance. To identify functional themes across all muscle disease categories from the discovery and validation meta-analysis, the complete list of genes with corresponding meta-effect sizes were input into GSEA PreRank, performing 100,000 permutations for assessment of GO term enrichment using FDR < 5% and normalized effect size > absolute(1.0) as threshold for significance (implemented in ClusterProfiler package). After removing redundant enriched GO terms using semantic similarity 332 , the significantly enriched GO terms were visualized as a network based on their overlapping genes using EnrichentMap 333 .

5.3.4 CMDM score

Genes in the CMDM were separated according to whether their effect sizes were positive or negative (where “positive” means a positive effect size in muscle disease as compared to healthy controls, and “negative” means a negative effect size in muscle disease compared to healthy controls). We calculated the geometric mean of the gene expression intensity for the up-regulated and down-regulated genes within the CMDM separately within each sample. The geometric mean of the CMDM was centered and standardized across all samples in a given experiment given a z-

103 score. The difference between the up-regulated and the down regulated z-score in each sample is hereafter termed the CMDM score, used in this analysis.

5.3.5 Correlation of the CMDM genes with clinical and histological severity

Two cohorts that measured muscle strength and degree of atrophy or mass (GSE78929 and E-MEXP-3260) were included in the Validation and Secondary Validation Cohorts, respectively. One cohort (GSE34111) measuring muscle strength was included in the Discovery Cohort. To assess the association of the CMDM summary expression with these clinical measures we performed Tukey’s Biweight correlation of each measure with the CMDM score. For GSE3411 and GSE89020, patients’ peak torque was reported as percentage of the predicted normal, as previously described 281 . Predicted Quadriceps Force in Nm = - (2.21 x age) + (55.9 x gender [female = 0, male = 1 ] ) + (1.78 x Body weight) + 124.

5.3.6 Correlation of the CMDM genes with response to exercise therapy in inflammatory myopathy

GSE95772 was downloaded, normalize ed, and CMDM score calculated for each sample as described above. Repeated measures ANOVA with treatment as the between subjects effect and the CMDM score as the repeated measure (pre-post treatment) was calculated.

5.3.7 Association of the CMDM genes with normal aging

We used a previously published 957 age-associated genes list 334 to investigate the proportion of CMDM genes associated with normal muscle aging. Su et al 334 identified age-associated genes by fitting linear models (with age, sex and study ID as covariates) to the gene expression of 361 human muscle arrays annotated with age and sex (ranging from < 1 year to 83 yrs) with adjusted p-value for slope of the age parameter FDR < 0.05. Of the 361 samples applied to their regression, 26 (7.2%) were also present in our Discovery Cohort (GSE5110 and GSE21496).

5.3.8 Muscle disease category specific meta-analysis

104 To identify patterns on gene expression changes that are unique to each muscle disease category we performed meta-analysis on each disease category separately as well as on the other four diseases together in the Discovery and Validation cohorts together. We then removed genes from the individual disease category meta-analysis that were also significantly differentially expressed in the four-disease category meta-analysis (using the same criteria as above) thereby removing common DE genes in muscle disorders from the disease-specific gene list. The resulting individual disease meta-analysis gene lists were then input into GSEA PreRank for assessment of GO term enrichment as described above.

5.3.9 Identification of enriched transcription factors oPOSSUM, version 3.0 287 (http://opossum.cisreg.ca/oPOSSUM3), was used to detect enrichment of human transcription factor binding sites (TFBSs) in the 5,000bp upstream and downstream sequence of input genes (single site analysis; cut offs were z score of ≥10, Fisher exact test score [negative natural logarithm of the hypergeometric p-value] of ≥7 (default values chosen based on empirical studies), and conservation cut-off 0.6), and matrix score threshold 85%. The 24,752 genes in the oPOSSUM database was input as the background.

5.3.10 Assessment of cell type specificity in CMDM genes

To evaluate whether the CMDM may reflect changes in cell-type composition, we assessed the gene enrichment in using the ARCHS4 Tissues atlas 335 based on genes that are highly expressed in human tissues from 84,863 human samples from publically available RNA-seq experiments from GEO using Enrichr 336 .

5.3.11 Subcellular localization analysis

Subcellular localization information from two carefully curated sources (UniProt and Gene Ontology) was downloaded from CellWhere 337 . Subcellular localizations were assigned priority scores according to the location’s relevance to muscle physiology (‘Muscle flavor priority’). For genes having more than one subcellular annotation, the subcellular annotation having the highest muscle priority score was selected. For each subcellular localization, Fisher’s exact test was performed to assess enrichment

105 of CMDM genes in the subcellular localization, using the all of the genes in the discovery analysis as background.

All analyses were completed in R 3.4.1/Bioconductor. Data tested for normality using Shapiro-Wilk normality test. Heatmaps were created using ComplexHeatmap package. The analysis workflow is shown in Figure 5.1.

5.3.12 Availability of support supporting data

The datasets supporting the results of this article are available in GEO and ArrayExpress online repositories at http://www.ebi.ac.uk/ arrayexpress/ and http://www.ncbi.nlm.nih.gov/geo/. Data set accession numbers can be found in Table 5.1.

5.4 Results

5.4.1 Meta-analysis identifies a common gene signature of muscle disease

A total of 42 independent patient cohorts that profiled human muscle disease and normal muscle controls comprising 1,231 samples (759 cases, 472 controls) met criteria for inclusion ( Figure 5.1, Table 5.1). Collectively, the cohorts represent a broad range of patient ages and peripheral muscles from the upper and lower extremities (available phenotypic data for patient samples included in public repositories shown in Supplementary Table 5.1. Summary descriptions of each study are found in Supplemental Document 5.1). The cohorts were categorized as i) Inflammatory myopathies (IM), ii) Intensive Care Unit Associated Weakness (ICUAW), iii) congenital muscular diseases (CMD), iv) muscle disuse and immobility (DI), and v) chronic systemic diseases affecting muscle (CSM) and vi) Motor Neuron diseases (MND). Each category except MND had at least 3 cohorts suitable for analysis therefore MND category was excluded from the discovery and primary validation analysis. For the discovery meta-analysis, 22 patient cohorts (294 cases, 196 controls) containing at least 3 cohorts from each of the remaining 5 muscle disease categories were analyzed. To identify the most robust differentially expressed (DE) genes across muscle diseases measured on multiple different microarray platforms we performed gene expression meta-analysis 133,330 . Criteria for DE was absolute value effect size

106 > 0.3 and false discovery rate < 5% measured in all 17 cohorts of the discovery set using, a “leave-one-disease-out” meta-analysis strategy. This was implemented to correct for heterogeneity in gene DE between muscle disease categories, and to avoid one muscle disease influencing the overall analysis as described before 249,270 . In this strategy, we repeated the analysis 5 times, each time removing all patient cohorts in one disease category prior to meta-analysis of the remaining patient cohorts for the other 4 disease categories ( Methods Section 5.3 ). Genes that remained significantly DE in all 5 iterations of the “leave-one-disease-out” analysis were considered to represent the common genes dysregulated across muscle disease. For the meta-analysis of the discovery cohort, we identified 486 DE genes (221 up- and 265 down-regulated in muscle disease compared to controls) (Supplementary Table 5.3). Next we validated these DE genes in an independent validation set of 17 cohorts consisting of 689 samples ( Table 5.1). The validation set included at least 1 cohort belonging to each of the 5 muscle disease categories. We found that 98 up- regulated and 33 down-regulated genes were also significantly differentially expressed in the validation cohorts. These 131 validated genes were termed the common muscle disease module (CMDM). ( Table 5.2; Supplementary Table 5.4). An additional 3 cohorts (2 cohorts from amyotrophic lateral sclerosis [ALS] and 1 cohort from cerebral palsy [CP]) could not be classified into any of the 5 muscle disease categories present in the discovery and validation sets ( Table 5.1). Therefore we tested these 3 cohorts (68 total samples) as a secondary validation set to test the generalizability of the CMDM. Visual inspection of the heatmap of the CMDM genes in Figure 5.2 shows that its pattern of expression is generally highly consistent between the discovery and validation set, as well as the secondary validation set, supporting the generalizability of the CMDM to muscle disease.

5.4.2 CMDM significantly associates with clinical and histological measures of disease severity

We hypothesized that the summary expression of the CMDM would correlate with the severity of muscle disease. Importantly, disease severity was not considered during meta-analysis, with every sample classified as either “control” or “case.” To summarize the expression of the CMDM for each individual patient, we calculated the difference of the geometric mean of the up-regulated and down- regulated CMDM genes as a score as described before ( Methods Section 5.3 ). The

107 CMDM summary expression scores were then correlated to the measures of muscle mass, strength, and function. Two cohorts assessed the histological grade of muscle severity, GSE17091 (CMD) and EMEXP3260 (ALS), and were included in the validation set and secondary validation set, respectivel y (Table 5.1). Three cohorts, GSE78929 (ICUAW), GSE34111 (CSM), and GSE17091 (CMD) provided publically available measurements of muscle severity based on muscle strength, mass or motor functional capacity. For the ICUAW cohort (GSE78929), CMDM scores were significantly increased in both early ICUAW (Day 7 post-ICU, p-value=1.188x10 -4) and sustained ICUAW (Month 6 post-ICU, p=6.216x10 -3) compared to controls. Early ICUAW scores were significantly higher than in sustained ICUAW (p-value=0.031) (Figure 5.3A). The CMDM score was inversely correlated with Functional Independence Measure motor subscore (r= -0.59, p-value=3.31 x10-3) and inversely correlated with quadriceps muscle mass (r= -0.56, p-value=4.17 x 10-3) ( Figure 5.3 B,C ). CMDM scores were significantly higher in late ALS than early ALS (p- value=0.016). No significant difference in CMDM scores was detected between patients with late ALS (EMEXP3260) compared to controls (p-value=0.055), or between early ALS and controls (p-value=0.45). (Figure 5.4 A). The CMDM score correlated with the histological assessment of muscle atrophy, (r=0.83, p- value=5.90x10 -3) and was inversely correlated to muscle strength (r= -0.74, p- value=0.023) ( Figures 5.4 B,C ). There was no significant correlation between quadriceps muscle strength and the CMDM score (r=-0.34, p-value=0.17, Figure 5.5) for GSE34111, a cohort of patients with upper GI cancers and weight loss. Review of the CMDM signature heatmap for GSE34111 ( Figure 5.2) showed that it differed markedly from the CMDM signature from the expression pattern common to the majority of cohorts with 86 genes significantly up-regulated across the meta-analysis. Differential expression analysis found exclusively down-regulated genes, with no genes up-regulated in the pre-resection cancer muscle biopsies compared to controls at FDR < 10% 338 . Therefore we speculate that for GSE34111, its markedly different pattern of expression in cases and controls in the transcriptome-wide analysis compared to other muscle disease cohorts resulted in a lack of association of expression in the CMDM signature as well as its association to muscle strength in this cohort. The cohort GSE17091 assessed the degree of muscle fibrosis histologically into normal, mild, moderate, or severe fibrosis in patients with dystrophic subtypes of CMD ( Supplementary Table 5.1). CMDM scores were significantly higher in

108 patients with moderate or severe fibrosis compared to mild fibrosis, p-value= 7.07x10 -3 and p-value=1.13 x10 -4, respectively ( Figure 5.6). There was no significant difference in CMDM scores between moderate and severe muscle fibrosis (p=0.24). Comparison of the CMDM signature with the 56 gene signature of Transforming Growth Factor-β derived by Dadgar et al 339 using pathway analysis is discussed in the Results Section 5.4.8

5.4.3 Meta-analysis highlights common mechanisms of muscle disease

We hypothesized that conserved pathways dysregulated across muscle diseases could be identified using meta-analysis. Gene Set Enrichment Analysis (GSEA) evaluated the enrichment of Gene Ontology (GO) terms in the complete ranked list of genes based on expression relative to controls from discovery meta- analysis prior to “leave-one-disease-out” analysis and validation analysis ( Figure 5.1). A total of 85 GO Biological Process (BP) terms were significantly enriched (FDR q-value < 0.05) after removing redundant GO terms ( Supplementary Table 5.5). Eleven GO terms were down-regulated and 74 up-regulated. Networks of overlapping significantly enriched up- and down-regulated GO terms were visualized to aid in the interpretation of the GO enrichment results ( Figure 5.7). The highest up-regulated and down-regulated gene sets were extracellular matrix (ECM) organization (q-value=8.2 x 10 -4) and tricarboxylic acid (TCA) metabolic acid process (q-value=1.7 x 10 -3), respectively. Eight of the 11 down-regulated gene sets, including TCA metabolic process were related to mitochondrial metabolism. Coordinate down-regulation of mitochondrial genes has been described in a number of muscle diseases 327,340-342 . The remaining 3 down-regulated sets were skeletal muscle adaptation (q-value=1.7 x10 -3), protein dephosphorylation (q-value=3.6 x10 - 3), and muscle cell development (q-value=1.7x10 -2). Nineteen of 74 up-regulated gene sets were related to immune system processes including neutrophil activation, innate immune response and antigen processing. Skeletal muscle tissue damage secondary to muscle disease induces immune activation culminating in inflammation and deposition of ECM 343,344 . Skeletal muscle diseases are characterized by up-regulation of ECM genes including collagen, with progressive development of fibrosis leading to dysfunctional muscle tissue 345,346 . Collectively, these findings are consistent with literature in chronic skeletal muscle diseases proposing convergent final common pathways including chronic inflammation, fibrosis, oxidative stress, and mitochondrial dysfunction 346 .

109 5.4.4 Transcription factors associated with the CMDM

We hypothesized that shared transcriptional regulators would control CMDM genes with similar patterns of up- or down-regulation. Therefore enrichment analysis of transcription factor binding sites (TFBS) was performed to evaluate potential transcriptional regulators of the CMDM. The 33 down-regulated CMDM genes were significantly enriched (TFBS z-score > 10 and Fisher score > 7) for targets of MIZF, MZF1 , and ELK1; the latter two TFs targeted the majority of down-regulated genes (84.8%). The 98 up-regulated genes were significantly enriched for targets of seven TFs: PAX6, SRF, NKX2-5, FOXA1 , FOXA2, FOXD1, and, ARID3A . NKX2-5 targeted the majority (81.3%) of the up-regulated genes ( Figure 5.8). Of the 10 TFs with overrepresented binding sites in the CMDM genes, three have previously been implicated in skeletal muscle pathology. Overexpression of NKX2-5 in C1C12 myoblasts has been shown to inhibit differentiation and myoblast fusion 347,348 . SRF has important roles in myogenesis and has been implicated in modulate muscle plasticity 349 . ELK1 has been shown to play a role in the differentiation of L6E9 muscle cells 350 .

5.4.5 Cell type-specificity analysis of CMDM genes

Comparisons of transcriptomic array data between healthy and diseased tissue can be complicated by increased proportions of cell types unrelated to the cell type of interest. During chronic muscle disease, it could be expected that the proportion of skeletal muscle cells may decrease relative to other cell types such as adipose or collagen deposition 346,351 . Thus, genes up-regulated in CMDM may reflect increased density of adipose or fibroblast cells while genes down-regulated in the CMDM may reflect decreased skeletal muscle density. To test this hypothesis we determined whether the CMDM genes demonstrated a cell type-specific expression pattern based on public data sets for purified cell types. Cell type enrichment analysis was performed using Enrichr 336 analysis of ARCHS4, a comprehensive mRNA expression database containing 187,946 samples from human and mouse tissues 335 . The down-regulated CMDM genes had highest enrichment in skeletal muscle (20 genes, q-value 8.9 x 10 -11 ) whereas the up-regulated CMDM genes had the highest enrichment in myoblasts (29 genes, q-value 1.2 x10 -6), followed by fibroblast (27 genes, q=6.4x10 -5) and dendritic cells (25 genes, q-value=9.94x10 -5). The tissue

110 types enriched in the up-regulated CMDM genes were not significant in the down- regulated CMDM genes ( Supplementary Table 5.6). These results suggest that the decrease in expression of muscle specific CMDM genes could be either due in part to reduction in skeletal muscle cell density in muscle disease or to decreased gene expression in skeletal muscles without any change in skeletal muscle cell density.

5.4.6 Subcellular localization analysis of CMDM genes

We hypothesized that the CMDM genes may be overrepresented in certain subcellular compartments. Subcellular localization annotations from the curated annotation databases UniProt and GO were obtained from CellWhere 337 . For genes having more than one localization, the subcellular localization with greatest relevance to muscle physiology as determined by CellWhere was selected ( methods section 5.3 ). The majority of CMDM genes (96.2%) mapped to at least one subcellular localization. The greatest proportion of genes localized to the vesicular exosome (23.8%), followed by mitochondria (11.9%), and nucleus (10.3%). Two genes ( IL1R1 and LMCD1 ) were extracellular ( Supplementary Table 5.7A, Figure 5.9). The CMDM signature was significantly overrepresented among genes located in the vesicular exosome (q-value = 5.4 x10 -3) and nucleus (q-value = 0.019) (Supplemental Table 5.7B ). Vesicular exosomes, cell derived vesicles containing signaling factors (including genes and microRNAs) for intercellular communication, have been found to have roles in muscle regeneration and congenital muscle diseases 352 .

5.4.7 Disease-specific patterns of gene expression changes

We hypothesized that functional analysis of a disease category-specific gene signature, after removing genes shared with other disease categories, would gain insights into the unique pathomolecular mechanisms underlying each individual muscle disease. Thus, for each disease category we used the meta-analysis approach to generate a rank ordered list of genes based on expression relative to controls for Discovery and Validation cohorts together. Then we visually examined the location of each of the CMDM genes in the ordered list of genes for each disease category ( Figure 5.10A). The CMDM genes were more densely distributed among

111 the most up- and down-regulated genes for each muscle-specific category gene list, validating that the CMDM genes are similarly dysregulated in each muscle disease. We then utilized the “leave-one-disease-out” meta-analysis approach to iteratively generate ranked lists containing four of the five disease categories. Significantly DE genes identified in the four-disease meta-analysis gene list were then removed from the single disease category meta-analysis gene list. We removed 596, 291, 454, 546, and 537 genes from the gene lists for CMD, IM, ICUAW, DI, and CSM, respectively. The disease-specific gene lists represent genes that are expressed more strongly in a specific muscle disease category ( Figure 5.10B). The gene lists were then input into GSEA PreRank for GO term enrichment analysis to identify disease- specific pathways ( Supplementary Figure 5.1A-E. Supplementary Table 5.8A-E) . Even after subtraction of genes significant in the other diseases categories, significant down-regulation of mitochondrial genes were found in IM, ICUAW, DI, and CSM. Moreover, IM, ICUAW, and DI each had significant down-regulation of genes sets related to muscle contraction. Remarkably, only ICUAW had down-regulation of genes related to muscle development. Decreased numbers of satellite cells (precursors to skeletal muscle cells) in ICUAW compared to healthy controls have been has detected histologically 103 , validating this enrichment analysis finding. Significant up regulation of NF-kB signaling genes was found in IM, ICUAW, and DI categories. NF-kB has been shown histochemically to play a role in IM 353,354 and has been studied in animal models of cancer cachexia 355 . NF-kB has been linked to a number of human muscle diseases and animal models of muscle disease 356 and has been shown to be an inhibitor of skeletal myogenesis and muscle regeneration 357 . Our enrichment results suggest, in addition to cancer, the up- regulation of NF-kB signaling also occurs among COPD and polymyalgia rheumatica included in our analysis of chronic systemic diseases. Down regulation of genes involved in the ubiquitin-proteasome pathway (UP) was detected only for CMD. Decreased UP activity has been detected in the nemaline myopathy 358 and myotonic dystrophy type II compared to control patients 359 . In contrast, the proteasome system is found to be upregulated in limb-girdle muscular dystrophy (MD) type 2A 360 and mouse model of Duchene’s MD. Therefore it remains unclear whether downregulation of the UPS is consistent across the congenital muscle diseases.

5.4.8 Comparison of CMDM signature with Transforming Growth Factor-β signature

112 The GSE17091 cohort has previously been used to validate a 56 gene signature of Transforming Growth Factor– β (TGF-β) genes discovered using a supervised network selection analysis of gene expression from the cohorts in GSE3307 339 . The TGF-β signature was shown to separate patients based on the degree of severity (moderate and severe vs. mild and normal) using an unsupervised clustering of the signature expression. First, the ranks of the TGF-β signature genes were compared to the CMDM signature genes. The majority of genes (52 genes) in the TGF-β signature had positive summary effect sizes in the validation cohort. Therefore, the 52 TGF-β genes were compared to the 98 upregulated CMDM genes based on their ranks among the top 5000 genes ranked by summary effect sizes for the samples in the validation cohort. Additionally, a random selection of 98 genes from 24,966 possible genes was also assessed. A histogram of the ranks of the CMDM, TGF-β, and randomly selected genes among the top 5000 genes ranked by summary effect size are shown in Supplementary Figure 5.2 . All 98 CMDM genes are included in the top 5000 genes, whereas 38 of 52 TGF-β and 9 of the 98 random genes were ranked in the top 5000 genes. Wilcoxon signed-rank test between CMDM and TGF-β genes showed no significant difference (p-value = 0.11), whereas there was significant difference in ranks between CMDM and random (p-value=1.30x10 -6) and TGF-β versus random (p-value 1.01x10 -3). Applying the z-score summary expression to the TGF-β signature, significant TGF-β signature z-score differences were found between mild fibrosis and moderate or severe fibrosis (p-value=0.045 and p-value=6.77x10 -4, respectively), thus replicating the finding of Dadgar et al 339 (Supplementary Figure 5.3 ). There was no significant difference between TGF-β scores for moderate and severe histology (p- value=0.94). We next sought to determine whether TGF-β muscle fibrosis signature z- score would also separate disease severity in other muscle disease categories. TGF-β scores were significantly higher sustained ICUAW compared with early ICUAW (p-value=7.251x10 -3). There was no significant difference between or between early ICUAW and controls, p= 0.66 or between sustained ICUAW and controls (p=0.083) (Figure 5.12A). The TGF-β score was significantly correlated with Functional Independence Measure motor subscore (r= 0.52, p-value= 0.010) (Figure 5.12B ). No significant correlation between the TGF-β score and quadriceps muscle mass (r= 0.18, p-value= 0.39) was detected (Figure 5.12C). The TGF-β scores were significantly higher in late ALS compared to early ALS (p=0.032). There was no significant difference in the TGF-β signature z-score between controls and early or late ALS (p-value=0.24, p-value=0.68, respectively)

113 (Figure 5.12D). The TGF-β scores did not significantly correlate with the histological assessment of muscle atrophy in ALS (r=0.58, p-value=0.10) or muscle strength (r= -0.42, p-value= 0.25) (Figure 5.12E,F ).

5.4.9 Association of CMDM signature with response to exercise therapy in inflammatory myopathy

We hypothesized that the CMDM score would correlate with response to therapy in clinical trials of muscle disease. Munters et al. 361 recently investigated the effects of exercise training on the gene profile of skeletal muscle in patients with IM (n=7) compared to control group of non-exercised patients with IM (n=8). Paired baseline and 12-week muscle biopsy samples were obtained from intervention (n = 3) and control groups (n= 3; GSE95772). Compared to the control group, Munters et al 361 found that the exercise group had decreased diseased activity based on clinical measures of muscle function. Comparing the CMDM scores we found no differences in mean CMDM score between the exercise group versus the control group at baseline (0.10 vs 0.21, p-value=0.88). In the exercise group, there was a trend toward decreasing CMDM scores at 12-weeks versus the control group though this did not meet statistical significance, Δ CMDM score -0.86 vs 0.24, p = 0.074 (Figure 5.13).

5.4.10 Assessment of CMDM muscle disease z-score as marker of response to muscle disease-specific pharmacotherapy

The transcriptomic response to pharmacotherapy of muscle disease was assessed in one cohort, the congenital muscle disease infantile-onset glycogen storage disease type II (Pompe disease), GSE38680a. The patients selected for muscle transcriptome analysis were those having presence or absence of improvement in motor function at 52 weeks of therapy, termed poor clinical outome (PR) and positive outcome (R). Principal component analysis of transcriptome data at baseline (time =0) by Palermo et al 362 found some separation of P and PR samples prior to onset of treatment. The CMDM showed significant difference between the healthy controls and patients with Pompe disease with poor clinical outcome and positive clinical outcomes, p=6.3x10 -5, and 2.1 x 10-5 , respectively Figure 5.14A. There was no significant difference between PR and R at baseline or

114 52 weeks (p-value = 0.78, p-value=0.17, respectively). No significant difference in the change in z-scores ( Δ z-score) between week 0 and 52 was detected between P and R subjects (0.28 vs. -0.22, p=0.37) Visual comparison of muscle disease severity z-scores for P and R subjects at time 0 ( Figure 5.14B) suggests that a subset of patients with eventual poor outcome may have had worse baseline muscle disease based on higher baseline z- scores. These findings suggests that 1) while the CMDM signature differentiates Pompe disease from healthy muscle, transcriptomic alterations occuring in addition to the CMDM signature may be associated with clinical outcomes in Pompe disease and 2) the degree of muscle dysfunction at baseline may have been a confounding factor in the therapeutic trial. Palermo et al 362 identified a 29 gene signature by ranking gene expression at time 0 based on correlation with eventual clinical outcome. To compare the performance of the CMDM signature with this supervised analysis we converted the gene signature to a summary z-score, termed supervised outcome signature. This signature had no significant difference between PR and R at time 0 p = 0.11, whereas it was significant at 52 weeks as expected (p-value=0.02; Figure 5.14C).

5.4.11 Characterizing the association of CMDM genes with normal aging

Aging in otherwise healthy individuals is associated with involuntary loss of muscle mass, strength, and function termed sarcopenia 363 . However, normal healthy aging does not involve the severe progressive loss of muscle mass and function observed across muscle diseases. Therefore we hypothesized that the CMDM genes that are up- or down-regulated specifically in muscle diseases but not in normal aging may be specific drivers of muscle atrophy and could be biomarkers of the muscle disease process. We used the 957 age-associated gene signature identified by Su et al. 334 to determine the proportion of genes in the CMDM that are associated with age. We found that 17 up-regulated and 3 down-regulated genes in the CMDM were present in the age-associated gene signature. Table 5.2 lists the CMDM genes, including those associated with age. The direction of change in expression was the same between the age-associated signature and the CMDM signature in all genes except for DCUN1D2 , a member of the E3 ubiquitin-ligase complex, which is up-regulated with aging and down-regulated in muscle disease. There was no significant enrichment of age-associated genes in the CMDM signature, p-value = 0.26 (Fisher’s

115 exact test). Thus, the majority of CMDM genes were not found to be related to sarcopenia.

5.5 Discussion

While transcriptomic studies of individual human muscle diseases continue to increase, no integrated analysis previously identified the genes and pathways consistently conserved across most muscle diseases (Figure 5.11). In total we examined 39 separate patient cohorts classified into five categories of muscle disease consisting of 1195 independent patient samples collected using multiple microarray platforms to identify a robust and reproducible signature. To assess the generalizability of the signature we examined three more cohorts that could not be classified within the previous five muscle categories and found the signature to be reproducible in these cohorts as well. We speculated that these conserved genes across the majority of skeletal muscle diseases, termed the CMDM, represent convergent gene dysregulation in muscle disease. Reviewing the literature for the most up and down-regulated genes in the CMDM supports the role of these genes in critical pathways related to muscle disease. The up-regulated CMDM gene ANXA2 has been recognized in inflammatory myopathies 364 and dysferlinopathies (a subset of congenital muscle diseases) 365 . Increasing levels of ANXA2 have been associated with increasing disease severity in muscular dystrophies 365 and deletion of ANXA2 in dysferlin deficient mice improved muscle dysfunction 366 . Increased expression of GADD45A has been shown to be necessary for skeletal muscle atrophy [40], and up-regulated in animal models of denervation, immobilization [41, 42] and early ICUAW [43]. CDKN1A , cyclin-dependent kinase inhibitor isoform 1A (also known as p21) has been shown to play an important role in muscle differentiation in mouse models of muscle injury 367 and transfection of CDKN1A has been found to induce muscle fiber atrophy via an unknown mechanism 368 . Increased expression of cholinergic receptor nicotinic, alpha 1 ( CHRNA1 ) has been recognized as a marker of muscle severity in denervation 369,370 . A recent study found that up-regulation of CHRNA1 was associated with dynamic epigenetic modifications of the gene in a rat model of disuse induced atrophy 371 . We assessed transcription factor enrichment in the up-regulated CMDM signature and found that NKX2-5 has binding sites targeting the majority of these genes. NKX2-5 mRNA levels were found to correlate with muscular dystrophy type I muscle histopathology in humans and mice 347 . In contrast, muscle samples from other

116 muscular dystrophies or healthy controls did not express NKX2-5. This finding calls into question the potential role of NKX2-5 as a transcriptional regulator across muscle diseases and will require further investigation. The down-regulated CMDM gene CAMK2G , calcium/calmodulin-dependent protein kinase type II (CaMKII) subunit gamma, is involved in sarcoplasmic reticulum Ca 2+ transport in skeletal muscle and has been shown to remain active after exercise 372 . Agonists of CaMKII have been proposed as potential pharmacologic therapies of muscle disease 373 however it has been unclear which of the CaMKII subunits is most important in skeletal muscle disease as these subunits are currently not well characterized. Given that CAMK2G is down-regulated across most muscle diseases in this study we propose that it may be a suitable target for future pharmacologic studies. Mutations causing undetectable levels of the down-regulated CMDM gene PYGM , encoding muscle glycogen phosphorylase, causes a metabolic disorder known as McArdle’s disease 374 . It is unclear whether decreased PYGM expression contributes to other muscle diseases. There is currently no literature characterizing the three other most down-regulated CMDM genes, RPL3L, MN1, and FHL3 in skeletal muscle. As muscle disease severity exists along a continuum, we proposed that the degree of expression of the CMDM genes would be associated with the severity of muscle dysfunction. For individual muscle diseases, multiple sets of gene signatures may perform similarly in classifying the degree of muscle impairment, as multiple pathways are activated or inhibited during the progression of the disease. However, signatures identified by assessing specific muscle disease types alone are unlikely to capture convergent gene dysfunction across muscle diseases whereas signatures derived from analysis across multiple muscle disease categories, such as the CMDM and TGF-β signatures, are more likely to reflect gene dysregulation across muscle diseases. This latter signature was identified using prior knowledge of pathways and clustering analysis using twelve muscle diseases to identify up- regulated TGF-β pathway genes in muscle diseases associated with severe histological patterns of fibrosis 339 . We found robust correlations between the CMDM signature and clinical and histological measures of disease severity across multiple categories of muscle disease. The CMDM signature performed comparably to the TGF-β signature score for measuring severity of fibrosis in muscular dystrophy samples. For both signatures, scores were higher for moderate or severe versus mild muscle fibrosis samples in muscular dystrophy. The CMDM signature was higher in late versus

117 early ALS and early versus early ICUAW; this corresponds to the time-points with greater severity for each of these muscle diseases. The CMDM signature was correlated with muscle fibrosis in ALS and inversely with muscle strength in ALS. Like the CMDM signature, the TGF-β scores were found to be upregulated in late versus early ALS. No correlation between histology or muscle strength in ALS and the TGF-β scores were detected however. In contrast to the CMDM signature, the TGF-β scores were higher in sustained versus early ICUAW. Interestingly, whereas the CMDM score had a negative correlation with the Functional Independence Measure (FIM) motor score, the TGF-β score was positively associated with the FIM motor score. The observation of higher TGF-β scores in late ICUAW are compatible with the co- expression functional analysis in Chapter 3 that found increased expression of genes related to fibrosis in late ICUAW samples. Thus, the CMDM and TGF-β signatures may be useful for assessing different pathomechanistic aspects of muscle disease. We assessed whether muscle specific therapies that improve physical capacity would be reflected in the CMDM as a decrease in post-therapy CMDM scores. In a small trial of patients with inflammatory myopathy undergoing exercise therapy versus control, the downward trend in the CMDM scores of the exercise group appears to reflect their improvement in physical capacity, though this will require a larger number of study participants in future studies to confirm. Thus our results provide strong evidence that the signature could have future applications as a biomarker for phenotyping muscle disease, specifically providing prognostic information and quantifying the molecular response to therapy for muscle disease. Measuring muscle disease severity using the CMDM signature at enrollment of therapeutic trials may aid the stratification of patients within trial arms. Measuring changes in CMDM scores after therapy may improve the identification of responders. An unexpected finding of our meta-analysis was that the CMDM signature is enriched for genes targeted to the exosomal vesicle. Monitoring exosomal miRNAs has been proposed as a non-invasive method for tracking muscle disease progression 352,375 . Future studies will assess whether plasma protein concentrations of the exosomal CMDM genes correlate with muscle severity to the same extent as their transcripts. Additionally, the predictive role of combined exosomal miRs and CMDM protein plasma biomarkers compared to individual types of biomarkers will be compared. Although there are diverse genetic and environmental contributions to individual muscle diseases, convergent pathways dysregulated across muscle disease were identified involved in fibrosis, inflammation and mitochondrial

118 dysfunction. Finally, we removed genes similarly dysregulated across muscle disease to identify disease specific signatures and pathways. Thus our findings serve as a valuable resource for interpreting disease mechanisms, connecting findings across muscle diseases, and driving novel hypotheses.

5.6 Limitations

Our meta-analysis has limitations despite its comprehensiveness. Although most included studies attempted to select patients without co-morbidities that span more than one muscle disease category, there are potentially multiple pathologies in some of the muscle samples. Given the number of cohorts and size of the overall study, such confounding is likely to be minimal. Although we have made inferences about genes likely altered in skeletal muscle, there are inherent limitations in the resolution of cell-type specific data from mixed tissue. We have assessed cell-type specific expression in muscle disease using enrichment analysis of cell-specific gene expression profiles. A number of methods for statistical deconvolution of mixed tissue expression have been developed which may be applied to future analyses once further human skeletal muscle-cell specific profiles are available 369 . Based on the use of microarray data from multiple platforms, we cannot test for alterations in splicing regulation, which has been associated with several congenital muscle diseases including the most common adult onset muscular dystrophies 376 . Analysis of RNA-seq transcriptome data will be necessary to determine whether altered splice variants lead to muscle pathology in other disease categories. Identification of conserved epigenetic signatures of muscle disease will provide important insights into the underlying mechanisms resulting in gene transcriptomic dysregulation identified here, once future epigenome-wide association studies of various muscle diseases are available. The cohort GSE34111 had a global expression pattern that differed markedly from the other muscle diseases and disease categories. As this cohort was the only one in the analysis that included cancer cachexia, it remains unclear whether the difference in global expression pattern reflects significant differences in the pathomechanism of cancer cachexia or technical or experimental differences in the study. Future analysis comparing peripheral muscle from patients with cancer cachexia and controls are required. Within the validation set, the chronic systemic disease and ICUAW categories each consisted of one cohort, reducing the power to detect significant effect size differences from controls within these disease

119 categories. For this reason, disease specific pathway analysis was performed by combing both discovery and validation cohorts. Direct experimentation will be necessary to determine the role of the dysregulated genes and pathways in muscle disease as either causal drivers or responses to muscle disease. This response may be initially adaptive, such as stress-related changes to preserve muscle viability in the acute stages of muscle injury, though ultimately contributing to muscle disease.

5.7 Conclusions

We carried out an integrated multi-cohort analysis of muscle disease to identify a conserved muscle disease transcriptional signature associated with clinical and histologic disease severity, and identify numerous novel genes associated with muscle disease severity.

120

Figure 5.1: Meta-analysis workflow diagram. See Methods section 5.3 for details. GSEA, Gene Set Enrichment Analysis; TF, transcription factor

121

Figure 5.2: Meta-analysis and leave-one-disease-out analysis reveal common differentially expressed genes across muscle diseases. Heat map shows consistent differential expression in the majority of discovery, validation, and secondary validation cohort data sets. Columns represent CMDM genes ranked from highest to lowest standardized mean difference (Hedges’ g in log 2 scale) from left to right. Rows denote data sets used in each stage of meta-analysis, arranged by unsupervised hierarchical clustering using Ward’s minimum variance method. Refer to Table 5.1 for data set information. ICUAW, Intensive Care Unit Acquired Weakness; IM, inflammatory myopathies; DI, Disuse and Immobility; CMD, congenital muscle disorders; CSM, chronic systemic diseases affecting muscle.

122

Figure 5.3: CMDM score significantly associates with clinical severity in ICUAW. Plots of A) CMDM scores for controls, early ICUAW (Day 7 post-ICU), and sustained ICUAW (month 6 post-ICU; GSE78929) B) CMDM scores versus Functional Independence Measure motor subscore C) CMDM score versus quadriceps muscle mass. Each dot corresponds to individual samples. ICUAW, ICU acquired weakness;

123

Figure 5.4 CMDM significantly associates with clinical and histological severity in ALS. Plots of a) CMDM scores in violin plots for controls, early ALS, and late ALS (EMEXP3260) b) CMDM score versus grading score of muscle atrophy in ALS (based on histology) c) CMDM score versus shoulder abduction (muscle strength). ALS, amyotrophic lateral sclerosis.

124

Figure 5.5 – CMDM signature in cancer cachexia (GSE34111). Plots of A) CMDM scores in violin plots for Cancer cachexia vs. controls. B) CMDM scores vs Quadriceps strength (% reference).

125

Figure 5.6 – CMDM signature shows significant differences between no fibrosis, mild, and moderate to severe fibrosis in a cohort of congenital muscle disease (GSE17091).

126

Figure 5.7: Functional enrichment reveals common pathways in muscle disease a) EnrichmentMap 333 network for overlapping enriched Gene Ontology gene sets identified by GSEA. Each node represents a significantly enriched gene set (FDR q- value < 0.05); gene sets containing larger number of genes are proportionally larger. Gene sets upregulated in muscle disease compared to control shown in red (top) and gene sets downregulated shown in blue (bottom)

127

128

Figure 5.8: Transcription factor binding site enrichment analysis identifies transcription factors upstream of CMDM genes. a) Heatmap shows genes in CMDM bound by transcription factors across discovery, validation, and secondary validation (SA) analysis cohorts (green, pink, lightblue columns, respectively, at the top of the figure). Heat map colors correspond to log2 standardized mean difference (Hedges’ g). Up and down-regulated CMDM were analyzed separately. Refer to Table 5.1 for data set information (abbreviations).

129

Figure 5.9 - Bar graph of genes in the CMDM signature by subcellular localization.

130

Figure 5.10 : Disease-specific meta-analysis a) Distribution of the 131 CMD genes among individual disease meta-analysis gene lists. Each line presents the presence of a CMDM gene among the 8,042 genes generated from disease-specific meta- analysis ranked from the most positive standardized mean difference (left) to the most negative standardized mean difference (right). b) Disease-specific meta- analysis after removing genes differentially expressed across the other four disease categories, identifies genes more strongly expressed in a single disease. Top 10 up- regulated and top 10 down-regulated genes shown (if 20 or more genes present). ICUAW, Intensive Care Unit Acquired Weakness; IM, inflammatory myopathies; DI,

131 Disuse and Immobility; CMD, congenital muscle disorders; CSM, chronic systemic diseases affecting muscle.

132

Figure 5.11: Conserved genes in skeletal muscle disease. Forest plots for 5 most up-regulated genes across muscle disease (top), and 5 most down-regulated genes (bottom).

133

134

Figure 5.12 – TGF-β scores in ICUAW and ALS. Plots of A) TGF-β scores in violin plots for controls, early ICUAW (Day 7 post-ICU), and sustained ICUAW (month 6 post-ICU; GSE78929). B) TGF-β scores versus Functional Independence Measure motor subscore C) TGF-β scores versus quadriceps muscle mass. E) TGF-β scores in violin plots for controls, early ALS, and late ALS (EMEXP326). F) TGF-β scores versus grading score of muscle atrophy in ALS (based on histology) G) TGF-β scores versus shoulder abduction (muscle strength). Each dot corresponds to individual samples. ICUAW, ICU acquired weakness; ALS, amyotrophic lateral sclerosis.

135

Figure 5.13 - CMDM score signature in a clinical trial of exercise in inflammatory myopathy (GSE95772) at baseline (time 0) and 12 weeks.

136

Figure 5.14 – CMDM signature in Pompe’s disease (GSE38680a) patients in therapeutic trial. A) Violin plots of CMDM scores comparing controls with Pompe’s disease patients with poor clinical outcome and positive clinical outcome. B) CMDM score at baseline (time 0) and 52 weeks. C) Supervised outcome score (signature derived from clinical response at 52 weeks) at baseline (time 0) and 52 weeks.

137

Table 5.1: Summary of public gene expression data sets used in the discovery and validation data set meta-analysis

Disease Accession# Reference Cases n Cases n Control Total Platform category Samples Discovery ICUAW GSE13205 Fredriksson 87 Sepsis MODS 8 13 21 GPL570 ICUAW GSE53702 Langhans 110 ICUAW 6 7 13 GPL5188 ICUAW GSE3307 Bakay 377 ICUAW 5 16 21 GPL96 Congenital GSE15090 Arashiro 378 FSHD 10 5 15 GPL570 Congenital GSE18715 Voets a POLG1 6 12 18 GPL570 Congenital GSE36398a * Rahimov 379 FSHD 16 16 32 GPL6244 Congenital GSE36398b Rahimov 379 FSHD 10 8 18 GPL6244 Congenital GSE37084 Perfetti 380 MMD 10 10 20 GPL5175 Congenital GSE26852 Tasca 381 FSHD, dysferlinopathy 12 7 19 GPL6947 Congenital GSE47968 Nakamori 382 FSHD, DM 23 8 31 GPL5188 Congenital GSE42806 Screen 383 TMD 5 7 12 GPL570 Congenital GSE38417 Dorsey a DMD 16 6 22 GPL570 IM GSE48280 Surez-Calvet 384 PM, IBM, DM 16 5 21 GPL6244 IM GSE39454 Zhu 385 PM, IBM, NM 31 5 36 GPL570 IM GSE26852 Tasca 381 PM, IM, DM 7 7 14 GPL6947 IM GSE3112 Greenberg 386 DM, IBM 29 11 40 GPL96 Immobility GSE45745 Barres 387 Morbid obesity 5 6 11 GPL13667 Immobility GSE21496 Reich 388 Unloading 14 7 21 GPL570 Immobility GSE5110 Urso 389 Immobility 5 5 10 GPL570 Chronic GSE27536 Turan 390 COPD 30 24 54 GPL570 Chronic GSE34111 Gallagher 338 Cancer 24 6 30 GPL570 Chronic EMTAB3671 Kreiner 391 PMR 6 5 11 GPL570 TOTAL 294 196 490 Validation ICUAW GSE78929 Walsh 304 ICUAW 24 8 32 GPL10558 Congenital GSE13608 Bachinksi 392 DMD, MMD 33 9 42 GPL570 Congenital GSE38680a Palermo 362 GSD II 41 17 58 GPL570 Congenital GSE38680b Palermo 362 GSD II 9 10 19 GPL570 Congenital GSE109178 Dadgar 339 MD 43 6 49 GPL570 Congenital GSE3307 Bakay 377 MD 70 c 86 GPL570 Congenital GSE6011 Pescatori 393 DMD 23 14 37 GPL96 Congenital GSE12648 Eisenberg 394 HIBM 10 10 20 GPL96 Congenital GSE10760 Osborne 395 FSHD 38 60 98 GPL96 Congenital GSE11681 Saenz 360 LGMD2A 10 19 29 GPL96 IM GSE3307 Bakay 377 Juvenile DM 21 c 37 GPL96 IM GSE1551 Greenberg 396 PM 13 10 23 GPL96 IM EMEXP2681 Bernasconi a DM, PM 10 5 15 GPL96 Immobility GSE14901 Abadi 340 Limb disuse (casting) 48 24 72 GPL570 Immobility GSE474 Park 397 Morbid obesity 8 8 16 GPL96 Immobility GSE45462 Chen 398 Limb disuse (casting) 16 16 32 GPL570 Chronic GSE1786 Radom-Aizik 399 COPD 12 12 24 GPL96 TOTAL 429 228 689 Secondary validation MND EMEXP3260 Pradat 400 ALS 9 10 19 GPL96 MND GSE3307 Bakay 377 ALS 9 c 25 GPL96 CP GSE31243 Smith 401 CP 20 20 40 GPL570 TOTAL 38 30 68

138

Abbreviations : IM (inflammatory myositiides), MMD ( myotonic muscular dystrophy), MD (muscular dystrophies) , DMD, Duchene’s muscular dystrophy FSHD fascioscapulohumoral muscular dystrophy , LGMD2A, Limb-girdle muscular dystrophy type 2A, HIBM, (Heritable IBM), POLG1, MD, TMD tibial muscular dystrophy, IBM, DM , GSD II (Glycogen storage disease type II, also called Pompe disease), POLG1, Mitochondrial DNA polymerase γ, ICUAW, Intensive Care Unit Acquired Weakness, MODS, multi-organ dysfunction syndrome ^ unpublished * Deltoid muscle samples removed as FSHD typically affects biceps ** Same healthy controls used in subcohorts of GSE330

139

Table 5.2: Common muscle disease module (CMDM) genes. Genes are listed from largest absolute meta-effect size to smallest, top-to-bottom, left-to-right. Genes not correlated with age shown in bold.

Up -regulated Down - regulated CHRNA1 TMEM87A CANX APLP2 MN1 HIGD2A GADD45A MFSD1 IFITM1 SSB CAMK2G MAPK12 ANXA2 CETN2 SRGN BZW1 ATP2B2 TECR TIMP1 PON2 FBLN5 TERF2IP PYGM SEPW1 SAT1 APMAP MSN FAM13A DCUN1D2 EIF1 CDKN1A OSBPL8 CTSB DSE SAMD4A C1S LAMP1 HPRT1 SLC39A6 RPL3L C3 PREPL RIN2 TACC1 COX6A2 GBP2 GPNMB IL1R1 EHBP1 FHL3 ACTC1 CFH PITPNB ABI1 ENO3 CYFIP1 YWHAQ NOTCH2 MAPK1 EPM2A C1R SAE1 TUBB2A TGOLN2 LMCD1 CLIC1 ANXA4 CKAP4 ITGB5 PDK2 FTL HTRA1 ANGPTL2 FBXO7 PTPN3 ATP6AP2 CCNG2 AGFG1 ATP5D NUP93 CAPRIN2 LIMA1 FXYD1 SERPING1 EFEMP1 PTEN DIP2C TRIM38 PSME1 THYN1 NDUFB11 CAP1 TMEM43 EIF4E2 TPI1 ATP1B3 IQGAP1 GMFB CS EIF3D S100A13 PNMA1 GAMT CNDP2 SDCBP MMD GDE1 HEXB STAT6 FAM208A UCKL1 MGP CHMP5 TM9SF3 ENDOG IFITM2 LGALS3 DYNLT3 MFN2 PRCP HEY1 EXOC1 ALDOA COL6A3 DDX50 YWHAB C1orf21 SCPEP1 TGFBR2 YWHAZ PHB2

140

CHAPTER 6 General Discussion

6.1 Overview of findings

ICUAW has been postulated to be a complex and etiologically heterogeneous disorder resulting from multiple factors contributing to impaired muscle contractility in animal models of critical illness 402 . The pathophysiology of the acute onset of ICUAW has been thought to involve mainly four mechanisms 403 : 1) mitochondrial dysfunction, 2) impaired muscle membrane excitability (ion channel dysregulation), 3) proteolysis (predominantly by activation of the ubiquitin-proteasome system), 4) intracellular calcium dysregulation. While increased muscle proteolysis relative to synthesis is essential for the development of muscle atrophy, it has not been found to be sustained over the longer term in ICUAW 66,103 . Two recent studies suggest that muscle regeneration impairment underlies the long-term impairment in muscle function observed in ICUAW 403 103 . However, mechanisms contributing to reduced muscle regeneration in ICU survivors with persisting muscle atrophy are poorly understood. The present doctoral work addressed a previously unmet need for data-driven analysis linking transcriptional profiles of muscle in ICUAW patients with clinical phenotypes to derive novel insights into the pathomechanisms of ICUAW. The findings from the three studies included in this thesis identify convergent transcriptional abnormalities in patients with ICUAW. Our analysis showed biologically meaningful enrichment in gene networks and identified miRs that play roles in regulating groups of these co-expressed genes. This work has established a comprehensive framework to study candidate genes and transcriptional regulators associated with muscle weakness in early and sustained ICUAW. Our ICUAW cohort was notable as it is the first to assess longitudinal clinical and transcriptomic profiles concurrently. Additionally it is the only ICUAW cohort to have miR transcriptome profiles analyzed in an unsupervised manner. Finally, integrating the vast assortment of publically available human muscle diseases, we found a common gene expression signature shared across human muscle diseases and identified molecular functional themes specific to ICUAW. Thus the present doctoral work presents a valuable resource to interpret muscle disease mechanisms, relating and contrasting findings between ICUAW and other muscle disease categories to enable novel hypotheses.

141 In Chapter 3 , we integrated the comprehensive clinical and gene expression profiles in patients with ICUAW at day 7 and month 6 post-discharge from ICU and healthy control muscle biopsy samples. Clinically relevant modules detected by unsupervised analysis using WGCNA were defined as having association with measures of muscle mass, function and/or strength as well as differences in summary expression in ICUAW versus controls. The modules that met these criteria were termed ICUAW relevant modules (ICUAW-RM). Our ICUAW cohort was comprehensively clinically phenotyped based on multiple measures of muscle impairment. These included general measures of global muscle strength using the Medical Research Council Sum Score, isometric peak torque of the quadriceps (at month 6 post-ICU), and measures of muscle mass using CT scans of the quadriceps. Additionally, a well-validated test of physical function in physical functional in critical illness, the Functional Independence Measure (motor subscore) was assessed. The long-term weakness observed in our ICUAW cohort resulted from variable combinations of muscle atrophy and decreased muscle-specific force. Muscle strength is determined by two distinct factors, total muscle mass and muscle force generating capacity (force per cross-sectional area) which are controlled by different processess 404 . Reductions in muscle-specific force and muscle atrophy are often concurrent in neuromuscular dysfunction. However, atrophy can occur without decreasing muscle-specific force and reductions in muscle-specific force may occur in the absence of atrophy 404 . Among the ICUAW patients in our cohort with resolution of quadriceps atrophy there was persistent impairment of muscle-specific force capacity. Five of the 11 ICUAW-RM were associated with both muscle strength and muscle mass, whereas 3 modules were associated with strength alone and 1 module was associated with mass alone. The ICUAW-RMs associated with only mass or strength served to elucidate putative mechanisms contributing to the dissociation between muscle mass and strength. An inverse association between muscle strength (but not mass) and the summary expression of M3 and M11 is concordant with the functional gene sets enriched in these modules. M3 is enriched for genes involved in calcium ion binding and M11 for genes related to inflammation. Thus, up-regulation of these genes in M3 and M11 may contribute to decreased contractile capacity in the absence of muscle mass loss at month 6 post-ICU discharge. Four of the 11 ICUAW-RM were associated with the APACHE II score, a classification system for the severity of critical illness 405 . APACHE II scores sums 12 variables including laboratory values, clinical measures, and acute and chronic

142 disease history. The score is associated with the extent of organ dysfunction and risk of mortality in the ICU. A recent meta-analysis of prospective cohort studies in adult ICU patients found APACHE II was associated with ICUAW 29 . The ICUAW- RM M1 had inverse summary expression with APACHE II score, and was positively correlated with muscle mass and strength. Thus, worsening severity of critical illness (i.e. increased APACHE II score) was associated with decreasing M1 expression and reductions in mass and strength. This is concordant with the finding of greater muscle mass decrease in critically ill patients with multi-organ failure versus single- organ failure by ICU day 7 66 . Enrichment for genes related to muscle contraction and mitochondria in M1 is concordant with substantial evidence implicating mitochondrial dysfunction in sepsis- induced organ dysfunction 406 . The association between ICUAW and APACHE II likely reflects that skeletal muscle is a target of end organ damage in critical illness, which contributes to ICUAW. Importantly, the relationships between module expression and clinical traits here (including the inverse correlation between APACHE II and M1 expression) do not imply causality, but serve as a framework for future investigation. An important finding of this doctoral work was identification of temporally distinct alterations of skeletal muscle regenerative genes in early and sustained ICUAW. We found one early (M1 ) and one sustained ICUAW-RM (M3 ) that were enriched for skeletal muscle regeneration genes. The skeletal muscle development genes enriched in M3 were distinct from those in M1 indicating specific temporal alterations in skeletal muscle regenerative expression profiles in early ICUAW and sustained ICUAW. The inverse correlation of muscle strength with M3 gene expression and enrichment of these genes for the extracellular matrix and muscle development genes may suggest aberrant muscle regeneration programs contributing to impaired muscle structure and function in sustained ICUAW. We identified over-represented transcription factor binding sites (TFBS) in the majority of ICUAW-RM, suggesting that shared TFBS are co-regulating gene expression within each module. The identified over-represented TFBS increased the biological plausibility and characterization of these modules. Remarkably, in the modules related to muscle regeneration the most enriched TFBS had experimental evidence supporting their roles in muscle development and differentiation. Myocyte enhancer factor 2A ( MEF2A ), a TF critical for skeletal muscle regeneration, was enriched in M1, a module enriched for genes related to muscle regeneration and down-regulated in ICUAW at day 7 versus controls. This suggested impaired binding of MEF2 to its promoter is associated with the degree of weakness and atrophy in

143 early ICUAW. In the sustained ICUAW-RM M3 , three over-represented TFBS, TEAD1, RUNX1, and NFATC2 have been linked to muscle regeneration experimentally. TEAD-binding sites are found in the promotors of genes involved in terminal differentiation. RUNX1 regulates muscle-specific genes and structural proteins, controlling the balance of proliferation and differentiation in myoblasts during muscle regeneration 299 . NFATC2 is activated only in newly formed myotubes and its gene targets play a key role in myoblast fusion and myotube growth. These findings support histopathological studies of ICUAW that have linked persistent muscle wasting to impaired muscle regrowth, potentially related to the loss of satellite cell content 103 . Correlation between muscle capillarization and satellite cell content has been previously demonstrated using histopathology 407 . Downregulation of genes related to angiogenesis were detected at day 7 post-ICU possibly reflecting decreased capillarization in muscles from ICUAW compared to controls. Thus these identified networks may provide a molecular correlate to the histopathologic findings of impaired muscle regeneration and decreased angiogenesis. Proof of concept for the corroboration of in silico functional enrichment and histopathologic findings was demonstrated for M3 which predicts increased extracellular membrane deposition in sustained ICUAW. We found increased staining for collagen in a representative biopsy sample of sustained ICUAW compared to healthy control. In contrast to ICUAW, the genes identified by differential expression (DE) analysis for the study in Chapter 3 resulted in a smaller number of functional themes directly related to muscle pathophysiology. Maximizing the power of DE analysis (i.e. finding the largest number of true DE genes while minimizing the FDR) remains challenging. Classification of genes as DE requires selection of a significance threshold in a relatively arbitrary manner. To reduce the number of statistical tests and increase the statistical power we removed genes with low expression and low variance as a general filtering procedure. Seven of the 11 ICUAW-RM had significant enrichment for biological processes in GO, with the majority of functional terms pertinent to muscle pathophysiology, including mitochondrial metabolism and muscle contraction. This functional enrichment further supported the biological plausibility of the identified ICUAW-RMs. DE analysis is expected to detect fewer DE genes in cases of smaller sample sizes and genes having smaller mean expression differences and/or larger expression variability. Thus comparing healthy controls to a disease with large heterogeneity of clinical phenotypes is expected to have less power to detect DE

144 genes compared to a disease with less heterogeneity due to larger variability of gene expression in the more heterogeneous phenotype. High variability genes tend to be hubs in co-expression networks 130 and DE genes are located in the periphery of co- expression networks in human brain studies of several mental illnesses compared to healthy controls 144 . These findings reflect the bias against high variability genes in standard DE analysis. Co-expression methods are well suited to analyze the gene transcriptome for detection of small but putatively biologically relevant gene expression changes within networks with “cumulative disease effect” 144 . Therefore, co-expression methods may serve as a complementary method in addition to differential expression in transcriptomic analysis. Convergent transcriptional abnormalities within our ICUAW cohort were reproducible in other ICUAW cohorts. The network topologies of the ICUAW- relevant modules identified in our ICUAW cohort were compared to an independent cohort of quadriceps biopsies from patients at high risk for ICUAW (sepsis and MODS) and most were found to be at least moderately preserved with similar summary expression changes compared to controls. We also performed a meta- analysis of the four independent cohorts with ICUAW in Chapter 5 that identified a gene signature distinct from other muscle diseases discussed below. Collectively, these findings provide robust evidence for convergent molecular abnormalities in ICUAW. Significant clinical heterogeneity was evident among the four independent ICUAW transcriptomic cohorts analyzed in this doctoral work 87,109,110,304 . There were several between- and within-cohort differences including degree of multi-organ dysfunction and patient survival. While only our ICUAW had muscle mass and strength documented, it is expected that within each ICUAW cohort there was a spectrum of muscle weakness and atrophy across patients. Additional confounding factors amongst the ICUAW cohorts were differences in timing of muscle biopsy and microarray platform. For our ICUAW cohort, the time points for muscle biopsy were selected to assess muscle undergoing maximal repair at day 7 post-ICU and muscle reaching completion of maximal recovery at 6 months post-ICU 103 . A discussion regarding microarray platforms is found below. In Chapter 4 we sought to identify miR master regulators of the ICUAW mRNA gene co-expression networks using joint miR and mRNA profiling from our ICUAW cohort and healthy controls. This work identified miR master regulators and their putative targets, establishing a database of miR-target interactions in ICUAW that serves as a framework for studying the pathomechanisms of miRs in ICUAW. DE analysis served as an important criterion for in the initial step of identifying

145 candidate miRs, whereas DE analysis was suboptimal for detecting disease-relevant mRNA expression changes in our ICUAW cohort in Chapter 3 . Biologically active miRs have been shown to have expression changes in the range of 1.5 to 4 fold which is expected to be detectible using standard DE analysis 284 . Therefore miRs with significant changes in expression are more likely to be biologically active than miRs with unchanged expression when comparing disease to the healthy control. Furthermore, among DE miRs, the miR that targets the greatest proportion of DE mRNAs is expected to cause the most significant change in pathway activities resulting in greater associations of the miR with the clinical phenotype. Thus the DE miRs targeting the largest proportion of DE mRNAs are termed miR master regulators. Several factors determine the effect of altering the expression of a miR on its target mRNA level 408 . These include the affinity of the miR to the target, the relative expression of the miR compared its target, and the relative expression of the target compared to all expressed targets of the miR. Thus a miR that interacts with many targets or a highly abundant target will have its activity buffered within the cell such that for a given change in its expression, its effect on an individual target is minor 160 . These factors highlight the importance of jointly analyzing miRs-mRNA expression data using network analysis that computes the interaction of each miR across the transcriptome 322 . mRNA and miR expression was integrated along with computationally predicted and/or experimentally supported miR-target interaction data to identify regulons, i.e. networks of mRNAs regulated by the same miR. The algorithm microRNA Master Regulator Analysis (MMRA) was used to rank the DE miRs based on the percentage of DE mRNAs in the ICUAW subgroups whose expression was “explained” by the miR. Therefore DE miRs were prioritized based on the expected biological impact as determined by the number of putative mRNA targets amongst the ICUAW mRNA profiles. The analysis identified 22 miRs from 2042 human miRs on the Exiqon platform. The MMRA pipeline also functioned as a data-reduction scheme as the 22 selected miR regulators were 21% of all DE miRs detected between ICUAW day 7 and month 6 vs. healthy controls. For each miR master regulator, a network of co- expressed mRNAs targeted by the miR (termed the transcriptional regulator unit or “regulon” of that miR) was detected using ARACNE. ARACNE aims to detect the regulatory effects of pre-selected list of TFs or miRs on its target mRNAs whereas WGCNA is designed to find co-expression modules (which may be share common miR or TF regulators). ARACNE uses an information theoretic approach to prune

146 indirect regulations in regulatory networks that serves to decrease network complexity and highlight the most important (dominant miR-target interactions). In contrast, the mRNA-mRNA interactions detected by WGCNA include both direct and indirect interactions. mRNAs within the 7 largest ICUAW-RM from Chapter 3 were enriched in at least one regulon. These findings indicate that multiple miRs have targets in most ICUAW-RM, likely giving rise to synergistic networks of miR regulation 409 . The miR- target integration analysis provided further evidence supporting aberrant muscle regeneration in ICUAW and identified miRs targeting dysregulated muscle regeneration genes in ICUAW. Importantly, the expression of the 22 miRs had significant correlation with strength, mass, or function. Concordant with our findings from ICUAW-RMs detected by WGCNA in Chapter 3 , a number of regulons contained genes involved in skeletal muscle differentiation and cellular respiration. Ten regulons were enriched for mitochondrial genes and three were enriched for muscle processes. miR-424-5p was found to explain the greatest proportion of the genes dysregulated in early ICUAW and the downregulated genes within its regulon were enriched for striated muscle differentiation. The downregulated genes within the miR-29a-3p regulon were enriched for muscle contraction and the upregulated genes in the miR-206 regulon were enriched for muscle system process. These three miRs master regulators, downregulated at day 7 post-ICU, have been previously identified in animal models as playing critical roles in muscle differentiation; miR-206 is a well characterized “myomiR” with muscle-specific expression, and miR-424-5p and miR-29b were included among the recently termed “atromiRs” based on their reported associations with muscle atrophy 108 . Mouse models of muscular dystrophy and chronic kidney disease have shown downregulation of miR-29a and miR-29b in muscle 326 , in keeping with our day 7 post-ICU expression data. Increasing miR-29 was found to improve differentiation of myoblasts into myotubes. In contrast, Li et al 162 found upregulation of miR-29b expression in mouse and rat atrophy models and inhibition of miR-29b attenuated atrophy. These findings may indicate that miR-29b downregulation may function as a compensatory processes in ICUAW at day 7 post-ICU. miR-206 is critical for muscle differentiation and formation of new neuromuscular junctions following nerve injury 158 . It has found to be highly expressed in satellite cells and regenerating muscle fibers after injury. Genetic deletion of miR-206 in mice significantly delayed regeneration induced by cardiotoxin injury 306 . Patients with amyotrophic lateral sclerosis (ALS) were found to have

147 upregulation of miR-206 and mir-424 in skeletal muscle and plasma, with plasma expression levels correlated with clinical parameters and assayed again after 6 and 12 months of follow-up. Patients with higher plasma levels of miR-206 at baseline predicted slower clinical deterioration over 12 months. These experimental results suggest that miR-206 upregulation may be protective against muscle disease or injury 410 The expression of miR-206 trended toward a decrease over the 12 month follow-up which lead the investigators to hypothesize that miR-206 expression increases early in the disease course, reaches a plateau then beings to fall which parallels the loss of skeletal muscle in ALS. Similarly miR-424 plasma levels at baseline correlated inversely with ALS disease progression. miR-424 was also found to be upregulated in quadriceps muscle biopsies from a cohort of ICUAW and a cohort of COPD using a supervised screen of putative miRs via RT-PCR 163 411 . In contrast to these findings, our analysis in Chapter 4 found that miR-424-5p was downregulated in ICUAW at day 7 post-ICU. This contrasting pattern of expression between ICU cohorts could be explained by differences in temporal expression. Our ICUAW weakness cohort was assessed at day 7 post-ICU discharge, whereas the ICU cohort used by Connolly et al 163 had biopsies performed during ICU stay (median day 20) which was roughly a “half-way” point in the ICU stay of most patients (total ICU length of stay median 42 days). Upregulation of miR-424-5p in the earlier stages of ICUAW followed by downregulation after critical illness is plausible given the dynamic expression changes of the miR-424-5p mouse homolog mir-322 which has been found to be induced during myogenesis 156 . Our in vitro analysis of C2C12 murine myoblast proliferation and differentiation found downregulation of miR-322 at 6h-48h followed by upregulation during differentiation at 48h-7d supporting its dynamic expression in the developing muscle. While the C2C12 studies characterize what a given miR can do within the context of myogenesis, not what it does in muscle disease, it may be useful for investigating miR target interactions in vitro that are also present in ICUAW. Future analysis assessing temporal expression of miR-424-5p and its targets in ICUAW cohorts are warranted. While comprehensive profiling of miRs in other ICUAW cohorts for validation are limited, biological plausibility f our findings are supported by 1) the recapitulation of important functional themes, including muscle regeneration and mitochondrial pathways, in the miR targeted networks serves to add biological plausibility to our unsupervised analysis and 2) correlation between miR master regulator expression and clinical parameters including muscle strength and mass.

148 In Chapter 4 we also found a miR expression signature separating patients who had a marked recovery of muscle mass (“improvers”) compared to those who did not (“non-improvers”) at month 6 post-ICU using DE analysis. Eight upregulated miRs (miR-4732-3p, miR-490-3p, miR- 4762-5p, miR-4279, miR-4473, miR-642b-5p, miR-4421, miR-4698) were identified as regulators of the mRNA signatures in these patients. This suggests that different miRs may play important roles in reconstitution of muscle mass in sustained critical illness compared to early ICUAW. These miRs targeted a small proportion of the DE miRs (i.e. <3 % DE mRNAs per miR), unlike the DE miR master regulators at Day 7 post-ICU discharge regulating up to 23% of DE mRNAs. This discrepancy likely reflects the smaller sample size in the improver and non-improver subgroups at month 6 post-ICU subgroups compared to the analysis of the day 7 post-ICU patients versus healthy controls. Additionally a larger biological “signal” is expected comparing disease versus healthy controls (i.e. ICUAW patients at day 7 versus healthy controls) versus comparing two subgroups of a disease (i.e. improvers vs non-improvers). While no studies characterizing the roles of the miRs separating improvers and non-improvers in muscle are currently available, we speculate that these miRs may also contribute to the muscle regeneration in phenotypes of ICUAW with increased muscle mass reconstitution. Moreover, we hypothesize that these miRs may regulate individual variation in sensitivity to signals within pathways involved in muscle mass 160 . In Chapter 5 we integrated gene transcriptional profiles of human muscle disease and healthy muscle controls from public repositories to detect signatures common to muscle diseases as well as those specific to each category of muscle disease. We hypothesized leveraging the heterogeneous patient populations across muscle disease cohorts would identify a generalizable muscle common disease signature that that is predictive of clinical severity across muscle diseases. Thus we utilized a multi-cohort meta-analysis framework that has been used to identify robust gene signatures in complex diseases including neurodegeneration 270 , organ transplant 249 , and sepsis 412 . In the present doctor work, we identified a 131-gene signature termed the common muscle disease module (CMDM) derived from five categories of muscle disease in 22 discovery cohorts (490 samples) and 17 validation cohorts (689 samples). To remove bias towards a specific muscle disease category, the meta- analysis was repeated iteratively, removing datasets from one muscle disease category at a time in a “leave-one-disease-out” analysis. The CMDM was found to be associated with clinical measures of muscle severity including muscle mass, strength, and fibrosis in validation cohorts that had clinical data available.

149 Multiple sources of biological and non-biological heterogeneity were identified in the datasets. The cohorts included wide ranges of disease pathologies, peripheral muscle types, patient ages, age at disease onset, and clinical settings. While only a subset of cohorts formally reported muscle mass and strength, there is expected to be significant variation in the extent of muscle atrophy and weakness across the range of muscle diseases. The samples were collected in numerous academic centers across different countries using various experimental protocols and microarray platforms. Importantly, the likelihood that the signatures identified are reproducible and generalizable was increased due to these sources of heterogeneity. Microarray platforms from multiple manufacturers were used across the datasets in the meta- analysis. As we expected, there was a wide variety of Affymetrix microarray platforms as well as a smaller number of Illumina and Agilent platforms. We ensured that discovery and validation datasets both included platforms from several manufacturers to increase the generalizability of the results 412 . Quality control metrics to identify aberrant microarray samples for removal to reduce technical heterogeneity (thereby increasing statistical power) have been advocated by some investigators 252 . Adjustment for technical factures such as batch, RNA integrity, and sample storage time have also been recommended 236 . We did not remove any individual samples or perform batch correction in an effort to optimize “real world” generalizability and avoid using potentially subjective parameters for denoting samples or cohorts as “problematic”. The range of biopsy sites included in the meta-analysis likely reflects muscle types affected across different muscle diseases that were most accessible to biopsy. The majority of biopsies were taken from large muscle groups in the upper extremities (biceps brachii and deltoid) and in the lower extremities (vastus lateralis). Skeletal muscle types are heterogeneous at the level of whole muscle as well as at the individual muscle fiber level which reflects their adaptions for different motor tasks 413 . The pattern of muscle groups involved varies between different genetic and acquired muscle diseases. Moreover, differing skeletal muscle types may display divergent responses to the same stimulus. Some muscle groups can be spared until late in one type of disease while affected early in another disease (e.g. comparing Duchene muscular dystrophy and oculopharyngeal dystrophy) 414 . The mechanisms that give rise to the selectivity of muscles affected are poorly elucidated at present. The differing embryonic origins of muscle types may contribute to their diversity. For example, craniofacial muscles are developmentally distinct from peripheral muscles

150 413 which may explain the relative sparing of craniofacial muscles in ICUAW. Terry et al 415 hypothesize that the sensitivity of a muscle cell to different pathologic mechanisms is determined by intrinsic cellular properties such as gene expression. Although there are heterogeneous muscle types and diverse genetic and acquired causes of different muscle diseases, the common muscle disease signature (CMDM) reflected convergent transcriptional pathways across peripheral skeletal muscles affected by muscle disease. This implies that prior characterization of any of the genes in the CMDM may be relevant across muscle diseases. Some of the genes in the CMDM have been associated with muscle disease previously, whereas many remain unknown or poorly characterized in muscle. Increased expression of ANXA2 has been previously associated with increasing muscle disease severity in muscular dystrophies 365 . Increased expression of GADD45A has been shown to be necessary for skeletal muscle atrophy in animal models of immobilization 416 . Increased expression of CHRNA1 has been recognized as a marker of muscle severity in denervation 371 . Several studies have implicated complement in muscle regeneration 417 . Complement- mediated muscle injury has been characterized in mouse models of subtypes of muscular dystrophy 418 . A cardiotoxin-induced injury mouse model found C3 to be a critical mediator of macrophage recruitment and subsequent muscle regeneration after injury. C1S and C1R activation was found to be involved with augmenting pathways associated with impaired skeletal muscle regeneration associated with aging 419 . Our analysis found all six genes were among the most upregulated genes in the CMDM, suggesting that their prior characterizations in a disease type may be broadly applied across muscle diseases. CMDM genes without functional annotation can be prioritized for future experimental evaluation based on the strength of the molecular data (e.g. effect size or correlation with clinical phenotype). CMDM genes may be conceptually divided into those having direct etiological contribution to muscle disease and those that represent a secondary phenomenon in the development of muscle disease, include stress-related changes or cell survival mechanisms 365 . Further experimentation will be required to identify the CMDM genes directly contributing to disease as these genes are expected to be good candidates for novel disease modifying therapies 270 . In contrast to the CMDM, signatures identified by assessing specific muscle diseases alone are unlikely to capture convergent gene dysfunction across muscle diseases. This is exemplified by comparing the performance of the CMDM to the TGF-β pathway signature that was previously derived based on prior knowledge of its role in fibrosis in many tissues 339 . While both signatures classified the severity of fibrosis in muscular

151 dystrophy the TGF-β signature showed no differences between healthy controls and ALS or ICUAW. The severity of muscle dysfunction based on clinical and histological measures of several muscle diseases was found to be associated with CMDM summary expression. Robust correlations with muscle mass in ICUAW, muscle strength in ICUAW and ALS, as well as degree of fibrosis in muscular dystrophy pathology samples were found. However, no association was found with muscle strength in the cancer cachexia cohort, likely reflecting the global discrepancies in the expression data of the cancer cohort (described in Limitations Section 6.2 ). The association between the CMDM and the majority of clinical measures suggested that changes in the CMDM could reflect muscle specific therapies. Thus, we assessed whether specific therapies that improve physical capacity would be reflected in the CMDM as a decrease in post-therapy CMDM scores. In a small trial of patients with inflammatory myopathy undergoing exercise therapy versus control, a non-significant downward trend in the CMDM scores of the exercise group appears to reflect the improved physical capacity in this group, however a larger number of study participants will be required in future studies. An unexpected finding of our meta-analysis was the enrichment of genes in the CMDM signature targeted to the exosomal vesicle. Exosomal vesicles, cell derived vesicles containing signaling factors including genes and miRs for intercellular communication, have been found to have roles in muscle regeneration and congenital muscle diseases. Recent transcriptomic analysis of skeletal muscle found dozens of genes encoding secreted proteins that may serve roles as “myokines” 415 . Further assessment of the CMDM signature genes targeted to the exosome and its potential clinical applications are discussed Future Directions . For the muscle disease category specific meta-analysis, our primary interest was on mechanistic inquiry of muscle disease. Therefore classes of functionally related genes, rather than single genes, that are uniquely altered in specific muscle diseases were assessed 235 . Genes shared between one or more disease category were removed prior to functional pathway analysis. Using this strategy, we identified downregulated pathways involved in muscle development and regeneration in ICUAW that are not altered in other muscle diseases. This finding is concordant with the functional pathway analysis performed using WGCNA and MMRA in our ICUAW cohort as described in Chapters 3 and 4. Collectively, our transcriptional profiling using co-expression analysis and meta-analysis both found downregulation of genes related to muscle regeneration in ICUAW. The ICUAW-specific signature

152 we identified may serve as a robust biomarker for distinguishing ICUAW from other muscle diseases in the future as discussed below in Future Directions .

6.2 Limitations

Each phase of a microarray analysis experiment can be affected by factors that adversely influence the final expression estimates. Jaksik et al 230 identified eight steps in a microarray experiment and factors that may affect the quality of the measurements. In the first step, mRNA isolation, mRNA may be compromised by contamination and degradation. Prior to hybridization, the extent of RNA degradation was estimated in our ICUAW transcriptomic analysis using the RNA integrity number (RIN) as a benchmark, and RIN values were found to indicate good RNA quality. The next step, complementary DNA synthesis is strongly influenced by previous RNA degradation, resulting in creation of truncated mRNAs causing probes located further from the 3’-end to show lower signal intensity. In the third step, amplification and labeling, differences in processing efficiency for mRNA with various structures may lead to inconsistent signals among samples. In step four and five, complimentary RNA fragmentation and hybridization, differences in process efficiency between probes with various structures among samples resulting uneven hybridization may occur. In step six, microarray washing and staining may impact levels of non-specific hybridization among probes with various RNA structures. Step seven and eight are scanning and data pre-processing steps that may generate artifacts related to background correction and normalization. Other important sources of technical noise during microarray analysis include incomplete or outdated gene annotation data, probe binding to alternatively spliced transcripts, and unreliable measurement of genes at high or low levels of expression 420 . These factors can all contribute to discrepancies in expression levels between probes representing the same gene for similar samples measured on different manufacturers platforms (e.g. Agilent, Illumina and Affymetrix) 235 230 . Probes from most microarrays target only a subset of gene isoforms or transcripts, hybridizing to different parts of the same transcripts. These discrepancies may yield different results that cannot be resolved with enhanced annotation 421 . Whereas long oligo

153 probes (e.g. 60-mer probes on Agilent) tend to disallow mishybridization, 25-mer probes on Affymetrix are less adapted to discriminate short hybridization products and partially degraded samples 234 . Degraded RNA results in high background signal compression which can be detected using clustering analysis. We followed quality control measures for microarray pre-processing that have been validated in the literature, however it does not exclude the possible limitations described. The genes included in the meta-analysis were those present on most of the microarray platforms. Accurate annotation files mapping probes to genes are crucial for microarray integration. For each platform, the annotation platform (GPL) files from the Gene Expression Omnibus (GEO) database were used in the Chapter 5 meta-analysis. Our meta-analysis included more recent “Exon” Affymetrix arrays which require two levels of summarization to get to the gene level: first the “probeset“ summarization to the exon level, then to “transcript clusters” to get to the gene level. Roughly 18% of probes in transcript clusters had non-unique annotation to genes i.e. could not be unambigously assigned. 422 . A previous study found that sequence alignment of probes to the transcriptome was more accurate than using the vendor’s annotation or Bioconductor, though it was unclear whether this reflected the use of more recent data 421 . There is a growing body of literature demonstrating the importance of specific splicing events for muscle function 380 , 382 . Mutations that disrupt splicing have been found to contribute to a number of myopathies 423 . However, Bachinski et al 422 concluded that aberrant splicing in two forms of muscular dystrophy was not likely to be the driving mechanism of the muscle pathology. Only a small subset of the microarrays included in the meta-analysis were performed on a platform designed to assess both global gene expression and splicing profiles (Affymetrix “Exon” microarrays 424 ). Thus we are unable to compare patterns of differential splicing across various muscle diseases. Analysis of conserved epigenetic and miR signatures of muscle disease will also be of great interest once sufficient data are available in the future. Studies that assess the transcriptome or proteome from single fiber preparations of bulk skeletal muscle tissue have identified fiber type specific differences in gene expression and protein levels 425 . Slow twitch (type I) and fast twitch (type II) muscle fibers have different functional and metabolic characteristics. Predominantly type II muscle fiber atrophy has been described in muscle biopsies from ICUAW 8. We are unable to compare differences in muscle fiber types in our ICU cohort and cannot associate changes in gene expression to muscle proportions of muscle fiber types. The transcriptional profiles analyzed in this doctoral work are

154 derived from bulk tissues from percutaneous muscle biopsies that may include vasculature, adipose, and immune cells. Cellular composition can change in the disease state with increased proportion of cell types unrelated to the cell type of interest that may influence gene expression measurements. During chronic muscle disease it could be expected that the proportion of skeletal muscle cells may decrease relative to other cell types such as adipose or collagen. Thus genes in the CMDM may reflect changes in these cell type proportions. We assessed cell-type specific expression in muscle disease using enrichment analysis of cell-specific expression profiles. This approach may be suboptimal in comparison to methods for statistical deconvolution of mixed tissue expression which may be applied to future analysis once human skeletal muscle-cell specific profiles are available 426 The phenotypic consequences of changes in gene expression levels are presumed to be due to the corresponding protein level changes. The abundance of protein is only partially predicted by transcript expression abundance 427 428 . While transcript expression levels only partially predict protein abundance, the abundance of a gene transcript can frequently be used to predict whether the protein is detectable within cells 427 . Vogel and Marcotte 427 proposed a model where mRNA expression acts in a switch-like fashion where low mRNA levels result in undetectable protein levels and protein abundance rises markedly with higher mRNA levels. The relationship between protein abundance and gene transcripts is likely related to numerous factors which affect protein levels independently of transcripts 429 . This includes post-transcriptional modifications, translational efficiency, alternative splicing, protein folding, secretion, subcellular localization and degradation. Ghazalpour et al 429 found that transcript levels were more strongly correlated with clinical traits (metabolic profiles) in mice. The investigators attributed this finding partly to the greater technical challenges related to the quantification of proteins versus transcripts as well as other possible explanations including: 1) increased buffering of protein abundance compared to transcripts in reaction to changes in clinical phenotypes (rather than the molecular phenotypes being causal), 2) phenotypes become more complex as they become further removed from DNA variation (see Figure 1.2) resulting in less variation in the phenotype being linked to DNA variation due to contribution of numerous additional factors, 3) differential genetic regulation between protein and transcripts attributed to phenotypic buffering 430 . Among the 11 ICUAW-RM identified in Chapter 3 , four modules did not have any functional enrichment making biological interpretation of these modules

155 challenging. We hypothesize that increased sensitivity of gene module functional enrichment in future analysis may be gained with increasing sample size, use of RNA sequencing, and transcriptomic analysis of single cells. Additionally, the lack of functional themes among these ICUAW-RM may reflect the large bias within the GO database. Several studies have documented a strong research bias in the literature that favors well-annotated genes 431,432 136 . Su and Hogenesch 431 demonstrated a power-law relationship in gene annotation (measured by links to the biomedical literature) and research funding (measured by gene refences in funded grants) that they suggested reflects researchers’ tendency to study previously described genes as they are easier to study than genes without known function. Haynes et al 432 found that the large inequality across gene annotation resources affected the functional interpretation of a meta-analysis of 104 distinct human diseases. Only 19.5% of published disease-gene associations were identified in the meta-expression analysis at FDR of 5%. Thus the investigators concluded that the inequality of gene annotations is a result of researchers perpetually ignoring unannotated genes because pathway analysis tools cannot map them to any functional (Gene Ontology) terms. Tomczak et al 136 , using the same gene meta-analysis signatures as Haynes et al 432 found strong annotation bias in the GO annotations where 58% of the annotations are for 16% of the human genes. Thus the functional analysis performed in Chapters 3-5 reflects these inequalities and serves only as an exploratory assessment of functional pathways. The functional analysis was a final step in most of the analytic pathways presented in this doctoral work (e.g. determination of DE genes and modules not dependent on a priori functional datasets). Thus modules and DE genes without functional annotation can be prioritized for future functional annotation based on the strength of the molecular data (e.g. effect size or correlation with clinical phenotype). Several studies have revealed underlying bias in functional enrichment testing when linking miRs with their predicted target genes 433 176,434 . MiR-target prediction algorithms are more likely to identify targets belonging to certain functional categories resulting in prediction bias. Therefore the assumption of uniform samples in the hypergeometric distribution used for standard functional enrichment analysis is false resulting in a biased estimate. We avoided the use of predicted miR-targets for functional enrichment analysis and instead performed enrichment on the regulons detected by ARACNE in an unsupervised manner, thereby avoiding prediction bias. While we applied two highly cited unsupervised methods for network analysis (ARACNE and WGCNA), there is no single best method for detection of gene co- expression modules. WGCNA uses a clustering algorithm to determine module

156 membership which is dependent on a number of variables including distance measure and tree cutting parameters. Differences in parameter choice may influence the types of co-expression modules obtained, therefore we applied two different network methods and found that both methods identified common functional themes, increasing our confidence in the selected parameters. While the meta-analysis framework we applied has been used to detect robust signatures in a number of complex diseases it also has limitations. The false positive rate of detected genes in the validation cohort of our meta-analysis was found to be substantially higher than estimated by the Benjamini-Hochberg FDR in our discovery analysis. This is concordant with the observation by Sweeney et al 133 that the FDR threshold is insufficient to estimate of false positives in meta-analysis. The heterogeneity of nomenclature for many human miR names can compromise results of miR analysis 176 . The definition and annotation of numerous miRs has been significantly modified with successive versions of miRBase (notably miRBase versions 16, 17, 18, and 19) 175 . However, publically available databases of miR-target predictions inconsistently update the miR names denoted in miRBase. This variation in annotation across databases makes integration challenging. To mitigate differences between miR names across miR-target prediction databases, we used miRDIP 176 to provide a single list of pre-processed miR names included in numerous miR-target databases. Then the list of miR names in miRDIP was compared to the list of miR names from the Exiqon miR microarray and any discrepancies were resolved. However, this does not exclude the possibility that some miRs were not included in the analysis due to differences in annotation. Of the 39 independent patient cohorts that profiled human muscle disease and normal muscle controls included in the transcriptomic meta-analysis in Chapter 5, only four studies made available clinical measurements of muscle mass or strength, and one study included histopathologic scoring of fibrosis. The clinical data from these cohorts was crucial for assessing the degree of association between the common muscle disease gene signature and severity of muscle impairment measured clinically. Given the importance of clinical data for interpretation of the gene and miR expression changes, it is hoped that future protocols for submission to peer reviewed journals will require clinical data to be included along with transcriptomic profiles in public repositories such as GEO and Array Express. Only one cohort of muscle atrophy secondary to cancer met quality criteria in the meta-analysis and had vastly different expression profile from the other cohorts. It will be important to obtain more transcriptional profiles of peripheral muscle in future cohort(s) of cancer cachexia. Additionally, there were too few cohorts in the

157 neuromuscular disease category to include it in the discovery and primary validation steps of the meta-analysis. The two available neuromuscular disease cohorts were used in a secondary validation to verify that the expression of the CMDM genes was in keeping with the other cohorts. There are limitations to using healthy controls, rather than patients discharged from ICU who did not develop ICUAW, to compare our ICUAW cohort for transcriptional analysis. Patients with ICUAW may have complex comorbidities with reduced mobility and exposure to adverse effects (iatrogenicity) of ICU stay compared to healthy controls. In our analysis of the ICUAW cohort we did not control for exposure to pharmacologic agents during the ICU stay that have been implicated in ICUAW such as corticosteroids or neuromuscular blockers. These differences may mask the underlying gene dysregulation contributing to ICUAW. Alternatively, ICUAW may result from pathways of critical illness and ICU iatrogenicity that converge to result in different degrees of muscle pathology. Thus other investigators have used controls undergoing elective orthopedic surgery 110 , or high-risk cardiothoracic surgery patients without muscle weakness 92 with matched co- morbidities. This strategy however does not correct for the various therapies and differences in mobility in patients with ICUAW versus these perioperative controls. There are reports of patients with abdominal infections, sepsis or trauma managed outside of the ICU diagnosed with ICUAW suggesting that ICU stay is not an essential causative factor for the development of ICUAW 435 . The primary focus of our ICUAW study was to identify all abnormal signaling and cellular processes dysregulated compared to normal muscle contributing to ICUAW. The study was not designed to identify biomarkers of ICUAW between survivors of ICU stay with and without ICUAW. The normal controls were also useful to determine whether patients at month 6 post-ICU discharge resolution of the molecular phenotype back to normal (i.e. no gene expression differences compared to controls). The gene expression differences identified in this thesis work however do not imply causation. To assess a causative role for gene expression changes would require direct experimentation on patients with inhibitors or inducers of gene expression which is infeasible as a discovery study. While our ICUAW cohort is the largest to date to perform transcriptomic profiling, its rather limited sample size reflects the challenges posed by recruitment of patients after critical illness and the invasive nature of the biopsy sample procurement. The sample size was further limited by subgroup analysis between improvers and improvers. Additionally, the cohort sample size may have limited the signal strength for functional pathway analysis using WGCNA. This is exemplified in

158 a study by Gupta et al 436 which assessed the enrichment of a module initially detected by Voineagu et al 145 using an autism cohort with substantially increased sample size compared to the prior study by Voineagu et al. Gupta et al found that the previously identified module was enriched for more disease-relevant functional pathways compared to the WGCNA study using smaller sample size. Additionally the investigators were more accurately able to identify the cell-type enrichment within the module. Thus, it may be possible to more accurately characterize the ICUAW- RM described in Chapter 3 using a larger cohort. Additionally, a larger cohort of ICUAW patients at month 6 will be required to determine whether the miR signature separating improvers from non-improvers is reproducible.

6.3 Conclusion

This doctoral work has applied unsupervised, data-driven analyses to identify novel and clinically relevant miRs and genes dysregulated in the quadriceps muscle of patients with ICUAW at day 7 post-ICU discharge and at follow up at month 6 post- ICU compared to healthy controls. Additionally a miR signature separated ICUAW patients that improved their muscle mass (“Improvers”) compared to those who have not (“Non-improvers”) after 6 month follow up post-ICU discharge. Gene co- expression changes associated with reductions in muscle mass, strength or both were identified and functional analyses predominantly found alterations in genes associated with muscle regeneration and mitochondria in keeping with histopathologic studies of ICUAW. The doctoral work has identified miR master regulators that target a large proportion of the differentially expressed genes in ICUAW, predominantly at day 7 post-ICU. These miRs included two “atromiRs” , miR-424-5p and miR-29b, as well as the “myomiR” miR-206 . Mir-424-5p was found to regulate the greatest proportion of DE genes (24%) in early ICUAW, including downregulated genes related to striated muscle cell differentiation. Using C2C12 murine myoblasts, miR-424-5p was found to be upregulated during myogenic differentiation. A miR signature separating Improvers from Non-improvers contained miRs with uncertain function that may also be linked to muscle atrophy observed in ICUAW patients without muscle mass improvement. For our final study we performed meta-analysis of muscle transcriptional profiles of human muscle diseases and healthy controls to identify a common

159 signature of genes dysregulated across muscle diseases as well as those genes with expression changes that are unique to ICUAW. A 131-gene signature common across muscle disease was identified using 490 muscle tissue samples from 22 cohorts validated in 17 additional patient cohorts encompassing five categories of muscle diseases. Finally, removing the genes common to muscle disease from meta-analysis of only ICUAW cohorts revealed uniquely down-regulated muscle development and contraction genes specific to ICUAW. These findings collectively implicate dysregulated miRs and genes in aberrant muscle regeneration in ICUAW. However, future work will be needed to further characterize these miRs and genes to determine which directly contribute to muscle disease

6.4 Future Directions

Clinical measures of muscle strength, the current diagnostic criteria for ICUAW, are often based on subjective clinician assessment and compromised by patients with altered consciousness in the ICU. Ideal biomarkers will provide objective, non-volitional and non-invasive measures for diagnosis and prognostication of functional outcomes in critically ill patients. Prognostic biomarkers able to predict disease outcomes such as loss functional independence after critical illness would enable improved selection of critically ill patients in clinical trials of ICUAW therapies. There is an unmet need for clinicians to improve ICUAW patient care using novel biomarkers with diagnostic or prognostic capabilities. This prognostic information would be valuable for clinicians, patients and caregivers for discussions regarding goals of care in the intensive care setting. Additionally, as ICUAW cohorts tend to be composed of highly heterogeneous patient populations, prognostic biomarkers could also be used to enrich therapeutic trials by including patients with high-probability of developing significant illness. Furthermore, biomarkers serving as surrogate endpoints for functional performance in ICUAW may show smaller inter-individual variation than clinical measures, therefore enabling better controlled studies requiring fewer participants 437 . Previous biomarker candidates have comprised non-specific serum markers of muscle or nerve injury including creatine kinase, alpha-actin, myoglobin, troponin 438 and neurofilaments 439 , the latter having demonstrated poor specificity and

160 sensitivity for diagnosis ICUAW in a recent study. Circulating miRs from blood (serum) samples have been found to be promising biomarkers in muscle diseases including muscular dystrophy 375 . The advantage of using blood samples compared to muscle biopsy is the relatively non-invasive and simple means of obtaining blood samples without need for specialists. Therefore larger sample numbers would likely be obtained, thereby increasing the power of future studies. Monitoring circulating miRs has been proposed as a non-invasive method for tracking muscle disease progression. The discovery of serum miR and mRNA biomarkers correlating with clinical measures of muscle weakness is expected to be feasible based on extrapolation from other human and animal studies of several types of muscle diseases. One prior study has shown that among a selected set of candidate miRs, serum miR-181a may serve as an early marker of muscle weakness in ICUAW 92 . Furthermore, previous studies in other human muscle diseases 410 have found concordant expression between some DE miRs between muscle and plasma. The identification of dysregulated miRs in muscle from ICUAW patients in our analysis favors the probability of detecting these miRs in blood samples from patients ICUAW. The first study in our future work will be comprised of three parts each assessing one component from blood samples in patients with ICUAW 1) circulating miRs 2) exosomal vesicle (EV) mRNA and 3) peripheral blood cell mRNA transcriptome. The analysis will compare 1) ICUAW at day 7 and month 6 to healthy controls and 2) Improvers vs non-improvers at month 6 post ICU, where improvers are defined using the previously established criterion of muscle mass increase greater than 10cm 2 cross sectional area at month 6 compared to day post-ICU discharge. We will obtain quadriceps muscle biopsies as described previously, in addition to blood samples taken concurrently. In our future study, patients with ICUAW will be thoroughly clinically phenotyped at day 7 and month 6 as described in Chapter 3 . Additionally healthy volunteers will be recruited to provide a blood and muscle biopsy sample. Part one of the first study will compare the expression of the miR master regulators identified in Chapter 4 in serum samples compared to muscle biopsies in a new cohort of ICUAW and tissue bank healthy controls. The study will assess whether the miRs in skeletal muscle are also observed in the serum and whether the expression of these circulating miRs correlate with clinical measures of strength and mass in ICUAW. We will perform muscle and serum miR profiling for the miR master regulators using a commercially available RT-qPCR array platform. The relative levels of circulating miRs found in EVs will he compared with the non-vesicular

161 fractions (expected to contain miRs bound to protein/lipoprotein complexes) by separation of supernatant and pellet using centrifugation methods previously described 440 . Using this method, the pellet contains the EV fraction and supernatant contains the non-vesicular fractions of miRs that can be quantified by RT-qPCR. RNA immunoprecipitation of the supernatant using anti-Argonaute-2 and anti-Apolipioprotein-A-1 will be performed to determine the protein-binding partners of the dysregulated miRs of interest. The second part of our first study will be assessing the mRNA expression profile of extracellular vesicles (EV) in the collected serum samples from patients with ICUAW and controls. A common muscle disease module (CMDM) identified in chapter 6 was found to be enriched for genes located in the EV favoring the possibility of identifying these mRNAs in EVs from the blood samples in ICUAW patients. The expression of EV genes will be identified using the Illumina microarray platform used in Chapter 3 . The EV mRNA expression levels will also be correlated with clinical measures of muscle mass and weakness as described above and the association will be compared with their corresponding muscle transcripts identified in Chapter 5 . Additionally, the predictive role of combined EV miRs and EV mRNA biomarkers will be compared to miR or mRNA biomarkers individually. Next we will assess for interactions among the EV miRs and EV mRNAs using miR target prediction summary scores from miRDIP database. Next, whole (peripheral) blood gene expression will be assessed via microarray whole-transcriptomic profiling to identify gene transcripts differentiating ICUAW versus controls and improvers versus non-improvers. The transcripts will also be correlated to clinical measures of mass and strength. A large meta-analysis measuring hand grip strength in four cohorts and whole blood gene expression via microarray found 221 genes associated with strength after adjustment for cofactors 441 . This suggests that whole blood (white blood cell) gene profiling may also be a valuable source of biomarkers in ICUAW. As whole blood is a mix of multiple cell subtypes, it is expected to have greater power to detect net expression changes in the cell subtypes that are in highest abundance (e.g. neutrophils), but will have less power to detect smaller expression changes within less numerous white cell subtypes. This future study will also serve as a new cohort of patients to validate the miR signature derived from skeletal muscle separating Improvers from Non- improvers identified in Chapter 4 . We will also assess whether a potential prognostic biomarker of muscle mass, predicting which patients become Improvers, taken at day 7 post-ICU from blood samples can be identified using each of the

162 transcriptomic datasets obtained above. Differential expression analysis of serum miR expression changes at day 7 post-ICU will be performed comparing the patients that become Improvers versus Non-improvers at month 6 post-ICU. The new cohort will also be important for assessing the performance of the CMDM signature using the gene profiles from muscle and blood (EV mRNA and peripheral blood) and clinical phenotyping from the new cohort. The sensitivity and specificity for separating ICUAW from controls will be assessed with construction of receiver operating characteristic curves. Correlations between the CMDM scores from each of the three transcriptomes (muscle, EV, peripheral blood) and the clinical measures of mass and strength will be compared. It is expected that muscle biopsy samples from new cohorts of human muscle diseases profiled using gene microarray will be submitted to public repositories, some of which will also contain clinical or histopathological phenotyping. These datasets submitted by other investigators will also serve as serve as a secondary validation of the CMDM score. Additionally, a meta-analysis of miR transcriptional profiles comparing human muscle disease to controls may be feasible in the future as the number of publically available datasets continues to increase. Given the spectrum of disease progression and severity in ICUAW, results of clinical trials may be biased by unbalanced numbers of patients who do not recover muscle mass even in the absence of any therapy. Therefore, we hypothesize that a prognostic biomarker to predict muscle mass Improvement at day 7 (identified as described above) at enrollment of therapeutic trials may aid the stratification of patients within trial arms. To test this hypothesis, we will collaborate with other investigators who are designing clinical trials testing therapies for ICUAW such as physical rehabilitation. Clinical trial design that uses two trial groups that are subdivided as 1) no intervention versus intervention (without stratification) and 2) no intervention versus intervention where both groups have roughly even distribution of patients predicted to improve muscle mass using prognostic biomarker will be assessed. Additionally, these trials will measure changes in the CMDM severity scores after intervention and compare this to the clinical measures for assessing response to therapy. The biomarker-based measures may be complementary to the clinical measures and combined into a single prognostic scoring system. It is expected that clinical trial design and performance will be improved with use of biomarkers that serve as objective outcome measurements that are linked to changes in strength or muscle mass. This thesis gives rise to possible avenues for investigation of potential therapies for ICUAW. Restoring satellite cell numbers using mesenchymal stem cell

163 injection may restore muscle in patients with ICUAW as it has been demonstrated in murine models 125 . Satellite cell transplantation using autologous (patient-derived) transplants from myogeneic progenitor cells has been described in human and models 442 . However, the clinical use of stem cell therapy is more challenging than traditional, readily available small molecule drugs. Therefore, we investigate whether generic drug repurposing can effectively treat ICUAW. To identify drug candidates, we will perform computational pharmacogenomics to explore large-scale drug perturbation data sets, including the Library of Integrated Network-based Cellular Signatures (LINCS) and Connectivity Map (CMap). These databases contain transcriptomic profiles from dozens of cell lines treated with thousands of chemical compounds 443 . The drug-specific gene expression profiles contained in these databases can be compared with the ICUAW disease specific signatures that we have identified. Small molecules identified using this method are anticipated to reverse the gene expression ICUAW. Such analysis will be performed using the PharmacoGx package to integrate platforms and remove biases from different sources 444 . A second possibility for ICUAW treatment involves miR therapy. Presently no miR therapies for muscle atrophy are undergoing clinical trials and it remains unclear whether miR therapies should target a single or cluster of miRs 108 . However, this thesis identified putative miR master regulators that are good candidates for investigating miR therapies in ICUAW. New high-throughput approaches are becoming available to identify drug-miR relationships that may allow identification of known readily available drugs that may alter miR expression and the disease phenotype 445 . Future experimental work will assess genes of unknown function within the 131-gene CMDM signature to determine which genes contribute directly to muscle disease pathology. Candidate CMDM genes without any previous characterization in muscle disease based on a systematic literature search will be selected in order of absolute effect size for further study. To characterize the effect of a downregulated CMDM gene, a knockout mouse model will be studied. To study an upregulated CMDM gene, a transgenic mouse model will be used. The knockout and transgenic mouse models displaying muscle atrophy phenotypes can also be used to study responses to miR targeted therapies. The mouse orthologues of each miR master regulator can be over- or under- expressed in these mouse models and changes in muscle mass and strength can be assessed.

164

REFERENCES

1. Batt J, dos Santos CC, Cameron JI, Herridge MS. Intensive care unit- acquired weakness: clinical phenotypes and molecular mechanisms. Am J Respir Crit Care Med. 2013;187(3):238-246.

2. Herridge MS, Tansey CM, Matte A, et al. Functional disability 5 years after acute respiratory distress syndrome. The New England journal of medicine. 2011;364(14):1293-1304.

3. Herridge MS, Cheung AM, Tansey CM, et al. One-year outcomes in survivors of the acute respiratory distress syndrome. The New England journal of medicine. 2003;348(8):683-693.

4. Bloch S, Polkey MI, Griffiths M, Kemp P. Molecular mechanisms of intensive care unit-acquired weakness. The European respiratory journal. 2012;39(4):1000-1011.

5. Levine S, Nguyen T, Taylor N, et al. Rapid disuse atrophy of diaphragm fibers in mechanically ventilated humans. The New England journal of medicine. 2008;358(13):1327-1335.

6. Garnacho-Montero J, Amaya-Villar R, Garc??a-Garmend??a JL, Madrazo- Osuna J, Ortiz-Leyba C. Effect of critical illness polyneuropathy on the withdrawal from mechanical ventilation and the length of stay in septic patients*. Critical Care Medicine. 2005;33(2):349-354.

7. De Jonghe B, Bastuji-Garin S, Durand M-C, et al. Respiratory weakness is associated with limb weakness and delayed weaning in critical illness*. Critical Care Medicine. 2007;35(9):2007-2015.

8. Schefold JC, Bierbrauer J, Weber-Carstens S. Intensive care unit-acquired weakness (ICUAW) and muscle wasting in critically ill patients with severe sepsis and septic shock. Journal of cachexia, sarcopenia and muscle. 2010;1(2):147-157.

165 9. Walsh CJ, Batt J, Herridge MS, Dos Santos CC. Muscle Wasting and Early Mobilization in Acute Respiratory Distress Syndrome. Clinics in chest medicine. 2014;35(4):811-826.

10. Ali NA, O'Brien JM, Jr., Hoffmann SP, et al. Acquired weakness, handgrip strength, and mortality in critically ill patients. Am J Respir Crit Care Med. 2008;178(3):261-268.

11. Tzanis G, Vasileiadis I, Zervakis D, et al. Maximum inspiratory pressure, a surrogate parameter for the assessment of ICU-acquired weakness. BMC anesthesiology. 2011;11:14.

12. De Jonghe B, Bastuji-Garin S, Sharshar T, Outin H, Brochard L. Does ICU- acquired paresis lengthen weaning from mechanical ventilation? Intensive Care Med. 2004;30(6):1117-1121.

13. Hough CL, Steinberg KP, Taylor Thompson B, Rubenfeld GD, Hudson LD. Intensive care unit-acquired neuromyopathy and corticosteroids in survivors of persistent ARDS. Intensive Care Med. 2009;35(1).

14. Miller MB, Tang YW. Basic concepts of microarrays and potential applications in clinical microbiology. Clinical microbiology reviews. 2009;22(4):611-633.

15. De Jonghe B, Sharshar T, Lefaucheur JP, et al. Paresis acquired in the intensive care unit: a prospective multicenter study. . Jama. 2002;288(22):2859-2867.

16. Connolly BA, Jones GD, Curtis AA, et al. Clinical predictive value of manual muscle strength testing during critical illness: an observational cohort study. Critical care. 2013;17(5):R229.

17. Stevens RD, Marshall SA, Cornblath DR, et al. A framework for diagnosing and classifying intensive care unit-acquired weakness. Critical care medicine. 2009;37(10 Suppl):S299-308.

18. Latronico N, Nisoli E, Eikermann M. Muscle weakness and nutrition in critical illness: matching nutrient supply and use. The Lancet Respiratory Medicine. 2013;1(8):589-590.

166 19. Ferrando AA, Paddon-Jones D, Wolfe RR. Bed rest and myopathies. Current opinion in clinical nutrition and metabolic care. 2006;9(4):410-415.

20. Wilcox ME, Herridge MS. Long-term outcomes in patients surviving acute respiratory distress syndrome. Seminars in respiratory and critical care medicine. 2010;31(1):55-65.

21. Kleyweg RP, van der Meche FG, Schmitz PI. Interobserver agreement in the assessment of muscle strength and functional abilities in Guillain-Barre syndrome. Muscle & nerve. 1991;14(11):1103-1109.

22. Feasson L, Stockholm D, Freyssenet D, et al. Molecular adaptations of neuromuscular disease-associated proteins in response to eccentric exercise in human skeletal muscle. The Journal of Physiology. 2002;543(1):297-306.

23. Linacre JM, Heinemann AW, Wright BD, Granger CV, Hamilton BB. The structure and stability of the Functional Independence Measure. Archives of physical medicine and rehabilitation. 1994;75(2):127-132.

24. McHorney CA, Ware JEJ, Lu JF, Sherbourne CD. The MOS 36-item Short- Form Health Survey (SF-36). III. . 1994;32(1):40-66.

25. Stevens RD, Dowdy DW, Michaels RK, Mendez-Tellez PA, Pronovost PJ, Needham DM. Neuromuscular dysfunction acquired in critical illness: a systematic review. Intensive Care Med. 2007;33(11):1876-1891.

26. Latronico N, Shehu I, Elisa S. Neuromuscular sequelae of critical illness. Current opinion in critical care. 2005;11(4):381-390.

27. Nanas S, Kritikos K, Angelopoulos E, et al. Predisposing factors for critical illness polyneuromyopathy in a multidisciplinary intensive care unit. Acta neurologica Scandinavica. 2008;118(3):175-181.

28. Papazian L, Forel JM, Gacouin A, et al. Neuromuscular blockers in early acute respiratory distress syndrome. The New England journal of medicine. 2010;363(12):1107-1116.

29. Yang T, Li Z, Jiang L, Wang Y, Xi X. Risk factors for intensive care unit- acquired weakness: A systematic review and meta-analysis. Acta neurologica Scandinavica. 2018;138(2):104-114.

167 30. de Jonghe B, Lacherade JC, Sharshar T, Outin H. Intensive care unit- acquired weakness: risk factors and prevention. Critical care medicine. 2009;37(10 Suppl):S309-315.

31. Song C, Tseng GC. Hypothesis setting and order statistic for robust genomic meta-analysis. The Annals of Applied Statistics. 2014;8(2):777-800.

32. Bednarik J, Vondracek P, Dusek L, Moravcova E, Cundrle I. Risk factors for critical illness polyneuromyopathy. Journal of neurology. 2005;252(3):343- 351.

33. McCollum ED, Preidis GA, Maliwichi M, et al. Clinical versus rapid molecular HIV diagnosis in hospitalized African infants: a randomized controlled trial simulating point-of-care infant testing. J Acquir Immune Defic Syndr. 2014;66(1):e23-30.

34. Witteveen E, Wieske L, van der Poll T, et al. Increased Early Systemic Inflammation in ICU-Acquired Weakness; A Prospective Observational Cohort Study. Critical care medicine. 2017;45(6):972-979.

35. van den Berghe G, Wouters P, Weekers F, et al. Intensive insulin therapy in critically ill patients. N Engl J Med. 2001;345(19):1359-1367.

36. Van den Berghe G, Wilmer A, Hermans G, et al. Intensive insulin therapy in the medical ICU. The New England journal of medicine. 2006;354(5):449- 461.

37. Michiels S, Koscielny S, Hill C. Prediction of cancer outcome with microarrays: a multiple random validation strategy. The Lancet. 2005;365(9458):488-492.

38. Gesthalter YB, Vick J, Steiling K, Spira A. Translating the transcriptome into tools for the early detection and prevention of lung cancer. Thorax. 2015;70(5):476-481.

39. Bercker S, Weber-Carstens S, Deja M, et al. Critical illness polyneuropathy and myopathy in patients with acute respiratory distress syndrome. Critical care medicine. 2005;33(4):711-715.

168 40. Garnacho-Montero J, Madrazo-Osuna J, Garcia-Garmendia JL, et al. Critical illness polyneuropathy: risk factors and clinical consequences. A cohort study in septic patients. Intensive Care Med. 2001;27(8):1288-1296.

41. Johnson WE, Li C, Rabinovic A. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics. 2007;8(1):118- 127.

42. Bolstad B. affyPLM: Model Based QC Assessment of Affymetrix GeneChips. 2015.

43. Meduri GU, Golden E, Freire AX, et al. Methylprednisolone infusion in early severe ARDS: results of a randomized controlled trial. Chest. 2007;131(4):954-963.

44. Li Z, Herold T, He C, et al. Identification of a 24-gene prognostic signature that improves the European LeukemiaNet risk classification of acute myeloid leukemia: an international collaborative study. Journal of clinical oncology : official journal of the American Society of Clinical Oncology. 2013;31(9):1172- 1181.

45. Rung J, Brazma A. Reuse of public genome-wide gene expression data. Nature reviews Genetics. 2013;14(2):89-99.

46. Jackson MJ. Control of reactive oxygen species production in contracting skeletal muscle. Antioxidants & redox signaling. 2011;15(9):2477-2486.

47. Fan E, Zanni JM, Dennison CR, Lepre SJ, Needham DM. Critical illness neuromyopathy and muscle weakness in patients in the intensive care unit. AACN advanced critical care. 2009;20(3):243-253.

48. Morris PE. Moving our critically ill patients: mobility barriers and benefits. Critical care clinics. 2007;23(1):1-20.

49. Bailey P, Thomsen GE, Spuhler VJ, et al. Early activity is feasible and safe in respiratory failure patients. Critical care medicine. 2007;35(1):139-145.

50. Morris PE, Goad A, Thompson C, et al. Early intensive care unit mobility therapy in the treatment of acute respiratory failure. Critical care medicine. 2008;36(8):2238-2243.

169 51. Needham DM, Korupolu R, Zanni JM, et al. Early physical medicine and rehabilitation for patients with acute respiratory failure: a quality improvement project. Arch Phys Med Rehabil. 2010;91(4):536-542.

52. Langfelder P, Zhang B, Horvath S. Defining clusters from a hierarchical cluster tree: the Dynamic Tree Cut package for R. Bioinformatics. 2008;24(5):719-720.

53. Thomsen GE, Snow GL, Rodriguez L, Hopkins RO. Patients with respiratory failure increase ambulation after transfer to an intensive care unit where early activity is a priority. Critical care medicine. 2008;36(4):1119-1124.

54. Warnat P, Eils R, Brors B. Cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypes. BMC bioinformatics. 2005;6:265.

55. Perme C, Chandrashekar R. Early mobility and walking program for patients in intensive care units: creating a standard of care. American journal of critical care : an official publication, American Association of Critical-Care Nurses. 2009;18(3):212-221.

56. Jones C, Skirrow P, Griffiths RD, et al. Rehabilitation after critical illness: a randomized, controlled trial. Critical care medicine. 2003;31(10):2456-2461.

57. Burtin C, Clerckx B, Robbeets C, et al. Early exercise in critically ill patients enhances short-term functional recovery. Critical care medicine. 2009;37(9):2499-2505.

58. R Core Team. A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. 2012.

59. Routsi C, Gerovasili V, Vasileiadis I, et al. Electrical muscle stimulation prevents critical illness polyneuromyopathy: a randomized parallel intervention trial. Critical care. 2010;14(2):R74.

60. Konstantinopoulos PA, Cannistra SA, Fountzilas H, et al. Integrated analysis of multiple microarray datasets identifies a reproducible survival predictor in ovarian cancer. PloS one. 2011;6(3):e18202.

170 61. Vasilevskis EE, Ely EW, Speroff T, Pun BT, Boehm L, Dittus RS. Reducing iatrogenic risks: ICU-acquired delirium and weakness--crossing the quality chasm. Chest. 2010;138(5):1224-1233.

62. Girard TD, Kress JP, Fuchs BD, et al. Efficacy and safety of a paired sedation and ventilator weaning protocol for mechanically ventilated patients in intensive care (Awakening and Breathing Controlled trial): a randomised controlled trial. Lancet. 2008;371(9607):126-134.

63. Abeel T, Helleputte T, Van de Peer Y, Dupont P, Saeys Y. Robust biomarker identification for cancer diagnosis with ensemble feature selection methods. Bioinformatics. 2010;26(3):392-398.

64. Denehy L, Berney S. Physiotherapy in the intensive care unit. Physical Therapy Reviews. 2006;11(1):49-56.

65. Morris PE, Herridge MS. Early intensive care unit mobility: future directions. Critical care clinics. 2007;23(1):97-110.

66. Puthucheary ZA, Rawal J, McPhail M, et al. Acute skeletal muscle wasting in critical illness. Jama. 2013;310(15):1591-1600.

67. Batt J, Dos Santos CC, Herridge MS. Muscle injury during critical illness. Jama. 2013;310(15):1569-1570.

68. Takala J, Ruokonen E, Webster NR, et al. Increased mortality associated with growth hormone treatment in critically ill adults. N Engl J Med. 1999;341(11):785-792.

69. Rennie MJ. Anabolic resistance in critically ill patients. Critical care medicine. 2009;37(10 Suppl):S398-399.

70. McClung JM, Davis JM, Wilson MA, Goldsmith EC, Carson JA. Estrogen status and skeletal muscle recovery from disuse atrophy. J Appl Physiol (1985). 2006;100(6):2012-2023.

71. Mukai R, Horikawa H, Fujikura Y, et al. Prevention of disuse muscle atrophy by dietary ingestion of 8-prenylnaringenin in denervated mice. PloS one. 2012;7(9):e45048.

171 72. Goncalves DA, Silveira WA, Lira EC, et al. Clenbuterol suppresses proteasomal and lysosomal proteolysis and atrophy-related genes in denervated rat soleus muscles independently of Akt. American journal of physiology Endocrinology and metabolism. 2012;302(1):E123-133.

73. Kline WO, Panaro FJ, Yang H, Bodine SC. Rapamycin inhibits the growth and muscle-sparing effects of clenbuterol. J Appl Physiol (1985). 2007;102(2):740-747.

74. Ryall JG, Lynch GS. The potential and the pitfalls of beta-adrenoceptor agonists for the management of skeletal muscle wasting. Pharmacology & therapeutics. 2008;120(3):219-232.

75. Sandri M. Protein breakdown in muscle wasting: role of autophagy-lysosome and ubiquitin-proteasome. The international journal of biochemistry & cell biology. 2013;45(10):2121-2129.

76. Supinski GS, Vanags J, Callahan LA. Effect of proteasome inhibitors on endotoxin-induced diaphragm dysfunction. American journal of physiology Lung cellular and molecular physiology. 2009;296(6):L994-L1001.

77. Park JW, Kim KM, Oh KJ, Rhyu IJ, Jang HS. Proteasome inhibition promotes functional recovery after peripheral nerve reperfusion injury. The Journal of trauma. 2009;66(3):743-748.

78. Agten A, Maes K, Thomas D, et al. Bortezomib partially protects the rat diaphragm from ventilator-induced diaphragm dysfunction. Critical care medicine. 2012;40(8):2449-2455.

79. Xiao Y, Yin J, Wei J, Shang Z. Incidence and risk of cardiotoxicity associated with bortezomib in the treatment of cancer: a systematic review and meta- analysis. PloS one. 2014;9(1):e87671.

80. Parotto M, Batt J, Herridge M. The Pathophysiology of Neuromuscular Dysfunction in Critical Illness. Critical care clinics. 2018;34(4):549-556.

81. Allen DC, Arunachalam R, Mills KR. Critical illness myopathy: further evidence from muscle-fiber excitability studies of an acquired channelopathy. Muscle & nerve. 2008;37(1):14-22.

172 82. Carre JE, Orban JC, Re L, et al. Survival in critical illness is associated with early activation of mitochondrial biogenesis. Am J Respir Crit Care Med. 2010;182(6):745-751.

83. Weber-Carstens S, Koch S, Spuler S, et al. Nonexcitable muscle membrane predicts intensive care unit-acquired paresis in mechanically ventilated, sedated patients. Critical care medicine. 2009;37(9):2632-2637.

84. Weber-Carstens S, Deja M, Koch S, et al. Risk factors in critical illness myopathy during the early course of critical illness: a prospective observational study. Critical care. 2010;14(3):R119.

85. Brealey D, Karyampudi S, Jacques TS, et al. Mitochondrial dysfunction in a long-term rodent model of sepsis and organ failure. American journal of physiology Regulatory, integrative and comparative physiology. 2004;286(3):R491-497.

86. Brealey D, Brand M, Hargreaves I, et al. Association between mitochondrial dysfunction and severity and outcome of septic shock. The Lancet. 2002;360(9328):219-223.

87. Fredriksson K, Tjader I, Keller P, et al. Dysregulation of mitochondrial dynamics and the muscle transcriptome in ICU patients suffering from sepsis induced multiple organ failure. PloS one. 2008;3(11):e3686.

88. Romanello V, Sandri M. Mitochondrial Quality Control and Muscle Mass Maintenance. Front Physiol. 2015;6:422.

89. Picard M, Shirihai OS, Gentil BJ, Burelle Y. Mitochondrial morphology transitions and functions: implications for retrograde signaling? American journal of physiology Regulatory, integrative and comparative physiology. 2013;304(6):R393-406.

90. Sullivan JS, Kilpatrick L, Costarino AT, Lee SC, Harris MC. Correlation of plasma cytokine elevations with mortality rate in children with sepsis. The Journal of Pediatrics. 1992;120(4):510-515.

91. Friedrich O, Reid MB, Van den Berghe G, et al. The Sick and the Weak: Neuropathies/Myopathies in the Critically Ill. Physiol Rev. 2015;95(3):1025- 1109.

173 92. Bloch SA, Lee JY, Syburra T, et al. Increased expression of GDF-15 may mediate ICU-acquired weakness by down-regulating muscle microRNAs. Thorax. 2015;70(3):219-228.

93. Constantin D, McCullough J, Mahajan RP, Greenhaff PL. Novel events in the molecular regulation of muscle mass in critically ill patients. The Journal of physiology. 2011;589(Pt 15):3883-3895.

94. Tiao G, Hobler S, Wang JJ, et al. Sepsis is associated with increased mRNAs of the ubiquitin-proteasome proteolytic pathway in human skeletal muscle. The Journal of clinical investigation. 1997;99(2):163-168.

95. Jespersen JG, Nedergaard A, Reitelseder S, et al. Activated protein synthesis and suppressed protein breakdown signaling in skeletal muscle of critically ill patients. PloS one. 2011;6(3):e18090.

96. Zhao J, Brault JJ, Schild A, et al. FoxO3 coordinately activates protein degradation by the autophagic/lysosomal and proteasomal pathways in atrophying muscle cells. Cell metabolism. 2007;6(6):472-483.

97. Wang X, Blagden C, Fan J, et al. Runx1 prevents wasting, myofibrillar disorganization, and autophagy of skeletal muscle. Genes & development. 2005;19(14):1715-1722.

98. Segal NA, Zimmerman MB, Brubaker M, Torner JC. Obesity and knee osteoarthritis are not associated with impaired quadriceps specific strength in adults. PM & R : the journal of injury, function, and rehabilitation. 2011;3(4):314-323; quiz 323.

99. Talbert EE, Smuder AJ, Min K, Kwon OS, Szeto HH, Powers SK. Immobilization-induced activation of key proteolytic systems in skeletal muscles is prevented by a mitochondria-targeted antioxidant. J Appl Physiol (1985). 2013;115(4):529-538.

100. Masiero E, Agatea L, Mammucari C, et al. Autophagy is required to maintain muscle mass. Cell metabolism. 2009;10(6):507-515.

101. Hussain SN, Mofarrahi M, Sigala I, et al. Mechanical ventilation-induced diaphragm disuse in humans triggers autophagy. Am J Respir Crit Care Med. 2010;182(11):1377-1386.

174 102. Vanhorebeek I, Gunst J, Derde S, et al. Insufficient activation of autophagy allows cellular damage to accumulate in critically ill patients. The Journal of clinical endocrinology and metabolism. 2011;96(4):E633-645.

103. Dos Santos C, Hussain SN, Mathur S, et al. Mechanisms of Chronic Muscle Wasting and Dysfunction after an Intensive Care Unit Stay. A Pilot Study. Am J Respir Crit Care Med. 2016;194(7):821-830.

104. Wollersheim T, Woehlecke J, Krebs M, et al. Dynamics of myosin degradation in intensive care unit-acquired weakness during severe critical illness. Intensive Care Med. 2014;40(4):528-538.

105. Schiaffino S, Dyar KA, Ciciliot S, Blaauw B, Sandri M. Mechanisms regulating skeletal muscle growth and atrophy. The FEBS journal. 2013;280(17):4294- 4314.

106. Stitt TN, Drujan D, Clarke BA, et al. The IGF-1/PI3K/Akt pathway prevents expression of muscle atrophy-induced ubiquitin ligases by inhibiting FOXO transcription factors. Mol Cell. 2004;14(3):395-403.

107. Jackson JR, Mula J, Kirby TJ, et al. Satellite cell depletion does not inhibit adult skeletal muscle regrowth following unloading-induced atrophy. Am J Physiol Cell Physiol. 2012;303(8):C854-861.

108. van de Worp W, Theys J, van Helvoort A, Langen RCJ. Regulation of muscle atrophy by microRNAs: 'AtromiRs' as potential target in cachexia. Current opinion in clinical nutrition and metabolic care. 2018;21(6):423-429.

109. Di Giovanni S, Molon A, Broccolini A, et al. Constitutive activation of MAPK cascade in acute quadriplegic myopathy. Annals of neurology. 2004;55(2):195-206.

110. Langhans C, Weber-Carstens S, Schmidt F, et al. Inflammation-induced acute phase response in skeletal muscle and critical illness myopathy. PloS one. 2014;9(3):e92048.

111. Farhan H, Moreno-Duarte I, Latronico N, Zafonte R, Eikermann M. Acquired Muscle Weakness in the Surgical Intensive Care Unit: Nosology, Epidemiology, Diagnosis, and Prevention. Anesthesiology. 2016;124(1):207- 234.

175 112. Bodine SC. Disuse-induced muscle wasting. The international journal of biochemistry & cell biology. 2013;45(10):2200-2208.

113. Lacomis D, Petrella JT, Giuliani M. Causes of neuromuscular weakness in the intensive care unit: a study of ninety-two patients. Muscle & nerve. 1998;21(5):610-617.

114. Tang H, Inoki K, Lee M, et al. mTORC1 promotes denervation-induced muscle atrophy through a mechanism involving the activation of FoxO and E3 ubiquitin ligases. Science signaling. 2014;7(314):ra18.

115. Winkelman C. Inactivity and inflammation in the critically ill patient. Critical care clinics. 2007;23(1):21-34.

116. Paddon-Jones D, Sheffield-Moore M, Cree MG, et al. Atrophy and impaired muscle protein synthesis during prolonged inactivity and stress. The Journal of clinical endocrinology and metabolism. 2006;91(12):4836-4841.

117. Hamburg NM, McMackin CJ, Huang AL, et al. Physical inactivity rapidly induces insulin resistance and microvascular dysfunction in healthy volunteers. Arteriosclerosis, thrombosis, and vascular biology. 2007;27(12):2650-2656.

118. Powers SK, Smuder AJ, Criswell DS. Mechanistic links between oxidative stress and disuse muscle atrophy. Antioxidants & redox signaling. 2011;15(9):2519-2528.

119. Chambers MA, Moylan JS, Reid MB. Physical inactivity and muscle weakness in the critically ill. Critical care medicine. 2009;37(10 Suppl):S337- 346.

120. Warren GL, Summan M, Gao X, Chapman R, Hulderman T, Simeonova PP. Mechanisms of skeletal muscle injury and repair revealed by gene expression studies in mouse models. The Journal of physiology. 2007;582(Pt 2):825-841.

121. Stevens JE, Walter GA, Okereke E, et al. Muscle Adaptations with Immobilization and Rehabilitation after Ankle Fracture. Medicine & Science in Sports & Exercise. 2004;36(10):1695-1701.

176 122. Mir SA, Chatterjee A, Mitra A, Pathak K, Mahata SK, Sarkar S. Inhibition of signal transducer and activator of transcription 3 (STAT3) attenuates interleukin-6 (IL-6)-induced collagen synthesis and resultant hypertrophy in rat heart. The Journal of biological chemistry. 2012;287(4):2666-2677.

123. Suhr F, Brenig J, Muller R, Behrens H, Bloch W, Grau M. Moderate exercise promotes human RBC-NOS activity, NO production and deformability through Akt kinase pathway. PloS one. 2012;7(9):e45982.

124. Seok J, Warren HS, Cuenca AG, et al. Genomic responses in mouse models poorly mimic human inflammatory diseases. Proceedings of the National Academy of Sciences. 2013;110(9):3507-3512.

125. Rocheteau P, Chatre L, Briand D, et al. Sepsis induces long-term metabolic and mitochondrial muscle stem cell dysfunction amenable by mesenchymal stem cell therapy. Nat Commun. 2015;6:10145.

126. Voy BH. Systems genetics: a powerful approach for gene-environment interactions. J Nutr. 2011;141(3):515-519.

127. Deffur A, Wilkinson RJ, Mayosi BM, Mulder NM. ANIMA: Association network integration for multiscale analysis. Wellcome Open Research. 2018;3.

128. Pogoryelova O, Wilson IJ, Mansbach H, Argov Z, Nishino I, Lochmuller H. GNE genotype explains 20% of phenotypic variability in GNE myopathy. Neurol Genet. 2019;5(1):e308.

129. Parikshak NN, Gandal MJ, Geschwind DH. Systems biology and gene networks in neurodevelopmental and neurodegenerative disorders. Nature reviews Genetics. 2015;16(8):441-458.

130. Gaiteri C, Ding Y, French B, Tseng GC, Sibille E. Beyond modules and hubs: the potential of gene coexpression networks for investigating molecular mechanisms of complex brain disorders. Genes Brain Behav. 2014;13(1):13- 24.

131. Rotival M, Petretto E. Leveraging gene co-expression networks to pinpoint the regulation of complex traits and disease, with a focus on cardiovascular traits. Brief Funct Genomics. 2014;13(1):66-78.

177 132. van Dam S, Vosa U, van der Graaf A, Franke L, de Magalhaes JP. Gene co- expression analysis for functional classification and gene-disease predictions. Briefings in bioinformatics. 2018;19(4):575-592.

133. Sweeney TE, Haynes WA, Vallania F, Ioannidis JP, Khatri P. Methods to increase reproducibility in differential gene expression via meta-analysis. Nucleic acids research. 2017;45(1):e1.

134. Consortium SM-I. A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control Consortium. Nature biotechnology. 2014;32(9):903-914.

135. Langfelder P, Horvath S. WGCNA: an R package for weighted correlation network analysis. BMC bioinformatics. 2008;9:559.

136. Tomczak A, Mortensen JM, Winnenburg R, et al. Interpretation of biological experiments changes with evolution of the Gene Ontology and its annotations. Scientific Reports. 2018;8(1).

137. Gillis J, Pavlidis P. The impact of multifunctional genes on "guilt by association" analysis. PloS one. 2011;6(2):e17258.

138. Gillis J, Pavlidis P. "Guilt by association" is the exception rather than the rule in gene networks. PLoS computational biology. 2012;8(3):e1002444.

139. Bettencourt C, Ryten M, Forabosco P, et al. Insights from cerebellar transcriptomic analysis into the pathogenesis of ataxia. JAMA neurology. 2014;71(7):831-839.

140. Subramanian A, Tamayo P, Mootha VK, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proceedings of the National Academy of Sciences of the United States of America. 2005;102(43):15545-15550.

141. Nunez YO, Truitt JM, Gorini G, et al. Positively correlated miRNA-mRNA regulatory networks in mouse frontal cortex during early stages of alcohol dependence. BMC genomics. 2013;14:725.

142. Horvath S. Weighted Network Analysis: Applications in Genomics and Systems Biology. 2011.

178 143. Zhu X, Gerstein M, Snyder M. Getting connected: analysis and principles of biological networks. Genes & development. 2007;21(9):1010-1024.

144. Gaiteri C, Sibille E. Differentially expressed genes in major depression reside on the periphery of resilient gene coexpression networks. Front Neurosci. 2011;5:95.

145. Voineagu I, Wang X, Johnston P, et al. Transcriptomic analysis of autistic brain reveals convergent molecular pathology. Nature. 2011;474(7351):380- 384.

146. Ravasz E, Somera AL, Mongru DA, Oltvai ZN, Barabasi AL. Hierarchical organization of modularity in metabolic networks. Science. 2002;297(5586):1551-1555.

147. Langfelder P, Luo R, Oldham MC, Horvath S. Is my network module preserved and reproducible? PLoS computational biology. 2011;7(1):e1001057.

148. Mason MJ, Fan G, Plath K, Zhou Q, Horvath S. Signed weighted gene co- expression network analysis of transcriptional regulation in murine embryonic stem cells. BMC genomics. 2009;10:327.

149. Song L, Langfelder P, Horvath S. Comparison of co-expression measures: mutual information, correlation, and model based indices. BMC bioinformatics. 2012;13:328.

150. Allen JD, Xie Y, Chen M, Girard L, Xiao G. Comparing statistical methods for constructing large scale gene networks. PloS one. 2012;7(1):e29348.

151. Bartel DP. MicroRNAs: target recognition and regulatory functions. Cell. 2009;136(2):215-233.

152. Baek D, Villen J, Shin C, Camargo FD, Gygi SP, Bartel DP. The impact of microRNAs on protein output. Nature. 2008;455(7209):64-71.

153. Guo H, Ingolia NT, Weissman JS, Bartel DP. Mammalian microRNAs predominantly act to decrease target mRNA levels. Nature. 2010;466(7308):835-840.

179 154. Griffiths-Jones S, Saini HK, van Dongen S, Enright AJ. miRBase: tools for microRNA genomics. Nucleic acids research. 2008;36(Database issue):D154-158.

155. Friedman RC, Farh KK, Burge CB, Bartel DP. Most mammalian mRNAs are conserved targets of microRNAs. Genome Res. 2009;19(1):92-105.

156. Sarkar S, Dey BK, Dutta A. MiR-322/424 and -503 are induced during muscle differentiation and promote cell cycle quiescence and differentiation by down- regulation of Cdc25A. Mol Biol Cell. 2010;21(13):2138-2149.

157. Ma G, Wang Y, Li Y, et al. MiR-206, a key modulator of skeletal muscle development and disease. Int J Biol Sci. 2015;11(3):345-352.

158. Kim HK, Lee YS, Sivaprasad U, Malhotra A, Dutta A. Muscle-specific microRNA miR-206 promotes muscle differentiation. The Journal of cell biology. 2006;174(5):677-687.

159. Zhang X, Zuo X, Yang B, et al. MicroRNA directly enhances mitochondrial translation during muscle differentiation. Cell. 2014;158(3):607-619.

160. Kemp PR, Griffiths M, Polkey MI. Muscle wasting in the presence of disease, why is it so variable? Biol Rev Camb Philos Soc. 2018.

161. Alexander MS, Kunkel LM. Skeletal Muscle MicroRNAs: Their Diagnostic and Therapeutic Potential in Human Muscle Diseases. J Neuromuscul Dis. 2015;2(1):1-11.

162. Li J, Chan MC, Yu Y, et al. miR-29b contributes to multiple types of muscle atrophy. Nat Commun. 2017;8:15201.

163. Connolly M, Paul R, Farre-Garros R, et al. miR-424-5p reduces ribosomal RNA and protein synthesis in muscle wasting. Journal of cachexia, sarcopenia and muscle. 2018;9(2):400-416.

164. Farre Garros R, Paul R, Connolly M, et al. miR-542 Promotes Mitochondrial Dysfunction and SMAD Activity and is Raised in ICU Acquired Weakness. Am J Respir Crit Care Med. 2017.

180 165. Kozomara A, Griffiths-Jones S. miRBase: annotating high confidence microRNAs using deep sequencing data. Nucleic acids research. 2014;42(Database issue):D68-73.

166. Sass S, Pitea A, Unger K, Hess J, Mueller NS, Theis FJ. MicroRNA-Target Network Inference and Local Network Enrichment Analysis Identify Two microRNA Clusters with Distinct Functions in Head and Neck Squamous Cell Carcinoma. International journal of molecular sciences. 2015;16(12):30204- 30222.

167. Muniategui A, Pey J, Planes FJ, Rubio A. Joint analysis of miRNA and mRNA expression data. Briefings in bioinformatics. 2013;14(3):263-278.

168. Meyer SU, Sass S, Mueller NS, et al. Integrative Analysis of MicroRNA and mRNA Data Reveals an Orchestrated Function of MicroRNAs in Skeletal Myocyte Differentiation in Response to TNF-alpha or IGF1. PloS one. 2015;10(8):e0135284.

169. Wang X. Improving microRNA target prediction by modeling with unambiguously identified microRNA-target pairs from CLIP-ligation studies. Bioinformatics. 2016;32(9):1316-1322.

170. Steinkraus BR, Toegel M, Fulga TA. Tiny giants of gene regulation: experimental strategies for microRNA functional studies. Wiley Interdiscip Rev Dev Biol. 2016.

171. Chi SW, Zang JB, Mele A, Darnell RB. Argonaute HITS-CLIP decodes microRNA-mRNA interaction maps. Nature. 2009;460(7254):479-486.

172. Helwak A, Tollervey D. Mapping the miRNA interactome by cross-linking ligation and sequencing of hybrids (CLASH). Nature protocols. 2014;9(3):711- 728.

173. Hausser J, Zavolan M. Identification and consequences of miRNA-target interactions--beyond repression of gene expression. Nature reviews Genetics. 2014;15(9):599-612.

174. Dweep H, Gretz N. miRWalk2.0: a comprehensive atlas of microRNA-target interactions. Nat Methods. 2015;12(8):697.

181 175. Xu T, Su N, Liu L, et al. miRBaseConverter: an R/Bioconductor package for converting and retrieving miRNA name, accession, sequence and family information in different versions of miRBase. BMC bioinformatics. 2018;19(Suppl 19):514.

176. Tokar T, Pastrello C, Rossos AEM, et al. mirDIP 4.1-integrative database of human microRNA target predictions. Nucleic acids research. 2018;46(D1):D360-D370.

177. Agarwal V, Bell GW, Nam JW, Bartel DP. Predicting effective microRNA target sites in mammalian mRNAs. Elife. 2015;4.

178. Nielsen CB, Shomron N, Sandberg R, Hornstein E, Kitzman J, Burge CB. Determinants of targeting by endogenous and exogenous microRNAs and siRNAs. RNA. 2007;13(11):1894-1910.

179. Xu J, Zhang R, Shen Y, Liu G, Lu X, Wu CI. The evolution of evolvability in microRNA target sites in vertebrates. Genome Res. 2013;23(11):1810-1816.

180. Ellwanger DC, Buttner FA, Mewes HW, Stumpflen V. The sufficient minimal set of miRNA seed types. Bioinformatics. 2011;27(10):1346-1350.

181. John B, Enright AJ, Aravin A, Tuschl T, Sander C, Marks DS. Human MicroRNA targets. PLoS Biol. 2004;2(11):e363.

182. Betel D, Koppal A, Agius P, Sander C, Leslie C. Comprehensive modeling of microRNA targets predicts functional non-conserved and non-canonical sites. Genome Biol. 2010;11(8):R90.

183. Didiano D, Hobert O. Perfect seed pairing is not a generally reliable predictor for miRNA-target interactions. Nat Struct Mol Biol. 2006;13(9):849-851.

184. Chi SW, Hannon GJ, Darnell RB. An alternative mode of microRNA target recognition. Nat Struct Mol Biol. 2012;19(3):321-327.

185. Wang X. Composition of seed sequence is a major determinant of microRNA targeting patterns. Bioinformatics. 2014;30(10):1377-1383.

186. Kertesz M, Iovino N, Unnerstall U, Gaul U, Segal E. The role of site accessibility in microRNA target recognition. Nature genetics. 2007;39(10):1278-1284.

182 187. Long D, Lee R, Williams P, Chan CY, Ambros V, Ding Y. Potent effect of target structure on microRNA function. Nat Struct Mol Biol. 2007;14(4):287- 294.

188. Rennie W, Liu C, Carmack CS, et al. STarMir: a web server for prediction of microRNA binding sites. Nucleic acids research. 2014;42(Web Server issue):W114-118.

189. Reczko M, Maragkakis M, Alexiou P, Grosse I, Hatzigeorgiou AG. Functional microRNA targets in protein coding sequences. Bioinformatics. 2012;28(6):771-776.

190. Dweep H, Sticht C, Gretz N. In-Silico Algorithms for the Screening of Possible microRNA Binding Sites and Their Interactions. Curr Genomics. 2013;14(2):127-136.

191. Cantini L, Caselle M, Forget A, Zinovyev A, Barillot E, Martignetti L. A review of computational approaches detecting microRNAs involved in cancer. Frontiers in bioscience (Landmark edition). 2017;22:1774-1791.

192. Fu J, Tang W, Du P, et al. Identifying microRNA-mRNA regulatory network in colorectal cancer by a combination of expression profile and bioinformatics analysis. BMC Syst Biol. 2012;6:68.

193. Peng X, Li Y, Walters KA, et al. Computational identification of hepatitis C virus associated microRNA-mRNA regulatory modules in human livers. BMC genomics. 2009;10:373.

194. Zhang W, Edwards A, Fan W, Flemington EK, Zhang K. miRNA-mRNA correlation-network modules in human prostate cancer and the differences between primary and metastatic tumor subtypes. PloS one. 2012;7(6):e40130.

195. Huang JC, Morris QD, Frey BJ. Bayesian inference of MicroRNA targets from sequence and expression data. Journal of computational biology : a journal of computational molecular cell biology. 2007;14(5):550-563.

196. Chen X, Slack FJ, Zhao H. Joint analysis of expression profiles from multiple cancers improves the identification of microRNA-gene interactions. Bioinformatics. 2013;29(17):2137-2145.

183 197. Efron B. Microarrays, Empirical Bayes and the Two-Groups Model. Statistical Science. 2008;23(1):1-22.

198. Vasudevan S, Tong Y, Steitz JA. Switching from repression to activation: microRNAs can up-regulate translation. Science. 2007;318(5858):1931-1934.

199. Bossel Ben-Moshe N, Avraham R, Kedmi M, et al. Context-specific microRNA analysis: identification of functional microRNAs and their mRNA targets. Nucleic acids research. 2012;40(21):10614-10627.

200. Martignetti L, Tesson B, Almeida A, et al. Detection of miRNA regulatory effect on triple negative breast cancer transcriptome. BMC genomics. 2015;16:S4.

201. Cantini L, Isella C, Petti C, et al. MicroRNA–mRNA interactions underlying colorectal cancer molecular subtypes. Nature communications. 2015;6:8878.

202. Wu M, Chan C. Learning transcriptional regulation on a genome scale: a theoretical analysis based on gene expression data. Briefings in bioinformatics. 2012;13(2):150-161.

203. Wehrspaun CC, Haerty W, Ponting CP. Microglia recapitulate a hematopoietic master regulator network in the aging human frontal cortex. Neurobiol Aging. 2015;36(8):2443 e2449-2443 e2420.

204. Aubry S, Shin W, Crary JF, et al. Assembly and interrogation of Alzheimer's disease genetic networks reveal novel regulators of progression. PloS one. 2015;10(3):e0120352.

205. Lachmann A, Giorgi FM, Lopez G, Califano A. ARACNe-AP: gene network reverse engineering through adaptive partitioning inference of mutual information. Bioinformatics. 2016;32(14):2233-2235.

206. Margolin AA, Nemenman I, Basso K, et al. ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC bioinformatics. 2006;7 Suppl 1:S7.

207. Pepe MS, Feng Z. Improving biomarker identification with better designs and reporting. Clinical chemistry. 2011;57(8):1093-1095.

184 208. Rudy J, Valafar F. Empirical comparison of cross-platform normalization methods for gene expression data. BMC bioinformatics. 2011;12:467.

209. Gene Expression Omnibus. http://www.ncbi.nlm.nih.gov/geo/. Accessed.

210. Liu CG, Calin GA, Volinia S, Croce CM. MicroRNA expression profiling using microarrays. Nature protocols. 2008;3(4):563-578.

211. Hall DA, Ptacek J, Snyder M. Protein microarray technology. Mech Ageing Dev. 2007;128(1):161-167.

212. Chang C, Wang J, Zhao C, et al. Maximizing biomarker discovery by minimizing gene signatures. BMC genomics. 2011;12 Suppl 5:S6.

213. Park S, Zhang Y, Lin S, Wang TH, Yang S. Advances in microfluidic PCR for point-of-care infectious disease diagnostics. Biotechnology advances. 2011;29(6):830-839.

214. Director's Challenge Consortium for the Molecular Classification of Lung A, Shedden K, Taylor JM, et al. Gene expression-based survival prediction in lung adenocarcinoma: a multi-site, blinded validation study. Nature medicine. 2008;14(8):822-827.

215. van Laar R, Flinchum R, Brown N, et al. Translating a gene expression signature for multiple myeloma prognosis into a robust high-throughput assay for clinical use. BMC medical genomics. 2014;7:25.

216. Shen R, Chinnaiyan AM, Ghosh D. Pathway analysis reveals functional convergence of gene expression profiles in breast cancer. BMC medical genomics. 2008;1:28.

217. Shi L, Campbell G, Jones WD, et al. The MicroArray Quality Control (MAQC)- II study of common practices for the development and validation of microarray-based predictive models. Nature biotechnology. 2010;28(8):827- 838.

218. Simon R. Genomic biomarkers in predictive medicine: an interim analysis. EMBO Mol Med. 2011;3(8):429-435.

219. Diamandis EP. Cancer biomarkers: can we turn recent failures into success? Journal of the National Cancer Institute. 2010;102(19):1462-1467.

185 220. Baker SG. Improving the biomarker pipeline to develop and evaluate cancer screening tests. Journal of the National Cancer Institute. 2009;101(16):1116- 1119.

221. Cruz J, Wishart D. Applications of Machine Learning in Cancer Prediction and Prognosis. Cancer Inform. 2006;2:59-77.

222. Hamid JS, Hu P, Roslin NM, Ling V, Greenwood CM, Beyene J. Data integration in genetics and genomics: methods and challenges. Human genomics and proteomics : HGP. 2009;2009.

223. Pawitan Y, Michiels S, Koscielny S, Gusnanto A, Ploner A. False discovery rate, sensitivity and sample size for microarray studies. Bioinformatics. 2005;21(13):3017-3024.

224. Taminau J, Lazar C, Meganck S, Nowe A. Comparison of merging and meta- analysis as alternative approaches for integrative gene expression analysis. ISRN bioinformatics. 2014;2014:345106.

225. Ramasamy A, Mondry A, Holmes CC, Altman DG. Key issues in conducting a meta-analysis of gene expression microarray datasets. PLoS medicine. 2008;5(9):e184.

226. Hu P, Greenwood CM, Beyene J. Integrative analysis of multiple gene expression profiles with quality-adjusted effect size models. BMC bioinformatics. 2005;6:128.

227. Shabalin AA, Tjelmeland H, Fan C, Perou CM, Nobel AB. Merging two gene- expression studies via cross-platform normalization. Bioinformatics. 2008;24(9):1154-1160.

228. Tseng GC, Ghosh D, Feingold E. Comprehensive literature review and statistical considerations for microarray meta-analysis. Nucleic acids research. 2012;40(9):3785-3799.

229. Hu P, Wang X, Haitsma JJ, et al. Microarray meta-analysis identifies acute lung injury biomarkers in donor lungs that predict development of primary graft failure in recipients. PloS one. 2012;7(10):e45506.

186 230. Jaksik R, Iwanaszko M, Rzeszowska-Wolny J, Kimmel M. Microarray experiments and factors which affect their reliability. Biol Direct. 2015;10:46.

231. Wu Z, Irizarry RA, Gentleman R, Martinez-Murillo F, Spencer F. A Model- Based Background Adjustment for Oligonucleotide Expression Arrays. Journal of the American Statistical Association. 2004;99(468):909-917.

232. Robinson MD, Speed TP. A comparison of Affymetrix gene expression arrays. BMC bioinformatics. 2007;8:449.

233. Barnes M, Freudenberg J, Thompson S, Aronow B, Pavlidis P. Experimental comparison and cross-validation of the Affymetrix and Illumina gene expression analysis platforms. Nucleic acids research. 2005;33(18):5914- 5923.

234. Stafford P, Brun M. Three methods for optimization of cross-laboratory and cross-platform microarray expression data. Nucleic acids research. 2007;35(10):e72.

235. Maouche S, Poirier O, Godefroy T, et al. Performance comparison of two microarray platforms to assess differential gene expression in human monocyte and macrophage cells. BMC genomics. 2008;9:302.

236. Schurmann C, Heim K, Schillert A, et al. Analyzing illumina gene expression microarray data from different tissues: methodological aspects of data analysis in the metaxpress consortium. PloS one. 2012;7(12):e50938.

237. Consortium M, Shi L, Reid LH, et al. The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. Nature biotechnology. 2006;24(9):1151-1161.

238. Barbosa-Morais NL, Dunning MJ, Samarajiwa SA, et al. A re-annotation pipeline for Illumina BeadArrays: improving the interpretation of gene expression data. Nucleic acids research. 2010;38(3):e17.

239. Xia J, Gill EE, Hancock RE. NetworkAnalyst for statistical, visual and network-based meta-analysis of gene expression data. Nature protocols. 2015;10(6):823-844.

187 240. Kitchen RR, Sabine VS, Simen AA, Dixon JM, Bartlett JM, Sims AH. Relative impact of key sources of systematic noise in Affymetrix and Illumina gene- expression microarray experiments. BMC genomics. 2011;12:589.

241. Turnbull AK, Kitchen RR, Larionov AA, Renshaw L, Dixon JM, Sims AH. Direct integration of intensity-level data from Affymetrix and Illumina microarrays improves statistical power for robust reanalysis. BMC medical genomics. 2012;5:35.

242. Heider A, Alt R. virtualArray: a R/bioconductor package to merge raw data from different microarray platforms. BMC bioinformatics. 2013;14:75.

243. Gentleman RC, Carey VJ, Bates DM, et al. Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. 2004;5(10):R80.

244. Dai M, Wang P, Boyd AD, et al. Evolving gene/transcript definitions significantly alter the interpretation of GeneChip data. Nucleic acids research. 2005;33(20):e175.

245. Sabine VS, Sims AH, Macaskill EJ, et al. Gene expression profiling of response to mTOR inhibitor everolimus in pre-operatively treated post- menopausal women with oestrogen receptor-positive breast cancer. Breast cancer research and treatment. 2010;122(2):419-428.

246. Hu P, Beyene J, Greenwood CM. Tests for differential gene expression using weights in oligonucleotide microarray experiments. BMC genomics. 2006;7:33.

247. Wang X, Lin Y, Song C, Sibille E, Tseng GC. Detecting disease-associated genes with confounding variable adjustment and the impact on genomic meta-analysis: with application to major depressive disorder. BMC bioinformatics. 2012;13:52.

248. Miller JA, Cai C, Langfelder P, et al. Strategies for aggregating gene expression data: the collapseRows R function. BMC bioinformatics. 2011;12:322.

188 249. Khatri P, Roedder S, Kimura N, et al. A common rejection module (CRM) for acute rejection across multiple organs identifies novel therapeutics for organ transplantation. J Exp Med. 2013;210(11):2205-2221.

250. Cambon AC, Khalyfa A, Cooper NG, Thompson CM. Analysis of probe level patterns in Affymetrix microarray data. BMC bioinformatics. 2007;8:146.

251. Wilson CL, Miller CJ. Simpleaffy: a BioConductor package for Affymetrix Quality Control and data analysis. Bioinformatics. 2005;21(18):3683-3685.

252. Kang DD, Sibille E, Kaminski N, Tseng GC. MetaQC: objective quality control and inclusion/exclusion criteria for genomic meta-analysis. Nucleic acids research. 2012;40(2):e15.

253. Wang X, Kang DD, Shen K, et al. An R package suite for microarray meta- analysis in quality control, differentially expressed gene analysis and pathway enrichment detection. Bioinformatics. 2012;28(19):2534-2536.

254. Rhodes DR, Yu J, Shanker K, et al. Large-scale meta-analysis of cancer microarray data identifies common transcriptional profiles of neoplastic transformation and progression. Proceedings of the National Academy of Sciences of the United States of America. 2004;101(25):9309-9314.

255. Choi JK, Yu U, Kim S, Yoo OJ. Combining multiple microarray studies and modeling interstudy variation. Bioinformatics. 2003;19(Suppl 1):i84-i90.

256. Chang LC, Lin HM, Sibille E, Tseng GC. Meta-analysis methods for combining multiple expression profiles: comparisons, statistical characterization and an application guideline. BMC bioinformatics. 2013;14:368.

257. Hong F, Breitling R, McEntee CW, Wittner BS, Nemhauser JL, Chory J. RankProd: a bioconductor package for detecting differentially expressed genes in meta-analysis. Bioinformatics. 2006;22(22):2825-2827.

258. Waldron L, Riester M. Meta-Analysis in Gene Expression Studies. 2016;1418:161-176.

259. Ma S, Huang J. Regularized gene selection in cancer microarray meta- analysis. BMC bioinformatics. 2009;10:1.

189 260. Sims AH, Smethurst GJ, Hey Y, et al. The removal of multiplicative, systematic bias allows integration of breast cancer gene expression datasets - improving meta-analysis and prediction of prognosis. BMC medical genomics. 2008;1:42.

261. Campain A, Yang YH. Comparison study of microarray meta-analysis methods. BMC bioinformatics. 2010;11:408.

262. Xu L, Tan AC, Winslow RL, Geman D. Merging microarray data from separate breast cancer studies provides a robust prognostic test. BMC bioinformatics. 2008;9:125.

263. Liu CC, Hu J, Kalakrishnan M, Huang H, Zhou XJ. Integrative disease classification based on cross-platform microarray data. BMC bioinformatics. 2009;10 Suppl 1:S25.

264. Lee Y, Scheck AC, Cloughesy TF, et al. Gene expression analysis of glioblastomas identifies the major molecular basis for the prognostic benefit of younger age. BMC medical genomics. 2008;1:52.

265. Fielden MR, Nie A, McMillian M, et al. Interlaboratory evaluation of genomic signatures for predicting carcinogenicity in the rat. Toxicological sciences : an official journal of the Society of Toxicology. 2008;103(1):28-34.

266. Lu Y, Lemon W, Liu PY, et al. A gene expression signature predicts survival of patients with stage I non-small cell lung cancer. PLoS medicine. 2006;3(12):e467.

267. Hong F, Breitling R. A comparison of meta-analysis methods for detecting differentially expressed genes in microarray experiments. Bioinformatics. 2008;24(3):374-382.

268. Lu S, Li J, Song C, Shen K, Tseng GC. Biomarker detection in the integration of multiple multi-class genomic studies. Bioinformatics. 2010;26(3):333-340.

269. Haynes WA, Vallania F, Liu C, et al. Empowering Multi-Cohort Gene Expression Analysis to Increase Reproducibility. Pac Symp Biocomput. 2017;22:144-153.

190 270. Li MD, Burns TC, Morgan AA, Khatri P. Integrated multi-cohort transcriptional meta-analysis of neurodegenerative diseases. Acta Neuropathol Commun. 2014;2:93.

271. Shen R, Ghosh D, Chinnaiyan AM. Prognostic meta-signature of breast cancer developed by two-stage mixture modeling of microarray data. BMC genomics. 2004;5(1):94.

272. Huang H, Lu X, Liu Y, Haaland P, Marron JS. R/DWD: distance-weighted discrimination for classification, visualization and batch adjustment. Bioinformatics. 2012;28(8):1182-1183.

273. Deshwar AG, Morris Q. PLIDA: cross-platform gene expression normalization using perturbed topic models. Bioinformatics. 2014;30(7):956-961.

274. Hughey JJ, Butte AJ. Robust meta-analysis of gene expression using the elastic net. Nucleic acids research. 2015.

275. Batt J, Hussain S, Mathur S, et al. MEND ICU - Muscle Injury and Repair in Critical Illness Survivors Mechanically Ventilated for Over 7 Days. Am J Respir Crit Med 2015;191:A2288.

276. Herridge MS, Chu LM, Matte AL, et al. The RECOVER Program: One-Year Disability in Critically Ill Patients Mechanically Ventilated (MV) for 7 Days. Am J Respir Crit Med 2015(191):A5123.

277. dos Santos CC, Walsh C, Herridge MS, Mathur S, Bader G, Batt J. Co- Expression Network Analysis Identifies Molecular Pathways Related to Persistent Impairment of Muscle Strength in Survivors of Critical Illness (Mend-ICU Study Group). Am J Respir Crit Med 2015(191):A2287.

278. Marquis K, Debigare R, Lacasse Y, et al. Midthigh muscle cross-sectional area is a better predictor of mortality than body mass index in patients with chronic obstructive pulmonary disease. Am J Respir Crit Care Med. 2002;166(6):809-813.

279. Maughan RJ, Watson JS, Weir J. Strength and cross-sectional area of human skeletal muscle. J Physiol. 1983;338:37-49.

191 280. Mathur S, Makrides L, Hernandez P. Test-retest reliability of isometric and isokinetic torque in patients with chronic obstructive pulmonary disease. Physiotherapy Canada. 2004;56:94-101.

281. Gosselink R, Troosters T, Decramer M. Peripheral muscle weakness contributes to exercise limitation in COPD. Am J Respir Crit Care Med. 1996;153(3):976-980.

282. Bourgeois JM, Tarnopolsky MA. Pathology of skeletal muscle in mitochondrial disorders. Mitochondrion. 2004;4(5-6):441-452.

283. dos Santos CC, Murthy S, Hu P, et al. Network analysis of transcriptional responses induced by mesenchymal stem cell treatment of experimental sepsis. The American journal of pathology. 2012;181(5):1681-1692.

284. Ritchie ME, Phipson B, Wu D, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic acids research. 2015;43(7):e47.

285. Plaisier CL, Horvath S, Huertas-Vazquez A, et al. A systems genetics approach implicates USF1, FADS3, and other causal candidate genes for familial combined hyperlipidemia. PLoS genetics. 2009;5(9):e1000642.

286. Reimand J, Arak T, Vilo J. g:Profiler--a web server for functional interpretation of gene lists (2011 update). Nucleic acids research. 2011;39(Web Server issue):W307-315.

287. Kwon AT, Arenillas DJ, Worsley Hunt R, Wasserman WW. oPOSSUM-3: advanced analysis of regulatory motif over-representation across genes or ChIP-Seq datasets. G3. 2012;2(9):987-1002.

288. Banduseela VC, Chen YW, Kultima HG, et al. Impaired autophagy, chaperone expression, and protein synthesis in response to critical illness interventions in porcine skeletal muscle. Physiological genomics. 2013;45(12):477-486.

289. Tsai S, Cassady JP, Freking BA, Nonneman DJ, Rohrer GA, Piedrahita JA. Annotation of the Affymetrix porcine genome microarray. Animal genetics. 2006;37(4):423-424.

192 290. Snyder CM, Rice AL, Estrella NL, Held A, Kandarian SC, Naya FJ. MEF2A regulates the Gtl2-Dio3 microRNA mega-cluster to modulate WNT signaling in skeletal muscle regeneration. Development. 2013;140(1):31-42.

291. Estrella NL, Desjardins CA, Nocco SE, Clark AL, Maksimenko Y, Naya FJ. MEF2 Transcription Factors Regulate Distinct Gene Programs in Mammalian Skeletal Muscle Differentiation. The Journal of biological chemistry. 2015;290(2):1256-1268.

292. Tintignac LA, Brenner HR, Ruegg MA. Mechanisms Regulating Neuromuscular Junction Development and Function and Causes of Muscle Wasting. Physiol Rev. 2015;95(3):809-852.

293. Qiu H, Wang F, Liu C, Xu X, Liu B. TEAD1-dependent expression of the FoxO3a gene in mouse skeletal muscle. BMC molecular biology. 2011;12:1.

294. Wang F, Wang H, Wu H, et al. TEAD1 controls C2C12 cell proliferation and differentiation and regulates three novel target genes. Cellular signalling. 2013;25(3):674-681.

295. Yang Y, Han L, Yuan Y, Li J, Hei N, Liang H. Gene co-expression network analysis reveals common system-level properties of prognostic genes across cancer types. Nat Commun. 2014;5:3231.

296. Liu N, Nelson BR, Bezprozvannaya S, et al. Requirement of MEF2A, C, and D for skeletal muscle regeneration. Proceedings of the National Academy of Sciences of the United States of America. 2014;111(11):4109-4114.

297. Judson RN, Tremblay AM, Knopp P, et al. The Hippo pathway member Yap plays a key role in influencing fate decisions in muscle satellite cells. Journal of cell science. 2012;125(Pt 24):6009-6019.

298. Wackerhage H, Del Re DP, Judson RN, Sudol M, Sadoshima J. The Hippo signal transduction network in skeletal and cardiac muscle. Science signaling. 2014;7(337):re4.

299. Umansky KB, Gruenbaum-Cohen Y, Tsoory M, et al. Runx1 Transcription Factor Is Required for Myoblasts Proliferation during Muscle Regeneration. PLoS genetics. 2015;11(8):e1005457.

193 300. Demonbreun AR, Lapidos KA, Heretis K, et al. Myoferlin regulation by NFAT in muscle injury, regeneration and repair. Journal of cell science. 2010;123(Pt 14):2413-2422.

301. Horsley V, Jansen KM, Mills ST, Pavlath GK. IL-4 Acts as a Myoblast Recruitment Factor during Mammalian Muscle Growth. Cell. 2003;113(4):483-494.

302. Dos Santos CC, Batt J. ICU-acquired weakness: mechanisms of disability. Current opinion in critical care. 2012;18(5):509-517.

303. Herridge MS, Chu LM, Matte A, et al. The RECOVER Program: Disability Risk Groups & One Year Outcome after >/= 7 Days of Mechanical Ventilation. Am J Respir Crit Care Med. 2016.

304. Walsh CJ, Batt J, Herridge MS, et al. Transcriptomic analysis reveals abnormal muscle repair and remodeling in survivors of critical illness with sustained weakness. Sci Rep. 2016;6:29334.

305. Vasudevan S. Posttranscriptional upregulation by microRNAs. Wiley interdisciplinary reviews RNA. 2012;3(3):311-330.

306. Liu N, Williams AH, Maxeiner JM, et al. microRNA-206 promotes skeletal muscle regeneration and delays progression of Duchenne muscular dystrophy in mice. The Journal of clinical investigation. 2012;122(6):2054- 2065.

307. Nakasa T, Ishikawa M, Shi M, Shibuya H, Adachi N, Ochi M. Acceleration of muscle regeneration by local injection of muscle-specific microRNAs in rat skeletal muscle injury model. J Cell Mol Med. 2010;14(10):2495-2505.

308. Eisenberg I, Eran A, Nishino I, et al. Distinctive patterns of microRNA expression in primary muscular disorders. Proceedings of the National Academy of Sciences. 2007;104(43):17016-17021.

309. Cantini L, Isella C, Petti C, et al. MicroRNA-mRNA interactions underlying colorectal cancer molecular subtypes. Nat Commun. 2015;6:8878.

194 310. Brock GN, Mukhopadhyay P, Pihur V, Webb C, Greene RM, Pisano MM. MmPalateMiRNA, an R package compendium illustrating analysis of miRNA microarray data. Source Code Biol Med. 2013;8(1):1.

311. Witwer KW, Halushka MK. Toward the promise of microRNAs - Enhancing reproducibility and rigor in microRNA research. RNA biology. 2016;13(11):1103-1116.

312. Shirdel EA, Xie W, Mak TW, Jurisica I. NAViGaTing the micronome--using multiple microRNA prediction databases to identify signalling pathway- associated microRNAs. PloS one. 2011;6(2):e17429.

313. Vlachos IS, Paraskevopoulou MD, Karagkouni D, et al. DIANA-TarBase v7.0: indexing more than half a million experimentally supported miRNA:mRNA interactions. Nucleic acids research. 2015;43(Database issue):D153-159.

314. Margolin AA, Wang K, Lim WK, Kustagi M, Nemenman I, Califano A. Reverse engineering cellular networks. Nature protocols. 2006;1(2):662-671.

315. Carro MS, Lim WK, Alvarez MJ, et al. The transcriptional network for mesenchymal transformation of brain tumours. Nature. 2010;463(7279):318- 325.

316. Zhang Z. Variable selection with stepwise and best subset approaches. Annals of translational medicine. 2016;4(7).

317. Vandesompele J, De Preter K, Pattyn F, et al. Accurate normalization of real- time quantitative RT-PCR data by geometric averaging of multiple internal control genes. Genome biology. 2002;3(7):research0034. 0031.

318. Volodin A, Kosti I, Goldberg AL, Cohen S. Myofibril breakdown during atrophy is a delayed response requiring the transcription factor PAX4 and desmin depolymerization. Proceedings of the National Academy of Sciences. 2017;114(8):E1375-E1384.

319. Delgado-Olguin P, Brand-Arzamendi K, Scott IC, et al. CTCF promotes muscle differentiation by modulating the activity of myogenic regulatory factors. The Journal of biological chemistry. 2011;286(14):12483-12494.

195 320. Katase N, Terada K, Suzuki T, Nishimatsu S, Nohno T. miR-487b, miR-3963 and miR-6412 delay myogenic differentiation in mouse myoblast-derived C2C12 cells. BMC Cell Biol. 2015;16:13.

321. Lugg ST, Howells PA, Thickett DR. The increasing need for biomarkers in intensive care unit-acquired weakness--are microRNAs the solution? Critical care. 2015;19:189.

322. Walsh CJ, Hu P, Batt J, Dos Santos CC. Discovering MicroRNA-Regulatory Modules in Multi-Dimensional Cancer Genomic Data: A Survey of Computational Methods. Cancer Inform. 2016;15(Suppl 2):25-42.

323. Shalgi R, Lieber D, Oren M, Pilpel Y. Global and local architecture of the mammalian microRNA-transcription factor regulatory network. PLoS computational biology. 2007;3(7):e131.

324. Yuasa K, Hagiwara Y, Ando M, Nakamura A, Takeda Si, Hijikata T. MicroRNA-206 is highly expressed in newly formed muscle fibers: implications regarding potential for muscle regeneration and maturation in muscular dystrophy. Cell structure and function. 2008;33(2):163-169.

325. Wang XH, Hu Z, Klein JD, Zhang L, Fang F, Mitch WE. Decreased miR-29 suppresses myogenesis in CKD. Journal of the American Society of Nephrology. 2011;22(11):2068-2076.

326. Wang L, Zhou L, Jiang P, et al. Loss of miR-29 in myoblasts contributes to dystrophic muscle pathogenesis. Molecular Therapy. 2012;20(6):1222-1233.

327. Jiroutkova K, Krajcova A, Ziak J, et al. Mitochondrial function in skeletal muscle of patients with protracted critical illness and ICU-acquired weakness. Critical care. 2015;19:448.

328. Bonaldo P, Sandri M. Cellular and molecular mechanisms of muscle atrophy. Dis Model Mech. 2013;6(1):25-39.

329. Cohen S, Nathan JA, Goldberg AL. Muscle wasting in disease: molecular mechanisms and promising therapies. Nat Rev Drug Discov. 2015;14(1):58- 74.

196 330. Walsh CJ, Hu P, Batt J, Santos CC. Microarray Meta-Analysis and Cross- Platform Normalization: Integrative Genomics for Robust Biomarker Discovery. Microarrays (Basel). 2015;4(3):389-406.

331. Mukund K, Subramaniam S. Co-expression Network Approach Reveals Functional Similarities among Diseases Affecting Human Skeletal Muscle. Front Physiol. 2017;8:980.

332. Wang JZ, Du Z, Payattakool R, Yu PS, Chen CF. A new method to measure the semantic similarity of GO terms. Bioinformatics. 2007;23(10):1274-1281.

333. Merico D, Isserlin R, Stueker O, Emili A, Bader GD. Enrichment map: a network-based method for gene-set enrichment visualization and interpretation. PloS one. 2010;5(11):e13984.

334. Su J, Ekman C, Oskolkov N, et al. A novel atlas of gene expression in human skeletal muscle reveals molecular changes associated with aging. Skeletal muscle. 2015;5:35.

335. Lachmann A, Torre D, Keenan AB, et al. Massive mining of publicly available RNA-seq data from human and mouse. Nat Commun. 2018;9(1):1366.

336. Kuleshov MV, Jones MR, Rouillard AD, et al. Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic acids research. 2016;44(W1):W90-97.

337. Zhu L, Malatras A, Thorley M, et al. CellWhere: graphical display of interaction networks organized on subcellular localizations. Nucleic acids research. 2015;43(W1):W571-575.

338. Gallagher IJ, Stephens NA, MacDonald AJ, et al. Suppression of skeletal muscle turnover in cancer cachexia: evidence from the transcriptome in sequential human muscle biopsies. Clin Cancer Res. 2012;18(10):2817-2827.

339. Dadgar S, Wang Z, Johnston H, et al. Asynchronous remodeling is a driver of failed regeneration in Duchenne muscular dystrophy. The Journal of cell biology. 2014;207(1):139-158.

197 340. Abadi A, Glover EI, Isfort RJ, et al. Limb immobilization induces a coordinate down-regulation of mitochondrial and other metabolic pathways in men and women. PloS one. 2009;4(8):e6518.

341. Taivassalo T, Hussain SN. Contribution of the Mitochondria to Locomotor Muscle Dysfunction in Patients With COPD. Chest. 2016;149(5):1302-1312.

342. Temiz P, Weihl CC, Pestronk A. Inflammatory myopathies with mitochondrial pathology and protein aggregates. Journal of the neurological sciences. 2009;278(1-2):25-29.

343. Dumont N, Bouchard P, Frenette J. Neutrophil-induced skeletal muscle damage: a calculated and controlled response following hindlimb unloading and reloading. American journal of physiology Regulatory, integrative and comparative physiology. 2008;295(6):R1831-1838.

344. Madaro L, Bouche M. From innate to adaptive immune response in muscular dystrophies and skeletal muscle regeneration: the role of lymphocytes. BioMed research international. 2014;2014:438675.

345. Gillies AR, Chapman MA, Bushong EA, Deerinck TJ, Ellisman MH, Lieber RL. High resolution three-dimensional reconstruction of fibrotic skeletal muscle extracellular matrix. The Journal of physiology. 2017;595(4):1159-1171.

346. Mann CJ, Perdiguero E, Kharraz Y, et al. Aberrant repair and fibrosis development in skeletal muscle. Skeletal muscle. 2011;1(1):21.

347. Gladman JT, Yadava RS, Mandal M, Yu Q, Kim YK, Mahadevan MS. NKX2- 5, a modifier of skeletal muscle pathology due to RNA toxicity. Human molecular genetics. 2015;24(1):251-264.

348. Riazi AM, Lee H, Hsu C, Van Arsdell G. CSX/Nkx2.5 modulates differentiation of skeletal myoblasts and promotes differentiation into neuronal cells in vitro. The Journal of biological chemistry. 2005;280(11):10716-10720.

349. Coletti D, Daou N, Hassani M, Li Z, Parlakian A. Serum Response Factor in muscle tissues: from development to ageing. European journal of translational myology. 2016;26(2).

198 350. Khurana A, Dey CS. Involvement of Elk ‐1 in L6E9 skeletal muscle differentiation. FEBS letters. 2002;527(1-3):119-124.

351. Addison O, Marcus RL, Lastayo PC, Ryan AS. Intermuscular fat: a review of the consequences and causes. Int J Endocrinol. 2014;2014:309570.

352. Murphy C, Withrow J, Hunter M, et al. Emerging role of extracellular vesicles in musculoskeletal diseases. Mol Aspects Med. 2018;60:123-128.

353. Monici MC, Aguennouz M, Mazzeo A, Messina C, Vita G. Activation of nuclear factor- B in inflammatory myopathies and Duchenne muscular dystrophy. Neurology. 2003;60(6):993-997.

354. Yang C-C, Askanas V, Engel WK, Alvarez RB. Immunolocalization of transcription factor NF-κB in inclusion-body myositis muscle and at normal human neuromuscular junctions. Neuroscience letters. 1998;254(2):77-80.

355. Guttridge DC. NF-kappa B-Induced Loss of MyoD Messenger RNA: Possible Role in Muscle Decay and Cachexia. Science. 2000;289(5488):2363-2366.

356. Mourkioti F, Rosenthal N. NF-kappaB signaling in skeletal muscle: prospects for intervention in muscle diseases. J Mol Med (Berl). 2008;86(7):747-759.

357. Bakkar N, Guttridge DC. NF-kappaB signaling: a tale of two pathways in skeletal myogenesis. Physiol Rev. 2010;90(2):495-511.

358. Sanoudou D, Haslett JN, Kho AT, et al. Expression profiling reveals altered satellite cell numbers and glycolytic enzyme transcription in nemaline myopathy muscle. Proceedings of the National Academy of Sciences of the United States of America. 2003;100(8):4666-4671.

359. Rusconi F, Mancinelli E, Colombo G, et al. Proteome profile in Myotonic Dystrophy type 2 myotubes reveals dysfunction in protein processing and mitochondrial pathways. Neurobiol Dis. 2010;38(2):273-280.

360. Saenz A, Azpitarte M, Armananzas R, et al. Gene expression profiling in limb- girdle muscular dystrophy 2A. PloS one. 2008;3(11):e3750.

361. Munters LA, Loell I, Ossipova E, et al. Endurance Exercise Improves Molecular Pathways of Aerobic Metabolism in Patients With Myositis. Arthritis Rheumatol. 2016;68(7):1738-1750.

199 362. Palermo AT, Palmer RE, So KS, et al. Transcriptional response to GAA deficiency (Pompe disease) in infantile-onset patients. Mol Genet Metab. 2012;106(3):287-300.

363. Walston JD. Sarcopenia in older adults. Curr Opin Rheumatol. 2012;24(6):623-627.

364. Probst-Cousin S, Berghoff C, Neundorfer B, Heuss D. Annexin expression in inflammatory myopathies. Muscle & nerve. 2004;30(1):102-110.

365. Cagliani R, Magri F, Toscano A, et al. Mutation finding in patients with dysferlin deficiency and role of the dysferlin interacting proteins annexin A1 and A2 in muscular dystrophies. Hum Mutat. 2005;26(3):283.

366. Defour A, Medikayala S, Van der Meulen JH, et al. Annexin A2 links poor myofiber repair with inflammation and adipogenic replacement of the injured muscle. Human molecular genetics. 2017;26(11):1979-1991.

367. Chinzei N, Hayashi S, Ueha T, et al. P21 deficiency delays regeneration of skeletal muscular tissue. PloS one. 2015;10(5):e0125765.

368. Adams CM, Ebert SM, Dyle MC. Role of ATF4 in skeletal muscle atrophy. Current opinion in clinical nutrition and metabolic care. 2017;20(3):164-168.

369. Anderson DM, Cannavino J, Li H, et al. Severe muscle wasting and denervation in mice lacking the RNA-binding protein ZFP106. Proceedings of the National Academy of Sciences of the United States of America. 2016;113(31):E4494-4503.

370. von Grabowiecki Y, Abreu P, Blanchard O, et al. Transcriptional activator TAp63 is upregulated in muscular atrophy during ALS and induces the pro- atrophic ubiquitin ligase Trim63. Elife. 2016;5.

371. Fisher AG, Seaborne RA, Hughes TM, et al. Transcriptomic and epigenetic regulation of disuse atrophy and the return to activity in skeletal muscle. FASEB J. 2017;31(12):5268-5282.

372. Rose AJ, Hargreaves M. Exercise increases Ca2+-calmodulin-dependent protein kinase II activity in human skeletal muscle. The Journal of physiology. 2003;553(Pt 1):303-309.

200 373. Chin ER. The role of calcium and calcium/calmodulin-dependent kinases in skeletal muscle plasticity and mitochondrial biogenesis. Proceedings of the Nutrition Society. 2004;63(2):279-286.

374. Nogales-Gadea G, Brull A, Santalla A, et al. McArdle Disease: Update of Reported Mutations and Polymorphisms in the PYGM Gene. Hum Mutat. 2015;36(7):669-678.

375. Koutsoulidou A, Kyriakides TC, Papadimas GK, et al. Elevated Muscle- Specific miRNAs in Serum of Myotonic Dystrophy Patients Relate to Muscle Disease Progress. PloS one. 2015;10(4):e0125341.

376. Imbriano C, Molinari S. Alternative Splicing of Transcription Factors Genes in Muscle Physiology and Pathology. Genes (Basel). 2018;9(2).

377. Bakay M, Wang Z, Melcon G, et al. Nuclear envelope dystrophies show a transcriptional fingerprint suggesting disruption of Rb-MyoD pathways in muscle regeneration. Brain : a journal of neurology. 2006;129(Pt 4):996-1013.

378. Arashiro P, Eisenberg I, Kho AT, et al. Transcriptional regulation differs in affected facioscapulohumeral muscular dystrophy patients compared to asymptomatic related carriers. Proceedings of the National Academy of Sciences of the United States of America. 2009;106(15):6220-6225.

379. Rahimov F, King OD, Leung DG, et al. Transcriptional profiling in facioscapulohumeral muscular dystrophy to identify candidate biomarkers. Proceedings of the National Academy of Sciences of the United States of America. 2012;109(40):16234-16239.

380. Perfetti A, Greco S, Fasanaro P, et al. Genome wide identification of aberrant alternative splicing events in myotonic dystrophy type 2. PloS one. 2014;9(4):e93983.

381. Tasca G, Pescatori M, Monforte M, et al. Different molecular signatures in magnetic resonance imaging-staged facioscapulohumeral muscular dystrophy muscles. PloS one. 2012;7(6):e38779.

382. Nakamori M, Sobczak K, Puwanant A, et al. Splicing biomarkers of disease severity in myotonic dystrophy. Annals of neurology. 2013;74(6):862-872.

201 383. Screen M, Raheem O, Holmlund-Hampf J, et al. Gene expression profiling in tibial muscular dystrophy reveals unfolded protein response and altered autophagy. PloS one. 2014;9(3):e90819.

384. Suarez-Calvet X, Gallardo E, Nogales-Gadea G, et al. Altered RIG-I/DDX58- mediated innate immunity in dermatomyositis. J Pathol. 2014;233(3):258-268.

385. Zhu W, Streicher K, Shen N, et al. Genomic signatures characterize leukocyte infiltration in myositis muscles. BMC medical genomics. 2012;5:53.

386. Greenberg SA, Bradshaw EM, Pinkus JL, et al. Plasma cells in muscle in inclusion body myositis and polymyositis. Neurology. 2005;65(11):1782-1787.

387. Barres R, Kirchner H, Rasmussen M, et al. Weight loss after gastric bypass surgery in human obesity remodels promoter methylation. Cell Rep. 2013;3(4):1020-1027.

388. Reich KA, Chen YW, Thompson PD, Hoffman EP, Clarkson PM. Forty-eight hours of unloading and 24 h of reloading lead to changes in global gene expression patterns related to ubiquitination and oxidative stress in humans. J Appl Physiol (1985). 2010;109(5):1404-1415.

389. Urso ML, Scrimgeour AG, Chen YW, Thompson PD, Clarkson PM. Analysis of human skeletal muscle after 48 h immobilization reveals alterations in mRNA and protein for extracellular matrix components. J Appl Physiol (1985). 2006;101(4):1136-1148.

390. Turan N, Kalko S, Stincone A, et al. A systems biology approach identifies molecular networks defining skeletal muscle abnormalities in chronic obstructive pulmonary disease. PLoS computational biology. 2011;7(9):e1002129.

391. Kreiner FF, Borup R, Nielsen FC, Schjerling P, Galbo H. Gene expression profiling in patients with polymyalgia rheumatica before and after symptom- abolishing glucocorticoid treatment. BMC musculoskeletal disorders. 2017;18(1):341.

392. Bachinski LL, Sirito M, Bohme M, Baggerly KA, Udd B, Krahe R. Altered MEF2 isoforms in myotonic dystrophy and other neuromuscular disorders. Muscle & nerve. 2010;42(6):856-863.

202 393. Pescatori M, Broccolini A, Minetti C, et al. Gene expression profiling in the early phases of DMD: a constant molecular signature characterizes DMD muscle from early postnatal life throughout disease progression. FASEB J. 2007;21(4):1210-1226.

394. Eisenberg I, Novershtern N, Itzhaki Z, et al. Mitochondrial processes are impaired in hereditary inclusion body myopathy. Human molecular genetics. 2008;17(23):3663-3674.

395. Osborne RJ, Welle S, Venance SL, Thornton CA, Tawil R. Expression profile of FSHD supports a link between retinal vasculopathy and muscular dystrophy. Neurology. 2007;68(8):569-577.

396. Greenberg SA, Pinkus JL, Pinkus GS, et al. Interferon-alpha/beta-mediated innate immune mechanisms in dermatomyositis. Annals of neurology. 2005;57(5):664-678.

397. Park JJ, Berggren JR, Hulver MW, Houmard JA, Hoffman EP. GRB14, GPD1, and GDF8 as potential network collaborators in weight loss-induced improvements in insulin action in human skeletal muscle. Physiological genomics. 2006;27(2):114-121.

398. Chen YW, Gregory C, Ye F, et al. Molecular signatures of differential responses to exercise trainings during rehabilitation. Biomed Genet Genom. 2017;2(1).

399. Radom-Aizik S, Kaminski N, Hayek S, Halkin H, Cooper DM, Ben-Dov I. Effects of exercise training on quadriceps muscle gene expression in chronic obstructive pulmonary disease. J Appl Physiol (1985). 2007;102(5):1976- 1984.

400. Pradat PF, Dubourg O, de Tapia M, et al. Muscle gene expression is a marker of amyotrophic lateral sclerosis severity. Neurodegener Dis. 2012;9(1):38-52.

401. Smith LR, Chambers HG, Subramaniam S, Lieber RL. Transcriptional abnormalities of hamstring muscle contractures in children with cerebral palsy. PloS one. 2012;7(8):e40686.

203 402. Batt J, Mathur S, Katzberg HD. Mechanism of ICU-acquired weakness: muscle contractility in critical illness. Intensive Care Med. 2017;43(4):584- 586.

403. Bougle A, Rocheteau P, Sharshar T, Chretien F. Muscle regeneration after sepsis. Critical care. 2016;20(1):131.

404. Callahan LA, Supinski GS. Sepsis-induced myopathy. Critical care medicine. 2009;37(10 Suppl):S354-367.

405. Knaus WA, Draper EA, Wagner DP, Zimmerman JE. APACHE II: a severity of disease classification system. Critical care medicine. 1985;13(10):818-829.

406. Singer M. The role of mitochondrial dysfunction in sepsis-induced multi-organ failure. Virulence. 2014;5(1):66-72.

407. Christov C, Chretien F, Abou-Khalil R, et al. Muscle satellite cells and endothelial cells: close neighbors and privileged partners. Mol Biol Cell. 2007;18(4):1397-1409.

408. Salmena L, Poliseno L, Tay Y, Kats L, Pandolfi PP. A ceRNA hypothesis: the Rosetta Stone of a hidden RNA language? Cell. 2011;146(3):353-358.

409. Chavali S, Bruhn S, Tiemann K, et al. MicroRNAs act complementarily to regulate disease-related mRNA modules in human diseases. RNA. 2013;19(11):1552-1562.

410. de Andrade HM, de Albuquerque M, Avansini SH, et al. MicroRNAs-424 and 206 are potential prognostic markers in spinal onset amyotrophic lateral sclerosis. Journal of the neurological sciences. 2016;368:19-24.

411. Lewis A, Lee JY, Donaldson AV, et al. Increased expression of H19/miR-675 is associated with a low fat-free mass index in patients with COPD. Journal of cachexia, sarcopenia and muscle. 2016;7(3):330-344.

412. Sweeney TE, Shidham A, Wong HR, Khatri P. A comprehensive time-course- based multicohort analysis of sepsis and sterile inflammation reveals a robust diagnostic gene set. Sci Transl Med. 2015;7(287):287ra271.

413. Kalamgi RC, Larsson L. Mechanical Signaling in the Pathophysiology of Critical Illness Myopathy. Front Physiol. 2016;7:23.

204 414. Burakiewicz J, Sinclair CDJ, Fischer D, Walter GA, Kan HE, Hollingsworth KG. Quantifying fat replacement of muscle by quantitative MRI in muscular dystrophy. Journal of neurology. 2017;264(10):2053-2067.

415. Terry EE, Zhang X, Hoffmann C, et al. Transcriptional profiling reveals extraordinary diversity among skeletal muscle tissues. Elife. 2018;7.

416. Bullard SA, Seo S, Schilling B, et al. Gadd45a Protein Promotes Skeletal Muscle Atrophy by Forming a Complex with the Protein Kinase MEKK4. The Journal of biological chemistry. 2016;291(34):17496-17509.

417. Alexander JJ, Quigg RJ. Muscle, myeloid cells, and complement: a complex interaction. Cell Mol Immunol. 2018;15(11):992-993.

418. Han R, Frett EM, Levy JR, et al. Genetic ablation of complement C3 attenuates muscle pathology in dysferlin-deficient mice. The Journal of clinical investigation. 2010;120(12):4366-4374.

419. Naito AT, Sumida T, Nomura S, et al. Complement C1q activates canonical Wnt signaling and promotes aging-related phenotypes. Cell. 2012;149(6):1298-1313.

420. Ran D, Daye ZJ. Gene expression variability and the analysis of large-scale RNA-seq studies with the MDSeq. Nucleic acids research. 2017;45(13):e127.

421. Allen JD, Wang S, Chen M, et al. Probe mapping across multiple microarray platforms. Briefings in bioinformatics. 2012;13(5):547-554.

422. Bachinski LL, Baggerly KA, Neubauer VL, et al. Most expression and splicing changes in myotonic dystrophy type 1 and type 2 skeletal muscle are shared with other muscular dystrophies. Neuromuscul Disord. 2014;24(3):227-240.

423. Nakka K, Ghigna C, Gabellini D, Dilworth FJ. Diversification of the muscle proteome through alternative splicing. Skeletal muscle. 2018;8(1):8.

424. Rodrigo-Domingo M, Waagepetersen R, Bodker JS, et al. Reproducible probe-level analysis of the Affymetrix Exon 1.0 ST array with R/Bioconductor. Briefings in bioinformatics. 2014;15(4):519-533.

205 425. Chemello F, Bean C, Cancellara P, Laveder P, Reggiani C, Lanfranchi G. Microgenomic analysis in skeletal muscle: expression signatures of individual fast and slow myofibers. PloS one. 2011;6(2):e16807.

426. Avila Cobos F, Vandesompele J, Mestdagh P, De Preter K. Computational deconvolution of transcriptomics data from mixed cell populations. Bioinformatics. 2018.

427. Vogel C, Marcotte EM. Insights into the regulation of protein abundance from proteomic and transcriptomic analyses. Nature reviews Genetics. 2012;13(4):227-232.

428. Battle A, Khan Z, Wang SH, et al. Genomic variation. Impact of regulatory variation from RNA to protein. Science. 2015;347(6222):664-667.

429. Ghazalpour A, Bennett B, Petyuk VA, et al. Comparative analysis of proteome and transcriptome variation in mouse. PLoS genetics. 2011;7(6):e1001393.

430. Hartman JL, Garvik B, Hartwell L. Principles for the buffering of genetic variation. Science. 2001;291(5506):1001-1004.

431. Su AI, Hogenesch JB. Power-law-like distributions in biomedical publications and research funding. Genome Biology. 2007;8(4):404.

432. Haynes WA, Tomczak A, Khatri P. Gene annotation bias impedes biomedical research. Sci Rep. 2018;8(1):1362.

433. Bleazard T, Lamb JA, Griffiths-Jones S. Bias in microRNA functional enrichment analysis. Bioinformatics. 2015;31(10):1592-1598.

434. Godard P, van Eyll J. Pathway analysis from lists of microRNAs: common pitfalls and alternative strategy. Nucleic acids research. 2015;43(7):3490- 3497.

435. Latronico N, Rasulo FA. Presentation and management of ICU myopathy and neuropathy. Current opinion in critical care. 2010;16(2):123-127.

436. Gupta S, Ellis SE, Ashar FN, et al. Transcriptome analysis reveals dysregulation of innate immune response genes and neuronal activity- dependent genes in autism. Nat Commun. 2014;5:5748.

206 437. Szigyarto CA, Spitali P. Biomarkers of Duchenne muscular dystrophy: current findings. Degener Neurol Neuromuscul Dis. 2018;8:1-13.

438. Barreiro E. Models of disuse muscle atrophy: therapeutic implications in critically ill patients. Ann Transl Med. 2018;6(2):29.

439. Wieske L, Witteveen E, Petzold A, et al. Neurofilaments as a plasma biomarker for ICU-acquired weakness: an observational pilot study. Critical care. 2014;18(1):R18.

440. Roberts TC, Godfrey C, McClorey G, et al. Extracellular microRNAs are dynamic non-vesicular biomarkers of muscle turnover. Nucleic acids research. 2013;41(20):9500-9513.

441. Pilling LC, Joehanes R, Kacprowski T, et al. Gene transcripts associated with muscle strength: a CHARGE meta-analysis of 7,781 persons. Physiological genomics. 2016;48(1):1-11.

442. Briggs D, Morgan JE. Recent progress in satellite cell/myoblast engraftment - - relevance for therapy. The FEBS journal. 2013;280(17):4281-4293.

443. Musa A, Ghoraie LS, Zhang SD, et al. A review of connectivity map and computational approaches in pharmacogenomics. Briefings in bioinformatics. 2018;19(3):506-523.

444. Smirnov P, Safikhani Z, El-Hachem N, et al. PharmacoGx: an R package for analysis of large pharmacogenomic datasets. Bioinformatics. 2016;32(8):1244-1246.

445. Wang J, Meng F, Dai E, et al. Identification of associations between small molecule drugs and miRNAs based on functional similarity. Oncotarget. 2016;7(25):38658-38669.

446. Udd B, Meola G, Krahe R, et al. Myotonic dystrophy type 2 (DM2) and related disorders report of the 180th ENMC workshop including guidelines on diagnostics and management 3-5 December 2010, Naarden, The Netherlands. Neuromuscul Disord. 2011;21(6):443-450.

207

APPENDIX 1 – Supplementary Data

Chapter 3 Supplementary Data

1. Supplementary Table 3.1 . Differentially expressed genes (N = 744 unique probes corresponding to 695 genes). LIMMA global analysis of differentially expressed genes between ICUAW and controls for both day 7 post-ICU and month 6 post-ICU at false discovery rate (FDR) 5% level was performed using data adjusted for age, sex, and correlation between patient samples.

Non-log Fold Non-log Fold change P value (FDR change Illumina ID Gene Symbol ICUAW Day 7 vs. P value (FDR adjusted) adjusted) ICUAW Month control 6 vs control ILMN_1818859 HS.407666 -2.052 2.51E -06 -2.285 3.86E -07 ILMN_1710027 PNMT 1.368 0.00167 1.712 2.38E-06 ILMN_1883153 HS.543175 -1.707 1.27E-05 -1.748 1.11E-05 ILMN_1720745 LOC645385 1.715 1.15E-06 1.613 1.43E-05 ILMN_3245346 EFR3B -1.874 6.30E -07 -1.703 1.68E -05 ILMN_1850374 HS.333785 -2.021 5.77E-05 -2.154 2.75E-05 ILMN_1903693 HS.573359 -1.684 6.55E-05 -1.77 2.95E-05 ILMN_1808825 OR51F1 -1.661 0.000339 -1.885 3.00E-05 ILMN_1897911 HS.565086 -2.101 5.01E -05 -2.182 4.02E -05 ILMN_1819158 HS.542632 -1.562 0.00353 -1.995 4.29E -05 ILMN_2283814 C21ORF34 -1.595 0.000169 -1.704 4.72E-05 ILMN_2157717 MYF6 -3.489 0.000362 -4.526 5.22E-05 ILMN_1893033 HS.544451 -1.936 7.12E-05 -2.01 5.35E-05 ILMN_1812461 WISP2 1.302 0.102 2.115 5.43E -05 ILMN_1854165 HS.564962 -1.425 0.000281 -1.516 5.66E-05 ILMN_1817048 HS.545044 -1.922 0.000337 -2.172 6.08E-05 ILMN_1710458 LOC646990 -2.083 0.000279 -2.34 6.80E-05 ILMN_1822180 HS.571435 -1.596 7.79E -05 -1.62 8.10E -05 ILMN_1803882 VEGFA -2.745 1.51E-05 -2.517 8.52E-05 ILMN_1716583 NME7 1.783 3.97E-05 1.758 8.75E-05 ILMN_1718766 MT1F 1.624 0.000812 1.828 9.49E-05

208 ILMN_2405023 PPP1CB -1.424 0.000483 -1.517 0.000102 ILMN_1750763 LOC643699 -1.662 0.000619 -1.844 0.000103 ILMN_1728202 TMEM22 1.348 0.00661 1.595 0.000109 ILMN_1769229 BCL2A1 -1.438 0.00159 -1.608 0.00012 ILMN_1728787 AGR3 -1.696 0.00027 -1.791 0.000124 ILMN_2073235 FTHL12 -2.093 1.08E-05 -1.889 0.000139 ILMN_1889144 HS.569953 -1.762 0.000689 -1.959 0.000143 ILMN_1757081 SYN2 -1.635 0.00142 -1.872 0.000144 ILMN_1651429 SELM 1.91 0.000408 2.064 0.00016 ILMN_1732318 PTK2B -1.684 0.000323 -1.769 0.000163 ILMN_1874302 HS.543828 -1.897 0.000151 -1.927 0.00017 ILMN_1838702 HS.547858 -1.564 0.00303 -1.842 0.000174 ILMN_1904242 HS.114286 -1.641 0.000112 -1.636 0.000185 ILMN_1703408 FZD3 1.458 0.000516 1.528 0.000199 ILMN_1687842 PGA3 -1.965 0.000566 -2.138 0.000219 ILMN_1699836 LSP1 -1.515 0.000682 -1.605 0.000232 ILMN_2181892 BEX2 2.549 0.00173 3.217 0.000239 ILMN_1914284 HS.538083 -1.931 0.000341 -2.011 0.00025 ILMN_1706660 HYI 1.437 0.000501 1.489 0.000253 ILMN_1659306 SVIL -1.378 0.00656 -1.597 0.000254 ILMN_1854029 HS.566801 -1.775 0.000419 -1.858 0.000256 ILMN_1824465 HS.148448 -2.032 8.32E-05 -1.952 0.000261 ILMN_1780057 RENBP 1.904 0.000131 1.873 0.000271 ILMN_1778505 LOC642771 -2.101 0.000581 -2.269 0.000284 ILMN_1859160 HS.117299 -1.785 0.000259 -1.811 0.000289 ILMN_1877455 HS.541892 -1.763 0.00203 -2.031 0.000289 ILMN_1832180 HS.557356 -1.711 0.000992 -1.861 0.000299 ILMN_1720373 SLC7A5 -1.642 0.00453 -1.972 0.000305 ILMN_2128428 DAB2 1.519 0.0137 1.946 0.000324 ILMN_2223941 FBLN5 1.354 0.00528 1.522 0.000328 ILMN_1777513 KCTD11 -1.345 0.00464 -1.497 0.000331 ILMN_3255124 ATL1 -1.627 0.000678 -1.709 0.000341 ILMN_1695058 SLC38A5 -1.83 0.000104 -1.762 0.000343 ILMN_1693270 SUSD2 1.362 0.0257 1.725 0.000343 ILMN_1874126 HS.544721 -1.41 0.0194 -1.78 0.000344 ILMN_1836172 HS.582136 -1.825 0.00216 -2.106 0.00035 ILMN_1832252 HS.369978 -1.631 0.000934 -1.738 0.000364 ILMN_1703233 LOC653382 -1.399 0.00358 -1.55 0.000371 ILMN_2132898 SPRN -2.049 0.000181 -2.008 0.000379 ILMN_1706304 EIF2C4 -1.403 0.000618 -1.442 0.000383 ILMN_1705627 USP2 -1.734 0.000124 -1.678 0.00039 ILMN_1676361 ARHGAP22 1.568 0.000745 1.638 0.000394

209 ILMN_2358560 TIAM2 1.534 0.0157 1.975 0.000418 ILMN_1847029 HS.553290 -1.996 7.46E-05 -1.859 0.00042 ILMN_1668619 KIAA1467 1.314 0.00317 1.416 0.000421 ILMN_1772702 SFRS2B 1.177 0.0558 1.395 0.000425 ILMN_1807969 SNCAIP 1.295 0.0728 1.757 0.000438 ILMN_1877156 HS.567436 -1.795 0.00112 -1.939 0.000439 ILMN_1888925 HS.232535 -1.63 0.000212 -1.604 0.000459 ILMN_1879135 HS.555208 -1.694 0.00236 -1.894 0.000501 ILMN_3274671 LOC283481 1.754 1.70E-05 1.561 0.000519 ILMN_1657009 NFASC -1.546 0.00068 -1.587 0.000524 ILMN_1772369 PDHA1 -1.613 0.000347 -1.608 0.00054 ILMN_1813581 CNR1 1.185 0.0951 1.478 0.000554 ILMN_3203534 LOC100132474 -1.413 0.00367 -1.547 0.000559 ILMN_1861915 HS.545663 -1.244 0.0316 -1.466 0.000565 ILMN_1763730 APPL1 -1.657 4.54E -06 -1.44 0.000569 ILMN_2396991 HCST 1.891 0.000189 1.818 0.000573 ILMN_1711422 PLEKHN1 -1.555 0.00274 -1.711 0.000574 ILMN_1740217 HACE1 -1.553 0.000136 -1.495 0.000575 ILMN_1699357 SLC22A5 -1.772 0.00578 -2.144 0.000578 ILMN_2303955 FKBP1B -1.53 0.00121 -1.604 0.000581 ILMN_2375879 VEGFA -2.678 2.74E-05 -2.219 0.00059 ILMN_1663422 RGL4 -1.419 0.000363 -1.413 0.000596 ILMN_1692785 KLHL21 -1.385 0.0154 -1.646 0.000616 ILMN_2365686 ALG8 1.789 6.23E-06 1.529 0.000619 ILMN_1907597 HS.553278 -1.614 0.00362 -1.818 0.000622 ILMN_1833415 LOC730877 -1.517 0.00016 -1.466 0.000633 ILMN_2134538 FTHL11 -2.085 4.69E-05 -1.846 0.000638 ILMN_1902998 HS.17661 -1.472 0.00125 -1.533 0.00064 ILMN_1713918 CYTH3 -1.632 0.000979 -1.696 0.000642 ILMN_2392674 PRR3 -1.539 0.000622 -1.56 0.000646 ILMN_1663417 C22ORF33 -1.575 0.00332 -1.748 0.000654 ILMN_1713496 ST3GAL5 -1.425 0.00755 -1.623 0.000655 ILMN_1880885 HS.545048 -2.103 0.000443 -2.097 0.000664 ILMN_1819590 HS.287720 -1.62 0.000539 -1.631 0.000669 ILMN_1738207 CISH -2.057 0.00918 -2.731 0.000721 ILMN_1914579 HS.562118 -1.447 0.00159 -1.512 0.000731 ILMN_1690780 RFK -1.339 0.00719 -1.48 0.000735 ILMN_3178792 HNRNPA2B1 -1.391 0.000297 -1.367 0.000764 ILMN_1796126 MXRA7 -1.367 0.0091 -1.541 0.000766 ILMN_1666690 ACRC -1.664 0.000392 -1.64 0.00077 ILMN_1807181 BACH1 -1.544 0.00416 -1.716 0.000777 ILMN_1674243 TFRC -5.05 8.49E-08 -2.485 0.000779

210 ILMN_1823270 HS.544326 -1.485 0.00369 -1.623 0.00078 ILMN_1685433 COL8A1 1.501 0.042 2.082 0.000785 ILMN_1839905 HS.544351 -1.277 0.00807 -1.392 0.000791 ILMN_1860789 HS.313056 -2.035 0.000218 -1.923 0.000799 ILMN_2340643 INSC -1.999 7.00E-04 -2.029 0.000802 ILMN_1791728 SLC25A25 -2.632 0.00101 -2.778 0.000825 ILMN_1859259 HS.537451 -1.569 0.00127 -1.626 0.000838 ILMN_1651967 TP53I3 -1.645 0.0044 -1.855 0.000841 ILMN_1758895 CTSK 2.261 0.00088 2.334 0.000843 ILMN_3223843 LOC90499 -1.434 0.00242 -1.515 0.000866 ILMN_1796663 B4GALNT4 1.89 4.83E -05 1.676 0.000872 ILMN_1670037 POLR2L 1.351 0.0221 1.605 0.000876 ILMN_1803813 ASTE1 1.285 0.0409 1.56 0.000902 ILMN_1776213 RGMB -1.444 0.0125 -1.689 0.000903 ILMN_1659270 OTP -1.498 0.0034 -1.62 0.000914 ILMN_1903750 HS.544238 -1.751 0.000183 -1.651 0.000919 ILMN_1671123 LOC647543 1.711 0.0445 2.627 0.000928 ILMN_2408885 HDAC9 -1.661 0.000101 -1.541 0.000929 ILMN_2103720 MRPL15 -1.531 7.71E -05 -1.426 0.000951 ILMN_1768534 BHLHB2 -1.802 0.0195 -2.457 0.000953 ILMN_2229877 PCDH18 1.383 0.0322 1.721 0.000963 ILMN_1816948 HS.146561 -1.418 0.00459 -1.537 0.000976 ILMN_1689142 UBE1C -1.487 0.000575 -1.476 0.000986 ILMN_1736093 SNX33 1.216 0.0326 1.388 0.000988 ILMN_1821883 HS.539119 -1.788 0.00173 -1.89 0.000993 ILMN_1795336 PTER -2.519 8.45E-06 -1.928 0.001 ILMN_1755843 SLC26A8 -1.407 0.00499 -1.527 0.00101 ILMN_3244456 LOC100134122 -1.46 0.0019 -1.519 0.00102 ILMN_1765454 LOC732111 -1.325 0.00768 -1.446 0.00103 ILMN_1774589 IQCC -1.644 0.000783 -1.648 0.00105 ILMN_1703475 LOC642981 -1.712 0.000628 -1.697 0.00105 ILMN_1770787 DDAH2 1.241 0.137 1.694 0.00105 ILMN_1743836 MXRA7 1.111 0.35 1.512 0.00105 ILMN_1653940 USP2 -1.451 0.00236 -1.522 0.00106 ILMN_1858507 HS.544228 -1.627 0.000919 -1.642 0.00107 ILMN_2400922 OPRL1 1.283 0.00412 1.349 0.00108 ILMN_1655296 UTRN -1.939 0.0013 -2.009 0.00109 ILMN_1852279 HS.528210 -1.519 0.00102 -1.536 0.0011 ILMN_1719788 LDHAL6B -1.574 0.00224 -1.657 0.00112 ILMN_2381603 ING3 1.179 0.0682 1.38 0.00113 ILMN_1850574 HS.545111 -1.518 0.00504 -1.668 0.00115 ILMN_1836060 HS.582211 -1.364 0.00575 -1.472 0.00116

211 ILMN_2220184 GFPT1 1.275 0.0162 1.422 0.00116 ILMN_1909197 HS.580452 1.273 0.0158 1.417 0.00116 ILMN_1785191 TMEM14A 1.431 0.0105 1.625 0.00116 ILMN_1790985 DJ341D10.1 -1.393 0.0148 -1.606 0.00117 ILMN_2078074 MUT -1.447 0.00467 -1.564 0.00117 ILMN_1670708 F10 1.131 0.472 1.864 0.00117 ILMN_1861128 HS.543412 -1.489 0.0185 -1.802 0.00119 ILMN_1738707 S100A13 1.455 0.00899 1.641 0.00119 ILMN_1792384 HABP4 1.684 1.49E-05 1.458 0.0012 ILMN_3241607 LOC100132106 -1.81 0.000175 -1.671 0.00123 ILMN_1879739 HS.565146 -1.34 0.0131 -1.503 0.00123 ILMN_1672662 SLC20A1 -1.518 0.00543 -1.669 0.00123 ILMN_2071809 MGP 1.72 0.0213 2.269 0.00123 ILMN_1909895 HS.570330 -1.218 0.118 -1.563 0.00128 ILMN_3223966 LOC730254 -1.641 0.00265 -1.741 0.00128 ILMN_1719641 SMOC2 1.277 0.104 1.7 0.00129 ILMN_1745005 GGCT -1.509 2.12E-05 -1.353 0.00132 ILMN_1851912 HS.385477 -1.402 0.00263 -1.457 0.00133 ILMN_2085922 WRB -1.455 0.000831 -1.446 0.00137 ILMN_3261022 LOC100129906 -1.517 0.00455 -1.64 0.00138 ILMN_1882485 HS.554595 -1.458 6.89E-05 -1.349 0.0014 ILMN_1897230 HS.543617 -1.776 0.00346 -1.926 0.00143 ILMN_2088847 OTUD5 -1.25 0.0273 -1.414 0.00144 ILMN_1762106 MMP2 1.468 0.00192 1.506 0.00145 ILMN_3179396 LOC100129410 -1.493 0.00353 -1.581 0.00146 ILMN_1698334 LOC728863 -1.43 0.00156 -1.451 0.00147 ILMN_2344455 G3BP1 -1.242 0.0558 -1.48 0.00149 ILMN_1796431 GPR101 -1.807 0.00119 -1.817 0.0015 ILMN_1911873 HS.544751 -1.57 0.00217 -1.626 0.0015 ILMN_3237099 LOC100134815 -1.605 0.000314 -1.523 0.00152 ILMN_1857546 HS.575831 -1.416 0.00272 -1.467 0.00153 ILMN_2252136 YWHAE -1.198 0.0629 -1.397 0.00156 ILMN_1702489 TRIM63 -1.879 0.0038 -2.056 0.00158 ILMN_1706410 HUS1B -1.693 0.00254 -1.773 0.00162 ILMN_1795464 LTA -1.451 0.000859 -1.434 0.00163 ILMN_1817816 HS.560319 -1.406 0.0219 -1.652 0.00164 ILMN_2171783 CPEB3 -1.281 0.0637 -1.579 0.00166 ILMN_1860753 HS.85445 -1.529 0.000378 -1.462 0.00168 ILMN_1838836 HS.282153 -1.557 0.00182 -1.587 0.00169 ILMN_1658743 CCNDBP1 1.65 4.16E -07 1.326 0.00172 ILMN_1685483 FETUB -1.934 3.89E-05 -1.634 0.00173 ILMN_1822220 HS.128709 -1.379 0.00409 -1.443 0.00173

212 ILMN_1734010 C10ORF118 1.36 0.00577 1.443 0.00175 ILMN_2327947 SLC25A25 -1.874 0.00394 -2.041 0.00176 ILMN_1728349 TMEM63B -1.274 0.0126 -1.382 0.00176 ILMN_1703079 NFS1 -1.247 0.0284 -1.403 0.00177 ILMN_1703511 PDZRN3 -1.507 0.0273 -1.867 0.00177 ILMN_2277523 DIP2A -1.5 0.00341 -1.571 0.00178 ILMN_1672331 MAP3K7IP2 -1.3 0.0102 -1.402 0.0018 ILMN_2189842 SNORA10 -1.164 0.225 -1.54 0.0018 ILMN_1781819 PAPSS1 1.548 0.000211 1.445 0.00182 ILMN_1691090 MPV17 1.387 0.00427 1.453 0.00183 ILMN_1799600 STARD8 -1.387 0.00347 -1.438 0.00186 ILMN_1675721 UBE2R2 -1.24 0.0125 -1.329 0.00186 ILMN_1889215 HS.61208 -1.947 0.00263 -2.047 0.00187 ILMN_2306565 MTX2 -1.3 0.00404 -1.346 0.00187 ILMN_2401730 C1GALT1C1 1.689 1.96E -05 1.445 0.00188 ILMN_1774207 ANGPT2 2.043 0.000102 1.761 0.0019 ILMN_1678116 XAGE1E -1.455 0.00129 -1.45 0.00192 ILMN_1773125 ENTPD1 -1.999 0.000952 -1.944 0.00195 ILMN_1690802 TRMT112 1.879 2.65E -06 1.475 0.00196 ILMN_3230241 LOC728975 -1.657 0.0018 -1.674 0.00204 ILMN_2197164 TAAR1 -1.507 0.000529 -1.44 0.00232 ILMN_3243034 LOC100134098 -1.384 0.00132 -1.372 0.00234 ILMN_1696911 FTHL8 -1.955 8.01E-05 -1.659 0.00245 ILMN_3201658 LOC642585 -1.595 0.000149 -1.447 0.00248 ILMN_2053415 LDLR -2.176 1.25E-05 -1.67 0.00254 ILMN_1671905 C10ORF78 1.547 0.00161 1.535 0.00256 ILMN_1748034 KLHDC4 1.623 0.000153 1.463 0.00269 ILMN_2174574 HNRNPA3P1 -1.956 2.42E -06 -1.484 0.00282 ILMN_1916513 HS.572883 -1.569 0.00166 -1.549 0.0029 ILMN_1781626 C1S 1.697 0.000885 1.618 0.0029 ILMN_2407389 GPNMB 2.405 8.92E-06 1.744 0.00297 ILMN_1859701 HS.545514 -1.634 0.000483 -1.522 0.00301 ILMN_1732410 SLC16A9 2.371 0.000155 1.954 0.00304 ILMN_1694466 ZBED1 -1.468 0.00166 -1.449 0.00305 ILMN_3292114 LOC100131510 -1.682 9.07E-05 -1.471 0.00309 ILMN_1727177 MGAM -2.132 2.29E -05 -1.662 0.0031 ILMN_1701386 STRADB -1.845 1.31E-06 -1.409 0.00315 ILMN_2071446 PI15 -1.427 8.57E-05 -1.299 0.00323 ILMN_1766094 MOSPD2 1.464 0.000156 1.34 0.0033 ILMN_1658460 LOC653884 1.561 8.93E -05 1.387 0.00332 ILMN_3230880 BEND7 -1.644 0.000101 -1.445 0.00337 ILMN_1804833 LOC645039 -1.474 0.000298 -1.367 0.00346

213 ILMN_1776493 MTUS1 -1.428 0.00121 -1.386 0.00361 ILMN_3234124 C17ORF101 1.463 0.00104 1.408 0.00368 ILMN_1715543 ACOT1 -1.584 0.000202 -1.426 0.0038 ILMN_1746948 MYL5 4.574 0.00113 3.943 0.00385 ILMN_1716237 ACOT2 -1.736 2.78E-05 -1.436 0.00416 ILMN_2107991 HABP4 1.803 6.45E-05 1.509 0.00416 ILMN_2345016 PTGES2 -1.418 0.000748 -1.349 0.0042 ILMN_1681754 GGH 1.731 0.000878 1.608 0.00437 ILMN_1667577 LCMT2 1.44 0.000231 1.323 0.00438 ILMN_1904843 HS.129547 -1.47 0.000709 -1.386 0.00442 ILMN_2173835 FTHL3 -1.795 0.000499 -1.61 0.00465 ILMN_1785891 PRKD1 1.615 0.000934 1.514 0.00466 ILMN_1654016 MRLC2 1.519 2.55E-05 1.308 0.00467 ILMN_2203588 MYL5 4.935 0.00126 4.127 0.00476 ILMN_1757287 MAPK6 -1.557 0.000381 -1.42 0.00479 ILMN_1750008 SUPV3L1 -1.429 0.000225 -1.309 0.00487 ILMN_1733535 ZNF366 -1.929 5.45E-05 -1.558 0.00492 ILMN_1790741 RNF126 -1.525 0.000137 -1.357 0.00493 ILMN_1658437 SFXN4 -1.397 0.000307 -1.293 0.00519 ILMN_1870085 HS.128892 -1.688 0.00125 -1.582 0.00524 ILMN_1699772 RRAGD -1.423 0.0013 -1.363 0.00533 ILMN_3268813 LOC100130934 -1.626 0.00161 -1.547 0.00534 ILMN_3250067 ANGPT2 2.114 0.000572 1.834 0.00534 ILMN_1704793 MYPOP -1.616 4.79E-05 -1.371 0.00557 ILMN_2409167 ANXA2 1.634 0.00187 1.558 0.0058 ILMN_1884357 HS.137293 -1.692 0.00151 -1.59 0.00582 ILMN_1687351 ANKRA2 1.768 4.51E-06 1.374 0.00586 ILMN_2250445 SMTN -1.662 3.09E -05 -1.378 0.006 ILMN_1723978 LGALS1 1.491 0.000272 1.346 0.0062 ILMN_1742269 LOC642726 -1.681 0.00117 -1.552 0.00644 ILMN_2401258 FAM13A 1.721 0.000162 1.467 0.00665 ILMN_1849997 HS.436627 -1.681 0.0015 -1.567 0.00671 ILMN_1727618 C8ORF38 -1.434 0.000677 -1.331 0.007 ILMN_2358919 TP53I3 2.14 0.000189 1.713 0.00714 ILMN_1775008 NCAPD2 1.586 0.00167 1.49 0.00725 ILMN_3293244 LOC100131744 -1.353 0.00153 -1.295 0.00728 ILMN_1757660 CAPS -1.374 0.00102 -1.298 0.00733 ILMN_1672554 C17ORF81 -1.46 9.42E-05 -1.285 0.00766 ILMN_1778951 C6ORF203 -1.47 3.00E-04 -1.324 0.00768 ILMN_1890788 HS.534997 -1.522 0.00177 -1.437 0.0078 ILMN_1702247 CCNDBP1 1.64 5.75E-06 1.309 0.0078 ILMN_1651643 ASB11 -1.654 0.000224 -1.428 0.00782

214 ILMN_2215640 TUBA3D -2.235 0.000316 -1.794 0.00801 ILMN_1805636 PGAP3 1.491 0.00106 1.383 0.00812 ILMN_1718961 BNIP3L 1.395 0.00089 1.303 0.00826 ILMN_2403237 CHN2 1.905 0.000264 1.582 0.0083 ILMN_1774287 CFB 1.707 0.00033 1.474 0.00831 ILMN_1752510 FAM13A 1.611 0.000418 1.424 0.00833 ILMN_1892608 HS.76704 -1.482 0.00182 -1.397 0.00887 ILMN_1808792 ALKBH6 1.438 0.000978 1.334 0.00906 ILMN_2367275 BCL7B -1.401 0.000524 -1.286 0.0091 ILMN_3247023 FLJ22536 1.716 0.000583 1.503 0.0091 ILMN_1735822 TTC30A 1.651 7.14E -05 1.372 0.00925 ILMN_1806601 GRSF1 -1.437 9.70E-05 -1.263 0.00931 ILMN_1782167 RPL32 1.448 8.86E-05 1.267 0.00941 ILMN_1801348 GOT2 -2.048 4.84E-06 -1.455 0.00955 ILMN_1679150 LOC387647 -1.857 0.00093 -1.621 0.00981 ILMN_3233388 RELL1 -1.512 0.000357 -1.341 0.0102 ILMN_1718520 C10ORF59 -1.781 8.10E-05 -1.436 0.0104 ILMN_1703335 LACTB -1.488 0.00044 -1.331 0.0105 ILMN_1814924 FAM55C -1.691 0.000261 -1.433 0.0107 ILMN_1828245 HS.542149 -1.485 0.000946 -1.357 0.0107 ILMN_2336647 NNT -1.802 2.97E-05 -1.407 0.0107 ILMN_1712754 NFKBIB -1.441 0.000955 -1.322 0.0114 ILMN_2104877 CMPK1 -1.342 0.0014 -1.264 0.0115 ILMN_1811754 NDUFB10 -1.566 6.51E-06 -1.262 0.0115 ILMN_2153485 NMNAT3 -1.737 9.45E-05 -1.412 0.0115 ILMN_2336133 SULT1A4 1.423 0.000623 1.295 0.0115 ILMN_1754244 MYH8 12.6 0.000134 5.056 0.0117 ILMN_1712530 AKAP1 -1.612 0.000649 -1.419 0.0118 ILMN_1807359 CLEC11A 1.568 0.00191 1.444 0.0118 ILMN_1741180 HEXDC -1.455 0.00107 -1.334 0.0119 ILMN_1704369 LIMA1 1.576 0.00115 1.422 0.0119 ILMN_1690085 STK11IP 1.389 0.000533 1.266 0.0119 ILMN_1752199 LHPP 1.377 0.000601 1.261 0.0122 ILMN_3247835 CXORF64 1.695 0.00122 1.505 0.0124 ILMN_1763941 LRRC49 1.413 0.000546 1.279 0.0126 ILMN_1745152 UQCC -1.434 0.00059 -1.295 0.0127 ILMN_3182171 FGGY 1.931 0.000504 1.593 0.0127 ILMN_2307656 AGTRAP 2.38 2.86E-05 1.629 0.0128 ILMN_2368773 FAM3C 1.443 0.000578 1.3 0.0128 ILMN_1796244 CD2BP2 1.682 2.01E -06 1.278 0.0132 ILMN_1764043 TTL -1.566 0.0016 -1.424 0.0133 ILMN_1718853 UQCRC2 -1.381 0.000522 -1.255 0.0135

215 ILMN_1790891 CKAP4 1.424 0.00139 1.313 0.014 ILMN_1729130 C7ORF42 1.547 0.000196 1.323 0.0141 ILMN_1739496 PRRX1 1.552 0.00191 1.415 0.0148 ILMN_2183784 TTC12 1.593 0.000782 1.397 0.015 ILMN_1745779 TCTEX1D2 1.532 0.000279 1.321 0.0151 ILMN_1748077 DDX59 -1.57 0.000465 -1.36 0.0152 ILMN_1795839 SCCPDH -1.522 0.00146 -1.378 0.0152 ILMN_1801205 GPNMB 2.524 2.09E-05 1.642 0.0152 ILMN_2326512 CASP1 1.685 0.000186 1.389 0.0153 ILMN_2305225 NDRG4 -1.761 9.03E-05 -1.399 0.0155 ILMN_1796773 BTBD8 -2.057 4.10E -05 -1.496 0.0156 ILMN_1802089 SYMPK 1.547 0.000453 1.344 0.0157 ILMN_1724266 LYPD2 -1.422 0.000643 -1.279 0.0159 ILMN_1847612 HS.543981 -1.407 0.00115 -1.287 0.016 ILMN_3205656 LOC391075 -1.519 0.000654 -1.339 0.0161 ILMN_1742789 LPXN 1.545 0.000232 1.318 0.0162 ILMN_1708516 PRTFDC1 1.408 0.00136 1.293 0.0162 ILMN_1858599 HS.20255 1.487 0.0012 1.341 0.0163 ILMN_1767816 APH1B 1.451 0.000728 1.299 0.0165 ILMN_1704079 RBM38 -1.588 0.00152 -1.418 0.0166 ILMN_2121282 MRPS18B -1.618 1.01E-05 -1.272 0.0171 ILMN_3200421 LOC641746 -1.78 0.000114 -1.408 0.0174 ILMN_1732187 TMEM143 -1.899 0.000146 -1.473 0.0174 ILMN_1723834 FLJ32011 -1.456 0.00155 -1.326 0.0176 ILMN_1683133 KLF15 -1.715 0.000622 -1.447 0.0176 ILMN_3265228 LOC100128392 -1.703 0.000754 -1.448 0.0177 ILMN_1652379 SUCLG2 -1.468 0.000359 -1.283 0.0179 ILMN_2173611 MT1E 2.994 0.000329 2.024 0.0181 ILMN_1673535 LOC340156 -3.739 4.68E-05 -2.065 0.0182 ILMN_1805826 BIVM -1.568 4.13E-05 -1.277 0.0184 ILMN_1766054 ABCA1 2.406 1.50E-06 1.472 0.0184 ILMN_1677452 REXO4 -1.514 0.00188 -1.37 0.0185 ILMN_1745041 TCF15 -1.649 0.00011 -1.34 0.0185 ILMN_3248069 LOC653881 2.068 0.000121 1.532 0.0188 ILMN_1733998 DHRS9 3.387 2.21E-05 1.881 0.0189 ILMN_1745034 SLC11A2 -1.7 0.000111 -1.361 0.0192 ILMN_3236259 PPIAL4A 1.762 0.000226 1.417 0.0194 ILMN_1703370 ZDHHC12 1.554 9.72E-05 1.288 0.0194 ILMN_1729142 CENPV -1.81 0.000614 -1.489 0.0197 ILMN_1724533 LY96 2.094 0.000848 1.667 0.0197 ILMN_1682993 NKG7 2.604 0.000161 1.765 0.0202 ILMN_1800103 LOC731196 -1.547 0.000273 -1.309 0.0207

216 ILMN_1729509 C1ORF43 -1.41 0.000164 -1.225 0.021 ILMN_2397880 CSTF3 1.422 0.00151 1.291 0.021 ILMN_1775182 GSR -1.5 0.000832 -1.317 0.0215 ILMN_1663605 RNF123 -1.324 0.00128 -1.22 0.0219 ILMN_1704477 COX5A -1.387 0.000172 -1.212 0.0223 ILMN_1838863 HS.497591 -1.602 0.000941 -1.38 0.0223 ILMN_1761309 ADCK5 1.488 0.000221 1.269 0.0224 ILMN_2048607 ANKRD9 -1.414 0.000979 -1.268 0.0226 ILMN_3244650 FLJ41941 -3.582 0.000157 -2.098 0.0227 ILMN_1691611 LOC645436 1.48 0.000693 1.295 0.023 ILMN_1725899 HOXC10 -1.514 0.00105 -1.329 0.0232 ILMN_1667670 SLC25A15 -1.546 0.00072 -1.334 0.0233 ILMN_1667418 LOC283953 -1.712 0.000868 -1.434 0.0235 ILMN_1813938 CHCHD4 -1.405 0.000251 -1.226 0.0237 ILMN_3264066 LOC100129720 -1.403 0.00154 -1.272 0.0239 ILMN_1709635 SLC38A11 -1.527 0.00101 -1.333 0.0242 ILMN_1777644 PIB5PA -2.344 6.16E-06 -1.475 0.0243 ILMN_1815882 HNRNPA1 1.512 0.000606 1.305 0.0244 ILMN_1707475 UBE2E2 1.458 0.000289 1.255 0.0245 ILMN_1805225 LPCAT3 1.688 0.000102 1.334 0.0252 ILMN_2128967 C11ORF1 1.851 8.60E-05 1.396 0.0254 ILMN_2404049 RBM38 -1.522 0.0019 -1.352 0.0256 ILMN_1749403 TSPAN33 -1.484 0.001 -1.303 0.0256 ILMN_1791388 ZNF787 -1.457 0.00067 -1.273 0.0264 ILMN_2367428 FAM96A -1.825 0.000281 -1.428 0.0266 ILMN_1731233 GZMH 2.274 0.00053 1.671 0.027 ILMN_1739886 HNF4A -2.04 0.00012 -1.48 0.0271 ILMN_1788099 LSM4 1.372 0.00171 1.247 0.0276 ILMN_1741475 C7ORF47 1.576 0.000164 1.291 0.0277 ILMN_2156172 HK2 -1.892 0.000458 -1.477 0.028 ILMN_1670901 COX10 -1.831 3.41E-05 -1.35 0.0283 ILMN_2360291 UGCGL1 1.46 0.000571 1.266 0.0283 ILMN_1704398 FZD9 -1.673 0.0013 -1.414 0.0287 ILMN_1701998 AFAP1 1.673 0.00186 1.434 0.0288 ILMN_3239568 MYLK4 -3.616 0.000123 -2.012 0.0289 ILMN_1708496 BRSK2 -2.232 0.000379 -1.614 0.0293 ILMN_2334296 IL18BP 1.75 0.000322 1.388 0.0296 ILMN_1714861 CD68 2.134 0.00178 1.686 0.0303 ILMN_1710756 ENO1 1.465 0.000271 1.245 0.0305 ILMN_2075334 HIST1H4C 1.419 0.000857 1.25 0.0306 ILMN_1851107 HS.280924 -1.575 0.000123 -1.277 0.0307 ILMN_1740590 LOC647881 -1.739 0.000462 -1.394 0.031

217 ILMN_2347068 MKNK2 -1.638 0.00141 -1.391 0.031 ILMN_3291053 LOC346085 -1.655 0.000677 -1.368 0.0311 ILMN_2097858 KIAA1737 -1.614 0.00128 -1.372 0.0313 ILMN_2094166 CHMP5 1.57 0.000126 1.274 0.0314 ILMN_1660602 C1ORF43 -1.411 0.000597 -1.235 0.0315 ILMN_2050255 UCKL1 -1.398 0.000516 -1.224 0.0318 ILMN_1749409 LOC649864 -2.184 0.000286 -1.563 0.032 ILMN_3182120 LOC100129522 1.672 0.000881 1.385 0.0321 ILMN_1683023 PDGFC 1.823 0.000288 1.409 0.0323 ILMN_2319994 RPL3 1.841 0.00113 1.485 0.0325 ILMN_1792110 C10ORF76 -1.906 1.21E -05 -1.335 0.0331 ILMN_1782635 YARS2 -1.37 0.00122 -1.226 0.0341 ILMN_1667977 TAF1B 1.49 0.000287 1.252 0.0341 ILMN_2225144 EIF4E3 -1.553 0.000302 -1.283 0.0346 ILMN_1669669 KCMF1 -1.442 0.000924 -1.258 0.0347 ILMN_1778478 CCDC28B -1.758 0.00077 -1.415 0.0348 ILMN_2180315 ATG4D -1.381 0.000655 -1.216 0.0349 ILMN_1751206 MLLT10 -2.194 0.000208 -1.536 0.035 ILMN_1759396 NNT -1.822 3.45E -05 -1.33 0.0353 ILMN_1731699 RAB15 1.998 0.00109 1.549 0.0358 ILMN_1902571 HS.557622 -1.696 0.000169 -1.323 0.0374 ILMN_1722309 ENDOG -1.829 3.53E-05 -1.327 0.038 ILMN_1810392 ZNHIT2 1.547 0.000225 1.266 0.0384 ILMN_1736940 HPRT1 1.668 4.07E-05 1.273 0.039 ILMN_1767129 ABCC8 1.796 0.00028 1.376 0.0399 ILMN_1675421 LOC389293 -1.735 0.00115 -1.409 0.04 ILMN_1666553 SLC25A19 -1.483 0.000969 -1.272 0.04 ILMN_2385647 ALAS1 -2.162 1.47E -06 -1.338 0.0404 ILMN_1764410 C22ORF13 -1.531 0.000189 -1.251 0.0406 ILMN_2322842 PPHLN1 1.531 0.000714 1.286 0.0407 ILMN_1703005 IFP38 -1.413 0.00127 -1.241 0.0408 ILMN_1683576 MAGED2 1.669 0.00101 1.367 0.0408 ILMN_1787344 LOC652078 -1.374 0.00134 -1.221 0.0411 ILMN_3307930 RAN 1.379 0.000327 1.192 0.0416 ILMN_3281599 LOC642741 1.831 0.00113 1.45 0.0421 ILMN_3250389 LOC440895 1.628 2.22E -05 1.239 0.044 ILMN_1718852 PLCL1 -1.616 0.0016 -1.353 0.0443 ILMN_1748916 C18ORF55 -1.625 0.000122 -1.273 0.0445 ILMN_1698307 DBNL 1.578 0.000295 1.276 0.0445 ILMN_1786612 PSME2 1.573 0.000101 1.248 0.0447 ILMN_1798485 ATP6V1E1 1.438 0.00104 1.242 0.0459 ILMN_2345319 PREPL 1.694 0.000369 1.33 0.0459

218 ILMN_2244841 ALDH4A1 1.836 0.000539 1.405 0.046 ILMN_1779616 SUCLG1 -1.54 7.09E-05 -1.226 0.0465 ILMN_1787109 CLK2 1.397 0.00116 1.222 0.0473 ILMN_1764266 CKMT2 -1.948 0.00011 -1.383 0.0476 ILMN_1666306 SRRD -1.534 0.00032 -1.255 0.0476 ILMN_1673026 CHCHD3 -1.6 0.000167 -1.266 0.0477 ILMN_2376502 RHOBTB1 -1.668 0.0019 -1.38 0.0479 ILMN_1652357 PDHX -1.454 0.00133 -1.254 0.049 ILMN_1759023 WFS1 1.422 0.00131 1.236 0.0496 ILMN_1700349 ADCK4 1.513 0.000316 1.243 0.0497 ILMN_2085844 GXYLT2 -2.086 4.73E -05 -1.393 0.05 ILMN_1812777 MRPL35 -1.423 0.000583 -1.215 0.0503 ILMN_1772261 GLG1 1.587 0.000141 1.254 0.0503 ILMN_1722622 CD163 1.968 0.00191 1.524 0.0507 ILMN_3247821 TMEM206 1.44 0.000372 1.213 0.0507 ILMN_1801996 MASP1 -3.231 1.94E-06 -1.533 0.0512 ILMN_2408001 RFWD2 -1.314 0.00104 -1.173 0.0512 ILMN_2207328 C18ORF10 1.802 7.38E-05 1.313 0.0518 ILMN_1714536 SCGB1D2 -3.46 1.28E -05 -1.652 0.0542 ILMN_1695717 RBM41 -1.412 0.000855 -1.215 0.0544 ILMN_1664921 PPP6C -1.377 0.000404 -1.183 0.0551 ILMN_1758827 RTN4IP1 -1.63 0.000295 -1.282 0.0557 ILMN_1692896 JMJD4 1.476 0.000878 1.243 0.0564 ILMN_1745271 EXOSC4 1.628 0.00152 1.334 0.0568 ILMN_1726114 SLC45A3 -2.267 0.000839 -1.577 0.0569 ILMN_2109708 ECGF1 1.746 0.000583 1.349 0.0569 ILMN_1733690 AKAP7 -1.985 0.000661 -1.451 0.0572 ILMN_1715508 NNMT 6.627 0.00039 2.657 0.0574 ILMN_1665123 NMNAT3 -1.57 0.00166 -1.308 0.0575 ILMN_2138801 TP73L 1.94 0.000216 1.384 0.0575 ILMN_2342695 PDGFA -1.591 0.000809 -1.293 0.0577 ILMN_1786278 FAM149A -2.402 0.000124 -1.506 0.0583 ILMN_1661804 CTF1 -1.509 0.00049 -1.241 0.0587 ILMN_1663532 RIC8B -1.601 7.24E-05 -1.234 0.0589 ILMN_2355665 MTP18 -2.38 0.00159 -1.664 0.0594 ILMN_1652006 SMC1A 1.405 0.000493 1.195 0.0594 ILMN_1774272 ESRRA -1.62 0.000154 -1.256 0.0598 ILMN_1652906 GBGT1 1.324 0.00145 1.176 0.0611 ILMN_1740633 PRF1 1.556 0.000997 1.276 0.0627 ILMN_1700081 FST 5.63 0.000403 2.401 0.0628 ILMN_2406892 C19ORF2 -1.463 0.00015 -1.194 0.0631 ILMN_2240009 ADSSL1 -2.054 0.00153 -1.512 0.0641

219 ILMN_1728049 S100A16 1.657 0.000997 1.319 0.0647 ILMN_1741599 MEMO1 -1.448 0.000466 -1.207 0.0648 ILMN_1689828 DMPK 2.21 0.000285 1.47 0.0658 ILMN_1656186 SLC41A1 -1.944 0.000185 -1.365 0.066 ILMN_1693310 ITFG1 1.489 3.96E-05 1.18 0.066 ILMN_2345739 CAPRIN2 1.99 1.16E-05 1.3 0.0666 ILMN_1712067 CCDC135 -1.413 0.00168 -1.219 0.067 ILMN_1741148 ALDOA -1.613 7.43E-05 -1.23 0.0673 ILMN_1751753 IDH2 -1.647 0.000232 -1.267 0.0677 ILMN_3253126 FLJ41484 3.161 0.000127 1.676 0.0693 ILMN_1752728 FUCA1 1.75 0.000267 1.305 0.0695 ILMN_1738866 DEXI -1.558 0.000231 -1.23 0.0716 ILMN_3283772 LOC644237 -1.499 0.00189 -1.26 0.0717 ILMN_1705397 PDK2 -1.851 0.000481 -1.356 0.0734 ILMN_1669409 VSIG4 1.619 0.00189 1.314 0.0734 ILMN_2393341 LIAS -1.632 0.00017 -1.247 0.074 ILMN_1765746 SFT2D3 1.476 0.00117 1.233 0.0741 ILMN_1813344 C20ORF7 -1.507 0.00102 -1.242 0.0751 ILMN_1810228 TTF2 2.499 5.86E -05 1.459 0.076 ILMN_2099586 CCDC28B -1.717 0.00162 -1.347 0.0765 ILMN_3213792 LOC439953 1.614 0.000745 1.276 0.0771 ILMN_3246608 CENPV -1.745 0.00143 -1.352 0.0781 ILMN_1703324 PDSS1 -1.62 0.000177 -1.24 0.0781 ILMN_1681503 MCM2 1.689 0.000179 1.263 0.0784 ILMN_1761084 FNDC5 -2.019 0.00103 -1.443 0.0785 ILMN_1810875 SYNGR1 -1.684 0.000264 -1.271 0.0785 ILMN_1655694 LOC642031 -1.58 0.000173 -1.224 0.0801 ILMN_1801045 SPIN3 1.349 0.00125 1.172 0.0805 ILMN_2350607 C20ORF7 -1.555 0.00168 -1.272 0.0807 ILMN_1672081 NEURL -1.718 0.00175 -1.342 0.0827 ILMN_1814823 FTL 1.637 0.000998 1.287 0.0837 ILMN_3193306 C14ORF109 -1.446 0.000419 -1.189 0.0846 ILMN_2409055 FAM13C1 2.061 0.000791 1.432 0.0856 ILMN_1772964 CCL8 1.881 0.000796 1.367 0.0869 ILMN_1794230 SCAND1 1.442 0.00175 1.217 0.0872 ILMN_1779324 GZMA 2.003 0.00113 1.427 0.0875 ILMN_2052331 MYOC -2.382 0.000656 -1.523 0.0876 ILMN_2219556 ISCA1 -1.485 0.000971 -1.217 0.0911 ILMN_1785892 MASP1 -3.601 6.55E-07 -1.456 0.0928 ILMN_1731851 OXA1L -1.445 0.000205 -1.171 0.0933 ILMN_1806349 SLC6A8 -1.76 1.65E-05 -1.222 0.0951 ILMN_3237286 LOC100132794 -2.261 5.73E-05 -1.369 0.0957

220 ILMN_1713985 MAF1 -1.347 0.000603 -1.149 0.0974 ILMN_2405305 ARNTL 2.342 0.000809 1.503 0.0977 ILMN_1662640 C20ORF127 2.355 0.000735 1.501 0.0979 ILMN_3271630 FGGY 1.772 0.000284 1.282 0.0987 ILMN_1653466 HES4 -1.891 0.000797 -1.354 0.0991 ILMN_1747759 WSB1 1.433 0.00125 1.195 0.101 ILMN_1712430 ATP5G1 -2.616 0.00189 -1.643 0.102 ILMN_1745368 TMEM50A 1.501 0.000934 1.215 0.102 ILMN_1667079 SPTBN2 -2.028 0.000155 -1.334 0.103 ILMN_1804328 WWP1 -1.542 0.000456 -1.213 0.104 ILMN_2080637 ZBTB44 -1.497 0.00164 -1.226 0.104 ILMN_1787526 MGC13057 -1.916 0.000397 -1.33 0.105 ILMN_2355042 CLUAP1 1.342 0.000919 1.15 0.105 ILMN_3225941 LOC728368 1.527 0.00132 1.231 0.105 ILMN_1755462 UGCGL1 1.562 7.20E -05 1.185 0.106 ILMN_1735495 TBC1D8 -1.717 0.00144 -1.305 0.108 ILMN_1779040 INO80B 1.651 0.000612 1.254 0.108 ILMN_1783681 MRPL34 -1.509 0.0017 -1.226 0.111 ILMN_2329927 ABCG1 2.271 0.00074 1.45 0.113 ILMN_2412927 GMPPB 1.569 0.000731 1.226 0.113 ILMN_1801504 RUNX1 3.376 0.000189 1.626 0.115 ILMN_1761981 FAM96A -1.763 5.11E-05 -1.225 0.117 ILMN_1777129 C16ORF56 1.393 0.000767 1.161 0.117 ILMN_1771957 MAN1B1 1.412 0.00161 1.182 0.117 ILMN_1903345 HS.575085 -1.375 0.000641 -1.151 0.119 ILMN_2049021 PTTG3P 1.933 0.000826 1.346 0.119 ILMN_2073289 MTSS1 1.81 7.75E-05 1.244 0.12 ILMN_1700047 ALAS1 -1.793 0.000204 -1.259 0.121 ILMN_1798712 USP4 -1.337 0.000795 -1.139 0.121 ILMN_1654690 CECR5 -1.564 0.000236 -1.195 0.122 ILMN_1721669 IDH3B -1.848 3.60E-05 -1.236 0.122 ILMN_1737604 FLJ10986 1.964 0.000962 1.358 0.122 ILMN_1719468 EPM2A -1.382 0.00163 -1.167 0.123 ILMN_2377669 CD247 1.851 0.000993 1.322 0.123 ILMN_1769520 UBE2L6 1.624 0.00021 1.208 0.127 ILMN_2261416 CD3D 1.652 0.00148 1.263 0.129 ILMN_1803856 DKFZP586I1420 1.669 0.000821 1.251 0.129 ILMN_1790549 TSPAN3 1.509 0.000475 1.187 0.129 ILMN_1802458 AGTRAP 1.856 0.000698 1.305 0.13 ILMN_1698934 CMTM7 2.819 0.000392 1.525 0.13 ILMN_1801553 LEO1 1.487 0.00131 1.199 0.131 ILMN_2412564 NCBP2 1.608 0.000218 1.201 0.131

221 ILMN_1716264 ANKRD1 2.477 0.000674 1.472 0.132 ILMN_3235065 ZNHIT6 1.321 0.00152 1.137 0.132 ILMN_1656951 APCDD1 -1.676 0.00116 -1.26 0.133 ILMN_1799488 ZNF383 -1.459 0.000758 -1.176 0.133 ILMN_2175114 KCNS3 -1.913 0.0014 -1.344 0.134 ILMN_1871457 HS.534680 2.17 0.000151 1.334 0.134 ILMN_1712678 RPS27L 1.722 0.000192 1.229 0.134 ILMN_1687519 SNAP23 1.533 0.000665 1.198 0.135 ILMN_1784447 PLCE1 2.779 0.000267 1.487 0.136 ILMN_2383383 PIR 1.547 0.00068 1.202 0.137 ILMN_1739847 EIF3D 1.423 0.000473 1.153 0.139 ILMN_1660900 SNORA7B 1.475 0.00153 1.191 0.144 ILMN_1740429 FTL 1.615 0.0014 1.234 0.149 ILMN_2324561 SLC7A6 2.577 0.000237 1.42 0.15 ILMN_2111739 MAN2C1 1.771 5.62E -05 1.207 0.151 ILMN_1737124 PRPF4B 1.391 0.0017 1.157 0.153 ILMN_2042771 PTTG1 2.058 0.000433 1.322 0.154 ILMN_1784186 C1ORF170 -2.591 0.000166 -1.398 0.159 ILMN_1675617 NT5M -1.461 0.00132 -1.175 0.16 ILMN_3194432 LOC100129913 1.667 0.000882 1.231 0.161 ILMN_2412822 SCN3B 1.907 0.000358 1.272 0.162 ILMN_1751431 WIBG 1.446 0.000431 1.15 0.162 ILMN_1721337 MRPS18B -1.443 0.000639 -1.154 0.164 ILMN_1755047 LRRC2 -2.388 0.00156 -1.45 0.165 ILMN_3200830 LOC649553 1.503 0.000437 1.166 0.165 ILMN_1795826 ATP6V0D1 1.402 0.00055 1.138 0.166 ILMN_1775111 SND1 1.397 0.000961 1.145 0.167 ILMN_1780188 B3GALNT2 -1.393 0.00123 -1.147 0.168 ILMN_1814917 TLE2 -1.812 0.000399 -1.246 0.168 ILMN_1652806 ATP5J -1.639 0.00109 -1.223 0.169 ILMN_3243961 ZNF252 -1.537 0.0017 -1.2 0.172 ILMN_1711729 LOC442454 -1.302 0.00152 -1.116 0.176 ILMN_1728355 PSMD4 1.448 0.000751 1.153 0.176 ILMN_1815115 CYC1 -1.364 0.000742 -1.127 0.177 ILMN_2136971 FABP3 -2.558 0.000109 -1.358 0.177 ILMN_3305304 POLD2 1.351 0.000731 1.122 0.178 ILMN_1788053 SLC25A12 -1.773 0.00019 -1.214 0.18 ILMN_1770479 LMO7 1.703 0.000255 1.203 0.18 ILMN_1787879 ARL2 1.362 0.000212 1.111 0.181 ILMN_1799024 VAC14 1.501 0.000558 1.163 0.181 ILMN_1653129 CSTF2 1.39 0.00169 1.146 0.182 ILMN_2070072 RPS7 2.046 0.000175 1.27 0.183

222 ILMN_1805161 LZTR1 1.364 0.00186 1.137 0.186 ILMN_1760741 NDUFA9 -1.573 0.00124 -1.194 0.191 ILMN_1699071 C21ORF7 2.565 0.000491 1.401 0.191 ILMN_2326509 CASP1 1.72 0.000251 1.199 0.195 ILMN_1734867 NR2C1 1.493 0.000661 1.156 0.198 ILMN_1800096 MPST -1.542 0.000263 -1.154 0.202 ILMN_2199926 OR7E37P -2.67 0.000317 -1.391 0.202 ILMN_1705224 TMEM110 -1.972 0.00118 -1.289 0.209 ILMN_2384591 HN1 2.223 9.67E-07 1.193 0.21 ILMN_1786308 NIPSNAP3B -2.243 0.000208 -1.292 0.211 ILMN_1777444 STX5 1.44 0.00129 1.147 0.211 ILMN_1672024 ISCA1L -1.597 0.00112 -1.188 0.212 ILMN_1758034 ETFDH -1.71 0.000477 -1.2 0.214 ILMN_2214098 BIVM -1.442 0.00165 -1.149 0.217 ILMN_3239284 B9D1 1.519 0.00176 1.174 0.217 ILMN_1705297 MYBPH 7.733 0.00129 2.096 0.228 ILMN_1807610 PRPH2 1.83 0.000658 1.226 0.23 ILMN_1815107 MATR3 1.443 0.000826 1.134 0.232 ILMN_3306482 LOC730107 -1.486 0.00191 -1.159 0.233 ILMN_2228732 CCNG2 2.393 0.00139 1.366 0.236 ILMN_1748836 FUZ -1.375 0.00123 -1.119 0.237 ILMN_2358733 TAZ -1.361 0.0016 -1.118 0.239 ILMN_1731418 SP110 1.491 0.000875 1.143 0.244 ILMN_1680501 GTF2IRD2B -2.077 0.000164 -1.234 0.246 ILMN_1786444 LPL -2.103 0.00182 -1.307 0.246 ILMN_1665132 CD36 -3.09 4.31E-05 -1.335 0.253 ILMN_1669709 TMEM108 -1.788 6.00E-04 -1.202 0.255 ILMN_1750722 RPS7 1.731 0.00187 1.215 0.255 ILMN_1671442 WDR43 1.358 0.00166 1.113 0.257 ILMN_1662917 LMO1 -2.117 0.00124 -1.288 0.259 ILMN_1696087 PHB2 -1.335 0.000933 -1.099 0.261 ILMN_2405521 MTHFD2 2.771 3.76E -05 1.289 0.261 ILMN_2306189 MAGED1 1.726 0.000325 1.175 0.262 ILMN_1656368 ALDH4A1 1.441 0.00183 1.135 0.263 ILMN_2091310 TMEM16A -1.425 0.000562 -1.115 0.264 ILMN_2201533 C17ORF61 1.437 0.000415 1.115 0.264 ILMN_2234873 NME2 -1.671 0.00109 -1.182 0.269 ILMN_1783333 C16ORF61 -1.607 0.00175 -1.174 0.273 ILMN_2358457 ATF4 -1.547 0.00147 -1.155 0.278 ILMN_2240866 MASP1 -4.5 3.58E -06 -1.364 0.278 ILMN_1814022 NR1H3 1.581 0.000407 1.141 0.281 ILMN_1655311 LOC145853 -2.029 5.16E-06 -1.159 0.283

223 ILMN_2185563 ANKRA2 1.664 0.000159 1.144 0.286 ILMN_1722239 TIMM8A -1.689 0.0015 -1.185 0.287 ILMN_1814726 SCARB2 1.416 0.000231 1.098 0.29 ILMN_2262288 EEF1G -1.431 0.0011 -1.117 0.295 ILMN_1736077 LIAS -1.553 0.00178 -1.153 0.296 ILMN_1672382 SLC38A3 -2.083 0.000109 -1.202 0.296 ILMN_2399893 RPS24 1.591 0.0014 1.157 0.298 ILMN_1687533 SEMA4D 1.843 0.000718 1.195 0.301 ILMN_2373632 IDH3B -1.926 0.000497 -1.201 0.305 ILMN_1808501 SH3KBP1 -2.289 0.00107 -1.283 0.305 ILMN_1814589 LOC728037 -1.818 0.000525 -1.181 0.309 ILMN_1806634 NNT -1.599 0.00104 -1.147 0.319 ILMN_1652797 FAM174B 3.098 0.000116 1.312 0.319 ILMN_1784365 MYOG 2.486 0.000449 1.277 0.32 ILMN_1733176 LIMS1 -1.457 0.000603 -1.108 0.324 ILMN_1691364 STAT1 1.753 0.000319 1.155 0.326 ILMN_1780698 ZFYVE19 -1.356 0.00122 -1.093 0.328 ILMN_1686664 MT2A 3.595 0.000754 1.418 0.335 ILMN_2361768 CHRNA1 7.065 7.50E -05 1.549 0.336 ILMN_1676980 MTSS1 1.608 0.000442 1.129 0.342 ILMN_2043918 DLEU1 -1.787 0.000382 -1.157 0.345 ILMN_1722855 VEGFB -1.825 2.18E-05 -1.127 0.35 ILMN_1766000 PM20D2 1.607 0.00158 1.144 0.354 ILMN_1740430 SLC2A4RG -1.366 0.00185 -1.093 0.358 ILMN_1680072 ZNF32 -1.442 0.000354 -1.092 0.36 ILMN_1810782 SH3KBP1 -2.279 0.000792 -1.238 0.362 ILMN_1723494 SIRT2 -1.425 0.00166 -1.102 0.369 ILMN_1775522 MAGED1 1.659 0.000851 1.138 0.371 ILMN_3239735 WASH5P 1.439 0.000469 1.092 0.371 ILMN_1798700 CHRNA1 6.066 0.000113 1.471 0.372 ILMN_2304512 SAA1 15.44 5.02E-05 1.733 0.373 ILMN_1893633 LOC439949 -1.617 0.00125 -1.136 0.374 ILMN_1804662 NRG4 -2.055 0.000237 -1.177 0.375 ILMN_1779751 C7ORF55 -1.537 0.00122 -1.118 0.381 ILMN_1743205 ABCA7 1.656 0.00105 1.135 0.391 ILMN_2052208 GADD45A 3.921 0.000458 1.372 0.391 ILMN_1671402 ARPP-21 2.802 9.00E-04 1.289 0.392 ILMN_1694111 PNKP 1.392 0.000647 1.082 0.395 ILMN_2376289 DBNL 1.72 0.000667 1.137 0.396 ILMN_1768101 HOXB6 -1.742 0.000327 -1.122 0.427 ILMN_1786189 MKI67IP 1.437 0.00159 1.089 0.44 ILMN_2400661 ZNF626 1.479 0.000394 1.083 0.441

224 ILMN_1686906 TP53INP2 -2.245 0.000789 -1.192 0.443 ILMN_1727740 SYNCRIP 1.451 0.0017 1.091 0.445 ILMN_1722206 MAF 1.576 0.00145 1.109 0.449 ILMN_1690754 SVIL 1.682 0.000353 1.109 0.452 ILMN_1758315 SLC9A9 1.753 0.000369 1.117 0.454 ILMN_3197097 TSTD1 2.163 0.00137 1.182 0.47 ILMN_2148668 RCBTB2 1.721 0.00052 1.111 0.476 ILMN_1672161 ARPP-21 1.983 1.03E-05 1.103 0.478 ILMN_1780298 FAM86A 1.864 0.00071 1.131 0.48 ILMN_1704446 SLC6A10P -1.599 0.000127 -1.082 0.486 ILMN_1652505 APEX2 1.663 0.00183 1.115 0.49 ILMN_1694075 GADD45A 2.897 0.000864 1.224 0.507 ILMN_1814315 PBXIP1 1.617 3.62E-05 1.07 0.521 ILMN_1798827 SRBD1 1.313 0.00189 -1.055 0.528 ILMN_3269324 FLJ37644 -2.205 0.000361 -1.139 0.531 ILMN_1701514 TRAF3IP2 1.78 0.000785 1.107 0.535 ILMN_1699695 TNFRSF21 1.825 0.00158 1.116 0.547 ILMN_1765409 STAM 1.341 0.00184 -1.056 0.549 ILMN_3234884 KIF22 -1.829 0.00108 -1.108 0.562 ILMN_1775703 TRAPPC6A 1.486 0.00125 1.069 0.568 ILMN_1897333 HS.18849 1.522 0.000264 1.062 0.575 ILMN_1726981 VEGFB -1.737 0.00017 -1.078 0.581 ILMN_2119224 KIFAP3 1.457 0.000561 1.059 0.581 ILMN_1682147 HOOK2 -1.803 0.000348 -1.089 0.582 ILMN_1751803 LSM10 -1.539 0.00174 -1.075 0.586 ILMN_1728047 AKR1A1 1.912 0.000245 1.093 0.588 ILMN_1676197 LRP11 1.454 0.00129 1.062 0.589 ILMN_1658110 C18ORF19 -1.506 0.000271 -1.057 0.596 ILMN_1662886 KLHL34 -2.945 0.000323 -1.157 0.605 ILMN_2044832 NOP56 1.368 0.00127 1.046 0.631 ILMN_1887174 KIAA0146 1.362 0.00183 1.044 0.652 ILMN_1737344 DDX41 1.337 0.00111 1.039 0.654 ILMN_1696601 VARS 1.581 0.00188 1.06 0.683 ILMN_1674706 MTHFD2 2.349 0.00045 1.096 0.688 ILMN_1671478 CKB -4.343 2.12E-05 -1.132 0.689 ILMN_1704380 INCA1 -1.469 0.00116 -1.045 0.697 ILMN_1690105 STAT1 1.705 0.000891 1.059 0.709 ILMN_1742358 CA14 -2.773 0.00189 -1.11 0.74 ILMN_1678191 GDF10 -1.658 6.50E-05 -1.038 0.75 ILMN_2289825 ARPP -21 1.866 1.41E -05 1.04 0.758 ILMN_1869087 HS.40289 1.502 0.000566 1.033 0.768 ILMN_1798181 IRF7 2.177 0.000341 1.054 0.798

225 ILMN_1673409 MGC16121 7.084 2.13E-07 1.074 0.82 ILMN_1674629 C9ORF3 -1.916 0.000138 -1.034 0.832 ILMN_1762093 ABCC12 -3.1 0.000168 -1.056 0.845 ILMN_1800008 ACAT1 -1.53 0.00144 -1.025 0.845 ILMN_2212763 ICAM3 -1.666 0.000394 -1.024 0.862 ILMN_1698673 EFCAB7 3.54 0.00031 1.056 0.867 ILMN_1730523 FAM195A -1.697 0.000468 -1.019 0.895 ILMN_1764596 MPST -1.558 3.00E-04 -1.005 0.964 ILMN_2380771 AKR1A1 1.684 0.00156 1 0.999

226 2. Supplementary Table 3.2 Enrichment analysis of 347 genes differentially expressed (downregulated in ICUAW post-ICU day 7 versus controls) at FDR < 0.05. Gene Ontology (GO) or Human Phenotype (HP) terms with FDR adjusted P-value (hypergeometric test) < 0.05.

Gene Ontology (GO) Description or FDR adjusted P-value Human Phenotype (hypergeometric test) (HP) term GO:0044429 mitochondrial part 2.02E-14

GO:0005743 mitochondrial inner membrane 8.50E-11 GO:0045333 cellular respiration 9.35E-10 GO:0051186 cofactor metabolic process 0.00494 HP:0001941 Acidosis 0.0129 GO:0019752 carboxylic acid metabolic process 0.0134 GO:0035383 thioester metabolic process 0.0137 GO:0001954 positive regulation of cell-matrix 0.0141 adhesion GO:0045244 succinate-CoA ligase complex (GDP- 0.0381 forming) GO:0043183 vascular endothelial growth factor 0.0381 receptor 1 binding GO:0004776 succinate-CoA ligase (GDP-forming) 0.0381 activity GO:0006082 organic acid metabolic process 0.0492

3. Supplementary Table 3.3 . Gene and kME in different modules. Module eigengene connectivity (kME), a module membership measure, for all unique Illumina probes included in the WGCNA analysis data set (N= 11,482) Excel drop box link : https://www.dropbox.com/sh/hia5l5gc3gc7fqb/AADzWyJa0PwlvhFxYEF4nAE 1a?dl=0 )

227

4. Supplementary Table 3.4 . Association of co-expression modules with disease status Differences in ME expression were tested using a linear mixed effects model to account for correlation between patient samples: ICUAW day 7 vs control and ICUAW month 6 vs control.

Module Coefficient FDR Coefficient FDR Adjusted P-Value ICUAW Day 7 vs. Adjusted P- ICUAW Month 6 control Value vs. control

1 -2.604E-01 5.770E-05 -9.307E-02 1.791E-01 2 3.067E-01 2.820E-08 9.377E-02 1.127E-01 3 3.983E-02 5.990E-01 1.757E-01 2.805E-02 4 -2.638E-01 3.340E-05 -9.430E-02 1.641E-01 5 5.260E-02 5.118E-01 -4.590E-02 5.920E-01 6 1.906E-01 1.032E-02 1.159E-01 1.435E-01 7 1.836E-01 1.236E-02 5.736E-02 4.627E-01 8 1.397E-01 7.228E-02 5.585E-02 5.019E-01 9 -2.380E-02 7.718E-01 -2.675E-02 7.606E-01 10 1.249E-01 9.043E-02 -5.129E-02 5.156E-01 11 1.736E-01 1.994E-02 1.755E-01 2.589E-02 12 2.446E-01 2.450E-04 9.116E-02 1.955E-01 13 2.092E-01 3.922E-03 1.325E-01 8.747E-02 14 1.923E-01 3.689E-03 -2.564E-02 7.157E-01 15 8.912E-02 2.361E-01 -6.525E-02 4.152E-01 16 -2.312E-01 4.178E-04 -2.067E-01 2.636E-03 17 -2.067E-01 3.918E-03 -7.180E-02 3.491E-01

5. Supplementary Table 3.5 . 3.5a . Module 1 and enrichment of Gene Ontology and Human Phenotype Ontology with FDR < 0.05. 5b. Module 2 and enrichment of Gene Ontology and Human Phenotype Ontology with FDR < 0.05. 5c. Module 3 and enrichment of Gene Ontology and Human Phenotype Ontology with FDR < 0.05. 5d. Module 4 and enrichment of Gene Ontology and Human Phenotype Ontology with FDR < 0.05. 5e. Module 11 and enrichment of Gene Ontology and Human Phenotype Ontology with FDR < 0.05. 5f. Module 17 and enrichment of Gene Ontology and Human Phenotype Ontology with FDR < 0.05. Excel drop box link: https://www.dropbox.com/sh/hia5l5gc3gc7fqb/AADzWyJa0PwlvhFxYEF4nAE 1a?dl=0 )

228

6. Supplementary Table 3.6 6a. Module 1 and transcription factor binding site overrepresentation analysis with Fisher score >= 7 and Z-score >= 10. 6b Module 2 and transcription factor binding site overrepresentation analysis with Fisher score >= 7 and Z-score >= 10. 6c Module 3 and transcription factor binding site overrepresentation analysis with Fisher score >= 7 and Z-score >= 10. 6d Module 6 and transcription factor binding site overrepresentation analysis with Fisher score >= 7 and Z-score >= 10. 6e Module 7 and transcription factor binding site overrepresentation analysis with Fisher score >= 7 and Z-score >= 10. 6f Module 17 and transcription factor binding site overrepresentation analysis with Fisher score >= 7 and Z-score >= 10. Excel drop box link: https://www.dropbox.com/sh/hia5l5gc3gc7fqb/AADzWyJa0PwlvhFxYEF4nAE 1a?dl=0 )

7. Supplementary Table 3.7 7a. Module conservation between modules constructed for the human sepsis with multi-organ dysfunction validation dataset compared with reference (human ICUAW). 7b. Module conservation between modules constructed for the porcine model of ICUAW compared with reference (human ICUAW). 7c. Association of module eigengene (ME) with disease status; human sepsis with multi-organ dysfunction vs controls 8d. Association of ME with disease status; porcine model of ICUAW day 5 sepsis vs. day 1 control.

Excel drop box link: https://www.dropbox.com/sh/hia5l5gc3gc7fqb/AADzWyJa0PwlvhFxYEF4nAE1a? dl=0 )

229 Chapter 4 Supplementary Data

Supplementary Table 4.1 Excel drop box link: https://www.dropbox.com/s/zwwnejsueprautd/SuppTable4.1a-f.xlsx?dl=0

Supplementary Table 4.2 : List of differentially expressed MiRs with a significant number of targets in the signatures(s) for the subgroup(s) in which they are differentially expressed. For each miR is listed the subgroup in which the miR was differentially expressed and the direction of the miR expression. Then the direction of the target gene expression, number of target miRs in the signature, hypergeometric p-value with Bonferroni adjustment, observed/expected ratio and the minimum number of databases supporting miR target prediction for the analysis. Abbreviations . ICUAW = ICU acquired weakness; Exp = experimentally validated database (Tarbase); D7 = Day 7 post-ICU (compared with healthy controls ); M6 = Month 6 post-ICU (vs. controls); Improver= ICUAW phenotype at month 6 post-ICU with increase in muscle mass cross sectional area >10 cm 2 (compared with non- improvers having increase in muscle mass cross sectional area < 10cm 2).

miR Subgroup Direction Direction of Number Bonferroni Observed/ Number of DBs of miR Target genes of targets adjusted p-value Expected predicting target expression expression interaction* hsa-miR-1321 D7 up down 4 1.87E-01 1.78 4 hsa-miR-206 D7 down up 62 1.45E-01 1.15 2 hsa-miR-23a-3p D7 down up 124 3.05E-06 1.50 Exp hsa-miR-29a-3p D7 down down 72 8.19E-02 1.18 2 hsa-miR-29a-3p D7 down down 10 1.04E-01 1.58 4 hsa-miR-29a-3p D7 down up 88 4.82E-02 1.19 Exp hsa-miR-29b-3p D7 down down 71 9.88E-02 1.17 2 hsa-miR-29b-3p D7 down down 9 1.76E-01 1.44 4 hsa-miR-29b-3p D7 down up 71 1.42E-01 1.14 Exp hsa-miR-3133 D7 up up 33 8.74E-02 1.28 3 hsa-miR-3136-3p D7 up up 40 8.80E-02 1.25 2 hsa-miR-3175 D7 down up 3 9.14E-02 2.79 Exp hsa-miR-3175 D7 down down 3 1.44E-01 2.27 Exp hsa-miR-3622a-3p D7 up down 54 1.33E-01 1.17 2 hsa-miR-3622a-3p D7 up down 13 1.07E-02 2.06 3 hsa-miR-424-3p D7 down up 11 1.94E-01 1.35 Exp hsa-miR-424-5p D7 down down 51 4.66E-02 1.27 3 hsa-miR-424-5p D7 down up 143 1.70E-02 1.18 Exp hsa-miR-424-5p D7 down down 170 3.02E-02 1.15 Exp hsa-miR-4488 D7 down up 4 1.03E-04 13.93 Exp hsa-miR-4516 D7 down up 16 9.61E-02 1.43 3

230 hsa-miR-4707-5p D7 down up 7 5.47E-02 2.05 2 hsa-miR-4707-5p D7 down up 2 7.72E-02 4.29 3 hsa-miR-4764-3p D7 up down 5 1.31E-01 1.86 3 hsa-miR-4780 D7 down up 6 1.05E-01 1.86 3 hsa-miR-4795-5p D7 up up 7 1.01E-01 1.77 3 hsa-miR-502-3p D7 down down 12 1.95E-01 1.33 3 hsa-miR-502-3p D7 down down 7 2.05E-02 2.52 4 hsa-miR-502-3p D7 down up 26 5.28E-02 1.40 Exp hsa-miR-548as-3p D7 up up 97 7.56E-02 1.15 2 hsa-miR-551a D7 up down 2 4.55E-02 5.67 Exp hsa-miR-5704 D7 up up 24 2.99E-02 1.51 2 hsa-miR-574-3p D7 up up 1 1.67E-01 5.57 4 hsa-miR-600 D7 up up 20 1.17E-02 1.75 3 hsa-miR-600 D7 up up 8 1.46E-02 2.50 4 hsa-miR-638 D7 down up 5 8.37E-02 2.14 Exp hsa-miR-638 D7 down down 6 6.62E-02 2.09 Exp hsa-miR-663a D7 down down 4 1.69E-01 1.85 3 hsa-miR-424-3p M6 down up 4 8.79E-02 2.38 3 hsa-miR-5007-3p M6 up up 21 1.49E-01 1.28 2 hsa-miR-181a-5p Improver down up 57 1.35E-01 1.16 2 hsa-miR-181a-5p Improver down down 38 1.04E-01 1.24 2 hsa-miR-181a-5p Improver down up 185 1.49E-09 1.53 Exp hsa-miR-181a-5p Improver down down 132 7.93E-11 1.75 Exp hsa-miR-196b-5p Improver down down 30 3.21E-02 1.43 2 hsa-miR-196b-5p Improver down up 14 1.49E-01 1.37 3 hsa-miR-196b-5p Improver down down 12 2.84E-02 1.87 3 hsa-miR-196b-5p Improver down up 4 1.76E-01 1.82 4 hsa-miR-196b-5p Improver down down 3 1.58E-01 2.18 4 hsa-miR-196b-5p Improver down up 51 5.90E-02 1.25 Exp hsa-miR-196b-5p Improver down down 42 1.13E-03 1.65 Exp hsa-miR-205-3p Improver up down 90 7.99E-02 1.16 2 hsa-miR-205-3p Improver up down 30 9.69E-03 1.58 3 hsa-miR-3124-3p Improver up down 31 5.11E-02 1.36 2 hsa-miR-33b-5p Improver down down 42 9.21E-03 1.46 2 hsa-miR-33b-5p Improver down down 16 1.52E-02 1.83 3 hsa-miR-33b-5p Improver down up 7 5.98E-02 2.01 4 hsa-miR-33b-5p Improver down down 5 6.75E-02 2.29 4 hsa-miR-33b-5p Improver down up 38 5.23E-05 1.98 Exp hsa-miR-33b-5p Improver down down 18 5.80E-02 1.50 Exp hsa-miR-3611 Improver up up 11 1.59E-01 1.41 Exp hsa-miR-3611 Improver up down 10 2.52E-02 2.05 Exp hsa-miR-4279 Improver up down 54 1.36E-01 1.16 2 hsa-miR-4290 Improver up up 7 6.65E-02 1.96 3 hsa-miR-4299 Improver up down 25 1.98E-01 1.20 2 hsa-miR-4419b Improver up down 59 5.77E-02 1.23 2 hsa-miR-4421 Improver up down 3 9.23E-02 2.79 Exp hsa-miR-4473 Improver up up 12 2.53E-02 1.90 Exp hsa-miR-4478 Improver up down 58 1.02E-01 1.18 2 hsa-miR-4530 Improver up down 54 1.48E-01 1.16 2 hsa-miR-4536-3p Improver down down 8 4.00E-02 2.08 2

231 hsa-miR-4667-5p Improver up down 11 2.43E-02 1.98 3 hsa-miR-4682 Improver up down 13 2.20E-02 1.89 3 hsa-miR-4698 Improver up down 131 8.64E-05 1.37 2 hsa-miR-4698 Improver up down 46 2.14E-03 1.56 3 hsa-miR-4698 Improver up up 7 1.06E-01 1.75 Exp hsa-miR-4701-5p Improver down down 52 6.01E-02 1.25 2 hsa-miR-4701-5p Improver down down 6 1.58E-01 1.65 3 hsa-miR-4709-3p Improver up down 68 1.10E-02 1.32 2 hsa-miR-4712-3p Improver up down 36 6.60E-02 1.30 2 hsa-miR-4732-3p Improver up up 42 1.40E-01 1.19 2 hsa-miR-4732-3p Improver up down 28 1.19E-01 1.27 2 hsa-miR-4732-3p Improver up up 8 4.72E-02 2.00 3 hsa-miR-4739 Improver up down 13 1.91E-01 1.32 3 hsa-miR-4747-5p Improver up down 63 5.29E-02 1.23 2 hsa-miR-4747-5p Improver up down 16 4.38E-02 1.61 3 hsa-miR-4762-5p Improver up up 62 1.60E-01 1.14 2 hsa-miR-4762-5p Improver up down 44 5.07E-02 1.29 2 hsa-miR-4762-5p Improver up up 24 2.86E-02 1.52 3 hsa-miR-4800-3p Improver up down 3 1.64E-01 2.14 2 hsa-miR-483-3p Improver up down 9 5.53E-02 1.86 2 hsa-miR-485-3p Improver up down 37 1.14E-02 1.48 2 hsa-miR-485-3p Improver up up 11 1.35E-01 1.46 Exp hsa-miR-485-3p Improver up down 7 1.94E-01 1.49 Exp hsa-miR-490-3p Improver up down 32 2.15E-02 1.46 2 hsa-miR-490-3p Improver up down 10 1.91E-02 2.15 3 hsa-miR-490-3p Improver up up 4 7.89E-03 4.89 4 hsa-miR-490-3p Improver up down 2 9.16E-02 3.91 4 hsa-miR-490-3p Improver up down 14 4.82E-03 2.21 Exp hsa-miR-516b-5p Improver up up 66 1.41E-01 1.14 2 hsa-miR-516b-5p Improver up down 46 5.48E-02 1.28 2 hsa-miR-516b-5p Improver up down 12 1.32E-01 1.44 3 hsa-miR-516b-5p Improver up down 4 5.11E-02 2.86 4 hsa-miR-5187-3p Improver up down 47 1.32E-02 1.40 2 hsa-miR-589-5p Improver down down 8 1.39E-01 1.57 3 hsa-miR-589-5p Improver down down 3 2.88E-02 4.46 4 hsa-miR-597 Improver down down 8 1.60E-01 1.52 2 hsa-miR-597 Improver down down 4 1.55E-02 4.13 3 hsa-miR-642b-5p Improver up down 21 3.89E-02 1.52 2 hsa-miR-642b-5p Improver up down 6 8.50E-02 1.97 3 hsa-miR-660-3p Improver up down 44 1.06E-01 1.21 2 hsa-miR-660-3p Improver up down 19 5.10E-02 1.51 Exp hsa-miR-665 Improver up down 55 1.90E-01 1.13 2 hsa-miR-744-5p Improver up down 20 1.80E-01 1.25 Exp hsa-miR-1321 D7 up down 4 1.87E-01 1.78 4 hsa-miR-206 D7 down up 62 1.45E-01 1.15 2 hsa-miR-23a-3p D7 down up 124 3.05E-06 1.50 Exp hsa-miR-29a-3p D7 down down 72 8.19E-02 1.18 2 hsa-miR-29a-3p D7 down down 10 1.04E-01 1.58 4 hsa-miR-29a-3p D7 down up 88 4.82E-02 1.19 Exp hsa-miR-29b-3p D7 down down 71 9.88E-02 1.17 2

232 hsa-miR-29b-3p D7 down down 9 1.76E-01 1.44 4 hsa-miR-29b-3p D7 down up 71 1.42E-01 1.14 Exp hsa-miR-3133 D7 up up 33 8.74E-02 1.28 3 hsa-miR-3136-3p D7 up up 40 8.80E-02 1.25 2 hsa-miR-3175 D7 down up 3 9.14E-02 2.79 Exp hsa-miR-3175 D7 down down 3 1.44E-01 2.27 Exp hsa-miR-3622a-3p D7 up down 54 1.33E-01 1.17 2 hsa-miR-3622a-3p D7 up down 13 1.07E-02 2.06 3 hsa-miR-424-3p D7 down up 11 1.94E-01 1.35 Exp hsa-miR-424-5p D7 down down 51 4.66E-02 1.27 3 hsa-miR-424-5p D7 down up 143 1.70E-02 1.18 Exp hsa-miR-424-5p D7 down down 170 3.02E-02 1.15 Exp hsa-miR-4488 D7 down up 4 1.03E-04 13.93 Exp hsa-miR-4516 D7 down up 16 9.61E-02 1.43 3 hsa-miR-4707-5p D7 down up 7 5.47E-02 2.05 2 hsa-miR-4707-5p D7 down up 2 7.72E-02 4.29 3 hsa-miR-4764-3p D7 up down 5 1.31E-01 1.86 3 hsa-miR-4780 D7 down up 6 1.05E-01 1.86 3 hsa-miR-4795-5p D7 up up 7 1.01E-01 1.77 3 hsa-miR-502-3p D7 down down 12 1.95E-01 1.33 3 hsa-miR-502-3p D7 down down 7 2.05E-02 2.52 4 hsa-miR-502-3p D7 down up 26 5.28E-02 1.40 Exp hsa-miR-548as-3p D7 up up 97 7.56E-02 1.15 2 hsa-miR-551a D7 up down 2 4.55E-02 5.67 Exp hsa-miR-5704 D7 up up 24 2.99E-02 1.51 2 hsa-miR-574-3p D7 up up 1 1.67E-01 5.57 4 hsa-miR-600 D7 up up 20 1.17E-02 1.75 3 hsa-miR-600 D7 up up 8 1.46E-02 2.50 4 hsa-miR-638 D7 down up 5 8.37E-02 2.14 Exp hsa-miR-638 D7 down down 6 6.62E-02 2.09 Exp hsa-miR-663a D7 down down 4 1.69E-01 1.85 3 hsa-miR-181a-5p Improver down up 57 1.35E-01 1.16 2 hsa-miR-181a-5p Improver down down 38 1.04E-01 1.24 2 hsa-miR-181a-5p Improver down up 185 1.49E-09 1.53 Exp hsa-miR-181a-5p Improver down down 132 7.93E-11 1.75 Exp hsa-miR-196b-5p Improver down down 30 3.21E-02 1.43 2 hsa-miR-196b-5p Improver down up 14 1.49E-01 1.37 3 hsa-miR-196b-5p Improver down down 12 2.84E-02 1.87 3 hsa-miR-196b-5p Improver down up 4 1.76E-01 1.82 4 hsa-miR-196b-5p Improver down down 3 1.58E-01 2.18 4 hsa-miR-196b-5p Improver down up 51 5.90E-02 1.25 Exp hsa-miR-196b-5p Improver down down 42 1.13E-03 1.65 Exp hsa-miR-205-3p Improver up down 90 7.99E-02 1.16 2 hsa-miR-205-3p Improver up down 30 9.69E-03 1.58 3 hsa-miR-3124-3p Improver up down 31 5.11E-02 1.36 2 hsa-miR-33b-5p Improver down down 42 9.21E-03 1.46 2 hsa-miR-33b-5p Improver down down 16 1.52E-02 1.83 3 hsa-miR-33b-5p Improver down up 7 5.98E-02 2.01 4 hsa-miR-33b-5p Improver down down 5 6.75E-02 2.29 4 hsa-miR-33b-5p Improver down up 38 5.23E-05 1.98 Exp

233 hsa-miR-33b-5p Improver down down 18 5.80E-02 1.50 Exp hsa-miR-3611 Improver up up 11 1.59E-01 1.41 Exp hsa-miR-3611 Improver up down 10 2.52E-02 2.05 Exp hsa-miR-4279 Improver up down 54 1.36E-01 1.16 2 hsa-miR-4290 Improver up up 7 6.65E-02 1.96 3 hsa-miR-4299 Improver up down 25 1.98E-01 1.20 2 hsa-miR-4419b Improver up down 59 5.77E-02 1.23 2 hsa-miR-4421 Improver up down 3 9.23E-02 2.79 Exp hsa-miR-4473 Improver up up 12 2.53E-02 1.90 Exp hsa-miR-4478 Improver up down 58 1.02E-01 1.18 2 hsa-miR-4530 Improver up down 54 1.48E-01 1.16 2 hsa-miR-4536-3p Improver down down 8 4.00E-02 2.08 2 hsa-miR-4667-5p Improver up down 11 2.43E-02 1.98 3 hsa-miR-4682 Improver up down 13 2.20E-02 1.89 3 hsa-miR-4698 Improver up down 131 8.64E-05 1.37 2 hsa-miR-4698 Improver up down 46 2.14E-03 1.56 3 hsa-miR-4698 Improver up up 7 1.06E-01 1.75 Exp hsa-miR-4701-5p Improver down down 52 6.01E-02 1.25 2 hsa-miR-4701-5p Improver down down 6 1.58E-01 1.65 3 hsa-miR-4709-3p Improver up down 68 1.10E-02 1.32 2 hsa-miR-4712-3p Improver up down 36 6.60E-02 1.30 2 hsa-miR-4732-3p Improver up up 42 1.40E-01 1.19 2 hsa-miR-4732-3p Improver up down 28 1.19E-01 1.27 2 hsa-miR-4732-3p Improver up up 8 4.72E-02 2.00 3 hsa-miR-4739 Improver up down 13 1.91E-01 1.32 3 hsa-miR-4747-5p Improver up down 63 5.29E-02 1.23 2 hsa-miR-4747-5p Improver up down 16 4.38E-02 1.61 3 hsa-miR-4762-5p Improver up up 62 1.60E-01 1.14 2 hsa-miR-4762-5p Improver up down 44 5.07E-02 1.29 2 hsa-miR-4762-5p Improver up up 24 2.86E-02 1.52 3 hsa-miR-4800-3p Improver up down 3 1.64E-01 2.14 2 hsa-miR-483-3p Improver up down 9 5.53E-02 1.86 2 hsa-miR-485-3p Improver up down 37 1.14E-02 1.48 2 hsa-miR-485-3p Improver up up 11 1.35E-01 1.46 Exp hsa-miR-485-3p Improver up down 7 1.94E-01 1.49 Exp hsa-miR-490-3p Improver up down 32 2.15E-02 1.46 2 hsa-miR-490-3p Improver up down 10 1.91E-02 2.15 3 hsa-miR-490-3p Improver up up 4 7.89E-03 4.89 4 hsa-miR-490-3p Improver up down 2 9.16E-02 3.91 4 hsa-miR-490-3p Improver up down 14 4.82E-03 2.21 Exp hsa-miR-516b-5p Improver up up 66 1.41E-01 1.14 2 hsa-miR-516b-5p Improver up down 46 5.48E-02 1.28 2 hsa-miR-516b-5p Improver up down 12 1.32E-01 1.44 3 hsa-miR-516b-5p Improver up down 4 5.11E-02 2.86 4 hsa-miR-5187-3p Improver up down 47 1.32E-02 1.40 2 hsa-miR-589-5p Improver down down 8 1.39E-01 1.57 3 hsa-miR-589-5p Improver down down 3 2.88E-02 4.46 4 hsa-miR-597 Improver down down 8 1.60E-01 1.52 2 hsa-miR-597 Improver down down 4 1.55E-02 4.13 3 hsa-miR-642b-5p Improver up down 21 3.89E-02 1.52 2

234 hsa-miR-642b-5p Improver up down 6 8.50E-02 1.97 3 hsa-miR-660-3p Improver up down 44 1.06E-01 1.21 2 hsa-miR-660-3p Improver up down 19 5.10E-02 1.51 Exp hsa-miR-665 Improver up down 55 1.90E-01 1.13 2 hsa-miR-744-5p Improver up down 20 1.80E-01 1.25 Exp hsa-miR-424-3p M6 down up 4 8.79E-02 2.38 3 hsa-miR-5007-3p M6 up up 21 1.49E-01 1.28 2

235 Supplementary Table 4.3 : Network reconstruction output . For each microRNA is reported the number of interactions (links or edges) that constitute its regulation, and the minimum, 25 th percentile, 50 th percentile, 7 th percentile and maximum of the mutual information distribution inside the regulon.

Mutual Information miR Interactions Min 25% 50% 75% Max hsa-mir-600 284 0.33061 0.36964 0.38333 0.4035 0.51085 hsa-mir-3622a-3p 332 0.32639 0.36473 0.37752 0.39375 0.50178 hsa-mir-502-3p 275 0.3306 0.36354 0.37702 0.39567 0.48841 hsa-mir-23a-3p 231 0.33441 0.36328 0.37488 0.39076 0.48098 hsa-mir-424-5p 332 0.32454 0.36635 0.38014 0.39953 0.49575 hsa-mir-4488 186 0.32229 0.36039 0.37037 0.38545 0.46097 hsa-mir-490-3p 97 0.33011 0.36097 0.38055 0.39911 0.43651 hsa-mir-4698 64 0.32145 0.36265 0.37836 0.39717 0.43368 hsa-mir-5704 181 0.32832 0.36636 0.38209 0.39661 0.44966 hsa-mir-551a 173 0.32915 0.36443 0.38101 0.40111 0.47926 hsa-mir-29a-3p 239 0.32592 0.36327 0.37882 0.39618 0.43845 hsa-mir-642b-5p 91 0.32904 0.36451 0.37348 0.39417 0.44258 hsa-mir-4732-3p 111 0.32991 0.36288 0.37773 0.39725 0.46168 hsa-mir-4473 62 0.33947 0.36602 0.38503 0.41563 0.44725 hsa-mir-3136-3p 261 0.33565 0.36908 0.38345 0.40493 0.58508 hsa-mir-29b-3p 259 0.32372 0.3628 0.37661 0.39732 0.46295 hsa-mir-3133 159 0.33381 0.36435 0.38244 0.39821 0.45898 hsa-mir-4516 195 0.32172 0.35857 0.37495 0.39128 0.44148 hsa-mir-638 215 0.33727 0.3652 0.38048 0.39478 0.5238 hsa-mir-3175 266 0.33333 0.36822 0.38336 0.39909 0.4742 hsa-mir-4762-5p 66 0.33705 0.36255 0.3806 0.39488 0.46934 hsa-mir-3124-3p 83 0.33298 0.3651 0.38209 0.40282 0.48894 hsa-mir-4290 91 0.33682 0.36298 0.38037 0.40092 0.4694 hsa-mir-660-3p 104 0.33164 0.36454 0.37861 0.39672 0.45829 hsa-mir-4421 73 0.331 0.36408 0.38018 0.39417 0.43765 hsa-mir-206 193 0.33029 0.36019 0.37657 0.39105 0.48625 hsa-mir-4764-3p 201 0.32786 0.3635 0.37674 0.39284 0.4561 hsa-mir-4795-5p 353 0.31655 0.36631 0.38121 0.39748 0.47209 hsa-mir-4780 334 0.32999 0.36964 0.38371 0.40305 0.48604 hsa-mir-4279 107 0.32004 0.36764 0.38557 0.4022 0.45663

236 Supplementary Table 4.4: Results of the master regulator analysis (MRA) for each miR regulon. The table lists, for each miR, the subgroup in which it is DE, its direction of change and the direction of change of the DE genes in MRA and the MRA p-value (Fisher’s exact test ). Abbreviations and terms : D7, Day 7 post ICU discharge; Improver, ICUAW phenotype at month 6 post-ICU with increase in muscle cross sectional area >=10 cm2); up, upregulated; down, downregulated. mirna Subgroup miR regulation Gene regulation p-value hsa-mir-206 D7 down up 1.22E-36 hsa-mir-23a-3p D7 down up 1.00E-34 hsa-mir-29a-3p D7 down up 7.57E-23 hsa-mir-29a-3p D7 down down 3.16E-42 hsa-mir-29b-3p D7 down down 8.71E-40 hsa-mir-29b-3p D7 down up 2.46E-27 hsa-mir-3133 D7 up up 2.17E-07 hsa-mir-3136-3p D7 up up 1.18E-19 hsa-mir-3175 D7 down up 4.42E-37 hsa-mir-3175 D7 down down 2.30E-49 hsa-mir-3622a-3p D7 up down 3.47E-61 hsa-mir-424-5p D7 down up 2.44E-74 hsa-mir-424-5p D7 down down 4.63E-50 hsa-mir-4488 D7 down up 1.75E-14 hsa-mir-4516 D7 down up 1.68E-24 hsa-mir-4764-3p D7 up down 3.40E-21 hsa-mir-4780 D7 down up 9.94E-35 hsa-mir-4795-5p D7 up up 1.34E-31 hsa-mir-502-3p D7 down down 5.02E-21 hsa-mir-502-3p D7 down up 3.07E-37 hsa-mir-551a D7 up down 6.81E-20 hsa-mir-5704 D7 up up 9.02E-14 hsa-mir-600 D7 up up 1.33E-14 hsa-mir-638 D7 down up 2.13E-16 hsa-mir-638 D7 down down 5.71E-33 hsa-mir-3124-3p Recover up down 3.78E-05 hsa-mir-4279 Recover up down 2.65E-06 hsa-mir-4290 Recover up up 3.35E-09 hsa-mir-4421 Recover up down 2.64E-05 hsa-mir-4473 Recover up up 7.64E-05 hsa-mir-4698 Recover up down 8.55E-05 hsa-mir-4732-3p Recover up up 8.61E-10 hsa-mir-4762-5p Recover up down 4.85E-08 hsa-mir-490-3p Recover up down 1.04E-07 hsa-mir-490-3p Recover up up 3.87E-13 hsa-mir-642b-5p Recover up down 5.79E-09 hsa-mir-660-3p Recover up down 6.02E-06

237 Supplementary Table 4.5: The table lists each miR (regulating a gene network termed the regulon) and the Intensive Care Unit Acquired Weakness (ICUAW)- relevant co-expression gene modules having significant enrichment for genes within the regulon (Fisher's exact test hypergeometric p-value)

microRNA ICUAW-relevant modules (hypergeometric p-value) hsa-miR-1321 M1(7.53e-35), M2(2.31e-07), M3(6.68e-07), M4(1.11e-07) hsa-miR-206 M2(8.3e-05), M11(2.82e-05) hsa-miR-23a-3p M1(2.62e-26), M2(6.08e-49), M3(1.04e-05) hsa-miR-29a-3p M1(5.98e-05), M2(3.08e-07), M3(5.53e-05), M4(7.95e-20), M7(1.2e-06) hsa-miR-29b-3p M1(0.000305), M2(2.17e-15), M3(1.8e-05), M4(2.61e-27), M7(3.28e-06) hsa-miR-3124-3p M2(0.000721) hsa-miR-3133 M2(6.66e-06), M4(3e-08) hsa-miR-3136-3p M1(1.98e-22), M2(1.7e-12), M3(0.000547), M4(2.27e-09) hsa-miR-3175 M1(3.94e-14), M2(8.96e-23), M3(7.18e-08), M4(1.34e-17) hsa-miR-3611 M4(0.000885) hsa-miR-3622a-3p M1(1.82e-06), M2(1.71e-16), M3(2.74e-07), M4(4.84e-32) hsa-miR-424-3p M1(7.43e-34), M2(5.53e-40), M3(1.49e-10), M4(3.41e-07), M17(0.000118) hsa-miR-424-5p M1(1.33e-15), M2(6.53e-43), M3(1.23e-09), M4(1.77e-14), M7(0.000775) hsa-miR-4488 M1(8.45e-06), M3(0.000495), M6(6.44e-07), M11(3.41e-08) hsa-miR-4516 M1(4.05e-06), M2(4.5e-13), M3(6.37e-06), M4(1.26e-09) hsa-miR-4530 M4(0.000657) hsa-miR-4536-3p M1(2.27e-17), M3(4.29e-06), M4(1.38e-22) hsa-miR-4701-5p M4(5.13e-06) hsa-miR-4732-3p M1(0.000415) hsa-miR-4764-3p M1(6.15e-05), M3(0.000504), M4(1.05e-11) hsa-miR-4780 M1(2.49e-33), M2(4.79e-45), M3(1.86e-08), M4(1.18e-07) hsa-miR-4795-5p M1(6.23e-29), M2(1.61e-17), M3(2.1e-05), M4(5.32e-22) hsa-miR-4800-3p M4(2.22e-06) hsa-miR-5007-3p M4(1.16e-05) hsa-miR-502-3p M1(5.49e-15), M2(3.53e-37), M3(5.03e-08) hsa-miR-548as-3p M4(0.00082) hsa-miR-551a M2(1.35e-11), M4(2.26e-06) hsa-miR-5704 M1(2.43e-05), M3(0.000231), M4(6.76e-10) hsa-miR-574-3p M1(2.03e-41), M2(4.27e-17), M3(4.93e-10), M4(2.45e-33) hsa-miR-589-5p M4(1.06e-23), M13(4.68e-05) hsa-miR-597 M4(2.29e-07) hsa-miR-600 M1(1.08e-10), M3(4.53e-07), M4(6.47e-53), M13(7.34e-05) hsa-miR-638 M1(6.37e-20), M2(3.45e-24), M3(3.44e-05), M4(1.69e-08) hsa-miR-660-3p M4(0.000152) hsa-miR-663a M1(2.62e-49), M2(1.11e-28), M3(3.29e-06) hsa-miR-744-5p M6(9.26e-06) 238

Supplementary Table 4.6 Excel Drop box link https://www.dropbox.com/sh/z3l62g5wl6zju6u/AADdPPaGzT3qlE0nLUT0G81Ba?dl= 0

239 Chapter 5 Supplementary Data

Supplementary Data 5.1: Study summaries

This section describes each dataset used in the analysis. To ensure an accurate description of the datasets, their description has been used verbatim from their corresponding publications (or from public repository if unpublished) whenever possible. Whenever relevant, subgroups that were split or removed from our meta- analysis are specifically described (and shown in italics). If a study made available clinical or pathologic data for the skeletal muscle samples profiled, this is specifically indicated (and shown underlined). For unpublished datasets we have listed the first contributor in the public repository.

GSE13205 Fredriksson et al 87 profiled muscle biopsies obtained from the lateral portion of the vastus lateralis muscle, 10–20 cm above the knee in seventeen patients with sepsis admitted to the general Intensive Care Unit (ICU) at Karolinska University Hospital. Patients younger than 18 years of age, patients with severe liver failure, undergoing dialysis, and patients with impaired coagulation were excluded from the study. Ten patients undergoing elective surgery were included as a control group.

GSE53702 Langhans et al 110 profiled biopsy specimens from vastus lateralis muscle in seven ICU patients. Three of the seven patients were diagnosed with critical illness myopathy (a subtype of ICU acquired weakness [ICUAW]) using muscle membrane inexcitability after direct muscle stimulation at day 6 ICU admission. Six patients undergoing elective orthopedic surgery without neuromuscular disorders were used as controls.

GSE15090 Arashiro et al 378 profiled muscle biopsies to compared the gene- expression profiles from affected facioscapulohumeral muscular dystrophy (FSHD) individuals, asymptomatic carriers, and normal controls. Biopsies were taken from related members (affected, asymptomatic carrier, and normal control) belonging to 5 unrelated families. Muscle biopsies were taken from the biceps in 3 families and from the deltoid in the remaining 2 families, because the clinically affected patients had a severe atrophy. Asymptomatic carriers were removed from our meta-analysis.

240

GSE18715 Voets (unpublished) https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE18715 Skeletal muscle gene expression profiles of six patients with mutations in the catalytic DNA polymerase gamma (POLG1) gene (resulting in the accumulation of mtDNA mutations) were compared with twelve controls. Patient and control subject from three age categories (<10 yrs; 11-49 yrs; >50 yrs) were selected.

GSE36398 Rahimov et al 379 aimed to better understand the pathophysiology of FSHD and develop mRNA-based biomarkers of affected muscles by profiling the biceps , which typically shows an early and severe disease involvement and deltoid , which are relatively uninvolved in patients with FSHD compared to controls biopsies of biceps and deltoids , respectively. Microarray samples were generated in 5 batches. We have designated batches 1-4 as GSE36398a , and GSE36398b batch 5. These batches were separated based on principal components analysis. We have excluded biopsies obtained from the deltoids as these are relatively spared in FSHD.

GSE37084 Perfetti et al 380 profiled muscle biopsies from biceps brachii from ten patients with myotonic dystrophy type (DM2) where clinical diagnosis of DM2 was based upon the criteria set by the International Consortium for Myotonic Dystrophies 446 . Ten control biopsies were from subjects admitted with suspected neuromuscular disorder of undetermined nature. Control biopsies did not show overt signs of muscle pathology upon on histological and immunohistochemical examination. All muscle biopsies were processed by the same pathology team and each was analyzed by two expert pathologists. The aim of the study was the identification of new aberrant alternative splicing events in DM2 patients.

GSE26852 Tasca et al 381 profiled muscle biopsies from FSHD and inflammatory myopathies (IM) as described below: FSHD cohort : muscle biopsies from various peripheral muscle ( biceps femoris, paravertebral, quadriceps, biceps femoris ) in twelve FSHD patients and four 4 dysferlinopathies (limb-girdle muscular dystrophy type 2A; LGMD2B) (age range 28– 35). Unrelated, genetically confirmed (D4Z4 EcoRI fragment ,40Kb) FSHD patients who had undergone lower limb muscle Magnetic Resonance Imaging (MRI) were considered as candidates for the study. Patients who met the inclusion criteria (i.e. i) having at least one muscle showing hyperintensity on T2-short tau inversion recovery (T2-STIR) sequences, or ii) having normal T1-Weighted and T2-STIR

241 sequence signal on quadriceps muscle). Gene-expression was compared with seven normal controls (age range 18– 58). IM cohort : muscle biopsies from seven immunopathologically characterized inflammatory myopathies (IM): 2 dermatomyositis (DM), 2 polymyositis (PM), 1 necrotizing myopathy and 2 IM with nonspecific histopathological features (age range 23–73). Gene-expression was compared with seven normal controls (age range 18– 58).

GSE47968 Nakamori et al 382 profiled quadriceps muscle biopsies in eight patients with FSHD, eight patients with DM1, and seven patients with DM2 and eight healthy control samples. The objective of the study was to perform global analysis of alternative splicing in DM1 and DM2. Nonambulant individuals and patients with congenital or childhood onset of DM1 were excluded to eliminate confounding effects of muscle disuse or maldevelopment

GSE42806 Screen et al 383 profiled skeletal muscle from distal muscles sites ( tibialis anterior, tibialis posterior, gastrocnemius lateralis, gastrocnemius medialis, soleus, extensor halluces longus, extensor digitorum longus, thigh posterior ) in seven patients with tibial muscular dystrophy (TMD) to analyze gene expression compared to five healthy controls. All patients were diagnosed based on DNA mutation testing. Range of ages 37–92.

GSE38417 Dorsey (unpublished) https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE38417 Gene expression data is from RNA extracted from muscle biopsy samples taken from patients with Duchenne muscular dystrophy (DMD) or pathologically normal controls.

GSE38680 Palermo et al 362 profiled muscle biopsies from two cohorts of infantile- onset Pompe disease (Glycogen Storage Disease Type I) to identify transcriptional differences that may contribute to the disease phenotype. In the first cohort, biceps biopsies from 9 infantile-onset Pompe patients and 10 controls were compared. In a separate experiment, quadriceps biopsies from 11 Pompe patients at either 0, 12, or 52 weeks after the initiation of treatment with Myozyme were compared to quadriceps biopsies from 7 controls. We have designated the latter experimental cohort (quadriceps biopsies) as GSE38680a and designated the biceps biopsies in the first cohort as GSE38680b .

242

GSE11681 Saenz et al 360 profiled skeletal muscle ( quadriceps, deltoid, or biceps brachialis ) in ten muscle biopsy samples of limb-girdle muscular dystrophy type 2A (LGMD2A) patients with in which molecular diagnosis was ascertained. Gene expression profiling was compared to ten normal muscle samples

GSE12648 Eisenberg et al 394 profiled skeletal muscle specimens (deltoid, biceps, quadriceps, tibialis anterior, gluteus, paraspinal, triceps) from ten hereditary inclusion body myopathy (HIBM) patients carrying the M712T Persian Jewish founder mutation and presenting mild histological changes, compared with ten healthy matched control individuals. Only mildly HIBM-affected muscles biopsies were selected when possible in order to detect changes as primary as possible,

GSE6011 Pescatori et al 393 profiled 23 quadriceps muscle biopsies from Duchenne muscular dystrophy (DMD) based on the absence of dystrophin immunoreactivity on quadriceps muscle sections. None of the participants at the time of biopsy was or had been under corticosteroid treatment. Control biopsies ( n = 14) were from individuals who came to the hospital with a suspect metabolic disorder that was not confirmed by biochemical and histopathological studies. Control biopsies did not show signs of muscle pathology on histological and histochemical examination.

GSE48280 Surez-Calvet et al 384 profiled skeletal muscle biopsies in five patients with dermatomyositis (DM), five with polymyositis (PM) and five with inclusion body myositis (IBM). The patients fulfilled established diagnostic criteria and did not received any ttreatment prior to biopsy. All DM and PM subjects were female, aged 25–71 years, while the IBM subjects were male ( n = 3) and female ( n = 2), aged 67– 77. Samples from patients with a neoplasm or poor RNA yield were excluded. Five control muscle biopsies were obtained from subjects undergoing hip replacement surgery. Routine histological stains were normal.

GSE1551 Greenberg et al 396 profiled skeletal muscle ( biceps, deltoids, quadriceps ) from 13 with patients with dermatomyositis (DM). Seven were treated with corticosteroids for a median duration of 12 days (range, 1–350 days) and 7 had never received immunosuppressive therapy. Biopsies were performed for clinical indications independent of the study. Expression was compared to 10 normal subjects patients without clinical or histological evidence of a neuromuscular disorder

243

E-MEXP-2681 Bernasconi (unpublished) https://www.ebi.ac.uk/arrayexpress/experiments/E-MEXP-2681/ Muscle biopsies were taken from 6 patients with dermatomyositis (DM), 4 with polymyositis (PM) and 5 not myopathic subjects as controls.

GSE3307 . Bakay et al 377 examined disease-specific transcriptional profiles of normal skeletal muscle and 12 muscle disease groups to determine if these expression fingerprints provide either pathophysiology or diagnostic information for these diseases. We have organized the 12 muscle disease groups into appropriate cohorts as follows: i) ICUAW cohort: one disease group with critical care myopathy. ii) Inflammatory myopathy cohort : one disease group with juvenile dermatomyositis iii) Congenital disease cohort : seven groups of congenital diseases (Fascioscapulohumeral muscular dystrophy [FSHD], Emery Dreifuss muscular dystrophy [both X linked recessive emerin form; autosomal dominant Lamin A/C form], Becker muscular dystrophy, Duchenne muscular dystrophy, Calpain 3 (LGMD2A), dysferlin (LGMD2B), FKRP [fukutin related protein] ). iv) ALS cohort: One group of amyotrophic lateral sclerosis (ALS). v) upper motor neuron disease cohort : one group with spastic paraplegia (SPG4, spastin) We separated i) – iv) into separate cohorts in our meta-analysis. Disease group v) was excluded as it contained only 4 samples and was not categorized among the other cohorts.

GSE45745 Barres et al 387 profiled vastus lateralis skeletal muscle samples obtained from 5 morbidly obese subjects immediately before and 6 months after Gastric Bypass (GB) surgery as well as from 6 lean healthy control subjects. Samples obtained 6 months after GB surgery were excluded.

GSE21496 Reich et al 388 profiled left vastus lateralis of healthy, sedentary men (N = 7) at baseline and immediately following 48 hours of unloading via unilateral lower limb suspension and 24 hours of reloading. Samples at baseline served as healthy controls, samples taken at 24 hours of reloading were excluded from our meta- analysis.

244 GSE5110 Urso et al 389 profiled biopsies taken from the vastus lateralis muscle of five men (20.4 +/- 0.5 yr) before and after 48-h immobilization.

GSE474 Park et al 397 profiled skeletal muscle samples obtained from the rectus abdominus during abdominal surgery for eight lean women (BMI 25 kg/m 2), eight morbidly obese women (BMI 40 kg/m 2), and eight obese patients (BMI 25-40 kg/m 2). This study aimed to identify the mRNA of proteins involved in fat oxidation that may be reduced in obese and morbidly obese individuals. Obese patients were excluded in our meta-analysis.

GSE27536 Turan et al 390 profiled skeletal muscle biopsies from the vastus lateralis of 15 patients with stable chronic obstructive pulmonary disorder (COPD) and 12 age-matched healthy sedentary subjects before and after 8 weeks of a supervised endurance exercise program. Nine COPD patients had normal fat free mass index (FFMI, 21Kg/m2) and 6 COPD patients had low FFMI (16Kg/m2).

GSE1786 Radom-Aizik et al 399 profiled skeletal muscle biopsies from vastus lateral from six COPD patients and five sedentary age-matched healthy men, before and after 3 months of exercise training.

E-MTAB-3671 Kreiner et al 391 profiled skeletal muscle biopsies from trapezius in nine glucocorticoid-naive patients with newly diagnosed, untreated polymyalgia rheumatica (PMR) and 10 matched (age, sex, and BMI) non-PMR control subjects before and after treatment with 14 days prednisolone (20mg/day) in a comprehensive clinical experimental research program. In all patients, the trapezius muscle exhibited the symptoms characteristic of PMR, i.e. aching, tenderness and stiffness. Controlled chronic comorbidities were accepted in both groups.

GSE78929 Walsh et al 304 profiled skeletal muscle biopsies from vastus lateralis in patients with ICUAW Day 7 (n=14) and Month 6 (n=10) post-ICU discharge and compared with 8 healthy control subjects obtained from previously banked specimens collected from consenting individuals. Clinical variables assessed in the cohort included the motor subscore of the Functional Independence Measure (FIM), global muscle strength measured by MRC sum score (MRCSS) and quadriceps cross sectional area (CSA), expressed as a percentage of published age and sex matched norms.

245 GSE13608 Bachinski et al 392 profiled skeletal muscle biopsies from DM1, DM2, Becker muscular dystrophy (BMD), Duchenne muscular dystrophy (DMD), tibial muscular dystrophy (TMD), and myotonia congenita—autosomal dominant (MC-AD), DM-like, and normal individuals (both adult and fetal). The 3 fetal healthy control samples were removed from the meta-analysis.

GSE109178 Dadgar et al 339 profiled skeletal muscle biopsies from 6 normal controls, 17 DMD (absence of dystrophin), 11 BMD (present but abnormal dystrophin), 7 LGMD2I (FKRP deficiency, a glycosylation defect), and 8 LGMD2B (DYSF). Patients had a broad range of ages, clinical severity of their disease, and histopathological findings, although all neuromuscular disease patients showed evidence of a dystrophic process (degeneration/regeneration of muscle fibers). The study sought to determine the mechanisms underlying failure of muscle regeneration that is observed in dystrophic muscle through hypothesis generation using muscle profiling data. The amount of fibrotic replacement (fibrosis) was visually approximated by the same evaluator (E.P. Hoffman), and divided into normal, mild, moderate, or severe fibrosis categories.

GSE10760 Osborne et al 395 profiled skeletal muscle in vastus lateralis from 19 patients with FSHD compared to thirty healthy individuals profiled before and after antibody enhancement. The objective of the study was to identify pathways that are abnormally regulated early in the FSHD disease process

GSE3112 Greenberg et al 386 profiled muscle biopsy specimens from 23 patients with inclusion body myositis (IBM), six with polymyositis (PM), and 11 controls without neuromuscular disease.

GSE39454 Zhu et al 385 profiled various muscle groups (biceps, quadriceps, deltoid) from patients with inflammatory myopathies (5 necrotizing myopathy [NM], 8 DM, 8 PM and 10 IBM) compared to 5 normal controls. Normal controls were not suspected clinically to have neuromuscular disease; had normal muscle strength by examination; and showed normal serum CK levels. The objective of the study was to develop gene signatures to characterize myositis patients at the molecular level.

GSE14901 Abadi et al 340 profiled skeletal muscle biopsies from vastus lateralis from recreationally active, non-smoking, healthy men (N=12) and women (N = 12) before, after 48 hours and 14 days of immobilization. Subjects had a randomly assigned leg

246 immobilized using a knee brace and were provided with walking crutches such that there was no weight bearing on the immobilized leg. The purpose of this study was to examine changes in global gene transcription during immobilization-induced muscle atrophy in humans. Muscle strength testing was conducted at each session using a dynamometer and magnetic resonance imaging (MRI) was used to determine the cross-sectional area (CSA) of the vastus muscles.

GSE45462 Chen et al 398 profiled medial gastrocnemius in 24 subjects (13 men, 11 women; mean age 26.7 ± 8.3 years) with an injury to the lower leg (closed malleolus fracture) treated conservatively with 6 weeks of cast immobilization and following immobilization each subject completed 6 weeks structured rehabilitation program focusing on progressive resistance training of the ankle plantar flexor muscles. Four longitudinal muscle biopsies were taken at the following time points: before (pre- rehab; post-immobilization), after 3 weeks of rehabilitation (early transcriptional changes) and immediately after 6 weeks of rehabilitation (chronic transcriptional changes). An additional muscle biopsy is taken at 4 months post-immobilization from the uninvolved (contralateral) medial gastrocnemius, which serves as a control sample. For our meta-analysis we included the before (pre-rehab) and uninvolved gastrocnemius control sample and excluded the 3 weeks mid-rehab and 6 weeks post rehab.

GSE34111 Gallagher et al 338 profiled quadriceps biopsies in twelve patients with upper gastrointestinal cancer pre-resection (weight-loss 7%) and median 8 month post- resection follow-up (range, 5–12 months) Post-resection patients were disease-free and weight-stable for previous 2 months. Six healthy controls recruited from the community underwent single quadriceps biopsy. Maximum voluntary isometric quadriceps strength was measured using an established method. Data were normalized to body mass (N kg 1).

E-MEXP-3260 Pradat et al 400 profiled skeletal muscle in the middle pertion of the deltoid muscle of 9 patients with probable or definite ALS. Normal control deltoid muscle samples were taken from 10 subjects without any significant neurological history, who underwent a shoulder orthopedic surgery. All patients had sporadic ALS, and presented with symptoms of limb onset. They underwent a complete needle electromyography (EMG) investigation, as performed in the routine work-up of patients with ALS. EMG measures were obtained from the anterior portion of the deltoid muscle. The study was designed to identify gene expression changes in

247 skeletal muscle that could reliably define the degree of disease severity. Patients were classified prior to biopsy based on EMG measures and the manual muscle testing of shoulder abductors (scored from 0 to 5, with 0 representing total paralysis and 5 normal strength, according to the Medical Research Council score).

GSE31243 Smith et al 401 profiled skeletal muscle biopsies from both the gracilis and semitendinosus obtained from 10 patients with cerebral palsy undergoing medial hamstring lengthening surgery. The control group was 10 patients undergoing ACL reconstruction with hamstring autograft. This study was designed to gain further understanding of the skeletal muscle response to cerebral palsy using microarrays and correlating the transcriptional data with functional measures

Supplementary Figure 5.1A-E – Figures available in Drop box link : https://www.dropbox.com/sh/mw2mazq2zu7qba9/AABTFo3- 08JOhyJesH6avz4Oa?dl=0 Disease-specific gene set enrichment analysis after removal of genes significant in the other 4 diseases for A) Congenital Muscle Diseases (CMD) B) Inflammatory Myopathies (IM) C) Disuse and immobility (DI) D) Intensive Care Unit Acquired Weakness (ICUAW) E) Chronic Diseases Affecting Muscle (CSM). EnrichmentMap network for overlapping enriched Gene Ontology gene sets identified by GSEA. Each node represents a significantly enriched gene set (FDR q-value < 0.05); gene sets containing larger number of genes are proportionally larger. Gene sets

248 upregulated in muscle disease category compared to control shown in red (top) and gene sets downregulated shown in blue (bottom)

6.773e-04

4.482e-02 ● 9.350e-01 2

● ● ●● ●● ● ● ● ● ● ● ● ● ● None ●● ● ● ●● ● Mild ● ● 0 ● ● ● Moderate ● ● ● ● ●● Severe ● ● ● TGF−beta score ● ● ●● ● ● ● ● ● ● ● −2 ● 3.281e−02 1.998e−02 5.944e−05

None Mild Moderate Severe

Supplementary Figure 5.2 TGF-β signature z-scores for a cohort of congenital muscle diseases, classified by degree of fibrosis (none, mild, moderate, severe), GSE109178.

249

Supplementary Figure 5.3 – Histogram of gene ranks for up-regulated 98 gene CMDM signature (red), 55 gene up-regulated TGF-beta signature (green), random signature of 98 genes (blue).

250

SUPPLEMENTARY TABLES FOR CHAPTER 5 Supplementary Table 5.1. Gene expression data set sample accession IDs and publically available clinical data. Excel drop box link: https://www.dropbox.com/sh/mw2mazq2zu7qba9/AABTFo3- 08JOhyJesH6avz4Oa?dl=0

Supplementary Table 5.2. Gene expression meta-analysis gene list Excel drop box link: https://www.dropbox.com/sh/mw2mazq2zu7qba9/AABTFo3- 08JOhyJesH6avz4Oa?dl=0

Supplementary Table 5.3. Common Muscle Disease Module (CMDM): pre-validated differentially expressed genes Excel drop box link: https://www.dropbox.com/sh/mw2mazq2zu7qba9/AABTFo3- 08JOhyJesH6avz4Oa?dl=0

Supplementary Table 5.4. Common Muscle Disease Modules (CMDM): validated differentially expressed genes. Excel drop box link: https://www.dropbox.com/sh/mw2mazq2zu7qba9/AABTFo3- 08JOhyJesH6avz4Oa?dl=0

Supplementary Table 5.5. Gene set enrichment analysis of the meta-analysis output for Gene Ontology (GO) terms. Excel drop box link: https://www.dropbox.com/sh/mw2mazq2zu7qba9/AABTFo3- 08JOhyJesH6avz4Oa?dl=0

Supplementary Table 5.6 . cell type specificity (enrichr) Excel drop box link:

251 https://www.dropbox.com/sh/mw2mazq2zu7qba9/AABTFo3- 08JOhyJesH6avz4Oa?dl=0

Supplementary Table 5.7 A Subcellular localization of genes in the Common Muscle Disease Module (CMDM) Gene Location HEXB Vesicular exosome ALDOA Sarcomere FHL3 Focal adhesion IQGAP1 Focal adhesion GBP2 Actin cytoskeleton LIMA1 Focal adhesion ACTC1 Focal adhesion ATP1B3 Caveolae FXYD1 Sarcomere TGFBR2 Caveolae MAPK1 Focal adhesion ANXA2 Cell junction CHRNA1 Cell junction CYFIP1 Focal adhesion SDCBP Focal adhesion ATP2B2 Cell junction LGALS3 Cell junction SAMD4A Cell junction ABI1 Cell junction ATP6AP2 Vesicular exosome LAMP1 Vesicular exosome APMAP Vesicular exosome IL1R1 Extracellular MSN Focal adhesion SLC39A6 Motile parts PHB2 Mitochondrion NOTCH2 Golgi ITGB5 Focal adhesion ANXA4 Vesicular exosome PDK2 Mitochondrion SRGN Golgi PTPN3 Membrane PREPL Golgi YWHAB Focal adhesion GADD45A Nucleus

252 PNMA1 Nucleolus PYGM Vesicular exosome PTEN Mitochondrion RIN2 Cytoplasm APLP2 Vesicular exosome EXOC1 Membrane DYNLT3 Microtubule cytoskeleton CLIC1 Mitochondrion DDX50 Nucleolus EIF1 Nucleus UCKL1 Nucleus CAPRIN2 Microtubule cytoskeleton RPL3L Membrane HPRT1 Vesicular exosome YWHAZ Focal adhesion FTL Vesicular exosome TRIM38 Cytoplasm SAT1 Cytoplasm TERF2IP Nucleus CCNG2 Cytoplasm CHMP5 Vesicular exosome PITPNB Vesicular exosome EIF4E2 Nucleus PSME1 Vesicular exosome TMEM87A Golgi OSBPL8 Endoplasmic reticulum CETN2 Microtubule cytoskeleton S100A13 Vesicular exosome TGOLN2 Vesicle GPNMB Membrane CTSB Mitochondrion CANX Mitochondrion CDKN1A Nucleolus FAM13A Cytoplasm MFN2 Microtubule cytoskeleton EHBP1 Endosome EIF3D Nucleus LMCD1 Extracellular SSB Nucleus SAE1 Nucleus CAMK2G Sarcoplasmic reticulum GAMT Vesicular exosome FBXO7 Mitochondrion

253 DSE Golgi YWHAQ Focal adhesion ENDOG Mitochondrion STAT6 Nucleus EPM2A Endoplasmic reticulum BZW1 Membrane HTRA1 Vesicular exosome CAP1 Focal adhesion CKAP4 Vesicular exosome TUBB2A Microtubule cytoskeleton HEY1 Nucleus CNDP2 Vesicular exosome ENO3 Vesicular exosome SCPEP1 Vesicular exosome TPI1 Vesicular exosome TACC1 Microtubule cytoskeleton MAPK12 Mitochondrion AGFG1 Vesicle TECR Endoplasmic reticulum TMEM43 Golgi TIMP1 Vesicular exosome COL6A3 Vesicular exosome C3 Vesicular exosome MMD Lysosome FBLN5 Vesicular exosome SERPING1 Vesicular exosome ANGPTL2 Vesicular exosome CFH Vesicular exosome EFEMP1 Vesicular exosome C1S Vesicular exosome MGP Vesicular exosome C1R Vesicular exosome PON2 Mitochondrion PRCP Vesicular exosome NUP93 Nucleus MFSD1 Membrane IFITM1 Membrane GDE1 Vesicle IFITM2 Membrane NDUFB11 Mitochondrion HIGD2A Mitochondrion TM9SF3 Membrane ATP5D Mitochondrion

254 COX6A2 Mitochondrion CS Mitochondrion FAM208A Nucleus THYN1 Nucleus DCUN1D2 Proteasome

Supplementary Table 5.7 B. Subcellular localization. Enrichment of subcellular localizations in the Common Muscle Disease Module (CMDM)

Location Fishers exact test p-value Actin cytoskeleton 1 Caveolae 0.712702327 Cell junction 1 Cytoplasm 1 Endoplasmic reticulum 1 Endosome 1 Extracellular 1 Focal adhesion 0.166154911 Golgi 1 Lysosome 1 Membrane 1 Microtubule cytoskeleton 1 Mitochondrion 1 Motile parts 1 Nucleolus 1 Nucleus 0.018689449 Proteasome 1 Sarcomere 1 Sarcoplasmic reticulum 1 Vesicle 1 Vesicular exosome 0.005405427

Supplementary Table 5.8 A-E. Disease specific gene set enrichment analysis after removal of genes significant in the other 4 diseases for A) Congenital Muscle Diseases (CMD) B) Inflammatory Myopathies (IM) C) Disuse and immobility (DI) D) Intensive Care Unit Acquired Weakness (ICUAW) E) Chronic Diseases Affecting Muscle (CSM).

255 Excel drop box link: https://www.dropbox.com/sh/mw2mazq2zu7qba9/AABTFo3- 08JOhyJesH6avz4Oa?dl=0

256