CELLULAR PROCESSES REGULATING CYTOSKELETAL

SIGNALING, RESISTANCE AND CALCIUM SIGNALING

ARE COMMONLY MISREGULATED IN TYPE 1 AND TYPE 2

MYOTONIC DYSTROPHY PATIENTS

by

NRIPESH PRASAD

A DISSERTATION

Submitted in partial fulfillment of the requirements for the Degree of Doctor of Philosophy

in

The Biotechnology Science and Engineering Program

to

The School of Graduate Studies of The University of Alabama in Huntsville

HUNTSVILLE, ALABAMA 2014

ACKNOWLEDGEMENT

The journey we take is supported by many others. Acknowledgement for a few might be just a trifle thing written on a piece of paper. Nevertheless, in the true essence, it gives us an opportunity to remember and express our feelings to those, whom we love, revere and share our secrets. Here I get a great chance to express my token of thanks to people who helped and supported me to complete this journey.

It is my sublime duty to express my deepest sense of gratitude and veneration to my advisor Dr. Shawn E. Levy, Director, Genomics Service Lab, HudsonAlpha Institute for

Biotechnology for his sincere and indelible inspiration, constant encouragement, constructive criticism, meticulous guidance, sustained interest, immense patience, and supportive nature throughout the investigation of the the research and preparation of this manuscript.

I express my deepest sense of reverence and indebtedness to the esteemed members of my Advisory Committee, Dr(s). Joseph Ng, Debra Moriarity, Devin Absher, Gerg

Cooper and Luis Cruz-Vera for their valuable suggestions and encouragement at various stages of my research and thesis writing. Sincere regards are due to, Dean, Graduate

School, University of Alabama in Huntsville, Huntsville, AL, USA for providing all the necessary help and support. I also owe my gratitude to Dr. Rick Myers, Director,

HudsonAlpha Institute for Biotechnology for all the help from time to time. I would also thank Dr. Andrew Link, Associate Professor of Pathology, Microbiology and

Immunology, Vanderbilt University, TN; Dr. Bjarne Udd, Department of Neurology, Vasa

Central Hospital, Finland and Dr. Ralf Krahe, Professor, Department of Genetics,

vi

University of Texas MD Anderson Cancer Center, Houston, TX, for helping me by providing necessary facilities to carry out the research work.

I would also like to take this moment to remember late Dr. Gopi Podila; he was my academic advisor during my first year of graduate school and helped me begin graduate school successfully. I will always remember him for his supportive and caring nature towards me and my family.

I express profound sense of love to my beloved wife Meenakshi and my endearing and adorable son Aayush who have always been a source of inspiration and encouragement for me. Their great sacrifice and constant support instilled a sense of responsibility and confidence into me that helped me get through the graduate school. A regard of love, affection of immeasurable inner rippling exclaims the entire viability of my respected mother Sangeeta Sharma, my father Kamesh Prasad, my brother Aditya, his lovely wife Jennifer and their cute little daughter Julia.

Words are unlimited to express the unconditional support, love, care and cooperation extended by present and past members of Genomics Service Lab at

HudsonAlpha Institute for Biotechnology, Huntsville, Alabama. I would like to thank

Braden, Liz, Angela, Melanie, Joanna, John, Niki, Terri, Jack and Dan who were a constant source of encouragement for me during my research and thesis work.

The help and guidance provided by senior scientists, post-docs and fellow graduate student is deeply acknowledged, without them it would have been difficult to pull through tough challenges. The pleasant company of Parimal Samir, Brittany

Lasseigne, Joy Agee, Kenny Day, Kevin Bowling, Arnab Sen Gupta, Avinash

vii

Sreedasayam, Geethika Trivedi had been a source of positive inspiration and a fun experience.

I am also grateful to all faculty and staff members of the Department of Biological

Sciences, University of Alabama in Huntsville for all the help provided by them during my days as a graduate student. The University Teaching assistantship received during the first year and a half of my degree program is duly acknowledged. Also, I would like to express my deepest gratitude for the Graduate Research Assistantship I received from

HudsonAlpha Institute for Biotechnology.

Last but not the least; I would like to express my deepest and sincere thanks for the generosity of patients of type 1 and type 2 myotonic dystrophies from Finland who donated their muscle biopsies with the hope of helping others in the future.

This list is obviously partial but allow me to submit that the omissions are inadvertent and I once again record my deep felt gratitude to all those who helped with me in this endeavor. Last but not the least, I thank the Almighty God for giving me courage and company of so many wonderful persons without whom I could not have succeeded in my pursuit.

Huntsville (Nripesh Prasad)

March, 2014 Author

viii

TABLE OF CONTENTS

PAGE

List of Figures ...... xiv List of Tables ...... xx

Chapter 1 ...... 1 INTRODUCTION ...... 1 1. 1: Project objectives ...... 5

Chapter 2 ...... 7 REVIEW OF THE LITERATURE ...... 7 2.1: Muscular system ...... 7 2.2: Skeletal muscle system ...... 7 2.3: Muscle contraction ...... 10 2.4: Simple tandem repetitive (STR) sequence and their molecular consequences ...... 11 2.5: Fundamental aspects of Myotonic Dystrophy ...... 17 2.6: Prevalence of myotonic dystrophies ...... 20 2.7: Type 1 Myotonic Dystrophy (DM1) ...... 21 2.8: Type 2 Myotonic Dystrophy (DM2) ...... 23 2.9: Molecular pathomechanism of DM1 and DM2 disease: RNA- gain-of-function model ...... 26 2.10: Role of MBNL protein in muscle development ...... 29 2.11: Altered transcriptional and microRNA misregulation in DM patients...... 31 2.12: Microarray vs. Next generation sequencing technology ...... 32

Chapter 3 ...... 35 MATERIALS AND METHODS ...... 35 3.1: Procurement of human muscle biopsy samples ...... 35

ix

3.2: Procurement of muscle biopsy samples from mouse models ...... 36 3.3: RNA extraction and its quality assessment...... 37 3.4: RNA-seq library preparation and sequencing ...... 38 3.5: Small RNA (miRNA) library preparation and sequencing ...... 39 3.6: Processing of RNA-seq Reads ...... 40 3.7: Processing of small RNA-seq Reads ...... 43 3.8: iTRAQ quantitation of the proteome ...... 44 3.9: Functional enrichment analysis of differentially expressed mRNAs and miRNAs ...... 45 3.10: Validation studies for mRNA expression by Real-time quantitative PCR (RT-qPCR) ...... 46

Chapter 4 ...... 47 RESULTS ...... 47 4.1: Unbiased, novel integrated comparative analysis strategy adapted for studying transcriptional, miRNA and proteomics expression layout in skeletal muscle biopsies from DM1 and DM2 human patients...... 47 4.2: Transcriptomics layout of the DM and control patients as identified by RNA-seq...... 53 4.3: Analysis of mRNA expression reveals unique as well as overlapping differentially expressed (DEGs) in both DM1 and DM2 patient groups...... 57 4.4: Analysis of overlapping differentially expressed genes across DM1 and DM2 patients...... 68 4.4.1: Overlapping down regulated set from DM1 and DM2 patients identifies unique genes affecting decrease in muscular contraction and muscle mass...... 68 4.4.2: Common up regulated gene set in DM1 and DM2 patients identifies unique genes affecting increased microtubule dynamics...... 75 4.4.3: Identification of enriched canonical pathways in common DEGs in DM1 and DM2 patients...... 77 4.4.4: Transcriptional factors predicted to be affected by the common DEGs in DM1 and DM2 patients...... 80

x

4.5: Analysis of uniquely differentially expressed genes in DM1 and DM2 patients reveals considerable insight into the unique attributes in these two patient groups...... 88 4.6: Expression layout of micro RNAs (miRNAs) in DM1 and DM2 patients as identified by small RNA-seq...... 91 4.7: Analysis of miRNA expression reveals unique as well as overlapping miRNAs in both DM1 and DM2 patient groups...... 94 4.8: Analysis of the differentially expressed miRNAs identifies unique biomarkers for muscular dystrophy and insulin resistance in DM1 and DM2 patients...... 99 4.9: Analysis of overlapping differentially expressed miRNAs across DM1 and DM2 patients along with integration of RNA-seq data...... 105 4.9.1: Targeting of Actin cytoskeletal signaling pathway...... 107 4.9.2: Targeting of Calcium signaling pathway...... 109 4.9.3: Targeting of Cardiac hypertrophy signaling pathway...... 111 4.9.4: Targeting Axonal guidance signaling pathway...... 113 4.10: Integration of RNA-seq data with common down regulated miRNAs in DM1 and DM2 patients reveals up regulation of various synaptic pathways...... 115 4.11: Analysis of uniquely up regulated miRNAs reveals unique enrichment of transforming -beta (TGF-β) and mitogen-activated protein (MAPK) signaling pathways to be specifically enriched in DM2 patients...... 118 4.12: Transcriptomics layout of the transgenic mouse models as identified by RNA-seq...... 120 4.13: Comparison of human DM2 and DM2-KI transgenic mouse RNA-seq data reveals actin cytoskeletal signaling controlled by Rho GTPases, as the centerpiece for the pathophysiology in type 2 myotonic dystrophy disease in human patients...... 128 4.14: Proteomics layout of the DM1 and DM2 skeletal muscle biopsy samples as identified by iTRAQ...... 132

Chapter 5 ...... 138 DISCUSSION ...... 138

Chapter 6 ...... 156 CONCLUSIONS ...... 156 xi

Chapter 7 ...... 159 FUTURE DIRECTIONS ...... 159 7.1: Larger study with more samples and data integration ...... 159 7.2: Development of diagnostic panels ...... 160 7.3: Cell lines and drug testing ...... 160

Appendix A ...... 161 Appendix B ...... 172 Abstract Title 2: Macrophages recruited to pancreatic islets in response to vascular endothelial growth factor-A promotes β cell proliferation...... 174 Abstract Title 3: Differential expression of Phox2b marks distinct progenitor cell populations that differ in developmental potential and gene expression in the fetal mouse enteric nervous system...... 175 Abstract Title 4: Systems Biology Assessment of Human Immune Responses after Seasonal Trivalent Inactivated Influenza Vaccine...... 177 Abstract Title 5: Systems Analysis of Inactivated Influenza Vaccine Responses in Distinct Immune Cell Types...... 178 Abstract Title 6: Defining the Lactocrine-Sensitive Neonatal Porcine Uterine Transcriptome and Refining Tools for Evaluation of Developmentally Critical Endometrial Gene Expression Events In Situ...... 179 Abstract Title 7: Splicing anomalies as potential prognostic markers for severity of Myotonic Dystrophy Type 2...... 181 Abstract Title 8: Acquired resistance to Aurora A inhibitor, Alisertib, in melanoma is associated with inhibition of tumor immune surveillance...... 182 Abstract Title 9: Transcriptome profiling of Rat Embryonic Stem Cell (ESCs) and comparing it to human and mouse ESCs to identify differences in gene expression and differential splicing...... 183 Abstract Title 10: RNA-Seq: A pioneering tool to uncover the intricate transcriptomics...... 184 Abstract Title 11: Differential Expression of Phox2b marks distinct enteric progenitor cell populations and facilitates analysis of regulatory pathways in ENS ontogeny...... 185

xii

Appendix C ...... 187 Abstracts of peer-reviewed research articles ...... 187 Article 1: The Pan-ErbB Negative Regulator Lrig1 Is an Intestinal Stem Cell Marker that Functions as a Tumor Suppressor...... 187 Abstract ...... 187 Article 2: Islet Microenvironment, Modulated by Vascular Endothelial Growth Factor-A Signaling, Promotes β Cell Regeneration...... 188 Abstract ...... 188 Article 3: Vascular endothelial growth factor coordinates islet innervation via vascular scaffolding...... 189 Abstract ...... 189 Article 4: Characterization of the Merkel Cell Carcinoma miRNome...... 190 Appendix D ...... 191 REFERENCES ...... 192

xiii

LIST OF FIGURES

Figure Page

Figure 1.1: Schematic representation of a. (CTG)n repeat expansion in 3’ untranslated region (UTR) of dystrophia myotonica-protein kinase (DMPK) gene located on 19q13.3 in type 1 myotonic dystrophy (DM1) patients b. (CCTG)n repeat expansion in 1 of cellular n nucleic acid-binding protein (CNBP) gene located on 3q21.3 in DM2 patients...... 2 Figure 1.2: Co-localization of CCUGexp RNA and muscle blind-like protein 1 (MBNL1) protein foci as determined by immunofluorescence and fluorescence in situ hybridization (IFL-FISH) method in combination with high resolution confocal microscopy from skeletal muscles biopsies of DM2 patients (Lukáš et al. 2012) reprinted with permission...... 3 Figure 2.1: Detailed cross-section of human skeletal muscle. (The figure was produced using Servier Medical Art at http://www.servier.com.) ...... 10 Figure 2.2: Simple tandem repeat expansions associated with various neuromuscular diseases in humans. Postulated mechanisms of pathogenesis mediated by protein loss-of-function (white), RNA and protein gain of function (yellow) (Wojciechowska and Krzyzosiak 2011) reprinted with permission...... 12 Figure 2.3: Hairpin loop like DNA structures formed by expandable repeats in various genetically inherited diseases spinocerebellar ataxia type 1 (Sca1); myotonic dystrophy (MyD). Flanking sequence is indicated in brackets. Calculated free energy (Δ G) in kilocalories per mole of each hairpin loop is indicated below repeat expansions (Gacy et al. 1995) reprinted with permission...... 16 Figure 2.4: Amino acid sequence of cellular nucleotide binding protein (CNBP) in the seven repeats of 18 amino acids each. (Kothekar 1990) reprinted with permission...... 25 Figure 2.5: Proposed RNA-gain-of-function molecular mechanism in type 1 (DM1 and type 2 (DM2) myotonic dystrophies in human patients (Udd and Krahe 2012) reprinted with permission...... 27

xiv

Figure 3.1: a. Schematic representation of the generation of DM2 knock-in (DM2-KI) mouse model by inserting 189 (CCTG) repeats in mouse Cellular nucleic acid-binding protein (CNBP) gene. b. Muscle histopathology of control (+/+) and DM2-KI (ki/ki) mouse with fluorescence in situ hybridization (FISH) , arrows pointing towards intranuclear inclusion bodies of muscleblind like (MBNL) proteins...... 37 Figure 3.2: Unique in-house data analysis pipeline followed for processing of RNA-seq, miRNA-seq and protein expression from DM1 and DM2 samples...... 42 Figure 4.1: Spearman correlation of mRNA expression from all DM samples compared with control a. averaged expression layout b. non-averaged expression layout...... 55 Figure 4.2: Principle component Analysis (PCA) plot of mRNA expression from all DM samples compared with control a. averaged expression layout b. non-averaged expression layout...... 56 Figure 4.3: The distribution of overall gene expression levels in DM patients a. Averaged expression values of 20,637 protein coding genes from each of three subsets: DM1 (n=7), DM2 (n=5) and control (n=3) of human muscle biopsy samples

were plotted as integrated log2 values. The distribution is based on the number of genes falling in each log2 gene expression category. b. Volcano plot of averaged mRNA expression values from DM1 patients compared against control samples c. Volcano plot of averaged mRNA expression values from DM2 patients compared against control samples...... 60 Figure 4.4: a. -wide genomic hotspot of differentially expressed genes (DEGs) in DM1 and DM2 patients. x-axis has the number of DEGs plotted against on y- axis b. Relative expression of DMPK, CNBP, MBNL1, MBNL2, MBNL3,CUBP1 and ETR3 genes as Log2 expression (x-axis) implicated in the molecular pathology of DM1 and DM2 disease as compared to control patients. c: Venn diagram of overlapping differentially expressed genes between DM2 (brown) and DM1 (blue) human skeletal patient muscle biopsy samples. Enriched canonical pathways

xv

identified in overlapping as well as unique DEGs is mentioned...... 63 Figure 4.5: Ranked enriched canonical pathways in both a. DM1 patients b. DM2 patients using Ingenuity Pathway Analysis (IPA) software. Red color indicates down regulated genes and green color indicates up regulated genes affecting a pathway. y-axis has the percentage of genes found to be differentially regulated in the predicted pathways (x-axis). Number on top of each horizontal bar represents the total number of genes known to be affecting that particular pathways (x-axis)...... 66 Figure 4.6: Prediction of affected protein classes (y-axis) from the differentially expressed genes (x-axis) in DM1 and DM2 patients as identified by Panther gene list classification system. Panther protein class Id is mentioned in the parenthesis...... 67 Figure 4.7: Predicted inhibition of skeletal muscle contractility, mass of muscles and increased fatigue of muscles from the 13 common down regulated genes in DM patients. Log2 fold change expression values are motioned below each respective genes...... 74 Figure 4.8: Predicted activation of microtubule dynamics identified from the overlapping up regulated genes sets between DM1 and DM2 patients...... 76 Figure 4.9: Increased activity Nuclear factor κB (NF-κB) transcription factor in skeletal muscles of DM patients resulting in downstream up regulation of genes as shown below. The predicted up regulation of some of the genes are hallmarks for development of rheumatoid arthritis in DM patients...... 83 Figure 4.10: Increased activity hypoxia-inducible factor-1 A (HIF1A) transcription factor in skeletal muscles of DM patients resulting in downstream up regulation of genes regulating stress response...... 84 Figure 4.11: Increased activity of catenin (cadherin-associated protein), beta 1 (CTNNB1) transcription factor in skeletal muscles of DM patients resulting in downstream up regulation of genes regulating microtubule dynamics...... 85

xvi

Figure 4.12: Collective down regulation of pleiomorphic adenoma gene 1 (PLAG1), SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily a, member 4 (SMARCA4), peroxisome proliferator-activated receptor gamma, coactivator 1 alpha (PPARGC1A) and peroxisome proliferator-activated receptor gamma (PPARG) transcriptional factors (TFs) from the common down regulated DEGs and their affected biological functions in DM patients...... 86 Figure 4.13: Predicted inhibition of v-myc myelocytomatosis viral related oncogene, neuroblastoma derived (MYCN) transcriptional factors (TFs) from the overlapping DEGs in DM patients...... 87 Figure 4.14: Spearman correlation of micro RNA expression from all DM samples compared with control a. averaged expression layout b. non-averaged expression layout...... 92 Figure 4.15: Principle component Analysis (PCA) plot of micro RNA expression from all DM samples compared with control a. averaged expression layout b. non-averaged expression layout...... 93 Figure 4.16: Venn diagram of overlapping differentially expressed microRNAs (miRNAs) between DM2 (red) and DM1 (blue) human patient muscle biopsy samples. miRNAs implicated in various relevant diseases and biological functions as identified by literature search is mentioned in boxes...... 98 Figure 4.17: Uniquely differentially expressed miRNA in both type 1 and type 2 myotonic dystrophy patients may serve as biological markers for development of muscular dystrophy as identified by using IPA analysis...... 102 Figure 4.18: Uniquely differentially expressed 30 miRNA in both type 1 and type 2 myotonic dystrophy patients may serve as biological markers of development of insulin resistance in DM patients as identified by using IPA analysis...... 104 Figure 4.19: Gene targeting of 6 mRNA regulating actin cytoskeleton signaling pathway by 18 uniquely up regulated miRNAs in both type 1 and type 2 myotonic dystrophy patients. Green fill represents up regulated event, red fill represents down regulated event...... 108

xvii

Figure 4.20: Gene targeting of 13 mRNA regulating calcium signaling pathway by 42 uniquely up regulated miRNAs in both type 1 and type 2 myotonic dystrophy patients. Green fill represents up regulated event, red fill represents down regulated event...... 110 Figure 4.21: Gene targeting of 6 mRNA regulating cardiac hypertrophy signaling pathway by 28 uniquely up regulated miRNAs in both type 1 and type 2 myotonic dystrophy patients. Green fill represents up regulated event, red fill represents down regulated event...... 112 Figure 4.22: Gene targeting of 8 mRNA regulating Axonal guidance signaling pathway by 29 uniquely up regulated miRNAs in both type 1 and type 2 myotonic dystrophy patients. Green fill represents up regulated event, red fill represents down regulated event...... 114 Figure 4.23: Up regulation of a. Glutamate receptor signaling pathway b. GABA receptor signaling pathway in both type 1 and type 2 myotonic dystrophy patients. Green fill represents up regulated event, red fill represents down regulated event...... 117 Figure 4.24: Up-regulation of genes regulating mitogen-activated protein kinases (MAPK) and transforming growth factor-beta (TGF- β) signaling pathways identified in response to 21 uniquely up regulated miRNAs in type 2 myotonic dystrophy patients...... 119 Figure 4.25: Expression layout of mRNA from skeletal muscle biopsy of mouse models for DM a. unsupervised spearman correlation b. Principal component analysis (PCA) plot...... 124 Figure 4.26: Volcano plot of differentially expressed genes in three transgenic mouse models as compared to control mouse mRNA expression. Up regulated genes (green), down regulated genes (red) and no change (black). a. DM2-KI vs. control b. HSA-CTG vs. control c. HSA-CCTG vs. control...... 125 Figure 4.27: Venn diagram of overlapping differentially expressed genes between DM2-KI (red), HAS-CTG (blue) and HAS-CCTG (green) mouse muscle biopsy samples. KEGG enriched canonical pathways identified from common DEGs is mentioned in boxes...... 126 Figure 4.28: Enriched canonical Signaling by Rho family GTPase pathway regulating microtubule dynamics in skeletal muscle biopsy of

xviii

a. Human DM2 patients b. DM2-KI transgenic mouse c. Human DM1 patients...... 131 Figure 4.29: Enriched canonical actin cytoskeleton signaling pathway in skeletal muscle biopsy of a. Human DM2 patients b. DM2-KI transgenic mouse c. Human DM1 patients ...... 134 Figure 4.30: Enriched canonical pathway regulation of actin based motility signaling in skeletal muscle biopsy of a. Human DM2 patients b. DM2-KI transgenic mouse c. Human DM1 patients ...... 136 A.1: Ovation® RNA-Seq System V2 library preparation protocol...... 161 A.10: Up regulation of production of ketone bodies in DM2-KI transgenic mice skeletal muscles as predicted by KEGG analysis and validated by RNA-seq expression data...... 171

xix

LIST OF TABLES

Table Page

Table 2.1: Characteristic features of skeletal, cardiac, and smooth muscles...... 8 Table 2.2: Repeat Expansion Disorders and Toxic RNAs (Dickson and Wilusz 2010) reprinted with permission...... 13 Table 2.3: Clinical Manifestations of Myotonic dystophies (Udd and Krahe 2012) reprinted with permission...... 18 Table 3.1: Parameters for muscle biopsy sample collection for the proposed study...... 36 Table 4.1: Detailed sample description for myotonic dystrophy patients and control individuals skeletal muscle biopsy samples used in the current investigation...... 50 Table 4.2: Alignment statistics overview of mRNA sequencing of human muscle biopsy samples...... 51 Table 4.3: Alignment statistics overview of microRNA sequencing of all human skeletal muscle biopsy samples...... 52 Table 4.4: Differentially expressed genes (DEGs) statistics across DM1 and DM2 patients as compared to control individuals as estimated by TMM normalization of RNA-seq data. Differential expression was calculated on the basis of their fold change (cut-off ≥ ±2.0 p-value was estimated by implementing z-score calculations using Benjamini Hochberg FDR corrections of 0.05...... 59 Table 4.5: Top 10 common up & down regulated genes in both DM2 and DM1 patients...... 69 Table 4.6: Top 10 enriched cluster results from common down regulated genes in both DM1 and DM2 patients as identified by DAVID functional annotation clustering analysis...... 70 Figure 4.7: Top 10 enriched cluster results from common up regulated genes in both DM1 and DM2 patients as identified by DAVID functional annotation clustering analysis...... 71

xx

Table 4.8: Top biological functions predicted to be affected in both DM1 and DM2 patients from overlapping down regulated genes as identified by IPA analysis...... 72 Table 4.9 : Enriched canonical pathways in common down regulated genes in both DM1 and DM2 patients as identified by Kyoto Encyclopedia of Genes and Genomes (KEGG) Panther classification, Reactome and Ingenuity Pathway Analysis (IPA, Ingenuity Systems) software...... 78 Table 4.10: Enriched canonical pathways in common up regulated genes in both DM1 and DM2 patients as identified by Kyoto Encyclopedia of Genes and Genomes (KEGG) Panther classification, Reactome and Ingenuity Pathway Analysis (IPA, Ingenuity Systems) software...... 79 Table 4.11: Transcriptional factors predicted to activated or inhibited from the overlapping up and down regulated genes from DM1and DM2 patients as identified by IPA analysis...... 82 Table 4.12: Top 10 enriched canonical pathways from uniquely expressed genes in DM1 and DM2 patients as identified by Ingenuity pathway analysis...... 89 Table 4.13: Top 10 ranked enriched biological networks from unique DEGS in DM1 and DM2 patients as compared to control individuals as identified by IPA...... 90 Table 4.14: Differentially expressed micro RNAs in type 1 and type 2 myotonic dystrophy patients on the basis of their fold change (cut-off ≥ ±1.5), p-value of differentially expressed miRNA's was estimated by implementing z-score calculations using Benjamini Hochberg FDR corrections of 0.05...... 96 Table 4.15: Top Biological effects in type 1 and type 2 myotonic dystrophy patients predicted to be targeted by differentially expressed miRNAs as identified by IPA analysis...... 97 Table 4.16: Expression values of 16 uniquely identified miRNAs from DM1 and DM2 patients that have been implicated in various forms of muscular dystrophy...... 101 Table 4.17: Expression values of 30 uniquely identified miRNAs both type 1 and type 2 myotonic dystrophy patients that have been implicated in development of insulin resistance in various previous studies...... 103

xxi

Table 4.18: Top 10 lists of differentially up-regulated miRNAs in both type 1 and type 2 myotonic dystrophy patients...... 106 Table 4.19: List of top 20 KEGG pathways predicted to be targeted by 9 common Down regulated miRNAs in DM patients according to DIANA miRPath v.2.1...... 116 Table 4.30: Clinical and histopathological manifestations of the three transgenic mouse models used in this study...... 123 Table 4.31: Alignment statistics overview of mRNA sequencing of transgenic mouse muscle biopsy samples...... 123 Table 4.32: Top 15 enriched canonical pathways in skeletal muscle biopsies of three transgenic mouse models as compared with control mouse skeletal muscle using IPA analysis...... 127 A.2: Top 10 uniquely differentially expressed genes in DM1 patients as compared to control individuals as determined by RNA-seq...... 162 A.3: Top 10 uniquely differentially expressed genes in DM2 patients as compared to control individuals as determined by RNA-seq...... 163 A.4: List of top 20 KEGG pathways regulating muscular physiology and functions predicted to be targeted by 16 unique miRNAs implicated in various forms of muscular dystrophy in DM patients according to DIANA miRPath v.2.1...... 164 A.5: List of top 20 KEGG pathways predicted to be targeted by 30 unique miRNAs implicated in development of insulin resistance in DM patients according to DIANA miRPath v.2.1...... 165 A.6: List of 18 miRNAs targeting 6 mRNAs that regulates Actin cytoskeleton signaling pathway in both DM1 and DM2 patients along with their expression values...... 166 A.7: List of 42 miRNAs targeting 13 mRNAs that regulates Calcium signaling pathway in both DM1 and DM2 patients along with their expression values...... 167 A.8: List of 29 miRNAs targeting 8 mRNAs that regulates Axonal guidance signaling pathway in both DM1 and DM2 patients along with their expression values...... 169 A.9: List of 9 common down regulated miRNAs in DM1 and DM2 patients along with their expression values...... 170

xxii

Chapter 1

INTRODUCTION

Myotonic dystrophy (DM) is a mutisystemic neurological disorder with an autosomal dominant inheritance involving skeletal, smooth and cardiac muscles (Cho and

Tapscott 2007; Udd and Krahe 2012). Most prominent clinical features include myotonia, muscle weaknesses, stiffness and wasting. Also, defects in cardiac conductivity, iridescent cataracts and various endocrine abnormalities prominently insulin resistance and testicular failures are commonly observed in human patients (Savkur et al. 2004;

Rusconi et al. 2010). There are two different types of DM identified; they are type1 myotonic dystrophy (DM1) and type 2 myotonic dystrophy (DM2), both of which are caused by microsatellite repeat expansions in transcribed, non-coding regions of their respective genes (Vihola et al. 2013).

While DM1 is caused by a trinucleotide repeat expansion [CTG]n in the 3' untranslated region (UTR) of dystrophia myotonica-protein kinase (DMPK) gene (Figure

1.1a) located on chromosome 19q13.3 (Brook et al. 1992; Fu et al. 1992; Mahadevan et al. 1992). DM2 on the other hand is caused by a tetranucleotide repeat expansion of

[CCTG]n within the first intron of cellular nucleic acid-binding protein (CNBP) aka zinc finger protein 9 (ZNF9) gene (Figure 1.1b) mapping to the 3q21 chromosomal region

(Liquori, Ricker, and Moseley 2001; Massa et al. 2010; Rusconi et al. 2010). Both

DMPK and ZNF9 are expressed in a wide range of tissues hence the multi organ

1 involvement is seen in both DM1 and DM2 patients. To date no mutation other than these repeat expansions has been found to cause DM disease.

(CTG)Full >50 - ~4,000 (CTG)Pre-mutation 39 - ~49

(CTG)Normal 5-37 a.

5’ 3’

Full(TGCC) 52 - ~11,000 b. Normal(TGCC) <24

3’ 5’

Figure 1.1: Schematic representation of a. (CTG)n repeat expansion in 3’ untranslated region (UTR) of dystrophia myotonica-protein kinase (DMPK) gene located on 19q13.3 in type 1 myotonic dystrophy (DM1) patients b. (CCTG)n repeat expansion in intron 1 of cellular n nucleic acid-binding protein (CNBP) gene located on 3q21.3 in DM2 patients.

.

Despite the fact that, DM1 has congenital as well as early childhood onset, DM2 patients have been reported to have only late adult onset. While, DM1 and DM2 patients share numerous overlapping clinical features, symptoms of DM2 patients are generally milder and the overall prognosis is better compared to DM1 patients (Meola and Moxley

2004; Day and Ranum 2005). It has been reported that in both diseases the microsatellite repeat expansions form hairpin loop-like structures (Jasinska et al. 2003) which get

2 accumulated as distinct foci within nuclei of affected cells (Taneja et al. 1995; Liquori,

Ricker, and Moseley 2001). These foci are suggested to sequester other nucleic acid binding proteins (Figure 1.2) such as muscle blind-like proteins (MBNL) (Fardaei et al.

2002; Cardani et al. 2006; Cho and Tapscott 2007; Holt et al. 2009). This phenomenon is suggested as "RNA-gain-of-function" model (Larkin and Fardaei 2001; Meola and

Moxley 2004; Ranum and Day 2004a) where because of sequestered proteins numerous splicing abnormalities have been reported.

Figure 1.2: Co-localization of CCUGexp RNA and muscle blind-like protein 1 (MBNL1) protein foci as determined by immunofluorescence and fluorescence in situ hybridization (IFL-FISH) method in combination with high resolution confocal microscopy from skeletal muscles biopsies of DM2 patients (Lukáš et al. 2012) reprinted with permission.

Up till now most of the research has been focused on studying misregulation of only a handful of genes, micro RNAs and transcriptional factors which have been directly or indirectly implicated in this disease such as, chloride channel-1 (ClC-1), insulin receptor (INSR), calcium-activated potassium channel (SK3) protein, myotubularin related protein 1 (MTMR1), calcium channel, voltage-dependent, beta 1 subunit

3

(CACNB1), sodium channel, voltage-gated, type I, beta (SCN1B) gene and another few.

As useful as they are in their findings, these investigations pose one common drawback and that is their inabilities to connect the dots and to help us better understand the overall makeup of these two very closely related diseases. As complex as DM1 and DM2 diseases are in itself, every investigator who has worked in this field of research had acknowledge in one way or another that there has to be a much broader network of genes and proteins whose regulation is affected in one way or another by a complex set of regulatory activities (Bachinski et al. 2013).

Over the last few years ultra high-throughput next generation sequencing (NGS) techniques such as RNA-sequencing (RNA-seq) and microRNA sequencing (miRNA- seq) have gained favor in the field of studying whole genome transcriptomics. With the

NGS technology we now have the opportunity to sequence thousands of megabases of

DNA/cDNA in just a matter of days (Marioni et al. 2008; Wang, Gerstein, and Snyder

2009; Nowrousian 2010). The RNA-seq not only provides us with the ability to analyze the overall cellular gene expression in the form of digital reads but these reads can be efficiently mapped to the reference genome and the expression levels of a gene then subsequently statistically determined (Mortazavi et al. 2008; Langmead et al. 2009;

Trapnell et al. 2010). Also, efficient comparisons of expression levels between genes within a sample and between samples can be made. With the advent of various NGS data analysis platforms, it has become much easier to call out splice variants (E.T. Wang et al.

2008a; Richard et al. 2010), single nucleotide polymorphisms (E.T. Wang et al. 2008a), novel gene and transcript discovery (Mortazavi et al. 2008; Trapnell, Pachter, and

Salzberg 2009) and gene fusions events (Edgren et al. 2011a; Nacu, Yuan, Kan, Bhatt,

4

Rivers, Stinson, B. a Peters, et al. 2011) opening up the opportunity to analyze in depth allele specific expression and changes in RNA editing.

Additionally, measurements of protein expression levels via use of stable isotopes have gained traction in modern day proteomics research (Ross et al. 2004). Estimation of protein abundance using iTRAQ (isobaric tags for relative and absolute quantification) is a departure from traditional 2-dimensional gel electrophoresis to a much more robust quantitative mass spectrometric (MS) technique (Nassa et al. 2014; Tu et al. 2014; White et al. 2014). iTRAQ quantification allows comparative analysis of peptides and proteins between various samples including disease and control in a single mixture.

1. 1: Project objectives

Inspite of similar clinical features, DM1 and DM2 are two separate diseases that require a more clearly differential diagnosis and also their own management strategies. I undertook an unbiased, intense investigative approach to fully comprehend the molecular events participating in the pathophysiology of this disease. To do so I profiled the global mRNA miRNA and protein expression landscape in skeletal muscle biopsies obtained from clinically diagnosed DM1 and DM2 patients and compared them with those of control muscle biopsies. First, the overall gene expression pattern in DM1 and DM2 patients was compared with that of healthy individuals. Second, the differential expression of microRNAs in DM1 and DM2 patients was examined. Third, the post- transcriptional genomic hotspots in DM1 and DM2 patients were determined by identifying genomics loci in each patient with highest activity of mRNA activity. Fourth, respective mRNA and miRNA expression profiles from DM1 and DM2 patients were compared with each other to determine overlapping and uniquely expressing mRNA and

5 miRNA in both patient groups. Based on those findings biological processes, canonical pathways and transcriptional factors that are commonly and uniquely enriched in both

DM1 and DM2 patients were identified. Fifth, miRNA expression profiles from DM1 and

DM2 patients and control samples were integrated with the mRNA expression profile to identify genes targeted by differentially expressed miRNAs. Sixth, the gene expression pattern in skeletal muscle biopsies of three transgenic mouse models replicating phenotypic characteristics of human DM1 and DM2 patients was analyzed to test the hypothesis of RNA-gain-of function mechanism for DM1 and DM2 disease. Cross comparison of the gene expression data from human and mouse samples was done to find similar and distinct gene expression patterns. Finally, comparisons of the protein expression profile of human DM1 and DM2 patients with their respective mRNA expression profiles was done to identify characteristic elements of the underlying pathophysiology of these two diseases.

Taken together, this novel comparative approach of integrating mRNA, microRNA and protein expression is the most comprehensive and unbiased strategy to better understand the underlying pathophysiology of DM1 and DM2 disease. In a broader view this approach could be a unique experimental design to perform in-depth investigation of other related neuromuscular diseases.

6

Chapter 2

REVIEW OF THE LITERATURE

2.1: Muscular system

Muscles are a very essential organ system that are involved in wide range of human body functions, notably in providing support and strength to our body, cardiac conduction and gastro-intestinal movements. There are three major types of muscles

(Table 2.1): skeletal muscles, smooth muscles and cardiac muscles. Skeletal muscles are voluntary in nature and are found in the greatest quantity. Their main function is to provide support and structure to the human body and they assist in locomotion. Smooth muscles are involuntary in nature and are found in hollow organs such as airway tracts, blood vessels, gastro-intestinal tracts, urinary tracts and genital organs. Cardiac muscles are found only in the heart; they have a physical appearance similar to skeletal muscles but are involuntary in nature similar to smooth muscles. Skeletal muscles are innervated by motor neurons that receive messages from the brain that regulate their function. There are approximately 650 skeletal muscles in the human body. Cardiac and smooth muscle being involuntary in nature are under autonomous control.

2.2: Skeletal muscle system

As the name suggests skeletal muscles are attached to the skeleton (bones) of the body by structures called tendons that are made up of collagen fibers. The functional units of the skeletal muscles are known as myocytes which also refers to a muscle cell or more commonly muscle fibers. These myocytes are long, cylindrical in shape and are 7 multinucleated. The energy requirement for skeletal muscles is provided by both aerobic and anaerobic respiration. Muscle cells store as glycogen which under aerobic conditions produces energy in the form of ATP via oxidative phosphorylation. On the other hand in emergency situations and after prolonged exercise; glucose is released via glycogenolysis and then anaerobic glycolysis results in ATP and lactic acid formation.

Table 2.1: Characteristic features of skeletal, cardiac, and smooth muscles in humans

Characteristics Skeletal Muscle Cardiac Muscle Smooth Muscle

Found in the walls of blood vessels and in the walls of organs of Attached to bones Found only in the Location the digestive, (skeleton) heart respiratory, urinary, and reproductive tracts Control of blood Movement of body. vessel diameter. Function Prevention of Pumping of blood Movement of content movement of body. in hollow organs Very large, Short cells with blunt, cylindrical, branched ends. Cells Small, spindle-shaped Anatomical multinucleated cells joined to others by cells joined to each description arranged in parallel intercalated disc and other by gap junctions bundles gap junction Spontaneous Small contraction Initiation of Only by nerve cell (pacemaker cells), always maintained. contraction modified by nerves Modifiable by nerves Voluntary Yes No No Gap Junction No Yes Yes Speed and Fast – 50 msec. (0.05 Moderate – 150 msec. Slow – 1-3 seconds. sustainability of seconds). Not (0.15 seconds). Not Sustainable contraction sustainable sustainable indefinitely Varies widely Low. Relaxation depending on the type Generally does not Likelihood of fatigue between contractions of skeleton muscle fatigue reduces the likelihood and work load Striated Yes Yes No

8

Skeletal muscles are derived from the mesoderm layer of embryonic tissue. When observed under a microscope, skeletal muscles exhibit a distinct banding pattern of cytoskeletal elements comprised of actins (thin filament) and myosin (thick filament) proteins. These proteins are arranged in a repetitive fashion which provides the striated appearance to skeletal muscles. An individual functional unit of skeletal muscles is known as a sarcomere (Figure 2.1). Microscopically, the sarcomere is composed of the thick A-band, with dense M-line in the middle and thin I-band, with a very electron dense

Z-disc at the ends. The length of a sarcomere is calculated as the distance between two Z- discs. Although, the size of a sarcomere is dependent upon the skeletal muscle and its location in the body, the average length of a sarcomere ranges from 2.0-3.0 µm. The central portion of the sarcomere i.e. M-band is rich in creatine kinase (CK) that hydrolyses creatine phosphate to generate ATP. An elevated level of CK in blood is often used as one of the diagnostic biomarkers for muscle damage that is an important component in various myopathies.

Several other muscle proteins also serve as accessories to the sarcomeric units.

Some of these include: troponins (TNNC, TNNI,TNNT), tropomyosins (TPM1, TPM2,

TPM3, TPM4), (TTN), nebulin (NEB), alpha-actinins (ACTN), myozenins

(MYOZ), myotilin (MYOT), (OBSCN), filamins (FLNA, FLNB, FLNC), calpains (CAPN), telethonin (TCAP), syncoilin (SYNC), dystrobrevin (DTNA, DTNB), synemin (SYNM), sarcoglycans (SGC), dystroglycans (DAG1), and vinculin (VCL).

These proteins are highly important as they assist in maintaining normal muscle structure and physiology.

9

Figure 2.1: Detailed cross-section of human skeletal muscle. (The figure was produced using Servier Medical Art at http://www.servier.com.)

2.3: Muscle contraction

Being voluntary in nature skeletal muscles are innervated by motor neurons that regulates their motion. This dynamic association is referred to as the neuromuscular junction. The nerve terminals form synapses with the muscle fibers that in response to an action potential affectively pass signals from neurons in the form of chemicals such as acetycholine (Ach). This causes changes in various ion stores such as sodium (Na+), potassium (K+) and calcium (Ca2+) at the muscular end of the junction. There is net Na+ ion intake by the muscle cells that is compensated by net Ca2+ ion release outside into the

10 sarcoplasm resulting in initiation of muscular contraction. Once outside, Ca2+ binds to the tropnin C that regulates the tropinin-tropomyosin complex, this allows mysoin heads to be free and interact with an actin filament. Thereby inducing a gliding motion, this results in shortening of the sarcomeric bands. Depletion of the Ca2+ availability results in relaxation of the muscle fiber. Various proteins are responsible for regulating this whole process. Some of these are dystrophin (DMD), receptor-associated protein of the synapse

(RASPN), ryanodine receptor (RYR1) protein, sarcoplasmic/endoplasmic reticulum Ca2+ reuptake pump (SERCA) family proteins, triadin (TRDN) and calsequestrin (CASQ).

2.4: Simple tandem repetitive (STR) sequence and their molecular consequences

Simple tandem repetitive (STR) sequences also known as microsatellite expansions are omnipresent in both eukaryotes and prokaryotes. STR are present in both protein coding as well as non-coding regions (Jeffreys, Wilson, and Thein 1985; Kashi,

King, and Soller 1997; Tóth, Gáspári, and Jurka 2000; Bhargava and Fuentes 2010).

Usually STRs are 2-6 nucleotides in length and have been shown to have unique repeat patterns that are site specific. This makes them highly suitable candidates for genetic mapping, population genetics, cancer screening, and notably differential diagnosis of various diseases (Kashi and King 2006; Usdin 2008). Out of several genetically inherited disorders one of the major disease groups in which these microsatellite expansions have been widely implicated are the neuromuscular disorders (NMDs) (Ranum and Day

2002a; Albrecht and Mundlos 2005). There are at least 16 different genetically inherited

NMDs (Table 2.2) where the repeat expansions have been clearly demonstrated to involve neurological systems and or muscular systems in their pathogenesis (Dickson and

Wilusz 2010). These repeat expansions exert different functional effects depending upon

11 their location (Figure 2.2), including gain-of-function or loss-of function at both RNA and protein levels along with production of toxic RNA (Brouwer, Willemsen, and Oostra

2009; Shin, Charizanis, and Swanson 2009; Wojciechowska and Krzyzosiak 2011;

Echeverria and Cooper 2012).

Figure 2.2: Simple tandem repeat expansions associated with various neuromuscular diseases in humans. Repeat sequences are indicated on the outside. Postulated mechanisms of pathogenesis mediated by protein loss-of-function (white), RNA and protein gain of function (yellow) (Wojciechowska and Krzyzosiak 2011) reprinted with permission.

12

Table 2.2: Repeat Expansion Disorders and Toxic RNAs (Dickson and Wilusz 2010) reprinted with permission.

Disease Symptoms Affected Gene Repeat Toxic RNA Evidence Type/Length

Group I: Diseases where toxic RNA has been implicated

DM1 Multiple organ systems affected, DMPK 3’ UTR CUG RNA retained in nuclear foci that are but mostly muscle displaying associated with MBNL 1. Wide-ranging Type 1 myotonic Dystrophia Normal 5-37 myotonia, muscle weakness and defects in splice site selection. dystrophy myotonica protein atrophy; congenital form has Permutation 38-50 Upregulation of CUGBP1 kinase neural involvement Mild 50-100 Classic 100-1000

13 Congenital > 1000 DM2 Similar to DM1, but proximal ZNF9 Intron 1 CCUG Nuclear foci containing MBNL1. Wide- Type 2 myotonic muscles affected first and no Zinc finger 9 Normal 7-24 ranging defects in splice site selection dystrophy congenital form Permutation 22-33 DM2 75-11,000 HDL2 Chorea, dystonia, rigidity, JPH3 Coding or 3’ UTR Nuclear foci containing MBNL in brain; Huntington’s bradykinesia, psychiatric Junctophilin - 3 CUG mutant transcript induces cell toxicity disease-like 2 symptoms, dementia Normal 6-27 Affected 40-57 Progressive cerebellar ataxia ATXN3 Codind CAG SCA3 In Drosophila model MBNL1 Ataxin - 3 Normal 13-36 Machado Joseph overexpression or mutation o CAG rpt to Affected 68-79 disease CAACAG rpt mitigates toxicity SCA8 Slow progressing cerebellar ataxia ATXN8 Noncoding CUG RNA foci containing MBNL1 in CTA/CTG nuclei of neurons; loss of MBNL1 Spinocerebellar Ataxin - 8 exaggerates phenotype in mouse model; ataxia 8 Coding CAG ATXN8OS GABT4 mRNA is mis-spliced; toxicity in Normal 16-91 Ataxin – 8 Drosophila is modulated by MBNL1 and other RNA-binding proteins

Disease Symptoms Affected Gene Repeat Toxic RNA Evidence Type/Length opposite strand Affected 110-130 SCA10 Cerebellar ataxia and seizures E46L Intronic ATTCT Expression of AUUCU repeat RNA in cell Spinocerebellar Ataxia- 10 Normal 10-29 culture leads to nuclear foci ataxia 10 Affected 800-4500 SCA31 Progressive cerebellar ataxia BEAN Intronic TGGAA RNA foci seen in Purkinje cells Spinocerebellar mainly affecting Purkinje cells Brain associated Large (>2.5 kb) CGG transcripts UGGAA rpts bind ataxia 31 Nedd4 microsatellite splicing factors insertion FXTAS Age-dependent progressive FMR1 5’UTR CGG CGG transcripts and MBNL1 in intention tremor and ataxia, white intranuclear astroglial inclusions Fragile X- Fragile X mental Normal 5-45 matter abnormality associated retardation 1 Long untranslated CGG-repeat RNAs are 14 FXTAS 55-200 tremor/ataxia toxic to cultured cells Fragile X >200 syndrome Overexpression of RNA-binding proteins hnRNPA2/B1, CUGBP1 or Pur-α can suppress neurodegradation in Drosophila

Group II: Diseases where toxic RNA has not been ruled out

HD Uncontrolled movements, HTT Coding CAG None reported emotional instability, loss of Huntington’s Huntingtin Normal 9-37 memory, neurodegradation disease Affected 37-121 DRPLA Ataxia, epilepsy, and dementia ATN1 Coding CAG None reported Dentatorubral- Atrophin-1 Normal 7-23 pallidoluysian Affected 49-88 atrophy SCA2 Ataxia and peripheral neuropathy ATXN2 Coding CAG None reported Spinocerebellar Ataxia-2 Normal 15-24

Disease Symptoms Affected Gene Repeat Toxic RNA Evidence Type/Length

ataxia 2 Affected 35-59 SBMA Myopathy, progressive proximal AR Coding CAG Aberrant splicing of Clcn1 m RNA and Kennedy muscle weakness, lower motor Androgen receptor Normal 10-36 elevated CUGBP1 expression but these are disease/spinal neuron loss and denervation- Affected 38-62 linked with polyQ toxicity bulbar muscular induced atrophy in skeletal muscle atrophy SCA6 Cerebellar ataxia CACNA1A Coding CAG None reported Spinocerebellar Calcium channel, Normal 5-20 ataxia 6 voltage- Affected 21-25 dependent, P/Q type, alpha 1A

15 subunit

SCA7 Ataxia with opthalmalgic disorders ATXN7 Coding CAG None reported Spinocerebellar Ataxin-7 Normal 4-35 ataxia 7 Affected 36-306 SCA12 Neurodegeneration, cerebellar PPP2R2B 5’UTR CAG None reported Spinocerebellar ataxia, seizures and dementia Protein Normal 9-28 ataxia 12 phosphatase 2 reg. Affected 55-78 subunit B, isoform β SCA17 Cerebellar ataxia, similar to TBP Coding CAG None reported Huntington’s disease Spinocerebellar TATA-binding Normal 29-42 ataxia 17 protein Affected 47-55

Although by mid 1990s several diseases associated with the presence of repeat expansions were reported, it was however not elucidated as to what role these expansion played and if they were responsible in contributing to the pathogenesis of such diseases.

Later in 1995, it was demonstrated for the first time that these repeat expansions formed hairpin loops like secondary RNA configurations (Gacy et al. 1995) which provided the structural basis for their maintenance. It was also shown that the stability of these repeat sequences as hairpin loops was directly correlated with the length of their expansions

(Figure 2.3). In 1997, the very first evidence of formation of hairpin loop-like secondary structures by the trinucleotide repeat expansions (CUGs) in type 1 myotonic dystrophy patients was provided (Napierała and Krzyzosiak 1997). While, similar evidence was provided for tetranucleotide repeat expansions (CCTGs) in type 2 myotonic dystrophy in

2004 (Dere et al. 2004).

Figure 2.3: Hairpin loop like DNA structures formed by expandable repeats in various genetically inherited diseases such as spinocerebellar ataxia type 1 (Sca1); myotonic dystrophy (MyD). Flanking sequence is indicated in brackets. Calculated free energy (Δ G) in kilocalories per mole of each hairpin loop is indicated below repeat expansions (Gacy et al. 1995) reprinted with permission.

16

2.5: Fundamental aspects of Myotonic Dystrophy

Myotonic dystrophy (DM) is a chronic, slow progressing, multisystemic disease involving primarily muscular and neurological systems, it is the second most common type of muscular dystrophy in humans after Duchenne Muscular Dystrophy (DMD)

(Ricker et al. 1995; Kornblum et al. 2006; Turner and Hilton-Jones 2010; Udd et al.

2011; Udd and Krahe 2012). DM is suggested to have an autosomal dominant inheritance across generations with STR repeats at the centerpiece of its pathophysiology (Day et al.

1999; Ranum and Day 2002b). Based on the type of microsatellite repeat expansion two distinct forms of DM have been identified: Type 1 Myotonic dystrophy (DM1, OMIM

160900) which is also known as Steinert or Batten or Gibb disease (Batten and Gibb

1909; Case and Atrophica 1924) and Type 2 Myotonic dystrophy (DM2, OMIM 602668)

(Thornton, Griggs, and Moxley 1994; Liquori, Ricker, and Moseley 2001) originally known as proximal myotonic myopathy (PROMM) (Ricker et al. 1995; III 1996; Udd et al. 1997; Moxley, Udd, and Ricker 1998; Day et al. 1999). DM1 is caused by a trinucleotide expansion of [CTG]n repeat expansion (~50-1000) in the 3' UTR regions of the dystrophia myotonica-protein kinase (DMPK) gene which is located on chr:19q13.3

(Fu et al. 1992; Warren and Nelson 1993; Ashley and Warren 1995), while DM2 is caused by a tetranucleotide expansion of [CCTG]n repeats (~75-11,000) in intron 1 of the cellular nucleotide binding protein (CNBP) also known as zinc finger 9 (ZNF9) gene located on chr:3q21.3 (Kress et al. 2000; Liquori, Ricker, and Moseley 2001). DM1 have been shown to have congenital, childhood as well as adult onsets with moderate to severe symptoms while DM2 is reported to have only adult onset with mild to moderate symptoms (Table 2.3) (Rönnblom 1996; Kamsteeg et al. 2012; Udd and Krahe 2012).

17

Table 2.3: Clinical Manifestations of Myotonic dystophies (Udd and Krahe 2012) reprinted with permission.

Myotonic dystrophy type 1 Myotonic dystrophy type 2

Genetics

Inheritance Autosomal dominant Autosomal dominant Anticipation Pronounced Exceptionally rare Congenital form Yes No Chromosome 19q13.3 3q21.3 DMPK CNBP Expansion mutation (CTG)n (CCTG)n Location of the expansion 3’ untranslated region Intron 1

Core features

Clinical myotonia Typical in adult onset Present in less than 50% Myotonia on Generally present Absent and variable in many electromyography patients, needs detailed investigation Muscle weakness Disability often by age 30- Disability at age 60-85 years 50 years Cataracts Generally present Present in a few patients at diagnosis

Localization of muscle weakness

Face or jaw Generally present Usually absent Ptosis Often present Rare, mild, or moderate Bulbar (dysphagia) Generally present later in Not present life Respiratory muscles Generally present later in Exceptionally rare cases life Distal limb muscle Generally prominent Flex or digitorum profundus in some patients Proximal limb muscle Can be absent for many Main disability in most years patients late onset Stemodeidomastoid Generally prominent Prominent in few patients muscle

Muscular symptoms

Myalgic pain Absent or moderate Most disabling symptom in most patients Muscle strength variations Occasional Can be considerable

18

Myotonic dystrophy type 1 Myotonic dystrophy type 2

Visible muscle atrophy Face, temporal, distal hands, Usually absent and legs Calf hypertrophy Absent Present in at least 50%

Laboratory findings

Concentration of creatine Normal-to-moderate Normal-to-moderate increase kinase in serum increase Muscle biopsy findings

Fiber atrophy Smallness of type 1 fibers Highly atrophic type 2 fibers Nuclear clump fibers In late stage only Scattered early before weakness Sarcoplasmic masses Very frequent in distal Very rare muscles Ring fibers Frequent May occur Internal nuclei Massive in distal muscle Variable and mainly in type 2 fibres

Cardiac symptoms

Conduction defects Common Highly variable, absent to severe

Other neurological symptoms

Tremors Absent Prominent in many patients Behavioral changes Common Not apparent Hypersomnia Prominent Infrequent

Other features

Manifest diabetes Occasional Infrequent Frontal balding in men Generally present Exceptional Incapacity (work and Typically after age 30-35 Rarely younger than 60 activities of daily living) years years, unless severe pain Life expectancy Reduced Normal

19

The hallmark symptoms of this disease include myopathy and muscle weakness, myotonia (hyperexcitability of muscles), and multiorgan involvement, especially of the cardiac system such as cardiac arrhythmia and hypertrophy, development of cataracts, diabetes mellitus, abnormalities of the brain such as cognition defects and hypogonadism

(Day et al. 1999; Finsterer 2002; Udd and Krahe 2012).

2.6: Prevalence of myotonic dystrophies

While, the collective frequency of myotonic dystrophies is predicted to be as high as approximately 1 in 8,000 individuals worldwide (Suominen et al. 2011; Udd and

Krahe 2012), the exact incidence rate for DM1 and DM2 separately is not available mainly due to three reasons: one being varying prevalence's in different geographical and ethnic population groups, second being overlapping symptoms and third being lack of sufficient data because most of the cases go undiagnosed for too long. In a recent study

(Suominen et al. 2011) it was reported that DM1 has a population frequency of one in

2760 people whereas DM2 had a frequency of one in 1830 in Finland. Death in cases of

DM is mainly due to cardiac and respiratory failure as a result of cardiac conduction defects, other than this DM is not an overall life threatening condition and life expectancy ranges between 50-60 years.

Nevertheless, the socio-economic impact of this disease is found to be much greater than estimated. Severe myalgic pain along with cardiac conduction defects are seen as the most disabling symptoms that hugely limit the working ability among the patients. In many cases these symptoms have been reported to keep the patient completely bedridden for most of their productive life (von zur Mühlen et al. 1998;

Meola et al. 2002; Schara and Schoser 2006).

20

2.7: Type 1 Myotonic Dystrophy (DM1)

Type 1 myotonic dystrophy (DM1) also known as Steinert's disease was first described in 1909 (Batten and Gibb 1909) and since then four different forms of DM1 has been identified: adult-onset (classical), congenital (most severe), childhood-onset

(juvernile form) and late-onset (mild form) (Arsenault et al. 2006; Kamsteeg et al. 2012).

Out of these the congenital-form is the most severe manifestation where there are developmental defects and mental retardation in the newborn babies. DM1 patients primarily experience muscular pain in the distal part of their bodies such as hands and legs with accompanying atrophy of facial muscles. Cardiac conduction defects are also common along with enlargement of the heart. A majority of the patients also develop diabetes and show signs of balding, dysarthia, dysphagia and immobility. Respiratory and cardiac insufficiencies are the major cause of death among DM1 patients (Udd and Krahe

2012).

Anticipation (progressive earlier onset of a disease with worsening of the symptoms in subsequent generations) has been commonly observed among many DM1 families (Mahadevan et al. 1992; Mitas 1997; Kamsteeg et al. 2012). In such families size and number of repeat expansions increases from one generation to another which leads to not only early onset of DM1 diseases in subsequent generations but also such patients experience much more severe symptoms. Since the largest expansions are only transmitted via the eggs, hence congenital myotonic dystrophy is virtually always transmitted by an affected mother (Longman 2009). It has been also shown that the repeat expansion increases with the patients age which reflects the progressive clinical symptoms in DM1 patients (Martorell et al. 1995).

21

Briefly, dystrophia myotonica-protein kinase (DMPK, OMIM: 605377,

ENSG00000104936) gene also known as myotonin-protein kinase (Mt-Pk) is located on

Chromosome 19: 46,272,975-46,285,810, reverse strand (Fu et al. 1993). DMPK protein is detected in skeletal and cardiac muscles and central nervous tissues.

Immunohistochemical staining has revealed that DMPK gets localized with sarcoplasmic reticulum (Salvatori et al. 1997) and neuromuscular junction in skeletal muscles and intercalated discs in heart and purkinje fibers (Maeda et al. 1995) and in gap junctions

(Mussini et al. 1999), suggesting a role in calcium and sodium ion homeostasis and also regulation of signal transduction systems. Electron microscopy has revealed that in the central nervous system DMPK is localized in the cytoplasm of cerebellar purkinje cells, hippocampal interneurons and spinal motoneurons (Balasubramanyam et al. 1998) suggesting its role in cognition, memory and control of motor functions and reflexes.

DMPK encodes a 71-80 kDa protein in purkinje fibers and skeletal muscles, respectively (Maeda et al. 1995). The protein coded by DMPK genes belongs to the serine/threonine protein kinase family (Manning et al. 2002; Wansink et al. 2003) which is a member of the Rho kinase family (Fu et al. 1993; Maeda et al. 1995; Leung et al.

1998; Mastaglia et al. 1998; Pham et al. 1998; Amano, Fukata, and Kaibuchi 2000).

Presence of alternative splice sites has resulted in 16 known splice variants of this gene out of which 12 are protein coding (Jansen et al. 1992; Mahadevan et al. 1992). Normal

CTG repeats in healthy individuals range from 5-37, repeat lengths between 38-49 are considered pre-mutation expansions while in DM1 patients the repeats range from 50-

4000 (Ashizawa T and Baiget M 2000). No mutations have been found in the coding regions of DMPK gene in DM1 patients.

22

Various studies have proposed haploinsufficiency of DMPK gene and protein as a underlying cause for DM1 disease meaning down regulation of DMPK mRNA and hence the reduced protein level is the cause of DM1 pathophysiology (Fu et al. 1992; Hofmann-

Radvanyi et al. 1993; Bhagavati et al. 1996). Homozygous knockout (DMPK-/-) mouse models displayed mild cardiac conduction defects and skeletal muscle physiology with late onset of myopathy (Jansen et al. 1996; Reddy et al. 1996). Animals with very high

DMPK transgene expression showed cardiac hypertrophy and conduction defects.

However, animal models lacked other frequently observed DM1 symptoms such as myotonia, cataract and male infertility. Abnormal sodium channel gating was detected in heterozygous (DMPK+/-) mouse muscle and cardiomyocytes (Mounsey et al. 2000;

Reddy et al. 2002; Lee et al. 2003). Taken overall the proposed theory of simple haploinsufficiency of DMPK transcript to be the sole cause of development of multisystemic disorders seen in DM1 patients is not sufficient.

2.8: Type 2 Myotonic Dystrophy (DM2)

While DM1 has been studied extensively over the years since it was discovered in

1909 it was not until 1994 that for the first time cases of patients showing symptoms similar to DM1 without the notable CTG repeat expansion in DMPK gene (Thornton,

Griggs, and Moxley 1994) were reported. Soon after this, 27 cases in a group of 14

German families with late onset of myotonia was reported (Ricker et al. 1995) where the symptoms were observed to be less severe in nature and proximal parts of the patient’s bodies such as legs and hands were affected. Interestingly, absence of CTG repeats in

DMPK gene from these patients was also observed and the disease was termed as

Proximal myotonic myopathy (PROMM). They concluded at that time that "PROMM is

23 a new genetic disorder similar to, but distinct from, DM. Patients suspected of having

DM but with negative DNA studies may have PROMM. The gene defect for PROMM awaits discovery". With the evidence mounting for a novel type of myotonic dystrophy, a diagnostic criteria was presented at the 54th European Neuromuscular Center

International Workshop in 1997 for such cases, this included: confirmation of autosomal dominant inheritance, evidence of proximal weakness especially thighs, limbs and most notably normal size of CTG repeat in DMPK gene.

In the same year, another group from Finland reported very similar late onset multisystemic, proximal muscle weakness from patients in Finland (Udd et al. 1997).

They termed this disorder as Proximal myotonic dystrophy (PDM). Blood samples and muscle biopsies from 63 individuals from 5 generation of a family in Minnesota exhibiting muscle weakness and myotonia were studied and it was shown that these patients had variability in muscle fiber size along with the presence of nuclear foci and atrophied muscle fibers (Day et al. 1999). Most notably, it was the first time that the repeat expansions in these patients were mapped to 10cM region of Chr3q without the chromosome 19 CTG expansion in DMPK gene. The investigators were quite certain that this disease was different from classical DM and they were the first to call it as "type 2

Myotonic Dystrophy" (DM2). Liquori et. al., (2001) provided the much needed evidentiary support by reporting that DM2 is in fact caused by an uninterrupted tetranucleotide (CCTG)n repeat expansion located in intron 1 of Zinc Finger 9 (ZNF9) gene also known as cellular nucleotide binding protein (CNBP, Ch3:128,886,657 -

128,902,809, OMIM: 116955). They also validated that this repeat expansion, is

24 expressed as part of the unprocessed transcript and gets accumulated as nuclear RNA foci in affected DM2 muscle.

Briefly, Zinc Finger 9 (ZNF9) gene also known as Cellular Nucleotide Binding

Protein (CNBP, OMIM: 116955, ENSG00000169714) is located at Ch3:128,886,657-

128,902,809, reverse strand (Lusis et al. 1990). It encodes a 17 KDa protein (Figure 2.4) containing 7 zinc finger binding consensus domain consisting of Cys-X2-Cys-X4-His-

X4-Cys (Kothekar 1990) and is suggested to be a RNA binding protein (Flink and

Morkin 1995a; Calcaterra et al. 1999; Chen et al. 2003).

Figure 2.4: Amino acid sequence of cellular nucleotide binding protein (CNBP) in the seven repeats of 18 amino acids each. (Kothekar 1990) reprinted with permission.

To date there are 10 known splice variants with 8 as protein coding transcripts of this gene (Flink and Morkin 1995). CNBP is highly conserved across various species with strikingly 100% similarity between human and mouse and 99% between human and

25 chicken (van Heumen, Claxton, and Pickles 1997). Much earlier reports indicated that

CNBP had a role in cholesterol metabolism (Rajavashisth et al. 1989) which could not be confirmed later on. Subsequently, it was reported that CNBP had an affinity to bind with single stranded DNA that has a CCCTCCCA sequence also known as CT element

(Michelotti et al. 1995) a segment of DNA that enhances c-Myc promoter activity. CNBP has been also shown to be an essential gene for forebrain development in mouse (Chen

2003). Homozygous knockout (CNBP-/-) mouse had severe forebrain truncation and decreased expression of Myc. All embryo's died around E10.5 age and there were no newborn of Cnbp-/-. On the other hand, in heterozygous (CNBP-/+) embryos 40% newborns had severe birth defects, including growth retardation, craniofacial defects, smaller jaws, lack of eyes and they died shortly after birth.

2.9: Molecular pathomechanism of DM1 and DM2 disease: RNA-gain-of-function model

Various hypothesis have been put forth by several investigators postulating the pathogenesis of the repeat expansions, one of which included the reduction of steady state mRNA (Pieretti et al. 1991; Sutcliffe et al. 1992; Knight et al. 1993) and or protein (Feng et al. 1995) levels of the genes effected by these repeat expansions. The second being an

RNA dominant mutation model where the duplex hairpin which accumulates as foci in the nuclei of DM1 (Taneja et al. 1995; Davis et al. 1997) and DM2 (Liquori, Ricker, and

Moseley 2001) affected cells. These foci act as protein sinks thereby removing certain nuclear proteins thus making them unavailable for their normal functions (Timchenko et al. 1996). Study of nuclear foci with fluorescence in situ hybridization (FISH) in skeletal muscle tissues from both DM1 and DM2 patients revealed that a nuclear protein known

26 as muscleblind like (MBNL1/EXP, Chr3:151,985,828-152,183,568, OMIM: 606516) co- localized with repeat expansions in both the DM1 as well as DM2 patients (Mankodi et al. 2001). MBNL1 is the human homolog of a protein required for terminal differentiation of muscle and photo-receptor cells in Drosophila (Artero et al. 1998). It was thereafter hypothesized that the sequestration of MBNL1 protein as nuclear foci plays a critical role in DM pathogenesis thus raising the possibility of "RNA gain of function" (Figure 2.5) as a unifying mechanism which underlies both forms of DM.

Figure 2.5: Proposed RNA-gain-of-function molecular mechanism in type 1 (DM1 and type 2 (DM2) myotonic dystrophies in human patients (Udd and Krahe 2012) reprinted with permission.

27

It was later reported that two more forms of MBNL proteins: MBLL (MBNL2,

Chr13:97,874,550-98,046,374, OMIM: 607327) and MBXL (MBNL3, Chr

X:131,503,341-131,623,995, OMIM: 300413) in addition to MBNL1 co-localize with expanded repeats in DM1 and DM2 transcripts (Fardaei et al. 2002). They also reported that MBNL1 and MBNL2 were expressed in many adult tissues while MBNL3 was expressed predominantly in placental tissue. Later, using a yeast 3 hybrid system it was confirmed that it was MBNL1 that bound to either CTG or CCTG repeats in DM1 or

DM2 respectively (Kino et al. 2004). They also observed that MBNL1 binds to longer repeats more or less equally as it binds to smaller repeat expansions and also the binding affinity of MBNL1 to CCTG repeats is much greater than CTG repeat expansions.

In normal situations, genes are transcribed and are then transported into splicing speckles before being exported to cytoplasm. Immunolocalization studies coupled with in-situ hybridization showed that nuclear foci in DM1 patients colocalized around the periphery of the nuclear splicing speckles whereas in DM2 patients the foci were widely dispersed in the nucleoplasm and do not accumulate either around speckles or exosomes and the spliced CNBP gene passes out to the cytoplasm (Holt et al. 2009). They also confirmed that foci's in both DM1 and DM2 sequester MBNL1 as reported earlier. With the aim of determining if CTG repeats alone are themselves sufficient to produce toxic effects or if the presence of DMPK gene is also necessary for the pathogenesis of DM1, the first transgenic mice (HSALR) were generated where ~250 CTG repeats were inserted in the final exon of human skeletal actin (HSA) gene between stop codon and the polyadenylation site (Mankodi et al. 2000). The idea was to keep the repeat HSALR constructs devoid of any association from the DM locus. They observed that the actin

28 coding sequence from transgenic mice was fully intact and the long repeat sequence was fully spliced. This suggested that the repeat expansion had no effect on the splicing of the gene (in this case actin) itself. Symptomatically, the progeny of these transgenic mice exhibited high mortality (41%), myotonic discharges as early as 4 weeks and abnormal hind limb posture. Histologically myopathy was characterized by increased variability of muscle fiber size and abundant central nuclei. Fluorescent in situ hybridization (FISH) technique was used to detect the presence of long-repeat transcripts retained in the nucleus in multiple discrete foci reminiscent of those seen in fibroblasts and myoblasts from DM patients. The authors suggested that since the only common feature between their HSALR transgenic mice and actual mRNA from DM patients is the presence of the expanded repeats, they concluded that the repeats sequences are sufficient to trigger the

DM disease pathogenesis.

Similar evidence was provided for DM2 disease where 15 fluorescently labeled oligo-nucleotide probes designed to recognize unique sequences in intron 1 failed to identify evidence of any portion of intron 1 associated with the intranuclear foci

(Margolis et al. 2006). This led them to conclude that similar to DM1, DM2 repeat sequences are sufficient to sequester MBNL1 protein and no flanking portion of the repeat expansion is contributing to this process.

2.10: Role of MBNL protein in muscle development

The very first report describing the role of MBNL in muscle development and differentiation was explored in 1998 (Artero et al. 1998) where they catalogued various developmental stages of Drosophila embryos and reported that MBNL was localized in the nucleus which strongly suggested its role as a regulatory element. They further used

29

MBNL mutant embryos and saw that even though MBNL was not required during early phases of muscle formation, it was essential for the later part of muscle differentiation.

MBNL mutant flies showed lack of mucle straitions due to abscence of reticular matrix at the Z band. This was accompanied by observations such as hypercontraction of muscles, partial paralysis and late embryonic and early larva lethality. The expression of MBNL1 is high in blood, eye, cardiac and skeletal muscles (Miller et al. 2000). RNA-protein binding assay coupled with electron microscopy revealed that unbound purifed repeat expansion [CUG]136 formed a rod like structure but as soon as it was incubated with

MBNL1 it formed a ring like structure with MBNL1 binding to the stem of the pathogenic RNA (Yuan et al. 2007). They also noted that ~70% RNA was bound to the protein when incubated at ratio of 1:2.5. This level increased to >90% RNA being bound when incubated at higher protein levels of 1:10. MBNL3 expression varies in myogenic and non myogenic cells (Lee, Squillace, and Wang 2007) while MBNL3 is enriched in spleen, lungs and testes, it is expressed in lower levels in heart, skeletal muscles and fibroblasts. MBNL3 levels decrease in muscle cells isolated from differentiating conditions whereas MBNL1 levels were unaffected thus confirming that muscle differentiation involves complex signalling pathways that eventually inhibit MBNL3 expression.

Knock-out mouse model with targeted deletion of exon 3 from the MBNL1 gene

(MbnlΔ3/Δ3) produces symptoms such as myotonia, cataract, and abberant splicings in

Chloride channnel 1 (CLCN1), Cardiac troponin T (TNNT2), skeletal muscle troponin T

(TNNT3) genes (Kanadia et al. 2003). They also observed that the level of CUGBP1 was unchanged as suggested earlier (Faustino and Thomas A. Cooper, 2003; Timchenko et

30 al., 2001). These results indicated that loss of MBNL1 gene itself without the presence of expansion is also able to produce some symptoms and splicing abnormalities similar to

DM disease however, they did not observe all of the symptoms of DM in their mouse models. This led them to propose that the pathogenesis of DM is a multi level phenomenon and the loss of MBNL1 alone is not sufficient to produce every aspect of this disease. This led Hao et. al. (2008) to generate a knock-out mouse model of MBNL2 gene that developed myotonia and skeletal myopathy. They were also able to demonstrate that MBNL2 deficiency affects RNA and protein expression of the CLCN1. Meanwhile,

Ho et. al. (2004) were also able to reproduce DM aberrant splicing pattern by using siRNA technique that targetted endogenous MNBNL1 in cell cultures in which MBNL1 was reduced by 80-90%.

Later, Ho et. al. (2005) provided the first evidence that the re-localization of

MBNL1 from mobile to immobile fractions as RNA foci and the splicing disruption of

MBNL1 pre-mRNA targets are two separate events. Recently, Suenaga et a. al. (2009) studied aberrant splicing abnormalities in DM1 human brain samples and compared it to earlier homozygous MBNL1 knock-out mice (MBNLΔ3/Δ3) brain tissue using splice senstive microarrays and reported that not all of the splicing defects seen in DM1 patient brain samples were also observed in MBNLΔ3/Δ3 brain. They concluded that factors other than MNBL1 proteins are likely contributing to mis-splicing events in brain tissues that need to be investigated.

2.11: Altered transcriptional and microRNA misregulation in DM patients.

Recently some studies have been performed to record the changes in gene expression in muscle biopsy samples from both DM1 and DM2 patients (Botta et al.

31

2007; Vihola et al. 2010; Udd et al. 2011; Rhodes et al. 2012; Bachinski et al. 2013b) using primarily microarray and RT-PCR based analysis. All of these studies provide a rare glimpse into the altered gene expression in DM patients; also they seem to suggest that there is a great overlap in gene expression changes in both DM1 and DM2 patients suggesting highly similar transcriptional profiles. Most notably, key genes regulating calcium flux and metabolism, insulin receptor gene, and genes regulating mitochondrial functions have been reported to be altered in both DM1 and DM2 patients.

MicroRNAs (miRNAs) are 22-23 nucleotide long non-coding RNAs that primarily repress the expression of target mRNA thereby decreasing their translational efficacy (Ambros 2001; Guo et al. 2010). Although miRNAs account for ~1% of the total cellular RNA pool, it has been reported that miRNAs regulated the expression of more than 60% of known protein coding genes (Lewis, Burge, and Bartel 2005). Since microRNA expression had already been reported to be misregulated in several myopathies (Eisenberg et al. 2007; Gambardella et al. 2010; Zacharewicz, Lamon, and

Russell 2013). Various studies have recently investigated into the expression of muscle specific microRNAs in DM1 (Gambardella et al. 2010; Fernandez-Costa et al. 2013) and

DM2 (Greco et al. 2012) patients.

2.12: Microarray vs. Next generation sequencing technology

While microarray provides relative measurement of gene expression, recently introduced ultra high-throughput next generation sequencing (NGS) techniques such as

RNA-sequencing (RNA-seq) and small RNA sequencing provide a much more comprehensive and absolute measurement of transcript abundance. Microarrays had been a technology of choice for several gene expression studies since the mid-1990s as they

32 are high-throughput compared to conventional Reverse transcription polymerase chain

(RT-PCR) based methods, and are also cost-effective. Microarray studies have provided valuable insight into gene regulation in the various diseases as well as for pharmacological studies.

Nonetheless, there are several limitations to this process first and foremost is that arrays are limited to the number of probes being used, thus allowing interrogation of a limited number of transcripts (Schulze and Downward 2000; Murphy 2002; Marioni et al.

2008; Wang, Gerstein, and Snyder 2009). This is a huge drawback with microarray technique because it heavily relies on our previous knowledge of existing gene expression to design the arrays. Secondly, there are high instances of background noise in the microarray experiments owing to cross-hybridization and also variation in hybridization properties among various probes; this limits the dynamic range of detection by microarrays and further requires complicated normalization methods during data analysis. This is the reason why microarray data still needs to be validated on conventional RT-PCR assay, which further adds to the cost of the project. Third, RNA quantity and quality are very critical factors in hybridization performance in arrays as a high amount of RNA is required to produce sufficient signal. This is a problem where

RNA samples available are in limited quantity such as biopsy samples; also a slight amount of cellular protein or lipid mixed with RNA can mediate non-specific binding of cDNA to the matrix.

On the contrary, sequencing based approaches such as RNA-seq have the potential to overcome these limitations. RNA-seq not only enables global quantification of the entire transcriptome in just a matter of days (Wang, Gerstein, and Snyder 2009;

33

Nowrousian 2010) but it also allows capturing of the transcriptomics landscape of a particular sample and gene expression levels can be easily estimated from the total number of reads map into the exons of a gene which is normalized by the length of exons

(Mortazavi et al. 2008). Gene expression levels determined by these methods closely correlate with RT-PCR and RNA spike-in controls. Recently, various researchers have been able to deduce whole transcriptome level gene expression information even from a single cell (Tang et al. 2010; Hashimshony et al. 2012). With the advent of various NGS techniques it has become much easier to call out splice variants (E.T. Wang et al. 2008b;

Richard et al. 2010), single nucleotide polymorphisms (Lalonde et al. 2011), novel gene and transcript discovery (Mortazavi et al. 2008; Trapnell et al. 2010) and gene fusions

(Edgren et al. 2011a; Nacu et al. 2011) opening up the opportunity to analyze in-depth allele-specific expression and changes in RNA editing.

34

Chapter 3

MATERIALS AND METHODS

3.1: Procurement of human muscle biopsy samples

DM1 and DM2 muscle biopsies had previously been collected by our collaborator

Dr. Bjarne Udd at Tampere University, Finland. Enrollment of patients was approved by their local institutional review boards, and the muscle biopsies and blood samples were collected from the enrolled persons only after informed consent from them. Clinical and histopathological details for the patients recruited in this study were evaluated based on parameter detailed in Table 3.1. Based on these findings a set of 6 DM2 and 7 DM1 muscle biopsy samples along with 3 muscle biopsies from healthy unrelated individuals were collected. A licensed physician performed the sample collection as per the guidelines of the Neuromuscular Research Unit, Tampere University and University

Hospital, Tampere, Finland. Briefly, skin was shaved around the subject's biopsy area and cleaned as for surgical preparations to maintain sterile conditions; local anesthesia was injected into skin and subcutaneous surfaces. Muscle biopsies were collected by inserting a percutaneous needle into muscle to obtain approximately a 2 cm plug which relates to approximately 75-100 mg muscle. Post procedure care and maintenance was provided to the donors and a routine follow up check-up was scheduled to assess their wound healing at the site of sample collection. Samples were snap frozen in isopentane and then precooled to -160°C in liquid nitrogen, this freezing process was rapid thereby minimizing any degradation of RNA or protein. Frozen biopsy samples were stored at

35 minus 80°C until further analysis and were shipped to us at HudsonAlpha Institute of

Biotechnology, Huntsville, Alabama with overnight delivery in a leak proof box containing adequate dry ice.

Table 3.1: Parameters for skeletal muscle biopsy sample collection for the current study.

Name of individual Date of birth Gender (M/F) Family History of DM2 (Y/N) Age first symptoms observed by individuals Age of clinical diagnosis of DM2 Detailed clinical symptoms Presence of myotonia (Y/N) Grade of muscle pain and weakness (1-10) EMG findings Blood Creatine kinase levels MRI findings (if any)

3.2: Procurement of muscle biopsy samples from mouse models

Muscle biopsy samples from mouse models for DM disease were obtained from

Dr. Ralf Krahe Professor, Department of Genetics, University of Texas MD Anderson

Cancer Center, Houston, TX. One gastrocenemuis muscle biopsy sample was obtained for each of the three transgenic mice for DM disease constructed in his lab (unpublished work). DM2 knock-in (DM2-KI) is a transgenic mouse model that had 189 CCTG repeats inserted in the intron 1 of CNBP gene located on chromosome 6:87843082-87851106 bp,

36 reverse strand (Figure 3.1a). DM2-HSA transgenic mouse had 121 CCTG repeats inserted into first intron of actin gene. DM1-HSA transgenic mice had 225 CTG repeats inserted into the 3'UTR region of actin gene. One control mouse muscle biopsy sample was also processed along with these three transgenic constructs.

Figure 3.1: a. Schematic representation of the generation of DM2 knock-in (DM2-KI) mouse model by inserting 189 (CCTG) repeats in mouse Cellular nucleic acid-binding protein (CNBP) gene. b. Muscle histopathology of control (+/+) and DM2-KI (ki/ki) mouse with fluorescence in situ hybridization (FISH) , arrows pointing towards intranuclear inclusion bodies of muscleblind like (MBNL) proteins.

3.3: RNA extraction and its quality assessment

Muscle biopsy tissues collected from clinically defined DM2 human patients

(n=6) and DM1 (n=7) with varying degree of clinical severity and histopathological findings as well as normal healthy individuals (n=3). Also, three mice transgenic along with control mouse muscle biopsy samples were used in the current investigation. Total

RNA containing both mRNA as well as miRNA fractions was extracted from approximately 10 mg of each muscle tissues using miRNeasy Kit with DNase treatment

(Qiagen Inc., Valencia, California, USA) according to manufacturer's protocol . Final

37 elution was made in 30µl RNase free sterile distilled water. The concentration and integrity of the extracted total RNA was estimated by Qubit® 2.0 Fluorometer

(Invitrogen, Carlsbad, California, USA), and Agilent 2100 Bioanalyzer (Applied

Biosystems, Carlsbad, CA, USA), respectively. RNA samples with a RNA integrity

Number (RIN) value of at least 7.5 or higher was used for further processing.

3.4: RNA-seq library preparation and sequencing

Total RNA (approximately 40ng) from each sample was amplified using

Ovation® RNA-Seq System V2 (NuGen Technologies, San Carlos, CA) using manufacturer's instructions (Figure A.1). The amplified product was purified using

QIAquick PCR cleanup column and final elution was made in 40µl Elution (EB) buffer

(Qiagen Inc., Valencia, California, USA). Final yield and quality of the amplified product was checked using Qubit® 2.0 Fluorometer and DNA 1000 chip on the Agilent 2100

Bioanalyzer. The post-ovation double-stranded DNA thus amplified by Ovation® RNA-

Seq System V2 kit was ready for the construction of RNA-Seq libraries. Approximately,

2.5 µg amplified DNA from each sample was sheared on a Covaris S200 focused- ultrasonicator (Woburn, MA, USA) at 6°C water bath for 3 minutes with 200 cycles per burst with a target yield of an average 200bp fragment size. Post shearing material was analyzed on a DNA 1000 chip on Agilent 2100 Bioanalyzer to confirm the targeted fragmentation range. Following this the fragmented DNA was taken into standard library preparation protocol using NEBNext® DNA Library Prep Master Mix Set for Illumina®

(New England BioLabs Inc., Ipswich, MA, USA ) with slight modifications. Briefly, end- repair was done followed by polyA addition and custom adapter ligation. Post-ligated materials were individually barcoded with unique in-house genomics service lab (GSL)

38 primers and amplified through 8 cycles of PCR using KAPA HiFi HotStart Ready Mix

(Kapa Biosystems, Inc., Woburn, MA, USA). Final libraries were purified using QIAquick PCR purification kit with MinElute cleanup columns (Qiagen Inc.,

Valencia, California, USA). The quality of the libraries were assessed by Qubit® 2.0

Fluorometer , and the concentration of the libraries was estimated by utilizing a DNA

1000 chip on an Agilent 2100 Bioanalyzer, respectively.

Accurate quantification for sequencing applications was determined using the qPCR-based KAPA Biosystems Library Quantification kit (Kapa Biosystems, Inc.,

Woburn, MA, USA). Each library was then diluted to a final concentration of 12.5nM and pooled equimolar prior to clustering. Cluster generation was carried out on a cBot v1.4.36.0 using the Truseq Paired-end (PE) Cluster Kit v3.0 (Illumina, Inc., San Diego,

CA, USA). Paired End (PE) sequencing was performed using a 200 cycle TruSeq SBS

HS v3 kit on an Illumina HiSeq2000, running HiSeq Control Software (HCS) v1.5.15.1

(Illumina, Inc., San Diego, CA, USA). Image analysis and base calling was performed using the standard Illumina Pipeline consisting of Real time Analysis (RTA) version v1.13. Raw reads were demultiplexed using a bcl2fastq conversion software v1.8.3

(Illumina, Inc., San Diego, CA, USA) with default settings.

3.5: Small RNA (miRNA) library preparation and sequencing

Approximately 300ng of total RNA from each sample was taken into small RNA library preparation protocol using NEBNext® Small RNA Library Prep Set for

Illumina® (New England BioLabs Inc., Ipswich, MA, USA) according to manufacturer's protocol. Briefly, 3` adapters were ligated to total input RNA followed by hybridization of multiplex SR RT primers and ligation of multiplex 5` SR adapters. Reverse

39

Transcription (RT) was done using SuperScript III RT (Life Technologies, Grand Island,

NY, USA) for 1 hour at 50°C. Immediately after RT reaction, indexed primers were added to uniquely barcode each sample and PCR amplification was done for 15 cycles using LongAmp Taq 2X master mix. Post PCR material was then purified using

QIAquick PCR purification kit (Qiagen Inc., Valencia, California, USA). Post PCR yield and concentration of the prepared libraries was assessed using Qubit® 2.0 Fluorometer and DNA 1000 chip on Agilent 2100 Bioanalyzer.

Size selection of small RNA libraries with a target size range of 140bp was done on a 6% PAGE gel for 1 hour at 120V. Accurate quantification for sequencing applications was performed using the qPCR-based KAPA Biosystems Library

Quantification kit. Each library was diluted to a final concentration of 12.5nM and pooled equimolar prior to clustering. Cluster generation was carried out on a cBot v1.4.36.0 using Illumina's Truseq Single Read (SR) Cluster Kit v3.0. Single End (SE) sequencing was performed on an Illumina HiSeq2000, running HiSeq Control Software (HCS) v1.5.15.1, using a 50 cycle TruSeq SBS HS v3 reagent kit. The clustered flowcells were sequenced for 56 cycles, consisting of a 50 cycle read, followed by a 6 cycle index read.

Image analysis and base calling was performed using the standard Illumina Pipeline consisting of Real time Analysis (RTA) version v1.13 and demultiplexed using bcl2fastq converter with default settings.

3.6: Processing of RNA-seq Reads

Post processing of the sequencing reads from RNA-seq experiments from each sample was performed as per our unique in-house pipeline (Figure 3.2). Briefly, quality control checks on raw sequence data from each sample were performed using FastQC

40

(Babraham Bioinformatics, London, UK). Raw reads were mapped to the reference hg19/GRCh37 using TopHat v1.4 (Langmead et al. 2009; Trapnell,

Pachter, and Salzberg 2009) with two mismatches allowed and other default parameters. TopHat is a splice junction mapping tool for RNA-seq reads that utilizes an ultra-fast high-throughput short read aligner Bowtie (Langmead et al., 2009) in the background and then takes the mapping result and identifies the splice junctions. The alignment metrics of the mapped reads was estimated using SAMtools (Li et al.

2009). Aligned reads were then imported onto the commercial data analysis platform,

Avadis NGS (Strand Scientifics, CA, USA). After quality inspection, the aligned reads were filtered on the basis of read quality metrics where reads with a base quality score less than 30, alignment score less than 95, and mapping quality less than 40 were removed. Remaining reads were then filtered on the basis of their read statistics, where missing mates, translocated, unaligned and flipped reads were removed. The reads list was then filtered to remove duplicates. Samples were grouped as patient and control identifiers and quantification of transcript abundance was done on this final read list using Trimmed Means of M-values (TMM) (Robinson and Oshlack 2010) as the normalization method. Differential expression of genes was calculated on the basis of fold change (using default cut-off ≥ ±2.0) observed between defined conditions, and the p-value of the differentially expressed gene list was estimated by z-score calculations with Benjamini Hochberg False discovery rate (FDR) correction of 0.05 (Benjamini and

Hochberg 1995).

41

Figure 3.2: Unique in-house data analysis pipeline followed for processing of RNA-seq, miRNA- seq and protein expression from DM1 and DM2 samples

42

3.7: Processing of small RNA-seq Reads

Post processing of the sequencing reads from miRNA-seq experiments from each sample was performed as per our unique in-house pipeline (Figure 3.1). Briefly, quality control checks on raw sequence data from each sample was performed using FastQC

(Babraham Bioinformatics, London, UK). Raw reads were imported on a commercial data analysis platform Avadis NGS (Strand Scientifics, CA, USA). Adapter trimming

(GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT) was done to remove ligated adapter from 3' end of the sequenced reads with only one mismatch allowed, poorly aligned 3' ends were also trimmed. Sequences shorter than 15 nucleotides length were excluded from further analysis. Trimmed Reads with low qualities (base quality score less than 30, alignment score less than 95, mapping quality less than 40) were removed.

Filtered reads were then used to extract and count the small RNA which were annotated with micro RNAs from the miRBase release 20 (Griffiths-Jones et al. 2008; Kozomara and Griffiths-Jones 2011) database.

The quantification operation carries out measurement at both the gene level and at the active region level. Active region quantification considers only reads whose 5' end matches the 5' end of the mature miRNA annotation. The expression values obtained thus are called `raw counts', candidate miRNAs having less than 10 counts were filtered.

Samples were then grouped as patient and control identifiers and the differential expression of miRNA was calculated on the basis of their fold change (default cut-off ≥

±1.5) observed between individual patients and averaged control samples, p-value of differentially expressed miRNA's was estimated by implementing z-score calculations using Benjamini Hochberg FDR corrections of 0.05 (Benjamini and Hochberg 1995).

43

3.8: iTRAQ quantitation of the proteome

Protein extracts were prepared by ultrasonicating the skeletal muscle biopsies from patient and control samples in lysis buffer (50% Trifluoroethanol 50mM HEPES).

The amount of protein in the samples was determined using bicinchoninic acid assay

(BCA assay), also known as the Smith assay. 10µg of protein was aliquoted and then reduced by tris (2-carboxyethyl) phosphine, cysteine. The reaction was blocked by methyl methanethiosulfonate, and digested with trypsin (1:25:: trypsin:protein) overnight.

The peptides were desalted using solid phase extraction with reverse phase microtrap column (Michrom Bioresources Inc, Auburn, CA, USA). The peptides were resolublized in 7µl 500mM triethylammonium bucarbonate (TEAB). 85µl of isopropyl alcohol was added to iTRAQ reagents. 12µL of iTRAQ reagents was added to the samples and then incubated with shaking for 2 hours, following this the samples were pooled, frozen, lyophilized and resolublized in buffer A (5% acetonitrile, 0.1 % formic acid in HPLC grade water) and finally they were stored at -80 0C.

The iTRAQ labeled samples were analyzed by MudPIT essentially as described

(Ibb et al. 2013). The precursor ions were analyzed in the Orbitrap® LC/MS technology

(Thermo Fisher Scientific, San Jose, CA USA) followed by 4 CID fragment ion scans in the ion trap to identify the peptides followed by 4 HCD fragment ion scan of the same precursors as in CID to get obtain the reporter ion intensities in the Orbitrap. Raw files generated by LC-MS/MS experiments were searched using Sequest database search engine (Eng, McCormack, and Yates 1994) running proteome Discoverer v1.3 (Thermo

Fisher Scientific, San Jose, CA USA) software to identify peptides. Protein assembly and reporter ion quantitation and statistical analysis were done using ProteoIQ (NuSep,

44

Bogart, GA, USA) which combines results from mass spectrometer platforms and performs robust relative quantitation by spectral counting

A pseudo spectral count value was calculated from reporter ion intensities by

ProteoIQ for each of the identified proteins and was used as the quantitative measure of protein abundance. A threshold spectral count of at least 1 in all samples was set and differentially expressed proteins between patients and control samples were identified using a fold change cut-off of ≥ ±2.0 p-value of DEGs was estimated by t-test calculations using Benjamini Hochberg False discover Rate (FDR) multiple test correction with a cut off value of 0.05.

3.9: Functional enrichment analysis of differentially expressed mRNAs and miRNAs

Gene Ontology (GO) analysis was performed on the list of differentially expressed mRNAs and miRNAs between patient and control samples. Database for

Annotation, Visualization and Integrated Discovery (DAVID) v6.7

(http://david.abcc.ncifcrf.gov/) was used for this analysis. DAVID is maintained by NIH and is web based functional annotation tools that takes in a list of uniquely differentially expressed genes in a condition and compares again a background of all the expressed genes in that same condition. The cut-off of the FDR was set at 5%. Functional annotation enrichment analysis, functional annotation clustering to rank the Biological processes, Cellular components and Molecular functions affected in the dataset was also done. Ingenuity Pathway Analysis (IPA, Ingenuity® Systems, www.ingenuity.com) software was used to analyze the unique canonical pathways, biological functions and networks affected. IPA was also used to predict the upstream regulators including

45 transcriptional factors affected in patients. Gene target analyses between differentially expressed miRNA and mRNA in DM1 and DM2 patients were also performed on IPA.

3.10: Validation studies for mRNA expression by Real-time quantitative PCR (RT- qPCR)

To validate the findings of RNA-seq analysis from our experiment a set of genes was identified and their relative quantification of expression was estimated by RT-qPCR.

For data normalization a set of two housekeeping genes TPB (TATA box binding protein) and HMBS (hydroxymethylbilane synthase) that represent different biological pathways and expression abundance levels were selected. For RNA-seq data validation a total of 150ng of RNA from each sample was used for first strand cDNA synthesis with

SuperScript III (Invitrogen, Carlsbad, California, USA) using random primers. RT-qPCR reaction (10μl) was performed in triplicate using KAPA SYBR® FAST qPCR Kits (Kapa

Biosystems, Inc., Woburn, MA, USA) with 0.3 μM of each primer. RT-qPCR reactions were run on a 7900HT Fast Real-Time PCR System (Applied Biosystems, Foster City,

CA, USA). Relative gene expression of the genes of interest was estimated by

-ΔΔCT comparative CT method (Schmittgen and Livak 2008) also known as 2 method

-ΔΔCT [where, Fold change = 2 ]. The CT is defined as the RT cycle when the fluorescence signal crosses the arbitrary threshold position, where the numerical value of CT is inversely related to the amount of amplicon in the reaction.

46

Chapter 4

RESULTS

4.1: Unbiased, novel integrated comparative analysis strategy adapted for studying transcriptional, miRNA and proteomics expression layout in skeletal muscle biopsies from DM1 and DM2 human patients.

The intention of this research was to study the mRNA, small RNA (microRNA) and protein expression profiles from human DM1 and DM2 patient muscle biopsy samples and compare them to that of muscle biopsy samples obtained from healthy individuals living an active life. All of the muscle biopsy samples were obtained from

Finland and the comparisons of all three data points were done to better understand the underlying pathophysiology in both types of myotonic dystrophy disease. In view of that, a total of 16 individual human muscle biopsy tissue (Table 4.1) were obtained (see methods for sample collection protocol), this included six samples from clinically diagnosed DM2 patients, seven from DM1 patients and three muscle biopsy samples from healthy individuals.

For the transcriptomics profiling, I sequenced mRNA libraries prepared from all

16 samples using the Illumina HiSeq 2000 platform (see methods), generating more than

1.7 billion total raw reads (Table 4.2). Likewise, to investigate changes in microRNA expression and its role in the DM disease pathophysiology I performed microRNA sequencing on all same 16 samples (see methods) generating more than 344 million raw reads (Table 4.3).

47

Post-quantification of the mRNA and microRNA expression along with massive scale protein expression profiling, following analyses were done. First, the association between the DM patients and control samples for their mRNA and microRNA expressions were determined using unsupervised hierarchical clustering and Principle

Component Analysis (PCA). Second, the differential mRNA and miRNA expression was calculated between DM1 and DM2 patient groups and averaged control samples. Third, I investigated the post-transcriptional genomic hotspots by identifying genomics loci in each patient with highest gene expression activity. Fourth, mRNA and miRNA expression profiles from DM1 and DM2 patients were compared to each other to determine overlapping and uniquely expressing mRNA and miRNA in both patient groups and based on that I identified biological processes, canonical pathways and transcriptional factors that were commonly and uniquely enriched in both DM1 and DM2 patients. Fifth, miRNA expression profiles from DM1 and DM2 patients and control samples were integrated with the mRNA expression profile from same patients to identify genes targeted by differentially expressed miRNAs and to evaluate as to how expression of miRNA influenced the expression of mRNA in these patients. Sixth, I analyzed the protein expression profile of DM1 and DM2 patients. Finally, I compared the protein expression profile of DM1 and DM2 diseases with their respective mRNA expression profiles to identify characteristic elements of the underlying pathophysiology of these two diseases.

Taken together, this unbiased, novel approach of integrating mRNA, microRNA and protein expression allowed us to better understand the underlying pathophysiology of

48

DM1 and DM2 disease. In a broader view this approach could be a unique experimental design to perform in-depth investigation of other related diseases.

49

Table 4.1: Sample description for myotonic dystrophy patients and control individuals skeletal muscle biopsy samples used in the current investigation.

Patient Internal Presence of Mutation Current Age Clinical Type Identifier Myotonia verified Sex (years) Symptoms

P1 YES YES Female 62 Mild P2 YES YES Male 51 Mild P3 YES YES Male 52 Mild Type 2 Myotonic dystrophy P4 YES YES Female 57 Moderate

50 P5 YES YES Female 56 Severe P6 YES YES Female 80 Most Severe

P1 YES YES Male 48 Severe P2 YES YES Male 46 Moderate P3 YES YES Female 56 Moderate Type 1 Myotonic dystrophy P4 YES YES Male 66 Moderate P5 YES YES Female 38 Moderate P6 YES YES Female 41 Moderate P7 YES YES Male 46 Moderate

C1 NO NO Male 64 Normal Control C2 NO NO Male 55 Normal C3 NO NO Female 50 Normal

Table 4.2: Alignment statistics overview of RNA-seq from human muscle biopsy samples.

% Patient type Internal Identifier Total raw reads Total Aligned reads Non-spliced reads Spliced reads Alignment

C1 72,985,966 61,992,448 84.94 50,540,818 2,628,298

C2 77,681,724 52,577,358 67.68 42,055,592 1,590,506 Control C3 78,349,640 57,358,906 73.21 44,893,802 1,910,732

DM2-P1 84,977,894 67,558,728 79.50 56,018,638 2,763,560 DM2-P2 75,626,326 60,416,080 79.89 48,248,252 1,727,402 51 DM2-P3 72,605,194 61,913,264 85.27 50,746,626 2,109,496 Type 2 Myotonic dystrophy DM2-P4 120,404,218 93,414,336 77.58 77,378,794 3,443,416 DM2-P5 134,215,246 115,466,712 86.03 94,425,140 5,695,078 DM2-P6 117,215,380 95,802,400 81.73 77,575,340 4,723,244

DM1-P1 145,523,256 92,571,750 63.61 76,542,142 3,888,378 DM1-P2 141,214,946 101,812,280 72.10 83,774,644 3,939,194 DM1-P3 107,124,338 90,500,716 84.48 76,477,822 4,973,162

DM1-P4 154,052,900 129,457,984 84.03 106,265,660 7,068,574 Type 1 Myotonic dystrophy DM1-P5 110,645,580 94,264,080 85.19 78,320,640 4,400,564 DM1-P6 134,443,340 112,353,884 83.57 95,085,734 5,522,464 DM1-P7 146,238,934 124,532,206 85.16 104,488,486 5,769,512

Table 4.3: Alignment statistics overview of microRNA sequencing of all human skeletal muscle biopsy samples.

Internal Average length Annotated reads % Annotated Patient type Total raw reads Trimmed reads (%) identifier after trim (mirBase 20) (miRBase 20) C1 22,442,090 19,441,767 (86.63) 23 9,298,697 47.8

Control C2 27,456,705 23,279,336 (84.79) 24 7,592,152 32.6 C3 12,475,020 11,324,048 (90.77) 24 6,544,843 57.8

DM2-P1 19,788,483 17,215,246 (87.00) 24 9,909,282 57.8 52

DM2-P2 25,449,128 22,923,252 (90.07) 26 8,949,482 39

DM2-P3 20,842,102 18,572,999 (89.11) 23 10,934,574 58.9 Type 2 Myotonic DM2-P4 23,133,588 17,054,531 (80.02) 23 7,138,207 41.9 dystrophy DM2-P5 22,932,233 20,859,609 (90.96) 26 6,887,801 33 DM2-P6 19,588,158 17,907,693 (91.42) 23 11,475,051 64.1

DM1-P1 20,066,195 19,061,439 (94.99) 26 8,253,901 43.3 DM1-P3 21,980,951 19,982,278 (90.91) 25 10,278,537 51.4

DM1-P4 17,133,444 15,882,533 (92.70) 23 11,346,876 71.4 Type 1 Myotonic DM1-P5 27,279,399 24,600,282 (90.18) 24 13,511,508 54.9 dystrophy DM1-P6 22,711,042 20,020,853 (88.15) 25 9,016,020 45 DM1-P7 21,244,539 18,282,167 (86.06) 25 8,744,057 47.8

4.2: Transcriptomics layout of the DM and control patients as identified by RNA- seq.

Post alignment gene expression quantification was performed using hg19 transcript annotation. Due to both low false positive rate and low coefficient of variance, the Trimmed Means of M-value (TMM) normalization method (Robinson and Oshlack

2010) was chosen instead of Reads per kilo base per million (RPKM) method of expression (Mortazavi et al. 2008) quantification of RNA-seq reads (Dillies et al. 2012).

Due to the high 3' sequencing bias, DM2-P2 was dropped from further analysis. After quantification of the mRNA expression levels of all 16 samples, I first examined their layout using unsupervised hierarchical clustering among control and DM patients. On an unsupervised hierarchal clustering plot I observed that the averaged DM1 (n=7) and DM2

(n=6) patients clustered as separate entities from control (n=3) individuals (Figure 4.1a).

Also, when clustering was done at individual patient levels without any averaging of expression data I observed that both DM1 and DM2 patients still clustered as two separate groups except DM1_Patient 1. (Figure 4.1b).

These findings were also evaluated and confirmed on a 3 dimensional principle component analysis (PCA) plot (Figure 4.2) which looks into pair wise relationship among all samples and simultaneously ranks and quantifies the degree of similarity between each pair of samples. We also observed separate clustering within DM2 patients groups with P4, P5 and P6 clustering separately with P1, P3. These results suggested that not only we were observing unique differences in the collective mRNA expression levels between DM1 and DM2 patients against an averaged mRNA expression levels in control individuals, but most notably we observed varying degree of differences in the mRNA

53 expression levels between individual DM2 patients. This was attributed to the varying clinical presentations among DM2 patients, where patients 4, 5 and 6 had the most severe symptoms while patients 1, 2 and 3 were experiencing very mild clinical symptoms. All the DM1 patients that we examined in this study presented very similar clinical symptoms

54

a. b.

Type 1 Myotonic Dystrophy 55

Control

Type 2 Myotonic Dystrophy

Figure 4.1: Spearman correlation of mRNA expression from all DM samples compared with control a. averaged expression layout b. non- averaged expression layout.

56

Figure 4.2: Principle component Analysis (PCA) plot of mRNA expression from all DM samples compared with control 4.3: Analysis of mRNA expression reveals unique as well as overlapping differentially expressed genes (DEGs) in both DM1 and DM2 patient groups.

I analyzed mRNA expression levels in 7 muscle biopsy samples from DM1 patients, 6 muscle biopsies from DM2 patients and 3 control individuals (Table 4.1) using

RNA-Seq technique (see methods). Close to 940 and 605 million raw reads were generated from 7 DM1 and 6 DM2 patient's respectively and approximately 230 million raw reads were generated for 3 control individual muscle biopsies. After quality check of the raw sequencing data, each sample was processed as per our unique in-house pipeline as detailed in Figure 3.1. Quantification of gene expression was done using TMM algorithm inbuilt in Avadis NGS data analysis software. A threshold of 1.0 was set on the log 2 scale for quantification purposes.

Differentially expressed genes (DEGs) between DM patients and control samples was estimated by averaging normalized expression values of all 5 DM2 patient, 7 DM1 patients and 3 control samples together. A fold change cut-off of ≥ ±2.0 was set and p- value of DEGs was estimated by z-score calculations using Benjamini Hochberg False discover Rate (FDR) multiple test correction with a cut off value of 0.05. Furthermore, I also estimated differential expression of genes in individual patients to assess if changes in gene expression are correlated with the clinical state of symptoms in the DM patients studied in this analysis.

Taken as a whole, out of 20,637 protein coding genes I found majority of the genes to have a low expression values. (Figure 4.3a). Genes with expression values of less than 1 was removed from the differential expression calculations. We found 3,074

(11.5%) and 4,180 (15.6%) DEGs in DM1 and DM2 patients (Figure 4.3 b, c)

57 respectively when compared with averaged controls individuals. Among these, majority of the genes were found to be up regulated i.e., 2,130 (69.2%) in DM1 and 3,566 (85.3%) in DM2 patients while remaining 944 (30.8%) genes in DM1 and 614 (14.7%) genes in

DM2 patients were found to be down regulated (Table 4.4).

58

Table 4.4: Differentially expressed genes (DEGs) statistics across DM1 and DM2 patients as compared to control individuals as estimated by TMM normalization of RNA- seq data. Differential expression was calculated on the basis of their fold change (cut-off ≥ ±2.0 p-value was estimated by implementing z-score calculations using Benjamini Hochberg FDR corrections of 0.05.

DM1 patients DM2 patients

Overall differentially expressed genes 3074 4180

Up regulated genes 2130 3566

Down regulated genes 944 614

Common differentially expressed genes 1803

Unique differentially expressed genes 1271 2377

Up regulated genes 675 2072

Down regulated genes 596 305

59 a.

b. c.

Figure 4.3: The distribution of overall gene expression levels in DM patients a. Averaged expression values of 20,637 protein coding genes from each of three subsets: DM1 (n=7), DM2 (n=5) and control (n=3) of human muscle biopsy samples were plotted as integrated log2 values. The distribution is based on the number of genes falling in each log2 gene expression category. b. Volcano plot of averaged mRNA expression values from DM1 patients compared against control samples. c. Volcano plot of averaged mRNA expression values from DM2 patients compared against control samples.

60

Then I looked into the chromosome-wide genomic hotspots of differentially expressed genes in both DM1 and DM2 patients and I identified that although overall there were greater number of DEGs identified in DM2 patients as compared to DM1 patients, the pattern of mRNA expression was similar in trend across both patient types

(Figure 4.4a). Genes located on chromosomes 1, 6, 11, 17and 19 were differentially expressed in greater numbers in both DM1 and DM2 patients.

DM1 is cause by (CTG)n repeat expansions in 3'UTR of dystrophia myotonica- protein kinase (DMPK) genes and DM2 is caused by repeat expansion of (CCTG)n in 1st intron of cellular nucleotide binding protein (CNBP) gene. These repeat expansion sequester various nuclear proteins such as muscleblind-like splicing regulator (MBNL) and two CUGBP, Elav-like family member (CELF) family of RNA binding proteins-

CUG-binding protein (CUBP1) and embryonically lethal abnormal vision-type RNA binding protein 3 (ETR-3) genes. We checked if any of these genes were differentially expressed in either DM1 or DM2 patients and from the RNA-seq data of averaged DM1 and DM2 patients we observed no differential expression of DMPK, CNBP, MBNL1,

MBNL2, MBNL3, CUBP1 and ETR3 genes (Figure 4.4b).

To identify uniqueness of gene expression between DM1 and DM2 patients we checked for the overlap of differentially expressed genes in both patients groups using a

Venn diagram It overlaid the respective gene expression data from both DM1 and DM2 patients together and revealed that there are 1803 DEGs that are common in both DM1 and DM2 patients, I also observed 1271 and 2377 uniquely expressed DEGs in DM1 and

DM2 patients, respectively (Figure 4.4c).

61

a.

b.

62 c.

Figure 4.4: a. Chromosome-wide genomic hotspot of differentially expressed genes (DEGs) in DM1 and DM2 patients. x-axis has the number of DEGs plotted against chromosomes on y-axis b. Absolute expression of DMPK, CNBP, MBNL1, MBNL2, MBNL3,CUBP1 and ETR3 genes as Log2 expression (y-axis) implicated in the molecular pathology of DM1 and DM2 disease c: Venn diagram of overlapping differentially expressed genes between DM2 (brown) and DM1 (blue) human skeletal patient muscle biopsy samples. Enriched canonical pathways identified in overlapping as well as unique DEGs from DM1 and DM2 patients are also mentioned.

63

I then looked into the enriched canonical pathways in DEGs in both DM1 and

DM2 patients individually using Ingenuity Pathway Analysis (IPA) software. IPA uses right-tailed Fisher's exact test to rank the highly enriched canonical pathways from a given gene expression dataset. I found unique canonical pathways to be affected in both

DM1 and DM2 patients. As shown in Figure 4.5a, I identified specific negative enrichment (red color) of eukaryotic translation initiation factor 2 (EIF2) signaling, mitochondrial dysfunction, mammalian target of rapamycin (mTOR) signaling, regulation of Eukaryotic translation initiation factor 4 (EIF4) and p70S6K and calcium signaling among DM1 patients. All of these pathways are critically important for organismal growth and development and are required for normal cellular function and maintenance. Both EIF2and EIF4 play critical roles in translational regulation and mTOR regulate cytoskeletal dynamics. Aberrant mTOR signaling is involved in many diseases including cancer, cardiovascular disease, metabolic and neuromuscular disorders. These three critically important signaling pathways along with negative regulation of calcium signaling suggests down regulation of protein synthesis process along with defective skeletal muscle structure and physiology in DM1 patients.

On the other hand, in DM2 patients Figure 4.5b, I observed positive enrichment of

Rho GTPases along with axonal guidance and several G protein (Gαq, Gα12/13) signaling pathways. Rho GTPase proteins are found in all eukaryotes and it belongs to

Ras-like that has been shown to regulate actin filament activity along with various other cellular functions such as gene expression, and cellular proliferation. Rho family proteins and G protein 12/13 induce massive actin cytoskeletal remodeling event in cells. This was validated on the histopathological findings previously done on the same

64 muscle biopsy samples at Finland in Dr. Bjarne Udd's lab. They found that there were numerous highly atrophic myosin heavy chain (MyHC) neonatal and MyHC fast IIA reactive fibers with increased amount of internalized nuclei and nuclear clump fibers

(Table 4.1). They also observed some fiber type grouping in the muscle biopsy samples used for analysis in this study.

I also investigated for the affected protein classes predicted to be affected by the

DEGs in both DM1 and DM2 patients using Panther gene list classification system (Mi,

Muruganujan, and Thomas 2013). I detected that although, the trend of the affected protein class was quite similar between DM1 and DM2 patients, upon closer look I found that proteins regulating cell adhesion, cytoskeletal, signaling, transcriptional factors and nucleic acid binding were highly enriched in both patient types (Figure 4.6).

65

a. b. 66

Figure 4.5: Ranked enriched canonical pathways in both a. DM1 patients' b. DM2 patients using Ingenuity Pathway Analysis (IPA) software. Red color indicates down regulated genes and green color indicates up regulated genes affecting a pathway. y-axis has the percentage of genes found to be differentially regulated in the predicted pathways (x-axis). Number on top of each horizontal bar represents the total number of genes known to be affecting those particular pathways (x-axis).

Figure 4.6: Prediction of affected protein classes (y-axis) from the differentially expressed genes (x-axis) in DM1 and DM2 patients as identified by Panther gene list classification system. Panther protein class Id is mentioned in the parenthesis.

67

4.4: Analysis of overlapping differentially expressed genes across DM1 and DM2 patients.

I identified 1803 differentially expressed genes to be overlapping between DM1 and DM2 patients (Figure 4.4c). I sorted out all the common DEGs into separate groups of up regulated genes and down regulated genes. Top 10 common up regulated as well as down regulated genes in both DM1 and DM2 patients is shown in Table 4.5. I first looked for the enriched cluster groups in both up and down regulated lists using DAVID functional annotation analysis. As summarized in Table 4.6, muscular contraction, cellular respiration, ionic homeostasis, calcium binding, actin filament-based process such as stress fiber and regulation of neurological system processes were the most significantly enriched processes among common down regulated genes in both DM1 and

DM2 patients. While, cell membrane, cell junction, biological adhesions, cell-cell signaling, neurological system process, ionic channel activity and cadherins were the most significantly enriched among common up regulated genes (Table 4.7) in both DM1 and DM2 patients.

4.4.1: Overlapping down regulated gene set from DM1 and DM2 patients identifies unique genes affecting decrease in muscular contraction and muscle mass.

Patients of both DM1 and DM2 disease clinically experience decrease in skeletal muscular contractility and muscle mass while at the same time they experience increased muscular fatigue. Ingenuity Pathway Analysis (IPA) software was used to inspect the list of common DEGs in DM1 and DM2 patients to identify top biological pathways affected

(Table 4.8).

68

Table 4.5: Top 10 common up & down regulated genes in both DM2 and DM1 patients.

Gene Symbol Description FC DM2 FC DM1

YTHDC1 YTH domain containing 1 60.20 1069.73 DEAD (Asp-Glu-Ala-Asp) box polypeptide DDX39B 91.69 1002.04 39B CD99 CD99 molecule 69.43 834.35 PPP1R10 Protein phosphatase 1, regulatory subunit 10 58.94 824.25 TUBB Tubulin, beta class I 118.67 797.46 FLOT1 Flotillin 1 66.63 788.10 BAG6 BCL2-associated athanogene 6 80.86 773.18 VPS52 Vacuolar protein sorting 52 homolog 46.68 586.21 HSPA1A Heat shock 70kDa protein 1A 45.08 502.97 ZNF84 Zinc finger protein 84 37.78 497.06

Eukaryotic translation elongation factor 1 EEF1G -154.41 -183.54 gamma ZNF550 Zinc finger protein 550 -31.39 -18.67 Eukaryotic translation initiation factor 3, EIF3CL -18.49 -40.93 subunit C-like Family with sequence similarity 166, member FAM166B -13.48 -5.52 B TNNC2 Troponin C type 2 (fast) -11.03 -5.69 IGHM Immunoglobulin heavy constant mu -10.18 -9.94 PLA2G2A Phospholipase A2, group IIA -9.27 -13.18 FAM57B Family with sequence similarity 57, member B -9.05 -6.30 CRIP1 Cysteine-rich protein 1 -9.04 -11.03 MYH1 Myosin, heavy chain 1, -8.70 -20.32 FC = Fold change

69

Table 4.6: Top 10 enriched cluster results from common down regulated genes in both DM1 and DM2 patients as identified by DAVID functional annotation clustering analysis.

Cluster terms Enrichment p-value Fold Score Enrichment

Muscle system 5.04 Skeletal muscle contraction 3.50E-07 35.98 Toponin complex 1.46E-04 35.31 Respiratory chain 3.64 Electron transfer 3.67E-06 12.21 Oxidative phosphorylation 8.57E-05 13.09 Myosin 1.94 Thick filament 2.26E-02 12.72 Myosin filament 2.88E-02 11.15 Muscle Contraction 1.84 Regulation of muscle contraction 5.95E-05 8.00 Regulation of striated muscle 3.30E-03 13.08 contraction Ionic homeostasis 1.54 Cellular calcium ion homeostasis 3.93E-03 3.54 Cation homeostasis 1.80E-02 2.52 Calcium binding 1.45 Calcium ion binding 5.05E-02 1.54 Calcium-binding EF-hand 1.29E-02 3.62 Ribosomal unit 1.40 Cytosolic ribosome 1.41E-04 6.97 Translational elongation 2.76E-03 4.99 Oxygen binding 1.38 Electron transfer 3.67E-06 12.21 Heme binding 3.31E-02 3.35 Antigen Binding 1.33 Immunoglobulin v-set, subgroup 4.00E-03 5.69 Immunoglobulin c region 5.75E-03 25.44 Cell signaling 1.31 Synaptic transmission 2.28E-02 2.41 Transmission of nerve impulse 5.42E-02 2.06

70

Figure 4.7: Top 10 enriched cluster results from common up regulated genes in both DM1 and DM2 patients as identified by DAVID functional annotation clustering analysis.

Cluster term Enrichment p-value Fold Score enrichment

Cell membrane 19.42 Glycoprotein 2.69E-31 1.61 Glycosylation site:N-linked 3.76E-31 1.62 Extra cellular region 18.62 Signal peptide 2.47E-21 1.59 Secreted 1.04E-08 1.53 Plasma membrane part 10.56 integral to plasma membrane 3.92E-10 1.66 Cell junction 7.85 Synapse 7.12E-11 3.06 Postsynaptic membrane 2.42E-09 3.30 Biological adhesion 7.82 Cell adhesion 5.34E-09 2.21 Cell-cell adhesion 8.79E-06 2.12 Neurological system process 6.32 Transmission of nerve impulse 1.83E-07 2.16 Cell-cell signaling 5.02E-05 1.64 Neurotransmitter receptor 4.94 Ligand-gated channel activity 2.18E-09 3.49 Gamma-aminobutyric acid A 1.26E-05 6.23 receptor Acetylcholine receptor 4.43E-05 4.42 EGF-like domain 4.31 EGF-like region, conserved site 2.68E-06 2.15 EGF-like 3 8.41E-06 3.54 Ionic channel 4.00 Ion channel activity 3.02E-09 2.26 Cation channel activity 3.91E-05 2.03 Potassium channel 1.53E-02 2.28 Cadherin 3.78 Cadherin cytoplasmic region 2.05E-05 6.79 Homophilic cell adhesion 3.65E-05 2.61

71

Table 4.8: Top biological functions predicted to be affected in both DM1 and DM2 patients from overlapping down regulated genes as identified by IPA analysis.

Activation z- Predicted Diseases or Functions Annotation p-value score Activation State

Contractility of muscle 1.64E-05 -2.69 Decreased Contractility of skeletal muscle 1.11E-08 -2.59 Decreased Adhesion of blood cells 2.52E-02 -2.39 Decreased Metabolism of nucleotide 7.02E-03 -2.36 Decreased Metabolism of nucleic acid component 3.18E-03 -2.36 Decreased Aggregation of blood platelets 2.31E-02 -2.19 Decreased Hypersensitive reaction 1.47E-02 -2.18 Decreased Quantity of neurons 1.88E-02 -2.18 Decreased Mass of muscle 6.77E-03 -2.17 Decreased Mass of skeletal muscle 1.83E-02 -2.00 Decreased Quantity of hdl cholesterol in blood 7.47E-03 -2.00 Decreased

Quantity of creatinine in blood 1.05E-03 2.00 Increased Fatigue of muscle 8.28E-06 2.00 Increased Movement Disorders 5.75E-05 2.42 Increased Bacterial Infection 1.24E-03 2.58 Increased Heart Disease 3.01E-03 3.10 Increased

72

Elements of skeletal and muscular system, development and function, organ morphology, cell to cell signaling and interaction, hematological system development and function, tissue development and cellular movements are identified to be predominantly decreased while elements of cardiac disease, skeletal and muscular disorders, fatigue of muscles, developmental disorders and neurological diseases were increased in the common down regulate genes sets from both patient groups. Upon further investigation I identified a set of unique muscle specific genes that are commonly down regulated in both DM1 and DM2 patients, these includes: myosin heavy chain 1 (MYH1), creatine kinase (CKM), cyclin-dependent kinase inhibitor (CDKN1C), calcium-transporting

ATPase sarcoplasmic reticulum type, fast twitch skeletal muscle isoform (ATP2A1), tripartite motif-containing 72 (TRIM72), potassium inwardly-rectifying channel, subfamily j, member 11 (KCNJ11), nitric oxide synthase 1 (NOS1), paralbumin

(PVALB), insulin receptor substrate 1 (IRS1), cytochrome-c-oxidase subunit VIIa polypeptide 1, muscle (COX7A1), myoglobin (MB), adenylate kinase 1 (AK1). These genes play crucial role in muscle contraction and maintaining muscle mass, specifically the skeletal muscles. As shown in Figure 4.7 down regulation of these 13 genes are collectively responsible for the predicted decrease in the skeletal muscle contraction and muscle mass in DM patients. Closer investigation revealed that changes in genes expression of these 13 unique genes correlates with clinical severity of DM patients. I propose to use these genes as biomarkers for grading muscle weakness and severity in

DM patients.

73

74

Figure 4.7: Predicted inhibition of skeletal muscle contractility, mass of muscles from the 13 uniquely common down regulated genes in DM patients. Red fill indicates down regulated genes and biological process. For figure legend see Appendix 4.4.2: Common up regulated gene set in DM1 and DM2 patients identifies unique genes affecting increased microtubule dynamics.

From the overlapping up regulated genes sets between DM1 and DM2 patients I identified characteristic genes involved in the regulation of microtubule dynamics in both

DM patients. Microtubules are integral component of cilia and flagella and are also essential for regulating molecular transport and maintaining cytoskeletal structure reorganization and cell division. Elements of cellular signaling, development, growth and proliferation, cellular assembly and organization were identified to be increased in DM patients while elements of neurological disease, nervous system development and function, cellular and organ morphology were decreased.

I identified a set of unique muscle specific genes that are commonly up regulated in both DM1 and DM2 patients. Upon closer look I found that both the formation and disassembly of microtubules is highly elevated in DM patients. These genes play crucial role in maintaining the overall microtubule dynamics. As shown in Figure 4.8, the up regulations of these genes are collectively responsible for the predicted increase in microtubule dynamics in DM patients. Further investigation revealed that changes in genes expression of these 14 uniquely up regulated genes directly correlates with the clinical severity of DM patients. I propose to use these genes as biomarkers for grading disease severity in DM patients.

75

a. b. 76

Figure 4.8: Predicted activation of a. formation and b. disassembly of microtubules identified from the overlapping up regulated genes sets between DM1 and DM2 patients. Green fill indicates up regulated genes and biological process. For figure legend see Appendix D.

4.4.3: Identification of enriched canonical pathways in common DEGs in DM1 and

DM2 patients.

I performed enrichment analysis to rank canonical pathways from both common up and down regulated DEGs between DM1 and DM2 patients using 4 major pathway enrichment analysis tools. Kyoto Encyclopedia of Genes and Genomes (KEGG)

(Kanehisa et al. 2004), Panther classification (Mi, Muruganujan, and Thomas 2013),

Reactome (Croft et al. 2013; D’Eustachio 2013) and Ingenuity Pathway Analysis (IPA,

Ingenuity Systems) were used.

As summarized in Table 4.9 major canonical pathways predicted to be enriched from analyzing overlapping down regulated genes in both DM1 and DM2 patients includes notably the processes involving muscular contraction and actin cytoskeletal reorganization such as: calcium signaling, oxidative phosphorylation, RhoGTPase,

RhoGD1, gluconeogenesis and diabetes pathways. While at the same time Table 4.10 has the detailed list of predicted enriched canonical pathways from the common up regulated

DEGs between DM1 and DM2 patients these include, neuroactive ligand-receptor interaction, Type I diabetes mellitus, glutamate receptor signaling, axonal guidance, cadherin, epidermal growth factor receptor signaling pathways.

77

Table 4.9: Enriched canonical pathways in common down regulated genes in both DM1 and DM2 patients as identified by Kyoto Encyclopedia of Genes and Genomes (KEGG) Panther classification, Reactome and Ingenuity Pathway Analysis (IPA, Ingenuity Systems) software.

Annotation Enriched Pathways p-value

Oxidative phosphorylation 9.40E-05 Cardiac muscle contraction 1.70E-04 KEGG Ribosome 2.70E-01 Calcium signaling pathway 4.20E-01

Cytoskeleton regulation by Rho GTPase 2.70E-02 Inflammation mediated by chemokine and cytokine signaling 9.80E-02 Panther pathway Integration of energy metabolism 2.30E-06

Diabetes pathways 3.10E-05 Reactome 3' -UTR-mediated translational regulation 1.80E-04

Mitochondrial Dysfunction 1.43E-09 Calcium Signaling 1.55E-09 Cellular Effects of Sildenafil (Viagra) 1.41E-03 Agranulocyte Adhesion and Diapedesis 2.69E-03 IPA nNOS Signaling in Neurons 3.09E-03 Gluconeogenesis I 3.47E-03 Hematopoiesis from Pluripotent Stem Cells 4.17E-03 Dopamine-DARPP32 Feedback in cAMP Signaling 5.25E-03 Calcium Transport I 6.92E-03 RhoGDI Signaling 7.24E-03

78

Table 4.10: Enriched canonical pathways in common up regulated genes in both DM1 and DM2 patients as identified by Kyoto Encyclopedia of Genes and Genomes (KEGG) Panther classification, Reactome and Ingenuity Pathway Analysis (IPA, Ingenuity Systems) software.

Annotation Enriched Pathways p-value

Neuroactive ligand-receptor interaction 1.70E-08 Cell adhesion molecules (CAMs) 1.10E-07 Type I diabetes mellitus 9.40E-04 KEGG Autoimmune thyroid disease 1.30E-03 Axon guidance 3.40E-02 ErbB signaling pathway 6.40E-02 Glycosphingolipid biosynthesis 8.50E-02

Ionotropic glutamate receptor pathway 3.50E-04 Panther Cadherin signaling pathway 2.40E-02

Synaptic Transmission 5.30E-04 Reactome Signaling by GPCR 1.20E-02 Hormone biosynthesis 1.70E-02

Antigen Presentation Pathway 2.51E-16 Graft-versus-Host Disease Signaling 9.33E-08 Autoimmune Thyroid Disease Signaling 1.74E-07 Allograft Rejection Signaling 3.63E-06 Glutamate Receptor Signaling 1.07E-05 IPA OX40 Signaling Pathway 2.04E-05 Type I Diabetes Mellitus Signaling 4.37E-05 B Cell Development 1.10E-04 Cytotoxic T Lymphocyte-mediated of Target 1.35E-04 Cells Crosstalk between Dendritic Cells and Natural Killer 1.55E-04 Cells

79

4.4.4: Transcriptional factors predicted to be affected by the common DEGs in DM1 and DM2 patients.

Transcriptional factors (TFs) are proteins that bind to DNA and proteins and regulate their expression when they are either activated or inhibited. Different TFs are implicated in the pathophysiology of various diseases such as cancers, diabetes, cardiac, neuromuscular and autoimmune disorders. From the RNA-seq data I identified unique

TFs to be either activated or inhibited in both DM1 and DM2 patients (Table 4.11).

From the common up regulated DEGs between DM1 and DM2 patients I identified three notable TFs to be activated in both patients types, these includes: Nuclear factor κB (NF-κB) that primarily regulates immune response to inflammation and activates cell survival responses (Figure 4.9). Hypoxia-inducible factor-1 alpha (HIF1A) is a key TF responsible for the expression of genes that facilitate adaptation and survival of cells in distress (Figure 4.10). Canonical β-catenin (CTNNB1) TF mediates via WNT signaling pathway and plays an important role in the induction of myogenesis (Figure

4.11), it is also predicted that elements of downstream effects of CTNNB1 TF will lead to increased microtubule dynamics in DM patients as we mentioned in the earlier segment.

Noteworthy, TFs inhibited in both DM1 and DM2 patients includes, pleiomorphic adenoma gene 1 (PLAG1), SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily a, member 4 (SMARCA4), peroxisome proliferator- activated receptor gamma, coactivator 1 alpha (PPARGC1A) and peroxisome proliferator-activated receptor gamma (PPARG). As shown in Figure 4.12 their inhibition greatly influence the expression of notable genes such as; troponin T type 3 (TNNT3), sarcoplasmic/endoplasmic reticulum calcium ATPase (ATP2A1/SERCA1), actin alpha1

80

(ACTA1), tropomysoin 1 and 2 (TPM1, TPM2), myosin light chain, phosphorylatable, fast skeletal muscle (MYLPF), troponin C type 2 (TNNC2) and insulin receptor substrate

1 (IRS/ IRS1). The down regulation of these genes as validated by our RNA-seq experiments leads to increased insulin resistance, decreased muscle contraction, increased myopathy and increased hypertrophy of the left ventricle in both DM1 and DM2 patients.

From the common DEGs between DM1 and DM2 patients I also observed the inhibition of a DNA binding v-myc myelocytomatosis viral related oncogene, neuroblastoma derived (MYCN) which is a member of c-MyC/N-myc family of TFs

(Figure 4.13). These proteins regulates the expression of various genes required for normal growth and differentiation and it is a critical component of Mitogen-activated protein kinases (MAPK) signaling pathway. Misregulation of Myc is implicated in several diseases such as various types of cancers, cell cycle disruption.

81

Table 4.11: Transcriptional factors predicted to activated or inhibited from the overlapping up and down regulated genes from DM1and DM2 patients as identified by IPA analysis.

Transcriptional Molecule Type Predicted z-score p-value Factors Activation State

From Up regulated genes

NFkB (complex) complex Activated 5.23 4.86E-04 CTNNB1 transcription regulator Activated 4.08 1.15E-02 HIF1A transcription regulator Activated 3.78 4.81E-02 RELA transcription regulator Activated 3.78 1.59E-02 SP1 transcription regulator Activated 3.77 1.26E-02 FOXO1 transcription regulator Activated 3.35 6.86E-03 ETS1 transcription regulator Activated 3.22 4.93E-02 CREB1 transcription regulator Activated 3.04 1.33E-02 CREBBP transcription regulator Activated 2.89 1.77E-03 SMAD4 transcription regulator Activated 2.77 5.57E-03 PML transcription regulator Activated 2.76 1.18E-03 KAT5 transcription regulator Activated 2.70 2.19E-03 ASCL1 transcription regulator Activated 2.62 4.66E-02 SOX9 transcription regulator Activated 2.61 2.09E-02 NLRC5 transcription regulator Activated 2.60 9.49E-05 CEBPA transcription regulator Activated 2.59 4.19E-02 NR5A1 nuclear receptor Activated 2.54 4.85E-02 E2F1 transcription regulator Activated 2.45 3.90E-02 TFEB transcription regulator Activated 2.42 1.96E-03 NR3C2 nuclear receptor Activated 2.41 5.03E-01 CIITA transcription regulator Activated 2.41 1.45E-03 TLX3 transcription regulator Activated 2.21 5.37E-03 KAT2B transcription regulator Activated 2.16 1.51E-02 EGR1 transcription regulator Activated 2.15 2.93E-02 TFDP1 transcription regulator Activated 2.00 2.03E-02 Ncoa-Nr1i2-Rxra complex Activated 2.00 3.40E-02 REST transcription regulator Inhibited -3.52 1.71E-08 ZNF217 transcription regulator Inhibited -2.33 7.80E-03 HDAC2 transcription regulator Inhibited -2.57 3.03E-02

From Down regulated genes

PLAG1 transcription regulator Inhibited -2.43 8.13E-05 SMARCA4 transcription regulator Inhibited -2.81 1.50E-04 PPARGC1A transcription regulator Inhibited -2.17 1.61E-02 PPARG nuclear receptor Inhibited -2.09 3.12E-02

82

Figure 4.9: Increased activity nuclear factor κB (NF-κB) transcription factor in skeletal muscles of DM patients resulting in experimentally observed up regulation of genes as shown below. The up regulated genes are hallmarks for development of neuromuscular diseases in DM patients. Green fill indicates up regulated genes and biological process. For figure legend see Appendix D.

83

Figure 4.10: Increased activity hypoxia-inducible factor-1 A (HIF1A) transcription factor in skeletal muscles of DM patients resulting in experimentally observed downstream up regulation of genes regulating stress response and organization of cytoskeleton. Green fill indicates up regulated genes and biological process. Red fill indicates down regulated genes. For figure legend see Appendix D.

84

Figure 4.11: Increased activity of catenin (cadherin-associated protein), beta 1 (CTNNB1) transcription factor in skeletal muscles of DM patients resulting in experimentally observed up regulation of genes regulating microtubule dynamics.Green fill indicates up regulated genes and biological process. For figure legend see Appendix D.

85

86

Figure 4.12: Collective down regulation of pleiomorphic adenoma gene 1 (PLAG1), SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily a, member 4 (SMARCA4), peroxisome proliferator-activated receptor gamma, coactivator 1 alpha (PPARGC1A) and peroxisome proliferator-activated receptor gamma (PPARG) transcriptional factors (TFs) from the common down regulated DEGs and their affected biological functions in DM patients. Green fill indicates up regulated genes and biological process. Red fill indicates down regulated genes. For figure legend see Appendix D.

Figure 4.13: Inhibition of v-myc myelocytomatosis viral related oncogene, neuroblastoma derived (MYCN) transcriptional factors (TFs) from the overlapping DEGs in DM patients. Green fill indicates up regulated genes and biological process. Red fill indicates down regulated genes. For figure legend see Appendix D.

87

4.5: Analysis of uniquely differentially expressed genes in DM1 and DM2 patients reveals considerable insight into the unique gene expression attributes in these two patient groups.

I identified 1271 and 2377 genes to be uniquely differentially expressed in both

DM1 and DM2 patients, respectively (Figure 4.4c). Top 10 common up regulated as well as down regulated genes in both DM1 and DM2 patients is shown in Table A.1, A.2 respectively. I separately analyzed uniquely expressing genes in DM1 and DM2 patients to identify distinctive molecular signatures in these patients. Using IPA analysis I found unique canonical pathways enriched in both DM1 and DM2 patients.

Major cytoskeleton signaling pathways were found to be significantly enriched from the unique DEGs from DM2 patients these include: signaling by Rho family

GTPases, Axonal guidance, RhoA signaling, Integrin, Ephrin B signaling pathways.

Simultaneously, from the unique DEGs in DM1 patients I found enrichment of signaling pathways that regulate protein synthesis and cellular growth and proliferation, endocrine system function and mitochondrial function. These include eukaryotic initiation factor-2

(EIF2) signaling, mTOR signaling and mitochondrial dysfunction (Table 4.12). Analysis of the affected biological networks from the unique DEGs from DM1 and DM2 patients

(Table 4.13) ranked cell-to-cell signaling and interaction, nervous system development and function as the top affected biological network in DM1 patients whereas connective tissue disorders, developmental disorder, hereditary disorder was ranked the top affected biological network in DM2 patients.

88

Table 4.12: Top 10 enriched canonical pathways from uniquely expressed genes in DM1 and DM2 patients as identified by Ingenuity pathway analysis.

Patient Ingenuity Canonical Pathways p-value Remark type

EIF2 Signaling 5.01E-14 Enrichment of Retinoate Biosynthesis I 2.75E-05 signaling pathways mTOR Signaling 1.78E-04 that regulate protein DM1 Bile Acid Biosynthesis, Neutral Pathway 3.55E-04 synthesis, cellular Methylglyoxal Degradation III 1.07E-03 growth and Mitochondrial Dysfunction 1.23E-03 proliferation, Ethanol Degradation II 1.38E-03 endocrine system Regulation of eIF4 and p70S6K Signaling 2.00E-03 and mitochondrial Noradrenaline and Adrenaline Degradation 2.34E-03 function. Serotonin Degradation 4.37E-03 Signaling by Rho Family GTPases 5.89E-06 Enrichment of Glycogen Degradation II 1.95E-04 major cytoskeleton

Breast Cancer Regulation by Stathmin1 2.19E-04 signaling, cell-cell DM2 Cell Cycle Control of Chromosomal 3.55E-04 junction signaling Replication pathways. Axonal Guidance Signaling 3.63E-04 RhoA Signaling 4.17E-04 Reelin Signaling in Neurons 4.27E-04 RhoGDI Signaling 5.13E-04 Ephrin B Signaling 5.25E-04 Remodeling of Epithelial Adherens 5.25E-04 Junctions

89

Table 4.13: Top 10 ranked enriched biological networks from unique DEGS in DM1 and DM2 patients as compared to control individuals as identified by IPA.

Patient Enriched Biological networks type

Cell-To-Cell Signaling and Interaction, Nervous System Development and Function, Cell Signaling Molecular Transport, Carbohydrate Metabolism, Small Molecule Biochemistry Cellular Growth and Proliferation, Embryonic Development, Connective Tissue Development and Function Gene Expression, Protein Synthesis, RNA Post-Transcriptional Modification Renal Damage, Renal Tubule Injury, Inflammatory Response Cardiovascular System Development and Function, Connective Tissue DM1 Development and Function, Embryonic Development Drug Metabolism, Glutathione Depletion In Liver, Small Molecule Biochemistry Cellular Movement, Embryonic Development, Organ Development Developmental Disorder, Hematological Disease, Hereditary Disorder Cell-To-Cell Signaling and Interaction, Tissue Morphology, Amino Acid Metabolism

Connective Tissue Disorders, Developmental Disorder, Hereditary Disorder Cell-To-Cell Signaling and Interaction, Embryonic Development, Organismal Development DNA Replication, Recombination, and Repair, Cancer, Embryonic Development Cell Morphology, Nervous System Development and Function, Endocrine System Development and Function Gastrointestinal Disease, Hepatic System Disease, Behavior Molecular Transport, RNA Trafficking, Cell Death and Survival DM2 Connective Tissue Development and Function, Embryonic Development, Nervous System Development and Function RNA Post-Transcriptional Modification, Cell Morphology, Reproductive System Development and Function Developmental Disorder, Hereditary Disorder, Neurological Disease DNA Replication, Recombination, and Repair, Cell Cycle, Connective Tissue Disorders

90

4.6: Expression layout of micro RNAs (miRNAs) in DM1 and DM2 patients as identified by small RNA-seq.

After quantification of the miRNA expression levels (see methods) of all 16 samples, I grouped the samples with their patient identifiers and examined their expression layout across all samples using unsupervised hierarchical clustering among control and all DM patients. Library prep of DM1_Patient 2 failed; hence it was dropped from further analysis of miRNA expression. I observed that the averaged DM1 (n=6) and

DM2 (n=6) patients mRNA clustered as separate entities from control (n=3) individuals

(Figure 4.14a). Also, when clustering was done at individual patient levels without any averaging of expression data I observed that both DM1 and DM2 patients still clustered as separate groups from the control individuals. (Figure 4.14b).

These findings were also evaluated and confirmed on a 3 dimensional PCA plot

(Figure 4.15). These results suggested that not only I was observing unique differences in the collective miRNA expression levels between DM1 and DM2 patients against an averaged miRNA expression levels in control individuals, but most notably similarly to mRNA expression layout, I observed varying degree of differences in the miRNA expression levels between individual DM patients.

91

a. b.

Type 2 Myotonic Dystrophy

Type 1 Myotonic Dystrophy

Type 2 Myotonic Dystrophy 92

Type 1 Myotonic Dystrophy

Control

Figure 4.14: Spearman correlation of micro RNA expression from all DM samples compared with control a. averaged expression layout b. non- averaged expression layout.

a. b. 93

Figure 4.15: Principle component Analysis (PCA) plot of micro RNA expression from all DM samples compared with control a. averaged expression layout b. non-averaged expression layout.

4.7: Analysis of miRNA expression reveals unique as well as overlapping miRNAs in both DM1 and DM2 patient groups.

I analyzed miRNA expression levels in 6 muscle biopsy samples from DM1 patients, 6 muscle biopsies from DM2 patients and 3 control individuals (Table 4.3) (see methods). Close to 150 and 131 million raw reads were generated from 6 DM1 and 6

DM2 patient's respectively and approximately 62 million raw reads were generated for 3 control individual muscle biopsies. After quality check of the raw sequencing data, each sample was processed as per our unique in-house pipeline as detailed in Figure 3.1.

Taken as a whole, out of 2104 mature miRNA quantified I found 453 and 369 statistically significant differentially expressed miRNAs in DM1 and DM2 patients respectively.

Surprisingly, majority of the miRNAs were detected to be up regulated i.e., 428 (94.4%) in DM1 and 344 (93.2%) in DM2 patients while remaining 25 miRNAs each in DM1 and genes in DM2 patients were found to be down regulated (Table 4.14).

I first looked into the enriched biological networks targeted in both DM1 and

DM2 patients separately using Ingenuity Pathway Analysis (IPA) software. I found the enriched miRNA pools in both DM1 and DM2 patients targeted genes involved in the regulation of skeletal and muscular system, endocrine system, reproductive system and connective tissues (Table 4.14). Comparative evaluation of the differentially expressed miRNAs in both DM1 and DM2 patients using a Venn diagram (Figure 4.16) revealed that there are 332 common differentially expressed miRNAs in both DM1 and DM2 patients. I also observed 121 and 37 uniquely differentially expressed miRNAs in DM1 and DM2 patients, respectively. Literature search of the overlapping miRNAs between these two patient groups also revealed their predominant involvement in various

94 hereditary, skeletal and developmental disorders such as in muscular dystrophy, azoopsermia and myopathy, this clearly correlated with the clinical presentation of the patients studied in this analysis.

95

Table 4.14: Differentially expressed micro RNAs in type 1 and type 2 myotonic dystrophy patients on the basis of their fold change (cut-off ≥ ±1.5), p-value of differentially expressed miRNA's was estimated by implementing z-score calculations using Benjamini Hochberg FDR corrections of 0.05.

DM1 patients DM2 patients

Overall differentially expressed miRNA 453 369

Up regulated genes 428 344

Down regulated genes 25 25

Common differentially expressed genes 332

Unique differentially expressed genes 121 37

Up regulated genes 106 21

Down regulated genes 15 16

96

Table 4.15: Top biological effects in type 1 and type 2 myotonic dystrophy patients predicted to be targeted by differentially expressed miRNAs as identified by IPA analysis.

DM2 patients DM1 patients

Skeletal and Muscular Disorders Hereditary Disorder

Cancer Skeletal and Muscular Disorders

Hereditary Disorder Developmental Disorder

Developmental Disorder Cancer

Reproductive System Disease Reproductive System Disease

Endocrine System Disorders Gastrointestinal Disease

Connective Tissue Disorders, Cell Cycle

Inflammatory Disease and Response Endocrine System Disorders

97

Azoospermia DM2 DM1 Uterine cancer Cervical carcinoma Liver cancer Lupus nephritis 37 332 121 Liver Hyperplasia Nasopharyngeal Cancer Connective tissue disorder Psoriasis Inflammatory response 98

Total=490

Muscular dystrophy Azoospermia LMNA-MD Liposarcoma Miyoshi myopathy LGMD

Figure 4.16: Venn diagram of overlapping differentially expressed microRNAs (miRNAs) between DM2 (red) and DM1 (blue) human patient muscle biopsy samples. miRNAs implicated in various relevant diseases and biological functions as identified by literature search is mentioned in boxes.

4.8: Analysis of the differentially expressed miRNAs identifies unique biomarkers for muscular dystrophy and insulin resistance in DM1 and DM2 patients.

Differentially expressed miRNAs from both DM1 and DM2 patients were analyzed using Ingenuity pathway analysis tool (IPA). The IPA software merges several target prediction algorithms into one environment. Hereditary disorders, skeletal and muscular disorders and developmental disorders were ranked top among the processes affected in both DM1 and DM2 patients (Table 4.15). Upon closer investigation of the list of miRNAs from these top ranked affected biological processes I identified 16 uniquely up regulated miRNAs from both DM1 and DM2 patients (Table 4.16) which are implicated in various forms of muscular dystrophy including Duchenne muscular dystrophy, miyoshi myopathy, limb girdle muscular dystrophy type 2B, nemaline myopathy, fascioscapular muscular dystrophy and LMNA-related congenital muscular dystrophy. These 16 miRNAs (Figure 4.17) may serve as unique biomarkers for identification of type 1 and 2 muscular dystrophy patients.

This list of these 16 differentially up regulated miRNAs were then uploaded into

DIANA miRPath version 2.1, it is a web based tool that predicts enriched signaling pathways targeted by the submitted list of miRNAs using KEGG database. Conservative parameters were used for acquiring the union list of predicted target genes, p-value cut- off was set to 0.05 and Benjamini-Hochberg FDR correction of 0.05 was used. Apart from pathways regulating various forms of cancer, enrichment of several common signaling pathways regulating muscular physiology and functions were identified to be targeted by these 16 unique miRNAs. As shown in A.4 these pathways include PI3K-Akt

99 signaling, focal adhesion, axonal guidance signaling, insulin signaling, gap junction, tight junction, cardiomyopathy and regulation of actin cytoskeletal signaling pathways.

From the common up regulated miRNAs between DM1 and DM2 patients I also identified 30 uniquely expressed miRNAs that are implicated in development of insulin resistance (Table 4.17) in various previously published studies. These 30 miRNAs as shown in Figure 4.18 must serve as unique biomarkers for identification of development of insulin resistance in DM patients. This list of 30 differentially up regulated miRNAs was then uploaded into DIANA miRPath version 2.1. Conservative parameters were used for acquiring the union list of predicted target genes, p-value cut-off was set to 0.05 and

Benjamini-Hochberg FDR correction of 0.05 was used. Apart from pathways regulating various forms of cancer, enrichment of several common signaling pathways regulating muscular physiology and functions, Insulin signaling and axonal guidance signaling were identified to be targeted by these unique miRNAs (A.5).

100

Table 4.16: Expression values of 16 uniquely identified miRNAs from DM1 and DM2 patients that have been implicated in various forms of muscular dystrophy.

miRNA FC DM1 FC DM2 Predicted in other NMDs Predicted in DM hsa-let-7c 4.06 2.74 FSHD, LGMD, MM, NM NOVEL hsa-miR-100-5p 3.89 2.45 FSHD, LGMD, MM, NM, PM NOVEL hsa-mir-103 2.81 2.30 FSHD, LGMD, MM, NM, IBM NOVEL hsa-miR-103a-3p 2.42 2.25 FSHD, LGMD, MM, NM NOVEL hsa-miR-128-3p 4.71 3.13 LMNA NOVEL hsa-miR-30a-3p 1.82 3.08 DMD, LGMD, MM, NM, DM, PM YES hsa-mir-154 5.01 3.33 DMD, FSHD, LGMD, MM, NM, PM, DM YES hsa-mir-320 2.85 2.49 LGMD, MM, NM, IBM, PM NOVEL hsa-mir-329 6.29 3.09 DMD, LGMD, MM, NOVEL hsa-miR-501-3p 2.59 2.33 LMNA NOVEL hsa-miR-505-5p 2.60 2.20 LMNA NOVEL hsa-miR-130 1.51 1.51 DMD, FSHD, LGMD, MM, NM, DM YES hsa-miR-206 4.09 3.16 DMD NOVEL hsa-miR-379 5.35 2.69 DMD, FSHD, LGMD NOVEL hsa-miR-299a-5p 2.98 1.94 DMD, LGMD, MM, NM, DM YES hsa-miR-532-3p 2.83 2.55 LMNA NOVEL FC = Fold change

FSHD = Facioscapulohumeral muscular dystrophy; MM = Miyoshi myopathy; LGMD = Limb girdle muscular dystrophy; PM = Polymyositis; NM = Nemaline myopathY, DMD = Duchenne muscular dystrophy; LMNA = LMNA-related congenital muscular dystrophy.

101

Figure 4.17: Uniquely differentially expressed miRNA in both type 1 and type 2 myotonic dystrophy patients may serve as biological markers for development of muscular dystrophy as identified by using IPA analysis. Green fill indicates up regulated genes and biological process. Red fill indicates down regulated genes. For figure legend see Appendix D.

102

Table 4.17: Expression values of 30 uniquely identified miRNAs both type 1 and type 2 myotonic dystrophy patients that have been implicated in development of insulin resistance in various previous studies.

Symbol FC DM1 FC DM2 let-7a-5p 2.67 2.06 miR-103-3p 2.42 2.26 miR-125b-5p 2.66 1.78 miR-126a-3p 1.80 1.59 miR-127-3p 5.86 3.08 miR-129-5p 9.15 3.08 miR-130a-3p 6.05 3.09 miR-132-3p 4.03 2.98 miR-141-3p 3.88 2.01 miR-142-5p 1.96 2.29 miR-148b-3p 2.37 1.96 miR-150-5p 1.64 1.51 mir-154 5.02 3.33 miR-16-5p 3.24 2.60 miR-183-5p 7.10 6.40 miR-194-5p 2.35 2.53 miR-21-5p 2.29 1.79 miR-221-3p 1.82 1.74 miR-223-3p 2.72 2.69 miR-23a-3p 2.93 2.49 miR-24-3p 3.35 2.40 miR-27a-3p 2.11 1.82 miR-299a-5p 2.98 1.94 miR-31-5p 3.18 2.12 miR-320b 3.12 2.80 miR-335-5p 3.98 2.55 miR-34a-5p 5.39 6.25 miR-379-5p 4.18 2.70 miR-450a-5p 3.04 2.00 miR-664-3p 2.06 1.88 FC = Fold change

103

Figure 4.18: Uniquely differentially expressed 30 miRNA in both type 1 and type 2 myotonic dystrophy patients may serve as biological markers of development of insulin resistance in DM patients as identified by using IPA analysis. Green fill indicates up regulated genes and biological process. Red fill indicates down regulated genes. For figure legend see Appendix D.

104

4.9: Analysis of overlapping differentially expressed miRNAs across DM1 and DM2 patients along with integration of RNA-seq data.

I identified 332 differentially expressed miRNAs to be overlapping between DM1 and DM2 patients (Figure 4.16), out of these 322 miRNAs were found to be up-regulated in both DM1 and DM2 patients, while only 10 miRNAs were found to be down regulated. Top 10 up regulated miRNAs in both DM1 and DM2 patients is shown in

Table 4.18. Besides observing a number of overlapping miRNAs that were uniquely expressed in DM1 and DM2 patients, I was interested in the genes they target and ultimately the signaling pathways and biological functions that could potentially be targeted by them. I used data from our own RNA-seq NGS experiment to perform the enrichment of miRNA targets using IPA software. The IPA software merges several target prediction algorithms such as TargetScan (Lewis, Burge, and Bartel 2005),

TarBase (Vergoulis et al. 2012), miRecords (Xiao et al. 2009) and its very own Ingenuity expert finding records into a single environment and it simultaneously overlays affected canonical pathways and biological functions that are targeted by the miRNA list being investigated

I performed miRNA target analysis using IPA software for 322 overlapping up regulated miRNAs between DM1 and DM2 patients with common differentially expressed RNA-seq data from all 7 DM1 patients and 5 DM2 patients. I observed significant targeting of 4 crucial canonical signaling pathways regulating actin cytoskeletal signaling, calcium signaling, cardiac hypertrophy, and axonal guidance from the cross-comparison of miRNA-seq and RNA-seq data between both DM1 and DM2 patients.

105

Table 4.18: Top 10 lists of differentially up-regulated miRNAs in skeletal muscles of type 1 and type 2 myotonic dystrophy patients.

DM1 DM2 miRNA FC miRNA FC hsa-miR-375-5p 27.22 hsa-miR-323b-3p 8.93 hsa-miR-196a-5p 23.94 hsa-miR-196a-5p 8.84 hsa-miR-323a-3p 22.65 hsa-miR-183-5p 6.40 hsa-miR-323b-3p 9.93 hsa-miR-34a-5p 6.25 hsa-miR-129-5p 9.15 hsa-miR-23a-5p 5.76 hsa-miR-432-5p 8.16 hsa-miR-432-5p 5.75 hsa-miR-23a-5p 7.74 hsa-miR-96-5p 5.66 hsa-miR-493-5p 7.74 hsa-miR-4490 5.64 hsa-miR-183-5p 7.10 hsa-miR-3615 5.48 hsa-miR-654-5p 7.09 hsa-miR-182-5p 5.27 FC = Fold change

106

4.9.1: Targeting of Actin cytoskeletal signaling pathway.

Gene target analysis of common overlapping miRNA with the common differentially expressed RNA-seq data between DM1 and DM2 patients revealed specific targeting of genes regulating actin cytoskeletal signaling in both DM1 and DM2 patients.

I identified 18 unique miRNAs targeting 6 specific mRNA (A.6) whose expressions were observed to be significantly decreased in the same patients by RNA-seq experiment.

These genes are: actin, alpha 1, skeletal muscle (ACTA1), myosin heavy chain 1

(MYH1), myosin heavy chain 13 (MYH13), myosin heavy chain 14 (MYH14), myosin light chain 9 (MYL9) and myosin light chain kinase 2 (MYLK2).

As shown in Figure 4.19 the up regulated miRNAs (green colored) targets the 6 mRNA which leads to their down regulation (red colored). All these 6 mRNAs are critically important for maintaining the actin cytoskeletal signaling that is required for normal organismal growth and development and regulation of cellular assembly, movement and organization. The predicted biological impact of the down regulation of these 6 mRNAs along with the down regulation of the actin cytoskeletal signaling in DM patients will lead to decrease in muscle contraction and increased myopathy as experienced by DM patients.

107

108

Figure 4.19: Gene targeting of 6 mRNA regulating actin cytoskeleton signaling pathway by 18 uniquely up regulated miRNAs in both type 1 and type 2 myotonic dystrophy patients. Green fill represents up regulated event, red fill represents down regulated event. For figure legend see Appendix D.

4.9.2: Targeting of Calcium signaling pathway.

Gene target analysis of the overlapping miRNA with the RNA-seq data from

DM1 and DM2 patients revealed significant targeting of genes regulating calcium signaling pathways in these patients. I identified 42 unique up regulated miRNAs in DM patients that target 13 mRNAs (A.7) whose expression was observed to be significantly decreased in the same patients by RNA-seq experiment. These genes are: actin, alpha 1, skeletal muscle (ACTA1), ATPases, Ca++ transporting cardiac muscle fast twitch 1

(ATP2A1), cholinergic receptor nicotinic beta 3 (CHRNB3), myosin heavy chain 1

(MYH1), myosin heavy chain 13 (MYH13), myosin heavy chain 14 (MYH14), myosin light chain 9 (MYL9), calcenurin B (PPP3R2), protein kinase, cAMP-dependent catalytic gamma (PRKACG), tropomyosin1 and 2 (TPM1, TPM2).

As shown in Figure 4.20 all these 13 mRNAs are critically important for maintaining the calcium signaling pathway that is an intracellular secondary messenger signaling molecule and is required for cell-cell signaling. The predicted biological impact of the down regulation of these 13 mRNAs along with the down regulation of the calcium signaling in DM patients will lead to increase in muscle weakness and hypertrophy of heart as experienced by DM patients.

109

110

Figure 4.20: Gene targeting of 13 mRNA regulating calcium signaling pathway by 42 uniquely up regulated miRNAs in both type 1 and type 2 myotonic dystrophy patients. Green fill represents up regulated event, red fill represents down regulated event. For figure legend see Appendix D.

4.9.3: Targeting of Cardiac hypertrophy signaling pathway.

Gene target analysis of the overlapping miRNA with the RNA-seq data from

DM1 and DM2 patients revealed significant targeting of genes regulating cardiac hypertrophy signaling pathways in these patients. I identified 28 unique up regulated miRNAs in DM patients that target 6 mRNAs (Table 4.25) whose expression was observed to be significantly decreased in the same patients groups by RNA-seq experiment. These genes are: guanine nucleotide binding protein (G protein), alpha 15

(GNA15), insulin receptor substrate 1(IRS1), myosin light chain 9 (MYL9), calcenurin B

(PPP3R2), protein kinase, cAMP-dependent catalytic gamma (PRKACG), ras homolog family member D (RHOD).

As shown in Figure 4.21 the up regulated miRNAs (green colored) targets the 6 mRNA (red colored) which leads to their down regulation (red color). All these 6 mRNAs are critically important for maintaining the cardiac hypertrophy signaling pathway that is an intracellular signaling pathway that transduces hypertrophic response in heart of affected individuals. The predicted biological impact of the down regulation of these 6 mRNAs will lead to increase in cardiac hypertrophy as experienced by DM patients.

111

112

Figure 4.21: Gene targeting of 6 mRNA regulating cardiac hypertrophy signaling pathway by 28 uniquely up regulated miRNAs in both type 1 and type 2 myotonic dystrophy patients. Green fill represents up regulated event, red fill represents down regulated event. For figure legend see Appendix D.

4.9.4: Targeting Axonal guidance signaling pathway.

Gene target analysis of the overlapping miRNA with the RNA-seq data from

DM1 and DM2 patients revealed significant targeting of genes regulating axonal guidance signaling pathways in these patients. Axonal guidance receptors signal as multimeric complexes which regulates the formation and extension of axons which migrate to reach their synaptic targets. I identified 29 unique up regulated miRNAs in

DM patients that target 8 mRNAs (A.8) whose expression was observed to be significantly decreased in the same patients by RNA-seq experiment. These genes are: guanine nucleotide binding protein (G protein), alpha 15 (GNA15), kinesin light chain 1

(KLC1), myosin light chain 9 (MYL9), nerve growth factor (beta polypeptide) (NGF), calcenurin B (PPP3R2), protein kinase, cAMP-dependent catalytic gamma (PRKACG), ras homolog family member D (RHOD) and wingless-type MMTV integration site family member 2 (WNT2).

As shown in Figure 4.22 the up regulated miRNAs (green colored) targets the 8 mRNA which leads to their down regulation (red colored). All these 8 mRNAs are critically important for regulating axonal guidance signaling and they are critical for various functions including cellular assembly and organization; maintenance of cellular functions and cell morphology. The predicted biological impact of the down regulation of these 6 mRNAs will lead to decreased excitation and stimulation of sensory neurons, in

DM patients.

113

114

Figure 4.22: Gene targeting of 8 mRNA regulating Axonal guidance signaling pathway by 29 uniquely up regulated miRNAs in both type 1 and type 2 myotonic dystrophy patients. Green fill represents up regulated event, red fill represents down regulated event. For figure legend see Appendix D.

4.10: Integration of RNA-seq data with common down regulated miRNAs in DM1 and DM2 patients reveals up regulation of various synaptic pathways.

Patients of myotonic dystrophy experience tremors and shaking of hand and legs

(Table 2.3). Target analysis on the 9 unique common down regulated miRNAs (A.9) was done using DIANA miRPath version 2.1 to identify targeted canonical pathways predicted to be enriched by this common down regulated gene list. As shown in Table

4.19 major pathways involving various neurotransmitters that regulates the normal functioning of nervous system signaling such as glutamatergic, dopoaminergic,

GABAergic, neurotopin and chemokine signaling. Glutamate receptors are located on the membranes of neuronal cells, they are the main excitatory neurotransmitter in brain and it also serves as precursor for GABA signaling which is an inhibitory neurotransmitter in brain. Dopoamine receptors belong to G-protein coupled receptor family and are also abundant in brain. Misregulation of these synaptic receptor pathways have been implicated in various neurodegenerative diseases and hyperactivity of heart and muscles.

These canonical signaling pathways are predicted to be increased in DM patients.

Validation of the up-regulation of the genes regulating Glutamate receptor signaling pathway (Figure 4.23) was made from our RNA-seq data using IPA analysis. As identified from our sequencing studies all these 9 miRNAs are critically important for maintaining the synaptic signaling that is critical for various functions including maintenance of normal nervous system signaling. The biological impact of the down regulation of these 9 miRNAs leads to manifestation of tremors and spasms in DM patients.

115

Table 4.19: List of top 20 KEGG pathways predicted to be targeted by 9 common down regulated miRNAs in DM patients according to DIANA miRPath v.2.1.

No. of No. of miRNAs FDR adjusted KEGG pathway targeted targeting p-value genes pathways

Wnt signaling pathway 3.70E-14 52 8 Glutamatergic synapse 9.22E-10 38 7 Dopaminergic synapse 9.22E-10 43 7 Prion diseases 1.79E-08 7 4 Long-term potentiation 1.06E-07 24 7 Retrograde endocannabinoid signaling 1.06E-07 35 7 Ubiquitin mediated proteolysis 1.93E-07 41 8 Hepatitis B 2.86E-06 43 8 GABAergic synapse 3.13E-06 29 8 Melanogenesis 4.78E-06 31 6 Pathways in cancer 5.59E-06 84 9 Colorectal cancer 5.73E-06 21 9 Neurotrophin signaling pathway 1.64E-05 34 8 Circadian rhythm 2.04E-05 12 5 mRNA surveillance pathway 6.25E-05 26 7 Adipocytokine signaling pathway 6.84E-05 21 6 Chemokine signaling pathway 7.96E-05 47 8 Pancreatic cancer 1.64E-04 21 7 RNA degradation 1.64E-04 22 6 Protein processing in endoplasmic 1.72E-04 45 6 reticulum

116

Figure 4.23: Up regulation of Glutamate receptor signaling pathway b in both type 1 and type 2 myotonic dystrophy patients. Green fill represents up regulated event, red fill represents down regulated event.

117

4.11: Analysis of uniquely up regulated miRNAs reveals unique enrichment of transforming growth factor-beta (TGF-β) and mitogen-activated protein kinases

(MAPK) signaling pathways to be specifically enriched in DM2 patients.

I identified 121 and 37 uniquely differentially expressed miRNAs in DM1 and

DM2 patients respectively (Figure 4.16). First I looked for biomarkers in these uniquely expressed miRNAs in both DM1 and DM2 patients. DM2 patients experience testicular atrophy and decrease in sperm production. I identified two uniquely down regulated miRNAs: miR-159a-3p and miR-29c-5p, both of which have been implicated in cases azoospermia. I analyzed uniquely up regulated miRNA in DM2 patients using DIANA miRPath version 2.1 to identify targeted canonical pathways predicted to be enriched uniquely in DM2 patients. I identified transforming growth factor-beta (TGF-β) and mitogen-activated protein kinases (MAPK) signaling pathways to be specifically enriched in DM2 patients. TGF-β and MAPK pathways regulate skeletal muscle homeostasis such as growth, differentiation and regeneration.

I further investigated for the mRNA expression dataset from DM2 patients to identify genes that are part of these two very important pathways. Indeed, as shown in

Figure 4.24 I identified uniquely up regulated mRNAs showing elevated expression levels of genes responsible for up regulation of MAPK and TGF-β signal transduction pathways in DM2 patients. Among the various pathways that participate in the pathogenesis of atrophy and impaired regeneration of muscles, MAPK and TGF-β play crucial roles. Our findings suggest that these pathways may serve as a potential therapeutic target to combat skeletal muscle atrophy in myotonic dystrophy patients.

118

MAPK signaling (p-value=4.27E-27) 119

TGF-β signaling (p-value=7.42E-26)

Figure 4.24: Up-regulation of genes regulating mitogen-activated protein kinases (MAPK) and transforming growth factor-beta (TGF-β) signaling pathways identified in response to 21 uniquely up regulated miRNAs in type 2 myotonic dystrophy patients.

4.12: Transcriptomics layout of the transgenic mouse models as identified by RNA- seq.

I intended to compare the mRNA, and protein expression profiles between three transgenic mouse models with human DM1 and DM2 patient muscle biopsy samples to identify similarities as well as differences exhibited by these mouse models with actual human patients. All of the mouse muscle biopsy samples were obtained from Dr. Ralf

Krahe (see methods) In view of that; a total of 4 muscle biopsy tissue from the transgenic models was obtained (see methods for sample collection protocol). The clinical presentation of these three transgenic mouse models is presented in Table 4.30. All three mice models exhibited clear symptoms of classical DM disease in human patients such as presence of ribonuclear inclusion bodies with co-localization of MBNL 1and 2 proteins; myopathy, cardiac hypertrophy and retinal atrophy were also seen in all three animal models.

A total of more than 307 million raw reads (Table 4.31) were generated from all 4 mouse tissues. Post alignment processing was performed as per our unique-in house data analysis pipeline. Gene expression quantification was performed using mouse mm9 transcript annotation. Trimmed Means of M-value (TMM) normalization method

(Robinson and Oshlack 2010) was chosen for quantification of mRNA expression. After quantification, I first examined the layout of expression of all 4 samples using unsupervised hierarchical clustering. I observed that the mRNA expression from skeletal muscle biopsy of DM2-KI mouse clustered as separate entities (Figure 4.25a) from control and the other mouse models. These findings were also evaluated and confirmed on a 3 dimensional PCA plot (Figure 4.25b) where I observed DM2-KI and DM2-CCTG

120 grouping close to each other while HSA-CTG and control samples were distantly located.

These results suggested that not only I was observing unique differences in the collective mRNA expression levels between DM2-KI and control mouse biopsy but most notably I observed varying degree of differences in the mRNA expression levels between individual DM2-KI , HSA-CTG and HSA-CCTG transgenic mouse muscle biopsies.

Taken as a whole, out of 22,850 protein coding genes I observed 4,591, 3,786 and

2,563 DEGs in DM2-KI, HSA-CTG and HSA-CCTG mouse models (Figure 4.26) respectively when compared with averaged controls individuals. Comparative evaluation of DEGs in both DM1 and DM2 patients using a Venn diagram (Figure 4.27) revealed that there are 1017 common DEGs that are common in all three transgenic mice. DM2-

KI mouse model had the greatest number of uniquely differentially expressed genes

(2,182) followed by HSA-CTG (1,327) and then in HSA-CCTG (590). I then looked for enriched canonical pathways in both unique as well as overlapping gene sets using

KEGG pathway analysis software. From the overlapping up regulated DEGs from all three transgenic mice I observed pathways regulating cell-cell adhesions, cardiac muscle contraction and cardiac hypertrophy to be highly enriched, these finding highly correlated with the clinical presentation of these transgenic mice as presented in Table 4.30.

Analysis of the unique DEGs in DM2-KI mouse revealed enrichment of pathways regulating fatty acid metabolism and amino acid degradation. Peroxisome proliferator- activated receptor (PPAR) pathway regulates fat burning in adipose tissue. From the gene expression data I found that PPAR-alpha (PPARA) is 5.5 times up regulated in DM2-KI mouse. I further looked into the mRNA expression data from DM2-KI mice and observed that yes in fact the production of ketone bodies was up regulated (A.10), also using IPA

121 analysis I found that canonical pathways such as Fatty acid oxidation, γ-linolenate

Biosynthesis II and Stearate Biosynthesis I were among the top 15 pathways enriched in

DM2-KI mice (Table 4.32). These findings suggest energy depravation conditions in

DM2-KI skeletal muscles exist whereby energy requirements had to be fulfilled by the process of ketogenesis. Although, HSA-CTG and HSA-CCTG transgenic mouse display various similar symptoms on their clinical presentation (Table 4.30); expression layout of these two mouse models does not show unique expression features as exhibited by DM2-

KI mouse.

122

Table 4.30: Clinical and histopathological manifestations of the three transgenic mouse models used in this study.

Manifestations DM2-KI HSA-CCTG HSA-CTG Control

Ribonuclear inclusions Present Present Present No MBNL 1/2 co-localization Present Present Present No Cardiac hypertrophy Present Present Present Absent Myotonia No Moderate Severe Absent Myopathy Present Present Present Absent Retinal Atrophy Present Present Present No Testis Azoospermia Oligospermia No information Normal

Table 4.31: Alignment statistics overview of mRNA sequencing of transgenic mouse muscle biopsy samples.

Internal Total raw Total Aligned % Non-spliced Spliced Identifier reads reads Alignment reads reads

DM2-KI 88,829,840 75,285,589 84.75 67,223,230 3,928,901

HSA-CCTG 67,439,758 57,919,205 85.88 51,436,576 3,270,019

HSA-CTG 73,463,574 62,580,802 85.19 56,157,738 3,276,224

Control 77,552,034 65,452,471 84.40 57,132,906 3,864,802

123

a. b. 124

Figure 4.25: Expression layout of mRNA from skeletal muscle biopsy of mouse models for DM a. unsupervised spearman correlation b. Principal component analysis (PCA) plot.

a. b.

c.

Figure 4.26: Volcano plot of differentially expressed genes in three transgenic mouse models as compared to control mouse mRNA expression. Up regulated genes (green), down regulated genes (red) and no change (black). a. DM2-KI vs. control b. HSA-CTG vs. control c. HSA-CCTG vs. control.

125

126

Figure 4.27: Venn diagram of overlapping differentially expressed genes between DM2-KI (red), HAS-CTG (blue) and HAS-CCTG (green) mouse muscle biopsy samples. KEGG enriched canonical pathways identified from common DEGs is mentioned in boxes.

Table 4.32: Top 15 enriched canonical pathways in skeletal muscle biopsies of three transgenic mouse models as compared with control mouse skeletal muscle using IPA analysis.

DM2-KI HSA-CCTG HSA-CTG

Fatty Acid β-oxidation I Atherosclerosis Signaling LXR/RXR Activation Hepatic Fibrosis / Hepatic Stellate Hepatic Fibrosis / Hepatic Stellate Cell Atherosclerosis Signaling Cell Activation Activation γ-linolenate Biosynthesis II Granulocyte Adhesion and Diapedesis Nicotine Degradation II (Animals) Agranulocyte Adhesion and Hepatic Fibrosis / Hepatic Stellate Cell Agranulocyte Adhesion and Diapedesis Diapedesis Activation 127 Cellular Effects of Sildenafil (Viagra) G-Protein Coupled Receptor Signaling Nicotine Degradation III B Cell Development Dendritic Cell Maturation Melatonin Degradation I Regulation of Actin-based Motility LXR/RXR Activation Granulocyte Adhesion and Diapedesis by Rho Calcium Signaling cAMP-mediated signaling Superpathway of Melatonin Degradation Role of Macrophages, Fibroblasts and Role of Osteoblasts, Osteoclasts and Atherosclerosis Signaling Endothelial Cells in Rheumatoid Arthritis Chondrocytes in Rheumatoid Arthritis Stearate Biosynthesis I (Animals) Complement System Acute Phase Response Signaling Signaling by Rho Family GTPases Cellular Effects of Sildenafil (Viagra) Estrogen Biosynthesis Epithelial Adherens Junction Role of Osteoblasts, Osteoclasts and Bupropion Degradation Signaling Chondrocytes in Rheumatoid Arthritis Acetone Degradation I (to Basal Cell Carcinoma Signaling Nicotine Degradation II Methylglyoxal) α-Adrenergic Signaling MSP-RON Signaling Pathway Dendritic Cell Maturation Colorectal Cancer Metastasis LPS/IL-1 Mediated Inhibition of RXR Nicotine Degradation III Signaling Function

4.13: Comparison of human DM2 and DM2-KI transgenic mouse RNA-seq data reveals actin cytoskeletal signaling controlled by Rho GTPases, as the centerpiece for the pathophysiology in type 2 myotonic dystrophy disease in human patients.

I looked for standalone enriched pathways in all three transgenic mouse muscle samples using IPA analysis (Table 4.32). I identified unique enrichment of fatty acid oxidation pathways, signaling by Rho Family GTPases and actin signaling pathways among DM2-KI transgenic mice. However these pathways were not highly enriched among either HSA-CCTG or HSA-CTG animals. I further compared the mRNA expression data from human DM2 patients previously analyzed in this study with the

DM2-KI mouse. To our surprise I identified very similar enrichment of Signaling by Rho

Family GTPases (Figure 4.28) canonical pathway in DM2 patients p-value = 2.51 E-06) and DM2-KI transgenic mouse (p-value =2.75 E-05).

Surprisingly, I identified a very mild affect of these signaling pathways in DM1 patients, but those were not significant (p-value = 0.181).. These strikingly similar findings between DM2-KI and human DM2 patients strongly suggests that modulation of actin cytoskeletal signaling by Rho family proteins is a unique attribute in DM2 patients and I propose that this signaling pathway is the centerpiece of DM2 disease pathophysiology. Components of this pathway can thus be utilized for further investigation to identify potential diagnostic biomarkers and for future drug target studies.

128

a. 129

b. 130

c. 131

Figure 4.28: Enriched canonical Signaling by Rho family GTPase pathway regulating microtubule dynamics in skeletal muscle biopsy of a. Human DM2 patient (p-value = 2.51 E-06) b. DM2-KI transgenic mouse (p-value =2.75 E-05) c. Human DM1 patients (p-value = 0.181).

4.14: Proteomics layout of the DM1 and DM2 skeletal muscle biopsy samples as identified by iTRAQ.

We identified the proteins expressed in same skeletal muscle biopsies of DM1 and

DM2 patients as used for RNA-seq and miRNA-seq using iTRAQ technique (see methods). After quantification of the protein expression levels of all 16 samples, I first examined their layout using unsupervised hierarchical clustering among control and DM patients. I observed that the both DM1 and DM2 patients clustered as two separate groups (Figure 4.29).

We observed separate clustering within DM2 patients groups with P4, P5 and P6 clustering separately with P1, P2 and P3. These results agree with the RNA-seq clustering analysis (Figure 4.1b) hence confirming that not only we were observing unique differences in the collective mRNA and protein expression levels between DM1 and DM2 patients against an averaged mRNA expression levels in control individuals, but most notably we observed varying degree of differences in the protein expression levels between individual DM2 patients. This was attributed to the varying clinical presentations among DM2 patients, where patients 4, 5 and 6 had the most severe symptoms while patients 1, 2 and 3 were experiencing very mild clinical symptoms. All the DM1 patients that we examined in this study presented very similar clinical symptoms.

We identified 94 and 98 differentially expressed proteins in DM1 and DM2 patients respectively when compared with averaged controls individuals. Among these, majority of the genes were found to be up regulated i.e., 58 in DM1 and 84 in DM2 patients while remaining 36 genes in DM1 and 14 genes in DM2 patients were found to

132 be down regulated To identify uniqueness of gene expression between DM1 and DM2 patients we checked for the overlap of differentially expressed genes in both patients groups. We identified 29 differentially expressed proteins to be common in both DM1 and DM2 patients; I also observed 65 and 69 uniquely expressed proteins in DM1 and

DM2 patients, respectively (Table 4.33).

I analyzed affected protein classes from the differentially expressed proteins from skeletal muscle biopsies of DM1 and DM2 patients using Panther protein classification platform. We observed significant enrichment cytoskeletal, structural and calcium binding proteins from down regulated genes in both DM1 and DM2 patients (Figure

4.30). I then looked into the enriched protein classes in both DM1 and DM2 patient's individually using Ingenuity Pathway Analysis (IPA) software. IPA uses right-tailed

Fisher's exact test to rank the highly enriched canonical pathways from a given expression dataset. Indeed, IPA identified specific negative enrichment of actin based motility by RhoGTPase signaling, mitochondrial dysfunction, epithelial adherens junction signaling, and calcium signaling in both DM1 and DM2 patients. All of these pathways are critically important for organismal growth and development and maintenance of cytoskeletal dynamics. These critically important signaling pathways along with negative regulation of calcium signaling suggest defective skeletal muscle structure and physiology in DM patients (Table 4.34).

133

134

Figure 4.29: Spearman correlation of protein expression from skeletal muscle biopsies of all DM samples compared with control.

Table 4.33: Differentially expressed proteins in DM1 and DM2 patients as compared to control individuals. Differential expression was calculated on the basis of their fold change (cut-off ≥ ±2.0 p-value was estimated by implementing t-test calculations using Benjamini Hochberg FDR corrections of 0.05.

Proteomics DM1 patients DM2 patients

Overall differentially expressed miRNA 94 98

Up regulated 58 84

Down regulated 36 14

Common differentially expressed 29

Unique differentially expressed 65 69

135

Figure 4.30: Bar chart distribution of major affected protein classes from common down regulated proteins in DM1 and DM2 skeletal muscles.

136

Table 4.34: Top 10 significantly enriched canonical pathways along with their p-values from differentially expressed proteins in skeletal muscle biopsies DM1 and DM2 patients as identified by IPA analysis.

Canonical Pathways DM1 DM2

Calcium Signaling 2.04E-10 8.32E-07

ILK Signaling 2.95E-10 1.12E-03

Epithelial Adherens Junction Signaling 7.08E-10 5.01E-10

Actin Cytoskeleton Signaling 1.10E-09 2.63E-07

Tight Junction Signaling 2.45E-08 2.63E-02

Agranulocyte Adhesion and Diapedesis 1.07E-07 4.27E-02

Regulation of Actin-based Motility by Rho 6.17E-06 5.13E-03

RhoA Signaling 3.39E-05 1.51E-03

Mitochondrial Dysfunction 1.82E-03 3.02E-08

Remodeling of Epithelial Adherens Junctions 1.26E-03 1.32E-08

137

Chapter 5

DISCUSSION

Myotonic dystrophy is the second most common genetically inherited

(autosomal dominant) muscular disorder in humans with an estimated prevalence of about 1 in 8000 case world-wide (Larkin and Fardaei 2001; Ranum and Day 2002b;

Suominen et al. 2008). Patients of DM experience highly variable clinical symptoms that primarily involve muscular, cardiovascular and endocrine systems. Besides these DM patients also experience involvement of eyes, brain and testis (Cho and Tapscott 2007;

Udd and Krahe 2012). Although, DM1 and DM2 patients share numerous overlapping clinical features as detailed in Table 2.2, symptoms of DM2 patients are observed to be generally mild and the overall prognosis better as compared to DM1 patients (Meola and

Moxley 2004; Day and Ranum 2005).

Microsatellite repeat expansions in DM1 (CTG) and DM2 (CCTG) patients leads to RNA-gain-of-function mechanism where repeat expansions remains trapped in nucleus and sequester several RNA binding proteins that causes misregulates their downstream functions, primarily invoking splicing abnormalities (Fu et al. 1992; Kimura et al. 2005;

Botta et al. 2006; Margolis et al. 2006; O’Rourke and Swanson 2009). It has been previously shown that the ribonuclear foci in both DM1 and DM2 patients uniquely sequester certain nucleic acid binding proteins; muscleblind-like splicing regulator

(MBNL) and two CUGBP, Elav-like family member (CELF) family of RNA binding proteins-CUG-binding protein (CUGBP1) and embryonically lethal abnormal vision-type

138

RNA binding protein 3 (ETR-3) thus resulting in loss-of-function of these proteins

(Fardaei et al. 2002; Mweller et al. 2000; N. A Timchenko et al. 2001).

Transgenic mouse model studies involving CUGBP1 gene over expression (Wang et al. 2007; Ward et al. 2010) and MBNL1 gene knockout (Kanadia et al. 2003;

Machuca-Tzili et al. 2011; Suenaga et al. 2012) have been able to recapitulate some but not all of the classical features of DM1 and DM2. Although, RNA toxicity has a major role in the pathophysiology of this disease however, spliceopathy alone does not seem to fully account for all the aspects of multisystemic disorders of myotonic dystrophy (Udd and Krahe 2012; Bachinski et al. 2013). This had led the research community studying

DM1 and DM2 disease over years to suggest that although different animal models have helped clarify distinct aspect of DM, more in-depth investigation of the transcriptomics layout as well as microRNA expression is required to fully comprehend the functional level changes in DM patients.

Several studies have recently been published that have utilized microarray based technique to study the transcriptomics (Botta et al. 2007) and microRNA (Greco et al.

2012) and splicing aberration (Vihola et al. 2010; Vihola et al. 2013) layouts in DM patients. Although, these studies have been able to capture some of the altered gene, miRNA expression in DM patients, none of these investigations have been able to provide comprehensive and comparative information between DM1 and DM2 patients.

In the current study I present, to our knowledge, for the first time the unbiased, novel, in-depth differential mRNA, miRNA and protein expression profiles of muscle biopsy samples from type 1 and type 2 myotonic dystrophy patients using the latest NGS

139 technologies such as RNA sequencing (RNA-seq) and microRNA sequencing (miRNA- seq). I also compared the mRNA expression profiles of skeletal muscle biopsies from three different transgenic mouse models for myotonic dystrophy with that of human DM2 and DM1 mRNA expression data. I am able to present detailed mRNA and miRNAs expression layout in DM patients that provide comprehensive information on the underlying pathophysiology of DM1 and DM2 disease in humans via gene expression modulations. This work also represents discovery of unique biological networks and transcriptional factors that are affected in DM patients and identification of potential biomarkers. Mass scale protein expression profiling using Isobaric Tags for Relative and

Absolute Quantification (iTRAQ) technique aided our analysis by providing ample support to the translation changes in tandem with transcriptional changes.

I sequenced mRNA and small RNA libraries prepared from skeletal muscle biopsies of DM1 (n=7) and DM2 (n=6) patients along with control (n=3) individuals using the Illumina HiSeq 2000 platform (see methods), generating more than 1.7 billion

RNA-seq reads (Table 4.2) and 344 million miRNA-seq reads (Table 4.3). More than

307 million RNA-seq reads (Table 4.31) were generated from the 4 muscle biopsy tissue from the transgenic mouse models. I devised and used a unique in-house pipeline (Figure

3.1) for analyzing and quantifying expression data generated in our study. I have used this pipeline for quantifying miRNA and miRNA expression data for various published and in-submission-process collaborative projects (Powell et al. 2012; Bohbot et al. 2013;

Sparks, Vinyard, and Dickens 2013; Brissova et al. 2014; Reinert et al. 2014). I quantified and profiled expression levels of 20,637 protein coding genes and 2104 human mature miRNAs using RNA-seq and miRNA-seq technique. The results showed that a

140 substantial proportion of reads of each RNA-seq library (approximately 80.0% were mapped to hg19 genome build (Table 4.2), out of that approximately 5% were identified to be spliced reads. From the small RNA sequencing I was able to get approximately 50% reads annotated to human miRNAs in miRBase20.

Our mRNA and miRNA expression data of the human patient samples clustered

DM1 and DM2 patients as separate entities from the control individuals. Overall, I detected a higher number of differentially expressed genes (DEGs) in DM2 patients

(4,180) as compared to DM1 patients (3,074) and I also found that the majority of the differentially expressed genes were up regulated in both DM1 (69.2%) and DM2

(85.3%) patients. Likewise, I was able to detect a significant number of differentially expressed miRNAs in DM1 (453) and DM2 (369) patients. Among these, majority of the miRNAs were detected to be up regulated i.e., 428 (94.4%) in DM1 and 344 (93.2%) in

DM2 patients while only 25 miRNAs each in DM1 and DM2 patients were found to be down regulated. This is by far greatest number of mRNA and miRNA that have been profiled than any previous published report in a single study (Botta et al. 2007; Greco et al. 2012; Vihola et al. 2013) from skeletal muscle biopsies of myotonic dystrophy patients.

From the RNA-seq data set in our study I did not observed any significant alteration in expression of either DMPK or CNBP genes itself in any DM1 or DM2 patients (Figure 4.4b). This results confirms with some of the very recent findings (Botta et al. 2006; Margolis et al. 2006) and contradict the role of DMPK and CNBP insufficiency in DM1 and DM2 disease pathogenesis (Davis et al. 1997). My findings indicate that the repeat expansions does not appear to have any role in alteration of the

141 expression of the gene itself and the downstream molecular effects of the DM1 and DM2 mutations are triggered by the accumulation of repeats tracts alone. This supports the

RNA gain-of-function model as the most accredited mechanism for DM pathogenesis

(Timchenko and Caskey 1996; Tian et al. 2000; Ranum and Day 2004; Udd and Krahe

2012). Our results confirm that the expression of the DMPK and CNBP mRNAs are unaffected in either DM patients.

Earlier published reports using various mouse, flies and zebrafish model studies have suggested that either MBNL gene knockout or CUGBP1 gene over expression leads to development of DM like symptoms in these animals (L. E. Machuca-Tzili et al. 2011;

L. Machuca-Tzili et al. 2006; Kanadia et al. 2003; Suenaga et al. 2012; G.S. Wang et al.

2007). From the RNA-seq data set in our study, I did not observed any significant adaptation in expression of either MBNL1, MBNL2, MBNL3, CUGBP1 and EIF2A genes itself in any DM1 or DM2 patients (Figure 4.4b). Our results confirms with some of the very recent findings (Nezu et al. 2007) where they also showed that transcript level

MBNL and CUGBP1 remains unaltered in a mouse model by expressing expanded CUG repeats in the 3`UTR of DMPK gene. The expression of these genes does not change in any DM patients analyzed in this study and my results contradict earlier findings of increased CUGBP1, EIF2A and decreased MBNL family genes expression in DM1 and

DM2 patients. Hence, there is no compensatory mechanism observed at the transcriptional levels in response to nuclear sequestration of these proteins and their misregulation at the protein level might be due to post transcriptional activities in DM patients.

142

Based on these findings I suggest that the downstream molecular events are not primarily affected by changes in expression of these genes, but is rather much complex and therefore requires more intense investigative approach. It requires going beyond investigating the expression levels of only a handful of genes in DM patients which have been the focus in most of the past studies. We need a much broader analysis of the transcriptomics layout in DM patients to uncover the vastly unknown gene expression landscape in these two patient groups. The results of such analysis are discussed below.

I was able to identified unique similarities between DM1 and DM2 patients by using RNA-seq data. As a novel finding I report enrichment of three very critically important canonical pathways: Cytoskeletal signaling, Type 1 and 2 diabetes mellitus and

Calcium signaling pathways in both DM1 and DM2 patients (Figure 4.4c). The enrichment of these pathways helped us comprehend the complex pathophysiology affecting skeletal muscles, endocrine system and cardiac physiology in DM patients. Due to impairment of endocrine functions, DM patients are highly susceptible for diabetes mellitus (Bjarne Udd and Krahe 2012; Ranum and Day 2004). From our RNA-seq data I confirmed earlier findings (Savkur, Philips, and Cooper 2001; Savkur et al. 2004) that report lower inclusion of exon 11 in insulin receptor (INSR) gene as compared to control individuals. However, I did not observed any change in differential expression of INSR gene as reported earlier (Morrone et al. 1997). As a novel finding I observed enrichment of Neuroactive ligand-receptor interaction pathway in DM patients, this pathway has been recently shown to be enriched in obese patients with type 2 diabetes as a family history (Das and Rao 2007). These results provide an explanation for the unusual insulin resistance observed in DM patients.

143

I observed unique negative regulation of calcium signaling pathway which points to calcium homeostasis disturbances in DM patients. Our results confirms the recent findings where calcium signaling pathway was also shown to be negatively affected in transgenic mice that expressed CTG repeats in 3'UTR of DMPK gene (Osborne et al.

2009). Integration of miRNAseq data from the same muscle biopsy samples confirms these findings (Figure 4.20). I identified 13 potential novel genes targeted by 42 uniquely upregulated miRNAs in DM patients. Down regulation of the 13 genes in both DM1 and

DM2 patients accounts for defective skeletal muscle structure and physiology in DM patients. This results in decreased muscle contractility and skeletal mass and increased fatigue in DM patients.

Both DM1 and DM2 patient experience many overlapping clinical manifestations

(Table 2.2). After noticing no change in differential expression of DMPK, CNBP

(ZNF9), MBNL1 and CUGBP1 (CELF1) genes in any of human DM patients studied in this analysis, I performed enrichment analysis on the RNA-seq derived gene expression data to identify affected canonical pathways in DM1 and DM2 patients separately. As a novel finding I identified specific negative enrichment of eukaryotic translation initiation factor 2 (EIF2) signaling, mitochondrial dysfunction, mammalian target of rapamycin

(mTOR) signaling and regulation of Eukaryotic translation initiation factor 4 (EIF4) and p70S6K in DM1 patients (Figure 4.5a). All of these pathways are critically important for organismal growth and development and are required for normal cellular function and maintenance. Both EIF2and EIF4 play critical roles in translational regulation where,

EIF2 bound to GTP transfers bound initiator methionine-transfer RNA to the 40S ribosomal subunit-mRNA complex, EIF4F is required for the formation of the 48S pre-

144 initiation complex and 70S6K is a Serine-Threonine kinase that participates in the control of protein synthesis (Chaudhuri, Si, and Maitra 1997).

mTOR is required for the phosphorylation of p70S6K on Threonine 389 hence it is necessary for full activation of p70S6K. Also, mTOR complex 2 (mTORC2) regulates cytoskeletal dynamics (Bodine et al. 2001; Gingras, Raught, and Sonenberg 2001; Zoncu,

Efeyan, and Sabatini 2011). Aberrant mTOR signaling is involved in many diseases including cancer, cardiovascular disease, metabolic and neuromuscular disorders.

Negative enrichment of these critically important signaling pathways suggests down regulation of protein synthesis process along with defective skeletal muscle structure and physiology in DM1 patients.

As a novel finding, I report extensive up regulation of actin cytoskeleton associated canonical pathways in DM2 patients (Figure 4.5b) mediated via activation of

Rho GTPases family proteins. This was also confirmed by RNA-seq results from skeletal muscle biopsy of mouse DM2 knock-in model (Figure 4.27). Rho family proteins play unique role in cytoskeleton reorganization and induce massive actin cytoskeletal remodeling events in atrophic muscles (Leung et al. 1998; Chockalingam et al. 2002; Sit and Manser 2011; Wallace, Lamon, and Russell 2012). Muscular weakness and atrophy is one of the clinical hallmark symptoms of DM2 patients. Activation of these critically important pathways suggests that the substantial actin filament based activity might be in response to muscular atrophy in DM2 patient. Although the extent of muscular symptoms experiences in DM1 patients is conventionally much greater than the DM2 patients these pathways were surprisingly very minimally enriched in the DM1 patients.

145

Here I present an original effort to define the transcriptional factors (TFs) regulating skeletal muscle pathophysiology in DM patients using a multifaceted approach of combining gene expression profiling and computational methods. To our knowledge, this is the first time that such an approach has been used to dissect the genetic landscape in DM patients. Transcriptional factors (TFs) are essential regulators that control expression of specific genes. When examining gene expression data from DM patients it is reasonable to assume that muscular TFs play a critical role because they directly regulate a number of muscle specific genes that are differentially expressed in skeletal muscles of DM patients (Blais et al. 2005; Braun and Gautel 2011).

Nuclear factor κB (NF-κB) primarily regulates immune response to inflammation and activates cell survival responses; however, its role in muscle development, physiology, and disease has just started to be elucidated (Mourkioti and Rosenthal 2008).

It has been proposed that NF-kB is a negative regulator of myogenesis and skeletal muscle differentiation (Lee, Squillace, and Wang 2007). It has been shown that NF-kB activity greatly increases in various muscular dystrophies such as Duchenne muscular dystrophy (DMD) (Monici et al. 2003) and limb-girdle muscular dystrophy type 2A

(LGMD2A) (Baghdiguian et al. 1999). I report for the first time the activation of Nuclear factor κB (NF-κB) transcription factor in skeletal muscles of both DM1 and DM2 patients. This predicted increase in NF-kB activity is attributed to increased instance of muscular atrophy in DM patients. Our findings suggest that DM belongs to a class of inflammatory myopathies with elevated expression of pro-inflammatory cytokines with a predictive development of neuromuscular disorders in DM patients.

146

Hypoxia-inducible factor-1 alpha (HIF1A) transcription factor is a key transcriptional regulator responsible for the expression of genes that facilitate adaptation and survival of cells in distress (Ke and Costa 2006). HIF-1 had been shown to be activated in cancer patients (Zhong et al. 1999; Talks et al. 2000), myocardial ischemia and infraction (Lee et al. 2000), wound healing (Elson et al. 2000) and neuromuscular diseases such as Dermatomyositis and vasculitic neuropathies (Probst-Cousin,

Neundörfer, and Heuss 2010). I report predictive activation of HIF1A TFs in skeletal muscle in DM patients. This activation is attributed to the increased response to the undergoing atrophy in the skeletal muscles of DM patients.

Canonical β-catenin (CTNNB1) signaling mediated via WNT signaling pathway plays an important role in the induction of myogenesis (Chen, Ginty, and Fan 2005) in tandem with (PKA) and cAMP-responsive element-binding protein

(CREB). Here I describe instance of activated post muscular atrophic remodeling and functional differentiation mediated via activation of CTNNB1, forkhead box O1

(FOXO1) and CREB transcriptional factors that are observed to be up regulated in DM patients. The predictive effect of up regulated downstream genes will lead to activation of microtubule dynamics, which is essential for establishing molecular transport and regulation of cytoskeletal reorganization.

MYC family member of transcription l factors consists of three major candidates: c-myc, N-myc and L-myc. Their roles have been well studied in cell growth and proliferation, maintaining the cytoskeletal structure, cellular adhesion and motility, tumorogenesis, and apoptosis (Eisenman 2001; Levens 2002; Levens 2003; Lin et al.

2012). Our data predicts that activity of MYCN is inhibited in DM patients without any

147 change in expression of CNBP gene itself in DM patients. This contradicts previous reports shown that demonstrated lesser expression of myc transcriptional factor in a mouse knockout model for CNBP gene had that resulted deformities in forebrain truncation and various craniofacial defects in mouse embryos (Chen et al. 2003).

SMARCA4 belongs to ATP-dependent chromatin-remodeling complex

Switch/Sucrose Non Fermentable (SWI/SNF aka BRG1) family proteins (de la Serna,

Ohkawa, and Imbalzano 2006) that plays important role in smooth muscle development.

Knockout models BRG1 mouse exhibits cardiopulmonary defects resulting in neonatal mortality (M. Zhang et al. 2011). Peroxisome proliferator coactivator (PGC-1) family transcriptional factors are highly expressed in organs that have high oxidative requirements such as heart, brown adipose tissue, skeletal muscle, and kidney. They plays a key role in the regulation of mitochondrial biogenesis and metabolism by modulating expression of gene regulating oxidative phosphorylation and fatty acid oxidation (Riehle and Abel 2012). Misregulation of fatty acid metabolism is common in diabetic patients. It has been extensively recorded that diabetic patients suffer from alternations in cardiac function and such patients are at 3 times greater risk of congestive heart failure

(Rodrigues, Cam, and McNeill 1995; Avogaro et al. 2004). PPAR-gamma TFs play important roles in transcriptional regulation in heart and contribute significantly to the changes that occur in the cardiac physiology specially in diabetic patients (Duncan 2011).

I predict unique inhibition of SMARCA4, peroxisome proliferator-activated receptor gamma, coactivator 1 alpha (PPARGC1A) and peroxisome proliferator-activated receptor gamma (PPARG) TFs in DM patients. The collective effect by inhibitory affect of these

148

TFs leads to increased insulin resistance and development of cardiac hypertrophy in DM patients along with decreased muscle contraction and increased symptoms of myopathy.

Using our genome wide expression data coupled with previous observations regarding TFs activity, I am able to provide comprehensive information about predictive misregulation of TFs in DM patients in response to atrophy, stress and also regeneration.

Regardless of the precise role of these TFs, our approach illustrates a powerful strategy for the identification of novel TFs involved in DM disease pathophysiology and suggests future studies of stress and damage response pathways in these patients.

Profiling RNA expression patterns by RNA-seq in DM patients facilitated us by understanding many of the unknown information's about the misregulation of signaling cascades and TFs in human and mouse skeletal muscle biopsy samples from myotonic dystrophy. However detailed and comprehensive our knowledge of the mRNA expression layout in these patients became, we needed to understand the microRNA

(miRNA) expression layout in these patients. miRNAs are critical posttranslational regulatory element of mRNA expression that have been demonstrated to regulate spatial pattern of gene expression (Ambros 2001; Buckingham 2003). Undoubtedly, miRNAs have been shown to have a huge impact in understanding the progression of various diseases including cancer (John et al. 2013; Kim and Kim 2013; Nana-Sinkam and Croce

2013), neurological diseases (Dogini et al. 2013; Rao, Benito, and Fischer 2013), age related diseases, ageing (Smith-Vikos and Slack 2012; Dimmeler and Nicotera 2013), diabetes, heart failures (Fernandes-Silva et al. 2012; Asrih and Steffens 2013;

Kumarswamy and Thum 2013) and several neuromuscular disorders (Eisenberg et al.

2007; Greco et al. 2012; Mahadevan 2012).

149

High throughput miRNA profiling of any neuromuscular disorders is a rare publication. At the time of writing this document Pubmed search for "miRNA sequencing

+ muscles" yielded back only 69 publication, with none on them on any neuromuscular disorder. To the best of our knowledge our study is first of its kind for any published miRNA sequencing of skeletal muscle biopsy from either type1 and type myotonic dystrophy patients. Given the great similarities in the gene expression data as discussed earlier between DM1 and DM2 patients I was not surprised to also observe far many greater number of differentially expressed miRNAs to be common (332) between DM1 and DM2 patients. At the same time I was also able to identify uniquely expressed miRNAs from DM1 (121) and DM2 (37) patients. Target analysis of the overlapping differentially expressed miRNA in DM1 and DM2 patients clearly showed targeting of genes regulating skeletal and muscular system, developmental processes, endocrine and reproductive systems.

Our results show that muscle specific miR-1 was indeed detected to be the most highly expressed miRNA in skeletal muscles of controls well as DM patients. However, as previously suggested, our analysis does not show miR-1 to be either up or down regulated in any DM patients (Perbellini et al. 2011). miR-206 is unique in the way that it is only expressed in skeletal muscles (McCarthy 2008) and belongs to the class of myomiRs (myo= muscle + miR = miRNA). I observed up regulation of this miRNA in both DM1 and DM2 patients. This finding correlate with previous reports of its higher expression of mir-206 in Duchenne muscular dystrophy (DMD) patients (Mizuno et al.

2011; W. Zhang et al. 2011). Recently, expression levels of mir-206 has also been reported to be elevated in blood of dystrophin-deficient muscular dystrophy mouse model

150 and the canine X-linked muscular dystrophy in Japan dog model (CXMD(J)) (Mizuno et al. 2011). These results suggest that miR-206 miRNAs expressions could serve as reliable biomarkers for diagnosis of myotonic dystrophy.

In comparison with previous publications there is a clear difference in the expression levels of abundant miRNAs in DM1 and DM2 patients, with miR-323b-3p and 196-5p being identified to be highly expressed both in DM1 and DM2 patients. Both these miRNAs are novel predictions for expression in DM patients. mir-196a-5p has been shown to be up regulated in patients with glioma tumor related epilepsy (You et al. 2012) and prostate cancer (Ambs et al. 2008). Although, not much information is available for miR-323b-3p expression, it has been predicted to be a unique biomarker for detection of ectopic pregnancy (Zhao et al. 2012) and also it shown to regulate embryonic ectoderm development in mouse embryonic stem cells (Zhang et al. 2013). Recently, the very first report of the novel up regulation of miR-323b-3p miRNA has been shown in synovial fibroblast biopsies of rheumatoid arthritis patients (Pandis et al. 2012). They predicted that miR-323b-3p is a positive regulator of Wnt/beta-catenin pathway in these patients.

Indeed from our RNA-seq data I saw positive up regulation of and beta-catenin1 TFs and

I also predicted that DM patients have a higher chance of acquiring arthritis via the activation of NF-kB TF pathway. Jointly, these results suggest that elevated levels of miR-323-3p can serve as a potential candidate for therapeutic intervention to relieve arthritis in DM patients.

Our results also show unique up regulation of mir-375, miR-122-5p, miR-377-5p, and miR-378 in DM1 patients; however these miRNA were not differentially enriched in

DM2 patients. miR-375 has been recently show to be up regulated in breast lobular

151 neoplasia progression (Giricz et al. 2012). They predicted that the increased expression of miR-375 in normal breast epithelial cell lines resulted in the loss of cellular organization and acquisition of a hyperplastic phenotype. mir-377-5p has been shown to be elevated in heart tissues of patients of heart failure (Thum et al. 2007) and diabetic nephropathies (Q.

Wang et al. 2008). miR-122 is a liver specific miRNA and specifically target AKT3 gene

(Nana-Sinkam and Croce 2013; Nassirpour, Mehta, and Yin 2013) and has been shown to restore tumor cell proliferation. mir-378a have been implicated to be over expressed in serum of renal cell carcinoma patients (Hauser et al. 2012), mir-378-3p has recently been shown to be down regulated in DM2 patients (Greco et al. 2012). However, our data suggest otherwise, I saw no change in expression of this miRNA in DM2 patients instead a mild up regulation in our DM1 patient's subsets. Going forward, I envision that the expression of these miRNAs in skeletal muscle of DM1 patients should be further investigated for its potential role as a biomarker for differential diagnosis of DM1 and

DM2 disease.

Muscular dystrophy and insulin resistance happens to be the centerpiece of pathophysiology of DM patients. In this study, I provide evidence that miRNA-seq is a promising tool and have the potential in identifying crucial pathophysiologies of myotonic dystrophy disease. I was able to identify a collection of 16 known miRNAs with altered expressions in DM patients that are implicated in development of various forms of muscular dystrophies. Expression of these enriched miRNAs have been predicted in different neuromuscular diseases, I predict 13/16 miRNAs as novel in DM patients. Likewise, I present 30 uniquely expressed miRNAs that are implicated in development of insulin resistance in human and animal model studies. These collections

152 of miRNAs could serve as unique biomarkers for identification of type 1 and 2 muscular dystrophies.

In addition to providing high throughput expression profile of miRNA in DM patients, I was interested in pooling miRNA data with mRNA expression data to understand their functional interactions. Since most miRNA negatively regulates the expression of mRNA (Ambros 2001; Buckingham 2003), and single miRNA might have multiple mRNA targets. Analyzing a group of miRNAs according to the ontology profiles of their targeted mRNA is a common approach (Griffiths-Jones et al. 2006; Liang et al. 2007; Deng et al. 2011; Osanto et al. 2012; Nana-Sinkam and Croce 2013). Further, owing to variable degree of false positive and false negative results of miRNA target prediction tools, it is highly advised to use mRNA data from the same system are for comparison to achieve a higher confidence in estimating targets by computational methods (Deng et al. 2011; Velthut-Meikas et al. 2013).

Since the miRNA database is rapidly expanding, finding a suitable miRNA target prediction tool is another challenge. Therefore I used ingenuity pathway analysis (IPA) software for target prediction. IPA merges various databases such as: TargetScan,

Tarbase, miRecords along with its own curated database for miRNa/mRNA target predictions. The mRNA expression signatures from same patients provided us the basis to look for a list of target genes from the differentially expressed miRNAs whose misregulation may contribute to the pathology of these disorders. To minimize the number of false positive hits following criteria's were used: a) only significantly expressed miRNAs and miRNAs were used for target analysis: b) miRNA/mRNA showing only reciprocal relationships were included c) target prediction was performed

153 using IPA software only on candidates passing both the criterion. Using expression pairing analysis I found more than 2400 and 3400 miRNA/mRNA interaction in DM1 and DM2 patients, respectively. It is worth noting that several mRNAs were found to be targeted by more than one miRNA.

Our target prediction plan identified crucial miRNA/mRNA interactions and outlined distinct targeting of genes regulating actin cytoskeletal signaling, calcium signaling, cardiac hypertrophy signaling and axonal guidance signaling. All these four critically important signaling pathways are predicted to be similarly affected in both DM1 and DM2 patients. Targeting of these pathways in both patient types are not surprising because of their crucial roles in maintaining the cytoskeletal structure and function and cardiac physiology in normal individuals. This modification of signaling pathways have a significant impact on the DM disease pathogenesis. These finding reinforce our mRNA expression findings and help us in understanding the underlying pathophysiology in DM patients. Although, specific targeting of these pathways are novel predictions and have not yet been specifically reported in any previous studies involving DM patients

(Eisenberg et al. 2007; Perbellini et al. 2011; Greco et al. 2012). A direct comparison of our study with these published finding is limited by the fact that these reports were narrowed to the number of probes sets being used for microarray analysis.

Although the assayed tissue was skeletal muscle biopsies from DM patients, I observed significant misregulation of mRNAs and miRNAs associated with development of cardiac hypertrophy commonly observed in these patients. Another myotonic dystrophy disease hallmark is testicular dysfunction resulting in infertility in such

154 patients; it was interesting to note that "azoopermia" function was also represented in this analysis.

I showed similarities and differences in mRNA expression between DM1 and

DM2 skeletal muscle biopsy samples. I also provided supplemental support to my results by using RNA-seq information derived from transgenic mouse skeletal muscle biopsy samples. Using the same muscle biopsy samples the iTRAQ experiment established the findings of RNA-seq data confirming that structural proteins, cytoskeletal proteins and calcium binding proteins are indeed the major classes of misregulated proteins in both

DM1 and DM2 patients. Our analysis showed that myotonic dystrophy is a complex disorders have distinct mRNA and miRNA expression patterns that can be used to distinguish them from normal muscle, pointing to the diagnostic potential of the expression profiling presented in this study in muscular diseases.

155

Chapter 6

CONCLUSIONS

Repeat expansion mutations have been shown to be the histopathological hallmark for patients with type 1 and type 2 myotonic dystrophy diseases. Other than the congenital form of DM1, patients of myotonic dystrophies are difficult to diagnose because of a generality of symptoms in early stages. Despite similar clinical and genetic features, DM1 and DM2 are two separate diseases that require a more clearly differential diagnosis and also their own management strategies. I undertook an unbiased, intense investigative approach to fully comprehend the molecular events participating in the pathophysiology of this disease. To do so I profiled the global mRNA miRNA and protein expression landscape in these patients.

Given the recent advances in the field of genomics with the advent of ultra high- throughput sequencing technologies such as RNA-seq and miRNAseq I was able to study the global expression profiles in skeletal muscle biopsies from DM1 and DM2 patients.

This study is first of its kind that leveraged the power of whole transcriptomics coupled with whole miRNA profiling and protein expression that provided us with the valuable insight on differential expression of genes miRNAs and proteins. I was able to identify unique as well as overlapping gene expression patterns in skeletal muscle biopsy samples from DM1 and DM2 patients. Using RNA-seq I was able to confirm that the repeat expansion does not appear to have any role in alteration of the expression of either

DMPK or CNBP gene itself. Also, I was able to ascertain that no compensatory

156 mechanisms were observed at the transcriptional levels in response to nuclear sequestration of MBNL1, MBNL2, MBNL3, CUGBP1 and EIF2A proteins. Thus I support the "RNA gain-of-function" hypothesis for DM disease pathogenesis.

I was able to evaluate affected biological processes, transcriptional factors and canonical pathways in these patients. I found that cellular processes regulating cell adhesion, cytoskeletal, signaling, diabetes and calcium signaling were commonly misregulated in DM patients. While, EIF2 and mTOR signaling are found to be uniquely enriched in skeletal muscles of DM1 patients, I was able to provide conclusive evidence that actin cytoskeletal signaling mediated via RhoGTPases family proteins are characteristic findings in DM2 patients. These findings were also validated using RNA- seq data from skeletal muscle biopsies of transgenic mouse models exhibiting phenotypic attributes for DM2 and DM1 disease.

Using the genome wide expression data coupled with previous observations regarding transcriptional factors activity in DM1 and DM2 patients, I was able to provide detailed information about novel misregulation of NF-kB, HIF1A, CTNNB1 and myc transcriptional factors in DM patients. Enrichment analysis of these misregulated transcriptional factors helped us identify molecular signatures of muscular atrophy; irregularities in muscular contraction, development of insulin resistance, hypertrophy of heart and loss of muscle mass. I was also able to provide detailed evidence of muscular response for remodeling in atrophied muscle fibers in DM patients.

microRNA (miRNA) expression analysis of the same skeletal muscle biopsy tissues from DM1 and DM2 patients aided me to comprehend the posttranslational

157 regulation of mRNA expression in these patients. Most of the differentially expressed miRNAs are commonly misregulated in DM patients; I was also able to identify uniquely expressed miRNAs in both DM1 and DM2 patients. Target analysis of the differentially expressed miRNA in DM1 and DM2 patients identified crucial miRNA/mRNA interactions and outlined distinct targeting of genes regulating skeletal and muscular system, developmental processes, endocrine and reproductive systems. This adaptation of signaling pathways have a significant impact on the DM disease pathogenesis. These finding reinforce our mRNA expression findings.

Our study indeed points that myotonic dystrophy is a complex disorder that has distinct mRNA, miRNA and protein expression patterns which can be used to distinguish them from normal muscle, pointing to the diagnostic and therapeutic potential of the expression profiling presented in this study in muscular diseases.

158

Chapter 7

FUTURE DIRECTIONS

7.1: Larger study with more samples and data integration

Myotonic dystrophy belongs to a family of neuromuscular disorders (NMDs).

Recently, they have been subject of intense research investigation because of their intriguing pathophysiology. Over years we have gathered sufficient clinical, histological and medical information from myotonic dystrophy patients. However, due to technological limitations and adequate funding support mass scale transcriptomics and proteomics profiling data was missing for this disease. Integrated analysis approach that we proposed in this study sure does opens that door by providing comprehensive database from DM1 and DM2 patients that will allows decoding of complex biological networks associated with this disease. Applying this technique had helped us better understand the differences between DM1 and DM2 patients at genomics level.

Additionally, examining other closely related NMDs such as

Facioscapulohumeral muscular dystrophy (FSHD), Miyoshi myopathy (MM), Limb girdle muscular dystrophy LGMD), Polymyositis (PM), Nemaline myopathy (NM),

Duchenne muscular dystrophy (DMD) and LMNA-related congenital muscular dystrophy

A and B (LMNA/LMNB) using similar approach is likely to uncover common and unique differences among patients of these diseases. At the same recruiting more patients' in future integrated studies will provide more power to the analysis and thus help improve the predictability of these studies.

159

7.2: Development of diagnostic panels

In this study I reported various unique mRNA and miRNA expression pattern in

DM1 and DM2 patients. These candidate genes and miRNAs can be used as biomarkers for identification of patients of these diseases. A future study including analysis of patient's blood, and cardiac muscle biopsies should broaden our knowledge and help establish the utility of these biomarkers in these patients. Also a comparison of various closely related NMDs (as mentioned in previous section) should help us narrow down the list for identification of unique biomarkers for early detection of DM.

7.3: Cell lines and drug testing

Recently various animal models have been used to mimic phenotypic attributes of

DM1 and DM2 patients. We also studied transcriptomics and proteomics expression of skeletal muscles of 3 transgenic models. Although, all of the transgenic studies including ours have been able to provide significant insight into the molecular mechanism of DM disease, none of the animal models have been able to reveal every characteristic features exhibited by human patients. We identified enrichment of novel canonical pathways and transcriptional factors such as Nuclear factor κB (NF-κB) , Hypoxia-inducible factor-1 alpha (HIF1A), β-catenin (CTNNB1) mediated WNT signaling pathway, MYC family member of transcription l factors, actin cytoskeletal and calcium signaling pathways in

DM patients such as . Future studies in human myoblast cell lines characterized for these alterations could provide better insight into the molecular mechanism of misregulation of these TFs and pathways. Various drugs are already available targeting these candidate

TFs and pathways, hence drug modulating behavior can be easily studied on these models.

160

Appendix A

A.1: Ovation® RNA-Seq System V2 library preparation protocol.

161

A.2: Top 10 uniquely differentially expressed genes in DM1 patients as compared to control individuals as determined by RNA-seq.

Regulation Gene Symbol Description FC

AIF1 Allograft inflammatory factor 1 10.55 Potassium voltage-gated channel, shaker- KCNA1 8.17 related subfamily, member 1

ANKRD30BL Ankyrin repeat domain 30B-like 7.76 Up AMY1B Amylase, alpha 1B 6.29 C6orf25 Chromosome 6 open reading frame 25 6.02 STH Saitohin 5.92 LHX9 LIM homeobox 9 5.91 CLIC6 Chloride intracellular channel 6 5.61 CFHR3 Complement factor H-related 3 5.59 Transmembrane phosphoinositide 3- TPTE2 5.55 phosphatase and tensin homolog 2

CCL21 Chemokine (C-C motif) ligand 21 -17.00

DOHH Deoxyhypusine hydroxylase/monooxygenase -11.72

PRAF2 PRA1 domain family, member 2 -9.18 Down C19orf68 Chromosome 19 open reading frame 68 -8.79 CCDC9 Coiled-coil domain containing 9 -8.43 ENTHD1 ENTH domain containing 1 -8.15 CCNJL Cyclin j-like -7.45 MZT2B Mitotic spindle organizing protein 2b -7.41 Guanine nucleotide binding protein (g protein), GNG5P2 -7.11 gamma 5 pseudogene 2 Polymerase (rna) ii (dna directed) polypeptide POLR2J2 -6.99 j2 FC = Fold change

162

A.3: Top 10 uniquely differentially expressed genes in DM2 patients as compared to control individuals as determined by RNA-seq.

Regulation Gene Symbol Description FC

CAPG Capping protein actin filament, gelsolin-like 36.32 SOX10 SRY sex determining region Y-box 10 30.83

FCRLA Fc receptor-like A 28.42

Up TOP2A Topoisomerase DNA II alpha 170kda 27.61 Potassium voltage-gated channel, beta member KCNAB2 26.44 2 CDKN2A Cyclin-dependent kinase inhibitor 2A 24.66

APOE Apolipoprotein E 22.06

YDJC Ydjc homolog 20.41

IQGAP3 IQ motif containing gtpase activating protein 3 20.03

GTSE1 G-2 and S-phase expressed 1 19.65

TMED7- TMED7-TICAM2 readthrough -11.77 TICAM2 NLGN4Y Neuroligin 4, Y-linked -7.11

HAUS5 HAUS augmin-like complex, subunit 5 -6.30

Down TUBA8 Tubulin, alpha 8 -5.05 SYT13 Synaptotagmin XIII -5.04

Ubiquitously transcribed tetratricopeptide UTY -4.97 repeat gene, Y-linked GSDMC Gasdermin C -4.86

C1orf127 open reading frame 127 -4.76

PPP1R3C Protein phosphatase 1, regulatory subunit 3C -4.71

CCDC160 Coiled-coil domain containing 160 -4.64

FC = Fold change

163

A.4: List of top 20 KEGG pathways regulating muscular physiology and functions predicted to be targeted by 16 unique miRNAs implicated in various forms of muscular dystrophy in DM patients according to DIANA miRPath v.2.1.

No. of FDR No. of miRNAs KEGG pathway adjusted p- targeted targeting value genes pathways

Wnt signaling pathway 7.40E-18 66 14 Focal adhesion 2.21E-17 81 14 PI3K-Akt signaling pathway 7.60E-17 124 14 Axon guidance 6.92E-16 59 13 MAPK signaling pathway 8.33E-16 99 14 Adherens junction 3.06E-13 40 12 Insulin signaling pathway 1.57E-12 55 13 Regulation of actin cytoskeleton 2.08E-10 80 13 Dorso-ventral axis formation 6.17E-08 13 10 Gap junction 8.69E-08 35 12 Arrhythmogenic right ventricular 1.72E-06 29 11 cardiomyopathy (ARVC) Tight junction 1.80E-06 51 13 Osteoclast differentiation 6.47E-06 46 12 Dilated cardiomyopathy 5.20E-04 31 11 Hypertrophic cardiomyopathy (HCM) 2.18E-03 28 11 VEGF signaling pathway 2.87E-03 23 13 Calcium signaling pathway 6.86E-03 55 11 Vascular smooth muscle contraction 2.05E-02 39 12 Type II diabetes mellitus 2.51E-02 16 11 Endocrine and other factor-regulated 3.62E-02 19 13 Calcium reabsorption

164

A.5: List of top 20 KEGG pathways predicted to be targeted by 30 unique miRNAs implicated in development of insulin resistance in DM patients according to DIANA miRPath v.2.1.

No. of FDR No. of miRNAs KEGG pathway adjusted p- targeted targeting value genes pathways Pathways in cancer 6.69E-54 184 29 PI3K-Akt signaling pathway 3.41E-48 183 29 MAPK signaling pathway 1.02E-39 147 27 Focal adhesion 7.17E-30 114 27 Wnt signaling pathway 4.30E-27 95 25 Regulation of actin cytoskeleton 4.29E-25 113 26 Hepatitis B 7.23E-25 82 27 Axon guidance 1.14E-22 76 25 Neurotrophin signaling pathway 2.20E-21 78 28 Ubiquitin mediated proteolysis 4.34E-20 74 27 Insulin signaling pathway 9.98E-20 74 26 Chemokine signaling pathway 1.75E-19 93 25 Dopaminergic synapse 2.92E-19 70 27 Transcriptional misregulation in cancer 3.53E-19 94 26 Retrograde endocannabinoid signaling 8.98E-19 62 24 ErbB signaling pathway 7.25E-18 60 25 TGF-beta signaling pathway 2.12E-16 54 25 Melanogenesis 2.59E-16 58 26 HIF-1 signaling pathway 2.59E-16 60 27

165

A.6: List of 18 miRNAs targeting 6 mRNAs that regulates Actin cytoskeleton signaling pathway in both DM1 and DM2 patients along with their expression values.

miRNA mRNA

Symbol FC DM2 FC Targeted Gene FC FC DM1 Symbol DM2 DM1 hsa-let-7a-5p 2.06 2.65 ACTA1 -2.99 -2.08 hsa-miR-200a-3p 2.01 3.88 ACTA1 -2.99 -2.08 hsa-miR-155-5p 2.67 2.95 ACTA1 -2.99 -2.08 hsa-miR-191-5p 1.99 2.47 ACTA1 -2.99 -2.08 hsa-miR-223-3p 2.69 2.72 ACTA1 -2.99 -2.08 hsa-miR-3158-3p 2.38 2.23 ACTA1 -2.99 -2.08 hsa-miR-23a-3p 2.49 2.93 MYH1 -8.70 -20.32 hsa-miR-342-3p 1.78 1.99 MYH13 -6.86 -8.81 hsa-miR-377-3p 1.86 2.82 MYH13 -6.86 -8.81 hsa-miR-30b-3p 1.87 2.11 MYH14 -2.52 -2.09 hsa-miR-483-3p 3.11 2.61 MYH14 -2.52 -2.09 hsa-miR-125b-5p 1.78 2.66 MYL9 -2.43 -2.25 hsa-miR-204-3p 1.51 1.86 MYL9 -2.43 -2.25 hsa-miR-146b-3p 2.89 3.55 MYLK2 -2.92 -2.20 hsa-miR-342-3p 1.78 1.99 MYLK2 -2.92 -2.20 hsa-miR-485-5p 4.31 6.63 MYLK2 -2.92 -2.20 hsa-miR-486-5p 2.02 2.33 MYLK2 -2.92 -2.20 hsa-miR-542-5p 1.70 2.37 MYLK2 -2.92 -2.20 hsa-miR-615-3p 2.78 4.59 MYLK2 -2.92 -2.20 FC = Fold change

166

A.7: List of 42 miRNAs targeting 13 mRNAs that regulates Calcium signaling pathway in both DM1 and DM2 patients along with their expression values.

miRNA mRNA

Symbol FC DM2 FC DM1 Target Gene Symbol FC DM2 FC DM1 hsa-let-7a-5p 2.06 2.65 ACTA1 -3.00 -2.08 hsa-miR-200a-3p 2.01 3.88 ACTA1 -3.00 -2.08 hsa-miR-155-5p 2.67 2.95 ACTA1 -3.00 -2.08 hsa-miR-191-5p 1.99 2.47 ACTA1 -3.00 -2.08 hsa-miR-223-3p 2.69 2.72 ACTA1 -3.00 -2.08 hsa-miR-3158-3p 2.38 2.23 ACTA1 -3.00 -2.08 hsa-miR-24-3p 2.40 3.35 ATP2A1 -4.04 -2.92 hsa-miR-542-3p 2.36 3.44 ATP2A1 -4.04 -2.92 hsa-miR-130b-3p 3.09 6.05 ATP2B2 -3.69 -3.26 hsa-miR-139-5p 1.62 1.92 ATP2B2 -3.69 -3.26 hsa-miR-144-3p 2.50 1.89 ATP2B2 -3.69 -3.26 hsa-miR-16-5p 2.60 2.66 ATP2B2 -3.69 -3.26 hsa-miR-20b-5p 4.74 4.15 ATP2B2 -3.69 -3.26 hsa-miR-181c-5p 3.94 4.99 ATP2B2 -3.69 -3.26 hsa-miR-191-5p 1.99 2.47 ATP2B2 -3.69 -3.26 hsa-miR-324-5p 2.58 1.99 ATP2B2 -3.69 -3.26 hsa-miR-361-3p 1.54 1.83 ATP2B2 -3.69 -3.26 hsa-miR-4646-3p 4.27 5.22 ATP2B2 -3.69 -3.26 hsa-miR-539-5p 3.35 5.56 ATP2B2 -3.69 -3.26 hsa-miR-204-3p 1.51 1.86 CHRNB3 -2.40 -2.40 hsa-miR-23a-3p 2.49 2.93 MYH1 -8.71 -20.32 hsa-miR-342-3p 1.78 1.99 MYH13 -6.86 -8.81 hsa-miR-377-3p 1.86 2.82 MYH13 -6.86 -8.81 hsa-miR-30b-3p 1.87 2.11 MYH14 -2.53 -2.10 hsa-miR-483-3p 3.11 2.61 MYH14 -2.53 -2.10 hsa-miR-125b-5p 1.78 2.66 MYL9 -2.43 -2.25 hsa-miR-204-3p 1.51 1.86 MYL9 -2.43 -2.25 hsa-miR-125b-5p 1.78 2.66 PPP3R2 -7.56 -5.17 hsa-miR-28-3p 2.03 1.82 PPP3R2 -7.56 -5.17 hsa-miR-31-5p 2.12 3.18 PPP3R2 -7.56 -5.17 hsa-miR-4646-3p 4.27 5.22 PPP3R2 -7.56 -5.17 hsa-miR-526b-5p 1.60 3.95 PPP3R2 -7.56 -5.17 hsa-miR-183-5p 6.40 7.10 PRKACG -3.55 -3.72 hsa-miR-361-3p 1.54 1.83 PRKACG -3.55 -3.72 hsa-miR-550a-5p 3.27 1.97 PRKACG -3.55 -3.72 hsa-miR-532-3p 2.55 2.83 TNNC2 -11.04 -5.70 hsa-miR-128 3.14 4.72 TPM1 -3.83 -3.64

167

miRNA mRNA

Symbol FC DM2 FC DM1 Target Gene Symbol FC DM2 FC DM1 hsa-miR-146b-5p 3.93 5.85 TPM1 -3.83 -3.64 hsa-miR-149-5p 2.32 2.88 TPM1 -3.83 -3.64 hsa-miR-151a-3p 1.84 2.13 TPM1 -3.83 -3.64 hsa-miR-183-5p 6.40 7.10 TPM1 -3.83 -3.64 hsa-miR-21-5p 1.79 2.29 TPM1 -3.83 -3.64 hsa-miR-320c 2.80 3.12 TPM1 -3.83 -3.64 hsa-miR-3591-5p 1.64 2.30 TPM1 -3.83 -3.64 hsa-miR-4732-3p 3.06 3.44 TPM1 -3.83 -3.64 hsa-miR-542-3p 2.36 3.44 TPM1 -3.83 -3.64 hsa-miR-550a-5p 3.27 1.97 TPM1 -3.83 -3.64 hsa-miR-96-5p 5.66 5.89 TPM1 -3.83 -3.64 hsa-let-7a-5p 2.06 2.65 TPM2 -3.47 -3.80 hsa-miR-16-5p 2.60 2.66 TPM2 -3.47 -3.80 hsa-miR-185-5p 2.27 2.04 TPM2 -3.47 -3.80 hsa-miR-654-5p 3.49 7.09 TPM2 -3.47 -3.80 FC = Fold change

168

A.8: List of 29 miRNAs targeting 8 mRNAs that regulates Axonal guidance signaling pathway in both DM1 and DM2 patients along with their expression values.

miRNA mRNA

Symbol FC FC Targeted Gene FC FC DM2 DM1 symbol DM2 DM1 hsa-miR-103a-3p 2.26 2.28 GNA15 -2.79 -5.84 hsa-miR-185-5p 2.27 2.04 GNA15 -2.79 -5.84 hsa-miR-3180-3p 2.27 2.70 GNA15 -2.79 -5.84 hsa-miR-324-5p 2.58 1.99 GNA15 -2.79 -5.84 hsa-miR-185-5p 2.27 2.04 KLC1 -2.60 -3.60 hsa-miR-221-3p 1.74 1.82 KLC1 -2.60 -3.60 hsa-miR-339-3p 2.09 2.44 KLC1 -2.60 -3.60 hsa-miR-377-3p 1.86 2.82 KLC1 -2.60 -3.60 hsa-miR-411-5p 2.52 4.84 KLC1 -2.60 -3.60 hsa-miR-125b-5p 1.78 2.66 MYL9 -2.43 -2.25 hsa-miR-204-3p 1.51 1.86 MYL9 -2.43 -2.25 hsa-let-7a-5p 2.06 2.65 NGF -3.87 -2.93 hsa-miR-423-5p 2.10 2.59 NGF -3.87 -2.93 hsa-miR-125b-5p 1.78 2.66 PPP3R2 -7.56 -5.17 hsa-miR-28-3p 2.03 1.82 PPP3R2 -7.56 -5.17 hsa-miR-31-5p 2.12 3.18 PPP3R2 -7.56 -5.17 hsa-miR-4646-3p 4.27 5.22 PPP3R2 -7.56 -5.17 hsa-miR-526b-5p 1.60 3.95 PPP3R2 -7.56 -5.17 hsa-miR-183-5p 6.40 7.10 PRKACG -3.55 -3.72 hsa-miR-361-3p 1.54 1.83 PRKACG -3.55 -3.72 hsa-miR-550a-5p 3.27 1.97 PRKACG -3.55 -3.72 hsa-miR-455-3p 1.54 1.62 RHOD -7.69 -4.60 hsa-miR-4746-5p 2.03 2.82 RHOD -7.69 -4.60 hsa-miR-654-5p 3.49 7.09 RHOD -7.69 -4.60 hsa-miR-136-5p 2.40 3.47 WNT2 -3.71 -3.04 hsa-miR-144-3p 2.50 1.89 WNT2 -3.71 -3.04 hsa-miR-199a-5p 1.51 1.94 WNT2 -3.71 -3.04 hsa-miR-204-5p 2.68 3.94 WNT2 -3.71 -3.04 hsa-miR-361-3p 1.54 1.83 WNT2 -3.71 -3.04 hsa-miR-409-5p 3.62 5.38 WNT2 -3.71 -3.04 hsa-miR-486-3p 2.14 2.47 WNT2 -3.71 -3.04 hsa-miR-532-5p 2.12 2.89 WNT2 -3.71 -3.04 FC = Fold change

169

A.9: List of 9 common down regulated miRNAs in DM1 and DM2 patients along with their expression values.

miRNA mirBase id FC DM2 FC DM1 hsa-miR-561-5p MI0003567 -2.02 -1.68 hsa-miR-4705 MI0017338 -3.31 -1.86 hsa-miR-4524a-3p MI0016891 -1.79 -2.14 hsa-miR-3688-3p MI0016089 -2.08 -2.24 hsa-miR-3688-3p MI0017447 -2.13 -2.59 hsa-miR-4699-3p MI0017332 -2.70 -3.08 hsa-miR-3681-5p MI0016082 -2.53 -3.83 hsa-miR-4432 MI0016772 -5.65 -5.38 hsa-miR-4485 MI0016846 -2.70 -6.06 FC = Fold change

170

A.10: Up regulation of production of ketone bodies in DM2-KI transgenic mice skeletal muscles as predicted by KEGG analysis and validated by RNA-seq expression data.

171

Appendix B

Oral Presentations and Posters presented at various conferences and Seminars

Abstract Title 1: Defining the Lactocrine-Sensitive Neonatal Porcine Uterine Transcriptome

Kathleen M. Rahman1, Meredith E. Camp1, Nripesh Prasad2,3, Anthony K. McNeel4, Shawn E. Levy3, Frank F. Bartol5 and Carol A. Bagnell1

1Department of Animal Sciences, Endocrinology and Animal Biosciences Program, Rutgers University, New Brunswick, NJ, USA; 2Department of Biological Sciences, University of Alabama in Huntsville, Huntsville, AL, USA; 3HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA; 4USDA ARS, US Meat Animal Research Center, Clay Center, NE, USA; 5Department of Anatomy, Physiology & Pharmacology, Cellular and Molecular Biosciences Program, College of Veterinary Medicine, Auburn University, Auburn, AL, USA.

Abstract to be presented at the Annual Meeting of the Society for the Study of Reproduction, Grand Rapids Michigan July 19-23, 2014

Milk-borne bioactive factors, delivered from mother to nursing offspring via a lactocrine mechanism, affect development of somatic tissues including the uterus. In the pig, lactocrine-sensitive gene expression events associated with the onset of endometrial adenogenesis between birth (postnatal day = PND 0) and PND 2 define the uterine developmental program and can determine developmental trajectory and function. Neither the neonatal porcine uterine developmental transcriptome, nor associated lactocrine-sensitive elements of this transcriptome, have been defined during this period. Here, objectives were to determine effects of age and imposition of the lactocrine-null condition for 48 h from birth by milk replacer feeding, on the uterine transcriptome at PND 2 using whole transcriptome RNA sequencing (RNAseq). Crossbred gilts (Sus scrofa domesticus, n = 4/group) were assigned at birth: (1) to have uteri collected prior to nursing, within 1 h of birth; or to be (2) nursed ad libitum for 48 h, or (3) gavage-fed commercial porcine milk-replacer for 48 h (30ml/kg BW/2h). Uteri were obtained from nursed and replacer-fed gilts at 50 h. For each uterus, total RNA was extracted and both RNA concentration and integrity were estimated. For RNAseq, 500 ng of RNA from each uterus was used for library preparation. Each library was sequenced (paired end) at > 90 million reads per sample, demultiplexed, and raw reads were mapped to the latest

172 pig Sscrofa10.2 build using the Avadis NGS (Strand Scientifics) software package. Quantification was accomplished using TMM normalization and further validated by comparing RNAseq results to those obtained for targeted genes using quantitative RT- PCR. Gene enrichment and functional analyses were accomplished using the Database for Annotation, Visualization and Integrated Discovery (DAVID) and Ingenuity Pathway Analysis (IPA). Uterine gene expression was affected (P < 0.05; corrected for false discovery rate) by both age and treatment. For nursed gilts, 3283 differential gene expression events were identified between birth and PND 2. By contrast, 4662 differential gene expression events were identified between birth and PND 2 for replacer- fed gilts. On PND 2, 896 lactocrine-sensitive genes were differentially expressed in nursed as compared to replacer-fed gilts. Thus, effects of both age and treatment (nursing vs replacer feeding) on uterine gene expression patterns were evident by PND 2. Based on DAVID analyses, the top up- and down-regulated genes affected by age and treatment included genes associated with immune response, cell adhesion, ion transport, and DNA packaging. IPA analyses revealed alterations in multiple signaling pathways associated with age and/or treatment, including those linked to the matrix metalloproteinase family of molecules and the estrogen receptor signaling cascade. Neonatal uterine transcriptomic analyses can be used to refine and focus hypotheses related to identification of cellular, molecular and lactocrine mechanisms regulating endometrial development. [Support: USDA-NIFA 2013-67016-20523. USDA is an equal opportunity provider and employer.]

173

Abstract Title 2: Macrophages recruited to pancreatic islets in response to vascular endothelial growth factor-A promotes β cell proliferation.

Kristie I. Aamodt1, Marcela Brissova1, Radhika Aramandla1, Nripesh Prasad2,3, Shawn E. Levy3, Alvin C. Powers1

1Vanderbilt University, Nashville, TN, USA; 2Department of Biological Sciences, University of Alabama in Huntsville, Huntsville, AL USA; 3Genomic Services Laboratory, Hudson Alpha Institute of Biotechnology, Huntsville, AL USA.

Poster and oral presentation at the Molecular cell biology of macrophages in human disease"meeting which will be held in Santa Fe from Feb 9-14, 2014

Reduced pancreatic β cell mass is a hallmark of diabetes, which makes the ability to increase or restore β cell mass a major therapeutic goal. However, factors that effectively stimulate β cell proliferation have not been identified. While testing the hypothesis that increased endothelial cell (EC) signaling would increase β cell mass using a model of inducible vascular endothelial growth factor-A (VEGF-A) overexpression in β cells (βVEGF-A mouse), we found that increased VEGF-A leads to reduced, not increased, β cell mass. Surprisingly, withdrawal of the VEGF-A stimulus is followed by robust β cell proliferation, leading to islet regeneration, normalization of β cell mass, and reestablishment of the intra-islet capillary network. Using islet and bone marrow (BM) transplantation approaches we found that β cell proliferation is dependent on the local microenvironment of ECs, β cells, and BM- derived macrophages (Mphs) recruited to the islets upon VEGF-A induction. Blocking Mph recruitment by partial BM ablation greatly reduces β cell proliferation, indicating that infiltration of these Mphs is required for the β cell proliferative response in regenerating islets. Transcriptome analysis of whole islets, and FACS-sorted EC and Mph populations shows that Mphs recruited to βVEGF-A islets express phenotypic M2 markers, and some M1 markers, along with high levels of MMPs associated with tissue restoration and remodeling, suggesting that these Mphs have a unique regenerative phenotype. This data also shows that intra-islet ECs produce several modulators of cell proliferation and may secrete factors directing Mph phenotype activation. Based on this data we propose a new paradigm for β cell regeneration where β cell self-renewal is mediated by coordinated interactions between Mphs recruited to the site of β cell injury and intra-islet ECs.

174

Abstract Title 3: Differential expression of Phox2b marks distinct progenitor cell populations that differ in developmental potential and gene expression in the fetal mouse enteric nervous system.

Dennis P. Buehler1, Nripesh Prasad2,3, Stephanie L. Byers1, Jennifer Rosebrock1, Shawn E. Levy3, Travis A. Clark1, Michelle Southard-Smith1

1Department of Medicine, Vanderbilt University, Nashville, TN USA; 2Department of Biological Sciences, University of Alabama in Huntsville, Huntsville, AL USA; 3Genomic Services Laboratory, Hudson Alpha Institute of Biotechnology, Huntsville, AL USA.

Poster and oral presentation at the annual meeting of the Digestive Disease Week (DDW) conference, which will be held in Chicago, IL, May 3-6, 2014.

Background and aims: Intestinal motility requires a normal complement of cell types within ganglia of the enteric nervous system (ENS) to mediate appropriate propulsion of the bowel contents. While individual genes are known that cause aganglionosis of the distal bowel or mispatterning of enteric ganglia, the gene networks that regulate generation of cellular diversity in enteric ganglia have remained elusive primarily because tools to capture distinct populations of differentiating enteric progenitors have not been available. The transcription factor Phox2b is expressed in enteric progenitors as they first enter the fetal intestine, exhibits heterogeneous expression among progenitors as they colonize the developing gut, and is maintained at high levels in mature enteric neurons and lower levels in adult enteric glia. We postulated that distinct populations of enteric progenitors could be captured on the basis of differential Phox2b expression levels during ENS development and sought to use this as a means to identify signaling pathways that regulate formation of discrete cell types. Methods: Phox2b-CFP transgenic lines that drive expression of the fluorescent reporter Cerulean from regulatory regions of Phox2b were used to examine gene expression in developing ENS progenitors from 11 days post coitus (dpc) to 14 dpc. Expression levels of Phox2b-CFP relative to lineage specific markers were evaluated by whole mount immunohistochemistry. Fluorescence activated cell sorting was used to isolate Phox2b-CFP+ progenitors from fetal gut at 14dpc that expressed either high or low levels of Phox2b. These populations were compared using low density clonal cultures to assess their capacity to form distinct lineages in vitro and by RNASeq to obtain whole transcriptome profiles. Results: At multiple stages of fetal gut development, ENS progenitors expressing high levels of Phox2b were consistently positive for Hu, a marker of differentiating neurons. At 14dpc these Phox2b-high progenitors differed significantly from Phox2b-low progenitors in the types of colonies they generated in clonal cultures. Moreover, the transcriptional profiles

175 from Phox2b-high progenitors indicated consistently higher expression of neuron-specific genes. Ingenuity pathway analysis of RNASeq data from these high and low populations confirmed expression of genes known to participate in ENS neurogenesis and identified novel signaling pathways. These novel pathways are high priority candidates for directed differentiation of progenitors to generate enteric neurons in compensation for enteric ganglia deficits.

176

Abstract Title 4: Systems Biology Assessment of Human Immune Responses after Seasonal Trivalent Inactivated Influenza Vaccine.

Kristen L Hoek1, Leigh M Howard2, Tara M Allos1, Parimal Samir3, Kirsten E Diggins1, Qi Liu4, Nripesh Prasad5,6, Megan Shuey1, Xinnan Niu1, C. Buddy Creech2, Shawn Levy6, Sebastian Joyce1, Kathryn M Edwards2, and Andrew J Link1

Departments of 1Pathology, Microbiology and Immunology, 2Pediatrics, 3Biochemistry, 4Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, TN, USA; 5Department of Biological Sciences, University of Alabama in Huntsville, Huntsville, AL, USA; 6HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA

Poster and oral presentation at the annual meeting of the American Association of Immunologists, Honolulu Hawaii, May 3-7, 2013.

Systems biology represents a novel approach to comprehensively study the human immune response to vaccines at the global transcriptional and proteomic level. However, most systems vaccinology approaches utilize total PBMCs in their analyses. In this context, responses of under-represented immune cell types in the blood are potentially obscured by the predominant cells in the PBMC fraction, and the contribution of PMNs is completely ignored. To investigate the contribution of individual cell types in the immune response following vaccination, we developed a rapid and efficient method for purifying large numbers of T cells, B cells, monocytes, NK cells, myeloid DCs and neutrophils from fresh venous human blood for systems vaccinology studies. This optimized protocol was applied to adult volunteers vaccinated with 2011-12 seasonal TIV. 100mL blood was obtained prior to and on days 1, 3, and 7 post-vaccination. Whole blood, PBMC and PMN fractions were subjected to phenotypic analysis by flow cytometry. Immune cells were fractionated and processed for RNA and protein extraction in a single day. RNA-Seq and quantitative proteomics were performed on purified cells in order to determine individual expression profiles. Our results show significant variation in the phenotypes and expression profiles of immune cells at each time point. This innovative systems approach is currently being utilized to evaluate vaccine safety and efficacy in an adjuvanted influenza clinical trial.

177

Abstract Title 5: Systems Analysis of Inactivated Influenza Vaccine Responses in Distinct Immune Cell Types.

Leigh Howard, MD1, Andrew Link, PhD2, Kristen Hoek, PhD2, Tara Allos2, Parimal Samir2, Kirsten Diggins3, Qi Liu, PhD3, Nripesh Prasad4, Megan Shuey3, Xinnan Niu5, C. Buddy Creech, MD, MPH6, Shawn Levy, PhD7, Sebastian Joyce, PhD8 and Kathryn Edwards, MD, FIDSA9

(1)Department of Pediatrics, Vanderbilt University Medical Center, Nashville, TN, (2)Pathology, Microbiology, and Immunology, Vanderbilt University Medical Center, Nashville, TN, (3)Vanderbilt University, Nashville, TN, (4) Department of Biological Sciences, University of Alabama in Huntsville, Huntsville, AL, USA; HudsonAlpha, Huntsville, AL, (5)Vanderbilt University Medical Center, Nashville, TN, (6)Pediatric Infectious Diseases and Vanderbilt Vaccine Research Program, Vanderbilt University School of Medicine, Nashville, TN, (7)HudsonAlpha, Huntsville, AL, (8)Vanderbilt University, Nashville, TN, (9)Div of ID, Vanderbilt University Medical Center, Nashville, TN

Oral presentation at the annual meeting of the Infectious Diseases Society of America (IDSA), San Francisco, CA, Oct 2-6, 2013.

Background: Systems biology represents a new approach to studying human immune responses to vaccines. We used this approach to determine if the biological pathways induced by trivalent influenza vaccination (TIV) in individual immune cell subsets are distinct from those in the pooled peripheral blood mononuclear cell (PBMC) population at early time points after vaccination. Methods: Two healthy adult volunteers received a single dose of 2011-2012 seasonal trivalent IIV. Blood was obtained prior to vaccination, and on Days 1, 3, and 7 after vaccination. Using cell isolation techniques, including magnetic-activated and fluorescence-activated cell sorting, purified populations of T cells (CD3+), B cells (CD19+), neutrophils (CD15+), monocytes (CD14+), natural killer (NK) cells (CD56+), and myeloid dendritic cells (mDC's; CD56+) were obtained from fresh blood samples at >98% purity. A fraction of the cells were homogenized in RNA stabilization buffer and stored at -80° C. Total RNA was extracted from the PBMC's and immune cell subsets using an automated procedure. PolyA-enriched RNA-Seq libraries were prepared and 25 million, 50 BP, paired-end sequencing was performed. These data were processed and analyzed using open-source and commercial software packages. Hierarchical clustering analyses and Ingenuity Pathway Analysis (IPA) were performed. Results: Hierarchical clustering analysis of the transcriptomes revealed that purified individual immune cell subsets have distinct RNA expression profiles compared to the pooled PBMC population. Using shared transcripts that were up-regulated two-fold from Day 0 to Day 1, IPA identified distinct gene networks induced by vaccination in each immune cell subset.

178

Abstract Title 6: Defining the Lactocrine-Sensitive Neonatal Porcine Uterine Transcriptome and Refining Tools for Evaluation of Developmentally Critical Endometrial Gene Expression Events In Situ.

Frank F. Bartola, Carol A. Bagnellb, Anne A. Wileya, Dori J. Millera, Kimberly E. Robertsa, Meghan L. Davolta, Meredith E. Campb, Kathleen M. Rahmanb, Shawn Levyc, Nripesh Prasadc,d

aDepartment of Anatomy, Physiology and Pharmacology, Cellular and Molecular Biosciences Program, Auburn University, Auburn, AL; bDepartment of Animal Sciences, Endocrinology and Animal Biosciences Program, Rutgers University, New Brunswick, NJ; cHudsonAlpha Institute for Biotechnology, Huntsville, AL; dDepartment of Biological Sciences, University of Alabama in Huntsville, Huntsville, AL

Presented at the National Institute for Food and Agriculture, USDA meeting, Montreal, Canada, July 21, 2013.

The lactocrine hypothesis for maternal programming of neonatal development was proposed to describe a mechanism through which milk-borne bioactive factors, delivered from mother to offspring as a consequence of nursing, could affect the development of somatic tissues, including the uterus. The goal of on-going research is to test the lactocrine hypothesis. Here, objectives were to: (1) employ RNAseq analysis to determine effects of age and nursing for 48h from birth (postnatal day = PND 0) on the uterine transcriptome at PND 2; and (2) establish procedures for semi-automated assessment and quantification of neonatal porcine endometrial cell compartment-specific gene expression events in situ using multispectral imaging (MSI) and digital image processing (DIP). For objective 1, crossbred gilts (n = 4/group) were assigned randomly at birth to have uteri collected within 1h of birth, or to be nursed ad libitum or fed commercial pig milk-replacer for 48h, with uterine tissues obtained on PND 0 or on PND 2 from nursed and replacer-fed gilts. In the pig, endometrial adenogenesis is initiated between birth and PND 2. RNAseq analyses revealed significant (P < 0.05) numbers of differential gene expression events (DGE) associated with both age (PND 0 vs PND 2- nursed: 3283 DGE) and nursing [(PND 0 vs PND 2-replacer: 4662 DGE) (PND 2-nursed vs replacer-fed: 896 DGE)]. A protocol for automated identification of neonatal porcine endometrial cell compartments, including total epithelium, luminal epithelium (LE), glandular epithelium (GE) and stroma (St), was established using fluorescence immunohistochemistry, MSI and DIP. This MSI/DIP protocol enabled collection of target- and cell compartment-specific , quantifiable fluorescent signals. Procedures will permit in situ investigations of lactocrine effects regulating endometrial development and programming of endometrial structure and function. Such studies will be facilitated by development of hypotheses informed by in silico pathway and network analyses of

179 normal (nursed) versus lactocrine-null (replacer-fed) neonatal uterine transcriptome data generated by RNAseq.

180

Abstract Title 7: Splicing anomalies as potential prognostic markers for severity of Myotonic Dystrophy Type 2.

Shawn E. Levy1, Nripesh Prasad1,3, Parimal Samir2, Andrew J. Link2, Tyson DeAngelis1, Cynthia L. Hendrickson1, Angela L. Jones1, Kruti Patel1, Braden E. Boone1, Melanie P. Robinson1, Shaterrika Pointer1, Jack Wimbish1.

1HudsonAlpha Institute for Biotechnology, Huntsville, AL 35806; 2Dept of Pathology, Microbiology and Immunology, Vanderbilt University Medical Center, Nashville, TN 37232; 3Department of Biological Sciences, University of Alabama in Huntsville, Huntsville, AL, USA

Presented at the 14th annual Advances in Genome Biology and Technology (AGBT 2013) meeting, Marco Island, Florida, Februrary 20-23, 2013.

Myotonic dystrophy Type 2 (DM2) is a dominantly inherited multisystemic disorder caused by repeat microsatellite expansion of a tetra nucleotide (CCTG)n in intron 1 of Zinc Finger 9 (ZNF9) gene in chromosome 3q21.3. DM2 is characterized by progressive myopathy, myotonia, and multiorgan involvement. The microsatellite expansions leads to hairpin loop like structural abnormality that causes nuclear sequestration of RNA binding protein Mbnl1 and Mbnl2 that cause misplicing of various downstream effector genes. No study has been done that utilizes whole transcriptomics approach for DM2 disease profiling. We propose the integration of massive scale high- throughput RNA-sequencing in a clinically diagnosed set of 6 DM2 patients. This approach provided us with the much needed information about the differential gene expression along with identifying novel splicing differences, affected canonical pathways and enriched 's. Using deep RNA-sequencing methods we are able to correlate expression and splicing data with gene set enrichment and pathway analysis to better understand the underlying molecular mechanism of DM2. Splicing aberration in this disease has been vastly under-reported. We were able to successfully profile the transcriptomics landscape of this disease in patients with varying degrees of severity and used that information to predict that splicing aberration in such these patients exacerbated with increasing disease severity. Since “anticipation” phenomenon is absent in DM2 cases, forecast of onset as well as severity of this disease is a challenging task. We put forward that the splicing anomalies along with a panel of unique gene expression pattern itself can be used acts as a potential prognostic markers for early identification of this disease as well as predicting the severity .

181

Abstract Title 8: Acquired resistance to inhibitor, Alisertib, in melanoma is associated with inhibition of tumor immune surveillance.

Anna Vilgelm1,2, Nripesh Prasad3,4, Oriana Hawkins1,2, Shawn E. Levy2,3,4, and Ann Richmond1,2.

1Department of Veteran’s Affairs, Nashville, TN, 2Vanderbilt University School of Medicine, Department of Cancer Biology, 3Department of Biological Science, University of Alabama in Huntsville, Huntsville, Alabama, 4Hudson Alpha Institute of Biotechnology, Huntsville, AL.

Presented at annual meeting of Society of Melanoma Congress, Nov 8-11, 2012, California.

We recently demonstrated that Aurora A kinase (AurkA) inhibitor Alisertib (MLN8237) effectively blocked growth of xenotransplants of patients’ melanoma and mouse melanoma tumors. However, we also found that most of tumors re-grew after the treatment was paused and some of them acquired resistance to a second round of therapy. In order to gain understanding of the molecular events facilitating Alisertib resistance we performed whole transcriptome sequencing in 6 mouse melanoma tumors that were either sensitive or acquired resistance to the treatment. As a result we identified 544 genes which expressions correlated with the level of Alisertib resistance. GO annotation inquiry demonstrated a highly significant enrichment of “immune system processes” term in this set (67 genes, P<10-18). KEGG pathway analysis revealed a strong association of Alisertib resistance with low expression of cytokines (including chemokines CXCL5, CXCL9, XCL1, CCL1, CCL22, TNF family cytokines TNFSF8, TNFSF11, TNFSF14, FASLG and others) and their receptors (P<10-30). In concert with these expression changes we observed a marked reduction of immune infiltrate (both lymphoid and myeloid fractions) in tumors which acquired resistance. We also found that inhibition of immune surveillance was required for Alisertib resistance because cells from sensitive and resistant tumors responded similarly to the treatment when they were grown in NSG mice characterized by compromised immune response and cytokine signaling. Taken together our results demonstrate that acquired resistance to AurkA inhibitor is associated with evasion of tumor immune surveillance and provide an important insight on interplay of melanoma tumor cells and host’s immune system.

This work was supported by grants from the NIH (CA116021, CA90625) and VA (1I01BX000196).

182

Abstract Title 9: Transcriptome profiling of Rat Embryonic Stem Cell (ESCs) and comparing it to human and mouse ESCs to identify differences in gene expression and differential splicing.

Nathan T. Johnson1, Nripesh Prasad2,3, Shawn E. Levy3, Elizabeth C. Bryda1

1Department of Veterinary Pathobiology, College of Veterinary Medicine, University of Missouri, Columbia, MO; 2Department of Biological Sciences, University of Alabama in Huntsville, Huntsville, AL; 3 Hudson Alpha Institute of Biotechnology, Huntsville, AL.

Presented at 28th Missouri Life Sciences Week, April 16-22, 2012, University of Missouri. Columbia, MO, USA.

Embryonic stem cells (ESCs) are a unique cell type that is defined by their ability to self-renew and to differentiate into all adult tissues including the germ line. While ESCs from many species have been available for a long time, rat ESCs (rESCs) have just recently been isolated. The first Rattus norvegicus (rat) embryonic stem cells (rESCs) were isolated in 2008 and they promise to become an important tool for producing genetically engineered rat models for biomedical research. Despite their usefulness, little characterization of rESCs has been done and the transcriptome has not been defined. Characterization of rESCs is still in its infancy. We used RNA-sequencing to determine the gene expression pattern or transcriptome of rat ESCs. The goals of our study were to characterize the rESC transcriptome and determine what genes are expressed and to compare the rat ESC transcriptome with that of Homo sapiens and Mus musculus ESC transcriptomes to gain insight into ESC expression patterns across species. In order to establish a rESC transcriptome, mRNA from rESC cell line DAc8, the first male germline competent rat ESC line to be described and the first to be used to generate a knockout rat model was characterized using RNA sequencing (RNA-seq) analysis. The gene expression profile of these rESCs was determined, novel isoforms were identified and expression profiles for human, mouse, and rat ESCs were compared. The availability of the rat ESC transcriptome will allow future studies to better understand the molecular and cellular mechanisms that give these cells their unique pluripotent properties.

183

Abstract Title 10: RNA-Seq: A pioneering tool to uncover the intricate transcriptomics.

Nripesh Prasad*# and Shawn Levy#

*Department of Biological Sciences, University of Alabama in Huntsville, Huntsville, AL; # HudsonAlpha Institute of Biotechnology, Huntsville, AL.

Presented at UAH Bioretreat, Feb 24-25, 2012, UAH, Huntsville, AL, USA.

From the year 1958, when Francis Crick first postulated the concept of “Central Dogma” biologist have been fascinated with the transfer of information from DNA to RNA and finally to proteins in the form of genetic codes. Over years, it took several researchers taking smaller steps in tandem to better investigate and understand this flow of information in a cell. All along RNA was considered a “bridge” in this transfer of biological information and here is where the term “Transcriptomics” arose, which in short provide direct access to gene expression, regulation and protein information in a cell or tissue at given a particular time. It had been well accepted by biologist that acquiring the ability to explore the whole genome transcriptome profile at any given time will provide them with an opportunity to explore the underlying disease or condition. The advent of massive parallel sequencing methods (MPSM) in recent years have dramatically changed the landscape and has revolutionized the way we approach the investigation of whole genome transcriptome profiling. Out of many tools that are available “RNA-Seq” which is also known as "Whole Transcriptome Shotgun Sequencing" is an extraordinarily powerful tool for the whole genome transcriptomics profiling. This technique provides us the ability to precisely quantify the levels of genes and their transcripts than any other method available. Over the past couple of years, researchers have been able to publish information’s derived from this technique which are so deep and extensive that it is beginning to radically change the general views of the extent of intricacy of the eukaryotic transcriptomics. Since no technique is immune from challenges and caveats, in the current presentation I will discuss the various issues associated with RNA-seq approaches, describe challenges associated with this technique, and the advances made so far by presenting some exciting data from investigations of Type-II Myotonic dystrophy disease and also the transcriptomics profiling of a single purkinje cell.

184

Abstract Title 11: Differential Expression of Phox2b marks distinct enteric progenitor cell populations and facilitates analysis of regulatory pathways in ENS ontogeny.

E.M. Southard-Smith*, B.P. Buehler*, S.B. Skelton*, J.C. Corpening*, N. Prasad#+, S. Levy+ and T.A. Clark@.

*Department of Medicine, Vanderbilt University, Nashville, TN USA; #Department of Biological Sciences, University of Alabama in Huntsville, Huntsville, AL USA; +Genomic Services Laboratory, Hudson Alpha Institute of Biotechnology, Huntsville, AL USA; @Genome Sciences Resource, Vanderbilt University, Nashville, TN USA.

Oral presentation at the 3rd International Symposium on development of the Enteric Nervous system: Cell, Signals and Genes held at The University of Hong Kong, Feb 25-18, 2012.

Normal gastrointestinal motility relies upon formation of a balanced complement of cell types from enteric neural crest-derived progenitors (ENPs) that populate the fetal intestine during development. Regulatory processes that control generation of cellular diversity within enteric ganglia have remained elusive in part because tools to capture populations of ENPs during lineage segregation have not been available. Phox2b is an essential transcription factor that is required for normal development of enteric ganglia. Heterogeneous expression of Phox2b is present in enteric progenitors from the time these cells first enter the foregut and is maintained into post-natal stages with higher levels in enteric neurons and lower levels in enteric glia. The Phox2b-H2BCerulean BAC transgene line (Phox2b-CFP) recapitulates this heterogeneous expression and facilitates capture of progenitors based on fluorescent CFP reporter expression. We postulated that differential Phox2b expression coincides with distinct lineage potential and defines different populations of cell types during development of enteric ganglia. To examine this hypothesis, we evaluated expression of multiple lineage markers in situ relative to Phox2b-CFP transgene expression. We found that enteric progenitors expressing high levels of Phox2b-CFP (bright) exhibited up-regulation of neuronal markers in situ concurrent with down-regulation of glial markers. Enteric progenitors expressing high levels of Phox2b-CFP maintained these high levels even over extended live cell imaging in catenary cultures in vitro and were observed to undergo cell division. Populations of bright Phox2b-CFP progenitors flow-sorted into low density cultures gave rise to discrete colonies that differed in composition by comparison to colonies derived from Phox2b- CFP dim progenitors further suggesting that these progenitor pools differ in their developmental potential. Whole transcriptome profiling of the bright and dim Phox2b- CFP enteric populations by next-generation sequencing identified increased expression of multiple genes associated with neuronal cell types in the bright pool. Comparative analysis of Phox2b-CFP bright and dim RNASeq profiles has identified multiple

185 pathways that are differentially regulated as these two populations diverge. Pathways up- regulated in Phox2b-CFP bright may be targeted to stimulate production of enteric neurons to compensate for deficiencies of this lineage in gastrointestinal motility disorders.

186

Appendix C

Abstracts of peer-reviewed research articles

Article 1: The Pan-ErbB Negative Regulator Lrig1 Is an Intestinal Stem Cell

Marker that Functions as a Tumor Suppressor.

Anne E. Powell, Yang Wang, Yina Li, Emily J. Poulin, Anna L. Means, Mary K. Washington, James N. Higginbotham, Alwin Juchheim, Nripesh Prasad, Shawn E. Levy, Yan Guo, Yu Shyr, Bruce J. Aronow, Kevin M. Haigis, Jeffrey L. Franklin, Robert J. Coffey

Cell, Volume 149, Issue 1, 146-158, 30 March 2012

Abstract

Lineage mapping has identified both proliferative and quiescent intestinal stem cells, but the molecular circuitry controlling stem cell quiescence is incompletely understood. By lineage mapping, we show Lrig1, a pan-ErbB inhibitor, marks predominately noncycling, long-lived stem cells that are located at the crypt base and that, upon injury, proliferate and divide to replenish damaged crypts. Transcriptome profiling of Lrig1+ colonic stem cells differs markedly from the profiling of highly proliferative, Lgr5+ colonic stem cells; genes upregulated in the Lrig1+ population include those involved in cell cycle repression and response to oxidative damage. Loss of Apc in Lrig1+ cells leads to intestinal adenomas, and genetic ablation of Lrig1 results in heightened ErbB1-3 expression and duodenal adenomas. These results shed light on the relationship between proliferative and quiescent intestinal stem cells and support a model in which intestinal stem cell quiescence is maintained by calibrated ErbB signaling with loss of a negative regulator predisposing to neoplasia.

187

Article 2: Islet Microenvironment, Modulated by Vascular Endothelial Growth Factor-

A Signaling, Promotes β Cell Regeneration.

Brissova, M, K Aamodt, P Brahmachary, N Prasad, JY Hong, C Dai, M Mellati, A Shostak, G Poffenberger, R Aramandla, SE Levy, and AC Powers.

Cell Metabolism, Volume 19, Issue 3, 498-511, February 2014

Abstract

Pancreatic islet endocrine cell and endothelial cell (EC) interactions mediated by vascular endothelial growth factor-A (VEGF-A) signaling are important for islet differentiation and the formation of highly vascularized islets. To dissect how VEGF-A signaling modulates intra-islet vasculature, islet microenvironment, and β cell mass, we transiently increased VEGF-A production by β cells. VEGF-A induction dramatically increased the number of intra-islet ECs but led to β cell loss. After withdrawal of the VEGF-A stimulus, β cell mass, function, and islet structure normalized as a result of a robust, but transient, burst in proliferation of pre-existing β cells. Bone marrow-derived macrophages (MΦs) recruited to the site of β cell injury were crucial for the β cell proliferation, which was independent of pancreatic location and circulating factors such as glucose. Identification of the signals responsible for the proliferation of adult, terminally differentiated β cells will improve strategies aimed at β cell regeneration and expansion.

188

Article 3: Vascular endothelial growth factor coordinates islet innervation via vascular scaffolding.

Reinert, RB, Q Cai, JY Hong, JL Plank, K Aamodt, N Prasad, R Aramandla, C Dai, SE Levy, A Pozzi, PA Labosky, CVE Wright, M Brissova, and AC Powers.

Development, Volume 141, Issue 7, 1480-1491, February 2014

Abstract

Neurovascular alignment is a common anatomical feature of organs, but the mechanisms leading to this arrangement are incompletely understood. Here, we show that vascular endothelial growth factor (VEGF) signaling profoundly affects both vascularization and innervation of the pancreatic islet. In mature islets, nerves are closely associated with capillaries, but the islet vascularization process during embryonic organogenesis significantly precedes islet innervation. Although a simple neuronal meshwork interconnects the developing islet clusters as they begin to form at E14.5, the substantial ingrowth of nerve fibers into islets occurs postnatally, when islet vascularization is already complete. Using genetic mouse models, we demonstrate that VEGF regulates islet innervation indirectly through its effects on intra-islet endothelial cells. Our data indicate that formation of a VEGF-directed, intra-islet vascular plexus is required for development of islet innervation, and that VEGF-induced islet hypervascularization leads to increased nerve fiber ingrowth. Transcriptome analysis of hypervascularized islets revealed an increased expression of extracellular matrix components and axon guidance molecules, with these transcripts being enriched in the islet-derived endothelial cell population. We propose a mechanism for coordinated neurovascular development within pancreatic islets, in which endocrine cell-derived VEGF directs the patterning of intra- islet capillaries during embryogenesis, forming a scaffold for the postnatal ingrowth of essential autonomic nerve fibers.

189

Article 4: Characterization of the Merkel Cell Carcinoma miRNome.

Matthew S. Ning, Annette S. Kim, Nripesh Prasad, Shawn E. Levy, Huiqiu Zhang, and Thomas Andl.

Journal of Skin Cancer, Volume 2014 (2014), Article ID 289548, 9 pages.

Abstract

MicroRNAs have been implicated in various skin cancers, including melanoma, squamous cell carcinoma, and basal cell carcinoma; however, the expression of microRNAs and their role in Merkel cell carcinoma (MCC) have yet to be explored in depth. To identify microRNAs specific to MCC (MCC-miRs), next-generation sequencing (NGS) of small RNA libraries was performed on different tissue samples including MCCs, other cutaneous tumors, and normal skin. Comparison of the profiles identified several microRNAs upregulated and downregulated in MCC. For validation, their expression was measured via qRT-PCR in a larger group of MCC and in a comparison group of non-MCC cutaneous tumors and normal skin. Eight microRNAs were upregulated in MCC: miR-502-3p, miR-9, miR-7, miR-340, miR-182, miR-190b, miR-873, and miR-183. Three microRNAs were downregulated: miR-3170, miR-125b, and miR-374c. Many of these MCC-miRs, the miR-183/182/96a cistron in particular, have connections to tumorigenic pathways implicated in MCC pathogenesis. In situhybridization confirmed that the highly expressed MCC-miR, miR-182, is localized within tumor cells. Furthermore, NGS and qRT-PCR reveal that several of these MCC- miRs are highly expressed in the patient-derived MCC cell line, MS-1. These data indicate that we have identified a set of MCC-miRs with important implications for MCC research.

190

Appendix D

Ingenuity pathways figure legends

191

REFERENCES

Albrecht A, Mundlos S. 2005. The other trinucleotide repeat: polyalanine expansion disorders. Curr. Opin. Genet. Dev. 15:285–93.

Amano M, Fukata Y, Kaibuchi K. 2000. Regulation and functions of Rho-associated kinase. Exp. Cell Res. 261:44–51.

Ambros V. 2001. microRNAs: tiny regulators with great potential. Cell 107:823–6.

Ambs S, Prueitt RL, Yi M, Hudson RS, Howe TM, Petrocca F, Wallace T a, Liu C-G, Volinia S, Calin GA, et al. 2008. Genomic profiling of microRNA and messenger RNA reveals deregulated microRNA expression in prostate cancer. Cancer Res. 68:6162–70.

Arsenault M-E, Prévost C, Lescault A, Laberge C, Puymirat J, Mathieu J. 2006. Clinical characteristics of myotonic dystrophy type 1 patients with small CTG expansions. Neurology 66:1248–50.

Artero R, Prokop A, Paricio N, Begemann G, Pueyo I, Mlodzik M, Perez-Alonso M, Baylies MK. 1998. The muscleblind gene participates in the organization of Z-bands and epidermal attachments of Drosophila muscles and is regulated by Dmef2. Dev. Biol. 195:131–43.

Ashizawa T, Baiget M. 2000. New nomenclature and DNA testing guidelines for myotonic dystrophy type 1 (DM1). The International Myotonic Dystrophy Consortium (IDMC). Neurology 54:1218–21.

Ashley CT, Warren ST. 1995. Trinucleotide repeat expansion and human disease. Annu. Rev. Genet. 29:703–28.

Asrih M, Steffens S. 2013. Emerging role of epigenetics and miRNA in diabetic cardiomyopathy. Cardiovasc. Pathol. 22:117–25.

Avogaro A, Vigili de Kreutzenberg S, Negut C, Tiengo A, Scognamiglio R. 2004. Diabetic cardiomyopathy: a metabolic perspective. Am. J. Cardiol. 93:13A–16A.

Baghdiguian S, Martin M, Richard I, Pons F, Astier C, Bourg N, Hay RT, Chemaly R, Halaby G, Loiselet J, et al. 1999. Calpain 3 deficiency is associated with myonuclear apoptosis and profound perturbation of the IkappaB alpha/NF-kappaB pathway in limb- girdle muscular dystrophy type 2A. Nat. Med. 5:503–11.

Balasubramanyam a, Iyer D, Stringer JL, Beaulieu C, Potvin A, Neumeyer AM, Avruch J, Epstein HF. 1998. Developmental changes in expression of myotonic dystrophy protein kinase in the rat central nervous system. J. Comp. Neurol. 394:309–25.

192

Batten F, Gibb H. 1909. Myotonia atrophica. Brain 32:187–205.

Benjamini Y, Hochberg Y. 1995. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B … 57:289–300.

Bhagavati S, Bhagwati S, Ghatpande A, Leung B. 1996. Normal levels of DM RNA and myotonin protein kinase in skeletal muscle from adult myotonic dystrophy (DM) patients. Biochim. Biophys. Acta 1317:155–7.

Bhargava A, Fuentes FF. 2010. Mutational dynamics of microsatellites. Mol. Biotechnol. 44:250–66.

Blais A, Tsikitis M, Acosta-Alvear D, Sharan R, Kluger Y, Dynlacht BD. 2005. An initial blueprint for myogenic differentiation. Genes Dev. 19:553–69.

Bodine SC, Stitt TN, Gonzalez M, Kline WO, Stover GL, Bauerlein R, Zlotchenko E, Scrimgeour a, Lawrence JC, Glass DJ, et al. 2001. Akt/mTOR pathway is a crucial regulator of skeletal muscle hypertrophy and can prevent muscle atrophy in vivo. Nat. Cell Biol. 3:1014–9.

Bohbot JD, Durand NF, Vinyard BT, Dickens JC. 2013. Functional Development of the Octenol Response in Aedes aegypti. Front. Physiol. 4:39.

Botta A, Caldarola S, Vallo L, Bonifazi E, Fruci D, Gullotta F, Massa R, Novelli G, Loreni F. 2006. Effect of the [CCTG]n repeat expansion on ZNF9 expression in myotonic dystrophy type II (DM2). Biochim. Biophys. Acta 1762:329–34.

Botta A, Vallo L, Rinaldi F, Bonifazi E, Amati F, Biancolella M, Gambardella S, Mancinelli E, Angelini C, Meola G, et al. 2007. Gene expression analysis in myotonic dystrophy: indications for a common molecular pathogenic pathway in DM1 and DM2. Gene Expr. 13:339–51.

Braun T, Gautel M. 2011. Transcriptional mechanisms regulating skeletal muscle differentiation, growth and homeostasis. Nat. Rev. Mol. Cell Biol. 12:349–61.

Brook JD, McCurrach ME, Harley HG, Buckler a J, Church D, Aburatani H, Hunter K, Stanton VP, Thirion JP, Hudson T. 1992. Molecular basis of myotonic dystrophy: expansion of a trinucleotide (CTG) repeat at the 3’ end of a transcript encoding a protein kinase family member. Cell 68:799–808.

Brouwer JR, Willemsen R, Oostra BA. 2009. Microsatellite repeat instability and neurological disease. Bioessays 31:71–83.

Buckingham S. 2003. THE MAJOR WORLD OF microRNAs. :1–3.

193

Calcaterra NB, Palatnik JF, Bustos DM, Arranz SE, Cabada MO. 1999. Identification of mRNA-binding proteins during development: characterization of Bufo arenarum cellular nucleic acid binding protein. Dev. Growth Differ. 41:183–91.

Carango P, Noble JE, Marks HG, Funanage VL. 1993. Absence of myotonic dystrophy protein kinase (DMPK) mRNA as a result of a triplet repeat expansion in myotonic dystrophy. Genomics 18:340–8.

Cardani R, Mancinelli E, Rotondo G, Sansone V, Meola G. 2006. Muscleblind-like protein 1 nuclear sequestration is a molecular pathology marker of DM1 and DM2. Eur. J. Histochem. 50:177–82.

Case A, Atrophica M. 1924. A Case of Myotonia Atrophica. Proc. R. Soc. Med. 17:28.

Chaudhuri J, Si K, Maitra U. 1997. Function of eukaryotic translation initiation factor 1A (eIF1A) (formerly called eIF-4C) in initiation of protein synthesis. J. Biol. Chem. 272:7883–91.

Chen AE, Ginty DD, Fan C-M. 2005. Protein kinase A signalling via CREB controls myogenesis induced by Wnt proteins. Nature 433:317–22.

Chen W, Liang Y, Deng W, Shimizu K, Ashique AM, Li E, Li Y-P. 2003. The zinc- finger protein CNBP is required for forebrain formation in the mouse. Development 130:1367–79.

Chen W, Wang Y, Abe Y, Cheney L, Udd B, Li Y-P. 2007. Haploinsuffciency for Znf9 in Znf9+/- mice is associated with multiorgan abnormalities resembling myotonic dystrophy. J. Mol. Biol. 368:8–17.

Chen W. 2003. The zinc-finger protein CNBP is required for forebrain formation in the mouse. Development 130:1367–1379.

Cho DH, Tapscott SJ. 2007. Myotonic dystrophy: emerging mechanisms for DM1 and DM2. Biochim. Biophys. Acta 1772:195–204.

Chockalingam PS, Cholera R, Oak SA, Zheng Y, Jarrett HW, Thomason DB. 2002. Dystrophin-glycoprotein complex and Ras and Rho GTPase signaling are altered in muscle atrophy. Am. J. Physiol. Cell Physiol. 283:C500–11.

Croft D, Mundo AF, Haw R, Milacic M, Weiser J, Wu G, Caudy M, Garapati P, Gillespie M, Kamdar MR, et al. 2013. The Reactome pathway knowledgebase. Nucleic Acids Res. :1–6.

D’Eustachio P. 2013. Pathway databases: making chemical and biological sense of the genomic data flood. Chem. Biol. 20:629–35.

194

Das UN, Rao AA. 2007. Gene expression profile in obesity and type 2 diabetes mellitus. Lipids Health Dis. 6:35.

Davis BM, McCurrach ME, Taneja KL, Singer RH, Housman DE. 1997. Expansion of a CUG trinucleotide repeat in the 3’ untranslated region of myotonic dystrophy protein kinase transcripts results in nuclear retention of transcripts. Proc. Natl. Acad. Sci. U. S. A. 94:7388–93.

Day JW, Ranum LPW. 2005. RNA pathogenesis of the myotonic dystrophies. Neuromuscul. Disord. 15:5–16.

Day JW, Roelofs R, Leroy B, Pech I, Benzow K, Ranum LP. 1999. Clinical and genetic characteristics of a five-generation family with a novel form of myotonic dystrophy (DM2). Neuromuscul. Disord. 9:19–27.

Deng N, Puetter A, Zhang K, Johnson K, Zhao Z, Taylor C, Flemington EK, Zhu D. 2011. Isoform-level microRNA-155 target prediction using RNA-seq. Nucleic Acids Res. 39:e61.

Dere R, Napierala M, Ranum LPW, Wells RD. 2004. Hairpin structure-forming propensity of the (CCTG.CAGG) tetranucleotide repeats contributes to the genetic instability associated with myotonic dystrophy type 2. J. Biol. Chem. 279:41715–26.

Dickson AM, Wilusz CJ. 2010. Repeat expansion diseases: when a good RNA turns bad. Wiley Interdiscip. Rev. RNA 1:173–92.

Dillies M-A, Rau A, Aubert J, Hennequet-Antier C, Jeanmougin M, Servant N, Keime C, Marot G, Castel D, Estelle J, et al. 2012. A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis. Brief. Bioinform. .

Dimmeler S, Nicotera P. 2013. MicroRNAs in age-related diseases. EMBO Mol. Med. 5:180–90.

Dogini DB, Avansini SH, Vieira AS, Lopes-Cendes I. 2013. MicroRNA regulation and dysregulation in epilepsy. Front. Cell. Neurosci. 7:172.

Duncan JG. 2011. Peroxisome proliferator activated receptor-alpha (PPARα) and PPAR gamma coactivator-1alpha (PGC-1α) regulation of cardiac metabolism in diabetes. Pediatr. Cardiol. 32:323–8.

Echeverria G V, Cooper TA. 2012. RNA-binding proteins in microsatellite expansion disorders: mediators of RNA toxicity. Brain Res. 1462:100–11.

Edgren H, Murumagi A, Kangaspeska S, Nicorici D, Hongisto V, Kleivi K, Rye IH, Nyberg S, Wolf M, Borresen-Dale A-L, et al. 2011a. Identification of fusion genes in breast cancer by paired-end RNA-sequencing. Genome Biol. 12:R6.

195

Edgren H, Murumagi A, Kangaspeska S, Nicorici D, Hongisto V, Kleivi K, Rye IH, Nyberg S, Wolf M, Borresen-Dale A-L, et al. 2011b. Identification of fusion genes in breast cancer by paired-end RNA-sequencing. Genome Biol. 12:R6.

Eisenberg I, Eran A, Nishino I, Moggio M, Lamperti C, Amato AA, Lidov HG, Kang PB, North KN, Mitrani-Rosenbaum S, et al. 2007. Distinctive patterns of microRNA expression in primary muscular disorders. Proc. Natl. Acad. Sci. U. S. A. 104:17016–21.

Eisenman RN. 2001. Deconstructing myc. Genes Dev. 15:2023–30.

Elson DA, Ryan HE, Snow JW, Johnson R, Arbeit JM. 2000. Coordinate up-regulation of hypoxia inducible factor (HIF)-1alpha and HIF-1 target genes during multi-stage epidermal and wound healing. Cancer Res. 60:6189–95.

Eng JK, McCormack AL, Yates JR. 1994. An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. J. Am. Soc. Mass Spectrom. 5:976–989.

Eriksson M, Hedberg B, Carey N, Ansved T. 2001. Decreased DMPK transcript levels in myotonic dystrophy 1 type IIA muscle fibers. Biochem. Biophys. Res. Commun. 286:1177–82.

Fardaei M, Rogers MT, Thorpe HM, Larkin K, Hamshere MG, Harper PS, Brook JD. 2002. Three proteins, MBNL, MBLL and MBXL, co-localize in vivo with nuclear foci of expanded-repeat transcripts in DM1 and DM2 cells. Hum. Mol. Genet. 11:805–14.

Feng Y, Zhang F, Lokey LK, Chastain JL, Lakkis L, Eberhart D, Warren ST. 1995. Translational suppression by trinucleotide repeat expansion at FMR1. Science 268:731– 4.

Fernandes-Silva MM, Carvalho VO, Guimarães GV, Bacal F, Bocchi EA. 2012. Physical exercise and microRNAs: new frontiers in heart failure. Arq. Bras. Cardiol. 98:459–66.

Fernandez-Costa JM, Garcia-Lopez A, Zuñiga S, Fernandez-Pedrosa V, Felipo-Benavent A, Mata M, Jaka O, Aiastui A, Hernandez-Torres F, Aguado B, et al. 2013. Expanded CTG repeats trigger miRNA alterations in Drosophila that are conserved in myotonic dystrophy type 1 patients. Hum. Mol. Genet. 22:704–16.

Finsterer J. 2002. Myotonic dystrophy type 2. Eur. J. Neurol. 9:441–7.

Flink IL, Morkin E. 1995a. Organization of the gene encoding cellular nucleic acid- binding protein. Gene 163:279–82

Flink IL, Morkin E. 1995b. Alternatively processed isoforms of cellular nucleic acid- binding protein interact with a suppressor region of the human beta-myosin heavy chain gene. J. Biol. Chem. 270:6959–65.

196

Frisch R, Singleton KR, Moses PA, Gonzalez IL, Carango P, Marks HG, Funanage VL. 2001. Effect of triplet repeat expansion on chromatin structure and expression of DMPK and neighboring genes, SIX5 and DMWD, in myotonic dystrophy. Mol. Genet. Metab. 74:281–91.

Fu YH, Friedman DL, Richards S, Pearlman JA, Gibbs RA, Pizzuti A, Ashizawa T, Perryman MB, Scarlato G, Fenwick RG. 1993. Decreased expression of myotonin- protein kinase messenger RNA and protein in adult form of myotonic dystrophy. Science 260:235–8

Fu YH, Pizzuti A, Fenwick RG, King J, Rajnarayan S, Dunne PW, Dubel J, Nasser GA, Ashizawa T, de Jong P. 1992. An unstable triplet repeat in a gene related to myotonic muscular dystrophy. Science 255:1256–8.

Furling D, Lam LT, Agbulut O, Butler-Browne GS, Morris GE. 2003. Changes in myotonic dystrophy protein kinase levels and muscle development in congenital myotonic dystrophy. Am. J. Pathol. 162:1001–9.

Gacy AM, Goellner G, Juranić N, Macura S, McMurray CT. 1995. Trinucleotide repeats that expand in human disease form hairpin structures in vitro. Cell 81:533–40.

Gambardella S, Rinaldi F, Lepore SM, Viola A, Loro E, Angelini C, Vergani L, Novelli G, Botta A. 2010. Overexpression of microRNA-206 in the skeletal muscle from myotonic dystrophy type 1 patients. J. Transl. Med. 8:48.

Gingras AC, Raught B, Sonenberg N. 2001. Regulation of translation initiation by FRAP/mTOR. Genes Dev. 15:807–26.

Giricz O, Reynolds PA, Ramnauth A, Liu C, Wang T, Stead L, Childs G, Rohan T, Shapiro N, Fineberg S, et al. 2012. Hsa-miR-375 is differentially expressed during breast lobular neoplasia and promotes loss of mammary acinar polarity. J. Pathol. 226:108–19.

Greco S, Perfetti A, Fasanaro P, Cardani R, Capogrossi MC, Meola G, Martelli F. 2012. Deregulated MicroRNAs in Myotonic Dystrophy Type 2. PLoS One 7:e39732.

Griffiths-Jones S, Grocock RJ, van Dongen S, Bateman A, Enright AJ. 2006. miRBase: microRNA sequences, targets and gene nomenclature. Nucleic Acids Res. 34:D140–4.

Griffiths-Jones S, Saini HK, van Dongen S, Enright AJ. 2008. miRBase: tools for microRNA genomics. Nucleic Acids Res. 36:D154–8.

Guo H, Ingolia NT, Weissman JS, Bartel DP. 2010. Mammalian microRNAs predominantly act to decrease target mRNA levels. Nature 466:835–40.

Hashimshony T, Wagner F, Sher N, Yanai I. 2012. CEL-Seq: Single-Cell RNA-Seq by Multiplexed Linear Amplification. Cell Rep. 2:666–73.

197

Hauser S, Wulfken LM, Holdenrieder S, Moritz R, Ohlmann C-H, Jung V, Becker F, Herrmann E, Walgenbach-Brünagel G, von Ruecker A, et al. 2012. Analysis of serum microRNAs (miR-26a-2*, miR-191, miR-337-3p and miR-378) as potential biomarkers in renal cell carcinoma. Cancer Epidemiol. 36:391–4.

Van Heumen WR, Claxton C, Pickles JO. 1997. Sequence and tissue distribution of chicken cellular nucleic acid binding protein cDNA. Comp. Biochem. Physiol. B. Biochem. Mol. Biol. 118:659–65.

Hofmann-Radvanyi H, Lavedan C, Rabès JP, Savoy D, Duros C, Johnson K, Junien C. 1993. Myotonic dystrophy: absence of CTG enlarged transcript in congenital forms, and low expression of the normal allele. Hum. Mol. Genet. 2:1263–6.

Holt I, Jacquemin V, Fardaei M, Sewry CA, Butler-Browne GS, Furling D, Brook JD, Morris GE. 2009. Muscleblind-like proteins: similarities and differences in normal and myotonic dystrophy muscle. Am. J. Pathol. 174:216–27.

De Hoon MJL, Imoto S, Nolan J, Miyano S. 2004. Open source clustering software. Bioinformatics 20:1453–4.

III RM. 1996. Proximal myotonic myopathy: mini-review of a recently delineated clinical disorder. Neuromuscul. Disord. 6:87–93.

Jansen G, Groenen PJ, Bächner D, Jap PH, Coerwinkel M, Oerlemans F, van den Broek W, Gohlsch B, Pette D, Plomp JJ, et al. 1996. Abnormal myotonic dystrophy protein kinase levels produce only mild myopathy in mice. Nat. Genet. 13:316–24.

Jansen G, Mahadevan M, Amemiya C, Wormskamp N, Segers B, Hendriks W, O’Hoy K, Baird S, Sabourin L, Lennon G. 1992. Characterization of the myotonic dystrophy region predicts multiple protein isoform-encoding mRNAs. Nat. Genet. 1:261–6.

Jasinska A, Michlewski G, de Mezer M, Sobczak K, Kozlowski P, Napierala M, Krzyzosiak WJ. 2003. Structures of trinucleotide repeats in human transcripts and their functional implications. Nucleic Acids Res. 31:5463–8.

Jeffreys A, Wilson V, Thein S. 1985. Hypervariable’minisatellite' regions in human DNA. Nature 314:67–73.

John K, Wu J, Lee B-W, Farah CS. 2013. MicroRNAs in Head and Neck Cancer. Int. J. Dent. 2013:650218.

Kamsteeg E-J, Kress W, Catalli C, Hertz JM, Witsch-Baumgartner M, Buckley MF, van Engelen BGM, Schwartz M, Scheffer H. 2012. Best practice guidelines and recommendations on the molecular diagnosis of myotonic dystrophy types 1 and 2. Eur. J. Hum. Genet. 20:1203–8.

198

Kanadia RN, Johnstone KA, Mankodi A, Lungu C, Thornton CA, Esson D, Timmers AM, Hauswirth WW, Swanson MS. 2003. A muscleblind knockout model for myotonic dystrophy. Science 302:1978–80.

Kanehisa M, Goto S, Kawashima S, Okuno Y, Hattori M. 2004. The KEGG resource for deciphering the genome. Nucleic Acids Res. 32:D277–80.

Kashi Y, King D, Soller M. 1997. Simple sequence repeats as a source of quantitative genetic variation. Trends Genet. 13:74–8.

Kashi Y, King DG. 2006. Simple sequence repeats as advantageous mutators in evolution. Trends Genet. 22:253–9.

Ke Q, Costa M. 2006. Hypoxia-inducible factor-1 (HIF-1). Mol. Pharmacol. 70:1469– 80.

Kim WT, Kim W. 2013. MicroRNAs in prostate cancer. 1:3–9.

Kimura T, Nakamori M, Lueck JD, Pouliquin P, Aoike F, Fujimura H, Dirksen RT, Takahashi MP, Dulhunty AF, Sakoda S. 2005. Altered mRNA splicing of the skeletal muscle ryanodine receptor and sarcoplasmic/endoplasmic reticulum Ca2+-ATPase in myotonic dystrophy type 1. Hum. Mol. Genet. 14:2189–200.

Kino Y, Mori D, Oma Y, Takeshita Y, Sasagawa N, Ishiura S. 2004. Muscleblind protein, MBNL1/EXP, binds specifically to CHHG repeats. Hum. Mol. Genet. 13:495– 507.

Knight SJ, Flannery AV, Hirst MC, Campbell L, Christodoulou Z, Phelps SR, Pointon J, Middleton-Price HR, Barnicoat a, Pembrey ME. 1993. Trinucleotide repeat amplification and hypermethylation of a CpG island in FRAXE mental retardation. Cell 74:127–34.

Kornblum C, Lutterbey G, Bogdanow M, Kesper K, Schild H, Schröder R, Wattjes MP. 2006. Distinct neuromuscular phenotypes in myotonic dystrophy types 1 and 2 : a whole body highfield MRI study. J. Neurol. 253:753–61.

Kothekar V. 1990. Computer simulation of zinc finger motifs from cellular nucleic acid binding protein and their interaction with consensus DNA sequences. FEBS Lett. 274:217–22.

Kozomara A, Griffiths-Jones S. 2011. miRBase: integrating microRNA annotation and deep-sequencing data. Nucleic Acids Res. 39:D152–7.

Krahe R, Ashizawa T, Abbruzzese C, Roeder E, Carango P, Giacanelli M, Funanage VL, Siciliano MJ. 1995. Effect of myotonic dystrophy trinucleotide repeat expansion on DMPK transcription and processing. Genomics 28:1–14.

199

Kress W, Mueller-Myhsok B, Ricker K, Schneider C, Koch MC, Toyka K V, Mueller CR, Grimm T. 2000. Proof of genetic heterogeneity in the proximal myotonic myopathy syndrome (PROMM) and its relationship to myotonic dystrophy type 2 (DM2). Neuromuscul. Disord. 10:478–80.

Kumarswamy R, Thum T. 2013. Non-coding RNAs in cardiac remodeling and heart failure. Circ. Res. 113:676–89.

De la Serna IL, Ohkawa Y, Imbalzano AN. 2006. Chromatin remodelling in mammalian differentiation: lessons from ATP-dependent remodellers. Nat. Rev. Genet. 7:461–73.

Ladd AN, Charlet N, Cooper TA. 2001. The CELF family of RNA binding proteins is implicated in cell-specific and developmentally regulated alternative splicing. Mol. Cell. Biol. 21:1285–96.

Lalonde E, Ha KCH, Wang Z, Bemmo A, Kleinman CL, Kwan T, Pastinen T, Majewski J. 2011. RNA sequencing reveals the role of splicing polymorphisms in regulating human gene expression. Genome Res. 21:545–54.

Langmead B, Trapnell C, Pop M, Salzberg SL. 2009. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10:R25.

Larkin K, Fardaei M. 2001. Myotonic dystrophy--a multigene disorder. Brain Res. Bull. 56:389–95.

Lee HC, Patel MK, Mistry DJ, Wang Q, Reddy S, Moorman JR, Mounsey JP. 2003. Abnormal Na channel gating in murine cardiac myocytes deficient in myotonic dystrophy protein kinase. Physiol. Genomics 12:147–57.

Lee K-S, Squillace RM, Wang EH. 2007. Expression pattern of muscleblind-like proteins differs in differentiating myoblasts. Biochem. Biophys. Res. Commun. 361:151–5.

Lee SH, Wolf PL, Escudero R, Deutsch R, Jamieson SW, Thistlethwaite PA. 2000. Early expression of angiogenesis factors in acute myocardial ischemia and infarction. N. Engl. J. Med. 342:626–33.

Leung T, Chen XQ, Tan I, Manser E, Lim L. 1998. Myotonic dystrophy kinase-related Cdc42-binding kinase acts as a Cdc42 effector in promoting cytoskeletal reorganization. Mol. Cell. Biol. 18:130–40.

Levens D. 2002. Disentangling the MYC web. Proc. Natl. Acad. Sci. U. S. A. 99:5757– 9.

Levens DL. 2003. Reconstructing MYC. Genes Dev. 17:1071–7.

200

Lewis BP, Burge CB, Bartel DP. 2005. Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets. Cell 120:15–20

Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R. 2009. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25:2078–9.

Liang Y, Ridzon D, Wong L, Chen C. 2007. Characterization of microRNA expression profiles in normal human tissues. BMC Genomics 8:166.

Lin CY, Lovén J, Rahl PB, Paranal RM, Burge CB, Bradner JE, Lee TI, Young R a. 2012. Transcriptional amplification in tumor cells with elevated c-Myc. Cell 151:56–67.

Liquori C, Ricker K, Moseley M. 2001. Myotonic dystrophy type 2 caused by a CCTG expansion in intron 1 of ZNF9. Science. 293:864–7.

Longman C. 2009. Myotonic dystrophy. J. R. Coll. Physicians Edinburgh 36:51–55.

Lukáš Z, Falk M, Feit J, Souček O, Falková I, Stefančíková L, Janoušová E, Fajkusová L, Zaorálková J, Hrabálková R. 2012. Sequestration of MBNL1 in tissues of patients with myotonic dystrophy type 2. Neuromuscul. Disord. 22:604–16.

Lusis AJ, Rajavashisth TB, Klisak I, Heinzmann C, Mohandas T, Sparkes RS. 1990. Mapping of the gene for CNBP, a finger protein, to human chromosome 3q13.3-q24. Genomics 8:411–4.

Machuca-Tzili L, Thorpe H, Robinson TE, Sewry C, Brook JD. 2006. Flies deficient in Muscleblind protein model features of myotonic dystrophy with altered splice forms of Z-band associated transcripts. Hum. Genet. 120:487–99.

Machuca-Tzili LE, Buxton S, Thorpe A, Timson CM, Wigmore P, Luther PK, Brook JD. 2011. Zebrafish deficient for Muscleblind-like 2 exhibit features of myotonic dystrophy. Dis. Model. Mech. 4:381–92.

Maeda M, Taft CS, Bush EW, Holder E, Bailey WM, Neville H, Perryman MB, Bies RD. 1995. Identification, tissue-specific expression, and subcellular localization of the 80- and 71-kDa forms of myotonic dystrophy kinase protein. J. Biol. Chem. 270:20246–9.

Mahadevan M, Tsilfidis C, Sabourin L, Shutler G, Amemiya C, Jansen G, Neville C, Narang M, Barceló J, O’Hoy K. 1992. Myotonic dystrophy mutation: an unstable CTG repeat in the 3’ untranslated region of the gene. Science 255:1253–5.

Mahadevan MS. 2012. Myotonic dystrophy: is a narrow focus obscuring the rest of the field? Curr. Opin. Neurol. 25:609–13.

201

Mankodi A, Logigian E, Callahan L, McClain C, White R, Henderson D, Krym M, Thornton CA. 2000. Myotonic dystrophy in transgenic mice expressing an expanded CUG repeat. Science 289:1769–73.

Mankodi A, Urbinati CR, Yuan QP, Moxley RT, Sansone V, Krym M, Henderson D, Schalling M, Swanson MS, Thornton C a. 2001. Muscleblind localizes to nuclear foci of aberrant RNA in myotonic dystrophy types 1 and 2. Hum. Mol. Genet. 10:2165–70.

Manning G, Plowman GD, Hunter T, Sudarsanam S. 2002. Evolution of protein kinase signaling from yeast to man. Trends Biochem. Sci. 27:514–20.

Margolis JM, Schoser BG, Moseley ML, Day JW, Ranum LPW. 2006. DM2 intronic expansions: evidence for CCUG accumulation without flanking sequence or effects on ZNF9 mRNA processing or protein expression. Hum. Mol. Genet. 15:1808–15.

Marioni JC, Mason CE, Mane SM, Stephens M, Gilad Y. 2008. RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome Res. 18:1509–17.

Martorell L, Martinez JM, Carey N, Johnson K, Baiget M. 1995. Comparison of CTG repeat length expansion and clinical progression of myotonic dystrophy over a five year period. J. Med. Genet. 32:593–6.

Massa R, Panico MB, Caldarola S, Fusco FR, Sabatelli P, Terracciano C, Botta a, Novelli G, Bernardi G, Loreni F. 2010. The myotonic dystrophy type 2 (DM2) gene product zinc finger protein 9 (ZNF9) is associated with sarcomeres and normally localized in DM2 patients’ muscles. Neuropathol. Appl. Neurobiol. 36:275–84.

Mastaglia FL, Harker N, Phillips BA, Day TJ, Hankey GJ, Laing NG, Fabian V, Kakulas B a. 1998. Dominantly inherited proximal myotonic myopathy and leukoencephalopathy in a family with an incidental CLCN1 mutation. J. Neurol. Neurosurg. Psychiatry 64:543–7.

McCarthy JJ. 2008. MicroRNA-206: the skeletal muscle-specific myomiR. Biochim. Biophys. Acta 1779:682–91.

Meola G, Moxley RT. 2004. Myotonic dystrophy type 2 and related myotonic disorders. J. Neurol. 251:1173–82.

Meola G, Sansone V, Marinou K, Cotelli M, Moxley RT, Thornton C a, De Ambroggi L. 2002. Proximal myotonic myopathy: a syndrome with a favourable prognosis? J. Neurol. Sci. 193:89–96.

Mi H, Muruganujan A, Thomas PD. 2013. PANTHER in 2013: modeling the evolution of gene function, and other gene attributes, in the context of phylogenetic trees. Nucleic Acids Res. 41:D377–86.

202

Michelotti EF, Tomonaga T, Krutzsch H, Levens D. 1995. Cellular nucleic acid binding protein regulates the CT element of the human c-myc protooncogene. J. Biol. Chem. 270:9494–9.

Miller JW, Urbinati CR, Teng-Umnuay P, Stenberg MG, Byrne BJ, Thornton CA, Swanson MS. 2000. Recruitment of human muscleblind proteins to (CUG)(n) expansions associated with myotonic dystrophy. EMBO J. 19:4439–48.

Mitas M. 1997. Trinucleotide repeats associated with human disease. Nucleic Acids Res. 25:2245–54.

Mizuno H, Nakamura A, Aoki Y, Ito N, Kishi S, Yamamoto K, Sekiguchi M, Takeda S, Hashido K. 2011. Identification of muscle-specific microRNAs in serum of muscular dystrophy animal models: promising novel blood-based markers for muscular dystrophy. PLoS One 6:e18388.

Monici MC, Aguennouz M, Mazzeo A, Messina C, Vita G. 2003. Activation of nuclear factor-kappaB in inflammatory myopathies and Duchenne muscular dystrophy. Neurology 60:993–7.

Morrone A, Pegoraro E, Angelini C, Zammarchi E, Marconi G, Hoffman EP. 1997. RNA metabolism in myotonic dystrophy: patient muscle shows decreased insulin receptor RNA and protein consistent with abnormal insulin resistance. J. Clin. Invest. 99:1691–8.

Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B. 2008. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat. Methods 5:621–8.

Mostafavi S, Ray D, Warde-Farley D, Grouios C, Morris Q. 2008. GeneMANIA: a real- time multiple association network integration algorithm for predicting gene function. Genome Biol. 9 Suppl 1:S4.

Mounsey JP, Mistry DJ, Ai CW, Reddy S, Moorman JR. 2000. Skeletal muscle sodium channel gating in mice deficient in myotonic dystrophy protein kinase. Hum. Mol. Genet. 9:2313–20.

Mourkioti F, Rosenthal N. 2008. NF-kappaB signaling in skeletal muscle: prospects for intervention in muscle diseases. J. Mol. Med. (Berl). 86:747–59.

Moxley RT, Udd B, Ricker K. 1998. Proximal myotonic myopathy (PROMM) and other proximal myotonic syndromes. Neuromuscul. Disord. 8:519–20.

Murphy D. 2002. Gene expression studies using microarrays: principles, problems, and prospects. Adv. Physiol. Educ. 26:256–70.

203

Mussini I, Biral D, Marin O, Furlan S, Salvatori S. 1999. Myotonic Dystrophy Protein Kinase Expressed in Rat Cardiac Muscle Is Associated with Sarcoplasmic Reticulum and Gap Junctions. J. Histochem. Cytochem. 47:383–392.

Nacu S, Yuan W, Kan Z, Bhatt D, Rivers CS, Stinson J, Peters B A, Modrusan Z, Jung K, Seshagiri S, et al. 2011. Deep RNA sequencing analysis of readthrough gene fusions in human prostate adenocarcinoma and reference samples. BMC Med. Genomics 4:4-11.

Nana-Sinkam SP, Croce CM. 2013. Clinical applications for microRNAs in cancer. Clin. Pharmacol. Ther. 93:98–104.

Napierała M, Krzyzosiak WJ. 1997. CUG repeats present in myotonin kinase RNA form metastable “slippery” hairpins. J. Biol. Chem. 272:31079–85.

Nassa G, Tarallo R, Giurato G, De Filippo MR, Ravo M, Rizzo F, Stellato C, Ambrosino C, Baumann M, Lietzen N, et al. 2014. Post-transcriptional regulation of human breast cancer cell proteome by unliganded Estrogen Receptor beta via microRNAs. Mol. Cell. Proteomics :1–48.

Nassirpour R, Mehta PP, Yin M-J. 2013. miR-122 Regulates Tumorigenesis in Hepatocellular Carcinoma by Targeting AKT3. PLoS One 8:e79655.

Nezu Y, Kino Y, Sasagawa N, Nishino I, Ishiura S. 2007. Expression of MBNL and CELF mRNA transcripts in muscles with myotonic dystrophy. Neuromuscul. Disord. 17:306–12.

Nowrousian M. 2010. Next-generation sequencing techniques for eukaryotic microorganisms: sequencing-based solutions to biological problems. Eukaryot. Cell 9:1300–10.

O’Rourke JR, Swanson MS. 2009. Mechanisms of RNA-mediated disease. J. Biol. Chem. 284:7419–23.

Osanto S, Qin Y, Buermans HP, Berkers J, Lerut E, Goeman JJ, van Poppel H. 2012. Genome-wide microRNA expression analysis of clear cell renal cell carcinoma by next generation deep sequencing. PLoS One 7:e38298.

Osborne RJ, Lin X, Welle S, Sobczak K, O’Rourke JR, Swanson MS, Thornton C a. 2009. Transcriptional and post-transcriptional impact of toxic RNA in myotonic dystrophy. Hum. Mol. Genet. 18:1471–81.

Pandis I, Ospelt C, Karagianni N, Denis MC, Reczko M, Camps C, Hatzigeorgiou AG, Ragoussis J, Gay S, Kollias G. 2012. Identification of microRNA-221/222 and microRNA-323-3p association with rheumatoid arthritis via predictions using the human tumour necrosis factor transgenic mouse model. Ann. Rheum. Dis. 71:1716–23.

204

Perbellini R, Greco S, Sarra-Ferraris G, Cardani R, Capogrossi MC, Meola G, Martelli F. 2011. Dysregulation and cellular mislocalization of specific miRNAs in myotonic dystrophy type 1. Neuromuscul. Disord. 21:81–8.

Pham YC, Man N, Lam LT, Morris GE. 1998. Localization of myotonic dystrophy protein kinase in human and rabbit tissues using a new panel of monoclonal antibodies. Hum. Mol. Genet. 7:1957–65.

Pieretti M, Zhang FP, Fu YH, Warren ST, Oostra BA, Caskey CT, Nelson DL. 1991. Absence of expression of the FMR-1 gene in fragile X syndrome. Cell 66:817–22.

Powell AE, Wang Y, Li Y, Poulin EJ, Means AL, Washington MK, Higginbotham JN, Juchheim A, Prasad N, Levy SE, et al. 2012. The pan-ErbB negative regulator Lrig1 is an intestinal stem cell marker that functions as a tumor suppressor. Cell 149:146–58.

Probst-Cousin S, Neundörfer B, Heuss D. 2010. Microvasculopathic neuromuscular diseases: lessons from hypoxia-inducible factors. Neuromuscul. Disord. 20:192–7.

Raheem O, Olufemi S, Bachinski L. 2010. Mutant (CCTG)n Expansion Causes Abnormal Expression of Zinc Finger Protein 9 (ZNF9) in Myotonic Dystrophy Type 2. Am. J. Pathology. 177:3025–36.

Rajavashisth TB, Taylor A K, Andalibi A, Svenson KL, Lusis AJ. 1989. Identification of a zinc finger protein that binds to the sterol regulatory element. Science 245:640–3.

Ranum LPW, Day JW. 2002a. Dominantly inherited, non-coding microsatellite expansion disorders. Curr. Opin. Genet. Dev. 12:266–71.

Ranum LPW, Day JW. 2002b. Myotonic dystrophy: clinical and molecular parallels between myotonic dystrophy type 1 and type 2. Curr. Neurol. Neurosci. Rep. 2:465–70.

Ranum LPW, Day JW. 2004. Myotonic dystrophy: RNA pathogenesis comes into focus. Am. J. Hum. Genet. 74:793–804.

Rao P, Benito E, Fischer A. 2013. MicroRNAs as biomarkers for CNS disease. Front. Mol. Neurosci. 6:39.

Reddy S, Mistry DJ, Wang QC, Geddis LM, Kutchai HC, Moorman JR, Mounsey JP. 2002. Effects of age and gene dose on skeletal muscle sodium channel gating in mice deficient in myotonic dystrophy protein kinase. Muscle Nerve 25:850–74.

Reddy S, Smith DB, Rich MM, Leferovich JM, Reilly P, Davis BM, Tran K, Rayburn H, Bronson R, Cros D, et al. 1996. Mice lacking the myotonic dystrophy protein kinase develop a late onset progressive myopathy. Nat. Genet. 13:325–35.

205

Rhodes JD, Lott MC, Russell SL, Moulton V, Sanderson J, Wormstone IM, Broadway DC. 2012. Activation of the innate immune response and interferon signalling in myotonic dystrophy type 1 and type 2 cataracts. Hum. Mol. Genet. 21:852–62.

Richard H, Schulz MH, Sultan M, Nürnberger A, Schrinner S, Balzereit D, Dagand E, Rasche A, Lehrach H, Vingron M, et al. 2010. Prediction of alternative isoforms from exon expression levels in RNA-Seq experiments. Nucleic Acids Res. 38:e112.

Ricker K, Koch MC, Lehmann-Horn F, Pongratz D, Speich N, Reiners K, Schneider C, Moxley RT. 1995. Proximal myotonic myopathy. Clinical features of a multisystem disorder similar to myotonic dystrophy. Arch. Neurol. 52:25–31.

Riehle C, Abel ED. 2012. PGC-1 proteins and heart failure. Trends Cardiovasc. Med. 22:98–105.

Robinson MD, Oshlack A. 2010. A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol. 11:R25.

Rodrigues B, Cam MC, McNeill JH. 1995. Myocardial substrate metabolism: implications for diabetic cardiomyopathy. J. Mol. Cell. Cardiol. 27:169–79.

Rönnblom A. 1996. Gastrointestinal symptoms in myotonic dystrophy. Scand. J. Gastroenterology 31:654–7.

Ross PL, Huang YN, Marchese JN, Williamson B, Parker K, Hattan S, Khainovski N, Pillai S, Dey S, Daniels S, et al. 2004. Multiplexed protein quantitation in Saccharomyces cerevisiae using amine-reactive isobaric tagging reagents. Mol. Cell. Proteomics 3:1154– 69.

Rusconi F, Mancinelli E, Colombo G, Cardani R, Da Riva L, Bongarzone I, Meola G, Zippel R. 2010. Proteome profile in Myotonic Dystrophy type 2 myotubes reveals dysfunction in protein processing and mitochondrial pathways. Neurobiol. Dis. 38:273– 80.

Saldanha AJ. 2004. Java Treeview--extensible visualization of microarray data. Bioinformatics 20:3246–8.

Salvatori S, Biral D, Furlan S, Marin O. 1997. Evidence for localization of the myotonic dystrophy protein kinase to the terminal cisternae of the sarcoplasmic reticulum. J. Muscle Res. Cell Motil. 18:429–40.

Salvatori S, Fanin M, Trevisan CP, Furlan S, Reddy S, Nagy JI, Angelini C. 2005. Decreased expression of DMPK: correlation with CTG repeat expansion and fibre type composition in myotonic dystrophy type 1. Neurol. Sci. 26:235–42.

206

Sarkar PS, Han J, Reddy S. 2004. In situ hybridization analysis of Dmpk mRNA in adult mouse tissues. Neuromuscul. Disord. 14:497–506.

Savkur RS, Philips AV, Cooper TA, Dalton JC, Moseley ML, Ranum LPW, Day JW. 2004. Insulin receptor splicing alteration in myotonic dystrophy type 2. Am. J. Hum. Genet. 74:1309–13.

Savkur RS, Philips AV, Cooper TA. 2001. Aberrant regulation of insulin receptor alternative splicing is associated with insulin resistance in myotonic dystrophy. Nat. Genet. 29:40–7.

Schara U, Schoser BGH. 2006. Myotonic dystrophies type 1 and 2: a summary on current aspects. Semin. Pediatr. Neurol. 13:71–9.

Schmittgen TD, Livak KJ. 2008. Analyzing real-time PCR data by the comparative C(T) method. Nat. Protoc. 3:1101–8.

Schulze A, Downward J. 2000. Analysis of gene expression by microarrays: cell biologist’s gold mine or minefield? J. Cell Sci. 113 Pt 23:4151–6.

Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T. 2003. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13:2498–504.

Shin J, Charizanis K, Swanson MS. 2009. Pathogenic RNAs in microsatellite expansion disease. Neurosci. Lett. 466:99–102.

Sit S-T, Manser E. 2011. Rho GTPases and their role in organizing the actin cytoskeleton. J. Cell Sci. 124:679–83.

Smith-Vikos T, Slack FJ. 2012. MicroRNAs and their roles in aging. J. Cell Sci. 125:7– 17.

Smoot ME, Ono K, Ruscheinski J, Wang P-L, Ideker T. 2011. Cytoscape 2.8: new features for data integration and network visualization. Bioinformatics 27:431–2.

Sparks JT, Vinyard BT, Dickens JC. 2013. Gustatory receptor expression in the labella and tarsi of Aedes aegypti. Insect Biochem. Mol. Biol. 43:1161–71.

Suenaga K, Lee K-Y, Nakamori M, Tatsumi Y, Takahashi MP, Fujimura H, Jinnai K, Yoshikawa H, Du H, Ares M, et al. 2012. Muscleblind-like 1 knockout mice reveal novel splicing defects in the myotonic dystrophy brain. PLoS One 7:e33218.

Suominen T, Bachinski LL, Auvinen S, Hackman P, Baggerly KA, Angelini C, Peltonen L, Krahe R, Udd B. 2011. Population frequency of myotonic dystrophy: higher than

207 expected frequency of myotonic dystrophy type 2 (DM2) mutation in Finland. Eur. J. Hum. Genet. 19:776–82.

Suominen T, Schoser B, Raheem O, Auvinen S, Walter M, Krahe R, Lochmüller H, Kress W, Udd B. 2008. High frequency of co-segregating CLCN1 mutations among myotonic dystrophy type 2 patients from Finland and Germany. J. Neurol. 255:1731–6.

Sutcliffe JS, Nelson DL, Zhang F, Pieretti M, Caskey CT, Saxe D, Warren ST. 1992. DNA methylation represses FMR-1 transcription in fragile X syndrome. Hum. Mol. Genet. 1:397–400.

Talks KL, Turley H, Gatter KC, Maxwell PH, Pugh CW, Ratcliffe PJ, Harris AL. 2000. The expression and distribution of the hypoxia-inducible factors HIF-1alpha and HIF- 2alpha in normal human tissues, cancers, and tumor-associated macrophages. Am. J. Pathol. 157:411–21.

Taneja KL, McCurrach M, Schalling M, Housman D, Singer RH. 1995. Foci of trinucleotide repeat transcripts in nuclei of myotonic dystrophy cells and tissues. J. Cell Biol. 128:995–1002.

Tang F, Barbacioru C, Bao S, Lee C, Nordman E, Wang X, Lao K, Surani MA. 2010. Tracing the derivation of embryonic stem cells from the inner cell mass by single-cell RNA-Seq analysis. Cell Stem Cell 6:468–78.

Thornton CA, Griggs RC, Moxley RT. 1994. Myotonic dystrophy with no trinucleotide repeat expansion. Ann. Neurol. 35:269–72.

Thum T, Galuppo P, Wolf C, Fiedler J, Kneitz S, van Laake LW, Doevendans P a, Mummery CL, Borlak J, Haverich A, et al. 2007. MicroRNAs in the human heart: a clue to fetal gene reprogramming in heart failure. Circulation 116:258–67.

Tian B, White RJ, Xia T, Welle S, Turner DH, Mathews MB, Thornton C a. 2000. Expanded CUG repeat RNAs form hairpins that activate the double-stranded RNA- dependent protein kinase PKR. RNA 6:79–87.

Timchenko L, Caskey C. 1996. Trinucleotide repeat disorders in humans: discussions of mechanisms and medical issues. FASEB J. 10:1589–97.

Timchenko LT, Miller JW, Timchenko NA, DeVore DR, Datar K V, Lin L, Roberts R, Caskey CT, Swanson MS. 1996. Identification of a (CUG)n triplet repeat RNA-binding protein and its expression in myotonic dystrophy. Nucleic Acids Res. 24:4407–14.

Timchenko LT, Timchenko NA, Caskey CT, Roberts R. 1996. Novel proteins with binding specificity for DNA CTG repeats and RNA CUG repeats: implications for myotonic dystrophy. Hum. Mol. Genet. 5:115–21.

208

Timchenko NA, Cai ZJ, Welm AL, Reddy S, Ashizawa T, Timchenko LT. 2001. RNA CUG repeats sequester CUGBP1 and alter protein levels and activity of CUGBP1. J. Biol. Chem. 276:7820–6.

Tóth G, Gáspári Z, Jurka J. 2000. Microsatellites in different eukaryotic genomes: survey and analysis. Genome Res. 10:967–81.

Trapnell C, Pachter L, Salzberg SL. 2009. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25:1105–11.

Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, Salzberg SL, Wold BJ, Pachter L. 2010. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat. Biotechnol. 28:511–5.

Tu T, Zhou S, Liu Z, Li X, Liu Q. 2014. Quantitative Proteomics of Changes in Energy Metabolism-Related Proteins in Atrial Tissue From Valvular Disease Patients With Permanent Atrial Fibrillation. Circ. J.

Turner C, Hilton-Jones D. 2010. The myotonic dystrophies: diagnosis and management. J. Neurol. Neurosurg. Psychiatry 81:358–67.

Udd B, Krahe R, Wallgren-Pettersson C, Falck B, Kalimo H. 1997. Proximal myotonic dystrophy--a family with autosomal dominant muscular dystrophy, cataracts, hearing loss and hypogonadism: heterogeneity of proximal myotonic syndromes? Neuromuscul. Disord. 7:217–28.

Udd B, Krahe R. 2012. The myotonic dystrophies: molecular, clinical, and therapeutic challenges. Lancet Neurol. 11:891–905.

Udd B, Meola G, Krahe R, Wansink DG, Bassez G, Kress W, Schoser B, Moxley R. 2011. Myotonic dystrophy type 2 (DM2) and related disorders report of the 180th ENMC workshop including guidelines on diagnostics and management 3-5 December 2010, Naarden, The Netherlands. Neuromuscul. Disord. 21:443–50.

Usdin K. 2008. The biological effects of simple tandem repeats: lessons from the repeat expansion diseases. Genome Res. 18:1011–9.

Velthut-Meikas A, Simm J, Tuuri T, Tapanainen JS, Metsis M, Salumets A. 2013. Research resource: small RNA-seq of human granulosa cells reveals miRNAs in FSHR and aromatase genes. Mol. Endocrinol. 27:1128–41.

Vergoulis T, Vlachos IS, Alexiou P, Georgakilas G, Maragkakis M, Reczko M, Gerangelos S, Koziris N, Dalamagas T, Hatzigeorgiou AG. 2012. TarBase 6.0: capturing the exponential growth of miRNA targets with experimental support. Nucleic Acids Res. 40:D222–9.

209

Vihola A, Bachinski LL, Sirito M, Olufemi S-E, Hajibashi S, Baggerly K a, Raheem O, Haapasalo H, Suominen T, Holmlund-Hampf J, et al. 2010. Differences in aberrant expression and splicing of sarcomeric proteins in the myotonic dystrophies DM1 and DM2. Acta Neuropathol. 119:465–79.

Vihola A, Sirito M, Bachinski LL, Raheem O, Screen M, Suominen T, Krahe R, Udd B. 2013. Altered expression and splicing of Ca(2+) metabolism genes in myotonic dystrophies DM1 and DM2. Neuropathol. Appl. Neurobiol. 39:390–405.

Wallace MA, Lamon S, Russell AP. 2012. The regulation and function of the striated muscle activator of rho signaling (STARS) protein. Front. Physiol. 3:469.

Wang ET, Sandberg R, Luo S, Khrebtukova I, Zhang L, Mayr C, Kingsmore SF, Schroth GP, Burge CB. 2008a. Alternative isoform regulation in human tissue transcriptomes. Nature 456:470–6.

Wang ET, Sandberg R, Luo S, Khrebtukova I, Zhang L, Mayr C, Kingsmore SF, Schroth GP, Burge CB. 2008b. Alternative isoform regulation in human tissue transcriptomes. Nature 456:470–6.

Wang G-S, Kearney DL, De Biasi M, Taffet G, Cooper TA. 2007. Elevation of RNA- binding protein CUGBP1 is an early event in an inducible heart-specific mouse model of myotonic dystrophy. J. Clin. Invest. 117:2802–11.

Wang Q, Wang Y, Minto AW, Wang J, Shi Q, Li X, Quigg RJ. 2008. MicroRNA-377 is up-regulated and can lead to increased fibronectin production in diabetic nephropathy. FASEB J. 22:4126–35.

Wang Z, Gerstein M, Snyder M. 2009. RNA-Seq: a revolutionary tool for transcriptomics. Nat. Rev. Genet. 10:57–63.

Wansink DG, van Herpen REMA, Coerwinkel-Driessen MM, Groenen PJTA, Hemmings BA, Wieringa B. 2003. Alternative splicing controls myotonic dystrophy protein kinase structure, enzymatic activity, and subcellular localization. Mol. Cell. Biol. 23:5489–501.

Ward AJ, Rimer M, Killian JM, Dowling JJ, Cooper TA. 2010. CUGBP1 overexpression in mouse skeletal muscle reproduces features of myotonic dystrophy type 1. Hum. Mol. Genet. 19:3614–22.

Warde-Farley D, Donaldson SL, Comes O, Zuberi K, Badrawi R, Chao P, Franz M, Grouios C, Kazi F, Lopes CT, et al. 2010. The GeneMANIA prediction server: biological network integration for gene prioritization and predicting gene function. Nucleic Acids Res. 38:W214–20.

Warren ST, Nelson DL. 1993. Trinucleotide repeat expansions in neurological disease. Curr. Opin. Neurobiol. 3:752–9.

210

Webb KJ, Xu T, Park SK, Yates JR. 2013. Modified MuDPIT separation identified 4488 proteins in a system-wide analysis of quiescence in yeast. J. Proteome Res. 12:2177–84.

White NMA, Masui O, Desouza L V, Krakovska O, Metias S, Romaschin AD, Honey RJ, Stewart R, Pace K, Lee J, et al. 2014. Quantitative proteomic analysis reveals potential diagnostic markers and pathways involved in pathogenesis of renal cell carcinoma. Oncotarget 5.

Wojciechowska M, Krzyzosiak WJ. 2011. Cellular toxicity of expanded RNA repeats: focus on RNA foci. Hum. Mol. Genet. 20:3811–21.

Xiao F, Zuo Z, Cai G, Kang S, Gao X, Li T. 2009. miRecords: an integrated resource for microRNA-target interactions. Nucleic Acids Res. 37:D105–10.

You G, Yan W, Zhang W, Wang Y, Bao Z, Li S, Li S, Li G, Song Y, Kang C, et al. 2012. Significance of miR-196b in tumor-related epilepsy of patients with gliomas. PLoS One 7:e46218.

Yuan Y, Compton SA, Sobczak K, Stenberg MG, Thornton C a, Griffith JD, Swanson MS. 2007. Muscleblind-like 1 interacts with RNA hairpins in splicing target and pathogenic RNAs. Nucleic Acids Res. 35:5474–86.

Zacharewicz E, Lamon S, Russell AP. 2013. MicroRNAs in skeletal muscle and their regulation with exercise, ageing, and disease. Front. Physiol. 4:266.

Zhang M, Chen M, Kim J-R, Zhou J, Jones RE, Tune JD, Kassab GS, Metzger D, Ahlfeld S, Conway SJ, et al. 2011. SWI/SNF complexes containing Brahma or Brahma- related gene 1 play distinct roles in smooth muscle development. Mol. Cell. Biol. 31:2618–31.

Zhang W, Wang T, Su Y, Li W, Frame LT, Ai G. 2011. Recombinant adenoviral microRNA-206 induces myogenesis in C2C12 cells. Med. Sci. Monit. 17:BR364–71.

Zhang Y, Teng F, Luo G-Z, Wang M, Tong M, Zhao X, Wang L, Wang X-J, Zhou Q. 2013. MicroRNA-323-3p regulates the activity of polycomb repressive complex 2 (PRC2) via targeting the mRNA of embryonic ectoderm development (Eed) gene in mouse embryonic stem cells. J. Biol. Chem. 288:23659–65.

Zhao Z, Zhao Q, Warrick J, Lockwood CM, Woodworth A, Moley KH, Gronowski AM. 2012. Circulating microRNA miR-323-3p as a biomarker of ectopic pregnancy. Clin. Chem. 58:896–905.

Zhong H, Marzo AM De, Laughner E, Lim M, Hilton DA, Zagzag D, Buechler P, Isaacs WB, Semenza GL, Simons JW. 1999. Overexpression of hypoxia-inducible factor 1alpha in common human cancers and their metastases. Cancer Res. 59:5830–5.

211

Zoncu R, Efeyan A, Sabatini DM. 2011. mTOR: from growth signal integration to cancer, diabetes and ageing. Nat. Rev. Mol. Cell Biol. 12:21–35.

Von zur Mühlen F, Klass C, Kreuzer H, Mall G, Giese a, Reimers CD. 1998. Cardiac involvement in proximal myotonic myopathy. Heart 79:619–21.

212