Bioinformatics Approaches to Identify Pain Mediators, Novel Lncrnas and Distinct Modalities of Neuropathic Pain

Bioinformatics approaches to identify pain mediators, novel LncRNAs and distinct modalities of neuropathic pain by Georgios Baskozos A thesis submitted to University College London for the degree of Doctor of Philosophy Institute of Structural and Molecular Biology University College London September 2016 1 Declaration I, Georgios Baskozos, confirm that the work presented in this thesis is my own. Where information has been derived from other sources, I confirm that this has been indicated in the thesis. ……………………………………… Georgios Baskozos 29 September 2016 2 Abstract This thesis presents a number of studies in the general subject of bioinformatics and functional genomics. The studies were made in collaboration with experimental scientists of the London Pain Consortium (LPC), an initiative that has promoted collaborations between experimental and computational scientists to further understanding of pain. The studies are mainly concerned with the molecular biology of pain and deal with data gathered from high throughput technologies aiming to assess the transcriptional changes involved in well induced pain states, both from animal models of pain and human patients. We have analysed next generation sequencing data (NGS data) in order to assess the transcriptional changes in rodent’s dorsal root ganglions under well induced pain states. We have also developed a customised computational pipeline to analyse RNA- sequencing data in order to identify novel Long non-coding RNAs (LncRNAs), which may function as mediators of neuropathic pain. Our analyses detected hundreds of novel LncRNAs significantly dysregulated between sham-operated animals and animal models of pain. In addition, in order to gain valuable insights into neuropathic pain, including both its molecular signature, somatosensory profiles and clusters of individuals related to pain severity, we analysed clinical data together with data obtained from quality of life pain-questionnaires. Based on this study, we were able to identify distinct pain modalities associated with the intensity of neuropathic pain. Our results will be useful for the understanding of neuropathic pain and its future treatment. 3 Acknowledgments I would like to thank my supervisor Christine Orengo for all her irreplaceable guidance, help and support throughout this PhD and for giving me the opportunity to work in such a friendly and prestigious group. I would like to thank David Bennett for all the immeasurable guidance and support throughout my research and for giving me the opportunity to collaborate with such prestigious scientists and groups. Their help and advise throughout these years have been really immeasurable. I would also like to thank Steve MacMahon and all scientists of the London Pain Consortium for their advice and for educating me about pain and the nervous system. I would also like to acknowledge my subsidiary supervisor Andrew Martin and chair of the thesis committee Kevin Bryson for all the help and guidance. This work has been done in collaboration with many groups. First I would like to thank all members of the Orengo group, past and present, for all their advice and help and for creating the friendly and supportive lab where I have always been enjoying working in. Also I would like to thank all the members of David Bennett's group and Steve MacMahon’s group. In particular I would like to thank Jim Perkins, Ana Antunes-Martins, John Dawes, John Lees and Andreas Themistocleous. It has always been a pleasure to work with them. Many thanks also to Jeffrey Mogil and all people in his group, with whom I collaborated and they have been really hospitable and supportive. Finally, a special thank to all my friends and family, both here in the UK and back in Greece. Thank you for being there and gave me the courage to take this path. This PhD would not have been possible without your support. 4 Contents Table of Contents Abstract.........................................................................................3 Acknowledgments.........................................................................4 Contents.........................................................................................5 List of Figures...............................................................................8 List of tables................................................................................11 Introduction.................................................................................12 Pain............................................................................................13 Pain at the molecular level..............................................................15 Neuropathic Pain............................................................................20 Animal models of pain...................................................................22 Gene Expression........................................................................26 The Central Dogma........................................................................26 Long Non-coding RNAs (LncRNAs).............................................27 Functional repertoires of LncRNAs...............................................35 Known pain-related LncRNAs.......................................................37 Overview of computational pipelines for identifying LncRNAs...38 RNA-Sequencing.......................................................................39 RNA isolation and library construction..........................................41 ........................................................................................................44 Potentials and drawbacks................................................................44 Analysing RNA-sequencing data...................................................47 Explain complex interactions of many variables.......................55 Principal Components Analysis and varimax rotation...................55 Overview of thesis chapters.......................................................57 Methods for identifying LncRNAs and analyse RNA-sequencing data..............................................................................................59 Overview of computational identification and DE of LncRNAs ...................................................................................................59 Methods.....................................................................................62 RNA-Seq and library preparation...................................................64 Aligning reads to the genome.........................................................66 Selecting reads according to overlapping genomic features..........68 Identify expressed regions outside known gene models................69 Reconstruct genes of putative LncRNAs........................................75 Calculate DE and associate expression profiles of putative LncRNAs and genes.......................................................................85 Comparing conditions using Generalized Linear Models (GLMs)86 Annotation of predicted LncRNAs.................................................90 Calculate counts and DE of known genes......................................92 Functional enrichments...................................................................93 5 Transcriptional changes of protein coding genes and novel LncRNAs in rat’s DRG after the SNT pain model......................96 Overview...................................................................................96 Background...............................................................................97 The Spinal Nerve Transection pain model.....................................97 RNA-Seq and library preparation...................................................98 Aligning RNA-seq reads to genome...............................................99 Experimental Design....................................................................101 Further quality control..................................................................101 Results.....................................................................................105 Differential Expression analysis of known genes.........................105 Functional enrichment..................................................................110 Expression patterns of ion channels and pain genes.....................114 Identification of LncRNAs...........................................................120 Expression of LncRNAs in rat’s DRG.........................................122 LncRNAs and pain-related protein coding genes.........................129 Discussion...............................................................................133 Transcriptional changes of LncRNAs and protein coding genes in DRG of two mouse strains experiencing high and low induced hypersensitivity.........................................................................135 Overview.................................................................................135 Background.............................................................................136 The Spared Nerve Injury pain model...........................................136 Behavioural tests...........................................................................137 Mouse strains and phenotypes......................................................140 Dissections....................................................................................142 RNA isolation and extraction.......................................................144 Dataset..........................................................................................144

Bioinformatics Approaches to Identify Pain Mediators, Novel Lncrnas and Distinct Modalities of Neuropathic Pain

Improving the Prediction of Transcription Factor Binding Sites To

Applied Category Theory for Genomics – an Initiative

A Community Proposal to Integrate Structural

SD Gross BFI0403

Functional Effects Detailed Research Plan

WO 2014/135655 Al 12 September 2014 (12.09.2014) P O P C T

Chip-Seq Annotation and Visualization How to Add Biological Meaning to Peaks

Supplementary Information For

Prolango: Protein Function Prediction Using Neural~ Machine

100000 Protein Structures for the Biologist

Microarray and Pattern Miner Analysis of AXL and VIM Gene Networks in MDA‑MB‑231 Cells

Gaëlle GARET Classification Et Caractérisation De Familles Enzy