Bioinformatics Approaches to Identify Pain Mediators, Novel Lncrnas and Distinct Modalities of Neuropathic Pain
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Improving the Prediction of Transcription Factor Binding Sites To
Improving the prediction of transcription factor binding sites to aid the interpretation of non-coding single nucleotide variants. Narayan Jayaram Research Department of Structural and Molecular Biology University College London A thesis submitted to University College London for the degree of Doctor of Philosophy 1 Declaration I, Narayan Jayaram confirm that the work presented in this thesis is my own. Where information has been derived from other sources, I confirm that this has been indicated in the thesis. Narayan Jayaram 2 Abstract Single nucleotide variants (SNVs) that occur in transcription factor binding sites (TFBSs) can disrupt the binding of transcription factors and alter gene expression which can cause inherited diseases and act as driver SNVs in cancer. The identification of SNVs in TFBSs has historically been challenging given the limited number of experimentally characterised TFBSs. The recent ENCODE project has resulted in the availability of ChIP-Seq data that provides genome wide sets of regions bound by transcription factors. These data have the potential to improve the identification of SNVs in TFBSs. However, as the ChIP-Seq data identify a broader range of DNA in which a transcription factor binds, computational prediction is required to identify the precise TFBS. Prediction of TFBSs involves scanning a DNA sequence with a Position Weight Matrix (PWM) using a pattern matching tool. This thesis focusses on the prediction of TFBSs by: (a) evaluating a set of locally-installable pattern-matching tools and identifying the best performing tool (FIMO), (b) using the ENCODE ChIP-Seq data to evaluate a set of de novo motif discovery tools that are used to derive PWMs which can handle large volumes of data, (c) identifying the best performing tool (rGADEM), (d) using rGADEM to generate a set of PWMs from the ENCODE ChIP-Seq data and (e) by finally checking that the selection of the best pattern matching tool is not unduly influenced by the choice of PWMs. -
Applied Category Theory for Genomics – an Initiative
Applied Category Theory for Genomics { An Initiative Yanying Wu1,2 1Centre for Neural Circuits and Behaviour, University of Oxford, UK 2Department of Physiology, Anatomy and Genetics, University of Oxford, UK 06 Sept, 2020 Abstract The ultimate secret of all lives on earth is hidden in their genomes { a totality of DNA sequences. We currently know the whole genome sequence of many organisms, while our understanding of the genome architecture on a systematic level remains rudimentary. Applied category theory opens a promising way to integrate the humongous amount of heterogeneous informations in genomics, to advance our knowledge regarding genome organization, and to provide us with a deep and holistic view of our own genomes. In this work we explain why applied category theory carries such a hope, and we move on to show how it could actually do so, albeit in baby steps. The manuscript intends to be readable to both mathematicians and biologists, therefore no prior knowledge is required from either side. arXiv:2009.02822v1 [q-bio.GN] 6 Sep 2020 1 Introduction DNA, the genetic material of all living beings on this planet, holds the secret of life. The complete set of DNA sequences in an organism constitutes its genome { the blueprint and instruction manual of that organism, be it a human or fly [1]. Therefore, genomics, which studies the contents and meaning of genomes, has been standing in the central stage of scientific research since its birth. The twentieth century witnessed three milestones of genomics research [1]. It began with the discovery of Mendel's laws of inheritance [2], sparked a climax in the middle with the reveal of DNA double helix structure [3], and ended with the accomplishment of a first draft of complete human genome sequences [4]. -
A Community Proposal to Integrate Structural
F1000Research 2020, 9(ELIXIR):278 Last updated: 11 JUN 2020 OPINION ARTICLE A community proposal to integrate structural bioinformatics activities in ELIXIR (3D-Bioinfo Community) [version 1; peer review: 1 approved, 3 approved with reservations] Christine Orengo1, Sameer Velankar2, Shoshana Wodak3, Vincent Zoete4, Alexandre M.J.J. Bonvin 5, Arne Elofsson 6, K. Anton Feenstra 7, Dietland L. Gerloff8, Thomas Hamelryck9, John M. Hancock 10, Manuela Helmer-Citterich11, Adam Hospital12, Modesto Orozco12, Anastassis Perrakis 13, Matthias Rarey14, Claudio Soares15, Joel L. Sussman16, Janet M. Thornton17, Pierre Tuffery 18, Gabor Tusnady19, Rikkert Wierenga20, Tiina Salminen21, Bohdan Schneider 22 1Structural and Molecular Biology Department, University College, London, UK 2Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, CB10 1SD, UK 3VIB-VUB Center for Structural Biology, Brussels, Belgium 4Department of Oncology, Lausanne University, Swiss Institute of Bioinformatics, Lausanne, Switzerland 5Bijvoet Center, Faculty of Science – Chemistry, Utrecht University, Utrecht, 3584CH, The Netherlands 6Science for Life Laboratory, Stockholm University, Solna, S-17121, Sweden 7Dept. Computer Science, Center for Integrative Bioinformatics VU (IBIVU), Vrije Universiteit, Amsterdam, 1081 HV, The Netherlands 8Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Belvaux, L-4367, Luxembourg 9Bioinformatics center, Department of Biology, University of Copenhagen, Copenhagen, DK-2200, -
SD Gross BFI0403
Janet Thornton Bioinformatician avant la lettre Michael Gross B ioinformatics is very much a buzzword of our time, with new courses and institutes dedicated to it sprouting up almost everywhere. Most significantly, the flood of genome data has raised the gen- eral awareness of the need to deve-lop new computational approaches to make sense of all the raw information collected. Professor Janet Thornton, the current director of the European Bioinformatics Institute (EBI), an EMBL outpost based at the Hinxton campus near Cambridge, has been in the field even before there was a word for it. Coming to structural biology with a physics degree from the University of Nottingham, she was already involved with computer-generated structural im- ages in the 1970s, when personal comput- ers and user-friendly programs had yet to be invented. The Early Years larities. Within 15 minutes, the software From there to the EBI, her remarkable Janet Thornton can check all 2.4 billion possible re- career appears to be organised in lationships and pick the ones relevant decades. During the 1970s, she did doc- software to compare structures to each to the question at hand. In comparison toral and post-doctoral research at other, recognise known folds and spot to publicly available bioinformatics the Molecular Biophysics Laboratory in new ones. Such work provides both packages such as Blast or Psiblast, Oxford and at the National Institute for fundamental insights into the workings Biopendium can provide an additional Medical Research in Mill Hill, near Lon- of evolution on a molecular level, and 30 % of annotation, according to Inphar- don. -
Functional Effects Detailed Research Plan
GeCIP Detailed Research Plan Form Background The Genomics England Clinical Interpretation Partnership (GeCIP) brings together researchers, clinicians and trainees from both academia and the NHS to analyse, refine and make new discoveries from the data from the 100,000 Genomes Project. The aims of the partnerships are: 1. To optimise: • clinical data and sample collection • clinical reporting • data validation and interpretation. 2. To improve understanding of the implications of genomic findings and improve the accuracy and reliability of information fed back to patients. To add to knowledge of the genetic basis of disease. 3. To provide a sustainable thriving training environment. The initial wave of GeCIP domains was announced in June 2015 following a first round of applications in January 2015. On the 18th June 2015 we invited the inaugurated GeCIP domains to develop more detailed research plans working closely with Genomics England. These will be used to ensure that the plans are complimentary and add real value across the GeCIP portfolio and address the aims and objectives of the 100,000 Genomes Project. They will be shared with the MRC, Wellcome Trust, NIHR and Cancer Research UK as existing members of the GeCIP Board to give advance warning and manage funding requests to maximise the funds available to each domain. However, formal applications will then be required to be submitted to individual funders. They will allow Genomics England to plan shared core analyses and the required research and computing infrastructure to support the proposed research. They will also form the basis of assessment by the Project’s Access Review Committee, to permit access to data. -
WO 2014/135655 Al 12 September 2014 (12.09.2014) P O P C T
(12) INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) (19) World Intellectual Property Organization International Bureau (10) International Publication Number (43) International Publication Date WO 2014/135655 Al 12 September 2014 (12.09.2014) P O P C T (51) International Patent Classification: (81) Designated States (unless otherwise indicated, for every C12Q 1/68 (2006.01) kind of national protection available): AE, AG, AL, AM, AO, AT, AU, AZ, BA, BB, BG, BH, BN, BR, BW, BY, (21) International Application Number: BZ, CA, CH, CL, CN, CO, CR, CU, CZ, DE, DK, DM, PCT/EP2014/054384 DO, DZ, EC, EE, EG, ES, FI, GB, GD, GE, GH, GM, GT, (22) International Filing Date: HN, HR, HU, ID, IL, IN, IR, IS, JP, KE, KG, KN, KP, KR, 6 March 2014 (06.03.2014) KZ, LA, LC, LK, LR, LS, LT, LU, LY, MA, MD, ME, MG, MK, MN, MW, MX, MY, MZ, NA, NG, NI, NO, NZ, (25) Filing Language: English OM, PA, PE, PG, PH, PL, PT, QA, RO, RS, RU, RW, SA, (26) Publication Language: English SC, SD, SE, SG, SK, SL, SM, ST, SV, SY, TH, TJ, TM, TN, TR, TT, TZ, UA, UG, US, UZ, VC, VN, ZA, ZM, (30) Priority Data: ZW. 13305253.0 6 March 2013 (06.03.2013) EP (84) Designated States (unless otherwise indicated, for every (71) Applicants: INSTITUT CURIE [FR/FR]; 26 rue d'Ulm, kind of regional protection available): ARIPO (BW, GH, F-75248 Paris cedex 05 (FR). CENTRE NATIONAL DE GM, KE, LR, LS, MW, MZ, NA, RW, SD, SL, SZ, TZ, LA RECHERCHE SCIENTIFIQUE [FR/FR]; 3 rue UG, ZM, ZW), Eurasian (AM, AZ, BY, KG, KZ, RU, TJ, Michel Ange, F-75016 Paris (FR). -
Chip-Seq Annotation and Visualization How to Add Biological Meaning to Peaks
ChIP-seq Annotation and Visualization How to add biological meaning to peaks M. Defrance, C. Herrmann, D. Puthier, M. Thomas-Chollier, S Le Gras, J van Helden Our data in the context Custom track uploded by the user (here ESR1 peaks in siGATA3 context) public UCSC annotation/data tracks Typical questions - What are the genes associated to the peaks? ChIP-seq peaks - Are some genomic categories over-represented? - Are some functional categories over-represented? - Are the peaks close to the TSS, …? ChIP-seq peaks Annotation Visualisation Enrichment profiles Annotated peaks Genomic & functional Average Profile near TSS Average Profile near TTS Annotation Genomic location Relation to CpG island 0.30 1.0 2500 0.8 chr start end Gene 0.25 chr15 65294195 65295186 0.6 chrX 19635923 19638359 Chst7 Average Profile Average Profile Average 3000 chr8 33993863 33995559 1500 0.4 0.20 chr10 114236977 114239326 Trhde # of regions # of regions 0.2 Distribution of Peak Heights chrX 69515082 69516482 Gabre 500 chr4 49857142 49858913 Grin3a 1000 −3000 −2000 −1000 0 1000 2000 3000 −3000 −2000 −1000 0 1000 2000 3000 0 5 10 15 chr16 7352861 7353410 Rbfox1 0 0 Relative Distance to TSS (bp) Relative Distance to TTS (bp) chr7 64764156 64765421 Gabra5 ChIP Regions (Peaks) over Chromosomes Average Gene Profile chrX 83436881 83438330 Nr0b1 CGI Shore Distant chr10 120288598 120289143 Msrb3 Multiple 1 Promoter Intergenic 2 chr5 67446361 67446855 Limch1 Gene Body 3 0.8 4 5 6 7 0.6 8 9 Average Profile Average 10 0.4 11 12 Chromosome 13 0.2 14 15 −1000 0 1000 2000 3000 4000 16 Upstream (bp), 3000 bp of Meta−gene, Downstream (bp) 17 18 19 Average Concatenated Exon Profile Average Concatenated Intron Profile X Y 0.0e+00 5.0e+07 1.0e+08 1.5e+08 2.0e+08 1.0 Chromosome Size (bp) 1.0 0.8 0.8 0.6 0.6 Average Profile Average Profile Average 0.4 0.4 0.2 0.2 0 20 40 60 80 100 0 20 40 60 80 100 Relative Location (%) Relative Location (%) Distribution of Peak Heights 0 5 10 15 ChIP Regions (Peaks) over Chromosomes 1 2 3 4 5 6 7 8 9 10 11 12 Chromosome W.Huang et al. -
Supplementary Information For
Supplementary Information for Increased Muscleblind levels by chloroquine treatment improve myotonic dystrophy type 1 phenotypes in in vitro and in vivo models. Ariadna Bargiela, Maria Sabater-Arcis, Jorge Espinosa-Espinosa, Miren Zulaica, Adolfo Lopez de Munain and Ruben Artero. Corresponding author: Ruben Artero Email: [email protected] This PDF file includes: Supplementary text Figs. S1 to S13 Tables S1 References for SI reference citations 1 www.pnas.org/cgi/doi/10.1073/pnas.1820297116 Supplementary Information Text Materials and methods. Fly strains and crosses w1118 line was obtained from the Bloomington Drosophila Stock Center (Indiana University, Bloomington, IN, USA). Mhc-Gal4 flies were described in (1). Mhc-Gal4 UAS-(CTG)480 flies were generated in (2). All crosses were carried out at 25 °C with standard fly food. For oral administration of chloroquine (Chloroquine diphosphate salt solid, ≥98%, C6628 Sigma Aldrich), a maximum of 25 one-day adult flies were collected in tubes containing standard food supplemented with chloroquine (10 or 100 μM). Flies were transferred to tubes containing fresh food every 2-3 days for a total administration time of 7 days. Determination of caspase-3 and caspase-7 activity Ten adult female flies of the indicated genotypes were homogenized in 100 μl of cold PBS buffer using TissueLyser LT (Qiagen, Hilden, Germany). After a 10 min centrifugation, the supernatant was transferred into a white 96-well plate. Caspase-3 and caspase-7 activity was measured using the Caspase-Glo 3/7 Assay Systems (Promega, Fitchburg, WI, USA). Briefly, 100 μl of Caspase-Glo 3/7 reagent was added per well and the plate was incubated at room temperature for 30 min. -
Prolango: Protein Function Prediction Using Neural~ Machine
ProLanGO: Protein Function Prediction Using Neural Machine Translation Based on a Recurrent Neural Network Renzhi Cao Colton Freitas Department of Computer Science Department of Computer Science Pacific Lutheran University Pacific Lutheran University Tacoma, WA 98447 Tacoma, WA 98447 [email protected] [email protected] Leong Chan Miao Sun School of Business Baidu Inc. Pacific Lutheran University 1195 Bordeaux Dr, Sunnyvale, CA 94089, USA Tacoma, WA 98447 [email protected] [email protected] Haiqing Jiang Zhangxin Chen Hiretual Inc. School of electronic engineering San Jose, CA 95131, USA University of Electronic Science and Technology of China [email protected] Chengdu, 610051, China [email protected] Abstract With the development of next generation sequencing techniques, it is fast and cheap to determine protein sequences but relatively slow and expensive to extract useful information from protein sequences because of limitations of traditional biological experimental techniques. Protein function prediction has been a long standing challenge to fill the gap between the huge amount of protein sequences and the known function. In this paper, we propose a novel method to convert the protein function problem into a language translation problem by the new proposed protein sequence language “ProLan” to the protein function language “GOLan”, and build a neural machine translation model based on recurrent neural networks to translate “ProLan” language to “GOLan” language. We blindly tested our method by attending the latest third Critical Assessment of Function Annotation (CAFA arXiv:1710.07016v1 [q-bio.QM] 19 Oct 2017 3) in 2016, and also evaluate the performance of our methods on selected proteins whose function was released after CAFA competition. -
100000 Protein Structures for the Biologist
© 1998 Nature America Inc. • http://structbio.nature.com meeting review 100,000 protein structures for the biologist Andrej Sali˘ Structural genomics promises to deliver experimentally determined three-dimensional structures for many thousands of protein domains. These domains will be carefully selected, so that the methods of fold assignment and comparative protein structure modeling will result in useful models for most other protein sequences. The impact on biology will be dramatic. Recently, some 200 structural biologists archiving and annotation of the new struc- on the order of 10,000. As an alternative, it from both academia and industry gath- tures. Several pilot projects in structural has also been suggested that 100,000 protein ered in Avalon, New Jersey to explore the genomics that implement all of these steps structures need to be determined by experi- science and organization of the new field are already underway (Table 1). It is feasible ment (David Eisenberg, UCLA); this would of structural genomics1. The aim of the to determine at least 10,000 protein struc- allow calculation of models with higher structural genomics project is to deliver tures within the next five years. No new dis- accuracy than is possible with only 10,000 structural information about most pro- coveries or methods are needed for this known structures. After a list of the 30-seq teins. While it is not feasible to determine task. The average cost per structure is families is obtained, the representatives will the structure of every protein by experi- expected to decrease significantly from be picked and prioritized. A variety of dif- ment, useful models can be obtained by ~$200,000 per structure to perhaps less ferent criteria likely will be used for this fold assignment and comparative model- than $20,000 per structure. -
Microarray and Pattern Miner Analysis of AXL and VIM Gene Networks in MDA‑MB‑231 Cells
MOLECULAR MEDICINE REPORTS 18: 4147-4155, 2018 Microarray and pattern miner analysis of AXL and VIM gene networks in MDA‑MB‑231 cells SUDHAKAR NATARAJAN1, VENIL N SUMANTRAN2, MOHAN RANGANATHAN1 and SURESH MADHESWARAN1 1Department of Biotechnology, Faculty of Engineering and Technology; 2Dr. A.P.J. Abdul Kalam Centre for Excellence in Innovation and Entrepreneurship, Dr. M.G.R. Educational and Research Institute, Tamil Nadu, Chennai 600095, India Received March 20, 2018; Accepted August 2, 2018 DOI: 10.3892/mmr.2018.9404 Abstract. MDA-MB-231 cells represent malignant triple-nega- migration, metastasis and chemoresistance, whereas the VIM tive breast cancer, which overexpress epidermal growth factor gene network regulates novel tumorigenic processes, such as receptor (EGFR) and two genes (AXL and VIM) associated lipogenesis, senescence and autophagy. Notably, these two with poor prognosis. The present study aimed to identify novel networks contain 12 genes not reported for TNBC. therapeutic targets and elucidate the functional networks for the AXL and VIM genes in MDA-MB-231 cells. We identi- Introduction fied 71 genes upregulated in MDA-MB-231 vs. MCF7 cells using BRB-Array tool to re-analyse microarray data from six Triple negative breast cancers (TNBC) lack expression of three GEO datasets. Gene ontology and STRING analysis showed important receptors (ER, PR, and HER2). These cancers account that 43/71 genes upregulated in MDA-MB-231 compared with for 10-15% of breast cancers, and are characterized by overex- MCF7 cells, regulate cell survival and migration. Another pression of epidermal growth factor receptor (EGFR), high 19 novel genes regulate migration, metastases, senescence, proliferative rate, and mutations in the p53 and BRCA1 tumour autophagy and chemoresistance. -
Gaëlle GARET Classification Et Caractérisation De Familles Enzy
No d’ordre : 000 ANNÉE 2015 60 THÈSE / UNIVERSITÉ DE RENNES 1 sous le sceau de l’Université Européenne de Bretagne pour le grade de DOCTEUR DE L’UNIVERSITÉ DE RENNES 1 Mention : Informatique École doctorale Matisse présentée par Gaëlle GARET préparée à l’unité de recherche Inria/Irisa – UMR6074 Institut de Recherche en Informatique et Système Aléatoires Composante universitaire : ISTIC Thèse à soutenir à Rennes Classification et le 16 décembre 2014 devant le jury composé de : caractérisation Jean-Christophe JANODET Professeur à l’Université d’Evry-Val-d’Essonne / Rapporteur de familles enzy- Amedeo NAPOLI Directeur de recherche au Loria, Nancy / Rapporteur Colin DE LA HIGUERA matiques à l’aide Professeur à l’Université de Nantes / Examinateur Olivier RIDOUX Professeur à l’Université de Rennes 1 / Examinateur de méthodes for- Mirjam CZJZEK Directrice de recherche CNRS, Roscoff / Examinatrice Jacques NICOLAS melles Directeur de recherche à Inria, Rennes / Directeur de thèse François COSTE Chargé de recherche à Inria, Rennes / Co-directeur de thèse Ainsi en était-il depuis toujours. Plus les hommes accumulaient des connaissances, plus ils prenaient la mesure de leur ignorance. Dan Brown, Le Symbole perdu Tu me dis, j’oublie. Tu m’enseignes, je me souviens. Tu m’impliques, j’apprends. Benjamin Franklin Remerciements Je remercie tout d’abord la région Bretagne et Inria qui ont permis de financer ce projet de thèse. Merci à Jean-Christophe Janodet et Amedeo Napoli qui ont accepté de rapporter cette thèse et à Olivier Ridoux, Colin De La Higuera et Mirjam Czjzek pour leur participation au jury. J’aimerais aussi dire un grand merci à mes deux directeurs de thèse : Jacques Ni- colas et François Coste, qui m’ont toujours apporté leur soutien tant dans le domaine scientifique que personnel.