Glycoinformatics: Data Mining-Based Approaches

Total Page:16

File Type:pdf, Size:1020Kb

Glycoinformatics: Data Mining-Based Approaches 10 CHIMIA 2011, 65, No. 1/2 Glycochemistry today doi:10.2533/chimia.2011.10 Chimia 65 (2011) 10–13 © Schweizerische Chemische Gesellschaft Glycoinformatics: Data Mining-based Approaches Hiroshi Mamitsuka* Abstract: Carbohydrates or glycans are major cellular macromolecules, working for a variety of vital biological functions. Due to long-term efforts by experimentalists, the current number of structurally different, determined carbohydrates has exceeded 10,000 or more. As a result data mining-based approaches for glycans (or trees in a computer science sense) have attracted attention and have been developed over the last five years, presenting new techniques even from computer science viewpoints. This review summarizes cutting-edge techniques for glycans in each of the three categories of data mining: classification, clustering and frequent pattern mining, and shows results obtained by applying these techniques to real sets of glycan structures. Keywords: Data mining · Frequent subtrees · Glycan structures · Machine learning · Probabilistic models 1. Introduction glycans can be trees, consisting of nodes 2. Mining from Glycan Structures: and edges (which connect nodes), corre- Data Mining Techniques for Trees Oligosaccharides and glycans are major sponding to monosaccharides and chemi- cellular macromolecules associated with cal linkages, respectively. Moreover gly- In general, current approaches for min- a variety of important biological phenom- cans are rooted directed ordered trees be- ing from data or machine learning can ena, including antigen-antibody interac- cause i) a glycan connects to a protein (an be classified into roughly three types: i) tion[1] and cell fate controlling.[2] Pro- amino acid) by only one monosaccharide, classification (or supervised learning), teins and DNAs consist of twenty types called the root, ii) connections are directed ii) clustering (or unsupervised learning) of amino acids and four types of nucleo- from the root to leaves, meaning that two and iii) frequent pattern mining.[7,8] Here tides, respectively. Likewise glycans have connected monosaccharides can be a par- we briefly explain the difference between building blocks, called monosaccharides, ent (closer to the root) and a child (closer these concepts, under which the input is which however are more diverse, the ma- to a leaf), ancestors and descendants being always called examples (which are in our jor ones being fructose, galactose, glucose able to be defined in a similar manner, and case rooted directed ordered trees). In clas- and mannose.[3] The uniqueness of carbo- iii) monosaccharides (children) can be or- sification, examples are labeled or class hydrates is in the connection of monosac- dered by carbon numbers attached to link- labels are attached to examples, and the charides, where two or more monosaccha- ages to another monosaccharide (parent), purpose is to make a classifier/predictor rides can connect to one monosaccharide meaning that children are ordered from which can predict a class to be assigned to without any cycle, forming branch-shaped the oldest sibling to the youngest sibling. a newly given example. This is supervised extensions. In a computer science sense, Hereafter a subtree means a node in a tree learning. On the other hand, in clustering, and its all descendants and edges from that examples are not labeled, and the purpose node to leaves, while a supertree of tree T here is to assign examples to some groups is a tree having T as a subtree. The unique by which we can see the similarity between structure of carbohydrates makes them examples, such as if two examples are in different from other macromolecules such the same group, they are similar. This is as proteins and DNAs which are simpler unsupervised learning, which can give la- sequences of building blocks. Furthermore bels to examples, while labels are given in this uniqueness has made it hard to de- supervised learning. In frequent pattern termine the tree structures of glycans ex- mining, the input is unlabeled examples perimentally by which the size of glycan (so in some sense this is unsupervised structure databases has been kept small learning, but in this review we do not cat- and its speed to develop has been forced egorize frequent pattern mining into unsu- to be slow. However, due to the develop- pervised learning). The purpose is to find ment of carbohydrate research, the current patterns which appear a larger number of number of structurally different glycans in times than a prefixed number, by which we a major database on glycans reaches more can see what patterns appear frequently. than ten thousands, which will be further increased in the near future by using high- 2.1 Classification over Trees throughput techniques.[4] Thus in glycoin- Currently the most popular approach *Correspondence: Prof. H. Mamitsuka formatics, developing efficient approaches in supervised learning is so-called sup- Kyoto University Institute for Chemical Research for mining rules or patterns from trees and port vector machines (SVMs), in which Bioinformatics Center applying the approaches to a glycan data- we use a kernel function that represents a Gokasho, Uji 611-0011, Japan base would be reasonable and promising similarity between two examples.[9] The Tel.: +81 774 38 3023 Fax: +81 774 38 3037 to find biological significance embedded objective here is to find a classifier which E-mail: [email protected] in glycan structures.[5,6] can separate examples in one class from Glycochemistry today CHIMIA 2011, 65, No. 1/2 11 those in the other class (if class labels (a): Sample rooted directed ordered tree (b):Probablistic dependencies of HTMM are binary). The simplest way is to use a linear function, corresponding to us- ing an inner product as a kernel function. However, it would be hard to separate examples depending upon class labels by using a simple linear function. This means that it is important to design a good ker- nel function by which we can separate ex- amples easily. For the case that examples are trees, a kernel function for computing a similarity between two input trees has been already proposed.[10] The idea be- hind the ‘tree’ kernel is to check subtrees, which appear in the two input trees com- monly, and if they are bigger and/or the number of common subtrees is larger, the two input trees should be more similar. This can be computed in a very efficient manner by using dynamic programming, and a tree kernel specialized for glycans (c): Probablistic dependencies of OTMM (d):Probablistic dependencies of PSTMM was also proposed.[11] 2.2 Clustering: Probabilistic Models for Trees There are a variety of approaches in unsupervised learning. In this review, we focus on probabilistic models, which are models with probability parameters, to be estimated from given data. Rather than clustering, probabilistic models allow bi- nary classification if binary classes are like the examples in question and others. This case, after training a probabilistic model from examples in question, we can com- pute the likelihood for any given example, showing how likely the given example can be in the class in question. A standard Fig. 1. (a) A sample rooted directed ordered tree and probabilistic dependencies of (b) HTMM, probabilistic model for time-series data (c) OTMM and (d) PSTMM or sequences is the hidden Markov model (HMM), which has been used in a lot of Figure 1: (a) A sample rooted directed ordered tree and probabilistic dependencies of (b)HTMM, applications, including speech recogni- (c) OTMM and (d)PSTMM. tion,[12] natural language processing[13] and transition diagram). Training (or learning) tree, where the top square means the root, analyzing biological sequences, e.g. amino probability parameters of HMM is usu- solid lines show directed edges (chemi- acid sequences.[14] A HMM can be defined ally based on maximum likelihood, which cal linkages) from parents to children and by a state transition diagram, in which means that probability parameters are es- dotted lines show ordered children. Fig. states are connected by edges. A HMM timated so that the likelihoods of given 1 (b) shows a HTMM over the tree in (or its state transition diagram) has two sequences are maximized. There are a va- (a), where you can see that thick lines of types of probability parameters, one being riety of extensions of HMMs, major ones state transitions are always directed from letter generation probabilities attached to being context free grammars[15] and tree parents to children. Fig. 1 (c) shows an states to generate letters, i.e. amino acid grammars[16] to deal with complex depen- OTMM over the tree in (a), where you can types, and the other being state transition dencies in sequences. Another direction is see that thick lines of state transition are probabilities attached to edges. A standard for trees, the first trial being the hidden tree from parents to children if the children are assumption on the Markov process is the Markov model (HTMM)[17] in which left- the oldest; otherwise thick lines are from first-order Markov property, which is very to-right on a sequence is simply applied to older siblings to younger siblings. That simple for sequences and means that the parent-to-child over a tree. This model can is, OTMM considers ordered children as current state depends upon only one state be applied to glycans. However, glycans well as parent-to-child dependencies via away. Thus given a sequence, we can just are rooted directed ordered trees, while parents and the oldest siblings. Fig. 1 (d) repeat the following two steps, from left ordered children are not considered in HT- shows a PSTMM over the tree in (a), where to right on the sequence: we first generate MM at all. Thus the ordered tree Markov you can see that thick lines of state transi- the corresponding letter, i.e.
Recommended publications
  • Bioinformatics Study of Lectins: New Classification and Prediction In
    Bioinformatics study of lectins : new classification and prediction in genomes François Bonnardel To cite this version: François Bonnardel. Bioinformatics study of lectins : new classification and prediction in genomes. Structural Biology [q-bio.BM]. Université Grenoble Alpes [2020-..]; Université de Genève, 2021. En- glish. NNT : 2021GRALV010. tel-03331649 HAL Id: tel-03331649 https://tel.archives-ouvertes.fr/tel-03331649 Submitted on 2 Sep 2021 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. THÈSE Pour obtenir le grade de DOCTEUR DE L’UNIVERSITE GRENOBLE ALPES préparée dans le cadre d’une cotutelle entre la Communauté Université Grenoble Alpes et l’Université de Genève Spécialités: Chimie Biologie Arrêté ministériel : le 6 janvier 2005 – 25 mai 2016 Présentée par François Bonnardel Thèse dirigée par la Dr. Anne Imberty codirigée par la Dr/Prof. Frédérique Lisacek préparée au sein du laboratoire CERMAV, CNRS et du Computer Science Department, UNIGE et de l’équipe PIG, SIB Dans les Écoles Doctorales EDCSV et UNIGE Etude bioinformatique des lectines: nouvelle classification et prédiction dans les génomes Thèse soutenue publiquement le 8 Février 2021, devant le jury composé de : Dr. Alexandre de Brevern UMR S1134, Inserm, Université Paris Diderot, Paris, France, Rapporteur Dr.
    [Show full text]
  • Toolboxes for a Standardised and Systematic Study of Glycans
    Campbell et al. BMC Bioinformatics 2014, 15(Suppl 1):S9 http://www.biomedcentral.com/1471-2105/15/S1/S9 RESEARCH Open Access Toolboxes for a standardised and systematic study of glycans Matthew P Campbell1, René Ranzinger2, Thomas Lütteke3, Julien Mariethoz4, Catherine A Hayes5, Jingyu Zhang1, Yukie Akune6, Kiyoko F Aoki-Kinoshita6, David Damerell7,11, Giorgio Carta8, Will S York2, Stuart M Haslam7, Hisashi Narimatsu9, Pauline M Rudd8, Niclas G Karlsson4, Nicolle H Packer1, Frédérique Lisacek4,10* From Integrated Bio-Search: 12th International Workshop on Network Tools and Applications in Biology (NETTAB 2012) Como, Italy. 14-16 November 2012 Abstract Background: Recent progress in method development for characterising the branched structures of complex carbohydrates has now enabled higher throughput technology. Automation of structure analysis then calls for software development since adding meaning to large data collections in reasonable time requires corresponding bioinformatics methods and tools. Current glycobioinformatics resources do cover information on the structure and function of glycans, their interaction with proteins or their enzymatic synthesis. However, this information is partial, scattered and often difficult to find to for non-glycobiologists. Methods: Following our diagnosis of the causes of the slow development of glycobioinformatics, we review the “objective” difficulties encountered in defining adequate formats for representing complex entities and developing efficient analysis software. Results: Various solutions already implemented and strategies defined to bridge glycobiology with different fields and integrate the heterogeneous glyco-related information are presented. Conclusions: Despite the initial stage of our integrative efforts, this paper highlights the rapid expansion of glycomics, the validity of existing resources and the bright future of glycobioinformatics.
    [Show full text]
  • PROGRAM and ABSTRACTS for 2020 ANNUAL MEETING of the SOCIETY for GLYCOBIOLOGY November 9–12, 2020 Phoenix, AZ, USA 1017 2020 Sfg Virtual Meeting Preliminary Schedule
    Downloaded from https://academic.oup.com/glycob/article/30/12/1016/5948902 by guest on 25 January 2021 PROGRAM AND ABSTRACTS FOR 2020 ANNUAL MEETING OF THE SOCIETY FOR GLYCOBIOLOGY November 9–12, 2020 Phoenix, AZ, USA 1017 2020 SfG Virtual Meeting Preliminary Schedule Mon. Nov 9 (Day 1) TOKYO ROME PACIFIC EASTERN EASTERN SESSION TIME TIME TIME START END TIME TIME 23:30 15:30 6:30 9:30 9:50 Welcome and Introduction - Michael Tiemeyer, CCRC UGA Downloaded from https://academic.oup.com/glycob/article/30/12/1016/5948902 by guest on 25 January 2021 23:30 15:30 6:30 9:50 – 12:36 Session 1: Glycobiology of Normal and Disordered Development | Chair: Kelly Ten-Hagen, NIH/NIDCR 23:50 15:50 6:50 9:50 10:10 KEYNOTE: “POGLUT1 mutations cause myopathy with reduced Notch signaling and α-dystroglycan hypoglycosylation” - Carmen Paradas Lopez, Biomedical Institute Sevilla 0:12 16:12 7:12 10:12 10:24 Poster Talk: “Regulation of Notch signaling by O-glycans in the intestine” – Mohd Nauman, Albert Einstein 0:26 16:26 7:26 10:26 10:38 Poster Talk: “Generation of an unbiased interactome for the tetratricopeptide repeat domain of the O-GlcNAc transferase indicates a role for the enzyme in intellectual disability” – Hannah Stephen, University of Georgia 0:40 16:30 7:30 10:40 10:50 Q&A 10:52 11:12 7:52 10:52 11:12 KEYNOTE: “Aberrations in N-cadherin Processing Drive PMM2-CDG Pathogenesis” - Heather Flanagan-Steet, Greenwood Genetics Center 1:14 11:26 8:14 11:14 11:26 Poster Talk: “Functional analyses of TMTC-type protein O-mannosyltransferases in Drosophila model
    [Show full text]
  • Introduction to Bioinformatics (Elective) – SBB1609
    SCHOOL OF BIO AND CHEMICAL ENGINEERING DEPARTMENT OF BIOTECHNOLOGY Unit 1 – Introduction to Bioinformatics (Elective) – SBB1609 1 I HISTORY OF BIOINFORMATICS Bioinformatics is an interdisciplinary field that develops methods and software tools for understanding biologicaldata. As an interdisciplinary field of science, bioinformatics combines computer science, statistics, mathematics, and engineering to analyze and interpret biological data. Bioinformatics has been used for in silico analyses of biological queries using mathematical and statistical techniques. Bioinformatics derives knowledge from computer analysis of biological data. These can consist of the information stored in the genetic code, but also experimental results from various sources, patient statistics, and scientific literature. Research in bioinformatics includes method development for storage, retrieval, and analysis of the data. Bioinformatics is a rapidly developing branch of biology and is highly interdisciplinary, using techniques and concepts from informatics, statistics, mathematics, chemistry, biochemistry, physics, and linguistics. It has many practical applications in different areas of biology and medicine. Bioinformatics: Research, development, or application of computational tools and approaches for expanding the use of biological, medical, behavioral or health data, including those to acquire, store, organize, archive, analyze, or visualize such data. Computational Biology: The development and application of data-analytical and theoretical methods, mathematical modeling and computational simulation techniques to the study of biological, behavioral, and social systems. "Classical" bioinformatics: "The mathematical, statistical and computing methods that aim to solve biological problems using DNA and amino acid sequences and related information.” The National Center for Biotechnology Information (NCBI 2001) defines bioinformatics as: "Bioinformatics is the field of science in which biology, computer science, and information technology merge into a single discipline.
    [Show full text]
  • Representing Glycophenotypes: Semantic Unification of Glycobiology Resources for Disease Discovery
    Representing glycophenotypes: semantic unification of glycobiology resources for disease discovery Authors: Jean-Philippe F. Gourdine1,2,3*, Matthew H. Brush1,3, Nicole A. Vasilevsky1,3, Kent Shefchek3,4, Sebastian Köhler3,5, Nicolas Matentzoglu3,6, Monica C. Munoz-Torres3,4, Julie A. McMurry3,4, Xingmin Aaron Zhang3,7, Melissa A. Haendel1,3,4 Affiliations: 1 Oregon Clinical & Translational Research Institute, Oregon Health & Science University, Portland, OR 97239, USA. 2 Oregon Health & Science University Library, Portland, OR 97239, USA. 3 Monarch Initiative, monarchinitiative.org. 4 Linus Pauling institute, Oregon State University, Corvallis, OR, USA. 5 Charité Centrum für Therapieforschung, Charité-Universitätsmedizin Berlin Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Berlin 10117, Germany. 6 European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Cambridge, UK. 7 The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA. Correspondence: [email protected] Abstract While abnormalities related to carbohydrates (glycans) are frequent for patients with rare and undiagnosed diseases as well as in many common diseases, these glycan-related phenotypes (glycophenotypes) are not well represented in knowledge bases (KBs). If glycan related diseases were more robustly represented and curated with glycophenotypes, these could be used for molecular phenotyping to help realize the goals of precision medicine. Diagnosis of rare diseases by computational cross-species comparison of genotype- phenotype data has been facilitated by leveraging ontological representations of clinical phenotypes, using Human Phenotype Ontology (HPO), and model organism ontologies such as Mammalian Phenotype Ontology (MP) in the context of the Monarch Initiative. In this article, we discuss the importance and complexity of glycobiology, and review the structure of glycan-related content from existing KBs and biological ontologies.
    [Show full text]
  • Science& Technology Trends
    ����������������������������������������������� � ������������ ��������������������������� �������� � Life Sciences ������������� � �� ����� ������ Necessity of Research and Development � ���������������������������������������� ������� of the Structural Genome Analysis Technology � ���� � ���������������������� Necessity to Promote Glycoinformatics � ������� � ���������������������������� Information and Communication Technologies � Towards Human-Centered Ubiquitous Computing � ������ ������ – Using a Paradigm Change as a Chance to Strengthen � International Technological Competitiveness – ������������������������ � Research and Development Trends in Robot � �������������� Technology – Toward Promoting the Commercialization of Personal Robots – � �������� ������� Nanotechnology and Materials � ��������������������������� ����� Current Conditions and Issues in International Strategy from the Perspective of the International ������ Standardization of Materials Energy The Necessity, Actual Situation and Challenges ����������� of Human Resources Training in the Nuclear Field Manufacturing Technology �������� Trend in the Introduction and Development of Robotic Surgical Systems � ���������������������������� ��������������������������������������������������������� ��������������������������������� ��������� ����������� ������������������������������������� ������������������������������������������������������������ ��������������������������������������������������������������������� QUARTERLY REVIEW No.10 / January 2004 F o r e w o
    [Show full text]
  • Meeting Booklet
    LS2 Annual Meeting 2019 “Cell Biology from Tissue to Nucleus” Meeting Booklet 14-15 February, University of Zurich, Campus Irchel LS 2 - Life Sciences Switzerland Design by Dagmar Bocakova - [email protected] 2019 WELCOME ADDRESS Dear colleagues and friends It is with great pleasure that we invite you to the LS² Annual Meeting 2019, held on the 14th and 15th of February, 2019 at the Irchel Campus of the University of Zurich. This is a special year, as it is the 50th anniversary of USGEB/LS2, which we will celebrate with a Jubilee Apéro on Thursday 14th. The LS² Annual Meeting brings together scientists from all nations and backgrounds to discuss a variety of Life Science subjects. The meeting this year focusses on the cell in health and disease, with plenary talks on different aspects of cell biology and a symposium on live cell imaging approaches. You will be able to hear the latest, most exciting findings in several fields, from Molecular and Cellular Biosciences, Proteomics, Chemical Biology, Physiology, Pharmacology and more, presented by around 30 invited speakers and 45 speakers selected from abstracts in one of the seven scientific symposia and five plenary lectures. Two novelties highlight the meeting this year: The "PIs of Tomorrow" session, in which selected postdocs will present their research to a jury of professors, will be for the first time a plenary session. In addition, poster presenters will be selected to give flash talks in the symposia with the spirit to further promote young scientists. Join us for the poster session with more than 130 posters, combined with a large industry exhibition and the Jubilee Apéro.
    [Show full text]
  • Semantics of Data for Life Sciences and Reproducible Research[Version 1
    F1000Research 2020, 9:136 Last updated: 16 AUG 2021 OPINION ARTICLE BioHackathon 2015: Semantics of data for life sciences and reproducible research [version 1; peer review: 2 approved] Rutger A. Vos 1,2, Toshiaki Katayama 3, Hiroyuki Mishima4, Shin Kawano 3, Shuichi Kawashima3, Jin-Dong Kim3, Yuki Moriya3, Toshiaki Tokimatsu5, Atsuko Yamaguchi 3, Yasunori Yamamoto3, Hongyan Wu6, Peter Amstutz7, Erick Antezana 8, Nobuyuki P. Aoki9, Kazuharu Arakawa10, Jerven T. Bolleman 11, Evan Bolton12, Raoul J. P. Bonnal13, Hidemasa Bono 3, Kees Burger14, Hirokazu Chiba15, Kevin B. Cohen16,17, Eric W. Deutsch18, Jesualdo T. Fernández-Breis19, Gang Fu12, Takatomo Fujisawa20, Atsushi Fukushima 21, Alexander García22, Naohisa Goto23, Tudor Groza24,25, Colin Hercus26, Robert Hoehndorf27, Kotone Itaya10, Nick Juty28, Takeshi Kawashima20, Jee-Hyub Kim28, Akira R. Kinjo29, Masaaki Kotera30, Kouji Kozaki 31, Sadahiro Kumagai32, Tatsuya Kushida 33, Thomas Lütteke 34,35, Masaaki Matsubara 36, Joe Miyamoto37, Attayeb Mohsen 38, Hiroshi Mori39, Yuki Naito3, Takeru Nakazato3, Jeremy Nguyen-Xuan40, Kozo Nishida41, Naoki Nishida42, Hiroyo Nishide15, Soichi Ogishima43, Tazro Ohta3, Shujiro Okuda44, Benedict Paten45, Jean-Luc Perret46, Philip Prathipati38, Pjotr Prins47,48, Núria Queralt-Rosinach 49, Daisuke Shinmachi9, Shinya Suzuki 30, Tsuyosi Tabata50, Terue Takatsuki51, Kieron Taylor 28, Mark Thompson52, Ikuo Uchiyama 15, Bruno Vieira53, Chih-Hsuan Wei12, Mark Wilkinson 54, Issaku Yamada 36, Ryota Yamanaka55, Kazutoshi Yoshitake56, Akiyasu C. Yoshizawa50, Michel
    [Show full text]
  • Proposal of the Annotation of Phosphorylated Amino Acids and Peptides Using Biological and Chemical Codes
    molecules Article Proposal of the Annotation of Phosphorylated Amino Acids and Peptides Using Biological and Chemical Codes Piotr Minkiewicz * , Małgorzata Darewicz , Anna Iwaniak and Marta Turło Department of Food Biochemistry, University of Warmia and Mazury in Olsztyn, Plac Cieszy´nski1, 10-726 Olsztyn-Kortowo, Poland; [email protected] (M.D.); [email protected] (A.I.); [email protected] (M.T.) * Correspondence: [email protected]; Tel.: +48-89-523-3715 Abstract: Phosphorylation represents one of the most important modifications of amino acids, peptides, and proteins. By modifying the latter, it is useful in improving the functional properties of foods. Although all these substances are broadly annotated in internet databases, there is no unified code for their annotation. The present publication aims to describe a simple code for the annotation of phosphopeptide sequences. The proposed code describes the location of phosphate residues in amino acid side chains (including new rules of atom numbering in amino acids) and the diversity of phosphate residues (e.g., di- and triphosphate residues and phosphate amidation). This article also includes translating the proposed biological code into SMILES, being the most commonly used chemical code. Finally, it discusses possible errors associated with applying the proposed code and in the resulting SMILES representations of phosphopeptides. The proposed code can be extended to describe other modifications in the future. Keywords: amino acids; peptides; phosphorylation; phosphate groups; databases; code; bioinformatics; cheminformatics; SMILES Citation: Minkiewicz, P.; Darewicz, M.; Iwaniak, A.; Turło, M. Proposal of the Annotation of Phosphorylated Amino Acids and Peptides Using 1. Introduction Biological and Chemical Codes.
    [Show full text]
  • Annual Scientific Report 2011 Annual Scientific Report 2011 Designed and Produced by Pickeringhutchins Ltd
    European Bioinformatics Institute EMBL-EBI Annual Scientific Report 2011 Annual Scientific Report 2011 Designed and Produced by PickeringHutchins Ltd www.pickeringhutchins.com EMBL member states: Austria, Croatia, Denmark, Finland, France, Germany, Greece, Iceland, Ireland, Israel, Italy, Luxembourg, the Netherlands, Norway, Portugal, Spain, Sweden, Switzerland, United Kingdom. Associate member state: Australia EMBL-EBI is a part of the European Molecular Biology Laboratory (EMBL) EMBL-EBI EMBL-EBI EMBL-EBI EMBL-European Bioinformatics Institute Wellcome Trust Genome Campus, Hinxton Cambridge CB10 1SD United Kingdom Tel. +44 (0)1223 494 444, Fax +44 (0)1223 494 468 www.ebi.ac.uk EMBL Heidelberg Meyerhofstraße 1 69117 Heidelberg Germany Tel. +49 (0)6221 3870, Fax +49 (0)6221 387 8306 www.embl.org [email protected] EMBL Grenoble 6, rue Jules Horowitz, BP181 38042 Grenoble, Cedex 9 France Tel. +33 (0)476 20 7269, Fax +33 (0)476 20 2199 EMBL Hamburg c/o DESY Notkestraße 85 22603 Hamburg Germany Tel. +49 (0)4089 902 110, Fax +49 (0)4089 902 149 EMBL Monterotondo Adriano Buzzati-Traverso Campus Via Ramarini, 32 00015 Monterotondo (Rome) Italy Tel. +39 (0)6900 91402, Fax +39 (0)6900 91406 © 2012 EMBL-European Bioinformatics Institute All texts written by EBI-EMBL Group and Team Leaders. This publication was produced by the EBI’s Outreach and Training Programme. Contents Introduction Foreword 2 Major Achievements 2011 4 Services Rolf Apweiler and Ewan Birney: Protein and nucleotide data 10 Guy Cochrane: The European Nucleotide Archive 14 Paul Flicek:
    [Show full text]
  • MSB 2017 Program Book
    MSB2017 33rd International Symposium on Microscale Separations and Bioanalysis Noordwijkerhout The Netherlands March 26-29 Programme Book SYMPOSIUM CO-CHAIRS Govert W. Somsen (Vrije Universiteit Amsterdam) Rawi Ramautar (Leiden University) www.msb2017.org Content Program Book MSB 2017 3 Sponsors 4 Welcome from the co-chairs of MSB 2017 5 Committees 6 Symposium history 7 Previous HPCE and MSB meetings 8 Arnold O. Beckman Medal and Award 9 Venue 10 General information 11 Short Courses 12 Scientific Programme 20 Science Cafe 22 Panel Discussion 23 Plenary speakers Abstracts & Biographies 28 Keynote speakers Abstracts & Biographies 42 Posters 47 Poster pitches 48 MSB 2017 Young Scientist Award 48 MSB 2017 Poster Awards 49 Social Program 2 Sponsors We wish to thank our sponsors for their generous support. Platinum sponsors Gold sponsors Silver sponsors Bronze sponsors Sponsors 3 Content Welcome from the co-chairs of MSB 2017 It is our great pleasure to welcome you to the 33rd International Symposium on Microscale Separations and Bioanalysis (MSB 2017) being held March 26-29, 2017 at the Conference Center Leeuwenhorst in Noordwijkerhout, The Netherlands. Originally started in 1989 as HPCE symposium, over the years MSB has evolved into an annual interactive forum for the discussion of research on the frontiers of microscale separation science and bioanalysis. Continuing this development, MSB 2017 will span the full range of microscale separation research, from fundamental technology development to high-impact applications relevant to health, medicine, food, forensics and the environment. The meeting is structured with short courses, plenary and parallel oral sessions, poster sessions and pitches, Science Cafe semi- nars, and a panel discussion.
    [Show full text]
  • Three-Dimensional Structures of Carbohydrates and Where to Find Them
    International Journal of Molecular Sciences Review Three-Dimensional Structures of Carbohydrates and Where to Find Them Sofya I. Scherbinina 1,2,* and Philip V. Toukach 1,* 1 N.D. Zelinsky Institute of Organic Chemistry, Russian Academy of Science, Leninsky prospect 47, 119991 Moscow, Russia 2 Higher Chemical College, D. Mendeleev University of Chemical Technology of Russia, Miusskaya Square 9, 125047 Moscow, Russia * Correspondence: [email protected] (S.I.S.); [email protected] (P.V.T.) Received: 26 September 2020; Accepted: 16 October 2020; Published: 18 October 2020 Abstract: Analysis and systematization of accumulated data on carbohydrate structural diversity is a subject of great interest for structural glycobiology. Despite being a challenging task, development of computational methods for efficient treatment and management of spatial (3D) structural features of carbohydrates breaks new ground in modern glycoscience. This review is dedicated to approaches of chemo- and glyco-informatics towards 3D structural data generation, deposition and processing in regard to carbohydrates and their derivatives. Databases, molecular modeling and experimental data validation services, and structure visualization facilities developed for last five years are reviewed. Keywords: carbohydrate; spatial structure; model build; database; web-tool; glycoinformatics; structure validation; PDB glycans; structure visualization; molecular modeling 1. Introduction Knowledge of carbohydrate spatial (3D) structure is crucial for investigation of glycoconjugate biological activity [1,2], vaccine development [3,4], estimation of ligand-receptor interaction energy [5–7] studies of conformational mobility of macromolecules [8], drug design [9], studies of cell wall construction aspects [10], glycosylation processes [11], and many other aspects of carbohydrate chemistry and biology. Therefore, providing information support for carbohydrate 3D structure is vital for the development of modern glycomics and glycoproteomics.
    [Show full text]