B American Society for Mass Spectrometry, 2019 J. Am. Soc. Mass Spectrom. (2019) 30:2608Y2616 DOI: 10.1007/s13361-019-02314-3 RESEARCH ARTICLE Kendrick Mass Defect Approach Combined to NORINE Database for Molecular Formula Assignment of Nonribosomal Peptides Mickaël Chevalier,1 Emma Ricart,2 Emeline Hanozin,3 Maude Pupin,4,5 Philippe Jacques,6 Nicolas Smargiasso,3 Edwin De Pauw,3 Frédérique Lisacek,2 Valérie Leclère,1 Christophe Flahaut1 1Univ. Lille, INRA, ISA, Univ. Artois, Univ. Littoral Côte d’Opale, EA 7394-Institut Charles Viollette (ICV), F-59000, Lille, France 2Proteome informatics Group, SIB Swiss Institute of Bioinformatics (SIB), and Computer Science Department, University of Geneva, Geneva, Switzerland 3Mass Spectrometry Laboratory, Molecular Systems - MolSys Research Unit, University of Liège, Liège, Belgium 4Univ. Lille, CNRS, Centrale Lille, UMR 9189 - CRIStAL - Centre de Recherche en Informatique Signal et Automatique de Lille, F-59000, Lille, France 5Inria-Lille Nord Europe, Bonsai team, F-59655, Villeneuve d’Ascq Cedex, France 6TERRA Research Centre, Microbial Processes and Interactions (MiPI), Gembloux Agro-Bio Tech University of Liège, B-5030, Gembloux, Belgium Abstract. The identification of known (dereplication) or unknown nonribosomal pep- tides (NRPs) produced by microorganisms is a time consuming, expensive, and challenging task where mass spectrometry and nuclear magnetic resonance play a key role. The first step of the identification process always involves the estab- lishment of a molecular formula. Unfortunately, the number of potential molecular formulae in- creases significantly with higher molecular masses and the lower precision of their measurements. In the present article, we demonstrate that molecular formula assignment can be achieved by a combined approach using the regular Kendrick mass defect (RKMD) and NORINE, the reference curated database of NRPs. We observed that irrespective of the molecular formula, the addition and subtraction of a given atom or atom group always leads to the same RKMD variation and nominal Kendrick mass (NKM). Graphically, these variations translated into a vector mesh can be used to connect an unknown molecule to a known NRP of the NORINE database and establish its molecular formula. We explain and illustrate this concept through the high-resolution mass spectrometry analysis of a commercially available mixture composed of four surfactins. The Kendrick approach enriched with the NORINE database content is a fast, useful, and easy-to-use tool for molecular mass assignment of known and unknown NRP structures. Electronic supplementary material The online version of this article (https:// doi.org/10.1007/s13361-019-02314-3) contains supplementary material, which is available to authorized users. Correspondence to: Christophe Flahaut; e-mail: christophe.flahaut@univ– artois.fr M. Chevalier et al.: KMD for Molecular Formula Assignment of NRPs 2609 Keywords: Kendrick map, Mass defect, Molecular formula, Nonribosomal peptides, NORINE Abbreviations ESI, Electrospray ionization; FT, Fourier transform; ICR, Ion cyclotron resonance; KMD, Kendrick mass defect; NRPs, Nonribosomal peptides; NRPS, Nonribosomal peptide synthetases; NKM,Nominal Kendrick mass; MALDI, Matrix-assisted laser desorption/ionization Received: 12 January 2019/Revised: 3 July 2019/Accepted: 10 August 2019 /Published Online: 28 October 2019 Introduction mass spectrometry (MS) (all coupled to separative techniques) is also a frequently used analytical method for secondary onribosomal peptides (NRPs) are secondary metabolites metabolite characterization [7, 8]. In this regard, high- Nusually produced by microorganisms. They represent resolution mass spectrometry (HRMS) such as very high field very large families of natural products with a peptidic moiety. Fourier transform ion cyclotron resonance (FT-ICR) technolo- Belonging to the class of peptide secondary metabolites, NRPs gies allows sub-ppm measurements for the computer-assisted are organic molecules that are not directly involved in the deduction of molecular formulae [9]. This can be achieved growth of an organism. Their absence is not lethal but may computationally with software that usually relies on detecting impact the survival, appearance, or growth of the microorgan- the isotopic pattern, the protonated or alkali metal adducts, and ism in a given ecological niche. NRP production provides an the state of charge of the molecule. However, some of the most advantage to those microorganisms that synthesize these mol- popular mass spectrometers, based on Orbitrap and hybrid Q- ecules by boosting competitiveness. In contrast with ribosomal TOF technologies, do not have the necessary resolving power peptides, the molecular structure of NRPs cannot be directly and mass accuracy to establish a molecular formula to ions of deduced from the genome because their biosynthesis does not given m/z [10]. Nonetheless, the molecular formula is a first result from the translation of mRNA. In fact, NRP synthesis is piece of information contributing to the identification of a performed via large enzymatic complexes called nonribosomal compound. Overall, when the mass exceeds 500 Da, several peptide synthetases (NRPS) and produces linear, semi-cyclic, possibilities of candidate molecular formulae co-exist and the cyclic, or branched polymeric structures of masses ranging greater the measured mass, the greater the number of possibil- from 200 to 3000 Da. ities. As a result, a range of strategies and algorithms has been Most NRPs are metabolites including both a peptide core developed. Kind and Fiehn were among the first to propose a and a nonpeptidic moiety. They can be modified during or set of tools for the calculation of elementary composition called post-synthesis (N-formylated, N-methylated, acetylated, glyco- Bthe seven golden rules^ [11]. This toolset relies on a combi- sylated, reduced, oxidized) increasing their structural biodiver- nation defined by Senior and Lewis, of rules of elementary sity. Currently, there are more than 500 monomers, among ratios for the CHONPS elements (respecting the valence of them proteogenic and non-proteogenic amino acids, but also atoms) and rules of isotopic abundance. This software is coded aliphatic chains, chromophores, and many others are known in Visual Basic, usable from Excel, and is freely available. In and referenced in the NORINE database that gathers more than metabolomics [12, 13] and more generally in chemistry [14, 1187 NRP curated structures [1, 2]. NRPs display an extremely 15], one of the strategies for obtaining a molecular formula broad range of biological activities and pharmacological prop- consists in using the isotopic profile. Most MS software (Sirius erties ranging from anti-bacterial, anti-inflammatory, surfac- [16]andBrain[17]) can simulate MS signals with respect to tants, or siderophores iron chelatant (siderophores). Hence, both molecular formula and a characteristic terrestrial isotope the interest of identifying new NRPs and developing effective composition [18] while taking into account the resolving power screening tools is high, considering potential applications in of the generating device (Chemcalc) [19]. Such an approach many fields as health, cosmetic, agrofood, or biocontrol. significantly reduces the number of candidates and eliminates In the Bomics^ cascade [3] (genomics, transcriptomics, pro- over 90% of incorrect molecular formulae for masses greater teomics, lipidomics, glycomics…), metabolomics and than 1000 Da [16]. metabonomics [4] designate the comprehensive, dynamic, qual- Long before this software era, Edward Kendrick had pro- itative, and quantitative study of all the small molecules (≤ to posed an elegant mathematic method based on the determina- about 1500 Da) in biological samples [5, 6]. Therein, metabolo- tion of a mass defect (now commonly named Kendrick mass mics encompasses the study of secondary metabolites such as defect (KMD)) to facilitate the discrimination between homol- NRPs. However, the mass range and structure of NRPs do not ogous compounds having different numbers of same base fully qualify for processing with any of the metabolomics ana- units. Briefly, the notion of mass defect of a single element or lytical workflows. This specificity warrants the definition of chemical compound is calculated as the difference between the BNRPomics^ as the systematic study of NRPs that entails the exact mass of the corresponding isotope and its nominal mass comprehensive, dynamic, qualitative, and quantitative character- which is the simple addition of the number of protons and ization of NRPs present in environmental or biological samples. neutrons in a given formula or elemental isotope [20]. By Concomitantly to the massive and non-controversial use of convention, carbon-12 (12C) has been defined [18]asthe ultraviolet-visible (UV) and infrared (IR) spectrophotometry, element with zero mass defect, and therefore, its atomic mass 2610 M. Chevalier et al.: KMD for Molecular Formula Assignment of NRPs is 12 Da while the hydrogen (1H) has an atomic mass of FT-ICR-MS. These experimental masses were processed and 1.00783 for a nominal mass of 1, and hence a mass defect of plotted on the Kendrick-based NORINE map to identify their 0.00783. molecular formulae. As the results matched expected compo- The nominal Kendrick mass (NKM) uses an atom group as sitions, the approach holds promise for identifying new high a building block (or base unit) while applying the principle
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages9 Page
-
File Size-