UC San Diego UC San Diego Electronic Theses and Dissertations
Total Page:16
File Type:pdf, Size:1020Kb
UC San Diego UC San Diego Electronic Theses and Dissertations Title Bioinformatics Methods for Natural Product Discovery / Permalink https://escholarship.org/uc/item/1cp8f9f1 Author Mohimani, Hosein Publication Date 2013 Peer reviewed|Thesis/dissertation eScholarship.org Powered by the California Digital Library University of California UNIVERSITY OF CALIFORNIA, SAN DIEGO Bioinformatics Methods for Natural Product Discovery A dissertation submitted in partial satisfaction of the requirements for the degree Doctor of Philosophy in Electrical Engineering (Communications Theory and Systems) by Hosein Mohimani Committee in charge: Professor Pavel A. Pevzner, Chair Professor Alexander Vardy, Co-Chair Professor Vineet Bafna Professor Nuno Bandeira Professor Pieter C. Dorrestein Professor William Hodgkiss Professor Paul Siegel 2013 Copyright Hosein Mohimani, 2013 All rights reserved. The dissertation of Hosein Mohimani is approved, and it is acceptable in quality and form for publication on microfilm and electronically: Co-Chair Chair University of California, San Diego 2013 iii DEDICATION To my parents. iv TABLE OF CONTENTS SignaturePage ................................. iii Dedication.................................... iv TableofContents................................ v ListofFigures ................................. vii ListofTables .................................. viii Acknowledgements ............................... ix Vita....................................... xi AbstractoftheDissertation . xii Chapter1 Introduction........................... 1 Chapter 2 Multiplex De Novo Sequencing of Peptide Antibiotics .... 4 2.1 Introduction........................ 4 2.2 Results........................... 8 2.3 Methods .......................... 16 2.4 Acknowledgement..................... 17 Chapter 3 Sequencing Cyclic Peptides by Multistage Mass Spectrometry 23 3.1 Introduction........................ 23 3.2 Materialsandmethods . 25 3.3 Results........................... 29 3.4 Discussion ......................... 31 3.5 Acknowledgemet ..................... 33 Chapter4 Cycloquest............................ 34 4.1 Introduction........................ 34 4.2 MaterialsandMethods . 37 4.3 Results........................... 43 4.4 Discussion ......................... 49 4.5 Acknowledgement..................... 50 Chapter5 MS-DPR............................. 51 5.1 Introduction........................ 51 5.2 MaterialsandMethod . 55 5.3 Results........................... 59 v 5.4 Discussion ......................... 64 5.5 Acknowledgement..................... 65 Bibliography .................................. 70 vi LIST OF FIGURES Figure 1.1: Natural product discovery pipeline . .... 3 Figure 2.1: Structures of Tyrocidine A, Cyclomarin A, and Reginamide A. 5 Figure2.2: SpectralprofileoftyrocidineA . ... 18 Figure2.3: Multitagalgorithm . 20 Figure2.4: Spectralnetowrks . 22 Figure 3.1: Illustration of experimental ion tree. ....... 27 Figure3.2: Branchandboundapproach. 28 Figure3.3: Distributionofscores . 31 Figure 4.1: Ribosomal cyclopeptides appear in all domains oflife..... 35 Figure 4.2: Head-to-tail cyclization versus concatenated cyclization. 37 Figure4.3: StepsofCycloquest . 47 Figure 4.4: Target decoy analysis for cycloquest . ..... 49 Figure5.1: Thepipeline ........................... 54 Figure5.2: MarkovChain .......................... 57 Figure5.3: Illustrationofsisterpeptides . ..... 58 Figure5.4: Thealgorithm .......................... 66 Figure 5.5: Illustration of theoric spectrum . ..... 67 Figure 5.6: Evloution of µ and p ....................... 67 Figure5.7: ComparisonofMS-DPR . 68 Figure 5.8: Estimating the score distribution for PSMs . ...... 69 vii LIST OF TABLES Table 1.1: .418pt14.5Comparing different natural product discovery approaches. ........... 3 Table 2.1: Multiplex de novo sequencing of Tyrocidines . ...... 19 Table 2.2: Sequence reconstruction by spectral networks . ........ 21 Table 2.3: Dereplication of Reginamide variants . ..... 21 Table3.1: Multistagesequencingresults. .... 30 Table3.2: Previousreconstructions . .. 32 Table 3.3: Comparison of Single Stage and MultiStage . ..... 32 Table 4.1: Top score reconstructions for SFT-L1 . .... 45 Table 4.2: Top score reconstructions for SFTI-1 . .... 45 Table 4.3: Top score reconstructions for SFTI-1(K,S) . ...... 46 Table4.4: TopscorereconstructionsofSKF . .. 46 Table4.5: TopscorereconstructionsofSDP . .. 46 Table 4.6: Top score reconstructions of defensin . ..... 48 Table 5.1: Comparison of theoretical p-value . .... 61 Table5.2: Topscorereconstructions . .. 63 viii ACKNOWLEDGEMENTS Without te help of many others, it would have been impossible for me to get to where I am now. I would first like to thank my advisor, Dr Pavel Pevzner, for teaching me how to be a scientist. What I learned from him is not limited to science, but it covers writing skills, how to express ideas to scientific community, and many more. I’d also like to thank my co-advisor, Dr. Pieter Dorrestein, for motivating me with so many fascinating ideas during my graduate school years. I would also like to thank Dr Vineet Bafna, Dr Nuno Bandeira, and all my labmates during the last five years, in particular Boyko Kakaradov, Kyowon jeong, Mingxun Wang and Sangtae Kim . I also thank my ECE advisor Dr Alexander Vardy, and my committee members Dr Paul Siegel and Dr William Hodgkiss. Finally, I would not even be in this position without the sacrifices made by my family over the years. I would like to thank my parents, my sister, and my brother. Chapter 2 is in part adapted from Hosein Mohimani, Wei-Ting Liu, Yu- Liang Yang, Susana P. Gaudenico, William Fenical, Pieter C. Dorrestein, and Pavel Pevzner, Multiplex De Novo Sequencing of Peptide Antibiotics, Journal of Computational Biology, 2011, 18(11): 1371-1381. The dissertation author was the primary author of this paper responsible for the research. Chapter 3 is in part adapted from Hosein Mohimani, Yu-Liang Yang, Wei- Ting Liu, Pei-Wen Hsieh, Pieter C. Dorrestein, and Pavel Pevzner, Sequencing Cyclic Peptides by Multistage Mass Spectrometry, 2011, Journal of Proteomics, 11(18), 3642-50. The dissertation author was the primary author of this paper responsible for the research. Chapter 4 is in part adapted from Hosein Mohimani, Wei-Ting Liu, Joshua S. Mylne, Aaron G. Poth, Michelle Colgrave, Michael Selsted, Pieter C. Dor- restein, and Pavel A. Pevzner, Cycloquest: Identification of cyclopeptides via database search of their mass spectra against genome databases, 2011, Journal of Proteome Research, 10 (10), pp 4505-4512. The dissertation author was the primary author of this paper responsible for the research. Chapter 5 is in part adapted from Hosein Mohimani, Sangtae Kim, and ix Pavel A. Pevzner, A new approach to evaluating statistical significance of spec- tral identifications, 2013, Journal of Proteome Research, 12 (4), pp 1560-1568. The dissertation author was the primary author of this paper responsible for the research. x VITA 2008 B.S. in Electrical Engineering and Mathematical Sciences, Sharif University of Technology, Iran 2013 Ph. D. in Electrical Engineering (Communications Theory and Systems), University of California, San Diego, USA PUBLICATIONS Hosein Mohimani, Roland Kersten, Wei-Ting Liu, Pieter C. Dorrestein, and Pavel A. Pevzner, NRPquest: Coupling Mass Spectrometry and Genome Mining for Non Ribosomal Peptide Discovery (submitted). Hosein Mohimani, Roland Kersten, Wei-Ting Liu, Samuel O. Purvine, Si Wu, Heather M. Brewer, Ljiljana Pasa-Tolic, Bradley S. Moore, Pavel A. Pevzner, and Pieter C. Dorrestein, Informatimicin, a novel Streptomyces viridochromogenes lanthipeptide discovered by automated peptidogenomics (in preparation). Hosein Mohimani, Sangtae Kim, and Pavel A. Pevzner, A New Approach to Evaluating Statistical Significance of Spectral Identifications, Journal of Proteome Research, 2013, 12 (4), pp 15601568 Hosein Mohimani, Wei-Ting Liu, Joshua S. Mylne, Aaron G. Poth, Michelle Colgrave, Michael Selsted, Pieter C. Dorrestein, and Pavel A. Pevzner, Cycloquest: Identification of cyclopeptides via database search of their mass spectra against genome databases, Journal of Proteome Research, 2011, 10 (10), pp 4505-4512. Hosein Mohimani, Yu-Liang Yang, Wei-Ting Liu, Pei-Wen Hsieh, Pieter C. Dorrestein, and Pavel Pevzner, Sequencing Cyclic Peptides by Multistage Mass Spectrometry, 2011, Journal of Proteomics, 11(18), 3642-50. Hosein Mohimani, Wei-Ting Liu, Yu-Liang Yang, Susana P. Gaudenico, William Fenical, Pieter C. Dorrestein, and Pavel Pevzner, Multiplex De Novo Sequencing of Peptide Antibiotics, Journal of Computational Biology, 2011, 18(11), 1371-1381 Emily Mevers,Wei-Ting Liu, Niclas Engene, Hosein Mohimani, Tara Byrum, Pavel A. Pevzner, Pieter C. Dorrestein, Carmenza Spadafora, and William H. Gerwick, Cytotoxic veraguamides, alkynyl bromide-containing cyclic depsipeptides from the marine cyanobacterium cf. Oscillatoria margaritifera, 2011, Journal of Natural Product, 74 (5), pp 928-936. xi ABSTRACT OF THE DISSERTATION Bioinformatics Methods for Natural Product Discovery by Hosein Mohimani Doctor of Philosophy in Electrical Engineering (Communications Theory and Systems) University of California, San Diego, 2013 Professor Pavel A. Pevzner, Chair Professor Alexander Vardy, Co-Chair Most of new chemical entities introduced as antibacterials over the last decades are derivated from natural products produced by living organisms. Some of the most effective antibiotics are peptidic natural products. The traditional process of natural products