Download on the Rawtools
Total Page:16
File Type:pdf, Size:1020Kb
PARSING AND ANALYSIS OF MASS SPECTROMETRY DATA OF COMPLEX BIOLOGICAL AND ENVIRONMENTAL MIXTURES by Kevin Kovalchik B.S., Oregon State University, 2014 B.M., The University of Idaho, 2007 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY in THE FACULTY OF GRADUATE AND POSTDOCTORAL STUDIES (CHEMISTRY) THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver) August 2019 © Kevin Kovalchik, 2019 The following individuals certify that they have read, and recommend to the Faculty of Graduate and Postdoctoral Studies for acceptance, the dissertation entitled: PARSING AND ANALYSIS OF MASS SPECTROMETRY DATA OF COMPLEX BIOLOGICAL AND ENVIRONMENTAL MIXTURES submitted by Kevin A Kovalchik in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Chemistry Examining Committee: David DY Chen Co-supervisor John V Headley Co-supervisor Roman Krems Supervisory Committee Member Ed Grant University Examiner Keng Chou University Examiner ii Abstract The chemical characterization of biological and environmental samples are areas of research which involve the analysis of highly complex chemical mixtures. While the samples from these two fields differ greatly in composition, they present similar challenges. Complex mixtures provide a challenge to the analytical chemist as compounds in the mixture can have matrix effects which interfere with the analysis. Indeed, these interfering compounds may even be analytes themselves. High resolution mass spectrometry, which separates and detects ions based on their mass-to-charge ratio, is a powerful tool in the analysis of such mixtures. The amount of data resulting from such analyses, however, can be intractable to manual analysis, necessitating the use of computational tools. Furthermore, for the data to be reliable it is important that the performance of the mass spectrometer is optimal and consistent, but the complexity of the data again makes manual interpretation of the quality difficult. Thus, there is a need for computational assistance in analysis as well as method optimization and quality control. In Chapter 2:, we present a review of considerations toward the design of a standard mass spectrometry-based method for the quantification of naphthenic acids. The study provides recommendations for how these considerations can be addressed. In Chapter 3:, we describe a computational method of resolving dicarboxylic acids in high resolution mass spectrometry data of mixtures of derivatized naphthenic acid fraction compounds. The study is a proof-of-concept and demonstrates that derivatization-based methods of analyzing these diacid components is feasible but requires further investigation. In Chapter 4: and Chapter 5:, we present two computational tools which assist in method optimization and quality control of Thermo Orbitrap mass spectrometer systems. Chapter 4: presents RawQuant, a software tool which extracts scan quantification and meta data from data- iii dependent analysis data files from Orbitrap mass spectrometer systems. The tool is designed to inform the user toward method optimization. Chapter 5: presents RawTools, which builds upon RawQuant by adding the ability to track important measures of mass spectrometer performance longitudinally across a multi-run experiment. The tool is demonstrated using a 140-file dataset and provides easy visual monitoring of instrument performance. iv Lay Summary Mass spectrometry is a powerful and prevalent tool in the chemical analysis of complex samples. The technology both drives and is driven by an increasing depth of analysis found in fields as diverse as petroleum analysis, environmental monitoring, and cancer research. This thesis will demonstrate the utility of mass spectrometry and the development of new mass spectrometry data analysis tools in two areas: environmental and biological analysis. Toward environmental analysis, we will present a case for the development of a mass spectrometry-based standard analysis method of naphthenic acids and demonstrate how computational analysis of mass spectrometry data can deepen the analysis of such samples. Toward biological analysis, a new software tool for processing and analysis of protein mass spectrometry data and instrument performance is described which aids in method development and quality control of mass spectrometer operation and of the resulting data. v Preface Except as indicated as follows, all results presented in this thesis are my own work. My research program was designed by myself and my graduate supervisor. Chapters 2, 3, 4 and 5 have been published. Publication details and author contributions are as follows: Chapter 2 was published in Frontiers of Chemical Science and Engineering: Kovalchik KA, MacLennan M, Peru K, Headley J, Chen DDY. Standard method design considerations for semi-quantification of total naphthenic acids in oil sands process affected water by mass spectrometry: A review. Frontiers in Chemical Science and Engineering. 2017;11(3):497-507. KAK carried out the majority of the literature review and wrote the manuscript. MM contributed to literature review. MM, PK, JH and DC contributing to writing the manuscript. Chapter 3 was published in Rapid Communications in Mass Spectrometry: Kovalchik KA, MacLennan MS, Peru KM, Ajaero C, McMartin DW, Head JV, Chen DDY. Characterization of dicarboxylic naphthenic acid fraction compounds utilizing amide derivatization: Proof of concept. Rapid Commun Mass Sp. 2017;31(24):2057- 2065. KAK carried out the data analysis and wrote the manuscript. KAK, MSM, KMP. and CA carried out the experimental work. MSM, KMP, CA, DWM, JVH and DDYC contributed to writing the manuscript. vi Chapter 4 was published in The Journal of Proteome Research: Kovalchik KA, Moggridge S, Chen DDY, Morin GB, Hughes CS. Parsing and Quantification of Raw Orbitrap Mass Spectrometer Data Using RawQuant. J Proteome Res. 2018;17(6):2237-2247. CSH and KAK conceived the idea, carried out the data analysis, and wrote the manuscript. KAK developed and wrote all code for the computational tool. SM performed data analysis and contributed to writing of the manuscript. DDYC and GBM contributed to writing of the manuscript. Chapter 5 was published in The Journal of Proteome Research: Kovalchik KA, Colborne S, Spencer SE, Sorensen PH, Chen DDY, Morin GB, Hughes CS. RawTools: Rapid and Dynamic Interrogation of Orbitrap Data Files for Mass Spectrometer System Management. J Proteome Res. 2019;18(2):700-708. KAK and CSH conceived the idea, carried out the data analysis, and wrote the manuscript. KAK developed and wrote all code for the computational tool. SC and S.S. helped with data acquisition and tool design. PHS, DDYC, and GBM contributed to writing of the manuscript. vii Table of Contents Abstract ......................................................................................................................................... iii Lay Summary .................................................................................................................................v Preface ........................................................................................................................................... vi Table of Contents ....................................................................................................................... viii List of Tables .............................................................................................................................. xiv List of Figures ...............................................................................................................................xv List of Abbreviations ................................................................................................................ xxii Acknowledgements .................................................................................................................. xxiv Dedication ...................................................................................................................................xxv Introduction ................................................................................................................1 1.1 Naphthenic acids and naphthenic acid fraction components ............................................ 1 1.2 Proteomics ........................................................................................................................ 2 1.2.1 Quality of data in mass spectrometry-based proteomics ........................................... 3 1.3 Mass Spectrometry ........................................................................................................... 4 1.3.1 The Orbitrap mass analyzer ....................................................................................... 5 1.3.2 Coupling mass spectrometry to liquid chromatography ............................................ 6 1.4 Untargeted mass spectrometry methods ........................................................................... 7 1.4.1 Direct injection .......................................................................................................... 7 1.4.2 Data-dependent acquisition ....................................................................................... 8 1.4.3 Isobaric labelling ....................................................................................................... 8 1.5 Research objectives ........................................................................................................ 11 viii