Using the US EPA's Comptox Chemistry Dashboard for Structure Identification and Non-Targeted Analyses
Total Page:16
File Type:pdf, Size:1020Kb
http://www.orcid.org/0000-0002-2668-4821 Using the US EPA’s CompTox Chemistry Dashboard for structure identification and non-targeted analyses Antony Williams1, Andrew D. McEachran3, Seth Newton2, Kristin Isaacs2, Katherine Phillips2, Nancy Baker1, Chris Grulke1 and Jon R. Sobus2 1) National Center for Computational Toxicology, U.S. Environmental Protection Agency, RTP, NC 2) National Exposure Research Laboratory, U.S. Environmental Protection Agency, RTP, NC 3) Oak Ridge Institute of Science and Education (ORISE) Research Participant, Research Triangle Park, NC The views expressed in this presentation are those of the author and do not necessarily reflect the views or policies of the U.S. EPA March 2018 ACS Spring Meeting, New Orleans The CompTox Chemistry Dashboard • A publicly accessible website delivering access: – ~760,000 chemicals with related property data – Experimental and predicted physicochemical property data – Experimental Human and Ecological hazard data – Integration to “biological assay data” for 1000s of chemicals – Information regarding consumer products containing chemicals – Links to other agency websites and public data resources – “Literature” searches for chemicals using public resources – “Batch searching” for thousands of chemicals – DOWNLOADABLE Open Data for reuse and repurposing 1 CompTox Chemistry Dashboard https://comptox.epa.gov/dashboard 2 1 of ~761,000 Chemical Pages 3 Access to Chemical Hazard Data 4 Sources of Exposure to Chemicals 5 Dashboard for Structure ID • Structure Identification using the dashboard – Formula/mass-based searching – 1 chemical at a time 6 Advanced Searches 7 Advanced Searches Mass Based Search 8 Advanced Searches 9 Formula Searches 10 Exact Formula Search: C12H17NO 298 Chemicals 11 Dashboard for Structure ID • Structure Identification using the dashboard – Formula/mass-based searching – 1 chemical at a time – Distilling structures into “MS-Ready form” 12 Specific Data-Mappings “MS-Ready Structures” 13 Diphenhydramine 15 Total MS-Ready Mappings 14 “MS Ready” Formula Search C12H17NO 354 Chemicals 15 Dashboard for Structure ID • Structure Identification using the dashboard – Formula/mass-based searching – 1 chemical at a time – Distilling structures into “MS-Ready form” – Ranking based on metadata 16 Identifying Known Unknowns by reference ranking 17 Data source ranking using the Dashboard 18 DOI: 10.1007/s00216-016-0139-z Additional Metadata Ranking • US EPA CompTox Chemistry Dashboard Data Sources • “CPDat” Consumer Product Database • PubChem Data Source Count • PubMed Reference Count 19 Additional Metadata Ranking C12H17NO: 354 Chemicals 20 Additional Metadata Ranking C12H17NO: 354 Chemicals 21 Top Ranked Chemical 22 Additional data streams in development • US EPA CompTox Chemistry Dashboard Data Sources • “CPDat” Consumer Product Database • PubChem Data Source Count • PubMed Reference Count • Retention Time Prediction = + + + + • Predicted Environmental Media Occurrence ⋯ • Presence in Lists C7H7NO3 0 1 2 3 4 DTXSID5024506 DTXSID3020962 DTXSID0026961 DTXSID2022591 DTXSID9059208 DTXSID1052298 DTXSID5075365 DTXSID2062535 DTXSID0046066 DTXSID90197716 Data Sources Retention Time Media Occurrence Method Compatibility 23 “Chemicals Detected in Water” 24 Dashboard for Structure ID • Structure Identification using the dashboard – Formula/mass-based searching – 1 chemical at a time – Distilling structures into “MS-Ready form” – Ranking based on metadata – Batch searching of formulae and masses 25 Batch Search 26 Batch Search 27 Excel Output 28 Batch Search Integration to MetFrag http://c-ruttkies.github.io/MetFrag/projects/metfragweb/ 29 MetFrag Input File 30 Batch Search Integration to MetFrag http://c-ruttkies.github.io/MetFrag/projects/metfragweb/ 31 The Dashboard to Support MS-Analysis MS-Ready Structures Underpin Analysis 32 Downloadable Data 33 Future Work: Combined Substructure/Formula Searching 34 Future Work: Searching Against Predicted Spectra 35 Future Work: Searching Against Predicted Spectra • CFM-ID predicted spectra generated for 700,000 chemicals – Positive ion, Negative ion, Electron Impact – Three energies 36 Future Work Scoring scheme into results = + + + + ⋯37 Conclusion • The CompTox Chemistry Dashboard provides access to data for ~760,000 chemicals • High quality curated data and rich metadata facilitates mass spec analysis • “MS-Ready” processed data enables structure identification 38 Acknowledgments • The CompTox Chemistry Dashboard team • NERL colleagues: – Jon Sobus, Elin Ulrich, Mark Strynar, Seth Newton (NTA Analysis) – Katherine Phillips, Kathie Dionisio, Kristin Isaacs (Consumer Products Database) • Emma Schymanski – Luxembourg Center for Systems Biomedicine (MS-ready/NTA) 39 Contact Antony Williams US EPA Office of Research and Development National Center for Computational Toxicology (NCCT) [email protected] ORCID: https://orcid.org/0000-0002-2668-4821 40.