Exploring in Silico Protein Toxicity Prediction Methods to Support the Food and Feed Risk Assessment

EXTERNAL SCIENTIFIC REPORT APPROVED: 28 May 2020 doi:10.2903/sp.efsa.2020.EN-1875 Literature search – Exploring in silico protein toxicity prediction methods to support the food and feed risk assessment L. Palazzolo1, E. Gianazza1, I. Eberini1 1Dipartimento di Scienze Farmacologiche e Biomolecolari, Università degli Studi di Milano, Via Balzaretti 9, 20133 Milano (IT) Abstract This report is the outcome of an EFSA procurement (NP/EFSA/GMO/2018/01) reviewing relevant scientific information on in silico prediction methods for protein toxicity, that could support the food and feed risk assessment. Several proteins are associated with adverse (toxic) effects in humans and animals, by a variety of mechanisms. These are produced by plants, animals and bacteria to prevail in hostile environments. In the present report, we present an integrated pipeline to perform a comprehensive literature and database search applied to proteins with toxic effects. “Toxin activity” and “toxin-antitoxin system” strings were used as inputs for this pipeline. UniProtKB was considered as the reference database, and only the UniProtKB curator-reviewed proteins were considered in the pipeline. Experimentally- determined structures and homology-based in silico 3D models were retrieved from protein structures repositories; family-, domain-, motif- and other molecular signature-related information was also obtained from specific databases which are part of the InterPro consortium. Protein aggregation associated with adverse effects was also investigated using different search strategies. This work can serve as the basis for further exploring novel risk assessment strategies for new proteins using in silico predictive methods. © European Food Safety Authority, 2020 Key words: (protein toxicity, bioinformatics, toxic activity, toxin-antitoxin system, aggregates, predictive toxicity) Question number: EFSA-Q-2019-00038 Correspondence: [email protected] www.efsa.europa.eu/efsajournal 1 EFSA Supporting publication 2020: EN-1875 Literature search – Exploring in silico protein toxicity prediction methods to support food/feed RA Disclaimer: The present document has been produced and adopted by the bodies identified above as author(s). This task has been carried out exclusively by the author(s) in the context of a contract between the European Food Safety Authority and the author(s), awarded following a tender procedure. The present document is published complying with the transparency principle to which the Authority is subject. It may not be considered as an output adopted by the Authority. The European Food Safety Authority reserves its rights, view and position as regards the issues addressed and the conclusions reached in the present document, without prejudice to the rights of the authors. Suggested citation: Palazzolo L, Gianazza E and Eberini I, 2020. Literature search – Exploring in silico protein toxicity prediction methods to support the food and feed risk assessment. EFSA supporting publication20 20:EN-1875. 89 pp. doi:10.2903/sp.efsa.2020.EN-1875 ISSN: 2397-8325 © European Food Safety Authority, 2020 www.efsa.europa.eu/publications 2 EFSA Supporting publication 2020: EN-1875 The present document has been produced and adopted by the bodies identified above as authors. This task has been carried out exclusively by the authors in the context of a contract between the European Food Safety Authority and the authors, awarded following a tender procedure. The present document is published complying with the transparency principle to which the Authority is subject. It may not be considered as an output adopted by the Authority. The European Food Safety Authority reserves its rights, view and position as regards the issues addressed and the conclusions reached in the present document, without prejudice to the rights of the authors). Literature search – Exploring in silico protein toxicity prediction methods to support food/feed RA Summary Abstract .........................................................................................................................................1 Summary .......................................................................................................................................3 1. Introduction ........................................................................................................................5 1.1. Background and Terms of Reference as provided by the requestor ........................................5 1.1.1. Task 1 ................................................................................................................................5 1.1.2. Task 2 ................................................................................................................................5 1.2. Interpretation of the Terms of Reference as provided by EFSA ..............................................5 1.3. Protein toxicity – Background information .............................................................................5 1.3.1. Toxic proteins .....................................................................................................................6 1.3.2. Proteins with putative toxic activity as aggregates .................................................................8 2. Data and Methodologies ......................................................................................................9 2.1. Databases ........................................................................................................................ 10 2.1.1. Literature databases .......................................................................................................... 10 2.1.2. Protein Databases - Primary structure and function ............................................................. 11 2.1.3. Protein databases - Modelling tools .................................................................................... 12 2.1.4. Families, domains and signatures ....................................................................................... 13 2.1.5. Toxin predicting tools ........................................................................................................ 15 2.2. Methodologies .................................................................................................................. 16 2.2.1. Compilation of a comprehensive collection of information for toxic proteins .......................... 16 2.2.2. Proteins with putative toxic activity as aggregates ............................................................... 23 2.3. Three-dimensional structures ............................................................................................. 26 2.4. Families, domains and signatures ....................................................................................... 26 2.5. Automated information retrieval and management tool ....................................................... 26 3. Results ............................................................................................................................. 27 3.1. Toxic proteins ................................................................................................................... 27 3.1.1. Main Collection ................................................................................................................. 27 3.1.2. TAS Collection ................................................................................................................... 46 3.2. Proteins with putative toxic activity as aggregates ............................................................... 60 4. Quality assurance of 3D structures ..................................................................................... 63 4.1. Protein Data Bank ............................................................................................................. 63 4.1.1. Models downloaded from SM repository ............................................................................. 64 5. Evaluation of toxin prediction tools ..................................................................................... 66 5.1. NTXpred ........................................................................................................................... 67 5.2. BTXpred ........................................................................................................................... 70 5.3. Knottin ............................................................................................................................. 73 5.4. CLANTOX ......................................................................................................................... 76 5.5. ConoServer ....................................................................................................................... 79 5.6. ToxinPred ......................................................................................................................... 82 6. Evaluation of other freely available tools ............................................................................. 83 6.1. Meme ............................................................................................................................... 83 7. Discussion, conclusions and future perspectives .................................................................. 84 References ................................................................................................................................... 86 Annex A

Exploring in Silico Protein Toxicity Prediction Methods to Support the Food and Feed Risk Assessment

The Role of Streptococcal and Staphylococcal Exotoxins and Proteases in Human Necrotizing Soft Tissue Infections

Manual of Bacteriology

Poisoning by Medical Plants

Meeting Key Synthetic Challenges in Amanitin Synthesis with a New Cytotoxic Analog: 50-Hydroxy- Cite This: Chem

Biosafety Level 1 and Rdna Training

Clinical Laboratory Preparedness and Response Guide

Penetration of Stratified Mucosa Cytolysins Augment Superantigen

Chapter 26 BIOSAFETY Appendix B. Pathogen and Toxin Lists B.1

Denotations & Old Terminologies Used in Homopathy

Response of Cellular Innate Immunity to Cnidarian Pore-Forming Toxins

A Novel Superantigen Isolated from Pathogenic Strains of Streptococcus Pyogenes with Aminoterminal Homology to Staphylococcal Enterotoxins B and C

A Report on Mushrooms Poisonings in 2018 at the Apulian Regional Poison Center