Study Title Bioinformatic Analysis of Proteins in Golden Rice 2 to Assess
Total Page:16
File Type:pdf, Size:1020Kb
Food Allergy Research and Resource Program Study No. BIO-02-2006 University of Nebraska Page 1 of 24 Study Title Bioinformatic Analysis of Proteins in Golden Rice 2 to Assess Potential Allergenic Cross- Reactivity Authors Richard E. Goodman, John Wise Study Completed On May 2, 2006 Performing Laboratory Food Allergy Research and Resource Program Food Science and Technology University of Nebraska 143 Food Science & Technology Lincoln, NE 68583-0955 Laboratory Project ID Study Number: BIO-02-2006 Food Allergy Research and Resource Program Study No. BIO-02-2006 University of Nebraska Page 2 of 24 Study Number: BIO-02-2006 Title: Bioinformatic Analysis of Proteins in Golden Rice 2 to Assess Potential Allergenic Cross-Reactivity Facility: Food Allergy Research and Resource Program Food Science and Technology University of Nebraska 143 Food Industry Complex Lincoln, NE 68583-0955 USA Authors: Richard E. Goodman, John Wise University of Nebraska Study Start Date: April 4, 2006 Study Completion Date: June 5, 2006 Records Retention: All study specific raw data and a copy of the final report will be retained at the Food Allergy Research and Resource Program, University of Nebraska. Signatures of Final Report Approval: Richard E. Goodman, Ph.D. 5 June 2006 Study Author Date Stephen L. Taylor, Ph.D. 5 June 2006 FARRP Co-Director Date Food Allergy Research and Resource Program Study No. BIO-02-2006 University of Nebraska Page 3 of 24 Table of Contents Section Page Title Page .......................................................................................................................1 Signatures of Final Report Approval.............................................................................2 Table of Contents...........................................................................................................3 Abbreviations and Definitions.......................................................................................5 1.0 Summary........................................................................................................................6 2.0 Introduction....................................................................................................................7 3.0 Purpose.........................................................................................................................10 4.0 Methods .......................................................................................................................10 4.1 Protein databases..............................................................................................10 4.1.1 AllergenOnline version 6.0 database...................................................10 4.1.2 NCBI Entrez Protein database ............................................................10 4.2 Sequence database search strategies................................................................10 4.2.1 FASTA3 overall search of AllergenOnline .........................................10 4.2.2 FASTA3 of AllergenOnline by 80 aa segments ..................................11 4.2.3 BLASTP of NCBI Entrez “allergen”...................................................12 5.0 Results and Discussion ................................................................................................13 5.1 Bioinformatics results for CTP-CRTI modified from Erwinia (GR2).............13 5.2 Bioinformatics results for PSY from maize (GR2)..........................................15 5.3 Bioinformatics results for PMI from E. coli (GR2).........................................17 5.4 Bioinformatics results for positive control Ber e 1..........................................18 6.0 Conclusions..................................................................................................................21 7.0 References....................................................................................................................22 Food Allergy Research and Resource Program Study No. BIO-02-2006 University of Nebraska Page 4 of 24 Tables Table 1. CTP-CRTI overall FASTA3 search of AllergenOnline version 6.0.............14 Table 2. CTP-CRTI BLASTP of NCBI Entrez “allergen” .........................................15 Table 3 PSY overall FASTA3 search of AllergenOnline version 6.0 .......................16 Table 4 PSY BLASTP of NCBI Entrez “allergen”....................................................16 Table 5 PMI overall FASTA3 search of AllergenOnline version 6.0 .......................17 Table 6 PMI BLASTP of NCBI Entrez “allergen”....................................................18 Table 7 Ber e 1 overall FASTA3 search of AllergenOnline version 6.0...................19 Table 8 Ber e 1 80 amino acid FASTA search of AllergenOnline version 6.0 .........20 Table 9 Ber e 1 BLASTP of NCBI Entrez “allergen” ...............................................21 Figures Figure 1. CTP-CRTI query sequence...........................................................................13 Figure 2 PSY query sequence .....................................................................................15 Figure 3. PMI query sequence......................................................................................17 Figure 4. Ber e 1 query sequence .................................................................................19 Appendices Appendix 1. AllergenOnline version 6.0 sequence database ...........................................25 Appendix 2. CTP-CRTI sequence and bioinformatics data ..............................................71 Appendix 3. PSY sequence and bioinformatics data ........................................................79 Appendix 4. PMI sequence and bioinformatics data ........................................................88 Appendix 5. Ber e 1 sequence and bioinformatics data ....................................................98 Food Allergy Research and Resource Program Study No. BIO-02-2006 University of Nebraska Page 5 of 24 Abbreviations and Definitions aa Amino acid AO6 http://www.AllergenOnline.com/ database version 6.0 Ber e 1 An allergenic 2S albumin from Bertholletia excelsa (positive control) BLASTP Algorithm used to find local high scoring alignments between a pair of protein sequences (using databases on Entrez) CRTI Carotene desaturase I from Erwinia uredovora CTP Chloroplast transit peptide from garden pea CTP-CRTI Fused CTP-CRTI Entrez NCBI A public genetic database maintained by the National Center for Biotechnology Information (NCBI) at the National Institutes of Health, Bethesda, MD. Protein entries in the Entrez search and retrieval system are maintained by the NCBI of the National Institutes of Health (U.S.A.) FASTA3 Algorithm used to find local high scoring alignments between a pair or protein sequences (using the AllergenOnline database) GI A unique identification number assigned by NCBI to each sequence in the database GR2 Golden Rice version 2 is a genetically modified rice that includes three genes introduced through biotechnology (see http://www.goldenrice.org). PMI Phosphomannose isomerase from Escherichia coli PSY Phytoene synthase from Zea mays Food Allergy Research and Resource Program Study No. BIO-02-2006 University of Nebraska Page 6 of 24 1.0 Summary The three proteins expressed by the genes introduced into Golden Rice 2 (GR2) through genetic engineering were evaluated using bioinformatic approaches to identify any potential sequence matches to allergenic proteins that might indicate an elevated risk of allergic cross reactivity in consumers. Two sequence alignment and similarity scoring algorithms were used in these comparisons. A FASTA3 algorithm (Pearson, 2000) was used with the default scoring matrix (BLOSUM 50) to evaluate overall alignment of each query sequence compared to all sequences in AllergenOnline, looking for matches of low E score values (< 1e-7) and/or greater than 50% identity as an indication of potential cross-reactivity. FASTA3 was also used to search for any segment of 80 or more amino acids that aligned with a match of 35% identity or more compared to any sequence in AllergenOnline, as suggested as a lower limit for considering potential cross- reactivity (Codex, 2003). Finally, BLASTP was used to identify any significant similarity to any newly reported “allergen” sequences not found in AllergenOnline version 6.0, by searching the non-redundant (nr) sequences in the NCBI-Entrez Protein Database. FASTA and BLASTP algorithms perform relatively similar comparisons and although the scoring matrices and scoring penalties are slightly different. Both programs compare amino acid sequences (i.e., primary protein structure), and the alignment data may be used to infer higher order structural similarities (i.e., secondary and tertiary protein structures). Proteins that share a high degree of similarity throughout the entire length usually share secondary structure, common three-dimensional folds and functions. Because of the structural similarity, closely related homologoues may share immunological cross-reacivitity, including IgE binding. Proteins that contain two or more IgE epitopes, or proteins with single epitopes that are cross-linked may bind IgE on the surface of mast cells in sensitized (allergic) individuals. If a sufficient number of allergens are bound on a mast cell, it will degranulate, releasing immune mediators such as histamine and leukotrienes that induce the allergic reaction. Highly similar homologues are more likely to be bound by the same IgE, because of the increased likelihood that the