Human Proteinpedia Enables Sharing of Human Protein Data
Total Page:16
File Type:pdf, Size:1020Kb
Human Proteinpedia enables sharing of human protein data Suresh Mathivanan, Mukhtar Ahmed, Natalie Ahn, Hainard Alexandre, Ramars Amanchy, Philip Andrews, Joel Bader, Brian Balgley, Marcus Bantscheff, Keiryn Bennett, et al. To cite this version: Suresh Mathivanan, Mukhtar Ahmed, Natalie Ahn, Hainard Alexandre, Ramars Amanchy, et al.. Human Proteinpedia enables sharing of human protein data. Nature Biotechnology, Nature Publishing Group, 2008, 26 (2), pp.164-167. 10.1038/nbt0208-164. hal-02883742 HAL Id: hal-02883742 https://hal.inrae.fr/hal-02883742 Submitted on 29 Jun 2020 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Distributed under a Creative Commons Attribution - NonCommercial - NoDerivatives| 4.0 International License CORRESPONDENCE viewed directly in the MMCD interface or RR02301 (J.L.M.). We are indebted to W.W. Cleland, using modified protein DAS (distributed downloaded as a tab-delimited file for viewing H. Lardy and L. Anderson for access to their chemical annotation system) protocols developed by with spreadsheet software. compound collections. We thank Joe Porwall from us (DAS protocols were originally developed Sigma-Aldrich for assistance in locating compounds, We have also built a ‘Miscellanea’ search engine Zsolt Zolnai for development of the Sesame LIMS and for sharing mRNA and DNA data) permit that allows users to filter results by the biological Eldon L. Ulrich for help in coordinating MMC activities contributing laboratories to maintain protein species, the type of database to be searched with the BMRB. annotations locally. All protein annotations or other criteria. These options allow users to are visualized in the context of corresponding rapidly locate particular sets of records or limit Qiu Cui, Ian A Lewis, Adrian D Hegeman, proteins in the Human Protein Reference their queries to a preferred subset of records. Mark E Anderson, Jing Li, Christopher F Schulte, Database (HPRD)3. Figure 1 shows tissue In summary, MMCD is a practical William M Westler, Hamid R Eghbalnia, expression data for alpha-2-HS glycoprotein Michael R Sussman, & John L Markley tool for expediting the time-consuming derived from three different types of steps of identifying and researching small Department of Biochemistry, University of experiments. molecules. This freely available resource Wisconsin-Madison, 433 Babcock Drive, Our unique effort differs significantly from is compatible with both NMR and MS Madison, Wisconsin 53706, USA. existing repositories, such as PeptideAtlas4 e-mail: [email protected] data (singly or in combination) and and PRIDE5 in several respects. First, most facilitates high-throughput metabolomics 1. Mendes, P. Brief Bioinform. 3, 134–145 (2002). proteomic repositories are restricted to one 2. Nicholson, J.K., Lindon, J.C. & Holmes, E. investigations. Ongoing MMCD support Xenobiotica 29, 1181–1189 (1999). or two experimental platforms, whereas is provided by the National Magnetic 3. Pauling, L., Robinson, A.B., Teranishi, R. & Cary, Human Proteinpedia can accommodate Resonance Facility at Madison. Users are P. Proc. Natl Acad. Sci. USA 68, 2374–2376 data from diverse platforms, including yeast (1971). encouraged to submit data to the BMRB 4. Viant, M.R., Rosenblum, E.S. & Tieerdema, R.S. two-hybrid screens, MS, peptide/protein (supported by the National Library of Environ. Sci Technol. 37, 4982–4989 (2003). arrays, immunohistochemistry, western blots, 5. Dettmer, K., Aronov, P.A. & Hammock, B.D. Mass Medicine, Bethesda, MD, USA), which Spectrom. Rev. 26, 51–78 (2006). co-immunoprecipitation and fluorescence maintains one of the growing data archives 6. Markley, J.L. et al. Pac. Symp. Biocomput. 12, microscopy–type experiments. 10 157–168 (2007). http://www.nature.com/naturebiotechnology that the MMCD relies upon . Second, Human Proteinpedia allows 7. Wishart, D.S. et al. Nucleic Acids Res. 35, D521– contributing laboratories to annotate data Note: Supplementary information is available on the D526 (2007). 8. Weininger, D., Weininger, A. & Weininger, J.L. J. pertaining to six features of proteins (post- Nature Biotechnology website. Chem. Inf. Comput. Sci. 29, 97–101 (1989). translational modifications, tissue expression, 9. Lewis, I.A. et al. Anal. Chem. 79, 9385–9390 ACKNOWLEDGMENTS (2007). cell line expression, subcellular localization, This work was supported by National Institutes of 10. Ulrich, E.L. et al. Nucl. Acids Res. 36, D402–D408 enzyme substrates and protein-protein Health grants R21 DK070297 (MRS, PI) and P41 (2008). interactions; Supplementary Fig. 2 online). No existing repository currently permits annotation of all these features in proteins. Third, all data submitted to Human Proteinpedia are viewable through HPRD Nature Publishing Group Group Nature Publishing 8 Human Proteinpedia enables in the context of other features of the corresponding proteins. To aid comparison 200 © sharing of human protein data and interpretation, meta-annotations pertaining to samples, method of isolation and experimental platform-specific To the editor: Here, we describe Human Proteinpedia information are provided (e.g., labeling Proteomic technologies, such as yeast two- (http://www.humanproteinpedia.org/) method, protease used, ionization method, hybrid, mass spectrometry (MS), protein/ as a portal that overcomes many of these details of primary antibody used). peptide arrays and fluorescence microscopy, obstacles to provide an integrated view of the And fourth, in spite of accommodating yield multi-dimensional data sets, which human proteome. Human Proteinpedia also multiple data types, the data submission are often quite large and either not allows users to contribute and edit proteomic is simplified. This means that a biologist published or published as supplementary data with two significant differences from with no technical expertise can login and information that is not easily searchable. Wikipedia: first, the contributor is expected contribute data. Without a system in place for standardizing to provide experimental evidence for the Thus far, a considerable body of data has and sharing data, it is not fruitful for the data annotated; and second, only the original been contributed to Human Proteinpedia biomedical community to contribute these contributor can edit their data. by the community (see Table 1), with a total types of data to centralized repositories. Human Proteinpedia’s annotation system of >1.8 million peptides and >4 million Even more difficult is the annotation and provides investigators with multiple options MS/MS-spectra deposited. The above- display of pertinent information in the for contributing data including web forms mentioned data were derived from 2,695 context of the corresponding proteins. and annotation servers (Supplementary Fig. individual experiments (single experiments Wikipedia, an online encyclopedia that 1 online). Although registration is required are defined as immunohistochemistry anyone can edit, has already proven quite to contribute data, anyone can freely access performed with a specific antibody, a single successful1 and can be used as a model the data in the repository. The web forms MS run or a yeast two-hybrid screen). for sharing biological data. However, the simplify submission through the use of We have imported MS data from two need for experimental evidence, data pull-down menus for certain data fields and Human Proteome Organization initiatives, standardization and ownership of data pop-up menus for standardized vocabulary including the human plasma proteome creates scientific obstacles. terms. Distributed annotation servers2 project (HPPP)6 and the human liver 164 VOLUME 26 NUMBER 2 FEBRUARY 2008 NATURE BIOTECHNOLOGY CORRESPONDENCE http://www.nature.com/naturebiotechnology Figure 1 Display of tissue expression data obtained from immunohistochemistry, mass spectrometry and western blot analysis that were submitted to Human Proteinpedia. The molecule page for α-2-HS glycoprotein in Human Protein Reference Database is shown. Annotations pertaining to sites of tissue expression from three types of experimental platforms contributed to Human Proteinpedia are shown. The meta-annotation of a mass spectrometry experiment Nature Publishing Group Group Nature Publishing 8 confirming expression in serum is shown (lower left) along with the corresponding MS/MS spectrum for a peptide displayed using a spectrum viewer obtained from PRIDE. Expression of α-2-HS glycoprotein in liver cancer is provided by two entries: one from western blotting (top right) and the other from 200 immunohistochemical labeling, each detailing the antibody used. Expression in kidney is hyperlinked to the corresponding page in the Human Protein Atlas, © which contains data from immunohistochemical labeling experiments (lower right). proteome project (HLPP)7. Data from such as eVOC11, Gene Ontology12, RESID13, Tranche file-sharing network supporting other initiatives like the Human Protein PSI-MI14 and PSI-MS. ProteomeCommons.org15 (http://www. Atlas8, Human Unidentified Gene-Encoded