bioRxiv preprint doi: https://doi.org/10.1101/056564; this version posted December 21, 2016. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. iSUMO - integrative prediction of functionally relevant SUMOylated proteins Xiaotong Yao1,2, Shashank Gandhi1, 3, Rebecca Bish1, Christine Vogel1* 1 Center for Genomics and Systems Biology, New York University, New York, USA 2 Tri-Institutional Program in Computational Biology and Medicine, New York, USA 3 Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California, USA * Corresponding author:
[email protected] 1 bioRxiv preprint doi: https://doi.org/10.1101/056564; this version posted December 21, 2016. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. Abstract Post-translational modifications by the Small Ubiquitin-like Modifier (SUMO) are essential for many eukaryotic cellular functions. Several large-scale experimental datasets and sequence-based predictions exist that identify SUMOylated proteins. However, the overlap between these datasets is small, suggesting many false positives with low functional relevance. Therefore, we applied machine learning techniques to a diverse set of large-scale SUMOylation studies combined with protein characteristics such as cellular function and protein-protein interactions, to provide integrated SUMO predictions for human and yeast cells (iSUMO).