Designing and Evaluating the MULTICOM Protein Local and Global
Total Page:16
File Type:pdf, Size:1020Kb
Cao et al. BMC Structural Biology 2014, 14:13 http://www.biomedcentral.com/1472-6807/14/13 RESEARCH ARTICLE Open Access Designing and evaluating the MULTICOM protein local and global model quality prediction methods in the CASP10 experiment Renzhi Cao1, Zheng Wang4 and Jianlin Cheng1,2,3* Abstract Background: Protein model quality assessment is an essential component of generating and using protein structural models. During the Tenth Critical Assessment of Techniques for Protein Structure Prediction (CASP10), we developed and tested four automated methods (MULTICOM-REFINE, MULTICOM-CLUSTER, MULTICOM-NOVEL, and MULTICOM- CONSTRUCT) that predicted both local and global quality of protein structural models. Results: MULTICOM-REFINE was a clustering approach that used the average pairwise structural similarity between models to measure the global quality and the average Euclidean distance between a model and several top ranked models to measure the local quality. MULTICOM-CLUSTER and MULTICOM-NOVEL were two new support vector machine-based methods of predicting both the local and global quality of a single protein model. MULTICOM- CONSTRUCT was a new weighted pairwise model comparison (clustering) method that used the weighted average similarity between models in a pool to measure the global model quality. Our experiments showed that the pairwise model assessment methods worked better when a large portion of models in the pool were of good quality, whereas single-model quality assessment methods performed better on some hard targets when only a small portion of models in the pool were of reasonable quality. Conclusions: Since digging out a few good models from a large pool of low-quality models is a major challenge in protein structure prediction, single model quality assessment methods appear to be poised to make important contributions to protein structure modeling. The other interesting finding was that single-model quality assessment scores could be used to weight the models by the consensus pairwise model comparison method to improve its accuracy. Keywords: Protein model quality assessment, Protein model quality assurance program, Protein structure prediction, Support vector machine, Clustering Background tries to fold a protein from scratch without using known Predicting protein tertiary structure from amino acid se- template structures. The two kinds of methods were often quence is of great importance in bioinformatics and combined to handle a full spectrum of protein structure computational biology [1,2]. During the last few decades, prediction problems ranging from relatively easy hom- a lot of protein tertiary structure prediction methods ology modeling to hard de novo prediction [10-13]. have been developed. One category of methods adopts a During protein structure prediction, one important template-based approach [3-7], which uses experimentally task is to assess the quality of structural models pro- determined structures as templates to build structural duced by protein structure prediction methods. A model models for a target protein without known structure. An- quality assessment (QA) method employed in a protein other category uses a template-free approach [8,9], which structure prediction pipeline is critical for ranking, refin- ing, and selecting models [3]. A model quality assess- * Correspondence: [email protected] ment method can generally predict a global quality score 1Computer Science Department, University of Missouri, Columbia, Missouri measuring the overall quality of a protein structure 65211, USA 2Informatics Institute, University of Missouri, Columbia, Missouri 65211, USA model and a series of local quality scores measuring Full list of author information is available at the end of the article the local quality of each residue in the model. A global © 2014 Cao et al.; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Cao et al. BMC Structural Biology 2014, 14:13 Page 2 of 12 http://www.biomedcentral.com/1472-6807/14/13 quality score can be a global distance test (GDT-TS) assessment method were used to benchmark model score [14-16] that is predicted to be the structural simi- quality assessment methods’ capability of distinguishing larity between a model and the unknown native struc- relatively small differences between more similar models. ture of a protein. A local quality score of a residue can The new settings provided us a good opportunity to assess be the Euclidean distance between the position of the the strength and weakness of our multi-model and single- residue in a model and that in the unknown native model protein model quality assessment methods in terms structure after they are superimposed. of accuracy, robustness, consistency and efficiency in In general, protein model quality assessment methods order to identify the gaps for further improvement. can be classified into two categories: multi-model In addition to evaluating our four servers on the methods [17-21] and single-model methods [13-17]. CASP10 benchmark, we compare our methods with three Multi-model methods largely use a consensus or cluster- popular multi-model clustering-based methods (Davis- ing approach to compare one model with other models QAconsensus [28], Pcons [29], and ModFOLDclust2 [21]). in a pool of input models to assess its quality. Generally, Our clustering-based methods (MULTICOM-REFINE, a model with a higher similarity with the rest of models MULTICOM-CONSTRUCT) performed comparably to in the pool receives a higher global quality score. The the three external tools in most cases. Our single-model methods tend to work well when a large portion of methods (MULTICOM-CLUSTER, MULTICOM-NOVEL) models in the input pool are of good quality, which is had a lower accuracy than the clustering-based methods, often the case for easy to medium hard template-based but performed considerably better than them on the modeling. Multi-model methods tend to work particularly models of hard template-free targets. Besides the reason- well if a large portion of good models were independently able performance and a comprehensive comparative generated by a number of independent, diverse protein study, our methods have some methodological innova- structure prediction methods as seen in the CASP (the tions such as using single-model quality scores to weight Critical Assessment of Techniques for Protein Structure models for clustering methods, repacking side chains Prediction) experiments, but they worked less well when before model evaluation, and improved machine learning being applied to the models generated by one single pro- methods for single-model quality assessment for template- tein structure prediction method because they prefer the free targets. average model of the largest model cluster in the model The rest of the paper is organized as follows. In the pool. And multi-model methods tend to completely fail if Results and discussions section, we analyze and discuss a significant portion of low quality modes are similar to the performance of the methods on the CASP10 bench- each other and thus dominate the pairwise model com- mark. In the Conclusion section, we summarize this work parison as seen in some cases during the 10th CASP ex- and conclude it with the directions of future work. In the periment (CASP10) held in 2012. Single-model methods Methods section, we introduce the methods in our protein strive to predict the quality of a single protein model with- model quality assessment servers tested in CASP10. out consulting any other models [22-26]. The perform- ance of single-model methods is still lagging behind the Results and discussions multi-model methods in most cases when most models in Results of global quality predictions the pool are of good quality [23,27]. However, because of We evaluated the global quality predictions using five their capability of assessing the quality of one individual measures (see the detailed descriptions of the evaluation model, they have potential to address one big challenge in methods in the Evaluation methods section). The results protein structure modeling – selecting a model of good of the global quality evaluation on Stage 1 of CASP10 quality from a large pool consisting of mostly irrelevant are shown in Table 1. The weighted pairwise model models. Furthermore, as the performance of multi-model comparison method MULTICOM-CONSTRUCT per- quality assessment methods start to converge, single- formed best among all our four servers according to all model methods appear to have a large room of improve- the five measures, suggesting using single-model quality ment as demonstrated in the CASP10 experiment. prediction scores as weights can improve the multi- In order to critically evaluate the performance of model pairwise comparison based quality prediction multi-model and single-model protein model quality as- methods such as MULTICOM-REFINE. The two multi- sessment methods, the CASP10 experiment was de- model global quality assessment methods had the better signed to assess them in two stages. On Stage 1, 20 average performance than the two single-model global models of each target spanning a wide range of quality quality assessment methods (MULTICOM-NOVEL and were used to assess the sensitivity of quality assessment MULTICOM-CLUSTER) on average on Stage 1, suggest- methods with respect to the size of input model pool ing that the advantage of multi-model methods over and the quality of input models. On Stage 2, about top single-model methods was not much affected by the 150 models selected by a naïve consensus model quality relatively small size of input models (i.e. 20). Instead, the Cao et al. BMC Structural Biology 2014, 14:13 Page 3 of 12 http://www.biomedcentral.com/1472-6807/14/13 Table 1 The average correlation (Ave.