Progress in Protein Structure Prediction

© 2001 Nature Publishing Group http://structbio.nature.com meeting review Progress in protein structure prediction Alexey G. Murzin The series of four CASP experiments has helped to transform the field of protein structure prediction. The state of the art in protein structure prediction has undoubtedly changed, but has there been progress over the years? “Now, here, you see, it takes all the running acid residues, but all earlier attempts to gories. In CASP4, there were just 43 tar- you can do, to keep in the same place. If you compute the structure ab initio using gets, the same number as in CASP3. On want to get somewhere else, you must run at nothing else but the sequence and the laws the positive side, 40 of the CASP4 targets least twice as fast as that!” of physics and chemistry had very limited had their experimental structures deter- The Red Queen to Alice in Through the if any success. However, there were other mined in time and thus were available for Looking-Glass by Lewis Carroll methods available that used additional the assessment, four more than were avail- information. The observation of homolo- able in CASP3. Unlike many of the conferences in the past gous proteins having very similar struc- year, this meeting did not have ‘2000’ in its tures allowed the comparative modeling Assessment of predictions name. The CASP4* organizers, John of protein structure by sequence homolo- The assessors’ burden of interpreting the Moult (CARB, University of Maryland, gy to known structure. The notion that results was heavier than ever this year. To USA), Tim Hubbard (Sanger Centre, Nature may operate with a limited reper- deal with it, each of the three principal Cambridge, UK), Krzysztof Fidelis and toire of protein folds gave rise to the fold assessors requested the assistance of Adam Zemla (both of the Prediction recognition methods searching for a cor- his/her colleagues. Anna Tramontano Center, Lawrence Livermore National relation between a given sequence and a (IRBM, Pomezia, Italy), the co-organizer Laboratory, USA), did not need to follow known fold. A substantial fraction of the of two FEBS advanced courses on the the millennium year fashion to attract due target proteins were expected to have new Frontiers of Protein Structure Prediction attention. CASP4 was more than just a sci- protein folds thus providing the opportu- in 1995 and 1997, was in charge of the entific conference. It was a convention of nity of a double blind test of the ab initio comparative modeling assessment. dedicated people who spent their summer methods by eliminating possible insights Manfred Sippl (University of Salzburg, working on the predictions of a new batch from the fold recognition methods. Austria), the leader of one of the most suc- of protein structures and then gathered The boundaries between the categories cessful predictors teams of the three previ- together to take one more step in the were somewhat arbitrary and have ous CASPs, carried out the fold © http://structbio.nature.com Group 2001 Nature Publishing ongoing CASP experiment. This experi- changed from one CASP experiment to recognition assessment with help of his ment, designed by John Moult and col- another. Also, the CASP process stimulat- group. The former CASP2 assessor in the leagues seven years ago, has been run ed the development of new approaches to ab initio methods category Arthur Lesk every other year, in 1994, 1996, 1998 and protein structure predictions, for exam- (University of Cambridge, UK) made a 2000. The details of all CASP experiments ple, the knowledge-based methods that comeback as the principal assessor in the are available online (http://Prediction could predict across the different cate- renamed category of new fold methods. Center.llnl.gov/). The CASP4 proceedings gories. To make the assessment as fair as The renaming from ab initio was prompt- will be published in a special issue of the possible to all prediction methods, after ed by the development of new prediction journal Proteins later this year, where the CASP2, the prediction formats were uni- methods that assemble a protein fold from proceedings of the first three CASP fied for all categories, and the category of small parts using both the first principles experiments have been published1–3. each target was decided upon the assess- and empirical rules derived from known ment of submitted predictions. The structures. The assessors had just two Description of the experiment assessment was carried out by indepen- months before the final meeting to com- The main objective of CASP was to subject dent assessors invited to analyze the pre- plete their analyses of more than 11,000 the available prediction methods to a dictions in each category and to nominate predictions submitted by 163 predictor blind test. The participants in the experi- the best predictors for the presentation of teams and automated servers. Although ment were asked to predict a number of their methods at the final meeting. the assessment was facilitated by the structures that were about to be deter- The original CASP design has with- numerical evaluation data generated in mined by X-ray crystallography and NMR stood the test of time. It has not changed the Prediction Center and by the assessors’ spectroscopy. The target protein significantly since the first CASP experi- own software, it was partly manual labor sequences were solicited from experimen- ment, neither has the format, timing or and extremely time-consuming (M. Sippl talists. The targets were divided into three location of the final CASP meetings. The estimated that the fold recognition assess- different categories: comparative model- experiment has been a success among the ment had required ~25 person-weeks ing, fold recognition and ab initio meth- predictor community, each time attract- working time). In the fold recognition and ods. ing more and more participants and gen- new fold methods categories, manual This division reflected the status of the erating more and more predictions (Table inspection was needed for the identifica- field at the beginning of the experiment in 1). In contrast, target collection, which tion of partially correct predictions that 1994. At that time, it had been well recog- depends on the genorosity of experimen- do not stand out in the numerical evalua- nized that the structure of a protein is fully talists, failed again to reach an optimal tion tables and plots. In the comparative determined by the sequence of its amino total number of 100 targets in all cate- modeling category, where almost all pre- 110 nature structural biology • volume 8 number 2 • february 2001 © 2001 Nature Publishing Group http://structbio.nature.com meeting review Table 1 CASP process in numbers CASP number Year Number of targets Number of predictor teams Total number of predictions CASP1 1994 33 35 135 CASP2 1996 42 72 947 CASP3 1998 43 98 3,807 CASP4 2000 43 163 11,136 dicted structures were sufficiently close to targets, neither are the statistical results of major developments outside the CASP the experimental structure, the focus of different CASPs. process that transformed the field of pro- manual inspection was on the prediction tein structure prediction, particularly in of fine details. Development in structure prediction the fold recognition category, arguably the The assessors also were asked to address The CASP process has brought a strong ele- most interesting in the field. Unlike the the following question: has there been ment of competition to the field, in partic- ab initio/new fold methods category, suc- progress in comparison to the earlier ular to the fold recognition category, where cessful predictions in the fold recognition CASPs? Two of the three principal asses- the number of correctly assigned folds is a category could result in substantially com- sors answered positively. The comparative simple criterion for a team’s success. In the plete structures; unlike comparative mod- modeling assessor was less certain, as there earlier CASPs, only a few prediction teams eling, correct fold assignment was far was no substantial difference in the quality consistently made a substantial number of from trivial. In the early days of CASP, the of CASP4 and CASP3 models. But the correct assignments. In CASP4, there were fold recognition category was dominated same could probably be said about the pre- many more teams that showed similar by methods that thread the sequence in dictions in the other two categories. At first good performances over a wide range of question through a library of known pro- glance the CASP4 and CASP3 results look targets, so this time the assessors measured tein folds, and the terms ‘threading’ and quite similar. In both CASP3 and CASP4, the team’s success by quality rather than ‘fold recognition’ were synonymous. In there were many correct predictions for quantity of the team’s correct predictions. CASP2, a major blow to the superiority of some targets, whereas for other targets In pursuit of the best result rather than the threading methods came from the knowl- there were just a few good predictions, and best method, the approaches used by dif- edge-based methods brought up by the a few targets were missed completely. ferent teams have begun to converge over creation of databases of proteins of known The non-uniform distribution of suc- the years. Many predictors used combina- structure, such as SCOP (the Structural cessful predictions could be explained by tions of different techniques rather than a Classification of Proteins database). The the variation of prediction difficulty. With single method to improve their perfor- coming of multiple sequence alignment- increasing prediction difficulty, both the mances. based similarity searches, like PSI-BLAST, number of correct predictions and the The CASP4 predictors were also able to made a big impact in CASP3.

Progress in Protein Structure Prediction

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support