© 2001 Nature Publishing Group http://structbio.nature.com meeting review

Progress in prediction

Alexey G. Murzin

The series of four CASP experiments has helped to transform the field of protein structure prediction. The state of the art in protein structure prediction has undoubtedly changed, but has there been progress over the years?

“Now, here, you see, it takes all the running acid residues, but all earlier attempts to gories. In CASP4, there were just 43 tar- you can do, to keep in the same place. If you compute the structure ab initio using gets, the same number as in CASP3. On want to get somewhere else, you must run at nothing else but the sequence and the laws the positive side, 40 of the CASP4 targets least twice as fast as that!” of physics and chemistry had very limited had their experimental structures deter- The Red Queen to Alice in Through the if any success. However, there were other mined in time and thus were available for Looking-Glass by Lewis Carroll methods available that used additional the assessment, four more than were avail- information. The observation of homolo- able in CASP3. Unlike many of the conferences in the past gous proteins having very similar struc- year, this meeting did not have ‘2000’ in its tures allowed the comparative modeling Assessment of predictions name. The CASP4* organizers, John of protein structure by sequence homolo- The assessors’ burden of interpreting the Moult (CARB, University of Maryland, gy to known structure. The notion that results was heavier than ever this year. To USA), Tim Hubbard (Sanger Centre, Nature may operate with a limited reper- deal with it, each of the three principal Cambridge, UK), Krzysztof Fidelis and toire of protein folds gave rise to the fold assessors requested the assistance of Adam Zemla (both of the Prediction recognition methods searching for a cor- his/her colleagues. Anna Tramontano Center, Lawrence Livermore National relation between a given sequence and a (IRBM, Pomezia, Italy), the co-organizer Laboratory, USA), did not need to follow known fold. A substantial fraction of the of two FEBS advanced courses on the the millennium year fashion to attract due target proteins were expected to have new Frontiers of Protein Structure Prediction attention. CASP4 was more than just a sci- protein folds thus providing the opportu- in 1995 and 1997, was in charge of the entific conference. It was a convention of nity of a double blind test of the ab initio comparative modeling assessment. dedicated people who spent their summer methods by eliminating possible insights Manfred Sippl (University of Salzburg, working on the predictions of a new batch from the fold recognition methods. Austria), the leader of one of the most suc- of protein structures and then gathered The boundaries between the categories cessful predictors teams of the three previ- together to take one more step in the were somewhat arbitrary and have ous CASPs, carried out the fold © http://structbio.nature.com Group 2001 Nature Publishing ongoing CASP experiment. This experi- changed from one CASP experiment to recognition assessment with help of his ment, designed by John Moult and col- another. Also, the CASP process stimulat- group. The former CASP2 assessor in the leagues seven years ago, has been run ed the development of new approaches to ab initio methods category Arthur Lesk every other year, in 1994, 1996, 1998 and protein structure predictions, for exam- (, UK) made a 2000. The details of all CASP experiments ple, the knowledge-based methods that comeback as the principal assessor in the are available online (http://Prediction could predict across the different cate- renamed category of new fold methods. Center.llnl.gov/). The CASP4 proceedings gories. To make the assessment as fair as The renaming from ab initio was prompt- will be published in a special issue of the possible to all prediction methods, after ed by the development of new prediction journal Proteins later this year, where the CASP2, the prediction formats were uni- methods that assemble a protein fold from proceedings of the first three CASP fied for all categories, and the category of small parts using both the first principles experiments have been published1–3. each target was decided upon the assess- and empirical rules derived from known ment of submitted predictions. The structures. The assessors had just two Description of the experiment assessment was carried out by indepen- months before the final meeting to com- The main objective of CASP was to subject dent assessors invited to analyze the pre- plete their analyses of more than 11,000 the available prediction methods to a dictions in each category and to nominate predictions submitted by 163 predictor blind test. The participants in the experi- the best predictors for the presentation of teams and automated servers. Although ment were asked to predict a number of their methods at the final meeting. the assessment was facilitated by the structures that were about to be deter- The original CASP design has with- numerical evaluation data generated in mined by X-ray crystallography and NMR stood the test of time. It has not changed the Prediction Center and by the assessors’ spectroscopy. The target protein significantly since the first CASP experi- own software, it was partly manual labor sequences were solicited from experimen- ment, neither has the format, timing or and extremely time-consuming (M. Sippl talists. The targets were divided into three location of the final CASP meetings. The estimated that the fold recognition assess- different categories: comparative model- experiment has been a success among the ment had required ~25 person-weeks ing, fold recognition and ab initio meth- predictor community, each time attract- working time). In the fold recognition and ods. ing more and more participants and gen- new fold methods categories, manual This division reflected the status of the erating more and more predictions (Table inspection was needed for the identifica- field at the beginning of the experiment in 1). In contrast, target collection, which tion of partially correct predictions that 1994. At that time, it had been well recog- depends on the genorosity of experimen- do not stand out in the numerical evalua- nized that the structure of a protein is fully talists, failed again to reach an optimal tion tables and plots. In the comparative determined by the sequence of its amino total number of 100 targets in all cate- modeling category, where almost all pre-

110 nature structural biology ¥ volume 8 number 2 ¥ february 2001 © 2001 Nature Publishing Group http://structbio.nature.com meeting review

Table 1 CASP process in numbers CASP number Year Number of targets Number of predictor teams Total number of predictions CASP1 1994 33 35 135 CASP2 1996 42 72 947 CASP3 1998 43 98 3,807 CASP4 2000 43 163 11,136

dicted structures were sufficiently close to targets, neither are the statistical results of major developments outside the CASP the experimental structure, the focus of different CASPs. process that transformed the field of pro- manual inspection was on the prediction tein structure prediction, particularly in of fine details. Development in structure prediction the fold recognition category, arguably the The assessors also were asked to address The CASP process has brought a strong ele- most interesting in the field. Unlike the the following question: has there been ment of competition to the field, in partic- ab initio/new fold methods category, suc- progress in comparison to the earlier ular to the fold recognition category, where cessful predictions in the fold recognition CASPs? Two of the three principal asses- the number of correctly assigned folds is a category could result in substantially com- sors answered positively. The comparative simple criterion for a team’s success. In the plete structures; unlike comparative mod- modeling assessor was less certain, as there earlier CASPs, only a few prediction teams eling, correct fold assignment was far was no substantial difference in the quality consistently made a substantial number of from trivial. In the early days of CASP, the of CASP4 and CASP3 models. But the correct assignments. In CASP4, there were fold recognition category was dominated same could probably be said about the pre- many more teams that showed similar by methods that thread the sequence in dictions in the other two categories. At first good performances over a wide range of question through a library of known pro- glance the CASP4 and CASP3 results look targets, so this time the assessors measured tein folds, and the terms ‘threading’ and quite similar. In both CASP3 and CASP4, the team’s success by quality rather than ‘fold recognition’ were synonymous. In there were many correct predictions for quantity of the team’s correct predictions. CASP2, a major blow to the superiority of some targets, whereas for other targets In pursuit of the best result rather than the threading methods came from the knowl- there were just a few good predictions, and best method, the approaches used by dif- edge-based methods brought up by the a few targets were missed completely. ferent teams have begun to converge over creation of databases of proteins of known The non-uniform distribution of suc- the years. Many predictors used combina- structure, such as SCOP (the Structural cessful predictions could be explained by tions of different techniques rather than a Classification of Proteins database). The the variation of prediction difficulty. With single method to improve their perfor- coming of multiple sequence alignment- increasing prediction difficulty, both the mances. based similarity searches, like PSI-BLAST, number of correct predictions and the The CASP4 predictors were also able to made a big impact in CASP3. PSI-BLAST © http://structbio.nature.com Group 2001 Nature Publishing mean accuracy decrease. For each target, benefit from the predictions submitted to allowed the confident detection of however, the prediction difficulty is sub- the CAFASP2 (critical assessment of fully sequence homology between a target pro- jective and depends on the prediction automated structure prediction) experi- tein and a protein of known structure with method. For example, the targets with ment4, which was run in parallel with sequence similarity much less than the many known structural homologs are CASP4 on the same set of targets. The pairwise method threshold of 30%, thus considered to be easier for the fold recog- CAFASP participants were 33 fully auto- moving many of the would-be fold recog- nition methods than targets with few sim- mated servers and computer programs. nition targets into the comparative mod- ilar known structures. Other specific The CAFASP metaserver automatically eling category. This move affected only the factors include the sequence similarity in submitted the targets to each server and subset of fold recognition targets, known the structural alignment and the amount collected the server predictions within 48 as distant homology recognition targets. of common structure shared by the target hours after submission of the target. The The distant homology recognition targets protein and the protein of most similar server predictions were then made avail- are not only structurally similar to the known structure. For the sequence simi- able online, providing a clear indication of proteins of known folds, but also probably larity based methods, the targets from the target difficulties and saving many evolutionarily related to some of those large sequence families are generally easier human predictors from embarrassing proteins. The genome sequencing pro- than the targets from small families or mistakes on easy targets. The CAFASP2 jects, which are revealing many novel with orphan sequences. The knowledge of predictions were assessed along with the sequences, will help further the advance- target protein function can be of great CASP4 predictions. There were at least ment of multiple alignment methods. help for the knowledge-based methods. four servers that did quite well in CASP4, This process will eventually deplete the Previous CASP experience also comes in but the separate, manually submitted pre- distant homology fold recognition targets to play; for example, in CASP4, many pre- dictions by their developers clearly and add them to the comparative model- dictors readily recognized the structural showed significant improvements by ing targets. It should be emphasized that relationships of several fold recognition human intervention. in both CASP3 and CASP4 even the best targets to some previous CASP targets The fact that predictors have been fold recognition methods were successful considered rather difficult at the time. learning from each other’s experience and mainly because of the distant homology This illustrates that the prediction diffi- expertise was probably the most impor- recognition targets that were considered culty cannot be measured on an absolute tant factor that improved the protein to be the easier ones. For the difficult tar- scale; therefore the difficulties of the structure prediction within the CASP gets, the fold recognition methods started CASP4 targets are not directly comparable process. This improvement, however, was to lose ground to the new fold methods. In to the difficulties of the previous CASP not due to this factor alone. There were CASP4, no new success came from the

nature structural biology ¥ volume 8 number 2 ¥ february 2001 111 © 2001 Nature Publishing Group http://structbio.nature.com meeting review

threading methods. One of the top Michael Sternberg’s team (ICRF, Future challanges threading teams in CASP3 (that of David London, UK) retained its leading position The sustained success of many individual Jones, Brunel University, London, UK) in the comparative modeling category predictors demonstrates the progress made fully automated their program Threader (the talk was presented by Paul Bates) and since earlier CASPs. The collective progress that was one of the top CAFASP2 servers, improved on the prediction of fold recog- is less evident, due to rapid changes in and but skipped the manual post-processing nition targets. The team’s fold recognition outside the CASP process. Like Alice in of its results. The other two CASP3 lead- server, 3D-PSSM, was the best of all Wonderland who had to run as fast as she ing threading teams did not predict in CAFASP servers that participated in this could to keep her place, the predictors must CASP4; Sippl’s team carried out the category. perform better each time just to keep their assessment and therefore they were not Leszek Rychlewski and Janusz Bujinicki place in the CASP league table. The next big allowed to submit their predictions, (International Institute of Molecular and changes in the field almost certainly will whereas Bryant’s team (NCBI, NIH, USA) Cell Biology, Warsaw, Poland), the orga- come from structural genomics projects did not participate at all. nizers of CAFASP-like LiveBench experi- aimed at the experimental determination of ment4, did well in comparative modeling a large number of novel protein structures. CASP4 highlights and the prediction of distant homology The CASP process has to respond quickly The selection of main speakers showed targets. Their prediction strategies uti- to forthcoming changes to keep the predic- that most of the CASP3 leading teams lized sequence profile-profile methods tion field going when structural genomics retained or, in some cases, strengthened and modern threading approaches, which projects gain full speed. It can be anticipat- their leading positions in CASP4. There were carefully benchmarked before in ed that the role of fold recognition methods were fewer speakers selected to present LiveBench. will eventually diminish, while the develop- their results in detail in CASP4, allowing Also, L. Rychlewski together with Arne ment of comparative modeling and new more predictors to make short presenta- Elofsson (Stockholm University, Sweden) fold methods will get a new lease on life. tions. At least four of the selected speakers and Daniel Fischer (Ben Gurion Structural genomics is in a very good posi- did well in more than one category, but University, Beer-Sheva, Israel) compiled tion now to help in speeding up the CASP each of them was allowed only one main the consensus predictions by CAFASP process by providing more prediction tar- presentation. servers. In the fold recognition category, gets. In return, CASP could probably help David Baker (University of CAFASP consensus (presented by D. in the functional annotation of the target Washington, Seattle, USA) and his team Fischer) performed better than any single structures by sharing the gathered informa- performed outstandingly well across all server, but, unlike other servers, reaching tion that helped in making the correct pre- three categories. In CASP3, their frag- a consensus was not fully automated. dictions. ment assembly method produced several Other selected speakers included: in the One of the ongoing structural genomics good predictions including arguably the comparative modeling category, Ceslovas projects (http://s2f.carb.nist.gov) already best CASP3 prediction. In CASP4, this Venclovas (Lawrence Livermore National provided several CASP4 targets. The struc- © http://structbio.nature.com Group 2001 Nature Publishing new fold method predicted complete Laboratory, Berkeley, California, USA); in tures of all these targets were successfully folds of at least four new fold targets. the fold recognition category, Kevin predicted, including two correctly predicted Baker and colleagues successfully Karplus (University of California, Santa new folds that were among the CASP4 extended their approach to the predic- Cruz, USA), ’s team highlights. I would like to conclude with an tion of new structural features in fold (University of Cambridge, Cambridge, appeal to structural biologists not to miss a recognition and comparative modeling UK, presented by Jiye Shi) and SB-fold good opportunity of fruitful collaboration targets. (SmithKline Beecham Pharmaceuticals, between CASP and structural genomics. After serving as one of the CASP3 Philadelphia, USA, presented by Andrej assessors, I teamed again with Alex Lupas); in the new fold methods category, Alexey G. Murzin is in the Centre for Bateman (Sanger Centre, Cambridge, Jeffrey Skolnick (Danforth Plant Science Protein Engineering, MRC Centre, Hills UK). Using essentially the same knowl- Center, St. Louis, Missouri, USA), Road, Cambridge CB2 2QH, UK. email: edge-based approach to distant homolo- Richard Friesner (Columbia University, [email protected] gy recognition that performed best in New York, USA), David Shortle (Johns *CASP4, the Fourth Meeting on the Critical Assessment CASP2, we produced the most accurate Hopkins University, Baltimore, USA) and of Techniques for Protein Structure Prediction, models for several fold recognition tar- Rita Casadio (University of Bologna, Asilomar, California, December 3-7, 2000.

gets and achieved the top averaged score Italy). The selected speakers presented 1. Proteins Struct. Func. Genet. 23 (1995). in this category. We also applied a knowl- new developments in their methods 2. Proteins Struct. Func. Genet. Suppl. 1 (1997). 3. Proteins Struct. Func. Genet. Suppl. 3 (1999). edge-based approach to the prediction of and/or found new areas of application for 4. Fischer, D., Elofsson, A. & Rychlewski, L. Protein new folds and had some successes. these methods. Eng. 13, 667–669 (2000).

112 nature structural biology ¥ volume 8 number 2 ¥ february 2001