Structural Genomics in North America
Total Page:16
File Type:pdf, Size:1020Kb
© 2000 Nature America Inc. • http://structbio.nature.com perspectives Structural genomics in North America Thomas C. Terwilliger Structural genomics in North America has moved remarkably quickly from ideas to pilot projects. Just three years ago, the field was only a concept, independently being discussed by its many inventors. Now it is already a well- organized, increasingly-funded, consortium-based effort to determine protein structures on a large scale. The pivotal point for the North American folds, or proteins from fully-sequenced organisms such as yeast efforts was the 1998 Argonne meeting (see or Haemophilus influenzae that held the promise of yielding Table 1 for a timeline of structural important functional information from protein structures. genomics in North America). This meeting All the pilot projects have used more or less the same initial brought together over 80 researchers and strategy: cloning many genes and finding the ones that were high- representatives of funding agencies who ly suitable for structure determination. This process consisted of .com thought that improvements in technology, finding those genes that express well in a simple system, selecting combined with the successes of the genome from this set ones that produce protein in soluble form, testing sequencing projects, had set the stage for a large-scale structure these for crystallization or NMR spectra, and determining the determination project. Experts in all the required steps, from structures of the ones that pass all these tests. All of the projects protein expression to X-ray and NMR structure determination, have found that this process leads to a huge attrition rate at each described recent progress and how the obstacles remaining in step, with three-dimensional structures obtained for just 5–20% their areas could be overcome. The Argonne meeting led to a of the genes that have been cloned. (To be fair, these projects are reinforced conviction on the part of many participants that the ongoing and many of the other cloned genes may yield structures http://structbio.nature • time was indeed right for structural genomics. It set a tone of later.) Table 2 lists the numbers of structures determined over the excitement and promise for the structural genomics field that past two years by some of these pilot projects. Altogether, these has propelled it ever since. Just as importantly, however, discus- pilot projects have produced over 70 new protein structures, with sions at the Argonne meeting indicated that small-scale testing of the Toronto-based consortium alone contributing ~20 of these. the ideas of structural genomics and considerable additional Although the pilot projects have already generated an impres- technology development were necessary before a full-scale pro- sive number of structures, obtaining them required considerable ject could be carried out. resources and effort, and most people in the field feel that sub- stantial improvements in technology will be required to make Pilot projects begin structure determination a high-throughput process. Several of A number of independent and mostly small pilot projects were the pilot projects also have major components focused on tech- 2000 Nature America Inc. © initiated right after the Argonne meeting, complementing some nology development. The Rutgers group, for example, is focus- existing projects that had begun before then (see Table 2 for a list ing on development of high-throughput methods for NMR of the major pilot projects). The projects have been supported by structure determination. The Argonne group on the other hand diverse sources, with funding for several coming from the US is focusing on developing of high-throughput methods for syn- Department of Energy, and additional funding coming from the chrotron-based X-ray crystallography, and the Scripps/GNF National Institutes of Health (NIH), the Ontario Cancer group has been entirely focused on technology development up Institute, the New Jersey Commission on Science and to this point. Technology Initiative, and the University of California. The funding of these projects in 1998–2000 range from minimal NIH workshops funding to ∼$1.5 million per year. The Ontario project is to be During the year following the Argonne workshop, the NIH held funded at ∼$3.4 million per year beginning in the fall of 2000, a series of three workshops to discuss the possibility of a large- and the Scripps/Genomics Institute of the Novartis Research scale publicly-funded effort in structural genomics (see Foundation (GNF) effort, involved in both public and private http://www.nigms.nih.gov/funding/psi.html for a comprehensive sectors, is funded at ~$6 million per year. discussion of the NIH program). The first workshop focused on Most of these pilot projects had two principal goals: to demon- the scope of a possible structural genomics project. The partici- strate the overall feasibility of structural genomics, and to devel- pants concluded that a project to determine a representative set op some of the technology necessary for large-scale structure of a few thousand protein structures would be the right scale to determination. For feasibility demonstrations, participants in be useful in understanding the structures and functions of most several of these pilot projects chose proteins from thermophilic other proteins. Importantly, the workshop conclusions also organisms, hoping to start with the simplest possible case, and noted that the infrastructure and technologies that would be reasoning that these proteins would be relatively easy to work developed in the course of such a project would transform the with. Other pilot projects chose proteins expected to have novel way structure determination is done in the future. Bioscience Division, Mail Stop M888, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA. email: [email protected] nature structural biology • structural genomics supplement • november 2000 935 © 2000 Nature America Inc. • http://structbio.nature.com perspectives rather than focus. Even in this age of Table 1 Timeline of North American structural genomics efforts rapid communication and remote X-ray January, 1998 Argonne Structural Genomics Meeting and NMR data collection, geographical 1998–1999 Starts of pilot projects funded by US Department proximity remains important. of Energy (DOE), University of California, NIH, The seven new NIH structural Ontario Cancer Institute; numerous workshops genomics centers have substantially vary- and meeting sessions on structural genomics ing emphases even though all of them are February, 1998–January, 1999 NIH holds three workshops on a Protein Structure Initiative designed to carry out all aspects of struc- June, 1999 NIH calls for center proposals and technology development tural genomics. Each of the centers plans proposals as part of the Protein Structure Initiative to obtain hundreds of new protein struc- September 30, 2000 NIH centers begin operation tures and expects these to contain many structures that represent families of pro- The second workshop focused on the appropriateness of a tein structure that previously had no representatives with known structural genomics project, concluding that such a project structure. Each of the centers also plans to develop new technolo- would indeed be important, that the technology is nearly ready, gies that will allow a full-scale structural genomics effort to suc- and that pilot projects of a substantial size should be supported ceed. The technologies to be developed by the different centers to assess feasibility and to develop additional technology. There address different bottlenecks in structure determination. The Joint was discussion of whether such a project would take funding Center for Structural Genomics and the Northeast Consortium away from existing projects; the NIH representatives assured for Structural Genomics, for example, plan to develop and use participants that new funding would be sought. high-throughput robotics crystallization devices to set up and .com The third workshop centered on target selection. Two general analyze tens of thousands to as many as 130,000 crystallization approaches to selecting targets for structural genomics were dis- experiments in a day, hoping that even a tiny percentage of these cussed. The first was to organize protein sequences into families will yield crystals. They are also developing robotics equipment for at a level of ∼30% sequence identity, and to determine just one all other steps in high throughput protein production. The TB representative of each family. The other approach was to focus Structural Genomics Consortium, in contrast, plans to place more on proteins with clear biological importance. Although there emphasis on the earlier bottleneck of protein expression, and was not a general consensus on one approach or the other, a goal expects to use in vitro evolution-based methods to engineer its for the global structural genomics efforts of 10,000 structures protein targets to increase solubility. Almost all of the centers plan http://structbio.nature • was thought by many to be reasonable. to develop automated procedures for X-ray data collection and Many meetings during 1998–1999 focused on or contained analysis, and the Northeast Consortium for Structural Genomics sessions on structural genomics. A particularly influential meet- plans additionally to automate NMR data analysis. ing was held in the fall of 1998 in Avalon, New Jersey. At this The choice of targets