Structural Genomics: an Approach to the Protein Folding Problem
Total Page:16
File Type:pdf, Size:1020Kb
Commentary Structural genomics: An approach to the protein folding problem Gaetano T. Montelione* Center for Advanced Biotechnology and Medicine, Department of Molecular Biology and Biochemistry, Rutgers University, and Department of Biochemistry, Robert Wood Johnson Medical School, University of Medicine and Dentistry of New Jersey, Piscataway, NJ 08854-5638 he large-scale genome sequencing of information and analyses that will be efficiently determine the phases of the Tprojects present tremendous new op- available as recently funded structural diffraction data required to determine the portunities for structural biology and mo- genomics centers and consortia around protein structure. In this study, MAD was lecular biophysics. This explosion of bio- the world (12–15) come up to speed. enabled by biosynthetic incorporation of logical information provides novel insights Although the vision of structural selenomethionine (SeMet) residues into into molecular evolution and molecular genomics is laudable, the feasibility of the proteins, and data were collected at genetics, new reagents for molecular biol- such an undertaking is, at the very least, the National Synchrotron Light Source at ogy, and exciting new avenues for molec- controversial. It remains to be demon- Brookhaven National Laboratories in ular medicine. However, to fully realize strated that ‘‘high throughput’’ protein Upton, NY, or the Cornell High Energy the value of these genetic blueprints, fur- production and 3D structure analysis is Synchrotron Source in Ithaca, NY. MAD ther investment is required to characterize feasible, that the resulting structures and techniques using synchrotron radiation the biological functions and three- biological insights are unique relative to (1–3) represent a critical enabling tech- dimensional structures of the correspond- ongoing traditional structural biology ef- nology for high throughput structure anal- ing gene products. These efforts, broadly forts, and that approaches for amplifying ysis by x-ray crystallography, underpin- characterized as functional and structural the resulting structural information are ning the feasibility of the proposed genomics, have the potential to provide a valuable. Fundamental to the scientific genomic-scale structure projects (12–15). unified understanding of molecular biol- validity and impact of structural genomics MDD and IDI function at sequential ogy from atomic to cellular levels. is the target selection process (6–10). Ide- steps in the biosynthetic pathway of sterols During the last few years, several inter- ally, structural and other natural national efforts have been initiated with genomics efforts products. MDD cat- the common goal of genomic-scale three- focus on targets alyzes the last of dimensional (3D) protein structure deter- with high leverage Although the vision of structural three sequential mination (for a summary of international value, either as genomics is laudable, the feasibility ATP-dependent re- members of large actions that convert structural genomics centers and consortia, of such an undertaking is, see http:͞͞www.rcsb.org͞pdb͞strucgen. protein families mevalonate to iso- across which the pentenyl diphos- html#Worldwide). Driven by the avail- at the very least, controversial. structural informa- phatase, whereas ability of many complete genome se- tion can be ampli- quences, recent technological advances in IDI catalyzes inter- rapid 3D structure analysis (1–5), and the fied, or selected on conversion of iso- integrative thinking of bioinformatics (6– the basis of functional genomics criteria pentenyl diphosphate and dimethylallyl 10), these efforts aim to provide a coarse for which broad biological information is diphosphate, which condense in the next sampling of the space of 3D protein struc- available. Obviously, targets should be step of this biosynthetic pathway. Other tures. Clustering proteins into homolo- selected for which structural information enzymes in this pathway exhibit sequence gous sequence families, it has been esti- is limited or which complement in valu- similarities with MDD, suggesting potential mated that high-resolution structure able ways the structural information al- structural and evolutionary relationships. determinations of some 15,000–20,000 ready available for that family. This target It is especially noteworthy that the carefully selected proteins will enable ac- selection process generally involves signif- NYSGRC has focused on multiple pro- ͞ curate modeling of hundreds of thousands icant input from bioinformatics and or teins from a common biosynthetic path- of protein structures (10). As well as being functional genomics analyses (9, 10). way. This is an important theme for useful in their own right, such models can In their current work, Bonnano et al. structural genomics activities. Having provide the basis for rapid analysis of x-ray (11) describe high-quality x-ray crystal structures and protein reagents in hand, crystallographic or NMR data, facilitating structures of the 396-residue yeast meva- the group is now in a position to further experimental high-resolution structure lonate-5-diphosphate decarboxylase leverage their structural studies in exper- determinations. A recent issue of PNAS (MDD) and 182-residue Escherichia coli imental and computational analyses of includes a report (11) from the New York isopentenyl diphosphate isomerase (IDI) protein–protein interactions, studies of Structural Genomics Research Consor- enzymes. Both structures were deter- the functional complementarity of these tium (NYSGRC) describing the x-ray mined by using multiwavelength anoma- crystal structures of two proteins involved lous diffraction (MAD; ref. 1), a rapid in sterol͞isoprenoid biosynthesis and the method of x-ray structure determination See companion article on page 12896 in issue 23 of volume 98. that exploits multiwavelength synchrotron *To whom reprint requests should be addressed at: Center amplification of these structural data by for Advanced Biotechnology and Medicine, Rutgers Uni- homology modeling. This study is partic- x-ray radiation together with unique dif- versity, 679 Hoes Lane, Piscataway, NJ 08854-5638. E-mail: ularly noteworthy as a model of the kinds fraction properties of certain atoms to [email protected]. 13488–13489 ͉ PNAS ͉ November 20, 2001 ͉ vol. 98 ͉ no. 24 www.pnas.org͞cgi͞doi͞10.1073͞pnas.261549098 Downloaded by guest on September 29, 2021 enzymes, and in structural͞functional quality scores’’ were generated for 379 tures from high-quality experimental studies of other members of the pathway. proteins, spanning a substantial fraction structures. The expanding database of Another important feature of the current of both superfamilies. high-resolution experimental protein analysis involves the use of homologous Particularly instructive was the model- structures, combined with improved structures to circumvent practical chal- ing of the GHMP kinase superfamily methods for accurate homology modeling lenges presented by sample preparation. based on the MDD and homologous Meth- and͞or rapid experimental data analysis Although the yeast IDI protein could be anococcus jannaschii homoserine kinase using these structures as starting points, produced and purified, it exhibited aggre- (HSK) x-ray crystal structures. These two provides a bona fide avenue for generating gation properties that confounded crystal- x-ray crystal structures allow generation of reliable 3D structural information on a lization efforts. The group was able to good quality models for 181 proteins, be- genomic scale. circumvent these problems by producing longing to 2 of 19 discrete GHMP sub- Although this vision is very exciting, it and crystallizing the E. coli IDI homo- families. To provide models for most of by no means addresses some of the more logue. This structure provided a useful the remaining members of this super- challenging systems for which structural model of the yeast IDI protein. Moreover, family, an additional 17 or more experi- information is required. Large swaths of although the yeast IDI protein was not mental x-ray or NMR structures will be the structural landscape are inaccessible crystallized, the group now has access to required. These results clearly demon- to the rapid data collection and analysis both MDD and strate that in some cases strategies currently being developed for IDI yeast protein it will be necessary to structural genomics. For example, these samples, facilitat- intelligently sample These results clearly demonstrate opportunistic methods cannot yet be ap- ing downstream multiple experimental biochemistry ef- that in some cases it will be structures from each plied to the very important classes of integral membrane proteins or in struc- forts that might necessary to intelligently sample protein superfamily. On require multiple the other hand, reliable ture analysis of large macromolecular members of the multiple experimental structures models could be gener- complexes that require dedicated research pathway from the from each protein superfamily. ated for two other en- efforts to reconstitute. Even some ‘‘sim- same organism. zymes of the same yeast ple’’ proteins will not be tractable to ‘‘high The structural sterol͞isoprenoid bio- throughput’’ crystallization or NMR information for MDD and IDI were am- synthesis pathway, mevalonate kinase and screening methods, and can only be ad- plified by homology modeling, using phosphomevalonate kinase, which work dressed by specific research programs