ANNUAL REPORT for July 2004 – June 2005
Total Page:16
File Type:pdf, Size:1020Kb
ANNUAL REPORT for July 2004 – June 2005 RESEARCH COLLABORATORY FOR STRUCTURAL BIOINFORMATICS: Rutgers, The State University of New Jersey San Diego Supercomputer Center, University of California, San Diego About the Cover The data that the RCSB PDB tracks from structural genomics projects – from target selection to structure determination – are illustrated here. The structures shown are from the nine pilot centers participating in the first phase of the NIH-funded Protein Structure Initiative. 1nkt Crystal structure of the SecA protein 1t3u translocation ATPase4 (interior) Mycobacterium tuberculosis Unknown conserved bacterial (TB) Structural 1rjj protein from Pseudomonas Genomics Consortium Solution structure of a aeruginosa PAO1 homodimeric hypothet- New York Structural ical protein from Genomics Research Arabidopsis thaliana1 1nc7 Consortium Center for Eukaryotic Crystal structure of Structural Genomics Thermotoga 1sq4 maritima 1070 Crystal structure of Midwest Center for the putative Structural Genomics glyoxylate induced protein from Pseudomonas aeruginosa Northeast Structural Genomics Consortium 1xn4 Putative Mar1 ribonuclease from Leishmania major Structural Genomics of Pathogenic Protozoa Consortium 1n0e Crystal structure of a cell division and cell wall biosynthesis protein 1o4v UPF0040 from Crystal structure of phospho- Mycoplasma pneumoniae 5 ribosylaminoimidazole Berkeley Structural mutase PurE (Tm0446) from 1pry 3 Genomics Center Thermotoga maritima Structure determination of Joint Center for fibrillarin homologue from Structural Genomics hyperthermophilic archaeon Pyrococcus furiosus2 Southeast Collaboratory for Contents Structural Genomics About the Cover . .2 Data Query and Reporting . .9 Message from the Director . .3 PDBML/XML Data Uniformity Files . .10 What is the RCSB PDB? . .4 Time-stamped Copies of PDB Archives . .10 PDB Users . .4 Browsing for Structures... .10 The RCSB PDB Team Mission . .5 Related Resources . .11 RCSB PDB Management . .5 Structural Genomics . .11 Funding . .5 Physical Archives . .11 wwPDB . .6 Recent Publications . .11 Advisory Committee . .6 Data Dictionaries: mmCIF and PDB . .12 RCSB PDB Services: Data Deposition . .7 NMR Structures . .12 Data Input: Deposition, Validation, and Annotation . .7 Community Interactions . .12-13 Software for Deposition and Validation . .7 Molecule of the Month Series . .13 Statistics . .8 Other Educational Resources . .13 Annotators work to represent PDB data... .8 Workshops . .14 ADIT Depositions . .8 Important Web Addresses & Help Desks . .14 RCSB PDB Services: Data Query, Reporting and Access . .9 References . .15 Data Distribution and Access . .9 2 RCSB PDB Annual Report Message from the Director The RCSB PDB is uniquely positioned to track the growth of structural biology. In terms of shear volume, the numbers are impressive. More than one hundred structures are deposited every week, each with its own story to tell. Some come from small laboratories that focus on a single aspect of biology aimed at cracking the mystery of how proteins function. Others come from structural genomics groups where high throughput methods make it possible to rapidly determine structures on a genomic scale and give us an even broader view of the many possible shapes of proteins. Now we can see structures of large macro- molecular machines that have been determined using a combi- nation of powerful methods, including cryo-electron microscopy. The RCSB PDB is up to the challenge of handling the data produced by all of these initiatives with methods for managing both its volume and complexity. It is both daunting and excit- ing to be part of this era in which the secrets of biology are being revealed at the molecular level. The RCSB is not alone in managing the data from these new initiatives. Through the worldwide PDB (wwPDB) we are working with our partners in Japan (the PDBj) and Europe (MSD-EBI) to ensure that the data remain uniform and freely available. This report presents the many different RCSB PDB activities in data deposition, data access and education. We hope you find that it reflects the excitement we feel about our contribution to the burgeoning field of structural biology. Helen M. Berman Director, RCSB PDB 1y0q Golden, B. L., Kim, H., Chase, E.: (2005) Crystal Structure of a phage Twort group I ribozyme-product complex. Nat Struct Mol Biol 12, 82-89. July 2004 – June 2005 3 What is the RCSB PDB? SNAPSHOT – July 1, 2005 31535 released atomic coordinate entries The structures housed in the PDB range from small pieces of Molecule Type protein or DNA to complex machines, such as viruses and 28739 proteins, peptides, and viruses ribosomes. Each molecule plays a part in at least one biological 1481 nucleic acids process and each has value in helping scientists unravel the 1302 protein/nucleic acid complexes mysteries of life. Their structural shapes provide insight about 13 carbohydrates fundamental biological processes and in some cases, about disease Experimental Technique or drug interactions. 26932 diffraction and other 4603 NMR The PDB archives include a wide variety of medically important structures, including enzymes and other proteins associated with influenza, HIV, SARS and other viruses; parts of prion pro- The RCSB stores these data in a relational database that can be teins (including the bovine form implicated in Mad Cow accessed using several different tools for searching and creating Disease); the amyloid peptide associated with Alzheimer’s graphical reports. The RCSB PDB website also offers other disease; and the p53 tumor-suppressor protein associated with a resources for associated aspects of structural biology, including wide variety of human cancers. structural genomics, data representation formats, and down- loadable software. The website includes a variety of The information in the PDB about materials for learning about structural biology these biological macromole- and the PDB, and for interacting with cules holds significant the multifaceted interests of the promise for the PDB user communities. pharmaceutical and biotechnology Data residing in the PDB industries in the results from research projects search for new all over the world. In recog- drugs and in the nition of the truly interna- effort to understand the tional nature of the PDB mystery of human disease. and the critical importance of maintaining that data in a single The data stored in the PDB archive, the worldwide PDB (wwPDB) 8 include the three-dimensional was established in 2003 . All wwPDB coordinates for the structures of sites share responsibilities for data these biological macromolecules deposition, processing, and distribution of the PDB archives, and support the standardization of structure data that have been determined experimen- 1u0n tally using X-ray crystallography, Fukuda, K., Doggett, T., (see also page 6). Laurenzi, I. J., Liddington, NMR, and cryo-electron microscopy. R. C., Diacovo, T. G. Understanding what these structures (2005) The snake venom PDB Users look like and being able to ‘see’ how protein botrocetin acts as a biological brace to promote As the sole repository for three-dimensional structure data of they bind to various ligands greatly aids dysfunctional platelet aggre- biological macromolecules, the PDB is a critical resource for in understanding how they function. gation. Nat Struct Mol Biol 12, 152-159. academic, pharmaceutical, and biotechnology research. Structural biologists in industry and academia who focus on The importance of the data related to experimental structure determinations work with wwPDB staff three-dimensional structures was recognized early in the history to make these data available in the archives. Researchers from a of structural biology. The scientific community felt strongly wide variety of disciplines then use these data in their own that these data needed to be freely available to people in all studies, which can range from structural genomics to computa- fields of research. As a result, in 1971 the Protein Data Bank tional biology to structural biology and beyond. (PDB) was founded at Brookhaven National Laboratory6 as the sole international repository for three-dimensional structure Students and teachers also depend upon the RCSB PDB's collec- data of biological macromolecules. In 1999, management was tion of resources to help elucidate various subjects related to bio- assumed by the non-profit consortium the Research Collaboratory logical structure. Resources available include RCSB tools to find for Structural Bioinformatics7. and visualize proteins, an education-centered webpage with materials for all levels of understanding, and the popular This online repository contains the coordinates and related Molecule of the Month feature that regularly highlights different information for more than 31,000 structures. molecules from the PDB (page 13). 4 RCSB PDB Annual Report What is the RCSB PDB? RCSB PDB Management The RCSB PDB is managed by Rutgers, The State University of New Jersey and the San Diego Supercomputer Center (SDSC), an organized research unit of the University of California, San Diego (UCSD) – two member institutions of the RCSB. The group from the Center for Advanced Research in Biotechnology/University of Maryland Biotechnology Institute/National Institute of Standards and Technology is no longer part of the RCSB consortium. The Project Leaders manage the overall operations. Helen M. Berman, a Board of Governors Professor of Chemistry and Chemical Biology at Rutgers, is the Director of the RCSB