369 Population statistics of protein structures: lessons from structural classi®cations Steven E Brenner∗§, Cyrus Chothia² and Tim JP Hubbard³ Structural classi®cations aid the interpretation of proteins classi®cation hierarchy and indicates how levels in the by describing degrees of structural and evolutionary different databases relate. See Brenner et al. [10•] for a relatedness. They have also recently revealed strikingly detailed description of the heirarchical levels in the scop skewed distributions at all levels; for example, a small classi®cation. number of folds are far more common than others, and just a few superfamilies are known to have diverged widely. The Whereas most proteins in the PDB have just a single classi®cations also provide an indication of the total number domain, many larger proteins have two or more. Most of superfamilies in nature. databases split such proteins into their constituent do- mains, as these are typically the fundamental units of both structure and evolution. Individual domains within a large Addresses protein may therefore be considered independently. ∗Structural Biology Centre, National Institute for Bioscience and Human-Technology, Higashi 1-1, Tsukuba, Ibaraki 305, Japan and MRC Laboratory of Molecular Biology, Hills Road, Cambridge The principal structural similarity between two proteins is CB2 2QH, UK; e-mail:
[email protected] the `fold', which indicates that two proteins (or domains ² MRC Laboratory of Molecular Biology, Hills Road, Cambridge CB2 of proteins) share the same core secondary-structure 2QH, UK; e-mail:
[email protected] ³Sanger Centre, Wellcome Trust Genome Campus, Hinxton, elements in the same arrangement. Although proteins Cambridgeshire CB10 1SA, UK; e-mail:
[email protected] with the same fold must have the same core, the §Address from August 1997: Department of Structural Biology, peripheral structural elements may differ greatly, including Stanford University, Stanford, CA 94305-5400, USA elaboration and insertion of other domains.