<<

Research Article 287

Crystal structure of throws light on a superfamily of adenylate-forming Elena Conti, Nick P Franks and Peter Brick*

Background: Firefly luciferase is a 62 kDa protein that catalyzes the production Address: Biophysics Section, Blackett Laboratory, of light. In the presence of MgATP and molecular , the oxidizes its Imperial College, London SW7 2BZ, UK. substrate, firefly , emitting yellow-green light. The reaction proceeds *Corresponding author. through activation of the substrate to form an adenylate intermediate. Firefly luciferase shows extensive sequence homology with a number of enzymes that Key words: acyl- , adenylate, utilize ATP in adenylation reactions. firefly luciferase, peptide synthetase, X-ray crystallography

Results: We have determined the crystal structure of firefly luciferase at 2.0 Å Received: 30 Nov 1995 resolution. The protein is folded into two compact domains. The large N-terminal Revisions requested: 21 Dec 1995 domain consists of a ␤-barrel and two ␤-sheets. The sheets are flanked by Revisions received: 15 Jan 1996 ␣-helices to form an ␣␤␣␤␣ five-layered structure. The C-terminal portion of the Accepted: 31 Jan 1996 molecule forms a distinct domain, which is separated from the N-terminal domain Structure 15 March 1996, 4:287–298 by a wide cleft. © Current Biology Ltd ISSN 0969-2126 Conclusions: Firefly luciferase is the first member of a superfamily of homologous enzymes, which includes acyl-coenzyme A and peptide synthetases, to have its structure characterized. The residues conserved within the superfamily are located on the surfaces of the two domains on either side of the cleft, but are too far apart to interact simultaneously with the substrates. This suggests that the two domains will close in the course of the reaction. Firefly luciferase has a novel structural framework for catalyzing adenylate-forming reactions.

Introduction The firefly uses this enzyme to emit flashes of light to Bioluminescent organisms are widely distributed through- attract its mate. The enzyme is a 62 kDa molecular weight out terrestrial and aquatic environments and include lumi- [2] but, unlike most other , no nous bacteria, insects, marine coelenterates and crustacea prosthetic group is involved in the reaction. The enzyme [1]. The biological function of luminescence varies from requires ATP, molecular oxygen and the heterocyclic com- species to species, and ranges from distracting predators to pound to generate light in a two-step attracting prey or mating partners. The production of light process [3] (Fig. 1). The reaction has unusual kinetics in arises from the conversion of chemical energy into an that firefly luciferase turns over very slowly [4]. After an excited electronic state, energetic enough to result in the initial flash of light, the luminescence rapidly decreases to emission of a photon of visible light. This is achieved by a low level of emission, probably due to product inhibition oxidation of a substrate (S) to an excited-state product of the enzyme. The quantum yield is the highest known (P*), which then decays to the ground state (P) emitting for any bioluminescent reaction [5], with nearly one photon light (h␯). of light emitted for every luciferin molecule oxidized. S + O → P* → P + h␯ 2 Firefly luciferase has been extensively used in molecular The reaction is catalyzed by enzymes called . and cell biology, in particular for the efficient detection While luciferase enzymes from different species all utilize and quantification of ATP and as a reporter of genetic an oxidation reaction, these reactions involve a variety of function [6]. Luciferase has also been studied as a model different cofactors and unrelated substrates and proceed for possible protein–anaesthetic interactions, being one of through distinct reaction pathways. Indeed, the biological the few soluble proteins sensitive to a wide range of and biochemical diversity of the bioluminescent systems general anaesthetics [7]. suggests that the ability to make light arose from many separate origins during evolution [1]. The catalytic formation of an enzyme-bound adenylate intermediate is a common mechanism used by both Luciferase from the firefly (EC 1.13.12.7) is prokaryotic and eukaryotic organisms to activate sub- a peroxisomal protein found in the light-emitting organ strates. Enzymes as diverse as aminoacyl-tRNA syn- known as the lantern within the abdomen of the insect. thetases (aaRSs) and fatty-acyl coenzyme A (CoA) ligases 288 Structure 1996, Vol 4 No 3

Figure 1 Figure 2

The reaction catalyzed by firefly luciferase. In the first (activation) step, the substrate D-luciferin (Luc-COOH; 4,5-dihydro-2-[6-hydroxy-2-benzo- thiazolyl]-4-thiazolecarboxylic acid) is activated by acylation of its carboxylate group with the ␣-phosphate of ATP, resulting in the formation of an enzyme-bound luciferyl adenylate (Enzyme:Luc-CO-AMP) and release of pyrophosphate (PPi). In the second (oxidation) step, luciferyl adenylate reacts with stoichiometric amounts of molecular oxygen, via an ␣-peroxylactone intermediate [61], producing an enzyme-bound excited- state product (Enzyme:Luc=O*). This excited-stated product then decays to the ground state (oxyluciferin; Luc=O) emitting yellow-green light. link the carboxyl group of their substrates to the phospho- ryl moiety of AMP, and subsequently transfer the acti- vated substrate to an acceptor such as tRNA or CoA. The primary sequence of firefly luciferase is unrelated to that of the aaRSs, but shares extensive sequence similarity with acyl-CoA ligases [8] and with enzymes involved in the synthesis of linear and cyclic polypeptides in fungi and bacteria [9]. On the other hand, firefly luciferase shares no sequence homology with the light-emitting enzymes from bacteria, coelenterates and crustacea.

Interest in luciferases was stimulated by Harvey’s discov- ery, 70 years ago [10], that virtually all bioluminescent reactions require oxygen. However, until quite recently, Ribbon representations of the firefly luciferase molecule shown in two little structural information was available on light-emitting orthogonal views. The three subdomains of the large N-terminal domain enzymes. Last year, Fisher and co-workers [11] deter- are shown in blue (␤-sheet A), purple (␤-sheet B) and green (␤-barrel) and the small C-terminal domain is shown in yellow. Disordered loops mined the crystal structure of luciferase from the bac- are drawn in violet. (Generated using the program MOLSCRIPT [62].) terium Vibrio harveyi. This bacterial luciferase is a flavin monooxygenase that catalyzes the oxidation of a long- chain aldehyde and reduced flavin mononucleotide. We Results and discussion present the three-dimensional crystal structure of the The three-dimensional crystal structure was determined firefly luciferase from Photinus pyralis, which has been by the multiple isomorphous replacement (MIR) method, determined at 2 Å resolution. despite significant non-isomorphism upon binding of Research Article Crystal structure of firefly luciferase Conti, Franks and Brick 289

Figure 3

Stereo ␣-carbon trace viewed in a similar orientation to Figure 2a, with some sequence numbering for reference.

heavy atoms. The structure was refined to 2 Å resolution consists of six parallel and two antiparallel ␤-strands and, with a crystallographic R-factor of 22.4% and a free together with six helices, is built up from two non-con- R-factor of 26.5%. The molecular model has good stereo- tiguous portions of the polypeptide chain: the polypeptide chemistry and includes 523 out of 550 residues. segments 22–70 and 236–351.

Overall structure The core of the each ␤-sheet consists of parallel strands The firefly luciferase molecule folds into two distinct joined to ␣-helices with standard right-handed cross-over domains (Figs 2,3). The major portion of the structure, com- connections, resulting in the arrangement of helices on prising residues 4–436, consists of a compact domain con- either side of the sheet, antiparallel to the strands. A taining a distorted antiparallel ␤-barrel and two ␤-sheets, similar topology to that of the ␤-sheet core is present in which are flanked on either side by ␣-helices. The C termi- other proteins, as detected by comparison of the firefly nus of the protein (440–544) forms a small separate ␣+␤ luciferase molecule with known protein structures using domain. The secondary structure has been assigned on the the program DALI [15]. In particular, the superposition of basis of main-chain hydrogen bonding using the algorithm the chemotactic protein CheY [16] on ␤-sheet A results in of Kabsch and Sander [12]. Figure 4 shows a schematic 56 topologically equivalent ␣-carbons with an overall root representation of the molecular topology and the location of mean square (rms) separation of 1.8 Å. the subdomains in the primary sequence. N-terminal ␤-barrel subdomain N-terminal ␤-sheet subdomains The two ␤-sheets pack together and create a long surface The two ␤-sheet subdomains in the large N-terminal groove, shaped by the C termini of the strands and by the domain are assembled to form a five-layered ␣␤␣␤␣ tertiary N termini of the helices between the sheets. The groove structure, so that two ␣-helices are sandwiched between the is closed at one end by the presence of the antiparallel sheets and the other helices are packed against the outer ␤-barrel. The barrel is distorted in that it is formed by faces. A similar arrangement has been reported for the three distinct faces. Two sides each consist of three- structures of 3-ketoacyl-CoA thiolase [13] and inositol stranded antiparallel ␤-sheets with one of the ␤-strands monophosphatase [14], but the topology and the relative (C2) also hydrogen bonding to the last strand of ␤-sheet B. orientation of the sheets is different in each case. In the The third side of the barrel is formed by strands 7 and 8 of luciferase structure, the two ␤-sheets are tilted so that the ␤-sheet B and by the loop connecting them. The packing C termini of the parallel strands are closer together. of the barrel against the side of the two sheets forms two shallow depressions on the concave surface of the mol- The two ␤-sheet subdomains share a similar topology ecule. The two depressions and the groove, which arise (Fig. 4a). Each sheet is composed of eight strands, and dis- from the packing of the three N-terminal subdomains, plays the right-handed twist characteristic of parallel form a Y-shaped system of ‘valleys’ on the surface of the ␤-sheets when viewed along the strands. The ␤-sheet A, large domain opposite to the small C-terminal domain. which includes five parallel and three antiparallel ␤-strands with six associated helices, is constructed from a C-terminal domain single portion of the polypeptide chain (77–222), except A wide cleft separates the large domain from the small for strand A8 and helix 12 (399–405). The ␤-sheet B C-terminal domain, which forms a ‘lid’ over the ␤-barrel 290 Structure 1996, Vol 4 No 3

Figure 4

Overall topology of the firefly luciferase enzyme. (a) Topological diagram illustrating the secondary structure elements coloured by subdomains. The circles represent ␣-helices, which have been numbered sequentially, and the arrows represent ␤-strands, which have been numbered sequentially for each of the five ␤-sheets, A–E. The secondary structural elements have been assigned using the program DSSP [12]. (b) Schematic drawing showing the location of the subdomains in the sequence, using the same colouring scheme.

and the last strands (7 and 8) of the ␤-sheets (Fig. 2). The is believed to have arisen as a result of gene duplication. hinge connecting the two distinct portions of the molecule However, in the firefly luciferase structure, the symmetry- is disordered in the crystal structure and is presumably related central regions of the two sheets correspond to flexible. The C-terminal domain contains two short non-contiguous stretches of sequence (see Fig. 5b) and antiparallel ␤-strands and a three-stranded mixed ␤-sheet, therefore these two modules could not have arisen by with three helices packed against the sides. This type of simple gene duplication without at least one translocation fold has been classified as an ␣+␤ structure. event. If, alternatively, a repeated structural module is taken to consist of ␤-strands B1, B2, A1, A2, A3, A4 and Internal structural homology ␣-helices 2, 3, 4 (Fig. 5c), then the corresponding ele- In the large N-terminal domain of the molecule, the ments A6, A7, B3, B4, B5, B6 and 7, 8, 9 are again related central regions of the two ␤-sheet subdomains share a by twofold symmetry, with a rotation of 178.5° and a trans- similar structure. These regions can be superposed such lation of 0.4 Å. In this case the rms separation for the 87 that 87 pairs of topologically equivalent ␣-carbon atoms, corresponding C␣ atoms increases to 2.1 Å, but the two have a separation ≤2.5 Å, giving an overall rms separation modules are arranged linearly along the sequence. of 1.6 Å. The two ‘modules’ are approximately related by twofold symmetry (Fig. 5a), with a rotational component A comparison of the 87 structurally related amino acids in of 178.4° and a translational component of 2.0 Å. the luciferase structure shows only a 16% identity. This is in line with other examples of internal structural homology and Examples of internal twofold symmetry have been is consistent with the view that the tertiary structure is con- reported for a number of proteins in which the symmetry served to a greater degree than the primary sequence [17]. Research Article Crystal structure of firefly luciferase Conti, Franks and Brick 291

Figure 5

Internal structural homology. (a) Stereo representation of ␤-sheets A and B of the large N-terminal domain showing the twofold- related secondary structure elements from sheet A (blue) and sheet B (purple), viewed down the local molecular dyad axis. (Generated using the program MOLSCRIPT [62].) (b) Topological diagram of ␤-sheets A and B, highlighting in different patterns the structural elements of the two symmetry- related modules. Secondary structural elements that do not obey the twofold axis are left as open circles and arrows. The figure includes a schematic drawing showing the location of the two modules along the sequence. (c) Topological diagram, as in (b), showing the preferred choice of symmetry- related modules, with their respective locations along the sequence.

Flexible regions density include the 356–358 loop connecting the last

A (2Fobs–Fcalc) electron-density map calculated using data strand of ␤-sheet B to strand C2 of the ␤-barrel sub- between 10 Å and 2 Å resolution shows well-defined elec- domain, and the three N-terminal and six C-terminal tron density for most of the protein. Figure 6 shows a residues of the sequence. The N terminus of the protein region of electron density in the N-terminal domain. The lies in the barrel subdomain, whereas the C terminus is density throughout the major domain, in particular the two located in the small domain, pointing away from the body ␤-sheet subdomains, is generally of better quality than for of the protein into the solvent. the rest of the molecule. This variation is correlated with the distribution of temperature factors along the polypep- The C-terminal Ser-Lys-Leu tripeptide is responsible for tide chain. The average isotropic atomic temperature peroxisomal targeting [18]. Peroxisomal proteins are synthe- factor for all protein atoms in the ␤-sheet subdomains is sized in the cytosol and transported into the organelle post- 23.2 Å2, whereas it increases to 36.2 Å2 and 43.7 Å2 in the translationally, by recognition of a targeting sequence. In the ␤-barrel and the C-terminal domain respectively. luciferase structure this sequence is disordered and exposed to solvent, presumably allowing interaction of the targeting There is no interpretable electron density for a few small signal with its receptor. An analogous observation has been flexible regions. The polypeptide segment 436–440 con- reported for a different N-terminal peroxisomal targeting necting the two domains is disordered in the structure, peptide in the 3-ketoacyl-CoA thiolase structure [13]. together with the 199–203 loop connecting strand A6 to strand A7 in ␤-sheet A and the 524–528 loop connecting Sequence comparison strand E3 to helix 15 in the small C-terminal domain. Firefly luciferase shares significant primary sequence and These three flexible loops include some of the most mechanistic similarities with peptide synthetases and acyl- conserved residues in the homologous enzymes, and CoA ligases. These enzymes catalyze analogous activation are exposed to solvent in the large cleft separating the reactions between ATP and the carboxyl moiety of their N-terminal and C-terminal domains. Other regions of poor substrates [19,20]. 292 Structure 1996, Vol 4 No 3

Figure 6 representative set of 38 proteins that catalyze the acylation reaction using 15 different substrates in a variety of organ- isms (Fig. 7). Despite the high level of similarity, only seven residues (Gly200, Lys206, Glu344, Asp422, Arg437, Gly446 and Glu455) are absolutely conserved in the sequences ana- lyzed, and they are therefore likely to play a crucial role in the binding of ATP and in adenylate formation. The invari- ant residues occur in short, highly motifs. The motif 198[STG]-[STG]-G-[ST]-[TSE]-[GS]-x- [PALIVM]-K206 (where square brackets enclose alternative residues and x represents a hypervariable position) is the signature sequence for the enzyme superfamily [25]. This nine-residue long peptide contains the invariant Gly200 and Lys206 residues. Site-directed mutagenesis of the cor- responding lysine to an residue in a peptide syn- thetase, dramatically decreases the protein activity, while alterations at other positions in the signature motif have a less significant effect [26]. Two other motifs, 340[YFW]- A region of the (2F –F ) electron-density map in the large obs calc [GASW]-x-[TSA]-E344 and 420[STA]-[GRK]-D422, contain N-terminal domain, which has been contoured at the 1␴ level. The electron density is shown in blue and the refined model is coloured the invariant residues Glu344 and Asp422 respectively. according to atom type, with carbon atoms in yellow, in red, Mutations of the residue corresponding to Asp422 in the nitrogens in blue and sulphurs in green. related tyrocidine synthetase results in loss of activity, although this is less severe when an asparagine residue is The most closely related enzymes are the acyl-CoA introduced at this position [26]. ligases. These enzymes activate a variety of different sub- strates, such as acetic acid, aromatic acids and long-chain fatty acids, to the corresponding enzyme-bound acyl- The probable location of the active site in the luciferase adenylates, which are then transferred to the thiol group structure can be identified from the position of the of CoA. The catalysis of acyl transfer to CoA is used by residues which are highly conserved in the three families plants and bacteria for various metabolic functions such as of related enzymes, including firefly luciferase, acyl-CoA the biodegradation of halogenated hydrocarbons [21]. In ligases and peptide synthetases. Many of the well-con- higher eukaryotes, activation of fatty acids is the first served residues are located in the C-terminal portion of step in fatty-acid metabolism. Long-chain fatty-acyl-CoA the sequence, and constitute the core of the ␤-barrel and synthetases catalyze the esterification of fatty acids into of the small C-terminal domain in the three-dimensional metabolically active CoA thioesters, which are then used structure. The surface of the molecule shows distinct either for the synthesis of cellular lipids, or as an energy patches of conserved residues, in particular the surfaces of source following degradation by ␤-oxidation [22]. the N-terminal and C-terminal domains facing each other across the large cleft. The invariant residues in the N-ter- Some regions of sequence conservation are also found in minal domain are located on the surface of the molecule, the family of peptide synthetases, enzymes which are lining the depression caused by the packing of ␤-sheet A involved in antibiotic production in microorganisms [23]. against the ␤-barrel (Fig. 8a). Enzymes such as gramicidin S synthetase, tyrocidine syn- thetase and the first enzyme of the penicillin biosynthetic The signature motif [STG]-[STG]-G-[ST]-[TSE]-[GS]-x- pathway, ␦-(L-␣-aminoadypoil)-L-cysteinyl-D-valine syn- [PALIVM]-K is found in the loop connecting the antiparal- thetase, carry out the non-ribosomal synthesis of small lel strands 6 and 7 of ␤-sheet A, in close proximity to the peptides by activation of the constituent amino, imino and barrel subdomain. No structural analogy can be drawn with hydroxy acids with ATP. The same chemical reaction is other glycine-rich loops found in nucleotide-binding used in the ribosomal synthesis of polypeptides, where proteins. The G-x-x-x-x-G-K-[STG] motif responsible for aaRSs activate amino acids before the specific attachment phosphate-binding in various ATP- and GTP-binding to their cognate tRNA species. In spite of their common proteins [27] has a characteristic structure and connects the catalytic functions, peptide synthetases do not share first strand and the first helix of the classical mono- the sequence motifs that characterize the two distinct nucleotide-binding (or Rossmann) fold. In the luciferase structural classes of aaRSs [24]. structure most of the signature motif is disordered, although interpretable electron density is present for the N-terminal Extensive regions of sequence similarity can be iden- serine (198) and the C-terminal proline (205) and lysine tified from a comparison of the primary sequences of a (206) residues. Lys206 is exposed to solvent and points Research Article Crystal structure of firefly luciferase Conti, Franks and Brick 293

Figure 7

Amino acid sequence of firefly luciferase with the relative positions of the secondary structure elements specified with the same name and colour coding used in Figure 4a. The boxed residues represent the invariant (red) and homologous (more than 50% identical; pink) amino acids, as derived from an alignment of 38 sequences including 9 firefly luciferases, 18 acyl-CoA ligases and 11 activating domains of peptide synthetases. All ligase and peptide synthetase sequences were taken from version 32 of the SWISSPROT database. The firefly luciferase sequences were taken from the SWISSPROT database, the EMBL database and [63].

towards the cleft separating the two domains. The side well-conserved Tyr340 (Fig. 8b). The hydroxyl group of chain of Ser198 is hydrogen bonded to a carboxylate oxygen Ser420 is within hydrogen bonding distance of the back- of Glu344, which in turn is 2.9 Å from the hydroxyl group of bone nitrogens of both Asp422 and Gly421. The latter, in Tyr401 (Fig. 8b). In the homologous tyrocidine synthetase, turn, interacts with the carboxylate of Glu389 (Fig. 8b). the tyrosine residue corresponding to position 401 in luciferase has been identified by photoaffinity labelling as Only a few other residues are found to be invariant in possibly interacting with the adenine ring of ATP [28]. the compared sequences. Arg437 is located in the loop connecting the barrel to the small C-terminal domain The invariant residue Glu344 is present in the 340[YFW]- and is disordered in the structure. In the C-terminal [GASW]-x-[TSA]-E344 motif, which is part of the loop domain, Gly446 is part of the loop joining ␤-strands D1 connecting the antiparallel strands 7 and 8 of ␤-sheet B. and D2, but the electron density is rather poorly defined The well-defined electron density for the 341–348 loop is for this residue. Glu455 is present on the surface of the consistent with a strained conformation at Thr346. In the domain pointing towards the cleft, with its carboxylate luciferase structure, Thr346 is the only residue moiety hydrogen bonded to the backbone nitrogen of presenting dihedral angles (␾=74°, ␺=–65°) outside the Val469. The nearby and well-conserved 523–529 loop is allowed regions of the Ramachandran plot [29]. Energeti- disordered and exposed to solvent in the cleft. cally unfavourable main-chain dihedral angles are rare in proteins and are usually associated with regions of the All the invariant residues in this superfamily of adeny- molecule having a functional role [30]. late-forming enzymes are located on the surfaces of the two domains on the opposite sides of the cleft or in the The third motif 420[STA]-[GRK]-D422 contains the invari- connecting hinge. The cleft is far too big to accommo- ant Asp422, which is exposed to solvent with its carboxy- date the substrates and to allow the simultaneous inter- late groups hydrogen bonded to the side chain of the action of the conserved surfaces, and it seems likely that 294 Structure 1996, Vol 4 No 3

Figure 8

Location of the active site. (a) Stereo representation of the concave molecular surface of the large N-terminal domain of firefly luciferase, looking down onto the Y- shaped system of valleys with ␤-sheet A on the left-hand side, ␤-sheet B on the right-hand side and the ␤-barrel at the top. The colouring scheme represents the level of similarity observed between the sequences of 38 acylating enzymes. The colours range from blue for the hypervariable residues, through white, to red for the three invariant residues Lys206, Glu344 and Asp422. (The surface was calculated and displayed using the program GRASP [64].) (b) Stereo representation of the active site of firefly luciferase, viewed in a similar orientation to (a), showing some of the well-conserved residues in the superfamily of adenylate- forming enzymes. (Generated using the program MOLSCRIPT [62].)

the two domains will come together during the course of The exact location of the luciferin- is not the reaction to sandwich the substrates. This would immediately apparent from the structure. The depression provide a suitable environment for the production of between ␤-sheet B and the barrel subdomain is well con- light, as water molecules must be excluded from the served and a corresponding patch of invariant residues active site to avoid intermolecular quenching of the among the firefly luciferase enzymes lies on the surface of excited-state product. Indeed, rapid-mixing kinetics [4], the C-terminal domain opposite the depression. The hydrogen–tritium exchange experiments [31] and the groove formed by the two ␤-sheets is conserved to a lesser fact that the binding of ATP to the enzyme has a extent. The residues lining the internal cavity are con- significant effect on the binding of luciferin (and vice served amongst luciferase enzymes, but the entrance versa) [32] all point to a substantial conformational appears to be rather too small and the pocket rather too change following substrate binding. hydrophilic to accommodate the luciferin molecule [33]. A number of mutations have been reported to affect the A number of putative substrate-binding clefts are present colour of the emitted light [34,35]. Most of them are in the in close proximity to the positions of the invariant residues ␤-sheet B subdomain, but they are not localized in one in the major domain. Besides the groove and the depres- region of the molecule, and could well cause colour sion formed by the packing of sheet B against the other changes by relatively long-range interactions. two subdomains, a hole leading to an internal cavity is present in the groove near the intersection with the other Firefly luciferase has also been shown to bind CoA two valleys. The cavity contains a dozen ordered water [36,37]. Although the physiological function of CoA molecules and, although buried within the interior of the binding is unclear, addition of CoA to the reaction signifi- molecule, is lined by a number of charged residues. cantly affects the kinetics of light emission, with an Research Article Crystal structure of firefly luciferase Conti, Franks and Brick 295

increase in the glow that follows the initial flash of light. chemical energy of an oxidation reaction into light, As the firefly luciferases show extensive sequence similar- they are biologically and chemically very diverse, ity with CoA ligases, it seems likely that both families of supporting the view that they originated indepen- enzymes bind CoA in a similar manner. dently in the course of evolution.

Adenylate-forming enzymes The luciferase from is an ATP-dependent It has long been known that enzymes involved in a wide enzyme. The light-producing reaction is initiated by range of metabolic functions use the same adenylate- the activation of the substrate, firefly luciferin, through forming reaction to activate substrates [38,39]. Whether a the adenylation of its carboxylate group, and proceeds similar structural framework is used to perform the cataly- in the presence of molecular oxygen to yield a photon sis has been the subject of much speculation. of yellow-green light. Firefly luciferase shares exten- sive sequence similarity with a number of enzymes Structural studies on both classes of aaRSs have shown that use ATP to form adenylate intermediates. Closely that, despite the mechanistic similarities, there is no uni- related enzymes are acyl-coenzyme A ligases, which versal structural fingerprint for adenylate formation [40]. activate various substrates prior to their transfer to The catalytic domain of class I aaRSs has the classical Ross- coenzyme A, and peptide synthetases, which activate mann fold, based on a parallel ␤-sheet. The Rossmann fold amino acids in the non-ribosomal synthesis of poly- is shared by numerous ATP- and GTP-binding proteins, peptides. Firefly luciferase is the first member of this although the position of the bound nucleotide is somewhat enzyme superfamily to have its structure elucidated. variable. Class II aaRSs catalyze the same reaction, but have a catalytic domain containing an uncommon antiparal- Firefly luciferase consists of two distinct domains lel ␤-sheet structure, characteristic of this class of enzymes. separated by a wide cleft. The invariant residues con- The same active-site architecture has been reported served in the three families of related enzymes are all for biotin synthetase [41,42], a protein which activates clustered on the surfaces of the two domains facing vitamin H. A similar structure has also been proposed for each other across the cleft, suggesting that this is the another activating enzyme, asparagine synthetase, on the location of the active site of the enzyme. The two basis of sequence homology with some class II aaRSs [43]. conserved surfaces, however, are too far apart to Firefly luciferase is the first representative of a superfamily interact simultaneously with the substrates without of related enzymes, including acyl-CoA ligases and peptide there being a significant change in the relative posi- synthetases, to be crystallographically characterized. The tions of the two domains. This suggests that during luciferase enzyme represents another structural framework the course of the reaction the two domains will come that has evolved to catalyze adenylation reactions. together sandwiching the substrates.

The formation of an adenylate intermediate involves the The activation of substrates to enzyme-bound adeny- attack of a fully ionized nucleophile (the substrate car- late intermediates is a common mechanism used in a boxylate ion) and the departure of a good leaving group wide range of metabolic functions, such as ribosomal (the pyrophosphate moiety of ATP). Crystallographic and non-ribosomal peptide synthesis, degradation of studies on the two classes of aaRSs [44,45] have shown fatty acids and light emission. The structure of firefly that the role of the enzyme is to position the ATP and the luciferase bears no relationship to the structures of amino acid substrates correctly for the in-line attack of the class I or class II aminoacyl-tRNA synthetases which ␣-phosphate of ATP by the amino acid carboxylate ion, are involved in the ribosomal synthesis of polypep- and then to stabilize the pentavalent transition state. tides. It therefore represents a third structural frame- Despite having unrelated structures, both classes of aaRSs work, which has evolved independently and is capable use chemically equivalent residues in the binding of ATP of binding ATP and catalyzing adenylate formation. and for stabilization of the negatively charged transition state [40]. The few glycine or charged residues invariant Materials and methods in the superfamily represented by firefly luciferase are Crystallization likely to be involved in similar types of interactions in the The crystals of firefly luciferase used for the structure determination other superfamily members. Further insights into the were obtained by the microbatch technique under oil [46]. Recombi- nant firefly luciferase [2] was purchased from Promega Corporation enzymatic mechanism will require the structure determi- (Southampton, UK) and crystallized as clusters of needles, usually with nation of enzyme–ligand complexes. a cross-section of 0.04–0.08 mm and up to 1.5 mm long. Larger crys- tals with a cross-section of 0.15–0.20 mm could be obtained occa- Biological implications sionally, but they were very fragile and cracked easily upon handling. The emission of light from living organisms is cat- The best crystals grew at 4°C or 10°C when 2 ␮l droplets of the recombinant protein (20 mg ml–1 in 200 mM ammonium sulphate, alyzed by enzymes generically called luciferases. 1 mM EDTA, 1 mM DTT, 10% (v\v) glycerol, 25% (v\v) ethylene glycol, Although the luminescent systems all convert the 25 mM Tris-HCl pH 7.8) were mixed with 2 ␮l of 500–540 mM lithium 296 Structure 1996, Vol 4 No 3

sulphate, 26% (w/v) PEG 8000, 100 mM Tris-HCl pH 7.8. The diffrac- One data set (nat1) was chosen as the reference native for phasing. tion pattern is consistent with space group P41212 and cell dimensions The iridium derivative and native data were combined following a proce- a=b=119.5 Å, c=95.4 Å. The asymmetric unit contains one protein dure similar to that described by Blow and Matthews [50], yielding a molecule, corresponding to a solvent content of about 56% [47]. modified set of derivative structure factors (F*der). The linear combina- tion of two native data sets which most closely matched the derivative Data collection structure-factor amplitudes (Fder) was used to calculate differences Luciferase crystals are extremely susceptible to X-ray radiation which were then added to the chosen reference native structure-factor damage, and room temperature data collection did not prove feasible, amplitudes: F*der=Fnat1+(Fder–Fnatcomb) with Fnatcomb=xFnat1+(1–x)Fnat2 the crystals lasting less than 30 min in the X-ray beam. To perform data where x is the fraction used. This procedure further decreased the Riso collection at cryogenic temperatures, the crystals were introduced for values and improved the phasing statistics for the iridium derivative and a few minutes into a cryoprotectant solution containing 8% w/v for the platinum/iridium double derivative. Both the K2PtCl4 and the PEG 8000, 10% glycerol, 12.5% ethylene glycol and 100 mM Tris-HCl Ir(NH3)5Cl2 derivatives had one heavy atom bound per asymmetric unit, pH 7.8. Crystals were frozen using standard techniques [48] in a enabling the heavy-atom sites to be easily located in the difference Pat- stream of nitrogen gas at 100 K produced by an Oxford Cryosystem terson and difference Fourier maps. The probabilistic approach of the (Oxford, UK). Screening for heavy-atom derivatives was carried out to program MLPHARE [49,51] was used to refine the heavy-atom para- 5 Å resolution on a MarResearch (Hamburg, Germany) image plate meters. The anomalous contributions for the platinum and the plat- detector, with graphite-monochromatized CuK␣ radiation from an Elliott inum/iridium derivatives were included in the refinement. The calculated GX21 rotating anode X-ray generator. The signal-to-noise ratio for the phases gave an overall figure of merit of 0.61 for data between 10.0 Å X-ray intensities scattered by these small and weakly diffracting crys- and 3.0 Å resolution. The 3.0 Å MIR phases were improved by solvent tals greatly improved using intense and highly collimated synchrotron flattening and solvent flipping, a density modification technique in which radiation. Data at 2.8 Å resolution were obtained at the SRS (Dares- the features of the solvent are inverted rather than flattened [52,53]. bury, UK) for the native and the derivative data sets used for phasing. Inspection of the resulting electron-density map showed definite Diffracted intensities out to 2.0 Å resolution were measured at DESY molecular boundaries and extensive regions of secondary structure. (Hamburg, Germany), using an unusually large crystal, frozen directly from the crystallization drop to avoid the handling damage upon har- Model building and crystallographic refinement vesting. The images were evaluated using a modified version of The 3.0 Å resolution MIR map showed two distinct regions of electron MOSFLM (A Leslie, personal communication) for processing image density separated by a wide solvent channel. Interpretation of the elec- plate data and the CCP4 suite [49] was used in the data reduction. tron density allowed chain tracing for most parts of the polypeptide chain, although the quality of the electron density varied significantly Structure determination between the two regions of the map. The interactive graphics program Initial phases were determined by multiple isomorphous replacement O [54,55] was used to display a skeletonized representation of the using a platinum derivative, an iridium derivative and a platinum/iridium electron density, and a polyalanine model of the luciferase molecule double derivative (Table 1). The isomorphism of the native diffraction was built by fitting fragments from a data base of highly refined struc- data was found to be critically dependent upon the conditions used to tures [56]. The amino acid sequence could be fitted unambiguously for cryoprotect the crystals. On binding of heavy metals to the protein, sig- most of the chain tracing, to give an initial model which included 492 nificant non-isomorphism was detected by comparison of the diffracted residues out of a total of 550. intensities, which resulted in Riso values >34% at 3.0 Å resolution. The platinum and iridium derivative data sets were subsequently found to be The model was refined with the program X-PLOR [57], using the Engh more similar to two native data sets (nat1 and nat2 in Table 1, respec- and Huber [58] stereochemical parameters against a 2.0 Å resolution tively), which had been obtained using different freezing conditions. native data set which had not been used for MIR phasing (nat 3;

Table 1

Data collection and phasing statistics.

Data set Native Native Native K2Pt(NO2)4 Ir(NH3)5Cl2 K2Pt(NO2)4/ nat1 nat2 nat3 Ir(NH3)5Cl2 Synchrotron Daresbury Daresbury Hamburg Daresbury Daresbury Hamburg Beamline SRS 9.6 SRS 9.6 DESY BW7B SRS 9.6 SRS 9.5 DESY BW7B Wavelength (Å) 0.87 0.87 0.86 0.87 1.13 0.86 Maximum resolution (Å) 2.8 2.8 2.0 3.0 2.9 2.7 Number of measurements 82 725 58 969 161 801 56 914 33 185 66 700 Independent reflections 17 311 17 198 45 349 13 494 14 929 19 518 Completeness (%)* 98.4 (99.6) 97.1 (97.2) 96.5 (97.3) 98.6 (98.5) 94.5 (92.9) 99.7 (99.3) † Rmerge (%)* 5.4 (15.9) 5.8 (15.9) 4.4 (18.6) 7.5 (16.3) 4.7 (6.4) 5.8 (14.1) Heavy-atom concentration (mM) 1.5 7.0 1.5/7.0 Soak time (h) 12 12 24 Number of sites 112 ‡§ # # Riso (%) 19.3 (23.9) 12.3 (14.0) 22.0 (25.4) Phasing power‡** 1.11 (1.08) 1.35# (1.33) 1.47# (1.48) ‡†† # # RCullis (%) 77 (89) 73 (94) 72 (83)

*Values for the outermost_ resolution shell are given in parentheses. using F*derh=Fderh+(1–x)(Fnat1h–Fnat2h) with x=0.5. Fderh, Fnat1h and † Rmerge=⌺h⌺iԽIh,i–IhԽ/ ⌺h⌺iIh,i, _ Fnat2h are the structure-factor amplitudes for the derivative and the two native where Ih,i is the intensity of a measured reflection i and Ih is the average data sets respectively. **Phasing power=(rms heavy-atom structure ‡ †† intensity for this reflection. Values for the 3 Å outer resolution shell are given factor)/(rms lack of closure). RCullis=(rms lack of closure)/(rms isomorphous § # in parentheses. Riso=⌺hԽFderh–Fnat1hԽ/ ⌺hFnat1h. Statistics calculated difference) for centric reflections only. Research Article Crystal structure of firefly luciferase Conti, Franks and Brick 297

Table 1). Initially, only 3 Å resolution data were included and a round of 10. Harvey, E.N. (1926). Oxygen and luminescence. Bio. Bull. 51, 89–97. rigid-body refinement carried out, with the molecule divided into the two 11. Fisher, A.J., Raushel, F.M., Baldwin, T.O. & Rayment, I. (1995). Three- domains. The crystallographic R-factor for the starting model was 44.0%. dimensional structure of bacterial luciferase from Vibrio harveyi at A random sample containing 5% of the total data (2266 out of 45349 2.4 Å resolution. Biochemistry 34, 6581–6586. 12. Kabsch, W. & Sander, C. (1983). Dictionary of protein secondary unique reflections) was excluded from the refinement and the agreement structure: pattern recognition of hydrogen-bonded and geometrical between calculated and observed structure factors for these reflections features. Biopolymers 22, 2577–2637. (Rfree) was used to monitor the course of the refinement procedure [59]. 13. Mathieu, M., Zeelin, J.P., Pauptit, R.A., Erdmann, R., Kunau, W.-H. & Positional refinement at 3 Å resolution was followed by rounds of refine- Wierenga, R.K. (1994). The 2.8 Å crystal structure of peroxisomal ment in which the maximum resolution of the data was gradually 3-ketoacyl-CoA thiolase of Saccharomyces cerevisiae: a five-layered increased. Beyond 2.4 Å resolution the refinement of the atomic posi- ␣␤␣␤␣ structure constructed from two core domains of identical tions was alternated with the refinement of the atomic temperature topology. Structure 2, 797–808. factors. The resulting molecular model had an R-factor of 32.1% and an 14. Bone, R., Springer, J.P. & Atack, J.R. (1992). Structure of inositol R of 37.8% at 2 Å resolution. Simulated annealing using the 2 Å reso- monophosphatase, the putative target of lithium therapy. Proc. Natl. free Acad. Sci. USA 89, 10031–10035. lution data was followed by conventional positional and B-factor refine- 15. Holm, L. & Sander, C. (1993). Protein structure comparison by ment. This lowered the R-factor to 30.3% and the Rfree to 36.2%, and alignment of distance matrices. J. Mol. Biol. 233, 123–138. improved the stereochemistry of the model. An electron-density map with 16. Volz, K. & Matsumura, P. (1991). Crystal structure of Escherichia coli coefficients (2Fobs–Fcalc) and model phases was calculated, and exami- CheY refined at 1.7 Å resolution. J. Biol. Chem. 266, 15511–15519. nation of the map enabled two wrong connections in the small C-terminal 17. Matthews, B.W. & Rossmann, M.G. (1985). Comparison of to be corrected and a few gaps in the polypeptide chain to be structures. Methods Enzymol. 115, 397–420. closed. Iterative cycles of model building and refinement resulted in a 18. Gould, S.J., Keller, G.A., Hosken, N., Wilkinson, J. & Subramani, S. model which included 95% of the total residues. When the R-factor had (1989). A conserved tripeptide sorts proteins to peroxisomes. J. Cell 108 dropped to 28%, water molecules were added at positions with density Biol. , 1657–1664. 19. Turgay, K., Krause, M. & Marahiel, M.A. (1992). Four homologous ␴ higher than 3 in the (Fobs–Fcalc) map. The last round of refinement was domains in the primary structure of GrsB are related to domains in a carried out including the 5% of data previously omitted. superfamily of adenylate-forming enzymes. Mol. Microbiol. 6, 529–546. The refined model 20. Wood, K.V. (1995). The chemical mechanism and evolutionary 62 The refined model contains 523 residues out of 550, and includes 360 development of beetle . Photochem. Photobiol. , 662–673. water molecules. The omitted residues are located at the N and C termini 21. Scholten, J.D., Chang, K.-H., Babbitt, P.C., Charest, H., Sylvestre, M. & and in a few surface loops. The crystallographic free R-factor is 26.5% Dunaway-Mariano, D. (1991). Novel enzymic hydrolytic using a random sample of 2266 reflections with no ␴ cutoff between dehalogenation of a chlorinated aromatic. Science 253, 182–185. 10 Å and 2 Å resolution. The conventional R-factor is 22.4% using all 22. Suzuki, H., et al., & Yamamoto, T. (1990). Structure and regulation of data between 10 Å and 2 Å resolution. The rms deviation from ideality is rat long-chain acyl-CoA synthetase. J. Biol. Chem. 265, 8681–8685. 0.010 Å for bond distances and 1.6° for bond angles. The main-chain 23. Kleinkauf, H. & von Döhren, H. (1990). Nonribosomal biosynthesis of dihedral angles for the majority of the residues in the refined model peptide antibiotics. Eur. J. Biochem. 192, 1–15. (93%) lie in the most favoured regions of the Ramachandran plot [29,60], 24. Eriani, G., Delarue, M., Poch, O., Gangloff, J. & Moras, D. (1990). Only one amino acid residue (Thr346) has energetically unfavourable Partition of tRNA synthetases into two classes based on mutually exclusive sets of sequence motifs. Nature 347, 203–206. dihedral angles, corresponding to a disallowed region in the Ramachan- 25. Babbitt, P.C., et al., & Dunaway-Mariano, D. (1992). Ancestry of 2 dran plot. The average temperature factor for all protein atoms is 29.9 Å . 4-chlorobenzoate dehalogenase: analysis of amino acid sequence The coordinates are being deposited in the Protein Data Bank. identities among families of acyl:adenyl ligases, enoyl-CoA hydratases/ and acyl-CoA thioesterases. Biochemistry 31, Acknowledgements 5594–5604. 26. Gocht, M. & Marahiel, M.A. (1994). Analysis of core sequences in the We are very grateful to David Blow and Alan Wonacott (Glaxo) for helpful D-Phe activating domain of the multifunctional peptide synthetase discussions and to Lesley Lloyd, John Akins and Siân Rowsell for their TycA by site-directed mutagenesis. J. Bacteriol. 176, 2654–2662. assistance. We thank the staff at both the Synchrotron Radiation Source, 27. Saraste, M., Sibbald, P.R. & Wittinghofer, A. (1990). The P-loop — a Daresbury Laboratory and at DESY, Hamburg. The European Union sup- Trends Biochem. ported the work at EMBL Hamburg through the HCMP Access to Large common motif in ATP- and GTP-binding proteins. Sci. 15 Installations Project, Contract Number CHGE-CT93-0040. EC wishes to , 430–434. acknowledge the support of a Glaxo studentship. 28. Pavela-Vrancic, M., Pfeifer, E., van Liempt, H., Schafer, H.J., von Döhren, H. & Kleinkauf, H. (1994). ATP binding in peptide synthetases: determination of contact sites of the adenine moiety by References photoaffinity labeling of tyrocidine synthetase 1 with 2-azidoadenosine 1. Hastings, J.W. (1983). Biological diversity, chemical mechanisms, and triphosphate. Biochemistry 33, 6276–6283. the evolutionary origins of bioluminescent systems. J. Mol. Evol. 19, 29. Ramakrishnan, C. & Ramachandran, G.N. (1965). Stereochemical 309–321. criteria for polypeptide chain conformations. Allowed conformations 2. De Wet, J.R., Wood, K.V., Helinski, D.R. & DeLuca, M. (1985). Cloning of for a pair of peptide units. Biophys. J. 5, 909–933. firefly luciferase cDNA and the expression of active luciferase in 30. Herzberg, O. & Moult, J. (1991). Analysis of the steric strain in the Escherichia coli. Proc. Natl. Acad. Sci. USA 82, 7870–7873. polypeptide backbone of protein molecules. Proteins 11, 223–229. 3. DeLuca, M. (1976). Firefly luciferase. Adv. Enzymol. 44, 37–68. 31. DeLuca, M. & Marsh, M. (1967). Conformational changes of luciferase 4. DeLuca, M. & McElroy, W.D. (1974). Kinetics of the firefly luciferase during catalyzes. Tritium–hydrogen exchange and optical rotation catalyzed reactions. Biochemistry 13, 921–925. studies. Arch. Biochem. Biophys. 121, 233–240. 5. Seliger, H.H. & McElroy, W.D. (1960). Spectral emission and quantum 32. Moss, G.W.J., Franks, N.P. & Lieb, W.R. (1991). Modulation of the yield of firefly bioluminescence. Arch. Biochem. Biophys. 88, 136–141. general anesthetic sensitivity of a protein: a transition between two 6. Gould, S.J. & Subramani, S. (1988). Firefly luciferase as a tool in forms of firefly luciferase. Proc. Natl. Acad. Sci USA 88, 134–138. molecular and cell biology. Anal. Biochem. 175, 5–13. 33. DeLuca, M. (1969). Hydrophobic nature of the active site of firefly 7. Franks, N.P. & Lieb, W.R. (1984). Do general anaesthetics act by luciferase. Biochemistry 8, 160–166. competitive binding to specific receptors? Nature 310, 599–601. 34. Kajiyama, N. & Nakano, E. (1991). Isolation and characterization of 8. Schröder, J. (1989). Protein-sequence homology between plant mutants of firefly luciferase which produce different colors of light. 4-coumarate:CoA ligase and firefly luciferase. Nucleic Acids Res. 17, Protein Eng. 4, 691–693. 460. 35. Wood, K.V. (1990). Luc genes: introduction of colour into 9. Toh, H. (1990). N-terminal halves of gramicidin S synthetase 1, and bioluminescence assays. J. Biolumin. Chemilum. 5, 107–114. tyrocidine synthetase 1 as novel members of the firefly luciferase 36. Airth, R.L., Rhodes, W.C. & McElroy, W.D. (1958). The function of family. Protein Seq. Data Anal. 3, 517–521. coenzyme A in luminescence. Biochim. Biophys. Acta 27, 519–532. 298 Structure 1996, Vol 4 No 3

37. Pazzagli, M., Devine, J.H., Peterson, D.O. & Baldwin, T.O. (1992). Use of bacterial and firefly luciferases as reporter genes in DEAE-dextran- mediated transfection of mammalian cells. Anal. Biochem. 204, 315–323. 38. Berg, P. (1956). Acyl adenylates: an enzymatic mechanism of acetate activation. J. Biol. Chem. 222, 991–1013. 39. McElroy, W.D., DeLuca, M. & Travis, J. (1967). Molecular uniformity in biological catalyzes. The enzymes concerned with firefly luciferin, amino acid, and utilization are compared. Science 157, 150–160. 40. Delarue, M. (1995). Aminoacyl-tRNA synthetases. Curr. Opin. Struct. Biol. 5, 48–55. 41. Wilson, K.P., Shewchuk, L.M., Brennan, R.G., Otsuka, A.J. & Matthews, B.W. (1992). Escherichia coli biotin holoenzyme synthetase/bio repressor crystal structure delineates the biotin- and DNA-binding domains. Proc. Natl. Acad. Sci. USA 89, 9257–9261. 42. Artymiuk, P.J., Rice, D.W., Poirrette, A.R. & Willet, P. (1994). A tale of two synthetases. Nat. Struct. Biol. 1, 758–760. 43. Gatti, D.L. & Tzagoloff, A. (1991). Structure and evolution of a group of related aminoacyl-tRNA synthetases. J. Mol. Biol. 218, 557–568. 44. Perona, J.J., Rould, M.A. & Steitz, T.A. (1993). Structural basis for transfer RNA aminoacylation by Escherichia coli glutaminyl-tRNA synthetase. Biochemistry 32, 8758–8771. 45. Cavarelli, J., et al., & Moras, D. (1994). The active site of yeast aspartyl-tRNA synthetase: structural and functional aspects of the aminoacylation reaction. EMBO J. 13, 327–337. 46. Conti, E., Lloyd, L.F., Akins, J., Franks, N.P. & Brick, P. (1996). Crystallization and preliminary diffraction studies of firefly luciferase from Photinus pyralis. Acta Cryst. D, in press. 47. Matthews, B.W. (1968). Solvent content of protein crystals. J. Mol. Biol. 33, 491–497. 48. Teng, T.Y. (1990). Mounting of crystals for macromolecular crystallography in a free-standing thin film. J. Appl. Cryst. 23, 387–391. 49. Collaborative Computing Project Number 4 (1994). The CCP4 suite: programs for protein crystallography. Acta Cryst. D 50, 760–767. 50. Blow, D.M. & Matthews, B.W. (1973). Parameter refinement in the multiple isomorphous-replacement method. Acta Cryst. A 29, 56–62. 51. Otwinowski, Z. (1991). Maximum likelihood refinement of heavy-atom parameters. In Isomorphous Replacement and Anomalous Scattering: Proceedings of the CCP4 Study Weekend 25–26 January 1991. (Wolf, W., Evans, P.R. & Leslie., A.G.W., eds), pp. 23–38, SERC Daresbury Laboratory, Warrington, UK. 52. Leslie, A.G.W. (1987). A reciprocal-space method for calculating a molecular envelope using the algorithm of B.C. Wang. Acta Cryst. A 43, 134–136. 53. Abrahams, J.P. & Leslie, A.G.W. (1996). Methods used in the structure determination of bovine mitochondrial F1 ATPase. Acta Cryst. D 52, 30–42. 54. Jones, T.A. (1978). A graphics model building and refinement system for macromolecules. J. Appl. Cryst. 11, 268–272. 55. Jones, T.A., Zou, J.-Y., Cowan, S.W. & Kjeldgaard, M. (1991). Improved methods for building protein models in electron density maps and the location of errors in these models. Acta Cryst. A 47, 110–119. 56. Jones, T.A. & Thirup, S. (1986). Using known substructures in protein model building and crystallography. EMBO J. 5, 819–822. 57. Brünger, A.T., Kuriyan, J. & Karplus, M. (1987). Crystallographic R-factor refinement by molecular dynamics. Science 235, 458–460. 58. Engh, R.A. & Huber, R. (1991). Accurate bond and angle parameters for X-ray protein structure refinement. Acta Cryst. A 47, 392–400. 59. Brünger, A.T. (1992). The free R value: a novel statistical quantity for assessing the accuracy of crystal structures. Nature 355, 472–474. 60. Laskowski, R.A., MacArthur, M.W., Moss, D.S. & Thornton, J.M. (1993). PROCHECK: a program to check the stereochemistry of protein structures. J. Appl. Cryst. 26, 283–291. 61. White, E.H., Miano, J.D. & Umbreit, M. (1975). On the mechanism of firefly luciferin luminescence. J. Am. Chem. Soc. 97, 198–200. 62. Kraulis, P.J. (1991). MOLSCRIPT: a program to produce both detailed and schematic plots of protein structures. J. Appl. Cryst. 24, 946–950. 63. Wood, K.V., Lam, Y.A. Seliger, H.H. & McElroy, W.D. (1989). Complementary DNA coding for click beetle luciferases can elicit bioluminescence of different colors. Science 244, 700–702. 64. Nicholls, A., Sharp, K.A. & Honig, B. (1991). Protein folding and association: insights from the interfacial and thermodynamic properties of hydrocarbons. Proteins 11, 281–296.