Proc. Nati. Acad. Sci. USA Vol. 91, pp. 12046-12050, December 1994 Biochemistry Covalent catalysis in nucleotidyl transfer reactions: Essential motifs in RNA capping are conserved in Schizosaccharomyces pombe and viral capping and among polynucleotide ligases STEWART SHUMAN*, YIZHI Liut, AND BEATE SCHWERt *Molecular Biology Program, Sloan-Kettering Institute, New York, NY 10021; and tDepartment of Biochemistry, Robert Wood Johnson Medical School, Piscataway, NJ 08854 Communicated by Bernard Moss, August 1, 1994

ABSTRACT Formation of the 5' cap structure of eukary- based comparisons of the yeast and viral proteins uncovered otic mRNAs occurs via transfer of GMP from GTP to the 5' no conservation of primary sequence. However, mapping of terminus of the . RNA guanylyltansferase, the yeast to Lys-70 within the the enzyme that catalyzes this reaction, has been isolated from motifKTDG (11, 12), which is identical to the KTDG element many viral and cellular sources. Though differing in molecular at the vaccinia active site (7, 8), argues that the cellular and weight and subunit structure, the various guanylyltranferases DNA virus-encoded enzymes might have a common evolu- employ a common catalytic mechanism involving a covalent tionary origin. enzyme-(Lys-GMP) intermediate. Saccharomyces cerevisiae Covalent catalysis during nucleotidyl transfer is not unique CEGI is the sole example ofa cellular capping enzyme gene. In to the RNA capping reaction. DNA and RNA ligation reac- this report, we describe the Identification and characterization tions were the first examples of that of the PCEI gene encoding the capping enzyme from Schizo- employ an enzyme-adenylate intermediate (13). The AMP saccharomycespombe. PCEI was isolated from a cDNA library moiety is linked covalently via a phosphoamide bond to a Lys by functional complementation in Sa. cerevisiae. Induced ex- residue of the ligase polypeptide (14). The bound AMP is pression of PCEI in bacteria and in yeast confirmed that the transferred to the 5' monophosphate end of a polynucleotide 47-kDa Sc. pombe protein was enzymatically active. The amino toformanactivatednucleicacidintermediate-A(5')pp(5')N- acid sequence ofPCE1 is 38% identical (152 of 402 residues) to that is reminiscent ofthe G(5')ppp(5')N RNA cap. Mapping of the 52-kDa capping enzyme from Sa. cerevisiae. Comparison of the active sites of DNA and RNA ligases to Lys residues the two cellular capping enzymes with guanylyltranferases within KXDG motifs (15, 16) suggests that capping enzymes encoded by DNA viruses revealed local sequence similarity at and polynucleotide ligases may be related structurally and the enzyme's active site and at four additional collinear motifs. functionally. Mutational analysis of yeast CEGI demonstrated that four of To gain further insight into the mechanism and molecular the five conserved motifs are essential for capping enzyme evolution of nucleotidyltransferases, we have identified and function in vivo. Remarkably, the same motifs are conserved in characterized a cDNA encoding the mRNA capping enzyme the polynucleotide ligase family of enzymes that employ an from the fission yeast Schizosaccharomyces pombe.t The enzyme-(Lys-AMP) intermediate. These findings illuminate a Sc. pombe PCE1 gene encodes a 402-aa polypeptide with shared structural basis for covalent catalysis in nucleotidyl extensive sequence identity to the guanylyltransferase from transfer and suggest a common evolutionary origin for capping Sa. cerevisiae. The cellular enzymes and the DNA virus enzymes and ligases. display local sequence similarity at the active site and at four additional collinear motifs not appre- RNA guanylyltransferase (capping enzyme) catalyzes transfer ciated previously. Remarkably, the same five motifs are ofGMPfrom GTP to the 5' end ofmRNA. The hallmark ofthe conserved in the polynucleotide ligase family of enzymes. RNA capping reaction is the formation of an enzyme- Mutational analysis demonstrates that the conserved motifs guanylate intermediate in which GMP is linked covalently to are essential for capping enzyme function. a Lys residue of the protein via a phosphoamide bond (1). Analysis ofvirus-encoded RNA guanylyltransferases points to at least two families of capping enzymes, based on overall MATERIALS AND METHODS sequence conservation and on the nature of the enzyme's cDNA Encoding Sc. pombe Capping Enzyme-Cloning by active site (2-8). The capping enzymes encoded by double- Genetic Complementation. The galactose-dependent yeast strand DNA viruses (vaccinia, Shope fibroma, and African strain YBS3 [MATa, leu2, lys2, trpl, ura3, cegl::hisG, swine fever) display local sequence similarities that suggest a pGAL-CEG1 (CEN, TRPI, GALIO-CEGI)] was used to common evolutionary origin-this includes a conserved screen a Sc. pombe cDNA library in the vector pDB20 (2 ,nm, KXDG motif at the site of covalent guanylylation (5, 6). URA3) (17). Ura+ transformants were selected at 300C on Guanylyltransferase encoded by reovirus, a double-strand medium containing glucose. Plasmids with cDNA inserts RNA virus, displays no obvious sequence similarity to the were recovered from viable yeast colonies, amplified by vaccinia protein despite their essentially identical enzymatic transformation in Escherichia coli, and then retested for functions (7, 8). Although guanylyltransferases have been complementation of YBS3 growth on glucose. All secondary isolated from numerous cellular sources (9), a cellular gene transformants were incapable of growth on glucose in the (CEGI) encoding an RNA capping enzyme has only been presence of 5-fluoroorotic acid (5-FOA), indicating that the identified in Saccharomyces cerevisiae (10). Initial computer- growth on glucose required the library plasmid. DNA pre-

The publication costs ofthis article were defrayed in part by page charge Abbreviation: 5-FOA, 5-fluoroorotic acid. payment. This article must therefore be hereby marked "advertisement" *The sequence reported in this paper has been deposited in the in accordance with 18 U.S.C. §1734 solely to indicate this fact. GenBank data base (accession no. U16143). 12046 Downloaded by guest on September 28, 2021 Biochemistry: Shuman et al. Proc. Nadl. Acad. Sci. USA 91 (1994) 12047

pared from five secondary colonies was amplified in bacteria CTCTAAAGAAGCCCTCTCTTAGCAAAGAAACGAGTGTATTATTAAAAGGA ATG GCA CCC TCA GAG AAA GAC ATT GAA GAG GTA TCA GTC CCT GGA GTT TTA GCA CCG CGC GAC GAT and screened by restriction digestion with EcoRI, HindIII, MA P S E K D I E E V S V P G V L A P R D D andBamHI. All five plasmids contained inserts with identical GTG AGG GTT TTA AAG ACA CGA ATT GCC AAA TTA TTA GGA ACA AGT CCT GAT ACA TTT CCT GGA TCA restriction patterns. HindIll and EcoRI fragments from one V RV L K T R I A K L L G TS PDT F P G S CAG CCA GTT TCT TTT TCA AAG AAA CAT TTA CAA GCA TTA AAA GAA AAG AAC TAT TTC GTA TGT GAA cDNA clone were inserted into pBS-KS+ (Stratagene). The Q P V S F S K K H L Q A L K E K NYF V C E AAA AGT GAT GGA ATT CGT TGT TTA CTT TAT ATG ACC GAG CAT CCT CGG TAC GAA AAT CGA CCC AGT sequence of the inserts was determined by dide- K S D G I R C L L Y M T E H P R Y E N R P S oxynucleotide sequencing. Sequencing of overlapping re- GTA TAT TTA TTT GAT CGT AAA ATG AAT TTT TAT CAT GTT GAG AAA ATT TTT TAT CCA GTT GMA AAT striction fragments and of the intact cDNA clone made clear V Y L FD R K M NF Y H V e K I F Y P V E N GAC AAA TCT GGA AAA AAA TAT CAT GTT GAT ACA CTT TTG GAC GGT GAG TTG GTT TTA GAT ATC TAT the order of the fragments and the 5' and 3' margins of the DK S G K KY H V D T L L D G E L V L D I Y cDNA insert. The cloned Sc. pombe gene was designated CCA GGT GGT AAG AAG CAA CTG AGA TAT TTA GTC TTT GAT TGT TTG GCA TGT GAT GGA ATT GTT TAT PCEI. P GG KK Q L R Y L V F D C L A C D G I V Y ATG AGT CGA TTG CTT GAC AAA CGC TTG GGA ATT TTT GCT AAA AGC ATT CAA AAG CCC TTA GAT GAA Vectors for Expression of Sc. pombe Capping Enzyme. The N S R LL D K R L G I F A K S I Q K P L D E PCEI coding sequence was isolated from the cDNA insert TAT ACA AAG ACT CAT ATG CGC GAA ACT GCC ATA TTT CCT TTT CTC ACA TCG TTA AAA AAA ATG GAG by Y T K T H H R E T A I F P F L T SL K K M E PCR amplification using oligonucleotides corresponding to CTG GGT CAT GGT ATC CTA AAG TTA TTT AAT GAA GTG ATC CCC CGA CTT CGT CAT GGT AAT GAT GGA the 5' end of the open reading frame and the 3' untranslated L G H G I L K L F N E V I P R L R H G N DG CTT ATC TTT ACA TGT ACG GAA ACT CCT TAT GTA TCT GGC ACT GAC CAG TCG CTT TTG MG TGG AAA region. The 5' oligonucleotide introduced an Nco I restriction L I F TCT E T P Y V S G TD Q S L L K H K site at the start codon and the 3' oligonucleotide included a CCA AM GAA ATG AAT ACA ATA GAC TTT ATG CTA AAG CTG GAA TTT GCA CAG CCT GM GAA GGG GAC BamHI site. The amplified DNA fragment was digested with P K E M NT I D F M L K L E F A Q P E E G D ATT GAT TAT TCA GCC ATG CCA GAA TTT CAA CTT GGT GTA TGG GAG GGT AGG AAC ATG TAC TCT TTT Nco I and Bgl II (which cuts at a single site in the cDNA I D Y S A N P E F Q L G V W E G R N M Y S F the and Tm GCC TTC ATG TAT GTT GAT GAA AAA GAA TGG GAA AAA TTG AAA AGC TTT AAT GT? CCT TTA TCG immediately downstream of stop codon) FAFM Y V D E K E WE K L K S F N V P L S then ligated into the bacterial expression vector pET14b. The GAA AGA ATA GTA GAG TGC TAT CTG GAT GA GAA MT CGT TGG AGA TTT TTA CGT TTT CGT GAT GAT resultant plasmid pET-PCE was transformed into E. coli E R I V E C Y L D DE N R W R F L R F R D D AAA CGA GAT GCA AAC CAT ATC AGT ACA GTT AAA AGT GTA TTG CAG AGC ATA GAA GAT GGC GTT TCT BL21(DE3). An Nco I-BamHI PCR fragment was also used K R D AN H I S T V K S V L Q S I E D G V S to create a yeast expression plasmid pGal-PCE (2 pm, LEU2, AAA GAA GAC CTT CTC AAA GAG ATG CCT ATT ATC CGT GM GCT TAT TAC AAC AGA AAG AAA CCC TCA GALIO-PCEI) in which the PCEI gene (obtained by PCR K E D L L K E M P II R E A Y Y N R K K P S GTA ACG AAA CGG AAA TTA GAT GAA ACC TCT AAT GAT GAT GCT CCT GCA ATC AAA AAA GTA GCT AAG amplification) is driven by a GALIO promoter. pGal-YCE, V T K R K L D ET S N DD A P A I KK V A K containing the Sa. cerevisiae CEGI gene in the same vector GAA AGT GAA AAA GAA ATT TM GATCTTTTAACAAAACAAAAGGAATA$TAATGATTCCTATACCAAAAGAACGCAA E S E K E I * background, has been described (11). The plasmids were AAATGTTGAAAGCTGGTCAGCCATTAATATCCGACCTTCAAGGCCAATATAATTGGACTTCTAAAGTGTTTATGCACGGACCCTTAG MTTAAGAAAAAAAGCTATGAAGGTTTAATTTGACTACTTATCGTTTACTTTTAMAGCATGGACTAGACTTAGARTTTTCA transformed into YBS2 and Leu+/Ura- transformants were ACTTGCTACATCGCTATTCGTATCGCAAATTTCAGTAACGAATATAATCTTCTCTCmTTTGAATCGCCATATTGATCATCAATTT ATTTATTTTGMTTCGCCTTATTCAAMCCAGTTTACTATTTAAATGAAGCTCGGAACTCCCAGCATGGAAGAGTAAAATCTATGC selected by growth on plates containing 5-FOA and galac- ACTTTAATTTAATAATGTTTATGCCTTAC tose. Growth of the GAL-CEGI and GAL-PCEI strains was galactose-dependent. FIG. 1. Sc. pombe cDNA encoding RNA capping enzyme. The Site-Directed Mutagenesis of CEGI. Oligonucleotide- nucleotide sequence of the PCEI cDNA insert selected by genetic directed mutations in the CEGI gene were created as de- complementation is shown along with the predicted amino acid scribed (11). Mutagenic DNA primers were designed to sequence of the PCE1 protein. generate single or clustered Ala substitutions at residues indicated in Tables 1 and 2. All mutations were confirmed by reading frame is a 438-nt 3' untranslated region and a 24-nt dideoxynucleotide sequencing. Restriction fragments ofeach poly(A) tail. The 402-aa polypeptide from Sc. pombe is CEG-Ala mutant were exchanged with the corresponding obviously related to the Sa. cerevisiae capping enzyme, with segment in the yeast plasmid pGYCE-358 (CEN, TRPI, CEGI) and the mutations were rechecked by sequencing. The Sp MAPSEKDIEEVSVPGVLAPRDDVRVLKTRIAKLLGTSPD ---TFPGSQPVSFSKKHLQA- function of the mutated CEGI alleles in vivo was tested by SC MVLAMESRVAPEIPGLIQPGNVTQDLKMMVCKLLN-SPKPTKTFPGSQPVSFQHSDVEEK plasmid shuffle into strain YBS2 as described (11). Sp LKEKNYFVCEISDGIRCLLYMTEHPRY-ENRPSVYLFDRKMNFYHVEKIFYP-VENDKSG SC LLAHDYYVCEKTDGLRVLF IVINPVTGEQGC--FMIDRENNYYLVNGFRFPRLPQKKKE RESULTS Cloning the Sc. pombePCEl Gene Ending mRNA Capping Sp --KKYHVD-TLLDGELVLDIYPGGKKQ-LRYLVFDCLACDG-----IVYMSRLLDKRLGI Enzyme. A conditional lethal growth phenotype in Sa. cere- SC ELLETLQDGTLLDGELVIQTNPMTKLQELRYLMFDCLAINGRCLTQSPTSSRLAH--LGK visiae can be elicited by placing the essential CEGI gene encoding yeast capping enzyme under the transcriptional Sp FAKSIQKPLDEYTKTHM-RETAIFPFLTSLKKMELGHGILKLFNEVIPRLRHGNDGLIFT control of a galactose-inducible promoter. The GAL-CEGI SC EFF ---KPYFDLRAAYPNRCT-TFPFKISMKHMDFSYQLVKVAKSLD-KLPHLSDGLIFT cells grow well on galactose but are unable to grow in the presence of glucose, when expression of the CEGI gene is Sp CTETPYVS-GTDQSLL-KWKPKEMNTIDFMLKLEFAQPEE------GDIDYSAMPE transcriptionally repressed (11). We have exploited this con- SC PVKAPYTAGGKD-SLLLKWKPEQENTVDFKLILDIPMVEDPSLPKDDRNRWYYNYDVKPV ditional phenotype to isolate the gene encoding the capping enzyme from the fission yeast Sc. pombe. The GAL-CEGI Sp FQLGVWEG-RNMYS-FFAFMYV-DEKE------WEKLKSFNVPLSE strain was transformed with a Sc. pombe cDNA-2 pim library SC FSLYVWQGGADVNSRLKHFDQPFDRKEFEILERTYRKFAELSVSDEEWQNLKNLEQPLNG (17). Transformants capable of growth on glucose medium were obtained at a frequency of about 1 in 10,000. Retrans- Sp RIVACYLDENR--WRFLRFRDDKRDANHISTVKSVLQSIEDGVSKEDLLKEMPI IREAYY formation with plasmid DNA recovered from these isolates SC RIVECAKNQETGAWEMLRFRDDKLNGNHTSVVQKVLESINDSVSLEDLEEIIVGDIKRCWD confirmed their ability to complement the growth defect of GAL-CEGI on glucose. These isolates also complemented Sp NRKKPSVTK--RKLDETSNDDAPAIKKVAKESEKEI (402) the Aceg) null mutation in a plasmid shuffle experiment. SC (459) Restriction analysis offive clones indicated that a single gene was responsible for complementation. The complete nucle- FiG. 2. Sequence alignment ofmRNA capping enzymes from Sc. otide sequence of the 1.7-kbp cDNA insert was determined pombe (Sp) and Sa. cerevisiae (Sc). Identical amino acids are indicated by a colon and conserved residues are denoted by a period. (Fig. 1). The cDNA contains a single long open reading frame Discontinuities in the alignment are indicated by dashes. The active- encoding a predicted 402-aa polypeptide that initiates at the site Lys residues are indicated in boldface type. Clustered charged first available ATG codon. The open reading frame is pre- residues in the Sa. cerevisiae protein that were targeted for Ala ceded by a 50-nt 5' untranslated region; following the open substitution are denoted by asterisks. Downloaded by guest on September 28, 2021 12048 Biochemistry: Shuman et al. Proc. Natl. Acad. Sci. USA 91 (1994) 152 of 402 identical residues, as indicated in the primary sequence alignment shown in Fig. 2. The alignment, which extends nearly the entire length of the two proteins, is .44 il,!( .4 i.) punctuated by several discontinuities in which the sequence present in the CEG1 protein is not represented in the Sc. .4 ,i pombe polypeptide. We have designated the gene encoding -4 6(x -,4; the pombe capping enzyme as PCEJ. P(T,, -a- a* Om -; 4 .44 Heterologous Expression ofSc. pombe Capping Enzyme. We -W a 0 41M have confirmed that the cDNA isolated by complementation AMP actually encodes a functional guanylyltransferase by express- ing the PCEI open reading frame in E. coli. The coding sequence was inserted into a T7-based pET vector. Isopropyl ,3-D-thiogalactoside induction of T7 RNA polymerase in cultures of E. coli BL21(DE3)/pET-PCR resulted in the accumulation of an abundant 47-kDa polypeptide (Fig. 3A). .4 This protein was absent when cells lacking pET-PCE were (data not shown). induced with isopropyl 3-D-thiogalactoside FIG. 3. Expression of catalytically active PCE1 in bacteria and Although the PCE1 polypeptide was recovered predomi- yeast. (A) Induced expression of the PCE1 protein in bacteria was nantly in the insoluble protein fraction of crude lysates, achieved by addition of 0.6 mM isopropyl P-D-thiogalactoside to a significant amount remained soluble. Indeed, PCE1 was 50-ml cultures of BL21(DE3)pET-PCE in LB medium (ampicillin at one of the most abundant polypeptides in the soluble fraction 0.1 mg/ml) at 370C when the A6w value reached 0.8. Cells were (Fig. 3A Left). Incubation of the soluble extracts in the harvested by centrifugation 4 h after induction. The bacteria were presence of [a-32P]GTP and manganese resulted in the for- resuspended in 4 ml of lysis buffer [150 mM NaCl/50 mM Tris HCI, mation of an SDS-stable nucleotidyl-protein adduct that pH 7.5/10% (wt/vol) sucrose], then lysozyme was added to 0.3 migrated as a discrete 48-kDa species during SDS/PAGE mg/ml, and the mixture was incubated on ice for 30-45 min. Triton X-100 was added to 0.1% and the lysates were sonicated to reduce (Fig. 3A Right). Labeling ofthis polypeptide was not detected viscosity. Soluble (S) and insoluble pellet (P) fractions were sepa- in extracts prepared from bacterial that lacked the PCEI gene rated by centrifugation. The insoluble pellets were resuspended in 4 (data not shown). ml of lysis buffer. Aliquots (15 .d) of the protein samples were Induced expression of the Sc. pombe capping enzyme in denatured and analyzed by SDS/PAGE. (Left) Coomassie blue- Sa. cerevisiae was achieved by placing the PCE1 gene under stained gel. The polypeptide corresponding to the capping enzyme the control of a 2 ,um-based GAL promoter in a Acegl null (PCE) is indicated. Guanylyltransferase assay mixtures (50 mM strain background. GAL-PCEJ cells grew readily on galac- Tris HCl, pH 8.0/10mM MnCl2/0.16 AtM [a-32P]GTP/1 pl of soluble tose but formed microcolonies on medium containing glucose bacterial extract) were incubated for 5 min at 37°C. (Right) Label (data not shown). Whole-cell extracts prepared from GAL- transfer to the capping enzyme was detected by autoradiography containing galactose) after SDS/PAGE. (B) Extracts were prepared from 1.5-liter cultures PCEI yeast (grown in liquid medium of Sa. cerevisiae (GAL-PCE1 and GAL-CEGI strains) harvested at formed a 48-kDa enzyme-GMP complex that was clearly an A6w value of 1.5. Cell pellets were washed and resuspended in 10 distinct from the 52-kDa enzyme-GMP complex formed by mM Hepes, pH 7.9/0.2 M KCl/1.5 mM MgCl2/0.5 mM dithiothrei- extracts of GAL-CEGI cells (Fig. 3B). Thus, the PCE1 tol/10%o (vol/vol) glycerol. Lysis was achieved by mixing the cells protein was enzymatically active in Sa. cerevisiae. Note that with liquid nitrogen and grinding the suspension with a mortar and the PCE1 protein sequence includes the element KSDG, pestle. The cell powder was thawed and clarified by centrifugation at which is related to the KTDG motif at the active site of the 30,000 x g. The supernatant was centrifuged for 1 h at 100,000 x g, guanylyltransferases from Sa. cerevisiae and vaccinia (5, 6, followed by dialysis of the S-100 against 20 mM Hepes, pH 7.9/50 11, 12). We inferred therefore that Lys-67 is the active site of mM KC1/0.2 mM EDTA/0.5 mM dithiothreitol/20%o glycerol. The our final volume ofthe preparation was 4 ml. Aliquots (1 t4) were assayed the Sc. pombe capping enzyme. This was supported by for enzyme-GMP complex formation as described above for bacte- finding that the Lys-67 Ala mutation abrogated PCEI func- rial lysates. The positions and sizes (in kDa) of marker proteins are tion in vivo (data not shown). indicated at the right. Sequence Conservation Among Capping Enzymes and Poly- nucleotide Ligases. The conserved KTDG element at the spacing in all capping enzymes and in most of the polynu- guanylyltransferase active site was first noted when we cleotide ligases. Motif I encompasses the KXDG element at scanned "by eye" for sequence similarities between capping the active site of covalent enzyme-NMP adduct formation. enzyme and various polynucleotide ligases (7). Shuman et al. The X residue is not strictly conserved, but there is a (18) had predicted that capping enzyme and ligase would preference for Thr among the guanylyltransferases and for share a common mechanism of covalent catalysis. That the Tyr in the ligases. Within the capping enzyme family, a Tyr active sites are so similar suggested that other structural features may be conserved, thus prompting further sequence located 4 residues upstream of the active-site Lys is also analysis to obtain candidate motifs. This was done by first conserved. Motif II, consisting of RFP, or closely related inspecting the regions ofconservation between the CEG1 and triplets, is found in some of the family members, but not in PCE1 proteins and then searching by eye for similar elements others, as indicated in Fig. 4. Motifs III and IV are highly in the capping enzymes of vaccinia virus (2), Shope fibroma conserved. Motif V, which displays a more subtle pattern of virus (3), and African swine fever virus (4). In addition to the conservation, can be viewed as bipartite. The KWKP se- active site KTDG (referred to as motif I), we discerned four quence in the upstream half of motif V is identical in the other conserved sequence elements, which we refer to as capping enzymes from the two yeasts and the African swine motifs II-V, situated within the CEG1 polypeptide as shown fever virus-the closely related sequence KLKP is found in in Fig. 4. Remarkably, these motifs are also conserved among the African swine fever virus DNA ligase. The poxvirus the numerous members of the polynucleotide ligase family capping enzymes and the other DNA ligases contain an (13). The aligned amino acid sequences of the five conserved invariant Lys in this region (XXKX). The downstream por- regions are shown in Fig. 4 for capping enzymes and poly- tion of motif V includes the sequence (E/D)NTVD, which is nucleotide ligases (DNA or RNA) from the indicated sources. highly conserved among the five capping enzymes. The What is most striking about these sequence motifs is that ligases have an invariant Asp residue in this region they are arranged in the same order and with nearly identical (DXXXX). Downloaded by guest on September 28, 2021 Biochemistry: Shuman et al. Proc. Natl. Acad. Sci. USA 91 (1994) 12049 Conserved Motifs Are Essential for Capping Enzyme Func- E132A, were lethal. The G131A mutant was strongly tem- tion in Vivo. We posit that primary sequence conservation perature-sensitive. In motif IV, D225A and G226A substitu- between capping enzyme and ligases is relevant to the tions were lethal. Replacement of universally conserved common catalytic mechanism. To test this, we created a Lys-249 in motif V with Ala was lethal, as was substitution series of Ala substitution mutations of the CEG1 protein at at Asp-257, a residue conserved only in the capping enzyme residues within conserved motifs I-V. (Residues that were family. The T255A mutant (affecting a residue common to all mutated are indicated by asterisks in Fig. 4.) The Ala- capping enzymes) caused a temperature-sensitive pheno- scanning technique provides a simple approach to gauge the type. The E253A and N254A mutants were fully viable. essentiality ofa given amino acid for protein function, in that In summary, the mutational analysis indicates that con- Ala substitution eliminates the side chain beyond the (3car- served motifs I, III, VI, and V are essential for capping bon, yet usually does not alter the main-chain conformation enzyme function. Seventeen residues in these motifs were or impose extreme electrostatic or steric effects (19-21). singly substituted (not counting the RFP mutation in motifII). CEGI-Ala alleles in CEN:TRPJ plasmids were tested for in Mutations at eight residues were lethal, three were temper- vivo function using the plasmid shuffle procedure. Inability of ature sensitive, and only six were viable. CEGI-Ala alleles to sustain cell growth on medium contain- Charge-to-Ala Cluster Mutants of CEGI. The high fre- ing 5-FOA (which selects against a resident CEG): URA3 quency with which growth phenotypes were elicited by single plasmid) indicates that the side chain of the affected residue (or, for RFP -- AAA, clustered) Ala substitutions in the five is essential for protein function. We anticipated that some of conserved motifs stand in stark contrast to the low frequency the Ala-substitution mutations in the conserved motifs might of growth phenotypes produced by Ala cluster mutations in be tolerated, whereas others would be lethal, and still others regions ofthe CEG1 protein other than the conserved motifs might confer a conditional growth defect. Consequently, all (Table 2). We constructed eight charge-to-Ala cluster muta- mutated CEGI alleles were screened initially for growth at tions at sites indicated by asterisks in Fig. 2. In this proce- 250C. CEGI-Ala strains viable at 250C were screened sec- dure, two or three locally clustered acidic or basic residues ondarily for growth at 370C and at 18'C. Although several were simultaneously replaced with Ala. Not a single one of CEGJ-Ala alleles caused a temperature-sensitive growth the charge cluster mutants had a lethal phenotype. Seven of phenotype, none of the mutants were cold-sensitive. The the eight were fully viable, and one mutant-D95A/R96A/ results are shown in Table 1 and are discussed below. E97A-was temperature-sensitive (Table 2). Previously described single Ala-substitution mutations in motifI were supplemented by two mutated alleles. Y66A and K70R. In agreement with earlier results (11), the K70A and DISCUSSION G73A mutations were lethal at 250C, whereas the T71A and A model ofnucleotidyl transfer involving covalent phosphor- D72A mutants were viable at both 250C and 370C. Replace- amidate intermediates was proposed by Shabarova (22). This ment of the active-site Lys with Arg (K70R) was also lethal, prescient hypothesis has since been substantiated, first for suggesting a stringent requirement for Lys as the nucleophile DNA and RNA ligases and, subsequently, for mRNA cap- during attack by enzyme on the a-phosphate of GTP. The Y66A substitution in motif I caused a temperature-sensitive Table 1. Mutations in conserved motifs I-V affect CEGI defect, seen as normal growth at 25°C, but severely slowed function in vivo growth at 37°C. The RFP triplet of motif II was substituted Growth simultaneously at all three positions; this caused a slow growth defect at 25°C and complete lethality at 37°C. Five Motif Mutation 250C 370C single Ala mutations in motifIII were examined. Two ofthese I Y66A + involving aliphatic residues, L129A and V134A, were viable, K70A Lethal whereas the alterations of charged residues, D130A and K70R Lethal T71A ;1 H1I 1V D72A +++ N- . G73A Lethal II RFP-AAA III L129A D130A Lethal G131A E132A Lethal V134A IV D225A Lethal G226A Lethal V K249A Lethal E253A N254A T255A + Lethal FIG. 4. Regions of conservation between capping enzymes and D257A polynucleotide ligases. Five colinear conserved sequence elements, YBS2 was transformed with derivatives ofpGYCE-358 containing designated motifs I-V, were discerned by visual inspection of the the indicated amino acid substitution mutants of CEGI. Trp+ trans- amino acid sequences of capping enzymes (CEs), DNA ligases formants were plated on medium containing 5-FOA (0.75 mg/ml) and (DNA), and RNA ligases (RNA) from Sa. cerevisiae (Sc), Sc. pombe incubated at 25°C for 4 days. Lethal mutations were those that (Sp), African swine fever virus (ASF), vaccinia, virus (VAC), Shope precluded growth under counterselective conditions. Strains that fibroma virus (SFV), humans (Hu), and bacteriophage T4. The grew on FOA were streaked on YPD plates at 25°C. Single colonies number ofintervening amino acid residues is indicated -n-). Residues were restreaked on YPD plates and incubated at either 25°C or 3rC in the CEGi protein that were targeted for mutational analysis are for 3 days. Growth was assessed as follows: + + +, colony size and indicated by asterisks above the aligned sequence. The location and number were indistinguishable from strains bearing wild-type CEGI; spacing ofthe motifs within the CEGi protein are depicted above the + +, slowed growth was manifest by reduced colony size; +, only alignment. pinpoint colonies were detected; -, no growth was detected at 37°C. Downloaded by guest on September 28, 2021 12050 Biochemistry: Shuman et al. Proc. Nad. Acad. Sci. USA 91 (1994) Table 2. Charge-to-Ala cluster mutations of CEGI The conservation of essential motifs among ligases and Growth capping enzymes has important evolutionary implications. Both types of enzymes catalyze single nucleotide transfer Mutation 250C 370C reactions to activate the ends of polynucleotide chains. We E57A,E58A,K59A + + + + + + propose that the ligases and guanylyltransferases evolved D95A,R96A,E97A +- from an ancestral that employed a K114A,K115A,K116A +++ +++ phosphoramidate intermediate, but may have lacked NTP- specificity. Indeed, single-step nucleotidyltransferases may E117A,E118A +++ +++ have antedated the evolution ofprocessive template-directed D218A,K219A +++ +++ DNA and RNA polymerases as agents of polynucleotide K241A,D242A +++ +++ synthesis. Phosphoramidate catalysis in nucleotidyl transfer K274A,D275A,D276A + + + + + + is not merely a molecular fossil-this mechanism is likely to E395A,D396A +++ +++ pertain to many other nucleotidyl transfer reactions forwhich YBS2 was transformed with derivatives ofpGYCE-358 containing a covalent intermediate has been demonstrated or proposed. the indicated amino acid substitution mutants of CEGI. Plasmid For example, histidine tRNA guanylyltransferase catalyzes shuffle and assessment of growth were performed as described in ATP-dependent addition of a nontemplated GTP moiety to Table 1. + + +, Colony size and number were indistinguishable from the 5' terminus of tRNAHiS molecules. This is a multistep strains bearing wild-type CEG1; -, no growth was detected at 370C. ligase-like reaction in which ATP binds enzyme to form a covalent protein-AMP intermediate, AMP is transferred to ping enzymes. Lys residues have been identified as the site the 5' end of the tRNA to form an activated A(5')pp(5')N of covalent protein-NMP adduct formation in these cases. structure that is attacked by the 3' OH ofGTP (27). In another The cloning of genes encoding DNA ligases from numerous case, GTP-GTP guanylyltransferase from brine shrimp cat- sources has illuminated extensive sequence conservation alyzes synthesis of a GppppG dinucleotide from two GTP within this enzyme family (13, 23). In contrast, RNA ligases molecules via a capping-enzyme-like mechanism employing (from T4 and yeast) show little sequence similarity to each an enzyme-GMP phosphoramidate intermediate (28). An other or to the DNA ligases, except in the vicinity of the ATP-dependent RNA ligase from kinetoplastid mitochondria active site (15, 24). Although relatively few genes encoding is thought to play a role in RNA editing (29). The cloning of capping enzymes have been cloned, the mapping of their genes encoding these proteins, and of additional members of active sites has highlighted similarities to the ligase family (5, the guanylyltransferase family, will undoubtedly shed light 6, 11, 12). By cloning the Sc. pombe capping enzyme gene, on the structural basis for covalent catalysis. we have elucidated a set ofsequence motifs conserved among cellular and viral guanylyltransferases and polynucleotide 1. Shuman, S. & Hurwitz, J. (1981) Proc. Nati. Acad. Sci. USA 78,187-191. 2. Niles, E. G., Condit, R. C., Caro, P., Davidson, K., Matusick, L. & ligases. Most important, we have demonstrated that these Seto, J. (1986) Virology 153, 96-112. motifs are essential for capping enzyme function in vivo. 3. Upton, C., Stuart, S. & McFadden, G. (1991) Virology 183, 773-777. The Sc. pombe PCEI gene is only the second example of 4. Pena, L., Yanez, R., Revilla, Y., Vinuela, E. & Salas, M. L. (1992) Virology 193, 319-328. a cloned guanylyltransferase from a cellular source. The 5. Cong, P. & Shuman, S. (1993) J. Biol. Chem. 268, 7256-7260. 47-kDa PCE1 protein (402 aa) is the smallest guanylyltrans- 6. Niles, E. G. & Christen, L. (1993) J. Biol. Chem. 268, 24986-24989. ferase identified thus far. The sizes of capping enzymes from 7. Seliger, L. S., Zheng, K. & Shatkin, A. J. (1987) J. Biol. Chem. 262, other sources, which are estimated from gene sequence or 16289-16293. 8. Fausnaugh, J. & Shatkin, A. J. (1990) J. Biol. Chem. 265, 7669-7672. from the mobility of the enzyme-GMP adduct during SDS/ 9. Mizumoto, K. & Kaziro, Y. (1987) Prog. NucleicAcidRes. Mol. Biol. 34, PAGE, are 52 kDa for yeast, 65-69 kDa for mammals, 73 kDa 1-28. for brine shrimp, 77 kDa for wheat germ, 95 kDa for vaccinia, 10. Shibagaki, Y., Itoh, N., Yamada, H., Nagata, S. & Mizumoto, K. (1992) J. Biol. Chem. 267, 9521-9528. 100 kDa for African swine fever, and 167 kDa for reovirus (1, 11. Schwer, B. & Shuman, S. (1994) Proc. Natl. Acad. Sci. USA 91, 4, 7, 9, 10). Although PCE1 is most closely related to yeast 4328-4332. CEG1 (38% identity), the two proteins have diverged con- 12. Fresco, L. D. & Buratowski, S. (1994) Proc. Nat!. Acad. Sci. USA 91, siderably. Regional conservation provides obvious clues for 6624-6628. 13. Lindlahl, T. & Barnes, D. E. (1992) Annu. Rev. Biochem. 61, 251-281. structure-function analysis and for efforts to isolate genes 14. Gumport, R. I. & Lehman, I. R. (1971) Proc. Natl. Acad. Sci. USA 68, encoding capping enzymes from other cellular sources. The 2559-2563. relatedness of the cellular and DNA virus capping enzymes 15. Thogerson, H. C., Morris, H. R., Rand, K. N. & Gait, M. J. (1985) Eur. is limited to the five motifs we have described. J. Biochem. 147, 325-329. 16. Tomkinson, A. E., Totty, N. F., Ginsburg, M. & Lindahl, T. (1991) Previous mutational analyses of DNA ligase, RNA ligase, Proc. Nat!. Acad. Sci. USA 88, 400-404. and RNA guanylyltransferase have focused exclusively on 17. Fikes, J. D., Becker, D. M., Winston, F. & Guarente, L. (1990) Nature the immediate region ofthe active site (5, 11, 12, 25, 26). The (London) 346, 291-294. 18. Shuman, S., Surks, M., Furneaux, H. & Hurwitz, J. (1980)J. Biol. Chem. present study provides a functional assessment of other 255, 11588-11598. protein segments. Our finding that motifs I-V include a high 19. Cunningham, B. C. & Wells, J. A. (1989) Science 244, 1081-1085. proportion of essential residues argues that these elements 20. Bennett, W. F., Paoni, N. F., Keyt, B. A., Botstein, D., Jones, A. J., are relevant to the shared mechanism of covalent catalysis. Presta, L., Wurm, F. M. & Zoller, M. J. (1991) J. Biol. Chem. 266, 5191-5201. (Preliminary studies of the mutated CEG1 proteins indicate 21. Gibbs, C. S. & Zoller, M. J. (1991) J. Biol. Chem. 266, 8923-8931. that loss offunction in vivo correlates with inactivation or, in 22. Shabarova, Z. A. (1970) Prog. Nucleic AcidRes. Mol. Biol. 10, 145-182. some cases, thermolability of guanylyltransferase activity in 23. Barnes, D. E., Johnston, L. H., Kodama, K., Tomkinson, A. E., Lasko, vitro.) We anticipate that our structure-function analysis of D. D. & Lindahl, T. (1990) Proc. Nat!. Acad. Sci. USA 87, 6679-6683. 24. Xu, G., Teplow, D., Lee, T. D. & Abelson, J. (1990) Biochemistry 29, CEG1 will be applicable to other enzymes in the guanylyl- 6132-6138. transferase and polynucleotide ligase families. Of course, a 25. Heaphy, S., Singh, M. & Gait, M. J. (1987) Biochemistry 26, 1688-1696. genetic "map" ofcapping enzyme structure and function will 26. Kodama, K., Barnes, D. E. & Lindahl, T. (1991) Nucleic Acids Res. 19, be more meaningful in the context of a known crystal 6093-6099. structure for this type of enzyme. To our knowledge, no 27. Jahn, D. & Pande, S. (1991) J. Biol. Chem. 266, 22832-22836. 28. Liu, J. J. & McLennan, A. G. (1994) J. Biol. Chem. 269, 11787-11794. structure has been reported for any member of the RNA and 29. Bakalara, N., Simpson, A. M. & Simpson, L. (1989) J. Biol. Chem. 264, DNA ligase family. 18679-18686. Downloaded by guest on September 28, 2021