Reducing the genetic code induces massive rearrangement of the proteome

Patrick O’Donoghuea,b, Laure Pratc, Martin Kucklickd, Johannes G. Schäferc, Katharina Riedele, Jesse Rinehartf,g, Dieter Söllc,h,1, and Ilka U. Heinemanna,1

Departments of aBiochemistry and bChemistry, The University of Western Ontario, London, ON N6A 5C1, Canada; Departments of cMolecular Biophysics and Biochemistry, fCellular and Molecular Physiology, and hChemistry, and gSystems Biology Institute, Yale University, New Haven, CT 06520; dDepartment of Microbiology, Technical University of Braunschweig, Braunschweig 38106, Germany; and eDivision of Microbial Physiology and Molecular Biology, University of Greifswald, Greifswald 17487, Germany

Contributed by Dieter Söll, October 22, 2014 (sent for review September 29, 2014; reviewed by John A. Leigh)

Expanding the genetic code is an important aim of synthetic Opening codons by reducing the genetic code is highly biology, but some organisms developed naturally expanded ge- promising, but it is unknown how removing 1 amino acid from netic codes long ago over the course of evolution. Less than 1% of the genetic code might impact the proteome or cellular viability. all sequenced genomes encode an operon that reassigns the stop Many genetic code variations are found in nature (15), including codon UAG to pyrrolysine (Pyl), a genetic code variant that results stop or sense codon reassignments, codon recoding, and natural from the biosynthesis of Pyl-tRNAPyl. To understand the selective code expansion (16). Pyrrolysine (Pyl) is a rare example of nat- advantage of genetically encoding more than 20 amino acids, we ural genetic code expansion. Evidence for genetically encoded constructed a markerless tRNAPyl deletion strain of Pyl is found in <1% of all sequenced genomes (17). In these acetivorans (ΔpylT) that cannot decode UAG as Pyl or grow on organisms, Pyl is encoded by the UAG codon, which requires trimethylamine. Phenotypic defects in the ΔpylT strain were evi- tRNAPyl, pyrrolysyl-tRNA synthetase (PylRS), and the products dent in minimal medium containing methanol. Proteomic analyses of three genes (pylBCD) that synthesize Pyl from two molecules of wild type (WT) M. acetivorans and ΔpylT cells identified 841 of lysine (18). The PylRS enzyme was engineered to genetically proteins from >7,000 significant peptides detected by MS/MS. Pro- encode >100 ncAAs (19). The Pyl encoding system has already tein production from UAG-containing mRNAs was verified for 19 been used to expand the genetic codes of Escherichia coli (20– proteins. Translation of UAG codons was verified by MS/MS for 22), mammalian cells, and animals (23). eight proteins, including identification of a Pyl residue in PylB, Despite the use of Pyl in synthetic biology, little is known which catalyzes the first step of Pyl biosynthesis. Deletion of about the role of Pyl in its native environment or the evolu- tRNAPyl globally altered the proteome, leading to >300 differen- tionary pressures that sustain expanded genetic codes in nature. tially abundant proteins. Reduction of the genetic code from 21 The Pyl-decoding trait is found in methanogenic of the to 20 amino acids led to significant down-regulation in translation orders and Methanomassiliicoccales (24) and initiation factors, amino acid metabolism, and methanogenesis certain anaerobic bacteria (17). In addition to producing 74% of from methanol, which was offset by a compensatory (100-fold) global methane emissions, are remarkable for their up-regulation in dimethyl sulfide metabolic enzymes. The data ability to survive with only the most basic carbon and energy show how a natural proteome adapts to genetic code reduction sources (25). Methanosarcina shows the greatest substrate range and indicate that the selective value of an expanded genetic code among methanogens and survives on acetate, carbon monoxide, is related to carbon source range and metabolic efficiency. methylamines, methanol, or dimethyl sulfide (DMS). Their broad substrate range depends, in part, on the presence of Pyl in the evolution | genetic code expansion | methanogenesis | pyrrolysine | active site of several methylamine methyltransferases (26). Hundreds tRNAPyl of Methanosracina genes contain in-frame TAG codons (27), but natural Pyl incorporation was only shown in methylamine methyl- ynthesizing whole genomes (1) and eliminating codons (2) transferases (17, 28) and tRNAHis guanylyltransferase (Thg1) (29). Sare novel methods for rewriting the genetic code that may dramatically alter the repertoire of genetically encoded amino Significance acids. Expansion of the genetic code has led to exciting tech- nologies, including site-directed protein labeling and production Expanding the genetic code is an important aim of synthetic of proteins with hardwired posttranslational modifications (3). biology, but some organisms developed naturally expanded The current approaches to cotranslationally insert noncanonical genetic codes over the course of evolution. To understand the amino acids (ncAAs) into proteins rely on the reassigning of one selective advantage of genetically encoding more than 20 of three stop codons (4). amino acids, we investigated the proteome-wide response to Although these approaches were highly successful in incor- reducing the genetic code of Methanosarcina acetivorans from porating over 100 ncAAs into proteins (3), they limit the ex- 21 to 20 amino acids. The data show how a natural proteome pansion of the code to no more than 2 additional amino acids at adapts to genetic code reduction and indicate that the selective a time and significantly challenge the cellular production host by value of an expanded genetic code is related to carbon source unnaturally extending proteins and reducing growth rate (5). Al- range and metabolic efficiency. ternate methods focus on quadruplet codons (6, 7) and recoding – (8) or reassigning sense codons (9 13). Attempts to reassign Author contributions: P.O., L.P., K.R., J.R., D.S., and I.U.H. designed research; P.O., M.K., a sense codon in Mycoplasma capricolum were defied by tRNA J.G.S., K.R., J.R., and I.U.H. performed research; P.O., D.S., and I.U.H. analyzed data; and misacylation by endogenous aminoacyl-tRNA synthetases (9). P.O., D.S., and I.U.H. wrote the paper. This result indicates that, although extensively rewriting the ge- Reviewers included: J.A.L., University of Washington. netic code may be possible, it comes with unexpected challenges The authors declare no conflict of interest. related to cellular fitness and translation fidelity. These consid- 1To whom correspondence may be addressed. Email: [email protected] or ilka. erations will impact efforts to engineer cells to synthesize proteins [email protected]. with multiple ncAAs or create biologically contained strains that This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. require an expanded code for survival (14). 1073/pnas.1420193111/-/DCSupplemental.

17206–17211 | PNAS | December 2, 2014 | vol. 111 | no. 48 www.pnas.org/cgi/doi/10.1073/pnas.1420193111 Downloaded by guest on September 24, 2021 Methanosarcina acetivorans provides an ideal model system to Table 1. Growth statistics for M. acetivorans strains identify Pyl-containing proteins and study the impact of genetic Strain Carbon source Doubling time (h) Maximum A Lag time (h) code reduction on the proteome and physiology of the cell. We 578 constructed a markerless tRNAPyl deletion (ΔpylT)strainof WT MeOH 5.9 ± 0.7 1.02 ± 0.01 39.9 ± 1.6 M. acetivorans C2A and used three independent mass spectrom- ΔpylT MeOH 7.9 ± 0.5 0.99 ± 0.02 45.2 ± 1.5 etry (MS) approaches to characterize soluble proteomes from WT TMA 7.5 ± 0.3 1.06 ± 0.40 52.4 ± 0.9 M. acetivorans grown on minimal medium containing trimethyl- amine (TMA) or methanol and ΔpylT cells grown on methanol. The data reveal previously unidentified biochemical roles for Pyl From 21 to 20—Proteome Adaptation to Genetic Code Reduction. To and Pyl-containing proteins and indicate that the expanded genetic better understand the nature of the selective value of Pyl, we code of Methanosarcina is intricately linked with cellular metabo- characterized the soluble proteomes of WT and ΔpylT strains. Of lism and the composition of the proteome. 4,721 potential protein coding genes in M. acetivorans, 841 pro- teins were identified, including ∼300 proteins identified by gel- Results based methods; the liquid chromatography (LC) -MS/MS M. acetivorans with a Reduced Genetic Code. There are 267 ORFs approach identified 583 proteins. Proteins were considered iden- in the M. acetivorans genome with one or multiple in-frame tified if two or more significant peptides (peptide score > 35) were UAG codon(s) (Figs. S1 and S2 and Table S1). Except for Thg1 detected and verified by MS/MS spectra. All peptides identified by and the methylamine methyltransferase (mtxB), it is unknown if LC-MS/MS are listed in Dataset S1. these ORFs are expressed or the resulting protein contains Pyl. The ΔpylT strain has a globally altered proteome (Fig. 2). We To uncover more Pyl-containing proteins and investigate the role identified 347 differentially regulated proteins showing more of Pyl in the M. acetivorans proteome, we constructed and than twofold change (Tables S2 and S3), most of which are Pyl characterized a tRNA deletion strain of M. acetivorans C2A proteins that do not contain Pyl. The most affected pathways (Fig. 1). We monitored the growth rate of three independently include stress response, methanogenesis, methylsulfide metabo- Pyl obtained markerless tRNA deletion mutants and compared lism, translation, and amino acid metabolism (Table 2, Figs. S3 these cells with wild type (WT) cells grown on minimal medium and S4, and Tables S2 and S3). In ΔpylT, proteins involved in containing TMA or methanol (Fig. 1 and Table 1). As expected heat shock (Hsp60 and GroEL/GroES) and oxidative stress re- Pyl (26), the tRNA deletion strain cannot use TMA as a growth sponse pathways were significantly up-regulated (approximately substrate. In rich medium containing yeast extract, previous fivefold). We observed enhanced expression of eight distinct Pyl studies indicated that tRNA deletion did not affect growth on methanosarcina disulfide reductases (Table S3), which were methanol (26). In minimal medium containing methanol, ΔpylT suggested to play a major role in oxidative defense (30). Pre- shows significant increases in lag time (10%) and doubling time maturely truncated or mistranslated proteins may activate this (30%) (Fig. 1). The data show that the expanded genetic code of stress response. Protein mistranslation and misfolding elicit M. acetivorans confers a selective advantage. similar stress responses in other organisms, such as E. coli (31). Ribosomal proteins S13 and L18 are >10-fold more abundant in ΔpylT. S13 is 40-fold up-regulated in ΔpylT and one of the most perturbed proteins identified. S13 interacts with the tRNA A B M 1 2 3 4 binding site and central protuberance of the ribosome, and it is StyI 1230 also in contact with L5 (32). L18 is essential in E. coli, and its pylT pylBCD association with L5 and the 5S rRNA is required for proper ri- bosomal assembly (33). Because deletion of tRNAPyl creates 653 WT (661 bp) StyI pylS a cell with up to 267 new stop codons, overexpression of S13 and 517 pylT (589 bp) L18 may help reassemble or stabilize stalled ribosomes. 453 EVOLUTION C 1.2 Metabolic Adaptations in Methanogensis Without Pyl. The proteomic analysis indicates the tRNAPyl deletion strain is metabolically less

1.0 efficient than WT. In methanogenesis from methanol, the MtaB protein abstracts a methyl group from the methanol substrate and

0.8 transfers it to a cognate corrinoid protein (MtaC). In the next step, OD WT (MeOH) the MtaA protein transfers the methyl group from MtaC to con- pylT (MeOH) vert coenzyme M (CoM) into methyl-CoM, which is a substrate for 0.6 WT (TMA) methanogenesis. Proteins responsible for methanogenesis from pylT (TMA) methanol are two- to fourfold down-regulated in ΔpylT (MtaB1, 0.4 MtaC, MtaB2, MtaA, MtaC2), including a newly identified Pyl- containing corrinoid:CoM methyltransferase (MtaA; MA0855) 0.2 (Table 2). The full-length Pyl-containing MtaA was observed in

0.0 a gel that contained only the WT proteome. Because of its low 0 10 20 30 40 50 60 70 80 90 abundance, the observed value of 1.8 ± 0.4-fold up-regulated in time (hours) ΔpylT maybeinaccurate,simplyreflecting the fact that the spot intensity is close to background. The molecular mass of the protein Fig. 1. Characterization of M. acetivorans variants. (A) Structure of the Pyl in the gel indicates that Pyl is present (Table 3 and Table S4). operon showing StyI sites flanking the tRNAPyl gene. (B) The Southern blot Proteins catalyzing methanogensis from methanol are less shows the genotype of WT M. acetivorans (lane 1) and three independently abundant in ΔpylT compared with WT. Global changes in the obtained tRNAPyl markerless deletion mutants (lanes 2–4). In the deletion Pyl proteome reduce the fraction of soluble protein devoted to strains, only 73 bases of the tRNA gene were deleted. As anticipated, the methanol methanogenesis in ΔpylT cells (Table S5). Methanol digoxigenin-labeled molecular weight marker (M) migrates slower than the Δ methanogenesis proteins account for 33% of the soluble protein in digested genomic DNA. (C) Growth curve of WT and pylT strains in minimal Δ media containing methanol (MeOH) or TMA as the sole carbon source. Data the WT cell, but this fraction is reduced to 21% in pylT.Themost are based on triplicate measurements; ΔpylT data are based on triplicate striking examples are the MtaB2 (MA1616) and MtaC2 (MA4391) measurements of three independently obtained deletion strains. Error bars proteins, which each make up 6% of the soluble proteome in WT show 1 SD. cells and ≤0.4% of the proteome in ΔpylT cells (Table S5).

O’Donoghue et al. PNAS | December 2, 2014 | vol. 111 | no. 48 | 17207 Downloaded by guest on September 24, 2021 Mr constitutively expresses MtsF, whereas in WT cells, MtsF is only (kDa) MA4384 expressed in the presence of DMS (38). We hypothesize that, in MA0859 MA0857 100 the pylT mutant, MsrC is one component of a regulatory system 75 MA4392 that senses reduced metabolic efficiency from methanol meth- anogenesis and stimulates production of MtsF and MtsD. Expres- MA0455 sion profiling of the MtsX genes indicated that their abundance is 50 low in methanol-grown cells, but their mRNA abundance increases MA4391 between 2- and 50-fold when DMS is added to the media (37). The MA0855 MA2718 MtsX enzymes are essential for DMS catabolism when DMS is the sole carbon source, but the same enzymes are involved in DMS 37 formationincellsgrownoncarbonmonoxide(35,39).These MA2813 reports suggest an alternative possibility that ΔpylT produces in- MA0456 MA4547 creased levels of DMS. 25 Proteomes of Methanol- and TMA-Grown M. acetivorans. To gain MA1091 insight into the metabolic shift in WT cells as they adapt from methanol to TMA substrates, we compared proteomes from WT grown in minimal medium containing either methanol or TMA. MA2699 MA1108 We found that enzymes involved in methanol methanogenesis MtaB1, MtaC1, and MtaC2 all show 10-fold up-regulation in pH 3 pH 11 methanol-containing media (Table S6). In a proteomic analysis M. acetivorans Pyl comparing grown on methanol vs. acetate, methanol Fig. 2. Deletion of tRNA leads to a globally altered proteome. Over- methyltransferase components MtaB1 (500-fold), MtaC1 (500- lapped 2D gel image showing soluble proteomes (pH range 3–11) of the WT Δ fold), and MtaC2 (90-fold) were also significantly up-regulated in (magenta) and pylT (cyan) M. acetivorans strains. Protein spots that overlap methanol-grown cells (40). A central component of the meth- exactly are shown in blue. Proteins that showed high levels of differential expression are marked. M , molecular mass. anogensis pathway, formyl-methanofuran dehydrogenase (Fmd) r subunits A–F, was 10- to 40-fold up-regulated in methanol com- pared with acetate media (40). We found that FmdA is another Reduced abundance of methanol methanogensis enzymes in 20-fold up-regulated in TMA-metabolizing cells. A flavodoxin ΔpylT may be responsible for the observed growth defects, but the (MA2699) was 10-fold up-regulated in methanol vs. TMA. The largest protein abundance changes in the ΔpylT proteome involve MA2699 transcript was 29-fold up-regulated on methanol com- a related metabolism—methanogensis from methylsulfides. pared with acetate (40), suggesting a functional role in electron Methylsulfide methanogenesis proceeds similarly to the methanol transport specific to methanol metabolism. MtsD showed 2.5-fold route, including a methylsulfide methyltransferase (MtsX), a cog- higher abundance in methanol-grown cells, suggesting that DMS nate corrinoid protein, and transfer of the methyl group to CoM methanogenesis is somewhat up-regulated normally during growth with subsequent entry into the generic methanogenesis pathway. on methanol compared with TMA. M. acetivorans was known to grow on DMS (34). Characterization of the methyltransferases was completed recently (35). MtsX enzymes are distinct from their relatives involved in methanol and Table 2. Most affected proteins in ΔpylT compared with WT methylamine methanogenesis, because both methyltransferase M. acetivorans and corrinoid protein exist in a single polypeptide. M. acetivorans Locus Gene name Function x-Fold change encodes three MtsX genes (mtsD, mtsF,andmtsH). All three proteins were identified by MS. According to LC-MS/MS, MtsH is MA3790 ilvC Ile biosynthesis 0.03 distinct in ΔpylT cells. MtsF (30-fold) and MtsD (112-fold) were MA2273 S-layer domain ABC transporter 0.1 the most up-regulated proteins in ΔpylT. MtsF increased to 1.2% MA0821 pgi Glycolysis 0.1 of the soluble proteome in ΔpylT from 0.03% in WT, and MtsD MA1778 Exosome RNA metabolism 0.1 increased to 5.0% in ΔpylT compared with 0.04% in WT. The data MA0182 eif2b Protein synthesis 0.1 indicate a compensatory switch away from methanol and toward MA3052 valS Protein synthesis 0.1 methyl sulfide metabolism in M. acetivorans cells that are only able MA0076 eif2B1 Protein synthesis 0.2 to encode 20 amino acids. MA4391 mtaC2 Methanogenesis 0.3 There are ArsR-type transcriptional regulators in the genomic MA0855 mtaA Methanogenesis 1.8 context of each of the three mtsX genes (35, 36). We confirmed MA1567 pdxS Cofactor synthesis 4.1 expression of one of these putative regulators (msrC; MA4383), MA4043 Hypothetical ? 4.5 which is twofold more abundant in WT (Table S2). We detected MA4127 Recombinase Mobile element 4.5 one significant peptide each for MsrB (MA0460) and MsrD Δ MA1317 gatD Protein synthesis 4.9 (MA4397) in pylT.AnM. acetivorans msrC deletion mutant MA1275 ahcY Cys biosynthesis 5.0 showed no phenotype during TMA growth, but methanol-grown MA3564 argJ Arg biosynthesis 5.8 cells showed twofold increases in both generation and lag time. MA4386 hsp60-4 Stress response 6.8 Expression of methanol methyltransferase genes was also inhib- MA0857 hsp60-5 Stress response 6.9 ited (36). This observation gave rise to the name methanol-specific MA1091 rpl18p Protein synthesis 10.3 regulators (msrs). The msrC gene is located adjacent to mtsF, and MtsF protein MA1108 rps13p Protein synthesis 37.7 expression was abolished in the msrC deletion mutant, suggesting MA4384 mtsF Methanogenesis 35.5 a regulatory role for MsrC in DMS metabolism (37). The fact MA0859 mtsD Methanogenesis 112.2 Δ that we observe reduced abundance of MsrC in ΔpylT along with MA4558 mtsH Methanogenesis Distinct in pylT* increased MtsX expression indicates that the regulatory system *Distinct peptides observed in the LC-MS/MS experiment refer to proteins associated with DMS metabolism is complex and may involve identified with more than two significant peptides in which the peptides multiple regulatory components. Indeed, MA4561 [methyl sulfide observed in one strain (methanol-grown WT or ΔpylT) are completely absent methyltransferase-associated sensor (MsmS)] was implicated as in the other strain. Fig. S5 shows typical spectra for distinct peptides. x-Fold another component in MtsF regulation. An msmS deletion mutant change indicates the differential abundance ΔpylT/WT.

17208 | www.pnas.org/cgi/doi/10.1073/pnas.1420193111 O’Donoghue et al. Downloaded by guest on September 24, 2021 Table 3. Newly identified Pyl proteins in M. acetivorans

Group/locus Name Carbon source UAG translation Mr stop Mr Pyl Mr observed

i Essential

MA0528 MttB TMA Mr 36.1 53.8 50

MA0855 MtaA MeOH Mr 9.7 38.4 38 MA0932 MttB TMA Mr 36.2 54.1 52 ii Read through MA0864 His kinase TMA 1 peptide 25.2 111.9 110 MA3625 Endonuclease MeOH 6 peptides 9.3 51.9 50 iii Extension MA1887 Hypo TMA 1 peptide 35.2 36.2 36

MA2509 Hypo TMA Mr 18.6 19.8 20 iv Unassigned MA0154 PylB MeOH Pyl peptide 40.3 42.9 42

Molecular mass (Mr) is in kilodaltons. The UAG translation column indicates the data source that confirmed read through of UAGs: Mr indicates UAG translation confirmed by the mass of the protein observed in gel; number of peptides indicates the number of significant peptides identified downstream of the Pyl locus; Pyl peptide indicates MS/MS identification of the Pyl-containing peptide. Hypo, hypothetical protein.

Proteins up-regulated in TMA-grown cells included those confirmed full-length proteins are the methanogenesis protein expressed from loci MA0146 (fivefold) and MA0527 encoding MtaA, two MttB homologs, and three hypothetical proteins enzymes responsible for methanogensis from dimethylamine (Table 3 and Table S4). We identified a Pyl-containing peptide (Table S6). The most differentially regulated gene is MA1362 by MS/MS for the PylB protein, which catalyzes the first step of (O-linked GlcNAc transferase). The enzyme is involved in protein Pyl biosynthesis (41) (Fig. 3). glycosylation and induced over 20-fold in TMA- vs. methanol- Interestingly, we were able to detect the C-terminal fragment of grown cells, which could represent an uncharacterized mechanism the tRNA editing enzyme Thg1 in ΔpylT.InWTcells,Thg1is to modulate protein function in TMA metabolism. Enzymes in- expressed as a full-length Pyl-containing protein (29). Thg1 had volved in lysine (lysA; 4-fold) and arginine (argJ; 17-fold) bio- initially been annotated as two separate proteins in M. acetivorans, synthesis were also significantly more abundant in TMA-grown and it seems that, in the absence of Pyl, the C-terminal fragment of cells. An archaeal peptide chain release factor 1 paralog (SI Text) Thg1 is expressed independently. We previously showed that the and three hypothetical proteins (MA0864, MA1887, and MA3997) two recombinant purified Thg1 halves could function in trans in in are3-to10-foldup-regulatedinTMA-growncells.Twoofthese vitro activity assays (29). Our proteomic data showed that the sec- genes (MA0864 and MA1887) contain in-frame TAGs. We con- ond peptide is expressed independently, and therefore, the protein firmed translation of the UAG codon in the histidine kinase may reassemble in the cell to form functional Thg1 without Pyl. MA0864 (Table 3 and Table S4). Histidine kinases are one of a small number of protein families enriched with Pyl proteins (Fig. Discussion S2). Identifying Pyl in MA0864 may indicate that the lar- Roles of Pyl in M. acetivorans. Methanogenesis is the major source ger family of histidine kinases (of which there are eight in of biogenic methane production on the planet. Methane is, at the M. acetivorans) also encodes Pyl and possibly requires the resi- same time, a potent greenhouse gas and a promising biofuel. due for activity. These facts are primary reasons why M. acetivorans and its rela- tives are not only of biological interest but also a likely source of

Expanding the Pyl Proteome. Our proteomic and mRNA expres- biotechnological applications. It was assumed that the biological EVOLUTION sion data (Fig. S1) expanded the known Pyl proteome from few role of Pyl was restricted to its catalytic role in the active site of previously experimentally verified Pyl proteins (28, 29) to the methylamine methyltransferases (28) until we showed that its now confirmed expression of 19 UAG-containing ORFs (Table cotranslational incorporation is essential for formation of full- 3, Fig. S1, and Tables S1 and S4). In three cases (histidine length Thg1 in M. acetivorans (29). Taking these prior observa- kinases MA2732 and MA3962, recombinase MA4127, and Thg1 tions into account, Pyl proteins can be classified into three cate- MA0816), the truncated protein was detected and identified by gories (Table 3 and Table S4): (i) Pyl is required for enzymatic molecular mass that resulted from decoding a UAG as stop in activity, (ii) Pyl is noncatalytic but required for translation of full- ΔpylT. The data show that M. acetivorans is capable of terminating length protein, and (iii) Pyl is inserted at intended stop codons, peptide chain extension at UAG and indicate that UAG was resulting in a short peptide extension. Group iv includes PylB, converted to a stop codon by deletion of pylT. We detected no where the role of the extension is unclear. Most potential and peptides suggesting that UAG is read as a stop codon in WT. experimentally verified Pyl proteins that we found fall into class iii. Several putative Pyl-containing proteins (Table 3, Fig. S2,and Table S4) were identified as expressed by detection of peptides Pyl in the Active Site. We confirmed that both MttB homologs upstream of the Pyl locus. We were unable to identify peptides (MA0582 and MA0932) contain Pyl, which is in line with the including or downstream of the Pyl locus in all cases, and there- essential catalytic role of Pyl in these enzymes. Excitingly, we fore, it remains unclear if these UAGs are translated. For many of also identified a Pyl-containing methylcobamide:CoM methyl- these proteins, Pyl is located at or near the C terminus; therefore, transferase (mtaA). Although it is unclear whether Pyl contrib- it is not possible to confirm Pyl insertion based on molecular mass utes to the catalytic activity of MtaA, Pyl is located within the of the protein, and there are few or no downstream peptides to characteristic triosephosphate isomerase (TIM)-barrel fold, like confirm UAG read through. the catalytically active Pyl residues in MttB, suggesting a catalytic For three proteins (MA0864, MA1887, and MA3625), MS/MS role for Pyl in MtaA. Other Pyl proteins may use the unusual analysis identified peptides beyond the UAG codon that showed amino acids in their active sites, but because many are unchar- UAG read through and indicated the presence of Pyl in the acterized, it is not possible to determine the role of Pyl at this corresponding protein. Determination of the relative molecular time. We observed that Pyl proteins (potential and experimen- masses confirmed UAG translation for four additional proteins tally verified) are, indeed, overrepresented in methanogenesis- (MA0528, MA0855, MA0932, and MA2509). Among these related enzymes. Histidine kinases, recombinases, transposases,

O’Donoghue et al. PNAS | December 2, 2014 | vol. 111 | no. 48 | 17209 Downloaded by guest on September 24, 2021 were able to detect the Pyl-containing peptide (Fig. 3A). PylB is KOT11-0730 #6940 RT: 44.51 AV: 1 NL: 8.58E2 A T: ITMS + c NSI d Full ms2 [email protected] [190.00-1485.00] 891.43 891.43 the first enzyme in the Pyl biosynthesis pathway catalyzing the 100100 b6 95 b1 b2 b3 b4 b5 b6 b7 b8 b9 isomerization of lysine to 3-methylornithine (18). PylB is a TIM 90 O barrel (Fig. 3B), with a central cavity containing the catalytic 85 (4Fe-4S) cluster and S-adenosylmethionine (42). The recombi- 80 y9 y8 y7 y6 y5 y4 y3 y2 y1 nant PylB is catalytically active without the extension of the 7575 70 21-amino acid residue Pyl peptide (18). Nevertheless the Pyl- 65 b8+H 0 containing peptide could modulate PylB activity or contribute 60 b7+H20 2 55 1153.44 to a regulatory mechanism in which the level of available Pyl is 1006.43 1153.44 1006.43 50 50 y5 sensed, reminiscent of the Trp operon (43). 45 719.00 40 b4-H O UAG Is Reassigned to Pyl. It has not been conclusively shown whether 35 2 y4 MH-H20 30 582.16 UAG has a dual meaning (i.e., stop and Pyl) in M. acetivorans or 582.16 727.84 2525 y2 if all UAG codons are read as Pyl. Earlier work with the Pyl- 20 b4+2 622.30 719.00 y6 b7 b8 626.88 b9 containing methylamine methyl transferases (26) suggested the 15 564.28 851.26 1135.22 320.11 622.30 existence of an RNA recoding element (Pyl insertion sequence, 10 320.11 467.23 693.25693.25 851.26 1298.211298.21 275.08275.08 338.37 338.37 467.23 564.28 5 833.74833.74 1109.821135.22 1318.49 PYLIS) downstream of the Pyl codon, similar to the selenocys- 988.41988.41 1109.82 1318.49 1471.121471.12 0 0 teine (Sec) insertion sequence (SECIS) found in selenoprotein 200 400 600 1000 1200 1400 200 600 m/z 1000 1400 mRNAs (44). In the case of Sec, SECIS designates particular UGA codons as Sec, whereas other UGA codons are stop signals. The B putative PYLIS element was not subsequently identified in other Pyl-containing proteins, such as Thg1, or those listed in Table 3. Pyl can be incorporated into normally non-Pyl pro- teins, such as a recombinant uidA gene product in M. acetivorans (26) or β-galactosidase in E. coli expressing a recombinant Pyl system (22). We found no evidence that UAG is read as stop in WT M. acetivorans, but we did identify peptides from UAG stop codons in ΔpylT. Taken together, this evidence indicates that the UAG codon is reassigned to Pyl and not selectively recoded.

UAG Codon Evolution. In contrast to M. acetivorans, the Pyl-decoding bacterium Acetohalobium arabaticum dynamically expands and reduces its genetic code depending on the carbon source, encoding Pyl only when TMA is present (17). Some bacteria have the genes required to reassign UAG to Pyl but do not ex- Fig. 3. Pyl incorporation in the Pyl biosynthesis enzyme methylornithine press Pyl-containing proteins. Under the conditions tested and synthase (pylB). (A) MS/MS spectra identified Pyl (O) in the relevant tryptic even in the presence of TMA, Desulfitobacterium dehalogenans peptide from WT M. acetivorans cells grown on methanol. In the peptide, Pyl Cys (C) is carbamidomethylated, and Met (M) is oxidized. (B) A view of the and Desulfitobacterium hafniense did not produce Pyl-tRNA or crystal structure of PylB (42) is shown with its substrate lysine (Lys) and detectable Pyl-containing protein in vivo (17). In contrast, all ar- chaea examined so far [ (28), M. acetivorans cofactors S-adenosyl methionine (SAM) and an iron-sulfur (Fe4S4) cluster in the active site. The protein is colored to highlight the N (blue) and C (red) (26), Methanosarcina mazei (45), and burtonii termini. The structure does not include the Pyl peptide. When pylB is (46)] that genetically encode Pyl seem to express Pyl-containing expressed in WT M. acetivorans cells, the protein contains an additional 21 proteins in cells grown on methylamines or methanol (29). Un- amino acids, including one Pyl residue, which would extend the C terminus derlying these different interpretations of the UAG codon are beyond the red sphere. clear differences in UAG codon use. M. acetivorans and other Pyl- decoding archaea have significantly fewer UAG codons (∼5%) than both non–Pyl-decoding archaea and Pyl-decoding bacteria and radical S-adenosylmethionine enzymes are other biochemical (∼20%) (17). The data suggest that, in Pyl-decoding archaea, se- activities that are enriched with Pyl proteins (Fig. S2), indicating lective pressure eliminated UAG codons at positions unfavorable to the potential of a catalytic Pyl in these enzymes. Pyl insertion and protein extension. Conversely, Pyl-decoding bac- teria control the meaning of the UAG codon by either silencing the Pyl for UAG Read Through. Pyl is only required for read through of Pyl operon or only synthesizing Pyl-tRNAPyl when TMA is present. an in-frame UAG in Thg1 but not its activity (29). Pyl incorporation in two more proteins, MA0864 and MA3625, extends the proteins Impact of Expanding the Genetic Code on Cellular Physiology. Micro- from 215 to 975 amino acids and from 79 to 446 amino acids, re- biome (47) and environmental microbial sequencing (48) work is spectively. Pyl incorporation inbothMA0864andMA3625linksan revealing organisms with new genetic code variations. At the same N-terminal hypothetical protein to a C-terminal protein with known time, advances in synthetic biology are leading to new engineered homologs. The C-terminal part of MA3625 encodes a hypothetical genetic codes (3). Despite these achievements, little is known about tRNA splicing endonuclease, and the C terminus of MA0864 enc- the impact of an expanded genetic code on the proteome, physi- odes a histidine kinase. Although our data clearly show that a full- ology, or fitness of the cell. Initial studies indicate that there is length protein is formed, the function and potential effect of joining a fitness cost for engineering genetic code expansion. Expanding the these proteins or the role of Pyl (if any) remains to be characterized. genetic code of E. coli by reassigning the UAG codon to phos- phoserine leads to slowly growing cells, which is caused by unnatural Pyl at the End. A third group of proteins includes candidates that extension of proteins beyond their normal UAG stop signal and cannot be easily assigned to one of the other groups (i.e., PylB, incorporation of phosphoserine throughout the proteome (5, 49). the hypothetical protein MA3459, and the putative histidine ki- We observed that M. acetivorans derives a selective advantage nase MA3962). For these proteins, several amino acids residues from maintaining an expanded genetic code with Pyl. Methano- are added as a result of UAG read through, with the potential of sarcina are able to survive using more chemically diverse carbon a regulatory function. Although MA3459 and MA3962 are sources [acetate, methanol, methylamines, or DMS (25)] than clearly expressed in cells, we do not have experimental evidence non–Pyl-decoding methanogens. Furthermore, Pyl confers a fit- that their protein products include Pyl. For PylB, however, we ness advantage in methanol growth. The proteomic comparison of

17210 | www.pnas.org/cgi/doi/10.1073/pnas.1420193111 O’Donoghue et al. Downloaded by guest on September 24, 2021 WT and ΔpylT indicates repression of methanol methanogenesis cells grown on methanol, and ΔpylT cells grown on methanol. Proteomes enzymes (one of which contains Pyl) with a significant metabolic were separated on independent gels (2D gels) and overlaid to quantitate shift toward DMS metabolism. It is possible that, by inducing differential protein abundance or simultaneously, by difference gel elec- DMS metabolism, the cells are compensating for the reduced trophoresis. For both gel-based methods, spots showing differential regu- metabolic efficiency of methanol methanogenesis without Pyl. lation >1.8-fold were characterized by MS/MS. We separated proteomes Deletion of the MtsX genes in addition to tRNAPyl should result from methanol-grown cells (ΔpylT or WT) by LC, and fractions were analyzed in more drastic phenotypic defects. Such experiments would help by MS/MS. Proteins identified with two or more significant peptides in only to establish a mechanism for the metabolic compensation sug- one of the samples (e.g., WT or ΔpylT) but not both were labeled as distinct. gested by the proteomic data. Full details of the markerless deletion of tRNAPyl, growth conditions, and M. acetivorans is, in a way, prepared to survive without Pyl but at proteomic analysis are in SI Materials and Methods. a phenotypic cost that reduces carbon source range, growth rate, and metabolic efficiency. Although M. acetivorans can survive with ACKNOWLEDGMENTS. We thank Dieter Jahn and Martina Jahn for pro- a reduced genetic code, we found that Pyl is essential for optimal viding laboratory space to perform initial proteomic experiments. We also thank Terrence Wu (Keck MS and Proteomics Resource, Yale University), growth on methanol and that removing Pyl has a tremendous im- Manfred Nimtz (Helmholtz Centre for Infection Research), and Thorsten pact on the proteome. The data show that reassignment of a rarely Johl (Helmholtz Centre for Infection Research) for support with MS used codon can dramatically alter cellular metabolism. These effects analysis and Yuchen Liu, Hans Aerni, and Ava Artaiz for critical discussions will have to be considered in approaches to reduce the genetic code of the manuscript. We thank Bill Metcalf, Gary Olsen, Claudia Reich, and of organisms to open new codons for genetic code expansion. Carl Woese for reagents, training, discussions, and their enthusiasm for the Archaea. This work was supported by Natural Sciences and Engineering Materials and Methods Research Council of Canada Grants RGPIN 04282-2014 (to P.O.) and RGPIN 04776-2014 (to I.U.H.), National Institutes of Health Grant GM22854 (to We conducted three independent proteomic investigations. In two gel-based D.S.), and Defense Advanced Research Projects Agency Contract N66001- methods, we separated three soluble proteomes: WT cells grown on TMA, WT 12-C-4211 (to D.S.).

1. Gibson DG, et al. (2008) Complete chemical synthesis, assembly, and cloning of 26. Mahapatra A, et al. (2006) Characterization of a Methanosarcina acetivorans mutant a Mycoplasma genitalium genome. Science 319(5867):1215–1220. unable to translate UAG as pyrrolysine. Mol Microbiol 59(1):56–66. 2. Lajoie MJ, et al. (2013) Genomically recoded organisms expand biological functions. 27. Zhang Y, Gladyshev VN (2007) High content of proteins containing 21st and 22nd Science 342(6156):357–360. amino acids, selenocysteine and pyrrolysine, in a symbiotic deltaproteobacterium of 3. O’Donoghue P, Ling J, Wang YS, Söll D (2013) Upgrading protein synthesis for syn- gutless worm Olavius algarvensis. Nucleic Acids Res 35(15):4952–4963. thetic biology. Nat Chem Biol 9(10):594–598. 28. Hao B, et al. (2002) A new UAG-encoded residue in the structure of a 4. Liu CC, Schultz PG (2010) Adding new chemistries to the genetic code. Annu Rev Bi- methyltransferase. Science 296(5572):1462–1466. His ochem 79:413–444. 29. Heinemann IU, et al. (2009) The appearance of pyrrolysine in tRNA guanylyl- 5. Heinemann IU, et al. (2012) Enhanced phosphoserine insertion during Escherichia coli transferase by neutral evolution. Proc Natl Acad Sci USA 106(50):21103–21108. protein synthesis via partial UAG codon reassignment and release factor 1 deletion. 30. Lessner DJ, Ferry JG (2007) The archaeon Methanosarcina acetivorans contains a FEBS Lett 586(20):3716–3722. protein disulfide reductase with an iron-sulfur cluster. J Bacteriol 189(20):7475–7484. 6. Moore B, Persson BC, Nelson CC, Gesteland RF, Atkins JF (2000) Quadruplet codons: 31. Ling J, et al. (2012) Protein aggregation caused by aminoglycoside action is prevented – Implications for code expansion and the specification of translation step size. J Mol by a hydrogen peroxide scavenger. Mol Cell 48(5):713 722. Biol 298(2):195–209. 32. Cukras AR, Green R (2005) Multiple effects of S13 in modulating the strength of in- – 7. Wang K, et al. (2014) Optimized orthogonal translation of unnatural amino acids tersubunit interactions in the ribosome during translation. J Mol Biol 349(1):47 59. enables spontaneous protein double-labelling and FRET. Nat Chem 6(5):393–403. 33. Spierer P, Bogdanov AA, Zimmermann RA (1978) Parameters for the interaction of 8. Bröcker MJ, Ho JM, Church GM, Söll D, O’Donoghue P (2014) Recoding the genetic ribosomal proteins L5, L18, and L25 with 5S RNA from Escherichia coli. Biochemistry – code with selenocysteine. Angew Chem Int Ed Engl 53(1):319–323. 17(25):5394 5398. 9. Krishnakumar R, et al. (2013) Transfer RNA misidentification scrambles sense codon 34. Ni S, Woese CR, Aldrich HC, Boone DR (1994) Transfer of siciliae to the genus Methanosarcina, naming it Methanosarcina siciliae, and emendation of the recoding. ChemBioChem 14(15):1967–1972. genus Methanosarcina. Int J Syst Bacteriol 44(2):357–359. 10. Kwon I, Kirshenbaum K, Tirrell DA (2003) Breaking the degeneracy of the genetic 35. Oelgeschläger E, Rother M (2009) In vivo role of three fused corrinoid/methyl transfer code. J Am Chem Soc 125(25):7512–7513. proteins in Methanosarcina acetivorans. Mol Microbiol 72(5):1260–1272. 11. Zeng Y, Wang W, Liu WR (2014) Towards reassigning the rare AGG codon in Es- 36. Bose A, Metcalf WW (2008) Distinct regulators control the expression of methanol cherichia coli. ChemBioChem 15(12):1750–1754. methyltransferase isozymes in Methanosarcina acetivorans C2A. Mol Microbiol 67(3):

12. Lajoie MJ, et al. (2013) Probing the limits of genetic recoding in essential genes. EVOLUTION 649–661. Science 342(6156):361–363. 37. Bose A, Kulkarni G, Metcalf WW (2009) Regulation of putative methyl-sulphide 13. Krishnakumar R, Ling J (2014) Experimental challenges of sense codon reassignment: methyltransferases in Methanosarcina acetivorans C2A. Mol Microbiol 74(1):227–238. An innovative approach to genetic code expansion. FEBS Lett 588(3):383–388. 38. Molitor B, et al. (2013) A heme-based redox sensor in the methanogenic archaeon 14. Hammerling MJ, et al. (2014) Bacteriophages use an expanded genetic code on Methanosarcina acetivorans. J Biol Chem 288(25):18458–18472. evolutionary paths to higher fitness. Nat Chem Biol 10(3):178–180. 39. Moran JJ, House CH, Vrentas JM, Freeman KH (2008) Methyl sulfide production by 15. Knight RD, Freeland SJ, Landweber LF (2001) Rewiring the keyboard: Evolvability of a novel carbon monoxide metabolism in Methanosarcina acetivorans. Appl Environ the genetic code. Nat Rev Genet 2(1):49–58. Microbiol 74(2):540–542. 16. Ambrogelly A, Palioura S, Söll D (2007) Natural expansion of the genetic code. Nat 40. Li L, et al. (2007) Quantitative proteomic and microarray analysis of the archaeon Meth- Chem Biol 3(1):29–35. anosarcina acetivorans grown with acetate versus methanol. J Proteome Res 6(2):759–771. 17. Prat L, et al. (2012) Carbon source-dependent expansion of the genetic code in bac- 41. Krzycki JA (2013) The path of lysine to pyrrolysine. Curr Opin Chem Biol 17(4):619–625. – teria. Proc Natl Acad Sci USA 109(51):21070 21075. 42. Quitterer F, List A, Eisenreich W, Bacher A, Groll M (2012) Crystal structure of 18. Gaston MA, Zhang L, Green-Church KB, Krzycki JA (2011) The complete biosynthesis of methylornithine synthase (PylB): Insights into the pyrrolysine biosynthesis. Angew – the genetically encoded amino acid pyrrolysine from lysine. Nature 471(7340):647 650. Chem Int Ed Engl 51(6):1339–1342. 19. Wan W, Tharp JM, Liu WR (2014) Pyrrolysyl-tRNA synthetase: An ordinary enzyme but 43. Babitzke P, Gollnick P, Yanofsky C (1992) The mtrAB operon of Bacillus subtilis enc- – an outstanding genetic code expansion tool. Biochim Biophys Acta 1844(6):1059 1070. odes GTP cyclohydrolase I (MtrA), an enzyme involved in folic acid biosynthesis, and 20. Longstaff DG, et al. (2007) A natural genetic code expansion cassette enables trans- MtrB, a regulator of tryptophan biosynthesis. J Bacteriol 174(7):2059–2064. missible biosynthesis and genetic encoding of pyrrolysine. Proc Natl Acad Sci USA 44. Yoshizawa S, Böck A (2009) The many levels of control on bacterial selenoprotein 104(3):1021–1026. synthesis. Biochim Biophys Acta 1790(11):1404–1414. Pyl 21. Nozawa K, et al. (2009) Pyrrolysyl-tRNA synthetase-tRNA( ) structure reveals the 45. Veit K, Ehlers C, Schmitz RA (2005) Effects of nitrogen and carbon sources on tran- molecular basis of orthogonality. Nature 457(7233):1163–1167. scription of soluble methyltransferases in Methanosarcina mazei strain Go1. 22. Ambrogelly A, et al. (2007) Pyrrolysine is not hardwired for cotranslational insertion J Bacteriol 187(17):6147–6154. at UAG codons. Proc Natl Acad Sci USA 104(9):3141–3146. 46. Goodchild A, et al. (2004) A proteomic determination of cold adaptation in the 23. Chin JW (2014) Expanding and reprogramming the genetic code of cells and animals. Antarctic archaeon, Methanococcoides burtonii. Mol Microbiol 53(1):309–321. Annu Rev Biochem 83:379–408. 47. Campbell JH, et al. (2013) UGA is an additional glycine codon in uncultured SR1 24. Borrel G, et al. (2014) Unique characteristics of the pyrrolysine system in the 7th order bacteria from the human microbiota. Proc Natl Acad Sci USA 110(14):5540–5545. of methanogens: Implications for the evolution of a genetic code expansion cassette. 48. Ivanova NN, et al. (2014) Stop codon reassignments in the wild. Science 344(6186):909–913. Archaea 2014:374146. 49. Aerni HR, Shifman MA, Rogulina S, O’Donoghue P, Rinehart J (2014) Revealing the 25. Liu Y, Whitman WB (2008) Metabolic, phylogenetic, and ecological diversity of the amino acid composition of proteins within an expanded genetic code. Nucleic Acids methanogenic archaea. Ann N Y Acad Sci 1125:171–189. Res, 10.1093/nar/gku1087.

O’Donoghue et al. PNAS | December 2, 2014 | vol. 111 | no. 48 | 17211 Downloaded by guest on September 24, 2021