Published online 5 November 2014 Nucleic Acids Research, 2015, Vol. 43, No. 2 e8 doi: 10.1093/nar/gku1087 Revealing the amino acid composition of proteins within an expanded genetic code Hans R. Aerni1,2, Mark A. Shifman3, Svetlana Rogulina1,2, Patrick O’Donoghue4 and Jesse Rinehart1,2,*
1Department of Cellular & Molecular Physiology, Yale University, New Haven, CT 06520, USA, 2Systems Biology Institute, Yale University, West Haven, CT 06516, USA, 3Keck Biotechnology Resource Laboratory, Yale University, New Haven, CT 06511, USA and 4Departments of Biochemistry and Chemistry, The University of Western Ontario, London, ON N6A 5C1, Canada
Received August 19, 2014; Revised October 9, 2014; Accepted October 19, 2014
ABSTRACT have even engineered new codes for non-standard amino acids (NSAA) (4–8). Expanding the genetic code beyond The genetic code can be manipulated to reassign the standard amino acids (AAs) requires an aminoacyl- codons for the incorporation of non-standard amino tRNA synthetase capable of efficiently ligating NSAAs to acids (NSAA). Deletion of release factor 1 in Es- a tRNA that can decode an ‘open’ codon. These engi- cherichia coli enhances translation of UAG (Stop) neered orthogonal translation systems (OTSs) typically use codons, yet may also extended protein synthesis the rare amber stop codon UAG as the ‘open’ codon to en- at natural UAG terminated messenger RNAs. The fi- code the NSAA of choice. Recently, new recoding strate- delity of protein synthesis at reassigned UAG codons gies have emerged that have paved the way for converting and the purity of the NSAA containing proteins amber codons into a dedicated NSAA sense codon (6,9– produced require careful examination. Proteomics 11). This advance creates new possibilities for NSAA inser- would be an ideal tool for these tasks, but con- tion into proteins and has also opened up new pitfalls. Over 100 NSAAs have been encoded by diverse OTSs (5)andyet, ventional proteomic analyses cannot readily iden- methods that can quantify the purity and specificity of OTS- tify the extended proteins and accurately discover produced recombinant proteins and proteomes are limited. multiple amino acid (AA) insertions at a single UAG. We and others have observed natural AAs that can com- To address these challenges, we created a new pro- pete for insertion at amber codons (9,12–14) and also iden- teomic workflow that enabled the detection of UAG tified potential open reading frames (ORFs) that areat readthrough in native proteins in E. coli strains in risk for natural extensions of their native protein prod- which UAG was reassigned to encode phosphoser- ucts (6,10,11). These observations raise important concerns ine. The method also enabled quantitation of NSAA about the fidelity and purity of recombinant proteins con- and natural AA incorporation at UAG in a recom- taining NSAAs and require new methods to characterize binant reporter protein. As a proof-of-principle, we NSAA containing proteins and proteomes. For example, measured the fidelity and purity of the phosphoser- NSAA containing proteins used for structural studies or therapeutics would benefit from methods that would help to ine orthogonal translation system (OTS) and used diversify the OTS repertoire while maintaining the fidelity this information to improve its performance. Our re- and purity of NSAA-containing protein synthesis. sults show a surprising diversity of natural AAs at Here we present a new set of proteomic methods that can reassigned stop codons. Our method can be used quantify both standard and NSAAs in proteins and pro- to improve OTSs and to quantify amino acid purity teomes. We applied these tools to compare global amber at reassigned codons in organisms with expanded codon readthrough in the proteomes of a variety of Es- genetic codes. cherichia coli strains. This required the creation of a new workflow that enabled the comprehensive and unbiased dis- covery of peptides reporting UAG readthrough from shot- INTRODUCTION gun proteomic data. We validated this approach by exam- ining UAG readthrough in the proteomes of three E. coli The genetic code is nearly universal in living organisms, strains. The workflow also accommodates the characteriza- highly evolved and was assumed to be fixed (1–3). However, tion of phosphoserine (Sep) incorporation at UAG codons nature has revealed expanded genetic codes and researchers
*To whom correspondence should be addressed. Tel: +1 203 737 3144; Fax: +1 203 785 4951; Email: [email protected]