Structure of the surface layer of the methanogenic archaean Methanosarcina acetivorans

Mark A. Arbinga,1, Sum Chana, Annie Shina, Tung Phana, Christine J. Ahna, Lars Rohlina, and Robert P. Gunsalusa,b,1

aUniversity of California at Los Angeles–US Department of Energy (UCLA-DOE) Institute for Genomics and Proteomics and bDepartment of Microbiology, Immunology, and Molecular Genetics, University of California, Los Angeles, CA 90095

Edited by Caroline S. Harwood, University of Washington, Seattle, WA, and approved June 1, 2012 (received for review December 14, 2011) have a self-assembling proteinaceous surface (S-) layer as The Methanosarcina acetivorans cell envelope is typical of ar- the primary and outermost boundary of their cell envelopes. The chaeal species belonging to the order , whereby S-layer maintains structural rigidity, protects the organism from the cytoplasmic membrane is enclosed by an S-layer (9). It may adverse environmental elements, and yet provides access to all also be decorated with an external layer of heteropolysaccharide, essential nutrients. We have determined the crystal structure of termed methanochondroitin, depending on the environmental one of the two “homologous” tandem polypeptide repeats that conditions of cell culture (10). We previously identified (9) the comprise the Methanosarcina acetivorans S-layer protein and pro- S-layer protein of M. acetivorans as MA0829 (671 aa). It has an pose a high-resolution model for a microbial S-layer. The molecular N-terminal signal peptide, tandem-duplicated DUF1608 domains features of our hexameric S-layer model recapitulate those visual- (11), followed by a negatively charged tether of ∼60 aa and a ized by medium resolution electron microscopy studies of micro- predicted C-terminal transmembrane helix. The latter presumably bial S-layers and greatly expand our molecular view of S-layer anchors the S-layer to the cytoplasmic membrane (Fig. 1C). Pro- dimensions, porosity, and symmetry. The S-layer model reveals tein domains of the of unknown function (DUF)1608 a negatively charged molecular sieve that presents both a charge protein family have average amino acid lengths of 250–300 aa, are and size barrier to restrict access to the cell periplasmic-like space. typically found in contiguous pairs within a polypeptide chain, and, The β-sandwich folds of the S-layer protein are structurally homol- with the exception of two extreme halophilic Archaea, are exclu- ogous to eukaryotic envelope proteins, suggesting that sively found in the methanogenic (12). Archaea and have arrived at a common solution for pro- Since the discovery of microbial S-layers over fifty years ago tective envelope structures. These results provide insight into the (13), knowledge of their molecular properties that govern cell evolutionary origins of primitive cell envelope structures, of which integrity and permeability has remained limited. To elucidate the the S-layer is considered to be among the most primitive: it also structural features of the M. acetivorans S-layer that provide for provides a platform for the development of self-assembling nano- these protective and transport functions, we have determined the materials with diverse functional and structural properties. structure of the C-terminal DUF1608 domain of MA0829 in two crystal forms by X-ray crystallography. The structures allow S-layer structure | X-ray crystallography | surface layer protein | a model of the tandem DUF1608 MA0829 protein and for its domain of unknown function 1608 | methanogen assembly into the 2D S-layer lattice. Our results provide high- resolution insight into the functional and structural properties of he cell envelopes of archaeal and bacterial cells contain, at an S-layer required to maintain cell metabolism. Ta minimum, a cytoplasmic membrane (CM) that sequesters its genetic, cellular, and metabolic potential within a confined, con- Results and Discussion centrated space. External to this membrane the Archaea and Crystal Structure of the MA0829 C-Terminal DUF1608 Domain. The bacteria have evolved distinct molecular structures to provide for ability to determine the M. acetivorans MA0829 tandem protection from the environment, to ensure structural rigidity, and DUF1608 S-layer structure has been limited by availability of to define cell shape that, together, mediate interactions with the suitable diffracting crystals. However, the second DUF1608 do- external milieu. Most Archaea use a proteinaceous surface (S-) main or C-terminal repeat (CTR) (i.e., the C-terminal DUF1608 – layer, which encloses the CM (Fig. 1), as the outermost cell en- repeat) containing amino acids 294 570 of the mature MA0829 amino acid sequence was successfully crystallized in two different velope structure (1). In contrast, bacteria generally possess one or A more peptidoglycan layers surrounding the CM and where the crystal forms. The structure (Fig. 2 ) was determined by sele- nomethionine (SeMet) single-wavelength anomalous diffraction Gram-negative types are further encased by an outer membrane (SeMet-SAD) to a resolution of 2.3 Å using SeMet-labeled (OM). The archaeal S-layer has been postulated to be the most protein, and the second crystal form obtained with unlabeled primitive component of the cell envelope structure (2) and may protein was solved by molecular replacement at a resolution of have originated before the evolution of the murein-containing cell 2.36 Å. Both structures have excellent geometry (Table S1) and envelope/sacculus. Despite their essential biological function and are virtually identical (Fig. S1). The SeMet-labeled structure with many potential applications in nanotechnology (3), we still have limited knowledge of any microbial S-layer structure.

Electron microscopy studies have provided medium-resolution Author contributions: M.A.A. and R.P.G. designed research; S.C., A.S., T.P., C.J.A., and L.R. information on the architecture of S-layers of several Archaea performed research; M.A.A., S.C., A.S., T.P., C.J.A., L.R., and R.P.G. analyzed data; and and bacteria. The protein subunits making up the S-layers are M.A.A. and R.P.G. wrote the paper. arranged in 2D crystalline lattices that have oblique, tetragonal, The authors declare no conflict of interest. or hexagonal symmetry with hexagonal symmetry predominating This article is a PNAS Direct Submission. for archaeal S-layers (reviewed in refs. 3, 4). Several high-reso- Data deposition: The atomic coordinates and structure factors reported in this paper have lution crystal structures of S-layer–associated proteins have been been deposited in the Protein Data Bank, www.pdb.org [PDB ID codes 3U2H (C2 space group) and 3U2G (P622 space group)]. determined; however, these structures represent but a single 1To whom correspondence may be addressed. E-mail: [email protected] or robg@ component of a multicomponent S-layer (5), isolated domains of microbio.ucla.edu. multidomain proteins (6, 7), or are computationally annotated This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. proteins not shown to be localized on the cell surface (8, 9). 1073/pnas.1120595109/-/DCSupplemental.

11812–11817 | PNAS | July 17, 2012 | vol. 109 | no. 29 www.pnas.org/cgi/doi/10.1073/pnas.1120595109 Downloaded by guest on October 1, 2021 uncorrelated surfaces have theoretical Sc values of 1 and 0, respectively, with a range of 0.70–0.76 seen for biologically relevant protein–protein interfaces. The Sc values for the two predicted interfaces are 0.73 and 0.82 for the P622 and C2 dimers, respectively, revealing that the interfaces are highly complementary. In total, 28 aa contribute to the intermolecular interface with an extensive network of 16 hydrogen bonds formed between eight residues from each subunit (see Fig. S5).

DUF1608 N-Terminal Repeat and CTR Domains Are Highly Homologous. A high degree of sequence identity between the MA0829 N- and C-terminal DUF1608 repeats (79% identical and 87% similar; Fig. 2C) suggests that the DUF1608 crystallographic CTR structure is representative of the N-terminal repeat (NTR) Fig. 1. The M. acetivorans cell envelope. (A) Electron micrograph of the cell structure. A sequence alignment of the NTR and CTR DUF1608 envelope of Methanosarcina barkeri, a representative species of the order domains (Fig. 2C) and mapping of sequence variability unto the Methanosarcinales. Cytoplasmic membrane (CM), S-layer (SL), and metachon- MA0829 CTR surface (Fig. S4) using the program Consurf (16) droitin (MC) are indicated. Reprinted from ref. 10. Copyright American Society reveals conservation of the residues that form the CTR dimer for Microbiology. (B) Cartoon of the cell envelope indicating relative positions interface. Of the 28 aa contributing to the intermolecular interface of cell envelope structures including the periplasmic-like space (PLS) contained of the CTR dimer, 21 are identical between the NTR and CTR between the S-layer and cytoplasmic membrane. (C) Linear cartoon repre- sentation of the domain organization of the M. acetivorans MA0829 S-layer domains, 2 residues are conservatively substituted, 1 residue is protein including: N-terminal signal sequence (S), N- and C-terminal DUF1608 semiconservatively substituted, and 4 residues are nonconserved domains (NTR and CTR), tether region (T), and membrane anchor (M). substitutions. Of the four nonconserved residues in two instances

contains the complete CTR (Thr-295 through Thr-568), whereas the native protein structure comprises amino acids Glu-297 through Glu-566 with two short disordered loop regions (Ser-447 to Ala-455 and Gly-484 to Ser-488). The MA0829 CTR structure reveals two structurally related domains composed primarily of β-sheet (Fig. 2 and Fig. S2). Both domains have a β-sandwich fold with two opposed anti- parallel β-sheets as their structural core. The N and C termini of the polypeptide are located in close proximity in the first domain (domain I); thus, the chain passes from domain I to domain II and back to complete the bipartite CTR structure. As the chain passes from the first domain to the second, it folds back on itself to form a small ∼40-aa connector subdomain consisting of an elongated three-stranded β-sheet (Fig. 2A, yellow). There is significant structural similarity between domains I and II (3.1 Å rmsd for the alignment of 66 residues with 3% sequence identity), although each has differences in both topology and the number of β-strands (Fig. S3): domain II has an additional three-stranded β-sheet appended to one of the β-sheets. The connector subdomain is not more tightly as- sociated with one domain over the other and appears to limit the range of motion between the two β-sandwiches without completely rigidifying the overall structure.

Crystallographic CTR Dimer Structure. The MA0829 DUF1608 CTR is a dimer in solution; however, the asymmetric unit of both crystal lattices contains a single DUF1608 CTR molecule. The Protein Interfaces, Surfaces and Assemblies (PISA) program (14), which identifies possible protein interfaces and quaternary assemblies, was used to determine a possible CTR dimer struc- ture. The top PISA prediction of a MA0829 CTR dimer was the P C Fig. 2. The crystal structure of the S-layer protein of M. acetivorans.(A) The same for both the 622 and 2 space groups and structural su- MA0829 CTR shown in ribbon representation. Domains I and II are colored perposition of the predicted dimers (Fig. S1) reveals that they

red and blue, respectively, and the connector subdomain is colored yellow. MICROBIOLOGY are virtually identical (DALI Z score of 49.6 and 1.6 Å rmsd for (B) Crystallographic CTR dimer that likely represents the physiological the alignment of 511 residues with 100% sequence identity). The MA0829 NTR-CTR protein. Domains I and II of the symmetry-related CTR intermolecular interface, mediated by residues solely from do- molecule are colored orange and cyan, respectively, and the connector sub- main I, buries ∼925 Å2 and 875 Å2 of each CTR molecule, or domain is colored dark green. (C) Sequence alignment of the sequences for 7.1% and 7.2% of the surface of each CTR molecule, in the the MA0829 NTR and CTR DUF1608 domains with identical residues in gray P C shading. Secondary structure elements above the alignment are based on the interface for the 622 and 2 space groups, respectively. The MA0829 CTR structure and are colored according to the CTR domains in A. interfaces of the CTR dimers were further compared using Residues predicted to be involved in the NTR-CTR interface based on the CTR the program SC (15) to calculate the geometric shape com- dimer structure are indicated by green circles. Numbering is according to the plementarity (Sc) of the interfaces. Perfectly correlated and mature processed sequence upon removal of the signal peptide.

Arbing et al. PNAS | July 17, 2012 | vol. 109 | no. 29 | 11813 Downloaded by guest on October 1, 2021 CTR residues Met-574 and Ile-575), the intersubunit interactions domains, including three short amino acid insertions, are pri- are mediated by the backbone atoms of the nonconserved residues marily in surface-exposed positions that do not contribute to the with conserved residues in the symmetry-related chain, and, thus, dimer interface. The termini of the two CTR molecules in the the interactions mediated by these two residues would be con- dimer in both crystal forms are in close proximity (∼9 Å) and served in the NTR-CTR interface. A third nonconserved in- easily spanned by the five amino acid linker that connects the teraction occurs at the N terminus of the CTR chain (residue NTR and CTR in the tandem DUF1608 MA0829 protein. Tyr-296) but is not involved in the primary interaction between the The properties of the interfaces, including shape comple- two surfaces. The greatest degree of variability between the NTR mentarity and buried surface area, and the equivalent dimer and CTR sequences occurs primarily on the extracellular face of structures from two different CTR crystal forms strongly suggests MA0829 CTR. that the crystallographic CTR dimer is representative of the The crystallographic CTR dimer structure would also be CTR dimer in solution. Moreover, there is strong sequence representative of the tandem DUF1608 NTR-CTR domain conservation of structurally important residues that contribute structure of the full-length MA0829 polypeptide because all to the intermolecular interactions in the CTR dimer and by ex- major structural components are preserved in the two domains. tension to the intramolecular interactions in the tandem To compare the CTR-CTR interface with that of a predicted DUF1608 MA0829 protein. It infers similar structural properties NTR-CTR interface, a model of the MA0829 NTR-CTR tandem across a large family of paralogous and orthologous tandem DUF1608 protein was generated. The NTR DUF1608 domain DUF1608 S-layer proteins of the (Fig. S5). was homology-modeled using the Phyre server (17) with the Model of the M. acetivorans S-Layer Hexagonal Tile. The CTR crystal CTR domain as a template; the NTR model was superimposed form I contains a hexagonal lattice (crystallographic space group on one DUF1608 domain of the CTR dimer to generate the P622) and examination of symmetry related molecules reveals NTR-CTR tandem DUF1608 protein (Fig. 3). The interfaces of a highly plausible model for the self-assembly of MA0829 into the the CTR dimer and the predicted NTR-CTR tandem DUF1608 M. acetivorans S-layer. Three crystallographic CTR dimers are protein are virtually identical (Fig. 3 C and D): the minor dif- arranged around a threefold crystallographic axis and serve as the ferences in sequence between the NTR and CTR DUF1608 basic repeating unit of the S-layer (Fig. 4 and Fig. S6). Translation of this trimeric unit in two dimensions throughout the crystal lattice generates a flat sheet with sixfold symmetry elements consistent with the molecular features of hexagonal archaeal S-layers visualized by electron microscopy (EM) (2, 18, 19). The intermolecular interactions in the trimeric MA0829 CTR assembly occur at the threefold axis of symmetry within the trimer and by interactions at the sixfold axis of symmetry that coincides with the primary pore of the S-layer sheet (discussed below). The interactions at the threefold axis are primarily between residues of domain I of each DUF1608 tandem repeat closest to the threefold axis. The intermolecular interactions at the sixfold axis occur be- tween domain II of two MA0829 protomers. Forty-nine amino acid residues contribute to the interface between adjacent proto- mers with nine hydrogen bonds and two salt bridges stabilizing the interface. In total, ∼695 Å2 of an MA0829 protomer is buried in the interface with the adjacent protomer in the trimeric as- sembly. The shape complementarity of the interface is high with an average value of 0.70 for the three interprotomer interfaces. Sequence alignment of the MA0829 NTR and CTR sequences reveals residues making hydrogen bonds and salt bridges in the trimeric assembly are 100% conserved between the MA0829 NTR and CTR tandem DUF1608 repeats (Fig. S5). In addition to these interactions, an ammonium citrate molecule from the crystallization solution is bound at the intermolecular interface between individual MA0829 molecules at the threefold axis. Two residues from each protomer, Lys-387 and Lys-390 from one CTR protomer and Thr- 536 and Glu-541 from the adjacent CTR protomer, make hydrogen bonds to the citrate (Fig. S6B). These residues are conserved in analogous positions (NTR residues: Lys-110, Lys-113, Thr-259, Glu-264) in the N-terminal DUF1608 repeat, suggesting that citrate, or a related molecule, is likely to be bound in this interface and involved in stabilizing interacting proteins in the S-layer. Fig. 3. Comparison of the interfaces of the crystallographic CTR dimer and the predicted NTR-CTR protein. (A) Crystallographic CTR dimer with CTR Two-Dimensional S-Layer Lattice. The overall appearance of the domains labeled and colored as in Fig. 2A.(B) Model of the NTR-CTR tandem S-layer sheet is made up of a series of hexagonal tiles with each DUF1608 protein. The DUF1608 domains are labeled and colored as in Fig. 2B. tile containing six complete CTR dimers and halves of an addi- The position of the 12-aa residues inserted near the N terminus of the NTR tional six CTR dimers that connect the tile with six surrounding model is indicated by an arrow. (C) Crystallographic CTR dimer with residues fi tiles (Fig. 4). The extensive network of intermolecular interactions that contribute to the intermolecular interface shown in space- lling repre- between S-layer subunits includes hydrogen bonds, salt bridges, sentation. Interfacing amino acid residues from one CTR domain are colored dark gray, whereas those from the other CTR domain are colored light gray. and van der Waals interactions (Fig. S5) but no covalent bonds, (D) Predicted NTR-CTR tandem DUF1608 protein with residues that contribute in agreement with prior observations (2). In total, the interface to the predicted interdomain interface shown in space-filling representation. between a CTR dimer and the surrounding molecules buries 2 Interfacing amino acid residues from the different domains are colored as in C. 2,700 Å (11.4%) of the solvent accessible surface of a CTR

11814 | www.pnas.org/cgi/doi/10.1073/pnas.1120595109 Arbing et al. Downloaded by guest on October 1, 2021 Fig. 4. The putative M. acetivorans S-layer structure inferred from crystal packing in the hexagonal crystal lattice. (A) Five S-layer tiles from the 2D sheet in the hexagonal crystal form as viewed from the extracellular side. The six CTR crystallographic dimers that contribute to the primary pore are shown in colorsofthe spectrum. CTR dimers that span adjacent tiles are shown in gray. The positions of primary (P), trimer (T), and asymmetric (A) pores are indicated by arrows. The trimeric building block that is the basic repeating unit of protomers in the S-layer is comprised of the gray, green, and yellow crystallographic dimers surrounding the in- dicated trimer pore. (B) Cutaway side view of the S-layer with the extracellular surface at the top. The distance between the arrowheads is ∼240 Å. (C) A single tile as seen from the extracellular surface (Upper) and from the periplasmic-like space (Lower). A smoothed molecular surface is shown and, on the top, the tile is tilted 40° away from the viewer from the view seen in A.(Lower) The tile is rotated 180°. (D) A single tile colored by electrostatic surface potential with the extracellular surface (Upper) and the view from the periplasmic-like space (Lower). (E) Cutaway views of the primary (P), asymmetric (A), and trimer (T) pores, respectively.

dimer in the interface with 17% of the surface exposed residues of sixfold axis, T or “trimer pores” on the threefold axis at the center a CTR dimer participating in the intermolecular interfaces. The of the trimeric building block, and A or “asymmetric pores” at the orientation of the S-layer relative to the CM is obvious as the N and interface of two adjacent trimeric building blocks (Fig. 4). The P C termini of all chains are located on the same face of the sheet. pore is the most prominent structural feature of the S-layer and An equally striking feature of the 2D sheet is a funnel-like has the overall appearance of a funnel when viewed from the structure on the sixfold axis of symmetry that points toward the extracellular side of the S-layer. The primary pore constricts to CM. The overall morphology of the S-layer, including center- ∼13 Å before opening slightly at the cytoplasmic face of the pore “ ” to-center spacing between primary pores, 120 Å, and S-layer (Fig. 4 and Fig. S7). The T pore on the threefold axis is roughly height, 45 Å, is consistent with the data from EM studies of other triangular in shape and constricts to an approximate diameter of – microbial S-layers (2, 18 21). The sequences of putative S-layer ∼8 Å at the narrowest point. Unlike the P pore, the dimensions of proteins from related Methanosarcinales also have a similar the T pore remain relatively unchanged throughout its vertical architecture with tandem DUF1608 domains; a high degree of profile (Fig. S7). The A pore is an irregular shaped gap that exists conservation of amino acid residues in functionally and struc- at the interface of the trimeric building blocks (Fig. 4). At its turally important positions suggests that these Archaea will have narrowest point, the A pore has dimensions of 5 × 14 Å. The S-layers with properties similar to that of M. acetivorans (Fig. S5). surfaces of the P and T pores is overwhelmingly negative (Fig. The inner and outer surfaces of the S-layer are over- D D 4 ), whereas the A pore is slightly less negative because of the

whelmingly negative (Fig. 4 ). The structure of the NTR-CTR MICROBIOLOGY sheet would be essentially identical to the CTR-CTR sheet; presence of several positively charged residues (CTR residues: however, minor differences occur in sequence between the NTR Lys-387, Lys-401, and Arg-537; corresponding NTR residues: and CTR DUF1608 domains (Fig. 2C) and would result in Lys-110, Lys-124, Arg-260) at the periphery of the pore. The a modest increase in negative charge for the S-layer composed of dimensions of all three pores could vary considerably depending NTR-CTR tandem DUF1608 proteins versus the S-layer model on the positioning of amino acid side chains, but their limited size based on the CTR-CTR DUF1608 structure alone. and strong negative charge suggest that the S-layer is a formidable charge and size barrier that protects the organism from noxious or S-Layer Pores. Porosity of the S-layer sheet is evident by the chaotropic compounds and/or hydrolytic enzymes. Thus, the pores presence of three distinct pore types: P or “primary pores” on the exhibit distinct morphological differences in shape and dimension.

Arbing et al. PNAS | July 17, 2012 | vol. 109 | no. 29 | 11815 Downloaded by guest on October 1, 2021 All three S-layer pore types would allow passage of small mo- lecular weight nutrients (e.g., amino acids, purines, pyrimidines, and vitamins) needed for cell growth. Additionally, the P pore is sufficiently large to allow passage of siderophores and conceivably, oligopeptides, oligonucleotides, and lipids. The loops that form the primary pore adopt slightly different conformations in the P622 and C2 crystal forms, suggesting that the size of the primary pore may vary slightly (Fig. S8) and could allow passage of larger com- pounds. The intense negative charge of the pore surfaces, and of both the internal and external surfaces of the S-layer, would im- pede the movement of negatively charged molecules which pre- sumably cross the S-layer as salts. As a primitive precursor of more advanced bacterial envelope structures, the molecular properties of which are easily tailored to adapt to changing environmental con- ditions (e.g., through incorporation of substrate-specificporinsand high-affinity receptors), the large pores of the S-layer are a neces- Fig. 5. Structural homology between the M. acetivorans S-layer protein and sary feature. The incorporation of other proteins that contain eukaryotic virus envelope proteins. Superposition of domain II of the DUF1608 domains, nine of which have been identified in the M. MA0829 CTR (colored as in Fig. 2A) with domain I of the envelope proteins acetivorans genome and three of which have been shown to be of West Nile virus (PDB ID code 3I50; colored orange), a member of the surface-exposed (9), could allow modification of the sieve-like Flaviviridae family (A), and Sindbis virus (PDB ID code 1Z8Y; colored green), properties of the S-layer in response to environmental cues. a member of the Togaviridae family, in the alphavirus subfamily (B). Domain II of the MA0829 CTR is rotated ∼210° on the y axis relative to the orien- Sequence Identity Suggests a Common S-Layer Architecture for tation shown in Fig. 2A, and the N and C termini of the domains are labeled. Multiple breaks are visible in domain I of the envelope proteins, as there are Methanosarcinaceae. To determine whether the S-layer of related multiple connections between domain I and domains II and III. Methanosarcinales species would form a similar structure as that seen for M. acetivorans, we compared the primary amino acid sequences of the M. acetivorans MA0829 S-layer protein with the structural element flanked by viral domains II and III, and it Methanosarcina mazei S-layer proteins of Gö1 (MM_1976) and mediates a variety of protein–protein interactions with adjacent M barkeri fi . strain Fusaro (Mbar_A1758) identi ed by our molecules in the virus envelope (25–27). Flexibility in the hinge Methanococcoides burtonii previous studies (9, 12), and DSM regions linking domain I with domains II and III is believed to allow fi 6242 (Mbur_1690), identi ed by BLAST searches. Secretion of the large conformational changes required during virion maturation M burtonii Mbur_1690 by . has been demonstrated by experiments and fusion with host cell membranes (28). This suggests that con- M burtonii that found this protein in . culture supernatants (22). All formational changes in the M. acetivorans S-layer protein may also four S-layer sequences have a similar domain organization allow limited bending of the S-layer to fit around the considerably (Fig. S5), with an N-terminal signal sequence followed by tandem larger archaeal cell membrane surface. A lack of fivefold symmetry DUF1608 domains, a tether region, and C-terminal trans- in the M. acetivorans S-layer hexagonal tile structure would pre- fi membrane anchor. There is also signi cant sequence conservation clude formation of smaller virus-shaped bodies and, thus, contrib- in the DUF1608 amino acid residues that govern the in- M acetivorans M acetivorans ute to cell envelope stability. Although both the . termolecular interactions between the . S-layer S-layer protein and viral envelope proteins form large protein as- protomers (Fig. S5), suggesting that related Methanosarcinales semblies, the lack of significant primary amino acid sequence sim- species plus two haloarchaeal species, Haloarcula marismortui NP Natronomonas pharaonis ilarity and limited chance for lateral gene exchange between these and DSM2160 (12), have S-layers with phylogenetically diverse organisms suggest that Methanosarcinales structures similar to that proposed for M. acetivorans by this study. β and viruses may have solved the common requirement for pro- More distantly related Archaea also appear to use -sandwich tective barriers through convergent protein evolution. structures in their S-layers. Sulfolobales S-layers are composed of stalk (SlaB) and canopy (SlaA) proteins, both of which are Conclusions. The structure of the M. acetivorans S-layer protein β predicted to contain multiple -sandwich domains (23). Al- DUF1608 domain and the working model for the 2D S-layer though the Sulfolobales S-layers exhibit three and sixfold sym- lattice are instances in which a high-resolution crystal structure metry elements, the overall architecture and pore types are M acetivorans has recapitulated, in high resolution, the symmetry and structural distinct from those of our model of the . S-layer. A features of microbial S-layers visualized by medium-resolution lack of detectable sequence homology between the Meth- EM studies. As a representative of a large family of homologous anosarcinales and Sulfolobales S-layer proteins suggests different archaeal Methanosarcinaceae proteins, this previously undescribed solutions for protective structures have been met through the use β S-layer protein structure allows insight into the properties, evo- of structurally related -sandwich domains. lution, and adaptation of the cell envelope structures of key methanogenic species involved in anaerobic carbon cycling. The Structural Homology Between the M. acetivorans S-Layer Protein and atomic level view of the archaeal S-layer reveals the molecular Viral Envelope Proteins. A search of known protein structures de- architecture of a protective, yet porous, barrier: it also has po- posited in the Protein Data Bank was used to find proteins struc- M acetivorans tential to facilitate the design of nanomaterials requiring periodic turally similar to the . S-layer NTR and CTR domains fi (Table S2). Most striking is the structural homology between the spacing and/or well-de ned pore structures. S-layer protein and envelope proteins of eukaryotic RNA viruses of Materials and Methods the Flaviviridae and Togaviridae families. These two virus families Cloning and Purification of M. acetivorans MA0829 C-Terminal DUF1608. The use structurally homologous envelope proteins in their capsids (24, DUF1608 CTR (amino acid residues 294–570 of the mature amino acid sequence) 25), although they differ in the number and organization of these from M. acetivorans C2A genomic DNA was cloned into the NdeI and XhoI sites proteins in the mature virion (24, 26). Similarity between the S-layer of pET22b (Novagen). The resulting construct has a methionine residue added protein and the viral envelope proteins is strongest between to the N terminus of the CTR sequence and a C-terminal hexahistidine tag. MA0829 CTR domain II and domain I of the viral envelope The expression plasmid was transformed into E. coli Rosetta (DE3) cells proteins (Fig. 5). In viral envelope proteins, domain I is a central (Novagen). Cells were grown in LB media at 37 °C, and growth continued

11816 | www.pnas.org/cgi/doi/10.1073/pnas.1120595109 Arbing et al. Downloaded by guest on October 1, 2021 for 4 h after protein expression was induced with 1 mM isopropyl-β-D- structure of the MA0829 CTR in crystal form I, crystallographic space group

thiogalactopyranoside (IPTG) at an OD600 of 0.6. SeMet-labeled protein P622, was determined by SeMet-SAD at a resolution of 2.30 Å using the was produced at the same temperature using the autoinduction method program HKL2MAP (31), which uses the SHELX suite of programs (32). Model (29). For protein purification, cell pellets were resuspended in lysis buffer building was performed with Arp/Warp (33) and Coot (34). The model was [20 mM Na2HPO4 (pH 7.4), 0.5 M NaCl, 15 mM imidazole] containing refined with Refmac5 (35), using translation, libration, screw (TLS) parame- protease inhibitor mixture (Sigma) and PMSF (100 μM). The cell suspension ters (36) in the later stages of refinement, which is part of the CCP4 suite was broken using a French press, and the clarified supernatant was in- (37). A second crystal form of unlabeled MA0829 CTR, in crystallographic cubated with Ni- nitrilotriacetic acid (Ni-NTA) agarose beads (Qiagen) for space group C2, was solved by molecular replacement using Phaser (38) at 60 min at 4 °C. The beads were washed extensively with lysis buffer, and a resolution of 2.36 Å using the model from the hexagonal crystal form. The bound protein was eluted using buffer containing increasing concen- native protein structure was refined as per the SeMet-labeled protein. Re- trations (50 mM, 100 mM, 300 mM) of imidazole. The proteins were fur- finement statistics are summarized in Table S1. Structural homology searches fi fi ther puri ed by gel- ltration chromatography using a HiPrep Sephacryl were performed using DALI (39), and figures were prepared with PyMol. S-300 size exclusion column (GE Life Sciences) equilibrated in 20 mM Electrostatic calculations were performed with Adaptive Poisson-Boltzmann Na HPO (pH 7.4), 0.5 M NaCl. Fractions containing the MA0829 CTR were 2 4 Solver (APBS) software (40) using the PDB2PQR server (41) with the fol- concentrated to 10 mg/mL (native protein) or 5 mg/mL (SeMet-labeled lowing parameters: parameters for solvation energy (PARSE) force field, protein) for crystallization screening. N and C termini with neutral charge, and ionic strength of 100 mM.

Protein Crystallization. Native and SeMet-labeled protein crystals were grown Pore Dimension Calculations. The program HOLE (42) was used to calculate at 18 °C using the hanging drop vapor diffusion method. Crystals of the the dimensions of the symmetrical primary and trimer pores. The dimensions native protein grew in 0.2 M sodium sulfate, 20% (wt/vol) polyethylene of the asymmetric pore were calculated by visual inspection of the pore, and glycol (PEG)3350 using a 1:1 ratio of protein to well solution and were flash frozen for data collection without any additional manipulation. SeMet- the distance between the closest atoms on either side of the pore was labeled protein crystals grew in 0.4 M ammonium citrate, 20% (wt/vol) measured and the van der Waals radii (3.4 Å) of the atoms subtracted. PEG3350 using a 1:2 ratio of protein to reservoir solution and were cry- oprotected in reservoir solution containing 20% (vol/vol) glycerol. ACKNOWLEDGMENTS. Diffraction data were collected at the Northeastern Collaborative Access Team (NE-CAT) beam line 24-ID-C at the Advanced Photon Source. We thank Duilio Cascio, Mike Sawaya, and Jim Bowie for Structure Determination. Single-wavelength datasets for both native and their valuable contributions. We thank the staff of the UCLA-DOE Protein SeMet-labeled MA0829 CTR protein crystals were collected on beam line Expression and X-Ray Crystallography Cores (supported by DOE Grant 24-ID-C at the Advanced Photon Source at Argonne National Laboratory. DE-FC02-02ER63421) and the UCLA Crystallization Core Facility. R.P.G. was Datasets were processed and scaled with DENZO and Scalepack (30). The funded by the DOE Biosciences Division (Grant DE-FG02-08ER64689).

1. Koenig H (1988) Archaeobacterial cell envelopes. Can J Microbiol 34:395–406. 21. Kessel M, Wildhaber I, Cohen S, Baumeister W (1988) Three-dimensional structure of 2. Sleytr UB, Messner P (1983) Crystalline surface layers on bacteria. Annu Rev Microbiol the regular surface glycoprotein layer of Halobacterium volcanii from the Dead Sea. 37:311–339. EMBO J 7:1549–1554. 3. Sleytr UB, Egelseer EM, Ilk N, Pum D, Schuster B (2007) S-Layers as a basic building 22. Saunders NF, et al. (2006) Proteomic and computational analysis of secreted proteins block in a molecular construction kit. FEBS J 274:323–334. with type I signal peptides from the Antarctic archaeon Methanococcoides burtonii. 4. Engelhardt H (2007) Mechanism of osmoprotection by archaeal S-layers: A theoretical J Proteome Res 5:2457–2464. study. J Struct Biol 160:190–199. 23. Veith A, et al. (2009) , and surface layers: 5. Fagan RP, et al. (2009) Structural insights into the molecular organization of the Structure, composition and gene expression. Mol Microbiol 73:58–72. S-layer from Clostridium difficile. Mol Microbiol 71:1308–1322. 24. Lescar J, et al. (2001) The Fusion glycoprotein shell of Semliki Forest virus: An icosa- 6. Kern J, et al. (2011) Structure of surface layer homology (SLH) domains from Bacillus hedral assembly primed for fusogenic activation at endosomal pH. Cell 105:137–148. anthracis surface array protein. J Biol Chem 286:26042–26049. 25. Rey FA, Heinz FX, Mandl C, Kunz C, Harrison SC (1995) The envelope glycoprotein 7. Pavkov T, et al. (2008) The structure and binding behavior of the bacterial cell surface from tick-borne encephalitis virus at 2 A resolution. Nature 375:291–298. fl layer protein SbsC. Structure 16:1226–1237. 26. Kuhn RJ, et al. (2002) Structure of dengue virus: Implications for avivirus organi- – 8. Jing H, et al. (2002) Archaeal surface layer proteins contain beta propeller, PKD, and beta zation, maturation, and fusion. Cell 108:717 725. helix domains and are related to metazoan cell surface proteins. Structure 10:1453–1464. 27. Mukhopadhyay S, Kim BS, Chipman PR, Rossmann MG, Kuhn RJ (2003) Structure of 9. Francoleon DR, et al. (2009) S-layer, surface-accessible, and concanavalin A binding West Nile virus. Science 302:248. proteins of Methanosarcina acetivorans and Methanosarcina mazei. J Proteome Res 8: 28. Mukhopadhyay S, Kuhn RJ, Rossmann MG (2005) A structural perspective of the fl – 1972–1982. avivirus life cycle. Nat Rev Microbiol 3:13 22. 10. Sowers KR, Boone JE, Gunsalus RP (1993) Disaggregation of Methanosarcina spp. and 29. Studier FW (2005) Protein production by auto-induction in high density shaking cul- tures. Protein Expr Purif 41:207–234. Growth as Single Cells at Elevated Osmolarity. Appl Environ Microbiol 59:3832–3839. 30. Otwinowski Z, Minor W (1997) Processing of X-ray diffraction data collected in os- 11. Finn RD, et al. (2010) The Pfam protein families database. Nucleic Acids Res 38(Da- cillation mode. Methods Enzymol 276:307–326. tabase issue):D211–D222. 31. Pape T, Schneider TR (2004) HKL2MAP: A graphical user interface for macromolecular 12. Rohlin L, et al. (2012) Identification of the major expressed S-layer and cell surface phasing with SHELX programs. J Appl Crystallogr 37:843–844. layer related proteins in the model methanogenic archaea, Methanosarcina barkeri 32. Sheldrick GM (2008) A short history of SHELX. Acta Crystallogr A 64:112–122. Fusaro and Methanosarcina acetivorans C2A. Archaea 2012, 873589. 33. Langer G, Cohen SX, Lamzin VS, Perrakis A (2008) Automated macromolecular model 13. Houwink AL (1953) A macromolecular mono-layer in the cell wall of Spirillum spec. building for X-ray crystallography using ARP/wARP version 7. Nat Protoc 3:1171–1179. Biochim Biophys Acta 10:360–366. 34. Emsley P, Cowtan K (2004) Coot: Model-building tools for molecular graphics. Acta 14. Krissinel E, Henrick K (2007) Inference of macromolecular assemblies from crystalline Crystallogr D Biol Crystallogr 60:2126–2132. state. J Mol Biol 372:774–797. 35. Murshudov GN, Vagin AA, Dodson EJ (1997) Refinement of macromolecular structures 15. Lawrence MC, Colman PM (1993) Shape complementarity at protein/protein inter- by the maximum-likelihood method. Acta Crystallogr D Biol Crystallogr 53:240–255. faces. J Mol Biol 234:946–950. 36. Painter J, Merritt EA (2006) Optimal description of a protein structure in terms of 16. Ashkenazy H, Erez E, Martz E, Pupko T, Ben-Tal N (2010) ConSurf 2010: Calculating multiple groups undergoing TLS motion. Acta Crystallogr D Biol Crystallogr 62:439–450. evolutionary conservation in sequence and structure of proteins and nucleic acids. 37. Collaborative Computational Project, Number 4 (1994) The CCP4 suite: Programs for – Nucleic Acids Res 38(Web Server issue):W529 W533. protein crystallography. Acta Crystallogr D Biol Crystallogr 50:760–763. 17. Kelley LA, Sternberg MJ (2009) Protein structure prediction on the Web: A case study 38. McCoy AJ, et al. (2007) Phaser crystallographic software. JApplCrystallogr40:658–674. – using the Phyre server. Nat Protoc 4:363 371. 39. Holm L, Kääriäinen S, Rosenström P, Schenkel A (2008) Searching protein structure 18. Cheong G-W, Guckenberger R, Fuchs K-H, Gross H, Baumeister W (1993) The structure databases with DaliLite v.3. Bioinformatics 24:2780–2781. MICROBIOLOGY of the surface layer of Methanoplanus limicola obtained by a combined electron 40. Baker NA, Sept D, Joseph S, Holst MJ, McCammon JA (2001) Electrostatics of nano- microscopy and scanning tunneling microscopy approach. J Struct Biol 111:125–134. systems: Application to microtubules and the ribosome. Proc Natl Acad Sci USA 98: 19. Trachtenberg S, Pinnick B, Kessel M (2000) The cell surface glycoprotein layer of the 10037–10041. extreme halophile Halobacterium salinarum and its relation to Haloferax volcanii: 41. Dolinsky TJ, Nielsen JE, McCammon JA, Baker NA (2004) PDB2PQR: An automated Cryo-electron tomography of freeze-substituted cells and projection studies of neg- pipeline for the setup of Poisson-Boltzmann electrostatics calculations. Nucleic Acids atively stained envelopes. J Struct Biol 130:10–26. Res 32(Web Server issue):W665–W667. 20. Cheong G-W, Cejka Z, Peters J, Stetter KO, Baumeister W (1991) The surface protein 42. Smart OS, Neduvelil JG, Wang X, Wallace BA, Sansom MS (1996) HOLE: A program for layer of Methanoplanus limicola: Three-dimensional structure and chemical charac- the analysis of the pore dimensions of ion channel structural models. J Mol Graph 14: terization. Syst Appl Microbiol 14:209–217. 354–360, 376.

Arbing et al. PNAS | July 17, 2012 | vol. 109 | no. 29 | 11817 Downloaded by guest on October 1, 2021