Reversible assembly and amyloidogenesis of the

staphylococcal biofilm protein, Aap

A dissertation submitted to the

Graduate School

of the University of Cincinnati

In partial fulfillment of the

requirements for the degree of

Doctor of Philosophy

In the Department of Molecular Genetics,

Biochemistry and Microbiology

of the College of Medicine

2019

Alexander E. Yarawsky

B.S. Biological Sciences and Chemistry

Northern Kentucky University 2013

Committee Chair: Andrew B. Herr, Ph.D.

Abstract

The human skin commensal, Staphylococcus epidermidis, is the bacterium most commonly responsible for hospital-acquired infections. This microbe has a very strong capacity for forming bacterial communities known as biofilms. These communities are well-structured and often involve a slime-like matrix of extracellular polysaccharide which assists in bacterial accumulation. A well-known protein factor, the accumulation- associated protein (Aap), can also mediate intercellular adhesion, contributing the biofilm formation. Interestingly, Aap has been shown to be critical for infection in a rat catheter model, whereas the extracellular polysaccharide was irrelevant in infection.

Aap is a large, multi-functional, cell wall-anchored protein expressed by S. epidermidis. The N-terminus of the protein contains a region of short repeats called the

A-repeats, followed by a globular lectin domain. The lectin domain can mediate attachment of bacteria to a surface. A proteolytic cleavage site downstream of the lectin domain leads to the release of the A-repeats and lectin domain, allowing Aap to function in accumulation. There are 5 - 17 B-repeats downstream of the cleavage site, which can assembly with Aap on adjacent cells in the presence of Zn2+. At the C-terminus of Aap lies a region of low complexity, which is rich in proline and glycine residues. After this region is a cell wall-anchoring motif that results in the covalent attachment of Aap to the bacterial cell wall.

Much of the progress made toward understanding the structure and biological function of the B-repeats has utilized a minimal construct containing one and a half B- repeats (Brpt1.5). Previous members of the Herr Lab have determined that Brpt1.5 can

ii assemble into an anti-parallel dimer in the presence of Zn2+. Interestingly, while the

Brpt1.5 dimer would disassociate in the presence of Zn2+-chelator, mature biofilms were unaffected by addition of the chelator. This led members of the Herr Lab to express and characterize longer B-repeat constructs, which more closely resemble the number of B- repeats observed in Aap and show Zn2+-dependent assembly beyond dimer, eventually forming amyloid-like fibers. Amyloid fibers are often associated with toxicity, however, the fibers formed by Aap's B-repeats are utilized by the bacterium in a functionally beneficial way. In several other bacteria, functional amyloids have been shown to provide added strength to the biofilm structure. In the work presented here, we show

Aap forms fibers in S. epidermidis biofilms and are responsible for the biofilm's resistance toward Zn2+-chelator. We also characterize a recently discovered protein, small basic protein (Sbp), which is able to reduce the amount of Zn2+ required for B- repeat aggregation. We propose that Sbp is a nucleating or accessory protein for Aap- amyloidogenesis.

Finally, a secondary interest of this dissertation work is to characterize the structure of the proline/glycine-rich stalk-like region of Aap and other cell wall-anchored proteins. Interestingly, several of these regions have a very high propensity to remain extended in solution, primarily due to the high polyproline type II helix propensity.

Overall, this dissertation work has led to an increased understanding of the mechanism of Aap-dependent accumulation in biofilm formation.

iii

iv

Acknowledgements

I would like to thank my advisor, Dr. Andrew Herr for always being enthusiastic about my data and inspiring me to continue down the path of biophysics. I thank my fellow lab members for constant discussions about various aspects of my research, as well as for entertaining outings. I also owe the previous lab members who had set the stage for my projects, especially Deb Conrady and Stefanie Johns. I would also like to thank my undergraduate advisor at Northern Kentucky University, Dr. Heather Bullen, for showing me the excitement of research and being an inspirational mentor.

I am grateful for my many trips to the Gibbs Conference on Biological

Thermodynamics, which are always inspiring me to maintain a high degree of rigor in my work, and have led to me reading many papers from the influential biophysicists that are or once were associated with this meeting.

I would like to thank Beckman Coulter for awarding me a full travel grant to attend the 23rd International Analytical Ultracentrifugation Workshop and Symposium

(AUC 2017) where I learned a great deal from many experts in the field, particularly in regards to Walter Stafford's SEDANAL software.

I must acknowledge my family and friends as well, especially my wife Danille, who have supported my journey and who have encouraged me to pursue the highest of goals.

v

Table of Contents

Title Page ...... i

Abstract ...... ii

Acknowledgements ...... v

Table of Contents ...... vi

List of Figures and Tables ...... x

Chapter I. Literature Review ...... 1 A. The importance of staphylococcal biofilms in the healthcare industry ...... 1 B. Molecular pathways of staphylococcal biofilm formation ...... 2 i. Steps of biofilm formation ...... 3 ii. PNAG-dependent accumulation ...... 8 iii. Protein-dependent accumulation ...... 10 C. The accumulation-associated protein ...... 13 i. Aap domain architecture ...... 14 ii. Role of Aap in attachment ...... 14 iii. Zn2+-mediated self-assembly of Aap (role in accumulation) ...... 17 D. Considerations in Protein Folding and Stability ...... 21 i. The hydrophobic effect ...... 22 ii. Solvent-solute interactions ...... 23 iii. Solute-solute interactions ...... 27 E. Protein misfolding and natively unfolded proteins ...... 31 i. Protein aggregation and amyloid formation ...... 31 a. Amyloid stability ...... 32 b. Pathogenic vs. functional amyloid ...... 33 ii. Intrinsically disordered proteins ...... 34 a. Factors determining conformational propensities ...... 34 b. Functional roles of IDPs ...... 36 F. Goals of dissertation work...... 39 References ...... 42

Chapter II. The biofilm adhesion protein Aap from Staphylococcus epidermidis forms zinc-dependent amyloid fibers ...... 54 Abstract ...... 55 Author Summary ...... 57 Introduction ...... 58 Solution characterization of tandem B-repeats from Aap ...... 62 Tandem B-repeats assemble into multiple higher-order species in the presence of Zn2+ ...... 64 2D size-and-shape sedimentation analysis indicates formation of fiber-like species . 67 Tandem B-repeats form amyloid fibers in the presence of Zn2+ ...... 71 vi

B-repeat fiber assembly is time- and temperature-dependent ...... 76 B-repeat fibers are resistant to acid and chelator treatment ...... 79 Amyloid fibers are structural components in S. epidermidis biofilms ...... 80 S. epidermidis amyloid fibers are composed of processed Aap ...... 84 Amyloid fibers form early in biofilm formation and correlate with DTPA resistance of biofilms ...... 86 Discussion ...... 89 Methods ...... 94 Supporting Information ...... 103 References ...... 110

Chapter III. Tandem B-repeats from Aap show reversible zinc-dependent assembly beyond dimer ...... 120 Abstract ...... 121 Introduction ...... 123 Brpt5.5 exhibits monomer-dimer-tetramer equilibrium ...... 127 Analysis of linked equilibria indicates a similar mechanism of Brpt5.5 dimerization to shorter constructs ...... 130 Formation of the tetramer requires additional Zn2+ ions ...... 130 Chemical modification and sequence mutation to define tetramer assembly ...... 133 Tetramer assembly is required for Zn2+-dependent amyloidogenesis ...... 137 Discussion...... 141 Materials and methods ...... 145 Supplementary Figures ...... 150 References ...... 151

Chapter IV. Defining the basis of the interaction between Sbp and the B-repeats of Aap in Staphylococcus epidermidis biofilms ...... 153 Abstract ...... 154 Introduction ...... 155 The secondary structure of Sbp is strongly dependent on electrostatic interactions 158 Sbp undergoes compaction upon electrostatic screening ...... 160 Probing for Sbp:B-repeat interactions at low and high NaCl concentrations ...... 160 Sbp enhances Zn2+-dependent Brpt5.5 assembly ...... 165 Sbp cannot interact with the Brpt5.5 monomer or dimer ...... 168 The ability for Sbp to reduce the Zn2+ requirement for B-repeat assembly is biologically relevant ...... 170 Discussion...... 173 Materials and methods ...... 176 Supplementary Figures ...... 180 References ...... 182

Chapter V. The proline/glycine-rich region of the biofilm adhesion protein Aap forms an extended stalk that resists compaction ...... 184 Abstract ...... 185 Introduction ...... 186 vii

The proline/glycine-rich region shows aberrant mobility ...... 190 PGR sediments as an elongated monomer ...... 193 Determination of the hydrodynamic radius of PGR in solution ...... 194 Predicted disorder based on PGR primary sequence ...... 197 PGR contains polyproline type II helix content ...... 198 Hydrodynamic behavior as a function of temperature ...... 200 Effects of cosolvents on PGR conformation ...... 203 Electrostatic interactions do not affect local or global PGR conformations ...... 207 Predicting PPII propensity and Rh from PGR primary sequence ...... 210 Discussion...... 212 Materials and methods ...... 218 Supplementary Figures ...... 228 Supplementary Tables ...... 231 References ...... 235

Chapter VI. Comparing intrinsically disordered regions of Staphylococcus surface proteins ...... 248 Abstract ...... 249 Introduction ...... 250 All constructs are predicted to be disordered ...... 255 AUC indicates highly elongated monomers ...... 262 CD confirms random coil/PPII secondary structure content ...... 264 Cosolvents perturb secondary structure to varied degrees ...... 267 Discussion...... 272 Materials and methods ...... 276 Supplementary Tables ...... 279 References ...... 283

Chapter VII. Future Directions ...... 289 A. The importance of Aap higher-order assembly and amyloidogenesis in Staphylococcus epidermidis biofilm formation and virulence ...... 289 B. Biological significance of the role of Sbp in Aap amyloidogenesis in Staphylococcus epidermidis biofilms ...... 294 C. Investigating the mechanism of B-repeat amyloidogenesis ...... 300 D. Biophysical and structural insights from endogenously expressed full-length Aap ...... 305 E. Defining spatial and temporal parameters of Staphylococcus epidermidis biofilm formation ...... 308 F. Differentiating infectious and commensal S. epidermidis colonization ...... 310 References ...... 313

Appendix I. Biophysical insights into the mechanism of Bap-dependent biofilm formation in Acinetobacter baumannii ...... 317 Abstract ...... 318 Introduction ...... 319 Bap is rich in beta-sheet secondary structure content and highly elongated ...... 323 viii

Bap dimerization occurs in the presence of Zn2+ ...... 323 Zn2+-induced dimerization of Bapice2 results in no significant secondary structural changes ...... 328 Testing for heteroassociation between the Bap ice-repeat regions ...... 328 Discussion...... 331 Materials and methods ...... 333 References ...... 336

ix

List of Figures and Tables

Chapter I Figure I-1: Biofilm formation ...... 2 Table I-1: Summary of components involved in attachment ...... 6 Table I-1: Summary of components involved in accumulation ...... 7 Figure I-2: Domain arrangement of Aap ...... 14 Figure I-3: B-repeat Zn2+-coordination ...... 19 Table I-3: Gibbs free energy of transfers from water to other solvents ...... 25 Table I-4: Component contributions to Gibbs free energy of transfers ...... 26

Chapter II Figure 1: Characterization of tandem B-repeats from S. epidermidis Aap ...... 63 Figure 2: Sedimentation behavior of tandem B-repeats in the presence of Zn2+ ...... 66 Figure 3: AUC c(s,ff0) analysis of early-stage HMBP-Brpt3.5 amyloidogenic intermediates ...... 68 Figure 4: Amyloid properties of tandem B-repeat constructs in the presence of Zn2+ 72 Figure 5: HPLC and turbidity assays to monitor time and temperature dependence of amyloidogenesis...... 77 Figure 6: Amyloid fibers composed of Aap are important structural components in S. epidermidis biofilms ...... 82 Figure 7: The formation of amyloid fibers is well correlated to DTPA resistance in S. epidermidis biofilms ...... 88 Figure S1: Sequence identity comparison of the tandem Brpt domains of Aap ...... 103 Figure S2: Secondary structure analysis of Brpt3.5 ...... 104 Figure S3: B-repeat fibers are resistant to acid and metal chelator treatment ...... 105 Figure S4: Stability of HMBP-Brpt3.5/Zn2+ fibers after incubation with HCl or the metal chelator DTPA ...... 106 Figure S5: Initial assembly of Brpt5.5 is sensitive to acidification and the metal chelator DTPA ...... 107 Figure S6: Mass spectrometry results of SDS-resistant aggregate present in S. epidermidis RP62A biofilms ...... 108 Table S1: Mass spectrometry results of SDS-resistant aggregate present in S. epidermidis RP62A biofilms ...... 109

Chapter III Figure 1: Brpt5.5 exhibits a monomer-dimer-tetramer equilibrium in the presence of Zn2+ ...... 129 Figure 2: Analysis of linked equilibria reveals the number of Zn2+ ions bound during each assembly event ...... 132 Figure 3: Chemical modification targets and potential Zn2+-binding residues are highlighted on a structure of Brpt1.5 (PDB: 4FUN) ...... 134 Figure 4: Chemical modifications and H85A mutations inhibit tetramer formation ... 136 Table 1: Measured equilibrium constants from sedimentation equilibrium experiments shown in Figure 4C and 4D ...... 136

x

Figure 5: Inhibiting tetramer formation results in weaker aggregation propensity .... 138 Figure 6: Models of tandem B-repeat reversible assembly ...... 143 Figure S1: Circular dichroism of Brpt5.5 assembly states ...... 150 Figure S2: Secondary structure and thermal denaturation of Brpt5.5 H85A ...... 150

Chapter IV Figure 1: Sbp is partially folded under standard conditions ...... 159 Table 1: Thermodynamic parameters calculated from thermal denaturation experiments shown in Figure 1C ...... 159 Figure 2: Sbp requires high NaCl concentrations to become fully compacted ...... 163 Table 2: Hydrodynamic parameters from AUC experiments in Figure 2A ...... 163 Figure 3: Interactions between Sbp and Brpt5.5 require the presence of Zn2+ ...... 164 Figure 4: Sbp induces Brpt5.5 assembly and aggregation at lower Zn2+ concentrations ...... 166 Figure 5: Sbp shows a weaker effect toward Brpt5.5 H85A aggregation and does not affect Brpt5.5 H85A assembly ...... 169 Figure 6: Sbp lowers the Zn2+ required for biofilm formation ...... 172 Figure S1: AUC indicates no Sbp:Brpt1.5 assembly ...... 180 Figure S2: Investigation the turbidity behavior of Brpt5.5 ...... 181 Figure S3: Effect of retroSbp on Brpt5.5 turbidity ...... 181

Chapter V Figure 1: PGR shows aberrant mobility and exists as an elongated monomer in solution ...... 191 Figure 2: Size exclusion chromatography (SEC) confirms an extended conformation in solution ...... 195 Figure 3: PGR contains polyproline type II helix ...... 199 Figure 4: PGR shows weak temperature dependence of Rh ...... 202 Figure 5: The local conformation of PGR shows resilience against chemical perturbants ...... 204 Figure 6: Coulombic effects do not play a role in local or global conformations ...... 209 Figure 7: Model of Aap on the surface of S. epidermidis ...... 217 Table 1: Summary of hydrodynamic parameters determined in this study ...... 217 Figure S1: PGR is predicted to be intrinsically disordered ...... 228 Figure S2: The sequence of PGR is highly conserved among S. epidermidis strains ...... 230 Table S1: Concentration dependence of sedimentation velocity AUC data ...... 231 Table S2: Temperature dependence of sedimentation velocity AUC data ...... 231 Table S3: Salt dependence of sedimentation velocity AUC data ...... 232 Table S4: Comparison of hydrodynamic properties for PGR to a dataset of studied IDPs ...... 233 Table S5: Folded proteins and hydrodynamic measurements from literature ...... 234

xi

Chapter VI Figure 1: Sequences of constructs ...... 251 Figure 2: Predictions of disorder and classification of constructs ...... 256 Table 1: Parameters calculated by CIDER ...... 257 Table 2: Calculated and predicted parameters of IDP constructs ...... 260 Figure 3: AUC indicates each construct is highly elongated and monomeric ...... 263 Table 3: Sedimentation velocity AUC parameters ...... 263 Figure 4: Circular dichroism wavelength scans show constructs have primarily random coil and PPII helix content ...... 265 Figure 5: Constructs respond to denaturants to different extents ...... 268 Figure 6: Comparing the response to TMAO and TFE ...... 270 Table S1: Sequence-based parameters of IDP dataset ...... 279 Table S2: The sequence of IDPs used in PPII and Rh predictions ...... 280

Chapter VII Table VII-1: S. epidermidis Aap B-repeat mutations and the predicted effects on biofilm formation ...... 293 Figure VII-1: Dot blot assay performed on Brpt5.5 samples using the amyloid- detecting antibody OC ...... 301 Figure VII-2: Predictions of amyloidogenic regions ...... 303 Figure VII-3: Analysis of Aap by AUC ...... 307

Appendix I Figure 1: Bap domain arrangement ...... 320 Figure 2: Secondary structure analysis reveals beta-sheet and random coil content ...... 322 Table 1: Secondary structure analysis by DichroWeb ...... 322 Figure 3: Bapice2 is highly extended in solution ...... 324 Table 2: Hydrodynamic parameters determined by AUC ...... 324 Figure 4: Zn2+ induces dimerization of Bapice2 ...... 326 Table 3: Hydrodynamic parameters determined by AUC in the presence of Zn2+ .... 324 Figure 5: No evidence for heteroassociation of Bap ice-repeat regions is observed 330 Table 4: Sw analysis of AUC data from Figure 5 ...... 324

xii

Chapter I. Literature Review

A. The importance of staphylococcal biofilms in the healthcare industry

The gram-positive bacterium, Staphylococcus epidermidis, is typically beneficial to the human host. S. epidermidis colonizes the skin and mucous membranes, preventing colonization by pathogenic bacteria. Unfortunately, S. epidermidis' ubiquity on the surface of the skin makes it a likely source of contamination during insertion of indwelling medical devices [1, 2]. In fact, S. epidermidis is the most common microbe responsible for hospital-acquired infections [1, 3]. There is a significant cost to the US healthcare system due to intravascular catheter infections caused by S. epidermidis and other coagulase-negative strains [4]. Such infections are very problematic, due primarily to the ability of the bacteria to form biofilms - a community of bacterial cells which form a three-dimensional structure typically encased in an extracellular polysaccharide-rich matrix. These biofilms confer a degree of chemical and physical resistance to the bacteria, often resulting in the need for prolonged antibiotic treatment and removal of the contaminated device [5].

1

B. Molecular pathways of staphylococcal biofilm formation

The ability to form biofilms is considered S. epidermidis' most important virulence factor, as opposed to the toxins and more traditional virulence factors Staphylococcus aureus possesses [6]. Biofilm formation by staphylococci is composed of several distinct phases, depicted in Figure 1. First, bacteria must attach to a surface, whether it be biotic (e.g. human cells, extracellular matrix) or abiotic (e.g. medical devices). Once a cluster of bacteria attaches to a surface, maturation occurs, where intercellular attachment holds bacteria together and a build-up of the trademark 3-dimensional structure occurs. Following maturation, bacteria periodically undergo cycles of shedding or detaching from the community, which can lead to the distribution of bacteria to other sites, where new biofilms may form [6].

Figure I-1. Biofilm formation is initiated by the attachment of planktonic bacteria to a surface. Accumulation and maturation of the biofilm can involve various factors, such as polysaccharide intercellular adhesion (PNAG - poly-N-acetylglucosamine), proteins, extracellular DNA and teichoic acids. Bacteria may then be shed during dispersal of the biofilm.

2

i. Steps of biofilm formation

Attachment: The first step of biofilm formation may be mediated by a variety of proteins which are capable of recognizing and binding to human matrix proteins, carbohydrates, or abiotic surfaces. These proteins, responsible for attachment, are classified as MSCRAMMs (microbial surface components recognizing adhesive matrix molecules), which typically contain the relevant "surface-binding" domain, a (usually) repetitive region leading back to the cell wall, and a motif which mediates attachment to the peptidoglycan layer of the bacterial cell wall [6]. Specific attachment of bacteria to a biotic surface or human matrix proteins is extremely relevant to infection. Indwelling medical devices become coated in proteins such as fibronectin, collagen, fibrinogen, and vitronectin [7, 8]. A number of MSCRAMMs have been demonstrated to bind one of more of these human proteins, initiating biofilm formation [7-10].

The serine-aspartate repeat (Sdr) family of proteins is one class of MSCRAMMs which have been well-studied. These proteins are composed of a ligand-binding region, repeat domains, serine-aspartate repeats, a cell wall-spanning region, and an LPXTG anchoring motif [11]. The LPXTG anchoring motif is a signal sequence that results in the covalent linkage of the protein to the peptidoglycan layer of the cell wall via a Sortase A family member [12]. The S. epidermidis Sdr family includes SdrG, SdrF, and SdrH.

SdrG, previously Fbe, binds strongly to fibrinogen, with a slow dissociation rate

[13]. Furthermore, the abundance of SdrG on the surface of S. epidermidis correlates with the affinity for surfaces coated with the human matrix protein. Together, these characteristics mean SdrG is well-suited to plant S. epidermidis on the surface of

3 indwelling medical devices and resist the shear forces experienced on many such surfaces [14].

SdrF has been well-characterized in its binding to type I collagen [7, 15].

Interestingly, the mechanism of binding differs from that of SdrG and fibrinogen - which uses a 'dock, lock, and latch' mechanism [16]. SdrF actually utilizes strong and weak forces across both the A and B regions of the protein in order to bind collagen [15]. In addition to the protein's role in infection, SdrF has also recently been shown to be important in S. epidermidis' ability to colonize skin, through binding keratin and therefore mediating attachment to human epithelial cells [17].

SdrH differs in its domain organization from SdrG and SdrF. It has a very short putative ligand-binding region, an extra region (termed a C region), and it lacks the

LPXTG sortase, cell wall-anchoring motif [18]. Nonetheless, SdrH and SdrG were present in 16/16 strains from patients experiencing S. epidermidis infections, while SdrF was absent from 4 of these strains. Importantly, all patients had developed antibodies against the ligand binding A regions of SdrH and SdrG, implicating these in S. epidermidis infection [18].

Outside of the Sdr family of surface proteins, autolysin E (AtlE) and autolysin/adhesion (Aae) can also bind matrix proteins. AtlE binds vitronectin, while Aae shows binding to vitronectin, fibronectin, and fibrinogen [9, 10]. Both proteins retain their namesake bacteriolytic activity, but also share adhesive properties towards host matrix proteins. This is a well-defined example of a surface protein exhibiting multiple functions.

4

While specific attachment is mediated by cell wall-associated proteins, nonspecific attachment of bacteria to abiotic surfaces depends on the cell surface hydrophobicity and the hydrophobicity of the abiotic surface itself [19]. Surface- associated proteins, such as autolysin E (AtlE) in S. epidermidis or Atl from S. aureus increase the hydrophobic nature of the cell (simply by their presence on the cell surface), and therefore affect the ability of strains to form biofilms on abiotic surfaces

[9]. The S. aureus biofilm-associated protein (Bap) and S. epidermidis Bap homologue protein (Bhp) have also been shown to afford the ability to form biofilms on polystyrene, although the mechanism is unclear [20]. Other key components in nonspecific attachment are teichoic acids. Teichoic acids are negatively charged or zwitterionic, linear polymers present at the cell wall of gram-positive bacteria and serve to bind cations to be used by the cell [21, 22]. The charged nature of these compounds has been shown to be a strong determinant of whether or not a strain of S. aureus can colonize polystyrene or glass. A mutant synthesizing teichoic acids with a stronger negative charge was unable to colonize the abiotic surfaces, suggesting electrostatic repulsion could be a valuable means to combat infection of indwelling medical devices

[23]. It should also be noted that molecules involved in the next phase of biofilm formation, accumulation, can also play a role in attachment.

Examples of proteins which function in both attachment and accumulation are the accumulation-associated protein (Aap) and small basic protein (Sbp). While the mechanism of Aap-mediated surface attachment is becoming more and more well- understood (discussed in Section C-ii), there is very limited data on how Sbp mediates attachment. Evidence suggests that Sbp is a necessary co-factor for establishing

5 sustained attachment. Given the positively-charged nature of Sbp, this protein could be enhancing attachment to slightly negative glass surfaces via electrostatic interactions.

This could explain the abundance of Sbp observed at the base of the biofilm [24].

Table I-1. Summary of components involved in attachment Component Target Mechanism Ref(s) SdrG (previously Fibrinogen Direct binding [14] Fbe) SdrF Type I collagen, keratin Direct binding [7, 15-17] SdrH Unknown Unknown [18] AtlE Vitronectin Direct binding [9] Polystyrene Hydrophobicity [9] Aae Vitronectin, fibronectin, fibrinogen Direct binding [10] Bhp (Bap) Polystyrene Unknown [20] Aap (N- Polystyrene Protease-induced [25] terminus) Corneocytes Lectin-binding [26] Polystyrene Unknown [27] Polystyrene N-terminus-based [28] Tissue culture-treated SepA-protease [29] polystyrene plates, glass or activity polycarbonate Sbp Polystyrene Unknown [24] Keratinocytes Unknown [24] Teichoic acid Polystyrene, glass Electrostatic [23]

Accumulation: Once attachment has occurred and a monolayer of bacterial cells has been formed on a surface, the next critical step is intercellular aggregation of bacteria. The canonical pathway of accumulation in S. epidermidis utilizes a polysaccharide intercellular adhesin (PIA), specifically poly-N-acetylglucosamine

(PNAG) [30, 31]. However, protein components may also be used in accumulation, namely the accumulation-associated protein (Aap) [32]. The redundancy for accumulation in biofilm formation is not totally understood. It has been reported that

6 biofilms formed by both protein and PNAG are more robust than protein-dependent biofilms [33]. This observation suggests both components play an important role in building a strong, effective biofilm. Furthermore, switching between PNAG- and protein- dependent biofilm formation has been observed, suggesting the environment is an important factor, but more importantly, redundancy suggests biofilm formation is a critical ability for this bacterium [34]. Additionally, PNAG- and protein-dependent biofilms differ in their abilities to resist shear stress and virulence [28]. The accumulation phase will be expanded upon in Section ii and Section iii.

Table I-2. Summary of components involved in accumulation Component Target Mechanism Reference(s) PNAG Bacterial surface Electrostatic [35] Aap Aap B-repeats Zn2+-mediated assembly [36, 37] Bhp (Bap) Unknown PNAG-independent, Aap- [20] independent Sbp Aap B-repeats, PNAG Direct binding [24]

Dispersal: Biofilm detachment, or dispersal, is an important factor in the spread of bacteria to other sites, whether for colonization or for infection. In this phase of biofilm formation, single cells or clusters of cells can be released from the mature biofilm structure via mechanical forces, decreased production of accumulation-related factors, and production of matrix-degrading factors [6]. Well-controlled regulation of factors involved in biofilm detachment is crucial for maintaining a well-structured biofilm. The most well-understood mechanism of dispersal regulation is the quorum-sensing system, agr [38, 39]. Mutating the agr system results in a thicker biofilm, suggestive of 7 decreased detachment. Expression of agr is focused at the biofilm surface (as opposed to the biofilm interior), where it could be better utilized in detachment regulation [39].

The agr system in S. aureus seems to function similarly [40].

Several proteases are produced by S. epidermidis which could contribute to detachment by degrading surface proteins involved in intracellular attachment, although more work needs to be completed in this area [41-43]. In addition to the degradation of protein factors involved in intercellular aggregation, degradation of PNAG could also be a useful mechanism of dispersal. While, no S. epidermidis or S. aureus enzymes responsible for PNAG hydrolysis have been found, the hydrolase, dispersin B, from

Acinetobacillus actinomycetemcomitans, can indeed disperse S. epidermidis biofilms via this mechanism [33, 44]. PNAG is slightly positively charged, while the bacterial surface is highly negatively charged. Therefore, in a PNAG-dependent biofilm, disruption of the electrostatic attraction between the cells and PNAG could lead to the release of cells. Likely candidates for this role are the phenol-soluble modulins (PSMs).

PSMs are under the control of the agr quorum-sensing system, and they are peptides ranging from 20 residues (α- and δ-types) to ~45 residues (β-type) and contain an amphipathic α-helix which could act as a detergent-like molecule to disrupt hydrophobic and electrostatic interactions [1].

ii. PNAG-dependent accumulation

The synthesis of PNAG is under the control of the intercellular adhesion operon, icaADBC [30, 45]. The specific role of each component in this system was determined using a xylose-inducible expression vector transformed into Staphylococcus carnosus.

8

Inactivation of any single component led to loss of cellular aggregation, PNAG production, and biofilm formation on various surfaces [30, 45]. The catalytic enzyme,

IcaA, showed a low level of N-acetylglucosaminyltransferase activity when presented with the substrate, UDP-N-acetylglucosamine. However, when co-expressed with IcaD,

IcaA activity was significantly increased and synthesis of a 20 residue oligomer occurred. With the co-expression of IcaC, a putative integral membrane protein, synthesis of full-length PNAG (≥130 residues) could occur [30]. IcaB does not appear to be directly involved in PNAG biosynthesis, yet it is an essential part of the icaADBC operon for biofilm formation [30]. IcaB is anchored to the surface of S. epidermidis, where it partially deacetylates PNAG. The bacterial surface itself is negatively charged, likely due in part to teichoic acid, so the positive charge on PNAG allows for intercellular attachment based on electrostatic interactions, whereas neutral, non-deacetylated

PNAG (synthesized by a mutant strain lacking icaB) could not support intercellular attachment. The ability of the icaB mutant strain to persist in a murine infection model was diminished compared to the icaB-expressing strain [35]. As PNAG synthesis is closely tied to biofilm formation, the regulation of the icaADBC operon is crucial. In addition to regulation by substrate availability [6], environmental factors can also regulate the system. For instance, a decrease in oxygen concentration increases icaADBC mRNA expression and PNAG synthesis. It is well understood that oxygen and other resources are increasingly depleted in the inner depths of a biofilm, therefore,

PNAG synthesis would be increased in cells experiencing lower oxygen concentrations within the biofilm, potentially creating a positive feedback loop [46]. Interestingly, some antibiotics have also been shown to increase transcription of the ica operon [47]. There

9 are also several internal regulators, in addition to these outside influences. Sigma factor

SigB upregulates PIA production when induced through the RsbU regulator via NaCl or ethanol [48]. The sarA gene in S. aureus is known to control the production of many virulence factors, and the sarA homolog in S. epidermidis was recently shown to regulate PNAG production leading to biofilm formation [49]. The luxS quorum-sensing system represses PNAG production via a pathway involving autoinducer 2 secretion

[50].

iii. Protein-dependent accumulation

Despite the classical connection between biofilm formation and secretion of intercellular adhesins such as PNAG, there is ample evidence for PNAG-independent biofilm formation in S. epidermidis. Early evidence came from an analysis of strains isolated from prosthetic infections [33]. Of the isolates, 89% were aap-positive, while

62% were icaA-positive. In the cases where biofilm formation could occur, 27% lacked the ica operon, and the ability to form biofilms was contributed to Aap [33]. More recently, further evidence has pointed toward the significance of PNAG-independent

(and more specifically, Aap-dependent) biofilm formation. Using a flow cell which exposes a glass surface to fluid shear, Schaeffer et al. found that strains lacking aap produced only limited amounts of biofilm mass on the glass surface compared to aap- positive strains. In a rat catheter infection model, strains lacking Aap had a significant decrease in bacterial recovery from the catheter and blood, compared to Aap- expressing strains. Importantly, there was no difference in improvement of bacterial recovery when the ica operon was present [28]. These results suggest that in the

10 context of infection inside a body, where fluid shear is present, Aap is critical in maintaining the intercellular aggregation necessary for mature biofilm formation.

In addition to Aap, the Bap homologue protein (Bhp), can also be responsible for biofilm formation. In the ica-negative, aap-negative, biofilm-forming strain C533, when bhp is deleted, the ability to form biofilms on a polystyrene surface is lost, but can be regained via complementation [20]. Bap (biofilm-associated protein) has been well- studied in S. aureus and other staphylococcal strains, however, the S. epidermidis homologue, Bhp (Bap homologue protein) is less well-understood. According to the previously mentioned study of S. epidermidis prosthetic joint infections, bhp was present in 46% of all isolates, compared to aap found in 89% of all isolates, regardless of biofilm formation capability [33]. A more recent study by Giomerzis, et al. examined S. epidermidis and S. haemolyticus isolated from patients suffering from bloodstream infections or prosthetic-device-associated infections [51]. S. epidermidis comprised nearly 75% of the strains isolated and tested. Of the total strains tested, 40% contained aap compared to a total of 20% containing bhp. Considering biofilm-producing isolates only, there is an enrichment of aap-containing isolates to 43%, whereas bhp sees a drop to just 16% [51]. Another report investigated a S. epidermidis isolate which switched from PNAG-dependent biofilm formation to protein-dependent biofilm formation. Interestingly, aap transcription was up-regulated after the switch to protein- dependent biofilm formation, while bhp was down-regulated [34]. These results seem to further emphasize the role of Aap in biofilm formation and clinical infections.

As was previously alluded to, Sbp (small basic protein) has also been implicated in the accumulation phase of biofilm formation. Sbp was initially discovered and isolated

11 from crude S. epidermidis biofilm mixture via its affinity toward sepharose beads tagged with the B-repeat superdomain of Aap. Aap-dependent biofilms were still formed in a

1457Δsbp strain, the addition of recombinant Sbp recovered biofilm thickness and volume to levels similar to the parent 1457 strain. Expression of Sbp in the biofilm- negative S. carnosus TM300 strain did not result in biofilm formation, indicating Sbp is not sufficient for biofilm formation, but may still be an important co-factor. By confocal microscopy, co-localization of Sbp and B-repeats was observed, while there was also co-localization of Sbp and PNAG. Expression of PNAG correlated with sbp expression levels [24]. In all, Sbp seems to play a role in multiple aspects of biofilm formation, and further studies will be needed to address its mechanisms of action.

12

C. The accumulation-associated protein

The accumulation-associated protein, Aap, was first observed by Timmerman, et al. in 1991 [52]. Studies leading up to this work had shown that proteases could reduce biofilm formation [53, 54]. This pointed toward protein factors being responsible for biofilm formation. Timmerman, et al. utilized a monoclonal antibody (mAb) raised against surface proteins cleaved from the surface of a strong biofilm-producing strain of

S. epidermidis. They observed a 220 kDa protein recognized by the mAb. When the mAb was used to pretreat a polystyrene surface, biofilm formation was greatly inhibited.

Further, they used the mAb for immunogold labeling and electron microscopy to show this protein was forming fibril-like appendages protruding outward from the cell surface

[52]. The next major characterization of Aap came several years later, when

Schumacher-Perdreau, et al. isolated a mitomycin mutant of S. epidermidis RP62A, termed M7. This mutant was unable to form a biofilm, whereas the RP62A parent strain is a strong biofilm producer. Further comparison of these two strains revealed only a loss of 115 kDa and 18 kDa extracellular proteins. Other characteristics remained unchanged, such as growth rate, initial adherence to the surface, antimicrobial susceptibility, and others [55]. Using the same M7 mutant, Hussain, et al. specifically attributed the accumulation phase of biofilm formation to Aap. The extracellular proteins were collected, and the 140 kDa protein (an unprocessed form of the 115 kDa protein above) had antiserum raised against it. The resulting antiserum was able to inhibit biofilm accumulation in RP62A and clinical strains [32], demonstrating an important role for Aap in biofilm formation, as well as clinical infections.

13

i. Aap domain architecture

Aap is a multi-domain, cell wall-anchored protein. The N-terminus of the protein is composed of 11 short, 16 residue A-repeats and a putative lectin domain. Following the lectin domain is a proteolytic cleavage site and 5-17 B-repeats (strain-dependent), containing 128 residues each. Immediately following the B-repeats is a 135-residue long region comprised mostly of AEPGKP repeats, referred to as the Proline/Glycine-rich region. This region leads into the LPXTG motif, which becomes covalently attached to

Lipid II of the peptidoglycan layer of the cell wall.

Figure I-2. Domain arrangement of Aap from S. epidermidis RP62A. Scissors represent the primary SepA cleavage site at Leu 602 and a secondary SepA cleavage site at Leu 336. The anchor marks the LPXTG Sortase A cell wall-anchoring motif, which covalently links Aap to the peptidoglycan layer of the bacterial cell wall (green half-oval).

ii. Role of Aap in attachment

As was previously mentioned, Aap has a role in the attachment phase of biofilm formation, as well as in accumulation. Given the multi-domain architecture of the protein, it is not surprising that the protein supports different functions. The N-terminus has been attributed to attachment to surfaces, both biotic and abiotic, and may be an important regulator of biofilm formation.

14

Rohde, et al. [25] investigated the presence of a 140 kDa isoform of Aap (220 kDa) in the clinical strain 5179. Expression of the 140 kDa isoform in the aap-negative strain

1585, led to biofilm formation, while expression of the full-length form of Aap did not result in biofilm formation in strain 1585. This result indicates that Aap can only induce biofilm formation after the N-terminus is proteolytically removed. The addition of α2- macroglobulin, a staphylococcal protease, was able to induce biofilm formation through proteolytic cleavage of Aap. Host-derived proteases, including trypsin, elastase and cathepsin G also could induce biofilm formation. These results suggest that the host immune response could induce Aap-dependent biofilm formation, which may very well have been evolved as a method of immune evasion [25].

S. epidermidis attachment to human corneocytes was shown to be mediated by the N-terminus of Aap. Adhesion to corneocytes could be inhibited using recombinant

AapArpt+Lectin, but not by AapArpt, indicating the ability for the lectin region to block adhesion when exposed to cells before addition of bacteria [26]. Additional evidence for the attachment role of Aap came from a S. epidermidis mutant CSF41498 and its mutant lacking sortase A - the protein responsible for Aap's attachment to the cell wall of S. epidermidis. The sortase A mutant showed diminished biofilm forming capacity compared to the parent strain, due to lower levels of Aap anchoring to the cell wall.

Antibodies raised against the N-terminus of Aap were able to specifically inhibit attachment on polystyrene [27]. Schaeffer, et al. produced genetic mutants of strain

1457 to better understand the role of Aap and the processing event. On plastic surfaces, complementing an aap-negative mutant with full-length aap enhanced initial attachment. However, complementing with the B-repeat region showed no effect on

15 attachment [28]. In 2016, Paharik, et al.[29] identified a staphylococcal protease, SepA

(controlled by SarA) that seems likely to be S. epidermidis' native mechanism of Aap processing. In strain 1457, a sepA-negative mutant resulted in biofilms which were able to attach to a surface, but unable to mature into well-structured biofilms. The sepA mutant resulted in a lack of processed Aap, which prohibited intercellular accumulation by the B-repeats. Therefore, processing of the A-repeats and lectin away from the B- repeat superdomain actually allows for the accumulation to occur, presumably by exposing the B-repeats in such a way that they can then interact with B-repeats on adjacent cells, rather than being sterically inhibited by additional protein at the N- terminus.

After identifying SepA, Paharik, et al. [29] proceeded to identify the specific cleavage sites in Aap. Fragments of recombinant Aap were expressed and purified, then incubated with concentrated culture supernatants from strain 1457ΔicaΔsarA, which showed higher expression of the SepA and greater processing of Aap. When a fragment containing the lectin and B-repeat superdomain was exposed to the culture supernatant, there was a slight change in the size of the protein by SDS-PAGE, indicative of a cleavage event. N-terminal sequencing of the shifted fragment identified residues 601-608 (residues just before the B-repeat superdomain begins) indicating the removal of the lectin region at a site between the lectin and B-repeats. This is consistent with the expected region of the cleavage site based on previous work by Rohde, et al.

[25]. Interestingly, the smaller cleaved fragment containing the lectin region was difficult to observe without slowing kinetics by keeping samples on ice before western blot analysis. This suggests the possibility that the cleaved fragment resulting from Aap

16 processing is rapidly degraded. Another recombinant fragment contained the A-repeats and lectin region. This fragment also was processed by supernatant from strain

1457ΔicaΔsarA. N-terminal sequencing of the larger, processed fragment identified residues corresponding to the beginning of the lectin region, after the A-repeats. The other resulting fragment (containing the A-repeats) was not detected, suggesting rapid degradation of the A-repeats. Performing experiments in strain 1457ΔicaΔsarAΔecp and 1457ΔicaΔsarAΔesp, which lack the respective secreted proteases, Ecp and Esp, showed slowed processing of the lectin/B-repeat fragment, indicating these two proteases contribute to SepA processing at the cleavage site between the lectin region and B-repeat superdomain. These two proteases had no effect on processing of the A- repeat/lectin fragment [29].

iii. Zn2+-mediated self-assembly of Aap (role in accumulation)

After processing of Aap, the absence of the A-repeats and lectin region allow for exposure of the B-repeat superdomain. Depending on the strain of S. epidermidis, there may be 5-17 B-repeats within the superdomain. Each B-repeat contains a G5 domain and a spacer region, with the exception of the C-terminal half-repeat, which contains only a G5 domain. The N-terminal-most B-repeat is consistently the most divergent repeat, with only 81-90% identity to other B-repeats, while the other B-repeats have a very high identity to one another (84-100%) [37, 56]. Biophysical characterization of the

B-repeat superdomain of Aap has been performed by the Herr Lab [36, 37, 56, 57], primarily utilizing a minimal, Brpt1.5 construct containing the full, most C-terminal B- repeat and the C-terminal half B-repeat, which contains only the G5 domain.

17

Based on the presence of G5 domains in Zn2+ metalloproteases, Conrady, et al. evaluated the ability of recombinant Brpt1.5 to assemble in the presence of Zn2+.

Analytical ultracentrifugation was used to show Brpt1.5 dimerized in the presence of

Zn2+, with 1-2 Zn2+ ions required by each G5 domain. The functional relevance of this observation was evaluated using a biofilm formation assay. Interestingly, DTPA, a Zn2+- chelator, was able to prevent biofilm formation of S. epidermidis on polystyrene and S. aureus USA300 (a methicillin-resistant strain) on fibronectin-coated polystyrene.

Addition of Zn2+ could restore biofilm formation. Recombinant Brpt1.5 was able to inhibit biofilm formation in the presence of Zn2+, providing further evidence of the involvement of Aap's B-repeats in the accumulation phase of biofilm formation. Given these results,

Conrady, et al. proposed the zinc zipper model of accumulation, in which Zn2+ supports intercellular adhesion via overlapping, anti-parallel B-repeat superdomains [36].

After this initial study, Conrady, et al. [37] then determined X-ray crystallographic structures of Brpt1.5 as a dimer in the presence of Zn2+ (Figure 3). As was expected based on the initial biophysical study, dimerization involved a His, as well as Glu and

Asp residues, although several different arrangements were observed across structures indicating a pleomorphic mechanism of Zn2+-binding. Overall, the structures also confirmed that Brpt1.5 is a highly extended protein rich in β-sheets and random coil [37].

The B-repeats do not contain the typical globular architecture with a hydrophobic core of nonpolar residues. Instead, there is a hydrophobic stack at the intersection of each spacer and G5 domain, which was of high importance for protein stability [37]. Chaton, et al. [57] recently investigated the preference of Zn2+ over other divalent cations. The primary factor in metal specificity seemed to be the coordination number, with Zn2+

18 nearly always having a coordination number of 4, making it an ideal fit when also considering typical residue coordination distances. In addition to Zn2+, Cu was also able to induce assembly of Brpt1.5 (as well as a Brpt5.5 construct containing an additional four, C-terminal B-repeats). Mn, Co, and Ni can all competitively inhibit Zn2+-dependent assembly, suggesting they have the ability to bind the Zn2+-binding sites, but cannot support assembly. This study proposed that Aap may not only be able to respond to local Zn2+ concentrations, but also Cu, which is also elevated in areas of immune response [57].

(A)

(B) (C)

H75 D21 E19 H75 H75

Zn2+ Zn2+ 2+ Zn

E203 D149 D149 E203

Figure I-3. (A) A crystal structure of Brpt1.5 (PDB: 4FUN) with one protomer shown in magenta and the other shown in cyan. Two Zn2+ ions are shown as gray spheres. (B) A detailed view of the Zn2+-binding region shows H75 and D21 from the magenta protomer and E203 from the cyan protomer interacting with one Zn2+ ion. Panel (C) shows three different sets of residues involved in Zn2+-binding. See Conrady, et al. for additional details [37].

19

While there is a high degree of identity among the B-repeats, specifically, in terms of the G5 domains, there is a set of eight residues that tend to swap in or out as a

"consensus" or "variant" cassette. The cassettes contain residues spatially near the

Zn2+-binding site, dimer interface and hydrophobic stack. Shelton, et al. [56] expressed minimal, Brpt1.5 constructs containing consensus or variant G5 domains. Interestingly, the C-terminal G5 domain determined the thermal stability of the overall fold - with the variant G5 conferring 5-7°C in additional thermal stability over constructs having the consensus cassette in the C-terminal position. Conversely, the consensus cassette was required in the C-terminal G5 domain in order to support Zn2+-dependent dimerization.

A significantly lower affinity was observed when both G5 domains were the consensus type, indicating stronger dimerization ability. In summary, variant cassettes offered higher thermal stability to the protein fold, while the consensus cassettes sacrificed thermal stability in exchange for better Zn2+-assembly capacity [56].

20

D. Considerations in Protein Folding and Stability

Nearly 30 years after Emil Fischer and Franz Hofmeister, in 1902, determined the composition of proteins to be covalently linked amino acids, Hsein Wu proposed a critical theory of protein folding and denaturation. He was the first to define the native

(folded) state of a protein as compact and having regular folding patterns and non- covalent linkages. Upon denaturation, Wu states these non-covalent linkages are broken, and the protein becomes more diffuse and like a flexible chain. Prior to Wu's highly insightful theory on denaturation, the process was mistaken for depolymerization

(via hydrolysis) or protein dehydration [58].

The next leap in understanding protein folding did not happen until 1959, when the physical chemist Walter Kauzmann published his wide-reaching review on factors of protein denaturation [59]. Kauzmann impressively tied earlier mentions of hydrophobicity by Irving Langmuir and J.D. Bernal and a theoretical physics paper by

Henry Frank and Marjorie Evans, together to conceptualize the hydrophobic effect as we now know it [60].

Protein folding is a delicate balance between a variety of forces, both attractive

(favorable) and repulsive (unfavorable). The Gibbs free energy change between the folded, native state and the unfolded, denatured state is relatively small - usually about

5-20 kcal/mol of protein, so even "weaker" forces are important to consider [61]. While the hydrophobic effect is widely accepted as the primary driver of protein folding, electrostatic interactions, hydrogen bonding, and van der Waals forces all play essential roles in protein folding, and therefore, biological function [58, 62, 63]. The following sections are aimed at breaking down the variety of forces and types of interactions that

21 govern protein folding and stability. With this knowledge, one can begin to understand why some proteins might aggregate or form super stable amyloid fibrils, or in the case of intrinsically disordered proteins, lack a traditional folded native state.

i. The hydrophobic effect

The hydrophobic effect can be described as the tendency for nonpolar amino acid sidechains to avoid contact with water molecules in the solvent. The result of this effect is that nonpolar residues are buried in the "hydrophobic core" of the globular protein, where they do not make contact with the water in the solvent, but instead are in contact with other nonpolar residues. This leaves polar residues on the surface of the protein to make more favorable contact with water molecules in the solvent.

Additionally, if hydrophobic residues were exposed on the surface of the protein, water molecules would form ordered "cages" around them, resulting in an unfavorable loss of entropy [64]. In Kauzmann's seminal papers explaining the hydrophobic effect, he cites as evidence the ability of nonpolar solvents to denature proteins. We now understand the mechanism for why this occurs - the driving force (polar solvent) that kept the nonpolar residues buried in the hydrophobic core is now gone, and the nonpolar residues can form favorable interactions with the nonpolar solvent (recall "like dissolves like"). The other piece of evidence Kauzmann proposed supporting the major role the hydrophobic effect plays in folding is that the stability of proteins shows a similar temperature dependence to that of nonpolar solutes [58].

Much more evidence supporting the hydrophobic effect's role as a dominant force in protein folding has amassed since Kauzmann's seminal papers [58]. One piece

22 of evidence is the vast numbers of high-resolution protein structures showing nonpolar residues buried in hydrophobic cores [65]. Along similar lines, Pace, et al. [63] have measured the change in conformational stability of many mutants designed to query the impact of buried hydrophobic residues. They found, for example, that replacing an isoleucine in the hydrophobic core of a protein with the less hydrophobic valine reduces the stability of the protein by 1-2 kcal/mol. Across 22 proteins, the contribution of hydrophobic interactions was determined to be about 60% of the total contributions to protein stability [63]. Furthermore, Lim & Sauer made random mutations to the hydrophobic core of λ-repressor and found that the primary requirement allowing for a native fold and functional activity was for the residues to retain their hydrophobicity [66].

Rather than continuing individually into the variety of forces or interactions important in protein stability, we will examine different forces in the context of two perspectives - interactions between the solvent and solute (protein), and interactions occurring within the protein (solute-solute interactions). It is the combination of these two that is important in the consideration of overall protein folding.

ii. Solvent-solute interactions

Protein solubility was a major hurdle for early protein work [58], and is still a major problem in formulation of protein pharmaceuticals [67]. Proteins are typically not very soluble in pure water, but the presence of salt and buffer will greatly enhance the solubility. Furthermore, a variety of cosolvents are known to greatly stabilize or destabilize proteins. Hence, the solvent is a critical consideration in protein folding.

23

Much of the early work on understanding, and specifically quantitatively predicting protein stability and the importance of solvent-protein interactions comes from the work of Charles Tanford [68, 69]. The experimental approach (based on earlier work by

Thomas McMeekin, Edwin Cohn, and John Edsall [70]) of interest here is the use of solubility measurements for single amino acids and model compounds representing the protein backbone [68, 71]. To oversimplify the experiments, known amounts of the solute were placed into known amounts of solvent (pure water, urea, alcohol, etc.) and shaken in a constant temperature water bath for 24 hours. The mixture was then filtered and the soluble solute quantified either by dry weight or by titration. The free energies of transfer from water to urea, for example, were then calculated based on these solubility measurements at each specific concentration of urea [68]. A negative (i.e. favorable)

Gibbs free energy change of transfer (usually displayed as ΔGtr or ΔFt) indicates greater solubility in the urea solution than in pure water, while a positive ΔGtr results from lower solubility in the urea solution. Conceptually, this means a negative ΔGtr value is indicative of favorable interactions between the amino acid (or peptide backbone) and the solvent. Table 3 lists the ΔGtr from water to the specified solvent for the peptide group and the leucine sidechain as an example of a hydrophobic residue. Solubility experiments have mostly focused on residues with hydrophobic sidechains, which are mostly buried in the native state, but mostly accessible in the denatured state, creating the opportunity for a major impact on stability [69]. It should be noted that while it was originally believed that polar residues are sequestered to the surface of the protein, the abundance of high-resolution structures has allowed for re-evaluation of this idea. In

24 fact, 37% of polar charged residues and 57% of polar uncharged residues are buried in smaller proteins, while larger proteins can bury even higher percentages [62, 72].

Table I-3. Gibbs free energy of transfers from water to other solvents Solvent Peptide Leu side chain Urea (2 M) -70 -110 GdnHCl (2 M) -135 -210 Sarcosine (2 M) +90 +80 TMAO (2 M) +180 +20 Ethanol +1400 -1800 Cyclohexane +7600 -4900 Vacuum +9800 -2300 -1 See Pace, et al. 2004 for individual references [62]. Values are ΔGtr in cal mol .

While it is out of the scope of this discussion to elaborate on all of the above solvents, a detailed example of urea and TMAO (trimethylamine N-oxide) will be presented. Table 4 summarizes the breakdown of backbone and sidechain interactions and their relative contributions on stability. Urea, of course, is a denaturing osmolyte, causing proteins to unfold. The mechanism for this process is largely that the peptide backbone has a favorable transfer to urea, as evident by the negative value in Table 3, above. Therefore, the unfolded state, where there is more backbone solvent exposure, is preferred. However, while hydrophobic sidechains have largely favorable transfers as well, polar and charged residues show slightly unfavorable transfers [73]. Arguments have been made as to whether the backbone or the sidechains are most responsible for urea's denaturing ability, particularly based on the solvent-accessibility assumed in

25 different models [74, 75]. Urea may also affect water ordering around the protein, providing an indirect, entropic effect [76].

Table I-4. Component contributions to Gibbs free energy of transfers Solvent Net Peptide Sidechain Polar Polar Hydrophobic Total Group Total Charged Uncharged Urea ------+ ++ + -- TMAO ++ +++ ------"+" signs indicate strength of positive ΔGtr (unfavorable interaction) "-" signs indicate strength of negative ΔGtr (favorable interactions) Summarized from Auton, et al. 2011 [73].

TMAO is a well-known stabilizing osmolyte. In nature, TMAO can be found in organisms present in extreme environments, where the ability for proteins to maintain a folded state would be otherwise challenging [77, 78]. In fact, TMAO can counteract urea

[73, 79]. Like urea, the effect of TMAO is primarily governed by the TMAO-backbone interactions. However, there is a positive ΔGtr for the backbone into TMAO, meaning the transfer is unfavorable. Sidechains tend to have a slightly favorable transfer to TMAO, regardless of type [73]. Therefore, TMAO stabilizes proteins primarily by causing the protein to bury backbone to prevent unfavorable interactions, thereby shifting the equilibrium to the folded, often native and functional state [80].

Another solvent worth mentioning briefly is trifluoroethanol (TFE). Similar to

TMAO, the peptide backbone has a positive, unfavorable ΔGtr into TFE (see solubility measurements for ethanol [81]). Generally, this causes folding to be favorable due to

26 preferential exclusion of the alcohol from the backbone. However, TFE-protein hydrogen bonding is weaker than the water-protein hydrogen bonding, resulting in a preference for α-helix formation, which maximizes intramolecular hydrogen bonding, while also minimizing backbone solvent exposure [82].

iii. Solute-solute interactions

With the considerations from solvent-solute interactions in mind, we next turn to solute-solute interactions, specifically intramolecular protein-protein interactions such as those important in some secondary structure elements.

Electrostatics were identified early on as an important force in folding. As is now well- known, protein folding (and function) depends highly on pH and salt concentration of the buffer. The charged state of sidechains, and therefore overall protein net charge, change with pH. Protonation or deprotonation of sidechains may also result in the breaking of specific ion pairing (salt-bridges).

Protein solubility is also directly affected by electrostatics. A phenomenon known as "salting in" describes increased protein solubility at low salt concentrations. The electrostatic shielding effect of salt weakens the attractive electrostatic interactions that may form between adjacent protein molecules and result in non-specific aggregation.

"Salting out" describes the opposite effect occurring at high salt concentrations, where salt ions outcompete charged protein sidechains for solvent interactions. As a result, protein-protein interactions become more favorable than protein-solvent interactions, thereby causing aggregation or precipitation of the protein [64].

27

The isoelectric point, or pI, is defined as the pH at which a protein has no net charge, and therefore, the overall contribution to stability should be null. As the pH is shifted with the addition of acid, for example, the net charge increases with protonation of sidechains. Proteins generally exhibit lower stability outside the physiological pH range [83]. As the net charge increases, the charge density of the molecule also increases, and an increasing electrostatic repulsion will arise. It will then become more favorable for the protein to expand and unfold, thereby decreasing the effects of high charge density and electrostatic repulsion [58].

Electrostatic contributions also appear via specific ion pairing between charged sidechains. While such interactions were once thought to be the dominant folding force, they are now believed to contribute a free energy value 5- or 10-fold smaller than the contribution of the hydrophobic effect. The relatively low number of ion pairs observed in folded proteins, the low conservation of these residues, and the weak effect of mutating charged residues all offer evidence that electrostatics and specifically ion pairing are not the major driving force for folding [58]. Additionally, the formation of an ion pair causes a loss of entropy of the sidechains, and a loss of favorable solvent interactions, such that these will offset a portion of any benefit to folding [64].

Intramolecular hydrogen bonding was also once in the running for primary folding force, especially after Linus Pauling proposed α-helices and β-sheets as major structural elements of folded proteins [58, 62]. Nick Pace and co-workers have quite thoroughly investigated the contribution of hydrogen bonding and its effect on protein stability [62, 84, 85]. By making mutations to buried residues, they have shown that hydrogen bonding in the protein core increases the stability (favors folding), and the

28 burial of a polar residue could actually stabilize the protein greater than burial of a nonpolar residue (in certain situations) [62]. Not surprisingly, essentially all buried polar groups in known crystal structures are hydrogen bonded [86]. Hydrogen bonding is considered by others to play only a minor role, because while hydrogen bonding occurs in the folded state, it also occurs in the unfolded state, between the protein and water molecules [64]. Regardless of the magnitude of the stabilizing effect of hydrogen bonding on protein folding, hydrogen bonding has been suggested to contribute specificity, leading to a unique native structure [64, 87]. This is in contrast to the hydrophobic effect, which may lack specificity [88].

While transfer free energy estimations and hydrophobicity contributions provided useful estimates for predicting protein stability, the resulting Gibbs free energy change for unfolding values were in the range of 100-200 kcal/mol - about 10-fold larger than the observed Gibbs free energy change, ΔG, of unfolding [58]. It turns out the major missing force was a force which strongly opposes folding - entropy. Entropic effects can be classified as local or nonlocal. The local entropic forces deal with translational, rotational, and vibrational entropies expected for any small molecules. Specific ion pairing and secondary structure formation are features of the folded state which result in unfavorable entropic contributions, because they reduce the ability of those sidechains or backbone to sample additional configurations [58]. The nonlocal entropic contribution is due to excluded volume or steric restrictions, which are apparent when one considers a protein as a polymer chain, rather than individual amino acids. The nonlocal entropy contribution is a function of number of chain configurations (which increases with number of residues) and chain density (related to the occupied volume). Proteins in

29 their native, folded state are usually extremely compact, while the unfolded state is much expanded (favored by entropy). Important experimental evidence showing that entropy significantly opposes folding includes mutating residues to proline, thereby reducing configurational degrees of freedom in the unfolded state and observing increased protein stability [89, 90]. Additional support comes from observations that cross-linking and disulfide, which reduce the available configurations of the unfolded chain and therefore weakens the entropy contribution, results in significant increases in protein stability [58, 91].

30

E. Protein misfolding and natively unfolded proteins

The protein folding pathway in the cell includes a period during which hydrophobic regions of newly translated proteins are solvent-exposed and secondary structure elements are not yet formed. Chaperones and other cellular machinery have evolved to assist in proper protein folding, as well as detection and disposal of misfolded and aggregated proteins. Even with the extensive efforts by the cell to prevent misfolding, proteins can still elude quality-control, resulting in protein unfolding and/or aggregation that can lead to disease [92]. The first part of this section will focus on protein aggregation and one specific type of aggregation, amyloidogensis. The second part will focus on intrinsically disordered proteins having a native state lacking well-ordered structure seen in folded proteins.

i. Protein aggregation and amyloid formation

Protein aggregation is a generic term referring to the interaction of a large number of protein molecules, usually leading to visible, insoluble particles. Often, protein aggregation is nonspecific and results in loss of protein function. However, amyloidogenesis, or the formation of amyloid, is specific, ordered aggregation which may be associated with a gain or loss of protein function. Features of an amyloid traditionally include binding to dyes such as Congo Red or Thioflavin T, having primarily

β-sheet secondary structure, and showing a fibrillar morphology by electron microscopy

[92, 93]. Proteins which can form amyloids can also form non-ordered aggregates, termed amorphous or native-like aggregates [94, 95]. Amyloidogenesis typically shows a characteristic lag phase, where a monomer or base unit completely or partially unfolds

31 and takes on a conformation known as the nucleus. This nucleating species is the smallest unit which can initiate fibril elongation. The exponential or polymerization phase begins once the nucleating species is formed (or introduced via "seeding" - addition of pre-formed fibrils), and then aggregation into well-ordered fibrils occurs rapidly by the addition of monomers or single units. Following the exponential phase, saturation occurs and an equilibrium is reached where there is no net change in the length or number of fibrils [95, 96].

a. Amyloid stability

Once amyloid fibrils are formed, they are extremely stable, often more stable than the native state of the protein, and do not appear to be in equilibrium with non- aggregated states [97, 98]. The tensile strength of amyloid fibrils can approach that of steel [99]. The molecular basis for such stability is the extensive intermolecular hydrogen-bonding network along the protein backbone, as well as Van der Waals interactions [99-102]. Given the vast sequence diversity among amyloid-forming proteins, it is not surprising that the protein backbone is a major source of interactions which stabilize the amyloid [95], and it is in stark contrast to protein folding. As previously discussed, the primary determinant of protein folding is the hydrophobic effect, or the preference to bury hydrophobic sidechains. Specific sidechain contacts

[99], steric restrictions of proline [103], and amino acid identity [104], however, still play a notable role in amyloidogenesis. For example, hydrophobic residues tend to show higher aggregation propensities than polar residues [104].

32

b. Pathogenic vs. functional amyloid

Amyloids are often discussed in terms of human neurological diseases, such as

Alzheimer's (amyloid-β peptide), Huntington's (Huntingtin exon 1), and Parkinson's (α- synuclein) diseases, but amyloid deposits or inclusions are involved in about 70 different human diseases [95]. In the case of half of these diseases, the onset of disease correlates with aging, where gradual loss in the ability of cells to regulate protein misfolding and aggregation may lead to a greater amount of proteins passing through "quality control" unchecked [92, 95, 105]. In other words, the formation of amyloid appears to be accidental, and pathogenicity occurs as a result. Whether the amyloid fibrils or oligomeric intermediates are the toxic species is of particular interest for future therapeutic endeavors. Interestingly, many other cases have been documented in which amyloidogenesis is a highly regulated process which can be induced by the host organism for its own benefit. These amyloids are referred to as functional amyloids [95].

Functional amyloids have been described in bacterial [106-109], fungal [110], human [111], and archaeal [112] systems. Lower-level organisms, such as the bacteria, fungi, insects, and archaea tend to utilize amyloids for their physical strength, adding mechanical and chemical stability to structures such as biofilms, larvae or spores, and cell walls [112-114].

While functional amyloids share similar structural features with "traditional" or

"toxic" amyloids, the important difference is in the regulation of polymerization. The most well-understood example of a bacterial functional amyloid is curli produced by

Escherichia coli. Two separate operons are dedicated to curli assembly and regulation -

33 csgBAC and csgDEFG [106]. Curli fibrils themselves are composed from CsgA and

CsgB, where CsgB acts as the minor subunit which preferentially folds into a nucleating species on the surface of the bacterium. CsgA rapidly polymerizes in the presence of

CsgB. CsgC inhibits polymerization in the periplasm, allowing for CsgG to secrete the subunits into the extracellular space. CsgE and CsgF assist CsgG. CsgD is responsible for transcriptional activation of the csgBAC operon [106]. The function of curli fibrils includes colonization and biofilm formation, host tissue interaction, and immune evasion

[108, 109, 115]. Other bacterial functional amyloids act as adhesins, modulate surface properties, and attach to extracellular DNA to contribute to biofilm stability [109, 116,

117].

ii. Intrinsically disordered proteins

Intrinsically disordered proteins (IDPs) have taken many names during their exponential rise to fame through the 1990's and 2000's, including natively unfolded, intrinsically unstructured, dancing proteins, protein clouds, partially folded, etc. [118-

120]. Regardless of the specific terminology, IDPs are proteins which lack significant secondary structure and are often highly expanded or extended, lacking a compact, globular structure [118].

a. Factors determining conformational propensities

Considering the hydrophobic effect is the driving force leading to compact protein structure in well-folded proteins, it shouldn't be too surprising that hydrophobic residues are severely under-represented in IDPs [121]. The second strongest sequence-based

34 determinant of whether or not a protein is intrinsically disordered is net charge. Charge- hydropathy plots, especially one sometimes known as the "Uversky Plot," segregates

IDPs from folded proteins surprisingly well based on only two simple parameters [118].

Polar residues themselves are over-represented in IDPs, allowing for favorable interactions with water molecules in the solvent. In the case of sequences composed mostly of charged, polar residues, the main influence on compaction is actually the distribution or mixing of the charges. For instance, a well-mixed (i.e. positive-negative- positive-negative residues) sequence will remain highly extended - resembling self- avoiding random walks due to electrostatic repulsion, whereas a well-segregated sequence will be dominated by attractive electrostatic interactions, forming hairpins and a more compact ensemble of structures [122, 123]. The structural propensity of sequences which contain a lower number of overall charged residues are determined primarily by polyproline type-II helix propensity and weakly effected by electrostatic repulsions and α-helix propensity [124, 125].

IDPs have many conformations available to them - each with similarly favorable stability. This is very different from well-folded proteins, which usually have a single specific conformation with much higher stability than other possible conformations, causing that particular high stability conformation to be strongly preferred. The difference between these two cases is primarily due to the fact that IDPs lack the strong unfavorable interactions between the hydrophobic residues and solvent (i.e. the hydrophobic effect). Instead, IDPs rely heavily on (generally) weaker forces that must be well-balanced to avoid, on one hand, trapping a particular folded state (when not desired) and on the other hand, avoiding aggregation (e.g. α-synuclein and amyloid-β

35 are pathogenic, amyloid-forming IDPs). For example, many IDPs undergo folding upon binding, a process which demonstrates a favorable enthalpic contribution is able to overcome the entropic penalty of forming a more ordered structure or ensemble [126].

b. Functional roles of IDPs

The function of proteins has long been known to be closely tied to their structure

[127], and denaturation of folded proteins causing loss of activity was strong evidence for the case [128]. Evidence that unstructured regions could be functionally relevant started mounting when missing electron density in X-ray crystallography structures was observed for regions which had been shown to be important for function. NMR stepped in as a critical technique for examining these functionally important disordered regions

[121]. A revised hypothesis for the relationship between structure and function is called

"The Protein ." The three states which compose the trinity are: ordered, molten globule and random coil. The ordered state refers to the well-folded, compact structure typically associated with function. The molten globule state is a partially folded state which has been proposed to be an intermediate in protein folding - with secondary structure similar to the ordered state, but a higher range of motion and a more expanded ensemble closer resembling random coil states. The random coil state is the least-structured state, able to exist in the greatest number of conformations. Each of these three states is proposed to be in equilibrium, and function can be related to a state or the transitions between [121]. The Protein Trinity has also been extended to

"The Protein Quartet" to accommodate both a molten globule and premolten globule

[129]. An important consideration when discussing disordered proteins is that "disorder

36

≠ disorder." IDPs can have different sequence compositions, degrees of compaction, responses to pH, salts, temperature, and co-solvents, and can even undergo folding upon binding to ligands [130, 131].

The function of IDPs is wide reaching and attempts have been made to organize classifications [131]. IDPs or IDRs (intrinsically disordered regions - often used interchangeably) in the entropic chains and springs category offer a way for proteins to keep domains positioned or spaced appropriately, and their disorder is directly important in this function. Some entropic chains are useful in harsh environments, where extreme temperatures or the presence of other denaturing conditions have no effect on the conformation of the protein/region, because the protein is already disordered [121, 132, 133]. Regions of disorder often are observed at sites of post- translation modifications or other areas where a particular motif needs to be easily accessible. Chaperones often contain disordered regions (over 1/3 disordered in the case of protein chaperones!), which can bind multiple different targets, have rapid association and disassociation, and fold upon binding. Effectors can be separate proteins, or they can be disordered regions of a protein, which will bind and modulate activity, often via allosteric mechanisms. Assemblers are usually IDRs which act as scaffolds to allow for protein binding and assembly of higher-order complexes. Finally, scavengers bind small ligands like ATP or tannin molecules, functioning as a sink for storage or to neutralize these ligands [131].

An analysis of the human proteome found that over 20% of all residues are located in regions predicted to be disordered. Furthermore, about 35% of proteins contain disordered regions longer than 30 residues, and 21.9% contain disordered

37 regions greater than 50 residues long [134]. Interestingly, bacteria and archaea average just 5.7% and 3.8% total residues disordered, respectively. This observation supports the importance of IDPs in signaling and regulation, given the increased networks of interaction found in humans [135]. Along with the important role of IDPs in signaling and important regulatory processes, comes the implication of IDPs in diseases when dysfunction occurs. In fact, over half of the proteins associated with cancer, cardiovascular disease, neurodegenerative diseases, and diabetes contain disordered regions at least 30 residues long [135]. IDPs have also been of recent interest as therapeutic targets due to their role in disease, ability to bind multiple partners, and the fact that amino acid sequence is the primary determinant of binding (as opposed to 3D structure in the case of folded proteins) [136, 137].

38

F. Goals of dissertation work

The Herr Lab has performed extensive biophysical and structural characterization of minimal B-repeat constructs containing one full B-repeat and the C- terminal half repeat cap (Brpt1.5) [36, 37, 56, 57]. Corrigan, et al. have demonstrated the importance of tandem B-repeats in S. aureus biofilm formation, by showing at least

5 B-repeats were required in SasG for biofilm formation to be observed [138].

Therefore, it was important to express constructs containing additional B-repeats.

Stefanie Johns, while in the Herr Lab, produced a construct which was originally believed to contain five and a half B-repeats, but was later found to be a Brpt3.5 construct after mass spectrometry analysis. The difficulty in working with multiple, nearly identical repeats, along with the increased accessibility of commercial gene synthesis, led to the decision to design well-defined Brpt3.5 and Brpt5.5 constructs to be ordered through LifeTechnologies' GeneArt Synthesis. Furthermore, the DNA sequence could be codon-optimized to introduce sequence uniqueness, allowing for easier mutagenesis downstream. These constructs were essential in creating a more biologically relevant study of the role of B-repeats. Particularly, we found that the Brpt3.5 and Brpt5.5 constructs were able to assemble beyond the monomer-dimer association observed using Brpt1.5 constructs, going on to form functional amyloid-like aggregates resembling extracellular material visible in S. epidermidis biofilms. The role of tandem

B-repeats in biofilm formation is the focus of Chapter II. An in-depth biophysical characterization of the early, reversible stages of Brpt5.5 assembly can be found in

Chapter III.

39

During the final years of dissertation work, a protein called small basic protein,

Sbp, was identified as playing an extensive role in S. epidermidis biofilm formation, from attachment to accumulation [24]. Sbp was isolated by running crude biofilm mixture over

B-repeat-coupled sepharose beads. Sbp, therefore, was predicted to be a binding partner of Aap. Additional experiments in this study provided convincing data to support such an interaction. Interestingly, a dissertation published by a lab member of a faculty involved in this initial study revealed a lack of Brpt1.5-Sbp interactions, based on gel filtration, native mass spectrometry, and microscale thermophoresis. At this point, we were well-aware of the fact that tandem B-repeats behave differently than the minimal

Brpt1.5 construct. Using our Brpt5.5 construct, we made a striking observation - Sbp rapidly induced aggregation of Brpt5.5 at Zn2+ concentrations nearly 10-fold lower than the concentration at which Brpt5.5 alone was expected to form this aggregate. Chapter

IV starts with a biophysical characterization of Sbp before diving into the mechanism of interaction between Sbp and Brpt5.5.

Despite the prevalence of "stalk-like' regions of low complexity among gram- positive, cell wall-anchored proteins, very little work has been done investigating the structure or function of these regions. The stalk-like region of Aap, the proline/glycine- rich region (PGR), will be a major focus of this dissertation work. In terms of the primary sequence, the 135-residue PGR has repeating patterns of AEPGKP, with some substitutions to AEPGTP, with more variation leading up to the LPXTG sortase motif.

Not surprisingly, given the frequency of Pro residues, there is no predicted secondary structure for this region, and a prediction of the hydrodynamic radius based on predicted polyproline type-II (PPII) helix results in a very elongated conformation due to extremely

40 high PPII propensity. This work is discussed in Chapter V. This work was furthered by investigating the PGR of SasG, the S. aureus ortholog of Aap, as well as the Ser/Asp- repeat regions of the Sdr family of proteins (see Chapter VI). In addition, we also performed a similar analysis of the A-repeat region of Aap, which is unstructured, but would presumably serve a different function than the stalk-like, PGR.

41

References

1. Otto M. Staphylococcus epidermidis--the 'accidental' pathogen. Nature reviews Microbiology. 2009;7(8):555-67. doi: 10.1038/nrmicro2182. PubMed PMID: 19609257; PubMed Central PMCID: PMC2807625. 2. Uckay I, Pittet D, Vaudaux P, Sax H, Lew D, Waldvogel F. Foreign body infections due to Staphylococcus epidermidis. Annals of medicine. 2009;41(2):109-19. Epub 2008/08/23. doi: 10.1080/07853890802337045. PubMed PMID: 18720093. 3. CDC. National Nosocomial Infections Surveillance (NNIS) system report. 4. Rogers KL, Fey PD, Rupp ME. Coagulase-negative staphylococcal infections. Infectious disease clinics of North America. 2009;23(1):73-98. Epub 2009/01/13. doi: 10.1016/j.idc.2008.10.001. PubMed PMID: 19135917. 5. Costerton JW, Stewart PS, Greenberg EP. Bacterial biofilms: a common cause of persistent infections. Science (New York, NY). 1999;284(5418):1318-22. Epub 1999/05/21. PubMed PMID: 10334980. 6. Otto M. Staphylococcal biofilms. Current topics in microbiology and immunology. 2008;322:207-28. Epub 2008/05/06. PubMed PMID: 18453278; PubMed Central PMCID: PMCPmc2777538. 7. Arrecubieta C, Lee MH, Macey A, Foster TJ, Lowy FD. SdrF, a Staphylococcus epidermidis surface protein, binds type I collagen. The Journal of biological chemistry. 2007;282(26):18767-76. Epub 2007/05/03. doi: 10.1074/jbc.M610940200. PubMed PMID: 17472965. 8. Dickinson GM, Bisno AL. Infections associated with indwelling devices: concepts of pathogenesis; infections associated with intravascular devices. Antimicrobial agents and chemotherapy. 1989;33(5):597-601. Epub 1989/05/01. PubMed PMID: 2665637; PubMed Central PMCID: PMCPMC172496. 9. Heilmann C, Hussain M, Peters G, Gotz F. Evidence for autolysin-mediated primary attachment of Staphylococcus epidermidis to a polystyrene surface. Molecular microbiology. 1997;24(5):1013-24. Epub 1997/06/01. PubMed PMID: 9220008. 10. Heilmann C, Thumm G, Chhatwal GS, Hartleib J, Uekotter A, Peters G. Identification and characterization of a novel autolysin (Aae) with adhesive properties from Staphylococcus epidermidis. Microbiology (Reading, England). 2003;149(Pt 10):2769-78. Epub 2003/10/03. doi: 10.1099/mic.0.26527-0. PubMed PMID: 14523110. 11. Bowden MG, Chen W, Singvall J, Xu Y, Peacock SJ, Valtulina V, et al. Identification and preliminary characterization of cell-wall-anchored proteins of Staphylococcus epidermidis. Microbiology (Reading, England). 2005;151(5):1453-64. doi: doi:10.1099/mic.0.27534-0. 12. Marraffini LA, Dedent AC, Schneewind O. Sortases and the art of anchoring proteins to the envelopes of gram-positive bacteria. Microbiology and molecular biology reviews : MMBR. 2006;70(1):192-221. Epub 2006/03/10. doi: 10.1128/mmbr.70.1.192- 221.2006. PubMed PMID: 16524923; PubMed Central PMCID: PMCPmc1393253. 13. Herman P, El-Kirat-Chatel S, Beaussart A, Geoghegan JA, Foster TJ, Dufrene YF. The binding force of the staphylococcal adhesin SdrG is remarkably strong. Molecular microbiology. 2014;93(2):356-68. Epub 2014/06/06. doi: 10.1111/mmi.12663. PubMed PMID: 24898289.

42

14. Vanzieleghem T, Herman-Bausier P, Dufrene YF, Mahillon J. Staphylococcus epidermidis Affinity for Fibrinogen-Coated Surfaces Correlates with the Abundance of the SdrG Adhesin on the Cell Surface. Langmuir : the ACS journal of surfaces and colloids. 2015;31(16):4713-21. Epub 2015/03/31. doi: 10.1021/acs.langmuir.5b00360. PubMed PMID: 25821995. 15. Herman-Bausier P, Dufrene YF. Atomic force microscopy reveals a dual collagen-binding activity for the staphylococcal surface protein SdrF. Molecular microbiology. 2016;99(3):611-21. Epub 2015/10/21. doi: 10.1111/mmi.13254. PubMed PMID: 26481199. 16. Ponnuraj K, Bowden MG, Davis S, Gurusiddappa S, Moore D, Choe D, et al. A "dock, lock, and latch" structural model for a staphylococcal adhesin binding to fibrinogen. Cell. 2003;115(2):217-28. Epub 2003/10/22. PubMed PMID: 14567919. 17. Trivedi S, Uhlemann AC, Herman-Bausier P, Sullivan SB, Sowash MG, Flores EY, et al. The Surface Protein SdrF Mediates Staphylococcus epidermidis Adherence to Keratin. The Journal of infectious diseases. 2017;215(12):1846-54. Epub 2017/05/10. doi: 10.1093/infdis/jix213. PubMed PMID: 28482041; PubMed Central PMCID: PMCPMC5853823. 18. McCrea KW, Hartford O, Davis S, Eidhin DN, Lina G, Speziale P, et al. The serine-aspartate repeat (Sdr) protein family in Staphylococcus epidermidis. Microbiology (Reading, England). 2000;146 ( Pt 7):1535-46. Epub 2000/07/06. doi: 10.1099/00221287-146-7-1535. PubMed PMID: 10878118. 19. Vacheethasanee K, Temenoff JS, Higashi JM, Gary A, Anderson JM, Bayston R, et al. Bacterial surface properties of clinically isolated Staphylococcus epidermidis strains determine adhesion on polyethylene. Journal of biomedical materials research. 1998;42(3):425-32. Epub 1998/10/27. PubMed PMID: 9788506. 20. Tormo MA, Knecht E, Gotz F, Lasa I, Penades JR. Bap-dependent biofilm formation by pathogenic species of Staphylococcus: evidence of horizontal gene transfer? Microbiology (Reading, England). 2005;151(Pt 7):2465-75. Epub 2005/07/08. doi: 10.1099/mic.0.27865-0. PubMed PMID: 16000737. 21. Poxton IR. Chapter 5 - Teichoic Acids, Lipoteichoic Acids and Other Secondary Cell Wall and Membrane Polysaccharides of Gram-Positive Bacteria. In: Tang Y-W, Sussman M, Liu D, Poxton I, Schwartzman J, editors. Molecular Medical Microbiology (Second Edition). Boston: Academic Press; 2015. p. 91-103. 22. Weidenmaier C, Peschel A. Teichoic acids and related cell-wall glycopolymers in Gram-positive physiology and host interactions. Nature reviews Microbiology. 2008;6(4):276-87. Epub 2008/03/11. doi: 10.1038/nrmicro1861. PubMed PMID: 18327271. 23. Gross M, Cramton SE, Gotz F, Peschel A. Key role of teichoic acid net charge in Staphylococcus aureus colonization of artificial surfaces. Infection and immunity. 2001;69(5):3423-6. Epub 2001/04/09. doi: 10.1128/iai.69.5.3423-3426.2001. PubMed PMID: 11292767; PubMed Central PMCID: PMCPMC98303. 24. Decker R, Burdelski C, Zobiak M, Buttner H, Franke G, Christner M, et al. An 18 kDa scaffold protein is critical for Staphylococcus epidermidis biofilm formation. PLoS pathogens. 2015;11(3):e1004735. Epub 2015/03/24. doi: 10.1371/journal.ppat.1004735. PubMed PMID: 25799153; PubMed Central PMCID: PMCPmc4370877.

43

25. Rohde H, Burdelski C, Bartscht K, Hussain M, Buck F, Horstkotte MA, et al. Induction of Staphylococcus epidermidis biofilm formation via proteolytic processing of the accumulation-associated protein by staphylococcal and host proteases. Molecular microbiology. 2005;55(6):1883-95. doi: 10.1111/j.1365-2958.2005.04515.x. PubMed PMID: 15752207. 26. Macintosh RL, Brittan JL, Bhattacharya R, Jenkinson HF, Derrick J, Upton M, et al. The terminal A domain of the fibrillar accumulation-associated protein (Aap) of Staphylococcus epidermidis mediates adhesion to human corneocytes. Journal of bacteriology. 2009;191(22):7007-16. Epub 2009/09/15. doi: 10.1128/jb.00764-09. PubMed PMID: 19749046; PubMed Central PMCID: PMCPmc2772481. 27. Conlon BP, Geoghegan JA, Waters EM, McCarthy H, Rowe SE, Davies JR, et al. Role for the A domain of unprocessed accumulation-associated protein (Aap) in the attachment phase of the Staphylococcus epidermidis biofilm phenotype. Journal of bacteriology. 2014;196(24):4268-75. Epub 2014/10/01. doi: 10.1128/jb.01946-14. PubMed PMID: 25266380; PubMed Central PMCID: PMCPmc4248850. 28. Schaeffer CR, Woods KM, Longo GM, Kiedrowski MR, Paharik AE, Buttner H, et al. Accumulation-associated protein enhances Staphylococcus epidermidis biofilm formation under dynamic conditions and is required for infection in a rat catheter model. Infection and immunity. 2015;83(1):214-26. Epub 2014/10/22. doi: 10.1128/iai.02177- 14. PubMed PMID: 25332125; PubMed Central PMCID: PMCPmc4288872. 29. Paharik AE, Kotasinska M, Both A, Hoang TN, Buttner H, Roy P, et al. The metalloprotease SepA governs processing of accumulation-associated protein and shapes intercellular adhesive surface properties in Staphylococcus epidermidis. Molecular microbiology. 2016. Epub 2016/12/21. doi: 10.1111/mmi.13594. PubMed PMID: 27997732. 30. Gerke C, Kraft A, Sussmuth R, Schweitzer O, Gotz F. Characterization of the N- acetylglucosaminyltransferase activity involved in the biosynthesis of the Staphylococcus epidermidis polysaccharide intercellular adhesin. The Journal of biological chemistry. 1998;273(29):18586-93. Epub 1998/07/11. PubMed PMID: 9660830. 31. Mack D, Nedelmann M, Krokotsch A, Schwarzkopf A, Heesemann J, Laufs R. Characterization of transposon mutants of biofilm-producing Staphylococcus epidermidis impaired in the accumulative phase of biofilm production: genetic identification of a hexosamine-containing polysaccharide intercellular adhesin. Infection and immunity. 1994;62(8):3244-53. Epub 1994/08/01. PubMed PMID: 8039894; PubMed Central PMCID: PMCPMC302952. 32. Hussain M, Herrmann M, von Eiff C, Perdreau-Remington F, Peters G. A 140- kilodalton extracellular protein is essential for the accumulation of Staphylococcus epidermidis strains on surfaces. Infection and immunity. 1997;65(2):519-24. Epub 1997/02/01. PubMed PMID: 9009307; PubMed Central PMCID: PMCPmc176090. 33. Rohde H, Burandt EC, Siemssen N, Frommelt L, Burdelski C, Wurster S, et al. Polysaccharide intercellular adhesin or protein factors in biofilm accumulation of Staphylococcus epidermidis and Staphylococcus aureus isolated from prosthetic hip and knee joint infections. Biomaterials. 2007;28(9):1711-20. doi: 10.1016/j.biomaterials.2006.11.046. PubMed PMID: 17187854.

44

34. Hennig S, Nyunt Wai S, Ziebuhr W. Spontaneous switch to PIA-independent biofilm formation in an ica-positive Staphylococcus epidermidis isolate. International journal of medical microbiology : IJMM. 2007;297(2):117-22. Epub 2007/02/13. doi: 10.1016/j.ijmm.2006.12.001. PubMed PMID: 17292669. 35. Vuong C, Kocianova S, Voyich JM, Yao Y, Fischer ER, DeLeo FR, et al. A crucial role for exopolysaccharide modification in bacterial biofilm formation, immune evasion, and virulence. The Journal of biological chemistry. 2004;279(52):54881-6. Epub 2004/10/27. doi: 10.1074/jbc.M411374200. PubMed PMID: 15501828. 36. Conrady DG, Brescia CC, Horii K, Weiss AA, Hassett DJ, Herr AB. A zinc- dependent adhesion module is responsible for intercellular adhesion in staphylococcal biofilms. Proceedings of the National Academy of Sciences of the of America. 2008;105(49):19456-61. doi: 10.1073/pnas.0807717105. PubMed PMID: 19047636; PubMed Central PMCID: PMC2592360. 37. Conrady DG, Wilson JJ, Herr AB. Structural basis for Zn2+-dependent intercellular adhesion in staphylococcal biofilms. Proceedings of the National Academy of Sciences of the United States of America. 2013;110(3):E202-11. Epub 2013/01/02. doi: 10.1073/pnas.1208134110. PubMed PMID: 23277549; PubMed Central PMCID: PMCPmc3549106. 38. Vuong C, Gerke C, Somerville GA, Fischer ER, Otto M. Quorum-sensing control of biofilm factors in Staphylococcus epidermidis. The Journal of infectious diseases. 2003;188(5):706-18. Epub 2003/08/23. doi: 10.1086/377239. PubMed PMID: 12934187. 39. Vuong C, Kocianova S, Yao Y, Carmody AB, Otto M. Increased colonization of indwelling medical devices by quorum-sensing mutants of Staphylococcus epidermidis in vivo. The Journal of infectious diseases. 2004;190(8):1498-505. Epub 2004/09/21. doi: 10.1086/424487. PubMed PMID: 15378444. 40. Yarwood JM, Bartels DJ, Volper EM, Greenberg EP. Quorum sensing in Staphylococcus aureus biofilms. Journal of bacteriology. 2004;186(6):1838-50. Epub 2004/03/05. PubMed PMID: 14996815; PubMed Central PMCID: PMCPMC355980. 41. Teufel P, Gotz F. Characterization of an extracellular metalloprotease with elastase activity from Staphylococcus epidermidis. Journal of bacteriology. 1993;175(13):4218-24. Epub 1993/07/01. PubMed PMID: 8320236; PubMed Central PMCID: PMCPMC204852. 42. Dubin G, Chmiel D, Mak P, Rakwalska M, Rzychon M, Dubin A. Molecular cloning and biochemical characterisation of proteases from Staphylococcus epidermidis. Biological chemistry. 2001;382(11):1575-82. Epub 2002/01/05. doi: 10.1515/bc.2001.192. PubMed PMID: 11767947. 43. Ohara-Nemoto Y, Ikeda Y, Kobayashi M, Sasaki M, Tajika S, Kimura S. Characterization and molecular cloning of a glutamyl endopeptidase from Staphylococcus epidermidis. Microbial pathogenesis. 2002;33(1):33-41. Epub 2002/07/20. PubMed PMID: 12127798. 44. Kaplan JB, Ragunath C, Velliyagounder K, Fine DH, Ramasubbu N. Enzymatic detachment of Staphylococcus epidermidis biofilms. Antimicrobial agents and chemotherapy. 2004;48(7):2633-6. Epub 2004/06/25. doi: 10.1128/aac.48.7.2633- 2636.2004. PubMed PMID: 15215120; PubMed Central PMCID: PMCPMC434209. 45. Heilmann C, Schweitzer O, Gerke C, Vanittanakom N, Mack D, Gotz F. Molecular basis of intercellular adhesion in the biofilm-forming Staphylococcus

45 epidermidis. Molecular microbiology. 1996;20(5):1083-91. Epub 1996/06/01. PubMed PMID: 8809760. 46. Cramton SE, Ulrich M, Gotz F, Doring G. Anaerobic conditions induce expression of polysaccharide intercellular adhesin in Staphylococcus aureus and Staphylococcus epidermidis. Infection and immunity. 2001;69(6):4079-85. Epub 2001/05/12. doi: 10.1128/iai.69.6.4079-4085.2001. PubMed PMID: 11349079; PubMed Central PMCID: PMCPMC98472. 47. Rachid S, Ohlsen K, Witte W, Hacker J, Ziebuhr W. Effect of subinhibitory antibiotic concentrations on polysaccharide intercellular adhesin expression in biofilm- forming Staphylococcus epidermidis. Antimicrobial agents and chemotherapy. 2000;44(12):3357-63. Epub 2000/11/18. PubMed PMID: 11083640; PubMed Central PMCID: PMCPMC90205. 48. Knobloch JK, Bartscht K, Sabottke A, Rohde H, Feucht HH, Mack D. Biofilm formation by Staphylococcus epidermidis depends on functional RsbU, an activator of the sigB operon: differential activation mechanisms due to ethanol and salt stress. Journal of bacteriology. 2001;183(8):2624-33. Epub 2001/03/29. doi: 10.1128/jb.183.8.2624-2633.2001. PubMed PMID: 11274123; PubMed Central PMCID: PMCPMC95180. 49. Tormo MA, Marti M, Valle J, Manna AC, Cheung AL, Lasa I, et al. SarA is an essential positive regulator of Staphylococcus epidermidis biofilm development. Journal of bacteriology. 2005;187(7):2348-56. Epub 2005/03/19. doi: 10.1128/jb.187.7.2348- 2356.2005. PubMed PMID: 15774878; PubMed Central PMCID: PMCPMC1065223. 50. Xu L, Li H, Vuong C, Vadyvaloo V, Wang J, Yao Y, et al. Role of the luxS quorum-sensing system in biofilm formation and virulence of Staphylococcus epidermidis. Infection and immunity. 2006;74(1):488-96. Epub 2005/12/22. doi: 10.1128/iai.74.1.488-496.2006. PubMed PMID: 16369005; PubMed Central PMCID: PMCPMC1346618. 51. Giormezis N, Kolonitsiou F, Foka A, Drougka E, Liakopoulos A, Makri A, et al. Coagulase-negative staphylococcal bloodstream and prosthetic-device-associated infections: the role of biofilm formation and distribution of adhesin and toxin genes. Journal of medical microbiology. 2014;63(Pt 11):1500-8. Epub 2014/08/02. doi: 10.1099/jmm.0.075259-0. PubMed PMID: 25082946. 52. Timmerman CP, Fleer A, Besnier JM, De Graaf L, Cremers F, Verhoef J. Characterization of a proteinaceous adhesin of Staphylococcus epidermidis which mediates attachment to polystyrene. Infection and immunity. 1991;59(11):4187-92. Epub 1991/11/01. PubMed PMID: 1682256; PubMed Central PMCID: PMCPMC259015. 53. Pascual A, Fleer A, Westerdaal NA, Verhoef J. Modulation of adherence of coagulase-negative staphylococci to Teflon catheters in vitro. European journal of clinical microbiology. 1986;5(5):518-22. Epub 1986/10/01. PubMed PMID: 3096727. 54. Hogt AH, Dankert J, Hulstaert CE, Feijen J. Cell surface characteristics of coagulase-negative staphylococci and their adherence to fluorinated poly(ethylenepropylene). Infection and immunity. 1986;51(1):294-301. Epub 1986/01/01. PubMed PMID: 3940998; PubMed Central PMCID: PMCPMC261101. 55. Schumacher-Perdreau F, Heilmann C, Peters G, Gotz F, Pulverer G. Comparative analysis of a biofilm-forming Staphylococcus epidermidis strain and its

46 adhesion-positive, accumulation-negative mutant M7. FEMS microbiology letters. 1994;117(1):71-8. Epub 1994/03/15. PubMed PMID: 8181711. 56. Shelton CL, Conrady DG, Herr AB. Functional consequences of B-repeat sequence variation in the staphylococcal biofilm protein Aap: deciphering the assembly code. Biochemical Journal. 2017;474(3):427-43. doi: 10.1042/bcj20160675. 57. Chaton CT, Herr AB. Defining the metal specificity of a multifunctional biofilm adhesion protein. Protein science : a publication of the Protein Society. 2017. Epub 2017/07/15. doi: 10.1002/pro.3232. PubMed PMID: 28707417. 58. Dill KA. Dominant forces in protein folding. . 1990;29(31):7133-55. doi: 10.1021/bi00483a001. 59. Kauzmann W. Some factors in the interpretation of protein denaturation. Advances in protein chemistry. 1959;14:1-63. Epub 1959/01/01. PubMed PMID: 14404936. 60. Tanford C. How protein chemists learned about the hydrophobic factor. Protein Science. 1997;6(6):1358-66. doi: doi:10.1002/pro.5560060627. 61. Pace CN, Hermans J. The Stability of Globular Protein. CRC Critical Reviews in Biochemistry. 1975;3(1):1-43. doi: 10.3109/10409237509102551. 62. Pace CN, Trevino S, Prabhakaran E, Scholtz JM. Protein structure, stability and solubility in water and other solvents. Philosophical transactions of the Royal Society of London Series B, Biological sciences. 2004;359(1448):1225-34; discussion 34-5. Epub 2004/08/13. doi: 10.1098/rstb.2004.1500. PubMed PMID: 15306378; PubMed Central PMCID: PMCPMC1693406. 63. Pace CN, Fu H, Fryar KL, Landua J, Trevino SR, Shirley BA, et al. Contribution of Hydrophobic Interactions to Protein Stability. Journal of molecular biology. 2011;408(3):514-28. doi: https://doi.org/10.1016/j.jmb.2011.02.053. 64. Voet D, Voet JG, Pratt CW. Fundamentals of biochemistry: life at the molecular level. 3rd ed. Hoboken, N.J: Wiley; 2008. 65. Wertz DH, Scheraga HA. Influence of water on protein structure. An analysis of the preferences of amino acid residues for the inside or outside and for specific conformations in a protein molecule. Macromolecules. 1978;11(1):9-15. Epub 1978/01/01. PubMed PMID: 621952. 66. Lim WA, Sauer RT. Alternative packing arrangements in the hydrophobic core of lambda repressor. Nature. 1989;339(6219):31-6. Epub 1989/05/04. doi: 10.1038/339031a0. PubMed PMID: 2524006. 67. Laue T. Charge matters. Biophysical reviews. 2016;8(4):287-9. Epub 2017/05/17. doi: 10.1007/s12551-016-0229-3. PubMed PMID: 28510017; PubMed Central PMCID: PMCPMC5425809. 68. Nozaki Y, Tanford C. The solubility of amino acids and related compounds in aqueous urea solutions. The Journal of biological chemistry. 1963;238:4074-81. Epub 1963/12/01. PubMed PMID: 14086747. 69. Tanford C. Isothermal Unfolding of Globular Proteins in Aqueous Urea Solutions. J Am Chem Soc. 1964;86(10):2050-9. 70. Cohn EJ, Edsall JT. Proteins, amino acids and peptides as ions and dipolar ions. New York U6 - ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF- 8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt %3Akev%3Amtx%3Abook&rft.genre=book&rft.title=Proteins%2C+amino+acids+and+pe

47 ptides+as+ions+and+dipolar+ions&rft.au=Cohn%2C+Edwin+Joseph&rft.au=Edsall%2C +John+T&rft.series=American+Chemical+Society.+Monograph+series&rft.date=1943&rf t.pub=Reinhold+Publishing+Corporation&rft.volume=no.+90&rft.externalDocID=b18383 415¶mdict=en-US U7 - Book: Reinhold Publishing Corporation; 1943. 71. Robinson DR, Jencks WP. Effect of denaturing agents of the urea-guanidinium class on the solubility of acetyltetraglycine ethyl ester and related compounds. The Journal of biological chemistry. 1963;238:1558-60. Epub 1963/04/01. PubMed PMID: 13974427. 72. Kajander T, Kahn PC, Passila SH, Cohen DC, Lehtiö L, Adolfsen W, et al. Buried Charged Surface in Proteins. Structure. 2000;8(11):1203-14. doi: 10.1016/S0969- 2126(00)00520-7. 73. Auton M, Rosgen J, Sinev M, Holthauzen LM, Bolen DW. Osmolyte effects on protein stability and solubility: a balancing act between backbone and side-chains. Biophysical chemistry. 2011;159(1):90-9. Epub 2011/06/21. doi: 10.1016/j.bpc.2011.05.012. PubMed PMID: 21683504; PubMed Central PMCID: PMCPmc3166983. 74. Canchi Deepak R, García Angel E. Backbone and Side-Chain Contributions in Protein Denaturation by Urea. Biophysical Journal. 2011;100(6):1526-33. doi: https://doi.org/10.1016/j.bpj.2011.01.028. 75. Moeser B, Horinek D. Unified Description of Urea Denaturation: Backbone and Side Chains Contribute Equally in the Transfer Model. The Journal of Physical Chemistry B. 2014;118(1):107-14. doi: 10.1021/jp409934q. 76. Das A, Mukhopadhyay C. Urea-mediated protein denaturation: a consensus view. The journal of physical chemistry B. 2009;113(38):12816-24. Epub 2009/08/28. doi: 10.1021/jp906350s. PubMed PMID: 19708649. 77. Bolen DW, Baskakov IV. The osmophobic effect: natural selection of a thermodynamic force in protein folding. Journal of molecular biology. 2001;310(5):955- 63. Epub 2001/08/15. doi: 10.1006/jmbi.2001.4819. PubMed PMID: 11502004. 78. Yancey PH, Clark ME, Hand SC, Bowlus RD, Somero GN. Living with water stress: evolution of osmolyte systems. Science (New York, NY). 1982;217(4566):1214- 22. Epub 1982/09/24. PubMed PMID: 7112124. 79. Schaub LJ, Campbell JC, Whitten ST. Thermal unfolding of the N-terminal region of p53 monitored by circular dichroism spectroscopy. Protein science : a publication of the Protein Society. 2012;21(11):1682-8. Epub 2012/08/24. doi: 10.1002/pro.2146. PubMed PMID: 22915551; PubMed Central PMCID: PMCPmc3527704. 80. Baskakov I, Bolen DW. Forcing thermodynamically unfolded proteins to fold. The Journal of biological chemistry. 1998;273(9):4831-4. Epub 1998/03/28. PubMed PMID: 9478922. 81. Nozaki Y, Tanford C. The solubility of amino acids and two glycine peptides in aqueous ethanol and dioxane solutions. Establishment of a hydrophobicity scale. The Journal of biological chemistry. 1971;246(7):2211-7. Epub 1971/04/10. PubMed PMID: 5555568. 82. Cammers-Goodwin A, Allen TJ, Oslick SL, McClure KF, Lee JH, Kemp DS. Mechanism of Stabilization of Helical Conformations of Polypeptides by Water Containing Trifluoroethanol. Journal of the American Chemical Society. 1996;118(13):3082-90. doi: 10.1021/ja952900z.

48

83. Privalov PL, Khechinashvili NN. A thermodynamic approach to the problem of stabilization of globular protein structure: a calorimetric study. Journal of molecular biology. 1974;86(3):665-84. Epub 1974/07/05. PubMed PMID: 4368360. 84. Myers JK, Pace CN. Hydrogen bonding stabilizes globular proteins. Biophysical Journal. 1996;71(4):2033-9. doi: https://doi.org/10.1016/S0006-3495(96)79401-8. 85. Pace CN, Horn G, Hebert EJ, Bechert J, Shaw K, Urbanikova L, et al. Tyrosine hydrogen bonds make a large contribution to protein stability. Journal of molecular biology. 2001;312(2):393-404. Epub 2001/09/14. doi: 10.1006/jmbi.2001.4956. PubMed PMID: 11554795. 86. Stickle DF, Presta LG, Dill KA, Rose GD. Hydrogen bonding in globular proteins. Journal of molecular biology. 1992;226(4):1143-59. Epub 1992/08/20. PubMed PMID: 1518048. 87. Fitzkee NC, Fleming PJ, Gong H, Panasik N, Street TO, Rose GD. Are proteins made from a limited parts list? Trends in Biochemical Sciences. 2005;30(2):73-80. doi: https://doi.org/10.1016/j.tibs.2004.12.005. 88. Bertagna AM, Barrick D. Nonspecific hydrophobic interactions stabilize an equilibrium intermediate of apomyoglobin at a key position within the AGH region. Proceedings of the National Academy of Sciences of the United States of America. 2004;101(34):12514-9. Epub 2004/08/18. doi: 10.1073/pnas.0404760101. PubMed PMID: 15314218; PubMed Central PMCID: PMCPMC515089. 89. Matthews BW, Nicholson H, Becktel WJ. Enhanced protein thermostability from site-directed mutations that decrease the entropy of unfolding. Proceedings of the National Academy of Sciences of the United States of America. 1987;84(19):6663-7. Epub 1987/10/01. PubMed PMID: 3477797; PubMed Central PMCID: PMCPMC299143. 90. Nicholson H, Tronrud DE, Becktel WJ, Matthews BW. Analysis of the effectiveness of proline substitutions and glycine replacements in increasing the stability of phage T4 lysozyme. Biopolymers. 1992;32(11):1431-41. Epub 1992/11/01. doi: 10.1002/bip.360321103. PubMed PMID: 1457724. 91. Pace CN, Grimsley GR, Thomson JA, Barnett BJ. Conformational stability and activity of ribonuclease T1 with zero, one, and two intact disulfide bonds. The Journal of biological chemistry. 1988;263(24):11820-5. Epub 1988/08/25. PubMed PMID: 2457027. 92. Dobson CM. Protein folding and misfolding. Nature. 2003;426(6968):884-90. Epub 2003/12/20. doi: 10.1038/nature02261. PubMed PMID: 14685248. 93. Nilsson MR. Techniques to study amyloid fibril formation in vitro. Methods (San Diego, Calif). 2004;34(1):151-60. Epub 2004/07/31. doi: 10.1016/j.ymeth.2004.03.012. PubMed PMID: 15283924. 94. Vetri V, Canale C, Relini A, Librizzi F, Militello V, Gliozzi A, et al. Amyloid fibrils formation and amorphous aggregation in concanavalin A. Biophysical chemistry. 2007;125(1):184-90. doi: http://doi.org/10.1016/j.bpc.2006.07.012. 95. Chiti F, Dobson CM. Protein Misfolding, Amyloid Formation, and Human Disease: A Summary of Progress Over the Last Decade. Annual review of biochemistry. 2017;86:27-68. Epub 2017/05/13. doi: 10.1146/annurev-biochem-061516-045115. PubMed PMID: 28498720.

49

96. Zhao R, So M, Maat H, Ray NJ, Arisaka F, Goto Y, et al. Measurement of amyloid formation by turbidity assay-seeing through the cloud. Biophysical reviews. 2016;8(4):445-71. Epub 2016/12/23. doi: 10.1007/s12551-016-0233-7. PubMed PMID: 28003859; PubMed Central PMCID: PMCPMC5135725. 97. Gazit E. The "Correctly Folded" state of proteins: is it a metastable state? Angewandte Chemie (International ed in English). 2002;41(2):257-9. Epub 2002/12/20. PubMed PMID: 12491403. 98. Baldwin AJ, Knowles TP, Tartaglia GG, Fitzpatrick AW, Devlin GL, Shammas SL, et al. Metastability of native proteins and the phenomenon of amyloid formation. J Am Chem Soc. 2011;133(36):14160-3. Epub 2011/06/10. doi: 10.1021/ja2017703. PubMed PMID: 21650202. 99. Knowles TP, Fitzpatrick AW, Meehan S, Mott HR, Vendruscolo M, Dobson CM, et al. Role of intermolecular forces in defining material properties of protein nanofibrils. Science (New York, NY). 2007;318(5858):1900-3. Epub 2007/12/22. doi: 10.1126/science.1150057. PubMed PMID: 18096801. 100. Fandrich M, Dobson CM. The behaviour of polyamino acids reveals an inverse side chain effect in amyloid structure formation. The EMBO journal. 2002;21(21):5682- 90. Epub 2002/11/02. PubMed PMID: 12411486; PubMed Central PMCID: PMCPMC131070. 101. Nelson R, Sawaya MR, Balbirnie M, Madsen AO, Riekel C, Grothe R, et al. Structure of the cross-beta spine of amyloid-like fibrils. Nature. 2005;435(7043):773-8. doi: 10.1038/nature03680. PubMed PMID: 15944695; PubMed Central PMCID: PMC1479801. 102. Erskine E, MacPhee CE, Stanley-Wall NR. Functional Amyloid and Other Protein Fibers in the Biofilm Matrix. Journal of molecular biology. 2018;430(20):3642-56. doi: https://doi.org/10.1016/j.jmb.2018.07.026. 103. Williams AD, Portelius E, Kheterpal I, Guo J-t, Cook KD, Xu Y, et al. Mapping Aβ Amyloid Fibril Secondary Structure Using Scanning Proline Mutagenesis. Journal of molecular biology. 2004;335(3):833-42. doi: https://doi.org/10.1016/j.jmb.2003.11.008. 104. Sanchez de Groot N, Pallares I, Aviles FX, Vendrell J, Ventura S. Prediction of "hot spots" of aggregation in disease-linked polypeptides. BMC structural biology. 2005;5:18. Epub 2005/10/04. doi: 10.1186/1472-6807-5-18. PubMed PMID: 16197548; PubMed Central PMCID: PMCPMC1262731. 105. Csermely P. Chaperone overload is a possible contributor to 'civilization diseases'. Trends in genetics : TIG. 2001;17(12):701-4. Epub 2001/11/24. PubMed PMID: 11718923. 106. Deshmukh M, Evans ML, Chapman MR. Amyloid by Design: Intrinsic Regulation of Microbial Amyloid Assembly. Journal of molecular biology. 2018;430(20):3631-41. doi: https://doi.org/10.1016/j.jmb.2018.07.007. 107. Taglialegna A, Lasa I, Valle J. Amyloid Structures as Biofilm Matrix Scaffolds. Journal of bacteriology. 2016;198(19):2579-88. Epub 2016/05/18. doi: 10.1128/jb.00122-16. PubMed PMID: 27185827; PubMed Central PMCID: PMCPMC5019065. 108. Chapman MR, Robinson LS, Pinkner JS, Roth R, Heuser J, Hammar M, et al. Role of Escherichia coli curli operons in directing amyloid fiber formation. Science (New

50

York, NY). 2002;295(5556):851-5. Epub 2002/02/02. doi: 10.1126/science.1067484. PubMed PMID: 11823641; PubMed Central PMCID: PMCPmc2838482. 109. Romero D, Kolter R. Functional amyloids in bacteria. International microbiology : the official journal of the Spanish Society for Microbiology. 2014;17(2):65-73. Epub 2014/06/01. doi: 10.2436/20.1501.01.208. PubMed PMID: 26418850. 110. King CY, Tittmann P, Gross H, Gebert R, Aebi M, Wuthrich K. Prion-inducing domain 2-114 of yeast Sup35 protein transforms in vitro into amyloid-like filaments. Proceedings of the National Academy of Sciences of the United States of America. 1997;94(13):6618-22. Epub 1997/06/24. PubMed PMID: 9192614; PubMed Central PMCID: PMCPMC21207. 111. Fowler DM, Koulov AV, Alory-Jost C, Marks MS, Balch WE, Kelly JW. Functional amyloid formation within mammalian tissue. PLoS biology. 2006;4(1):e6. Epub 2005/11/23. doi: 10.1371/journal.pbio.0040006. PubMed PMID: 16300414; PubMed Central PMCID: PMCPMC1288039. 112. Dueholm MS, Larsen P, Finster K, Stenvang MR, Christiansen G, Vad BS, et al. The Tubular Sheaths Encasing Methanosaeta thermophila Filaments Are Functional Amyloids. The Journal of biological chemistry. 2015;290(33):20590-600. Epub 2015/06/26. doi: 10.1074/jbc.M115.654780. PubMed PMID: 26109065; PubMed Central PMCID: PMCPMC4536462. 113. Fowler DM, Koulov AV, Balch WE, Kelly JW. Functional amyloid--from bacteria to humans. Trends Biochem Sci. 2007;32(5):217-24. Epub 2007/04/07. doi: 10.1016/j.tibs.2007.03.003. PubMed PMID: 17412596. 114. Larsen P, Nielsen JL, Dueholm MS, Wetzel R, Otzen D, Nielsen PH. Amyloid adhesins are abundant in natural biofilms. Environmental microbiology. 2007;9(12):3077-90. Epub 2007/11/10. doi: 10.1111/j.1462-2920.2007.01418.x. PubMed PMID: 17991035. 115. Blanco LP, Evans ML, Smith DR, Badtke MP, Chapman MR. Diversity, biogenesis and function of microbial amyloids. Trends in microbiology. 2012;20(2):66- 73. Epub 2011/12/27. doi: 10.1016/j.tim.2011.11.005. PubMed PMID: 22197327; PubMed Central PMCID: PMCPMC3278576. 116. Schwartz K, Syed AK, Stephenson RE, Rickard AH, Boles BR. Functional amyloids composed of phenol soluble modulins stabilize Staphylococcus aureus biofilms. PLoS pathogens. 2012;8(6):e1002744. Epub 2012/06/12. doi: 10.1371/journal.ppat.1002744. PubMed PMID: 22685403; PubMed Central PMCID: PMCPMC3369951. 117. Zheng Y, Joo HS, Nair V, Le KY, Otto M. Do amyloid structures formed by Staphylococcus aureus phenol-soluble modulins have a biological function? International journal of medical microbiology : IJMM. 2017. Epub 2017/09/05. doi: 10.1016/j.ijmm.2017.08.010. PubMed PMID: 28867522. 118. Uversky VN, Gillespie JR, Fink AL. Why are "natively unfolded" proteins unstructured under physiologic conditions? Proteins. 2000;41(3):415-27. Epub 2000/10/12. PubMed PMID: 11025552. 119. Uversky VN. Intrinsically disordered proteins from A to Z. The international journal of biochemistry & cell biology. 2011;43(8):1090-103. Epub 2011/04/20. doi: 10.1016/j.biocel.2011.04.001. PubMed PMID: 21501695.

51

120. Oldfield CJ, Dunker AK. Intrinsically disordered proteins and intrinsically disordered protein regions. Annual review of biochemistry. 2014;83:553-84. Epub 2014/03/13. doi: 10.1146/annurev-biochem-072711-164947. PubMed PMID: 24606139. 121. Dunker AK, Lawson JD, Brown CJ, Williams RM, Romero P, Oh JS, et al. Intrinsically disordered protein. Journal of molecular graphics & modelling. 2001;19(1):26-59. Epub 2001/05/31. PubMed PMID: 11381529. 122. Das RK, Pappu RV. Conformations of intrinsically disordered proteins are influenced by linear sequence distributions of oppositely charged residues. Proceedings of the National Academy of Sciences of the United States of America. 2013;110(33):13392-7. Epub 2013/08/01. doi: 10.1073/pnas.1304749110. PubMed PMID: 23901099; PubMed Central PMCID: PMCPmc3746876. 123. Mao AH, Crick SL, Vitalis A, Chicoine CL, Pappu RV. Net charge per residue modulates conformational ensembles of intrinsically disordered proteins. Proceedings of the National Academy of Sciences of the United States of America. 2010;107(18):8183- 8. Epub 2010/04/21. doi: 10.1073/pnas.0911107107. PubMed PMID: 20404210; PubMed Central PMCID: PMCPmc2889596. 124. Tomasso ME, Tarver MJ, Devarajan D, Whitten ST. Hydrodynamic Radii of Intrinsically Disordered Proteins Determined from Experimental Polyproline II Propensities. PLoS computational biology. 2016;12(1):e1004686. Epub 2016/01/05. doi: 10.1371/journal.pcbi.1004686. PubMed PMID: 26727467. 125. English LR, Tilton EC, Ricard BJ, Whitten ST. Intrinsic alpha helix propensities compact hydrodynamic radii in intrinsically disordered proteins. Proteins. 2017;85(2):296-311. Epub 2016/12/10. doi: 10.1002/prot.25222. PubMed PMID: 27936491; PubMed Central PMCID: PMCPMC5258847. 126. Dyson HJ, Wright PE. Intrinsically unstructured proteins and their functions. Nature Reviews Molecular Cell Biology. 2005;6:197. doi: 10.1038/nrm1589. 127. Fischer E. Einfluss der Configuration auf die Wirkung der Enzyme. Berichte der deutschen chemischen Gesellschaft. 1894;27(3):2985-93. doi: 10.1002/cber.18940270364. 128. Mirsky AE, Pauling L. On the Structure of Native, Denatured, and Coagulated Proteins. Proceedings of the National Academy of Sciences of the United States of America. 1936;22(7):439-47. Epub 1936/07/01. PubMed PMID: 16577722; PubMed Central PMCID: PMCPMC1076802. 129. Uversky VN. Natively unfolded proteins: a point where biology waits for physics. Protein science : a publication of the Protein Society. 2002;11(4):739-56. Epub 2002/03/23. doi: 10.1110/ps.4210102. PubMed PMID: 11910019; PubMed Central PMCID: PMCPMC2373528. 130. Uversky VN. What does it mean to be natively unfolded? European journal of biochemistry / FEBS. 2002;269(1):2-12. Epub 2002/01/11. PubMed PMID: 11784292. 131. van der Lee R, Buljan M, Lang B, Weatheritt RJ, Daughdrill GW, Dunker AK, et al. Classification of intrinsically disordered regions and proteins. Chemical reviews. 2014;114(13):6589-631. Epub 2014/04/30. doi: 10.1021/cr400525m. PubMed PMID: 24773235; PubMed Central PMCID: PMCPmc4095912. 132. Uversky VN. Paradoxes and wonders of intrinsic disorder: Stability of instability. Intrinsically disordered proteins. 2017;5(1):e1327757. Epub 2018/09/27. doi:

52

10.1080/21690707.2017.1327757. PubMed PMID: 30250771; PubMed Central PMCID: PMCPMC6149434. 133. Yarawsky AE, English LR, Whitten ST, Herr AB. The Proline/Glycine-Rich Region of the Biofilm Adhesion Protein Aap Forms an Extended Stalk that Resists Compaction. Journal of molecular biology. 2017;429(2):261-79. Epub 2016/11/29. doi: 10.1016/j.jmb.2016.11.017. PubMed PMID: 27890783. 134. Ward JJ, Sodhi JS, McGuffin LJ, Buxton BF, Jones DT. Prediction and Functional Analysis of Native Disorder in Proteins from the Three Kingdoms of Life. Journal of molecular biology. 2004;337(3):635-45. doi: https://doi.org/10.1016/j.jmb.2004.02.002. 135. Uversky VN, Oldfield CJ, Dunker AK. Intrinsically disordered proteins in human diseases: introducing the D2 concept. Annual review of biophysics. 2008;37:215-46. Epub 2008/06/25. doi: 10.1146/annurev.biophys.37.032807.125924. PubMed PMID: 18573080. 136. Metallo SJ. Intrinsically disordered proteins are potential drug targets. Current opinion in chemical biology. 2010;14(4):481-8. Epub 2010/07/06. doi: 10.1016/j.cbpa.2010.06.169. PubMed PMID: 20598937; PubMed Central PMCID: PMCPmc2918680. 137. Uversky VN. Intrinsically disordered proteins and novel strategies for drug discovery. Expert opinion on drug discovery. 2012;7(6):475-88. Epub 2012/05/09. doi: 10.1517/17460441.2012.686489. PubMed PMID: 22559227. 138. Corrigan RM, Rigby D, Handley P, Foster TJ. The role of Staphylococcus aureus surface protein SasG in adherence and biofilm formation. Microbiology (Reading, England). 2007;153(Pt 8):2435-46. Epub 2007/07/31. doi: 10.1099/mic.0.2007/006676- 0. PubMed PMID: 17660408.

53

Chapter II. The biofilm adhesion protein Aap from Staphylococcus epidermidis

forms zinc-dependent amyloid fibers*

Authors: Alexander E. Yarawsky1,2¶, Stefanie L. Johns1¶, Peter Schuck3, Deborah G.

Conrady1 and Andrew B. Herr2,4,5

¶These authors contributed equally to this work.

Affiliations: 1 - Graduate Program in Molecular Genetics, Biochemistry and Microbiology, University of Cincinnati College of Medicine, Cincinnati, Ohio, USA

2 - Division of Immunobiology, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, USA

3 - Dynamics of Macromolecular Assembly Section, Laboratory of Cellular Imaging and Bioengineering, National Institute of Biomedical Imaging and Bioengineering, National Institutes of Health, Bethesda, Maryland, USA

4 - Division of Infectious Diseases, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, USA

5 - Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, Ohio, USA

Author Contributions: A.E.Y. collected and analyzed data on Brpt5.5 and Brpt3.5-Cys, purified fibrils from biofilms for mass spectrometry, and performed biofilm assays and electron microscopy of biofilms S.L.J. collected and analyzed (HMBP-)Brpt3.5 data and performed confocal microscopy P.S. analyzed c(s,ff0) data D.G.C. designed and cloned the His-MBP-Brpt3.5 construct A.E.Y., S.L.J., and A.B.H. conceived experiments, directed the project and wrote the manuscript.

Funding: NIH R01-GM094363 and U19-AI070235, and funds from the Cincinnati Children’s Hospital Research Foundation awarded to A.B.H. supported this work.

*Notice of Previous Publication: Contributions from S.L.J., P.S. and D.G.C. have been previously published in Chapters 4 and 5 of the Ph. D. Dissertation entitled "Mechanistic insights for protein-dependent biofilm formation in Staphyloccocus epidermidis and beyond", by Stefanie L. Johns, 2011, University of Cincinnati. Major updates to the previously published text (primarily in the Results and Discussion sections) have been made in the current chapter by A.E.Y. and A.B.H.

54

Abstract

The skin-colonizing, commensal bacterium Staphylococcus epidermidis has emerged as a leading cause of hospital-acquired and device-related infections. The primary determinant for S. epidermidis pathogenesis is its ability to form biofilms, which are multi-layered, surface-adherent bacterial accumulations that show remarkable resistance to chemical and physical stresses. Accumulation-associated protein (Aap) from S. epidermidis and its S. aureus homolog SasG have been shown to be necessary and sufficient for mature biofilm formation. These proteins have a repetitive domain architecture, containing up to 17 tandem B-repeats; the presence of at least five tandem repeats in SasG has been shown to be critical for S. aureus biofilm formation. We previously demonstrated that Aap B-repeat constructs self-assemble in the presence of zinc to reversibly form twisted, rope-like filaments between staphylococcal cells in the biofilm. In this study, we demonstrate that longer Aap B-repeat constructs with three to five intact repeats form functional amyloid fibers in the presence of zinc. Fluorescence assays with amyloid-binding dyes, transmission electron microscopy, and confocal fluorescence microscopy experiments were used to analyze the time- and temperature- dependence of amyloid fiber formation. We have also utilized a recently developed analytical approach to resolve multiple amyloidogenic precursors using sedimentation velocity analytical ultracentrifugation. Furthermore, we have demonstrated the presence of amyloid fibers during both early and late stages of S. epidermidis biofilm formation using confocal microscopy and have confirmed that extracellular fibrils from biofilms primarily contain Aap. This work provides new insights into S. epidermidis biofilm

55 formation and architecture that will potentially lead to new therapeutic treatments for persistent staphylococcal infections.

56

Author summary

S. epidermidis has recently emerged as a leading cause of hospital-acquired infections, bloodstream infections, and medical device-related infections. While the majority of S. epidermidis strains are antibiotic-resistant, the actual basis for the propensity of S. epidermidis to cause persistent infections is its ability to form a biofilm.

A biofilm is a multi-layered bacterial aggregation that is typically encased in a polysaccharide-rich extracellular matrix. Once a mature biofilm-related infection has been established within the body, treatment often requires surgical removal of the biofilm and long-term intravenous antibiotic therapy. Here, we have determined that the accumulation-associated protein, which is critical for the establishment of S. epidermidis biofilms, forms a network of stable protein fibers in the presence of zinc. We determined that these fibers are amyloid, a type of highly stable, ordered protein aggregate that is resistant to chemical denaturants and has been implicated in the pathogenesis of a wide range of diseases. We have demonstrated the use of a new analytical technique to study the early protein assembly events leading to mature amyloid fiber formation that can potentially be applied to other amyloid-forming protein systems. Our study provides important insights into S. epidermidis biofilm formation that will potentially provide novel therapeutic treatments.

57

Introduction

Staphylococcus epidermidis is a critical component of the normal human flora that helps control the colonization and invasion of potentially dangerous microbial pathogens. However, S. epidermidis has emerged as a leading opportunistic pathogen due to its high prevalence on epithelial surfaces and ability to colonize prosthetic medical devices[1]. S. epidermidis specifically is the leading cause of nosocomial infections and device-related infections[2, 3], and is, along with its other coagulase- negative relatives, the leading cause of bacteremia[4, 5]. While S. epidermidis infections are typically non-aggressive, they are extremely resistant to antibiotic therapy[6].

Therefore, staphylococcal infections often require invasive treatment methods and frequently lead to chronic morbidity, mortality, and high healthcare costs[5, 7, 8]. S. epidermidis pathogenesis and chronic persistence is primarily associated with its ability to form a biofilm[9, 10], a multi-layered bacterial aggregation surrounded by an extracellular matrix[11]. Biofilm formation occurs in three general stages: primary adherence to a surface by individual staphylococcal cells; accumulation into multicellular colonies through intercellular adhesion events; and formation of a mature biofilm through cycles of remodeling that create characteristic cellular towers separated by channels that allow access to nutrients. The biofilm is typically surrounded by a secreted extracellular matrix comprised of macromolecular components that serve to anchor the staphylococcal cells together, including the polysaccharide poly-N- acetylglucosamine (PNAG) as well as proteins, extracellular DNA, or teichoic acid, depending on the strain and growth conditions[12-14]. Biofilm formation can occur through both polysaccharide-dependent and -independent pathways, the latter

58 mediated by protein-protein interactions[15]. The protein Aap (accumulation-associated protein) is primarily responsible for protein-dependent intercellular accumulation of S. epidermidis biofilm formation[16], and is also required for the polysaccharide-dependent mechanism[17]. Rohde et al and Corrigan et al described a protein-based mechanism for staphylococcal biofilm formation that is independent of PNAG; in several strains of S. epidermidis or S. aureus, the protein Aap or its ortholog SasG can mediate biofilm formation in the absence of polysaccharide secretion[18, 19]. Indeed, 39% of biofilm- positive clinical isolates are PNAG-negative while 90% are positive for Aap[15, 20, 21].

Recent work by Schaeffer et al[22] demonstrated the critical role of Aap in S. epidermidis infection in vivo. Under fluid shear, Aap-deficient strains formed significantly less biofilm than those expressing normal levels of Aap. A rat catheter model was then used to evaluate the biological implication of these results, which revealed that Aap, but not PNAG, was required for infection[22].

Aap is a multi-domain protein consisting of an N-terminal export signal followed by the A-repeat region (eleven partially-conserved 16-residue repeats), a lectin domain, the B-repeat region containing five to seventeen conserved repeats of a 128-amino acid sequence, a proline/glycine-rich (P/G-rich) region that forms a highly extended stalk that is resistant to compaction[23], and an LPXTG cell wall anchor motif, as shown in Fig 1A.

The N-terminal portion of Aap containing the A-repeat region and lectin domain

(collectively called the A-domain) can be proteolytically cleaved to expose the B-repeat region, which can then initiate bacterial accumulation into microcolonies[18]. The staphylococcal metalloprotease SepA is responsible for this proteolytic cleavage event, switching the role of Aap from A-domain-mediated surface adhesion to its namesake

59 role in biofilm accumulation[24]. Each B-repeat (Brpt) contains a 78-amino acid G5 domain and a 50-amino acid spacer domain (also called an E domain); the B-repeat sequences in Aap are highly conserved, with 83-100% sequence identity[25]. The final repeat in the B-repeat region is comprised of a single G5 domain without the spacer motif (Fig 1A); this C-terminal half-repeat “cap” plays a role in stabilizing the protein[25].

We have previously demonstrated that a single B-repeat domain with the half-repeat cap (Brpt1.5) will self-associate in the presence of Zn2+ to form an anti-parallel dimer, leading to a model for Zn2+-mediated protein-dependent intercellular accumulation between staphylococci in a nascent biofilm[25, 26]. Subsequent work demonstrated that

Brpt1.5 and longer B-repeat constructs can self-assemble in the presence of Zn2+ or

Cu2+, while other metals (Mn2+, Co2+, and Ni2+) can bind to B-repeats but do not induce assembly[27]. Recent work has shown similar Zn2+-dependent self-association behavior for the B-repeat region of SasG, the S. aureus ortholog of Aap[28, 29].

Full-length Aap contains 5-17 of these nearly identical B-repeats; Corrigan et al have demonstrated that at least five tandem B-repeats were required for S. aureus biofilm formation, suggesting that the biological function of SasG and, presumably, Aap relies on longer stretches of at least 5 consecutive B-repeats [19]. Therefore, the focus of this study is to characterize a biologically relevant construct of Aap consisting of five consecutive B-repeats and the C-terminal cap (called Brpt5.5; see Fig 1A), and to determine the role of tandem B-repeats in S. epidermidis biofilm formation. In the presence of Zn2+, tandem B-repeats assembled into a range of oligomeric states, including highly elongated fibers. We show that Zn2+-induced B-repeat fibers are functional amyloid fibers that assemble in a temperature- and time-dependent fashion.

60

Importantly, fibers formed in vitro by recombinant, tandem B-repeats show at least partial resistance to Zn2+ chelation and acidification, suggesting that these highly stable amyloid fibers may play a critical part in rendering the biofilm resistant to physical or chemical stresses. We apply a newly developed analytical approach to deconvolute the early-stage assembly of fibril intermediates using analytical ultracentrifugation. Finally, we show that amyloid fibers form during early and late stages of S. epidermidis biofilm growth, and we demonstrate that fibrils isolated from S. epidermidis biofilms are primarily composed of Aap. This is the first report demonstrating that Aap, an essential protein for S. epidermidis infectivity, forms amyloid fibrils; these findings could potentially lead to new therapeutic approaches to relieve the extensive morbidity and mortality caused by S. epidermidis infections.

61

Results

Solution characterization of tandem B-repeats from Aap

In order to determine the functional relationship between tandem copies of the

Aap B-repeat region and S. epidermidis biofilm formation, we have generated a construct containing the C-terminal five intact B-repeats, along with the C-terminal half- repeat (called Brpt5.5) (Fig 1A). The design of this construct was based on the minimum number of B-repeats previously shown to support biofilm formation by SasG, an Aap ortholog from S. aureus[19]. In addition, we utilized a Brpt3.5 construct, both in isolation and as an N-terminally His6-tagged maltose binding protein (MBP) fusion protein (called

HMBP-Brpt3.5). The HMBP-Brpt3.5 construct showed more efficient expression and was more stable in solution, so it was predominantly used for initial experiments.

Far-UV circular dichroism (CD) was used to verify proper folding of the B-repeats in these constructs (Fig 1B). Based on our previous studies of related Brpt1.5 constructs

[25, 26] and the high sequence identity between B-repeats (Fig S1)[25, 30], it was expected that the Brpt3.5 and Brpt5.5 constructs would have similar secondary structure content. Indeed, these constructs all contain high β-strand and coil content.

The increased negativity around 200 nm observed for the Brpt3.5 and Brpt5.5 constructs may be due to greater random coil contribution from the increased proportion of B-repeat spacer motifs compared to G5 domains in Brpt3.5 and Brpt5.5 (i.e. Brpt5.5 contains 5 spacer motifs vs 6 G5 domains (~45% spacer), whereas Brpt1.5 contains 1 spacer motif vs 2 G5 domains (~33% spacer) (see Fig 1A)). The CD spectrum for uncleaved HMBP-Brpt3.5 (Fig S2) revealed a combination of α-helical, β-strand, and coil secondary structure, consistent with the α-helical content of MBP[31].

62

Fig 1. Characterization of tandem B-repeats from S. epidermidis Aap. (A) Full length Aap domain organization including: the A-repeat region and putative lectin domain that are proteolytically cleaved (dashed line), the B-repeat region (5 – 17 B- repeats) ending with the conserved half-repeat cap, the proline/glycine-rich region (P/G- rich), and the cell wall anchor motif (LPXTG). The domain boundaries of the Brpt5.5, Brpt3.5, and Brpt1.5 constructs are shown underneath the cartoon. Each full B-repeat contains a G5 domain and a spacer region, while the half-repeat cap contains only the G5 domain. (B) Far-UV circular dichroism spectra demonstrating similar secondary structure characteristics of the previously characterized Brpt1.5 construct (green circle) and cleaved Brpt3.5 (blue triangle) and Brpt5.5 (red square). The far-UV CD spectrum of Brpt1.5 is adapted from[25], Copyright 2008 National Academy of Sciences, U.S.A. C) Sedimentation coefficient distribution plot c(s) of HMBP-Brpt3.5 (dashed blue line), cleaved Brpt3.5 (solid blue line), and Brpt5.5 (solid red line). All constructs sedimented as monomers in the absence of Zn2+.

63

Sedimentation velocity analytical ultracentrifugation (AUC) was used to characterize each construct in solution under native conditions (Fig 1C). Brpt3.5 and

Brpt5.5 sedimented as monomers with very high frictional ratios (2.58 and 3.55, respectively). Although these constructs have similar sedimentation coefficients, the difference in frictional ratios is an important indication of the increased mass of Brpt5.5 compared to Brpt3.5. Such high frictional ratios are indicative of highly elongated global conformations, such as what might be expected based on our prior characterization of

Brpt1.5[25, 26]. Furthermore, such an extended conformation for tandem B-repeats, in conjunction with the extended Pro/Gly-rich stalk[23], makes Aap well-suited to project itself out away from the S. epidermidis surface in order to more easily interact with adjacent cells and surfaces and avoid steric hindrance from other cell wall-anchored proteins. The HMBP-Brpt3.5 construct also sedimented as a monomer, but had a significantly increased sedimentation coefficient due to the additional 42 kDa mass contributed by the more globular HMBP fusion tag (Fig 1C).

Tandem B-repeats assemble into multiple higher-order species in the presence of

Zn2+

We previously reported that shorter Aap B-repeat constructs, Brpt1.5 and

Brpt2.5, specifically self-associate to form dimers in the presence of Zn2+[25-27, 30].

These data indicated that tandem B-repeats self-associate in a modular fashion, with each B-repeat capable of forming an adhesive contact with another B-repeat in the presence of approximately two Zn2+ ions, and that longer B-repeats constructs can dimerize at lower free Zn2+ concentrations[25, 32]. This phenomenon is known as the

64 chelate effect, in which the binding of Zn2+ and resulting self-association at the first site in each of two multi-site protomers reduces the entropic penalty for neighboring sites to self-assemble in the presence of Zn2+. Thus, the effective free Zn2+ concentration required for assembly of the entire multi-repeat protein becomes progressively lower as the number of repeats increases. Prior biophysical work on B-repeat constructs has focused on shorter constructs that were suitable for detailed thermodynamic analyses; these short constructs were limited to monomer-dimer equilibria[25-27, 30]. For the longer Brpt3.5 and Brpt5.5 constructs described here, self-association in the presence of Zn2+ could be significantly more complicated than a simple monomer-dimer equilibrium, due to the potential for multiple adhesive interactions within a stretch of three to five intact B-repeats. Indeed, we observed a dramatic change in the sedimentation coefficient distribution for HMBP-Brpt3.5 in the presence of Zn2+ compared to monomeric apo-HMBP-Brpt3.5 (Figs. 2A, 2B), consistent with the formation of a wide range of very large oligomeric states (Fig. 2C). In addition to the reaction boundaries visible in the 0-40 s* range, the increasing trend at 40 s* indicates the presence of even larger aggregated species. This aggregation behavior is quite distinct from previously observed oligomerization of B-repeat constructs in the presence of Zn2+. Although the HMBP fusion tag alone is able to dimerize in the presence of Zn2+

(data not shown), the formation of such enormous aggregates of HMBP-Brpt3.5 in the presence of Zn2+ is unexpected and suggests that a distinct mode of assembly or aggregation is occurring.

65

Fig 2. Sedimentation behavior of tandem B-repeats in the presence of Zn2+. (A) Raw sedimentation velocity data for 5 µM HMBP-Brpt3.5 in the absence of Zn2+ and (B) in the presence of 3 mM ZnCl2, both at 36,000 rpm and 20 °C. In both panels, scans 1- 100 (5 hours elapsed time) were loaded, with every 7th scan plotted. Note the dramatic increase in spacing between scans in panel B compared to panel A; the much faster- moving sedimentation boundaries of HMBP-Brpt3.5 in the presence of Zn2+ compared with HMBP-Brpt3.5 alone indicate the sedimentation of very large species. (C) Sedimentation coefficient distribution of HMBP-Brpt3.5 alone (grey line) and HMBP- Brpt3.5 in the presence of 3 mM ZnCl2 (blue line). The broad peak distribution in the presence of Zn2+ indicates that HMBP-Brpt3.5 sediments as a mixture of assembled oligomers; however, resolution of individual species is difficult due to overlapping sedimentation profiles. The inset of (C) shows the same data analyzed by Wide Distribution Analysis (WDA) in SEDANAL – note the x-axis is on the natural log scale, while the y-scale has been normalized by area. Here, the sample containing 3 mM ZnCl2 shows some material around the 5-10 s* range, with the majority of material sedimenting between 10-20 s*, in good agreement with the c(s) distribution calculated by SEDFIT.

66

2D size-and-shape sedimentation analysis indicates formation of fiber-like species

A recently developed AUC analysis method was used to better resolve the multiple sedimentation boundaries observed for HMBP-Brpt3.5 in the presence of Zn2+ and to provide additional information on the size and shape of the sedimenting species.

We utilized the 2D size-and-shape c(s,ff0) analysis in SEDFIT that characterizes each sedimenting species in terms of both sedimentation coefficient and frictional ratio[33].

This approach is particularly useful when analyzing co-sedimenting species that differ greatly in shape and therefore experience very different degrees of drag[34]. For these experiments, we used samples of 5 µM HMBP-Brpt3.5 alone or in the presence of 500

µM or 1 mM ZnCl2. The samples were analyzed at both 25 °C and 37 °C. In order to capture the earliest assembly events, the samples were incubated after addition of Zn2+ only for the time required to pull vacuum (approximately 30 minutes). As expected, the

HMBP-Brpt3.5 alone sample yielded a single predominant species corresponding to a highly elongated monomer, as seen in Fig 3A. Both samples incubated with Zn2+ formed similar oligomeric states, but the higher oligomeric species were more heavily populated for the sample incubated with 1 mM ZnCl2, allowing for easier resolution of the species involved (25 °C data not shown). The c(s,ff0) analysis yields a 3-dimensional plot (Fig

3A, B) that separates the species based on sedimentation coefficient on the x-axis, frictional ratio (f/f0; the frictional coefficient of the sedimenting species compared to that of an ideal sphere of identical volume) on the y-axis, and peak amplitude on the z-axis.

67

Fig 3. AUC c(s,ff0) analysis of early-stage HMBP-Brpt3.5 amyloidogenic intermediates. Sedimentation velocity data (36,000 rpm at 37 °C) were analyzed using the c(s,ff0) analysis model in Sedfit. (A) 3-D shape and size distribution plot for HMBP- Brpt3.5. Sedimenting species are distinguished based on sedimentation coefficient (plotted uncorrected for buffer conditions) along the x-axis and frictional (f/f0) along the y-axis. Increasing values of f/f0 correspond to more highly elongated or non-globular species. The heat map indicates species concentration, from lowest population density (blue) to highest (red). HMBP-Brpt3.5 alone sediments as a single dominant species of 4.53 S with an elongated frictional ratio (f/f0 = 2.3). (B) 3-D shape and size distribution plot for HMBP-Brpt3.5 in the presence of 1 mM Zn2+. In the presence of Zn2+, there is a broad distribution of species that vary both in sedimentation coefficient values as well as frictional ratios. Note in particular the series of extremely elongated species (f/f0) values of approximately 4 or higher) highlighted by the magenta oval. (C) To illustrate the putative species present, the three-dimensional plot from (B) has been simplified to a two-dimensional distribution of the sedimentation coefficient (x-axis) and frictional ratio (y-axis), labeled with the putative species present as implied by the given pairs of s- and f/f0-values. Elongated species with f/f0 values of approximately 3 or greater are highlighted by ovals and putative species labels in magenta. Compact species with f/f0 values between 1 and 2.5 are delineated by dark red lines along with putative species

68 labels under the distribution plot. The solid black line depicts a 2-dimensional representation of c(s,*), showing the relative total amount of material at any sedimentation coefficient.

69

We also show a top-down view of the 3D distribution plot (Fig 3C) superimposed on the standard sedimentation coefficient distribution that has been labeled with the approximate oligomeric states, estimated by Sedfit based on the sedimentation coefficient and apparent frictional ratio values. The data show that HMBP-Brpt3.5 incubated with 1 mM ZnCl2 at 37 °C forms multiple species including: monomeric

HMBP-Brpt3.5, several mostly compact oligomeric species (dimer, trimer, and tetramer or pentamer), followed by large oligomers of ever-increasing degrees of elongation. In the presence of Zn2+, peaks for both elongated and compact monomer species are observed (f/f0 values of 3.1 and 1.2, respectively), in contrast to the HMBP-Brpt3.5 alone data that shows only elongated monomer (Fig 3A). The dimer and trimer species are mostly compact, with frictional ratios comparable to those of globular proteins

(between 1.2 and 1.4). The putative pentamer species is moderately elongated, with a frictional ratio of 1.7, but all higher oligomers are highly elongated. The putative 9-mer and 14-mer species show frictional ratios of 2.2 or 2.5, respectively, while the 15-mer through 65-mer peaks (circled in magenta ovals in Fig 3C) gave f/f0 values between 3.9 and 4.7. (As a control, the analysis of the same sedimentation data with an upper limit of 2.5 for f/f0 values resulted in a worse fit to the measured sedimentation boundaries.)

These f/f0 values of 3.9 or greater indicate extremely elongated fiber-like morphologies, with approximate axial ratios that range from 72-108 (assuming prolate ellipsoids), suggesting that these species may be nascent amyloid fibers. The HMBP-Brpt3.5 sample incubated with 1 mM ZnCl2 analyzed at 25 °C showed a very similar distribution of oligomeric species, although the slower sedimentation rates (due to lower

70 temperature) allowed resolution of a few higher-order oligomeric species, such as putative 75-mer and 115-mer fibers (data not shown).

Tandem B-repeats form amyloid fibers in the presence of Zn2+

To characterize the nature of these extremely large, fiber-like B-repeat species, samples of HMBP-Brpt3.5 incubated with Zn2+ for 2, 4, or 7 days were visualized by transmission electron microscopy (TEM) (Fig 4A). Negative-stained TEM images revealed large assemblies of protein fibers. One possibility raised by the TEM images is that HMBP-Brpt3.5 can form amyloid fibers in the presence of Zn2+. Amyloid fibers are highly stable protein fibers that have a characteristic β-strand-based fibril architecture[35]. Formation of amyloid fibers was initially linked to protein misfolding implicated in disease states; however, a number of “functional” amyloid proteins required for specific cell processes have recently been characterized in organisms ranging from archaea[36] and bacteria to humans[37]. Functional amyloid proteins from bacterial species, such as Escherichia coli, Bacillus subtillis, Enterobacteriaceae, and

Pseudomonas spp., have recently been implicated in biofilm formation; specifically, these functional amyloid fibers are integral to the overall stability of the biofilm[38-41].

To confirm that these HMBP-Brpt3.5 protein fibers were amyloid, samples were incubated with Thioflavin T (ThT) and characterized by fluorescence spectroscopy.

Thioflavin T is a small-molecule fluorophore that can intercalate within amyloid fibers, which restricts rotation about an internal bond and results in a dramatic increase in quantum yield and characteristic fluorescence emission at 482 nm[42, 43].

71

Fig 4. Amyloid properties of tandem B-repeat constructs in the presence of Zn2+. (A) Negative-stained TEM image of HMBP-Brpt3.5 protein fibers generated 7 days post- incubation with 500 µM ZnCl2. (B) Fluorescence emission spectra of 10 µM Thioflavin T (ThT) alone (black) or ThT in the presence of: HMBP-Brpt3.5 with 500 µM ZnCl2 (red), HMBP-Brpt3.5 alone (dark blue), and HMBP-Brpt3.5 with 10% formic acid (FA, cyan) after excitation at 432 nm. HMBP-Brpt3.5 without ThT (green) did not produce fluorescence. The HMBP-Brpt3.5 alone and HMBP-Brpt3.5 + Zn2+ samples were pre- treated with 10% FA to remove aggregates, followed by dialysis back into working buffer and addition of ZnCl2. (C) ThT fluorescence of cleaved Brpt3.5 tagged with Cys added to the C-terminus (Brpt3.5-Cys). At 1 mM ZnCl2, there is negligible fluorescence of incubated samples of 0.5 mg/ml (10 µM) Brpt3.5-Cys under reducing (M, monomer) and oxidizing (D, dimer) conditions. However, at 5 mg/ml (100 µM) and 10 mg/ml (200 µM)

72

Brpt3.5-Cys, strong ThT fluorescence indicates a high propensity for amyloid fibers in the presence of Zn2+. This clearly demonstrates the dependence of amyloid formation on local B-repeat concentration. (D) TEM image of native, cleaved Brpt5.5 (20 µM, 1.5 mg/ml) incubated at 37 °C with 5 mM ZnCl2 showing similar fiber morphology as HMBP- Brpt3.5 in panel A. (E) shows the absorbance and fluorescence of Brpt5.5 (20 µM, 1.5 mg/ml) incubated with different ZnCl2 concentrations. At 5 and 10 mM ZnCl2, there is an increased fluorescence of ThT and Proteostat dyes, indicating amyloid-like aggregates in the native, untagged Brpt5.5 construct. (F) Far-UV CD spectrum of Brpt5.5 (6.5 µM, 0.5 mg/ml) with or without Zn2+, at 20 °C or 40 °C. Only in the presence of Zn2+ and at higher temperatures does the CD spectrum indicate the rich β-sheet structure observed for amyloid-like aggregate. (G) Brpt5.5 samples were tested for recognition by the anti- amyloid antibody OC using a dot blot assay. Antibody binding was observed for samples incubated at temperatures and Zn2+ concentrations conducive of the CD spectral change to a minimum near 225 nm. Natively folded Brpt5.5 (without Zn2+) also showed binding, but this was lost when Brpt5.5 was unfolded.

73

To ensure that the ThT fluorescence produced was truly from Zn2+-dependent amyloid fibers, HMBP-Brpt3.5 was first pretreated with 10% formic acid (FA) to depolymerize any fibers that may have spontaneously formed[44]. As seen with other amyloid proteins, HMBP-Brpt3.5/Zn2+ produced minimal ThT fluorescence in the presence of FA

(Fig 4B). This effect was not due to FA-induced structural changes in the HMBP-Bprt3.5 monomer, since the secondary structure of the protein was unchanged upon addition of

FA (Fig S2). Upon removal of FA and incubation for 24 hours, HMBP-Brpt3.5/Zn2+ protein fibers demonstrated strong ThT fluorescence (Fig 4B). However, after removal and incubation, HMBP-Brpt3.5 alone also produced a low level of ThT fluorescence, suggesting that some fibers could potentially form in the absence of Zn2+. Based on quantification of monomeric HMBP-Brpt3.5 in the Zn2+-free AUC data, the proportion of protein that forms fibers in the absence of Zn2+ is 3% or less (Figs 1 and 2). A separate construct was additionally tested to better replicate the natural setting of Aap tethered in dense tufts to the cell wall of S. epidermidis. This construct was Brpt3.5 (cleaved from

HMBP) with a C-terminal Cys residue added (Brpt3.5-Cys). Under non-reducing conditions, a disulfide bond would link two Brpt3.5 molecules in a parallel fashion, similar to their orientation on the cell surface, while raising the local B-repeat concentration – a well-known factor in amyloidogenesis. In Fig 4C, there is not only an overall protein concentration threshold required for amyloidogenesis, but there is also a clear dependence on local B-repeat concentration, with disulfide-linked samples (“D” for dimer) yielding higher ThT fluorescence than reduced samples (“M” for monomer).

To ensure that amyloid formation was not due to the presence of the His6-MBP tag (or the C-terminal Cys tag), native Brpt5.5 (tag-free) was incubated with Zn2+, then

74 visualized by TEM. Brpt5.5 formed fibers (Fig 4D) primarily of the “branched” morphology seen with HMBP-Brpt3.5 (Fig 4A). This morphology resembles one often observed with light chain amyloid [45], but that is also observable with Aβ peptide [46].

Analyzing these Brpt5.5 fibers spectroscopically (Fig 4E) revealed that incubation with amyloid-binding dyes led to increases in both ThT and Proteostat fluorescence[47].

These samples also showed an increase in Congo Red absorbance at 540 nm[48] (data not shown). Because amyloid fibers have characteristic structures rich in β-strand, we examined the far-UV CD spectrum under conditions which promote these amyloid-like aggregates. It should be noted that based on X-ray crystallography and NMR studies[26, 30, 49], along with CD data presented here and elsewhere[25, 30], natively folded B-repeats contain β-sheet and random coil content. Importantly, however, Fig 4F demonstrates that when Brpt5.5 is incubated with 5 mM Zn2+ at near-physiological temperature (40 °C), there is a significant change in the CD spectrum, resulting in a strong, broad minimum near 225 nm. This is similar to CD spectra observed with insulin fibrils[50] and glucagon fibrils[51]. Together, these spectroscopic results and TEM observations are consistent with amyloid-like fibril formation by native, tandem B- repeats.

A number of antibodies have been raised against amyloid-forming peptides or proteins. Kayed, et al.[52] reported an antibody (OC antibody) which could specifically recognize Aβ42 fibers, but not monomers or oligomers. Further, it appeared to be recognizing generic amyloid fiber conformation(s), irrespective of protein sequence, as it was able to also detect amyloid fibers from IAPP and α-synuclein[52]. Due to the ability for OC antibody to specifically recognize fibers and be apparently independent of

75 protein sequence, we chose this antibody to test against Brpt5.5 amyloid fibers (Fig 4G.

By dot blot assays, OC antibody recognized Brpt5.5 that had been incubated with Zn2+ at temperatures which resulted in the significant change in the CD signal in Fig 4F.

Interestingly, OC antibody recognized native, monomeric Brpt5.5 as well. This suggests that conformations present in the amyloid fiber are also present in the natively folded protein. Incubating Brpt5.5 in the absence of Zn2+ and at a temperature high enough to unfold the protein resulted in loss of OC antibody binding. Collectively, Brpt5.5 and

HMBP-Brpt3.5 both share features characteristic of amyloid fibers.

B-repeat fiber assembly is time- and temperature-dependent

Metal ions, namely Zn2+ and Cu2+, play complex roles in the aggregation and amyloidogenesis of amyloid-β peptide important in Alzheimer’s disease[53-56].

However, functional amyloid-forming proteins from bacteria assemble into amyloid fibers without a known requirement for zinc ions or other triggering molecules. Rather, these amyloid proteins form as a result of the coordinated action of accessory proteins expressed within operons, including nucleator proteins and chaperones to prevent aberrant polymerization in the cytoplasm[38, 41, 57, 58], or amyloidogenesis can require proteolytic activity to release the amyloidogenic regions of the protein, such as with Bap[59]. Thus, the mechanism of Zn2+-dependent amyloid formation by the B- repeat region of Aap is of significant interest for comparison to other systems. We have therefore used a combination of HPLC, TEM, confocal microscopy, and AUC to analyze the assembly of Aap B-repeat amyloid formation as a function of time, temperature, and solution conditions.

76

Fig 5. HPLC and turbidity assays to monitor time and temperature dependence of amyloidogenesis. (A) C4 reverse-phase HPLC elution profiles of 10 µM HMBP-Brpt3.5 alone (black) or incubated with 500 µM ZnCl2 for 1 (red), 4 (blue), or 30 (green) days at 37 °C. Samples were centrifuged at 13,000 rpm for 1 hour prior to HPLC separation to remove insoluble aggregates. (B) Samples of 10 µM HMBP-Brpt3.5 with 500 µM ZnCl2 were incubated for 4 days at 37 °C prior to addition of Tris-buffer saline (red trace); sufficient dilute HCl to lower the pH to 5 (blue trace); or 2 mM DTPA (green trace). Separation by HPLC revealed that oligomer/soluble fiber species were maintained in the presence of both acid (blue) and DTPA (green) when compared to the buffer-treated control samples (red) at both temperatures. TEM images of these samples are in Supporting Information (S3 Fig). (C) Samples of 5 µM Brpt5.5 incubated at 37 °C and followed by absorbance at 400 nm, showing turbidity in the presence of ZnCl2 with maximal turbidity reached by 5 days. Data points are shown as symbols connected by straight lines for clarity. The lines do not represent fits.

77

To assess the effect of temperature on the rate and morphology of amyloid fiber formation, we used a HPLC quantification assay previously shown to effectively differentiate between monomer and oligomer or soluble fiber species[60]. A relative time course for HMBP-Brpt3.5 fiber formation was established by incubating 10 µM HMBP-

Brpt3.5 with 500 µM ZnCl2 for 0, 2, and 6 hours and 1, 4 and 30 days at both 20 °C (Fig

S3) and 37 °C (Fig 5A). Samples were first centrifuged for one hour at 13,000 rpm to remove any insoluble aggregates prior to separation by a C4 HPLC column; therefore, the elution profile only reports on monomer, oligomer, and/or soluble amyloid fiber species. The HPLC elution profile of HMBP-Brpt3.5 alone (Fig 5A, black trace) gives a peak eluting at 28.5 ml, which corresponds to a single monomeric species, as shown by its sedimentation coefficient distribution (Fig 1B). Upon addition of Zn2+ to HMBP-

Brpt3.5, samples gradually showed a shift in the distribution of peaks toward higher elution volumes that represent higher-order oligomer or fiber species. The elution profiles at 2 and 6 hours were unchanged from time point zero (data not shown), indicating slow assembly kinetics. As time progressed, the monomer peak decreased and shifted to the right as the oligomer peak increased. The transition from monomer to putative fiber species was more pronounced after 1, 4, or 30 days for samples incubated at 37 °C (Fig 5A) compared to the samples incubated at 20 °C (Fig S3), indicating accelerated HMBP-Brpt3.5 self-assembly at higher temperature.

We verified that our observations were not an artifact of the His6-MBP tag, using an orthogonal approach. Because the HPLC assay is limited to only oligomers and soluble fiber species, we used a turbidity assay which follows light scattering by large assemblies and aggregates, including amyloid fibers[61]. With the Brpt5.5 construct, we

78 observed significant increases in the turbidity of the samples containing Zn2+ when incubated at 37 °C (Fig 5C), but no visible change in turbidity when incubated at 20 °C under the same Zn2+ concentrations (data not shown). These data demonstrate the

Zn2+-, time-, and temperature- dependence of B-repeat fiber assembly using native, untagged B-repeats.

B-repeat fibers are resistant to acid and chelator treatment

Our previous work on Brpt1.5 revealed that this shorter construct formed a reversible Zn2+-dependent dimer that was sensitive to removal of Zn2+ (via the chelator

DTPA) or to even a modest decrease in pH (from 7.4 to 6.0)[25]. To assess the stability of mature amyloid fibers, we allowed HMBP-Brpt3.5 fibers to form in the presence of

Zn2+ over a period of 24 hours or 4 days at 20 °C or 37 °C. We then treated the samples by addition of a sufficient volume of dilute HCl to lower the pH to 5, addition of 2 mM

DTPA, or addition of buffer as a control. The samples were incubated for 2 hours before loading onto a C4 reverse-phase HPLC column, without spinning out the fibrous aggregates. The elution profile for the 20 °C sample at 4 days resembled the 24-hour samples (Fig S3) in terms of the monomer vs oligomer distribution and showed evidence of limited remodeling upon incubation with DTPA or HCl. In contrast, the elution profile for the 37 °C sample (Fig 5B) showed that the oligomer/fiber peak predominated, and that it was more resistant to the action of DTPA or HCl. TEM was performed on each sample to confirm that fiber structure was maintained (Fig S4).

Furthermore, although some conformational rearrangement of the fibers may occur at

79 lower temperatures and early time points[62], mature fibers at 37 °C are highly resistant to these environmental conditions.

The initial Zn2+-dependent assembly of untagged Brpt5.5 was reversible with addition of DTPA or HCl, as shown by sedimentation velocity AUC data in Fig S5.

These results coincide with the reversible assembly of Brp1.5[25], suggesting tandem

B-repeats still undergo an initial phase of reversible assembly. While Brpt5.5 aggregates from Fig 5C showed sensitivity to DTPA, turbidity was not completely abolished after the addition of DTPA (data not shown). This result suggests that while some higher-order oligomers or non-amyloid aggregate may have been reversed or solubilized by DTPA, there is a fraction of material that remained in an aggregate form, as indicated by the remaining turbidity. It is possible the DTPA-sensitive material had not yet developed into mature amyloid fibers, or that a higher local concentration of the tandem B-repeats (e.g., as seen with HMBP-Brpt3.5 or with native cell wall-anchored

Aap) is required for complete DTPA resistance. DTPA-resistance is an important feature of Aap functional amyloid, in that we previously reported the addition of Zn2+ chelator could prevent biofilm formation, but not disrupt mature biofilms[25], a phenomenon likely caused by these newly observed amyloid fibers.

Amyloid fibers are structural components in S. epidermidis biofilms

While it has been well established that Aap is critical for S. epidermidis biofilm formation[15, 16, 22, 25], the mechanism by which Aap promotes stable, long-term intercellular adhesion is not well understood. Others have shown that bundles of Aap fibrils extend outward from the cell wall on planktonic S. epidermidis cells[63, 64], but

80 these were presumably free-standing proteins attached to the cell wall rather than amyloid fibrils. We sought to explore the significance of our findings in the context of S. epidermidis biofilms. S. epidermidis strain RP62A biofilms were grown in tryptic soy broth (TSB) supplemented without additional Zn2+ (Fig 6A) or with additional Zn2+ (Fig

6B) and visualized by TEM. Large networks of extracellular fibrils were observed around clusters of cells in the biofilm. This morphology closely resembles the fiber morphology observed by native Brpt5.5 when examined at the same magnification (see Fig 6G for

Brpt5.5 incubated at 37 °C with 5 mM ZnCl2, compared to Fig 6H for RP62A biofilm + 20

µM ZnCl2). These extracellular fibers were observed in biofilms grown in TSB with or without addition of Zn2+, although the presence of Zn2+ increased the prevalence of the fibers. TSB contains Zn2+, which allows for Aap-based, Zn2+-dependent accumulation and biofilm formation[25, 26, 65]. Addition of the Zn2+ chelator, DTPA, at the start of growth inhibits biofilm formation[25]; by TEM, these S. epidermidis cells showed no evidence of the extracellular fibrous networks seen in the mature biofilms (Fig 6C). The addition of DTPA at the start of growth would prevent the initial reversible self-assembly of Aap B-repeat regions (Fig S5), presumably preventing subsequent nucleation of amyloid. However, once a mature biofilm is formed, DTPA can be added with no resulting effect[25] (Fig 6D), similar to the observation that adding DTPA to mature amyloid fibers formed in vitro does not result in depolymerization (Fig 5B). The highly resistant nature of these amyloid fibers would contribute to the strength and stability of the mature biofilm, as shown previously for other systems, including TasA from B. subtilis[40]. In Fig 4B, we demonstrated that B-repeat fibers formed in vitro are sensitive to formic acid (FA).

81

Fig 6. Amyloid fibers composed of Aap are important structural components in S. epidermidis biofilms. S. epidermidis biofilms were examined by TEM when grown on dialysis membrane on (A) TSA, (B) + 20 µM ZnCl2, (C) 100 µM DTPA added at the start of growth (t = 0 hr) or (D) 100 µM DTPA at the end of growth and incubated for 1 hour (t = 24 hr), (E) 50% FA (t = 24 hr), (F) pH lowered to pH 5 (t = 24 hr). This series of TEM images displays the resistance of fibers to DTPA and acidification in the setting of a mature biofilm. (G) and (H) compare the fibers observed by Brpt5.5 incubated at 37 °C with 5 mM ZnCl2 (from Fig 3) (G) and RP62A biofilms + 20 µM ZnCl2 (H) at similar magnification demonstrating similar morphologies. (I) examines the ability of S. epidermidis to form biofilms under various conditions. Biofilm formation can be inhibited by addition of DTPA (t = 0 hr), but not after biofilm formation has already occurred (t = 24 hr), while addition of FA can significantly disrupt mature biofilms. Acidification to pH 5, like adding DTPA at t=24 hr, had no significant effect compared to RP62A without treatment. The * symbol denotes statistical difference (P < 0.05) to RP62A biofilm formation without treatment, as determined by a two-tailed Student’s t test. (J) Bacteria

82 from biofilms shown in (B) were digested with lysostaphin to remove cell wall-anchored proteins. The supernatant from this mixture was then examined by SDS-PAGE, which revealed a large amount of material unable to migrate beyond the stacking gel. This material (red rectangle) was identified primarily as Aap by nanoLC-MS/MS.

83

Based on this result, Aap fibers in biofilms should also show sensitivity to FA. Indeed, addition of FA to mature biofilms demonstrated the ability to disrupt the majority of the extracellular fibers in the biofilm (Fig 6E). Furthermore, we examined the effect of pH on biofilm stability. As with DTPA, we previously observed that mildly acidic pH disrupts the reversible assembly of B-repeat proteins[25, 32] but low pH has no effect on pre-formed

B-repeat amyloid fibrils (Fig 5B). Consistent with these in vitro results, we observed that lowering the pH of an established biofilm to 5.0 had no effect on the extracellular amyloid fiber network (Fig 6F).

A biofilm formation assay (Fig 6I) was performed to complement the TEM observations and evaluate the role of Aap functional amyloid fibers in biofilm formation in a more quantitative way. We once again observed the ability of DTPA to inhibit amyloid formation, but only when present before biofilm formation; when DTPA or pH

5.0 buffer was added after biofilm maturation (t = 24 hr), neither was able to disrupt the biofilm, likely due to the resistance of the amyloid fibers (Fig 6D, F, I). Furthermore, addition of FA to mature biofilm was able to significantly disrupt the biofilm, as expected due to its ability to depolymerize the functional amyloid (Fig 6E). The correlation between depolymerization of amyloid fibrils and weakening of the biofilm structure is a key observation which supports the idea of functional amyloid fibrils contributing strength and stability to the biofilms.

S. epidermidis amyloid fibers are composed of processed Aap

As a final characterization of the extracellular amyloid fibers we observed in S. epidermidis biofilms, we determined the composition of these fibers by a combination of

84

SDS-PAGE and mass spectrometry (MS). Biofilms were collected as they were for TEM characterization (Fig 6B). The biofilm mixtures were then centrifuged to separate the bacteria from extracellular material and media. After removing the supernatant, we used lysostaphin to digest the polyglycine crosslinks of the cell wall peptidoglycan. The soluble proteins released from the cell wall were examined by SDS-PAGE. We observed SDS-insoluble aggregate trapped in the well and stacking gel (Fig 6J). This aggregated material contained primarily Aap when examined by nanoLC-MS/MS after in-gel tryptic digestion (Fig S6, Table S1). In addition to Aap, peptides from several cytoplasmic proteins were identified, but with sparse coverage, suggesting these proteins are present in small quantities and are likely irrelevant to the amyloid fibers.

Peptides were observed from the B-repeats and the lectin region of Aap, with one peptide including Leu601, which is one of two SepA cleavage sites (the other being

Leu335)[24], suggesting the lectin domain is still attached to Aap. The Aap A-repeats also contain tryptic sites, so the fact that no A-repeat peptides were detected by MS/MS is highly suggestive that SepA cleavage occurred at Leu335, downstream of the A- repeat region. This observation is in agreement with Rohde et al, who demonstrated that proteolytic processing of the N-terminus of Aap is required for biofilm formation[18], and with Paharik et al, who observed SepA-processed Aap molecules that retained the lectin domain, but not the A-repeat region in the context of biofilms[24]. This suggests the lectin need not be removed for biofilm formation (accumulation and amyloidogenesis specifically) to occur, but that it is the A-repeat region which inhibits biofilm formation.

85

Amyloid fibers form early in biofilm formation and correlate with DTPA resistance of biofilms

To further explore the importance of Aap amyloidogenesis in the context of developing biofilms, we followed biofilm formation by S. epidermidis strain RP62A at distinct time points throughout the first 24 hours of biofilm formation. The addition of ThT to the media during the initial inoculation of the bacteria allowed us to follow amyloid formation by CFM (Fig 7A). In parallel, biofilms were stained with LIVE/DEAD fluorescent dye to ensure ThT was not affecting cell growth or biofilm formation (Fig

7B). As early as 2 hours post-inoculation, punctate ThT fluorescence was visible on planktonic cells. This suggests that fibers start to form to a limited degree prior to cellular accumulation or biofilm assembly (Fig 7B). At 6 hours, the formation of microcolonies, and therefore intercellular accumulation, begins to occur. At this stage,

ThT fluorescence is located at the boundaries between associating cells, which is consistent with the role of Aap as the critical factor for intercellular adhesion. ThT fluorescence increases throughout the biofilm over time, with a predominance of fluorescence in the core of the biofilm with the highest cellular densities.

To understand the implications of the observed ThT fluorescence in the developing biofilm, we tested the ability of DTPA to inhibit biofilm formation when added at distinct time points along a similar time frame. Addition of DTPA during the first hour of incubation was able to prevent biofilm formation from occurring, whereas by 2 hours,

DTPA was significantly less effective. The DTPA resistance of the biofilm correlates with the time-frame of amyloid formation in growing biofilms (Fig 7A). To ensure that the lack of biofilm formation at early time-points was not due to mixing of the cultures upon

86 adding DTPA, a duplicate experiment (Fig 7D) was performed where DTPA was not added, but wells were mixed in the same way as in Fig 7C. Each of the conditions were still able to grow very strong biofilms, similar to the untreated control (“RP62A”), indicating that the mixing does not significantly affect biofilm formation. Therefore, these results support our hypothesis that Aap is important both for intercellular accumulation, by the formation of amyloid fibers between bacteria, and for stabilizing mature biofilms due to the remarkable resistance of amyloid fibers to physical and chemical insults.

87

Fig 7. The formation of amyloid fibers is well correlated to DTPA resistance in S. epidermidis biofilms. (A) Biofilm formation by S. epidermidis was characterized 2, 6, 12, 18, and 24 hours post-inoculation by confocal microscopy, staining with 10 µM ThT (cyan) over brightfield. In the 2-hour panel, a zoomed inset is shown with increased contrast and brightness to highlight punctate ThT fluorescence visible around planktonic S. epidermidis cells. ThT fluorescence is seen primarily at cell-cell junctions at the 6- hour time point. Mature biofilms (24 hours) show ThT throughout the biofilm as a major structural component. (B) S. epidermidis biofilms at 2, 6, 12, 18, and 24 hours post- inoculation were stained in parallel with LIVE/DEAD stain (green/red) as a control. (scale bar = 10 µm). Images were generated in Zen 2009 Light Edition. (C) Biofilm formation assays show DTPA resistance develops within 2 – 6 hours, with full DTPA resistance reached within 24 hrs. (D) demonstrates these observations are in fact, due to addition of DTPA, not mixing. The * symbol denotes statistical difference (P < 0.05) to DTPA added at t = 0 hr, while the # indicates no significant difference to RP62A without treatment, as determined by a two-tailed Student’s t test.

88

Discussion

Our previous work established that the B-repeat region of Aap can undergo Zn2+- mediated self-association to form protein-based ‘ropes’ between staphylococcal cells in the biofilm and that Zn2+ chelation was able to inhibit biofilm formation by both S. epidermidis and S. aureus[25-27, 30]. This initial work was carried out with the short B- repeat constructs Brpt1.5 and Brpt2.5, which showed reversible self-association that could be inhibited upon addition of chelator or moderate reduction in pH. Given that staphylococcal biofilms typically undergo acidification over time, it was unclear how this

Zn2+-mediated self-assembly mechanism could be maintained within the biofilm in vivo.

The results presented here using a more biologically relevant construct (Brpt5.5) and a Brpt3.5 fusion protein (HMBP-Brpt3.5) demonstrate that the B-repeat region of

Aap is capable of two distinct Zn2+-dependent assembly processes, forming both reversible oligomers and functional amyloid fibers within biofilms. While the formation of amyloid fibers in biofilms has been established in several bacterial species, the mechanism of Aap amyloid fiber assembly displays some unique features among this group of bacterial biofilm proteins. Unlike the amyloidogenic biofilm proteins curli (E. coli and Salmonella spp.)[57, 66] TasA (B. subtilis)[40, 58] or FapC (Pseudomonas spp.)[41] that require additional chaperones or initiator proteins for amyloid fiber assembly, our data suggest that Aap utilizes Zn2+ as a catalyst to drive amyloid fiber formation. This mechanism for metal-dependent amyloid nucleation is instead reminiscent of several mammalian amyloid-forming proteins including amyloid-β[54, 67, 68], prions[69, 70], and β2-microglobulin[71, 72]. Our proposed mechanism of Aap amyloid assembly has some similarities to that reported for S. aureus Bap[59]. The N-terminal region of this

89 cell wall-anchored protein is cleaved and released into the local environment, where it can self-assemble into amyloid-like structures in the presence of low Ca2+ concentrations or at acidic pH. Bap, like Aap, can therefore act as both a sensor and a scaffold protein[59]. Aap requires cleavage of the N-terminal A-repeat region by staphylococcal proteases or human proteases in order to support biofilm formation[18,

24]. With the N-terminal region of Aap removed, the B-repeat region is unmasked and allowed to facilitate intercellular contact and amyloidogenesis in the presence of Zn2+.

As previously mentioned, the S. aureus cell wall-associated protein, Bap (Biofilm associated protein) is capable of forming amyloid fibers in biofilms and seems to be critical for infection in a mouse catheter model[59]. Bap has a similar domain arrangement and function to SasG and Aap. However, the bap gene has not been found in S. aureus or S. epidermidis human clinical isolates[59]. A Bap homologue, Bhp, is present in human isolates of S. epidermidis[73], but no experimental evidence exists regarding its role in biofilm formation or ability to form amyloid fibers, with the exception of a study by Lembre et al, who showed a six-residue peptide from Bhp was able to form amyloid fibers in vitro[74]. The bhp gene was found in less than half of S. epidermidis strains isolated from prosthetic knee joint and hip infections, while aap was found in 89% of isolates[15].

A recently identified S. epidermidis protein, Sbp, was shown to form amyloid fibers both in vitro and when expressed in intracellular inclusions in E. coli[75]. Wang, et al.[75] also showed that an sbp knockout of S. epidermidis strain 1457 did not show

Thioflavin S (ThS) fluorescence by confocal microscopy, nor did it form a biofilm - an observation consistent with the initial study by Decker, et al.[76]. The wild-type strain

90

1457 did show biofilm formation, as expected, along with ThS fluorescence indicative of amyloid-like fibers. However, the authors did not identify Sbp as a component of purified amyloid fibers derived from biofilms, so there is uncertainty as to whether Sbp directly forms amyloid fibers in the biofilm, or if Sbp plays an indirect role in the nucleation of

Aap amyloid fibers. If the latter case were true, then genetic knockout of either aap or sbp would abrograte biofilms as well as eliminate ThT fluorescence. Until there are well- defined mutations identified in either Aap or Sbp that specifically eliminate interactions, it will be difficult to establish the exact role played by Sbp. For this reason, we utilized mass spectrometry to confirm that the fibers we observed were indeed composed of

Aap.

The molecular characteristics of diverse types of amyloid fiber architecture have been well described in multiple studies using a combination of structural and biophysical techniques[77-85]. AUC approaches have also been used to characterize mature amyloid fibers in solution[86, 87]. However, there have been few studies about the assembly of early oligomers and proto-fibril intermediates that produce mature amyloid fibers[88-90], particularly in terms of resolving discrete intermediate species. For example, pre-amyloid oligomerization of transthyretin has been characterized by AUC but without resolution of individual species[91]. We have applied the c(s,ff0) analysis to sedimentation velocity AUC data to resolve the broad array of species present at early stages of Aap amyloidogenesis, including the oligomeric complexes and nascent fibers.

This is one of the first uses of this 2D size-and-shape analysis approach to an amyloid- forming protein, and its ability to separate many species with differing sizes and shapes in solution holds promise for resolving assembly intermediates in other amyloid systems

91 as well. A similar approach, based on the van Holde-Weischet method[92] and implemented in Ultrascan[93] (http://www.ultrascan.uthscsa.edu) designed to resolve size and shape information from sedimentation velocity data has been applied to mature amyloid fibers formed by Abeta peptides[94, 95], but here we examined amyloidogenesis of a much larger protein. The c(s,ff0) analysis model is capable of distinguishing species of different size and shape that are sedimenting at the same rate[34], which can occur in amyloid assembly systems due to the complicated nature of the assembly process and the number of species to be resolved. This analysis works particularly well in systems such as this one with slow kinetics, which allows resolution of discrete assembly intermediates rather than simply showing broad reaction boundaries representing multiple species in rapid exchange. The c(s,ff0) analysis of Aap presented here clearly discriminates between compact oligomers and fibers that show similar sedimentation coefficients due to the increased drag and resulting slower sedimentation of the highly elongated fibers. This analytical approach could prove useful in providing mechanistic details for how amyloid nucleation occurs by this or other proteins.

Biofilm formation is the primary characteristic responsible for pathogenicity and it contributes to antibiotic resistance in chronic infections caused by S. epidermidis. One of the biggest challenges with recurrent infections caused by biofilms is their highly adhesive and cohesive nature and their resistance to chemical and physical insults. Our data indicate that intercellular amyloid fibers appear early during the accumulation phase of nascent biofilms, and they continue to increase until they are ubiquitous throughout the mature biofilm. Using highly purified tandem B-repeats in solution, we

92 show that mature amyloid fibers are highly resistant to chemical stresses such as Zn2+ chelation or acidic pH, suggesting that the amyloid fibers forming the intercellular network are one reason S. epidermidis biofilms are so resistant to harsh environmental conditions. We have previously shown[25] that addition of Zn2+ chelators to staphylococcal cultures prevents biofilm formation. However, addition of these same chelators to pre-formed mature staphylococcal biofilms does not appreciably disperse the biofilm, which is now explained by the presence of resistant amyloid fibers between cells. Furthermore, we predict that homologous surface proteins containing tandem B- repeats in S. aureus (i.e., SasG and Pls) and other gram-positive bacteria will form similar Zn2+-dependent amyloid fibers between cells in biofilms. The data presented here will provide new insights for both potential prevention and treatment of chronic staphylococcal infections, particularly with regard to methods that could depolymerize amyloid fibers and thus destabilize the biofilm.

93

Methods

Bacterial Strains and Media

S. epidermidis strain RP62A (ATCC 35984) was purchased directly from ATCC as a glycerol stock and was cultured in tryptic soy broth (TSB).

Expression Construct Generation

The Brpt3.5 construct (amino acids 1761-2223) of Aap (NCBI AAW53239.1) was

PCR amplified from RP62A genomic DNA and inserted into the expression vector pHisMBP-DEST (kindly provided by Dr. Artem Evdokimov) using Gateway technology, which adds an N-terminal hexahistidine-maltose binding protein (His-MBP) fusion to the

Brpt3.5 construct, with an intervening tobacco etch virus protease site. The Brpt5.5 construct (amino acids 1505-2223) of Aap (NCBI AAW53239.1) was synthesized by

LifeTechnologies GeneArt®, also containing an intervening tobacco etch virus protease site for removal of the His-MBP tag. The plasmids were then transformed into the E. coli expression cell line BLR(DE3) (Novagen). The Brpt3.5-Cys mutant was synthesized by

LifeTechnologies GeneArt® as well, containing amino acids 1761-2223 of Aap (NCBI

AAW53239.1), followed by a single cysteine residue inserted before the stop codon.

Protein Expression and Purification

One-liter cultures were inoculated with 10 ml of His-MBP-Brpt3.5/BLR(DE3) culture at an OD600 of 0.6-0.8, and then allowed to incubate overnight with shaking at 37

°C. Protein expression was then induced using 250 µM IPTG for 6 hours at 25 °C. The cells were then harvested, re-suspended, frozen and thawed prior to lysis by French

94 press. The cell lysate was centrifuged, and the soluble fraction was decanted onto a nickel-NTA gravity column. The Brpt3.5 protein was eluted by imidazole step gradient.

Fractions confirmed by SDS gel to contain Brpt3.5 were then further purified by anion exchange using a 5 ml anionQ fast-flow column (GE Healthcare) followed by size- exclusion chromatography using a Superdex 200 column (GE Healthcare). The presence of the His-MBP fusion tag improved solubility and behavior in solution and was therefore left attached to enhance protein stability, except for initial CD and AUC experiments (Fig 1).

The Brpt5.5 construct was induced overnight at 20 °C with 200 µM IPTG, lysed by sonication, then ran over a homemade 50 ml NiNTA column containing 50 ml of

IMAC Sepharose Fast Flow resin (GE Healthcare) charged with Ni2+ and housed inside a XK 16/40 column. After elution by an imidazole linear gradient, the fractions containing Brpt5.5 were dialyzed into 20 mM Tris pH 7.4, 300 mM NaCl and cleaved by

TEV for at least 6 hours under reducing conditions. Another 50 ml NiNTA column was run, this time collected the flow-through (cleaved Brpt5.5). Finally, a Superdex 200 (GE

Healthcare) column was run to separate remaining contaminants and truncations of

Brpt5.5.

Circular Dichroism

Brpt3.5 and HMBP-Brpt3.5 samples were dialyzed into 10 mM Tris pH 7.4 and

150 mM NaF. Far-UV VD spectra were obtained using an Aviv 215 spectrometer. The concentration of cleaved Brpt3.5 (fusion tag removed) was determined using the molar extinction coefficient of 10,430 M-1 cm-1, as calculated using the online server

95

PrtoParam (web.expasy.org/protparam), and the concentration of HMBP-Brpt3.5 was determined using the molar extinction coefficient of 78,270 M-1 cm-1 calculated by

ProtParam. An additional sample of HMBP-Brpt3.5 was treated with 10% formic acid

(FA) to depolymerize amyloid fibers, followed by dialysis of the sample into 10 mM Tris pH 7.4 and 150 mM NaF. The concentration of Brpt5.5 was based on the molar extinction coefficient of 16,390 M-1 cm-1, and the spectrum was collected in 10 mM Tris pH 7.4 and 50 mM NaF. Data were analyzed using the CDSSTR program on the online

Dichroweb server with reference set 4 (http://dichroweb.cryst.bbk.ac.uk)[96]. Data are plotted as mean residue ellipticity, [θ], which is in units of degrees cm2 dmol-1 residue-1.

Analytical Ultracentrifugation

Experiments were performed with a Beckman XL-I analytical ultracentrifuge using absorbance optics at 280 nm. Sedimentation velocity experiments were performed at 36,000 rpm at 20 °C with and without 3 mM ZnCl2 (Fig 2) or at 25 °C and

37 °C with or without 500 µM or 1 mM ZnCl2 as indicated (Fig 3). Data were analyzed using Sedfit software[97] and the c(s) (Figs 1 and 2) or c(s,ff0) models (Fig 3). The c(s,ff0) model describes the sedimentation behavior of the species in solution as a two- dimensional distribution based on both sedimentation coefficient and frictional ratio, which allows resolution of multiple species of widely differing size and shape sedimenting at the same sedimentation coefficients[33, 34]. Parameters of buffer density and viscosity and the partial specific volume of constructs were calculated using

SEDNTERP[98] at all relevant experimental temperatures. The data described in Fig 3B and Fig 3C were analyzed using several different models in SEDFIT; the c(s,ff0) model

96 yielded fits with the lowest value for the summed square of the residuals (SSR). The standard c(s) model fit gave an SSR value of 0.5140; the best-fit worm-like chain model gave an SSR value of 0.5102; the c(s,ff0) model gave an SSR value of 0.5029 when setting the frictional ratio limits from 1 to 5.

In order to show reversibility of Brpt5.5 initial assembly (Fig S5), 6 µM Brpt5.5 was dialyzed overnight into 50 mM MOPS pH 7.2, 50 mM NaCl, 3.5 mM ZnCl2. A sample was analyzed by sedimentation velocity AUC with the addition of buffer (Brpt5.5

+ Zn), 7 mM DTPA (+ Zn + DTPA), or enough HCl to reach pH near 6 (+ Zn + HCl). As a control, Brpt5.5 was also dialyzed into 20 mM MES pH 5.0, 50 mM NaCl, 3.5 mM

ZnCl2. This sample sedimented as a monomer near 2 s* (data not shown). The absorbance at 280 nm was used for the c(s) distribution analysis.

Transmission Electron Microscopy

5 µl samples of 10 µM HMBP-Brpt3.5 incubated with 500 µM ZnCl2 were applied to 200 mesh formvar carbon/copper grids for 2 minutes, washed with diH2O, stained by

1% uranyl acetate drop-wise for 30 seconds, and washed a second time with diH2O.

Samples were then dried for 1 hour prior to viewing on a Hitachi 7600 transmission electron microscope at an accelerating voltage of 80 kV. Images were captured using an AMT 2k CCD camera. Brpt5.5 was incubated at 20 µM with 5 mM Zn2+ and incubated at 37 °C for 3 weeks. Samples were stained using the same protocol as

HMBP-Brpt3.5, but with 2% uranyl acetate (Electron Microscopy Sciences). To stain biofilms, they were washed off of the dialysis tubing as described in “Bacterial Strains and Media.” The washes were treated with DTPA, FA, or buffer and incubated for 1

97 hour at room temperature while shaking. A 3 µl sample was added to grids for negative staining as described above. For extracellular biofilm material, washes were centrifuged at 17 k x g for 5 minutes before collecting the supernatant.

Harvesting Biofilms

This protocol was based on an approach used by Sun, et al. [17] to isolate extracellular and cell wall-associated proteins from S. epidermidis biofilms. RP62A was grown overnight on tryptic soy agar (TSA) with Sheep Blood (Thermo ScientificTM

R01202). Colonies were scraped from the agar plate and suspended in TSB to an OD of 0.1. A piece of 3,500 MWCO dialysis tubing was cut down the edge and opened to be arranged over a TSA agar plate. A 2 ml aliquot (with the addition of ZnCl2, DTPA, FA, or

100 mM MES pH 5.0) was added to the dialysis tubing on top of the TSA agar. After 16 hours of growth at 37 °C, the biofilm was washed off of the dialysis tubing using 1 ml of

H2O. For lysostaphin digestion of the cell wall, mixtures were centrifuged for 10 minutes at 17 k x g and the pellet resuspended in 50 mM Tris pH 7.4, 150 mM NaCl, 30% raffinose. Samples were then incubated at 37 °C for 1 hr with 1 mg/ml of lysostaphin

(Sigma L3876).

Thioflavin T Protein Fluorescence Assay

10 µM HMBP-Brpt3.5 and 20 µM Brpt5.5 samples were treated with 10% formic acid (FA) to depolymerize amyloid fibers[44], followed by dialysis into standard buffer.

The FA-treated protein was then incubated with or without 500 µM ZnCl2 for 24 hours at

20 °C prior to adding thioflavin T (Sigma) to a final concentration of 10 µM. In parallel,

98 one sample of HMBP-Brpt3.5 was treated with 10% FA and was tested for ThT fluorescence without removing the FA (Labeled “FA + HMBP-Brpt3.5 + ThT” in Fig 4B).

Fluorescence was measured using a Perkin Elmer LS50B Luminescence

Spectrophotometer or a Biotek Synergy at an excitation of 434 nm or 440 nm and collecting the complete emission spectrum between 450 and 600 nm. Congo Red absorbance was measured from 400 to 600, and the absorbance at 540 nm plotted in

Fig 4E. Proteostat fluorescence was measured using an excitation wavelength of 500 nm and collecting the emission spectrum from 500 to 700 nm. The emission at 600 nm was plotted in Fig 4F, along with the emission of ThT at 482 nm (Fig 4C, F).

HPLC Assays

For the HPLC fiber/oligomer quantification assay, 10 µM HMBP-Brpt3.5 samples were incubated with or without 500 µM ZnCl2 for 1, 4, or 30 days at 20 °C or 37 °C.

Samples were prepared based on a protocol previously described by O’Nuallain et al[60]. Briefly, samples were centrifuged for 1 hour at 13,500 rpm and then 250 µl of 5% acetonitrile. Samples were loaded on a C4 reverse-phase column (Phenomenex) and run on an Äkta purifier with a linear gradient of 0-95% acetonitrile over 10 column volumes. Peaks were integrated using the Unicorn software, normalizing the peak area to a standardized elution volume for monomer and oligomer peaks. To determine the stability of the Zn2+-induced amyloid fibers, samples of 10 µM HMBP-Brpt3.5 were incubated with 500 µM ZnCl2 for 1 or 4 days at 20 °C or 37 °C, followed by the addition of an equal volume of: 1) buffer (20 mM sodium citrate pH 7.4, 150 mM NaCl); 2) dilute hydrochloride acid sufficient to lower the pH 5.0; or 3) 2 mM Na5-

99 diethylenetriaminepentacetic acid (DTPA). Samples were further incubated for 2 hours, mixed with equal volume of 5% acetonitrile and then separated on a C4 column.

Turbidity Assays

5 µM Brpt5.5 in 2 ml 20 mM MOPS pH 7.2, 50 mM NaCl was incubated with 3, 5, or 8 mM ZnCl2 at 37 °C in a shaking incubator. The sample was transferred to a cuvette and the absorbance at 280, 400, and 700 nm was measured in a BioMate 3S

(ThermoFisher). The reported turbidity value is the absorbance measured at 400 nm

(Fig 5C). To evaluate the resistance of Brpt5.5 fibers to DTPA and HCl, samples were taken from the turbidity assay in Fig 5C, then DTPA or HCl were titrated in, with turbidity measurements taken after each addition.

Confocal Microscopy

For analysis of Brpt5.5 protein fibers, 10 µM samples were incubated with 500

µM ZnCl2 for 2, 6, 12, 18, or 24 hours prior to addition of thioflavin T (10 µM final concentration). Samples were then added to 8-well borosilicate glass slides (Nunc) and viewed using a Zeiss LSM 710 inverted microscope using an Apochromat 63x/1.40 oil

DIC M27 objective for brightfield and a laser set to 458 nm with emission filters at 469-

580 nm to measure thioflavin T fluorescence. For analysis of amyloid formations in biofilms, overnight cultures of S. epidermidis RP62A were diluted 1:200 and 400 µl aliquots were added to 8-well borosilicate glass slides. The slides were incubated 18 hours at 37 °C without shaking, followed by addition of thioflavin T to 10 µM as indicated. The media was aspirated and each well washed 3x with diH2O. The biofilms

100 were viewed using a Zeiss LSM 510 inverted microscope as described above. Control biofilm samples were alternatively labeled with the LIVE/DEAD stain kit (Invitrogen), using 3 µl stain diluted in 1 ml of sterile saline.

Biofilm Formation Assay

To quantify the ability for RP62A to form biofilms under varying conditions, a crystal violet-based biofilm formation assay was utilized[25]. An overnight culture grown in TSB was diluted 1:200 into TSB. To wells of a 96 well plate (Corning 351172), 200 µl of culture were added, along with initial treatments of ZnCl2 or DTPA (t = 0 hr). DTPA,

FA, or 100 mM MES pH 5.0 were added at the specified time points and incubated for an additional 1 hour at 37 °C. The liquid was then removed from the wells, followed by two washes of 200 µl H2O before being allowed to dry. Next, 100 µl of 0.1 % crystal violet was added and allowed to stain the biofilms for 10 minutes at room temperature.

The crystal violet solution was then removed, followed by two more washes with 200 µl

H2O. The plates were allowed to dry before being scanned. Finally, 200 µl 33 % acetic acid was added and plates incubated at 4 °C for 30 min to remove crystal violet stain from the adherent biofilms. The solution in the wells was then transferred to new wells, and then diluted 1:1 (100 µl solution to 100 µl H2O) before scanning at 520 nm.

Statistical analysis was performed in Microsoft Excel, using the two-tailed, two-sample unequal variance Student’s t test. Significant difference was identified when P < 0.05.

101

Mass spectrometry

Biofilms were prepared as described in “Harvesting Biofilms,” with the exception that no H2O was added to the dialysis membrane during collection. This kept the material in the biofilm at a reasonable concentration to be visualized by SDS-PAGE.

The section outlined in Fig 6 was reduced with DTT, alkylated with iodoacetamide and digested with trypsin. The resultant peptides were recovered and dried in a speed vac, before analysis by nanoLC-MS/MS.

Acknowledgements

The authors thank Georgianne Ciralo and Cincinnati Children’s Department of

Pathology for assistance and use of the electron microscopy facility, Dr. Dan Hassett and lab members and Dr. Birgit Ehmer for assistance with the confocal microscope, Dr.

Andy Deng and Dr. Tom Thompson’s lab members for help with the HPLC experiments and analysis, Dr. Nicolas Nassar for use of their fluorescence spectrophotometer, and the University of Cincinnati Cancer Biology Proteomics Core Facility for mass spectrometry services. The authors would also like to thank Dr. Rhett Kovall and Dr.

Tom Thompson for comments on the manuscript and Dr. Catherine Chaton for helpful discussions. This project was supported by NIH grants R01-GM094363 and U19-

AI070235, and by funds from the Cincinnati Children’s Hospital Research Foundation

(to ABH).

102

Supporting Information

S1 Fig. Sequence identity comparison of the tandem Brpt domains of Aap. S. epidermidis strain RP62A contains 12 tandem B-repeats. Each B-repeat amino acid sequence was aligned, using the most N-terminal repeat as the reference sequence with the online ISREC-Server program LALIGN, version 2.1.30 (http://www.ch.embnet.org/software/LALIGN_form.html). Sequence alignment of each B-repeat demonstrates a highly conserved sequence identity for each repeat with minor amino acid differences. In addition, several internal repeats are up to 98-100% identical to one another (such as repeats 4 and 5 or 9, 10, and 11) despite sequence differences with the N-terminal reference B-repeat.

103

S2 Fig. Secondary structure analysis of Brpt3.5. Circular dichroism spectra of uncleaved Brpt3.5 (black) and uncleaved Brpt3.5 treated with formic acid (FA, grey) demonstrates that the secondary structure of Brpt3.5 was not affected by treatment with FA. The secondary structure profile of Brpt3.5 (13% α-helix, 33% β-sheet, 22% turn, 31% coil) is almost identical to Brpt3.5 treated with formic acid (13% α-helix, 34% β- sheet, 23% turn, 30% coil). Secondary structure content was determined using CDSSTAR (reference set 4) on Dichroweb (http://dichroweb.cryst.bbk.ac.uk/html/home.shtml).

104

S3 Fig. B-repeat fibers are resistant to acid and metal chelator treatment. A) C4 reverse-phase HPLC elution profiles of 10 µM HMBP-Brpt3.5 alone (black) or incubated with 500 µM ZnCl2 for 1 (red), 4 (blue), or 30 (green) days at 20 °C. Samples were centrifuged at 13,000 rpm for 1 hour prior to HPLC separation to remove insoluble aggregates. B) Samples of 10 µM HMBP-Brpt3.5 with 500 µM ZnCl2 were incubated for 4 days at 20 °C prior to addition of Tris-buffer saline (red trace); sufficient dilute HCl to lower the pH to 5 (blue trace); or 2 mM DTPA (green trace). Separation by HPLC revealed that oligomer/soluble fiber species were maintained in the presence of both acid (blue) and DTPA (green) when compared to the buffer-treated control samples (red) at both temperatures. TEM images of these samples are in Supporting Information (S4 Fig).

105

S4 Fig. Stability of HMBP-Brpt3.5/Zn2+ fibers after incubation with HCl or the metal chelator DTPA. Representative TEM images of the 4 day-Zn2+ incubated HPLC samples (see Fig 4) at 20 °C (A) and 37 °C (B) treated with TBS buffer, acid, and DTPA, respectively. The fiber morphology is maintained upon each treatment, with minor fiber assembly rearrangements visible for the 20 °C acid-treated fibers (shown in image), although other rod-like fibers can still be found in this sample. Scale bars = 1 µm.

106

S5 Fig. Initial assembly of Brpt5.5 is sensitive to acidification and the metal chelator DTPA. 6 µM Brpt5.5 was incubated with 3.5 mM ZnCl2 overnight, then analyzed by sedimentation velocity AUC with the addition of buffer (black), 7 mM DTPA (blue) or HCl to pH 5 (red). This experiment indicates that tandem B-repeats undergo an initial, reversible assembly.

107

S6 Fig. Mass spectrometry results of SDS-resistant aggregate present in S. epidermidis RP62A biofilms. Biofilms were collected from dialysis membrane on TSA grown in the presence of 20 µM ZnCl2. Collected material was centrifuged and the supernatant removed. The bacterial pellet was incubated with lysostaphin, resulting in the digestion of the cell wall and release of cell wall anchored proteins. After another centrifugation step, the supernatant was run on SDS-PAGE, revealing aggregate trapped in the well and stacking gel when stained with coomassie (Fig 6J). Results from nanoLC-MS/MS performed using the material within the red rectangle in Fig 6J are displayed above. Aap is the primary protein identified. The three other proteins, each of which is cystoplasmic, have very little coverage, suggesting these are likely present in very small amounts, and may therefore be insignificant to the amyloid fibers. Table S1 displays all peptides observed.

108

Protein Sequence Observed Accumulation-associated protein ADLDGATLTYTPK Accumulation-associated protein DNYDFYGR Accumulation-associated protein EFDPNLAPGTEK Accumulation-associated protein EFDPNLAPGTEKVVQK Accumulation-associated protein EFNPDLKPGEER Accumulation-associated protein GSQEDVPGKPGVK Accumulation-associated protein GSQEDVPGKPGVKNPDTGEVVTPPVDDVTK Accumulation-associated protein GTPSAAGFR Accumulation-associated protein GYGTFVKNDSQGNTSK Accumulation-associated protein IDTGYYNNDPLDK Accumulation-associated protein NPDTGEVVTPPVDDVTK Accumulation-associated protein NPLTGEKVGEGEPTEK Accumulation-associated protein NTIDIPPTTVK Accumulation-associated protein QPVDEIVEYGPTK Accumulation-associated protein TITTPTTKNPLTGEK Accumulation-associated protein VGEGEPTEKITK Accumulation-associated protein YNASNQTFTATYAGK Accumulation-associated protein YNYGQPPGTTTAGAVQFK Accumulation-associated protein PITSTEEIPFDK Accumulation-associated protein TGEVVTPPVDDVTK Arginine deiminase DVLAIGISER Arginine deiminase VVAIEIPTSR L-lactate dehydrogenase DAAYDIIQAK L-lactate dehydrogenase VIGSGTVLDSAR Probable malate:quinone IDEGTDVNYGALTR oxidoreductase Probable malate:quinone YSIDQMIK oxidoreductase

S1 Table. Mass spectrometry results of SDS-resistant aggregate present in S. epidermidis RP62A biofilms. Listed in this table are the peptides observed by mass spectrometry.

109

References 1. Uckay I, Pittet D, Vaudaux P, Sax H, Lew D, Waldvogel F. Foreign body infections due to Staphylococcus epidermidis. Annals of medicine. 2009;41(2):109-19. Epub 2008/08/23. doi: 10.1080/07853890802337045. PubMed PMID: 18720093. 2. NNIS. National Nosocomial Infections Surveillance (NNIS) System Report, data summary from January 1992 through June 2004, issued October 2004. American journal of infection control. 2004;32(8):470-85. Epub 2004/12/02. doi: 10.1016/s0196655304005425. PubMed PMID: 15573054. 3. Fey PD, Olson ME. Current concepts in biofilm formation of Staphylococcus epidermidis. Future microbiology. 2010;5(6):917-33. Epub 2010/06/05. doi: 10.2217/fmb.10.56. PubMed PMID: 20521936; PubMed Central PMCID: PMCPMC2903046. 4. Wisplinghoff H, Bischoff T, Tallent SM, Seifert H, Wenzel RP, Edmond MB. Nosocomial bloodstream infections in US hospitals: analysis of 24,179 cases from a prospective nationwide surveillance study. Clinical infectious diseases : an official publication of the Infectious Diseases Society of America. 2004;39(3):309-17. Epub 2004/08/13. doi: 10.1086/421946. PubMed PMID: 15306996. 5. Rogers KL, Fey PD, Rupp ME. Coagulase-negative staphylococcal infections. Infectious disease clinics of North America. 2009;23(1):73-98. Epub 2009/01/13. doi: 10.1016/j.idc.2008.10.001. PubMed PMID: 19135917. 6. Fey PD. Modality of bacterial growth presents unique targets: how do we treat biofilm-mediated infections? Current opinion in microbiology. 2010;13(5):610-5. Epub 2010/10/05. doi: 10.1016/j.mib.2010.09.007. PubMed PMID: 20884280; PubMed Central PMCID: PMCPMC2966470. 7. von Eiff C, Peters G, Heilmann C. Pathogenesis of infections due to coagulase- negative staphylococci. The Lancet Infectious diseases. 2002;2(11):677-85. Epub 2002/11/01. PubMed PMID: 12409048. 8. Otto M. Staphylococcus epidermidis--the 'accidental' pathogen. Nature reviews Microbiology. 2009;7(8):555-67. doi: 10.1038/nrmicro2182. PubMed PMID: 19609257; PubMed Central PMCID: PMC2807625. 9. Kristian SA, Birkenstock TA, Sauder U, Mack D, Gotz F, Landmann R. Biofilm formation induces C3a release and protects Staphylococcus epidermidis from IgG and complement deposition and from neutrophil-dependent killing. The Journal of infectious diseases. 2008;197(7):1028-35. Epub 2008/04/19. doi: 10.1086/528992. PubMed PMID: 18419540. 10. Gill SR, Fouts DE, Archer GL, Mongodin EF, Deboy RT, Ravel J, et al. Insights on evolution of virulence and resistance from the complete genome analysis of an early methicillin-resistant Staphylococcus aureus strain and a biofilm-producing methicillin- resistant Staphylococcus epidermidis strain. Journal of bacteriology. 2005;187(7):2426- 38. Epub 2005/03/19. doi: 10.1128/jb.187.7.2426-2438.2005. PubMed PMID: 15774886; PubMed Central PMCID: PMCPMC1065214. 11. Costerton JW, Stewart PS, Greenberg EP. Bacterial biofilms: a common cause of persistent infections. Science (New York, NY). 1999;284(5418):1318-22. Epub 1999/05/21. PubMed PMID: 10334980. 12. Sadovskaya I, Vinogradov E, Flahaut S, Kogan G, Jabbouri S. Extracellular carbohydrate-containing polymers of a model biofilm-producing strain, Staphylococcus

110 epidermidis RP62A. Infection and immunity. 2005;73(5):3007-17. Epub 2005/04/23. doi: 10.1128/iai.73.5.3007-3017.2005. PubMed PMID: 15845508; PubMed Central PMCID: PMCPMC1087347. 13. Izano EA, Amarante MA, Kher WB, Kaplan JB. Differential roles of poly-N- acetylglucosamine surface polysaccharide and extracellular DNA in Staphylococcus aureus and Staphylococcus epidermidis biofilms. Applied and environmental microbiology. 2008;74(2):470-6. Epub 2007/11/28. doi: 10.1128/aem.02073-07. PubMed PMID: 18039822; PubMed Central PMCID: PMCPMC2223269. 14. Mack D, Haeder M, Siemssen N, Laufs R. Association of biofilm production of coagulase-negative staphylococci with expression of a specific polysaccharide intercellular adhesin. The Journal of infectious diseases. 1996;174(4):881-4. Epub 1996/10/01. PubMed PMID: 8843236. 15. Rohde H, Burandt EC, Siemssen N, Frommelt L, Burdelski C, Wurster S, et al. Polysaccharide intercellular adhesin or protein factors in biofilm accumulation of Staphylococcus epidermidis and Staphylococcus aureus isolated from prosthetic hip and knee joint infections. Biomaterials. 2007;28(9):1711-20. doi: 10.1016/j.biomaterials.2006.11.046. PubMed PMID: 17187854. 16. Hussain M, Herrmann M, von Eiff C, Perdreau-Remington F, Peters G. A 140- kilodalton extracellular protein is essential for the accumulation of Staphylococcus epidermidis strains on surfaces. Infection and immunity. 1997;65(2):519-24. Epub 1997/02/01. PubMed PMID: 9009307; PubMed Central PMCID: PMCPmc176090. 17. Sun D, Accavitti MA, Bryers JD. Inhibition of biofilm formation by monoclonal antibodies against Staphylococcus epidermidis RP62A accumulation-associated protein. Clinical and diagnostic laboratory immunology. 2005;12(1):93-100. Epub 2005/01/12. doi: 10.1128/cdli.12.1.93-100.2005. PubMed PMID: 15642991; PubMed Central PMCID: PMCPMC540198. 18. Rohde H, Burdelski C, Bartscht K, Hussain M, Buck F, Horstkotte MA, et al. Induction of Staphylococcus epidermidis biofilm formation via proteolytic processing of the accumulation-associated protein by staphylococcal and host proteases. Molecular microbiology. 2005;55(6):1883-95. doi: 10.1111/j.1365-2958.2005.04515.x. PubMed PMID: 15752207. 19. Corrigan RM, Rigby D, Handley P, Foster TJ. The role of Staphylococcus aureus surface protein SasG in adherence and biofilm formation. Microbiology (Reading, England). 2007;153(Pt 8):2435-46. Epub 2007/07/31. doi: 10.1099/mic.0.2007/006676- 0. PubMed PMID: 17660408. 20. Kogan G, Sadovskaya I, Chaignon P, Chokr A, Jabbouri S. Biofilms of clinical strains of Staphylococcus that do not contain polysaccharide intercellular adhesin. FEMS microbiology letters. 2006;255(1):11-6. Epub 2006/01/27. doi: 10.1111/j.1574- 6968.2005.00043.x. PubMed PMID: 16436056. 21. Jabbouri S, Sadovskaya I. Characteristics of the biofilm matrix and its role as a possible target for the detection and eradication of Staphylococcus epidermidis associated with medical implant infections. FEMS immunology and medical microbiology. 2010;59(3):280-91. Epub 2010/06/10. doi: 10.1111/j.1574- 695X.2010.00695.x. PubMed PMID: 20528930. 22. Schaeffer CR, Woods KM, Longo GM, Kiedrowski MR, Paharik AE, Buttner H, et al. Accumulation-associated protein enhances Staphylococcus epidermidis biofilm

111 formation under dynamic conditions and is required for infection in a rat catheter model. Infection and immunity. 2015;83(1):214-26. Epub 2014/10/22. doi: 10.1128/iai.02177- 14. PubMed PMID: 25332125; PubMed Central PMCID: PMCPmc4288872. 23. Yarawsky AE, English LR, Whitten ST, Herr AB. The Proline/Glycine-Rich Region of the Biofilm Adhesion Protein Aap Forms an Extended Stalk that Resists Compaction. Journal of molecular biology. 2017;429(2):261-79. Epub 2016/11/29. doi: 10.1016/j.jmb.2016.11.017. PubMed PMID: 27890783. 24. Paharik AE, Kotasinska M, Both A, Hoang TN, Buttner H, Roy P, et al. The metalloprotease SepA governs processing of accumulation-associated protein and shapes intercellular adhesive surface properties in Staphylococcus epidermidis. Molecular microbiology. 2017;103(5):860-74. Epub 2016/12/21. doi: 10.1111/mmi.13594. PubMed PMID: 27997732; PubMed Central PMCID: PMCPMC5480372. 25. Conrady DG, Brescia CC, Horii K, Weiss AA, Hassett DJ, Herr AB. A zinc- dependent adhesion module is responsible for intercellular adhesion in staphylococcal biofilms. Proceedings of the National Academy of Sciences of the United States of America. 2008;105(49):19456-61. doi: 10.1073/pnas.0807717105. PubMed PMID: 19047636; PubMed Central PMCID: PMC2592360. 26. Conrady DG, Wilson JJ, Herr AB. Structural basis for Zn2+-dependent intercellular adhesion in staphylococcal biofilms. Proceedings of the National Academy of Sciences of the United States of America. 2013;110(3):E202-11. Epub 2013/01/02. doi: 10.1073/pnas.1208134110. PubMed PMID: 23277549; PubMed Central PMCID: PMCPmc3549106. 27. Chaton CT, Herr AB. Defining the metal specificity of a multifunctional biofilm adhesion protein. Protein science : a publication of the Protein Society. 2017;26(10):1964-73. Epub 2017/07/15. doi: 10.1002/pro.3232. PubMed PMID: 28707417; PubMed Central PMCID: PMCPMC5606542. 28. Geoghegan JA, Corrigan RM, Gruszka DT, Speziale P, O'Gara JP, Potts JR, et al. Role of surface protein SasG in biofilm formation by Staphylococcus aureus. Journal of bacteriology. 2010;192(21):5663-73. Epub 2010/09/08. doi: 10.1128/jb.00628-10. PubMed PMID: 20817770; PubMed Central PMCID: PMCPMC2953683. 29. Formosa-Dague C, Speziale P, Foster TJ, Geoghegan JA, Dufrene YF. Zinc- dependent mechanical properties of Staphylococcus aureus biofilm-forming surface protein SasG. Proceedings of the National Academy of Sciences of the United States of America. 2016;113(2):410-5. Epub 2015/12/31. doi: 10.1073/pnas.1519265113. PubMed PMID: 26715750; PubMed Central PMCID: PMCPmc4720321. 30. Shelton CL, Conrady DG, Herr AB. Functional consequences of B-repeat sequence variation in the staphylococcal biofilm protein Aap: deciphering the assembly code. The Biochemical journal. 2017;474(3):427-43. Epub 2016/11/23. doi: 10.1042/bcj20160675. PubMed PMID: 27872164. 31. Spurlino JC, Lu GY, Quiocho FA. The 2.3-A resolution structure of the maltose- or maltodextrin-binding protein, a primary receptor of bacterial active transport and chemotaxis. The Journal of biological chemistry. 1991;266(8):5202-19. Epub 1991/03/15. PubMed PMID: 2002054.

112

32. Herr AB, Conrady DG. Thermodynamic analysis of metal ion-induced protein assembly. Methods in enzymology. 2011;488:101-21. Epub 2011/01/05. doi: 10.1016/b978-0-12-381268-1.00005-7. PubMed PMID: 21195226. 33. Brown PH, Schuck P. Macromolecular size-and-shape distributions by sedimentation velocity analytical ultracentrifugation. Biophysical journal. 2006;90(12):4651-61. Epub 2006/03/28. doi: 10.1529/biophysj.106.081372. PubMed PMID: 16565040; PubMed Central PMCID: PMCPMC1471869. 34. Chaton CT, Herr AB. Elucidating Complicated Assembling Systems in Biology Using Size-and-Shape Analysis of Sedimentation Velocity Data. Methods in enzymology. 2015;562:187-204. Epub 2015/09/29. doi: 10.1016/bs.mie.2015.04.004. PubMed PMID: 26412652. 35. Sunde M, Blake C. The structure of amyloid fibrils by electron microscopy and X- ray diffraction. Advances in protein chemistry. 1997;50:123-59. Epub 1997/01/01. PubMed PMID: 9338080. 36. Dueholm MS, Larsen P, Finster K, Stenvang MR, Christiansen G, Vad BS, et al. The Tubular Sheaths Encasing Methanosaeta thermophila Filaments Are Functional Amyloids. The Journal of biological chemistry. 2015;290(33):20590-600. Epub 2015/06/26. doi: 10.1074/jbc.M115.654780. PubMed PMID: 26109065; PubMed Central PMCID: PMCPMC4536462. 37. Fowler DM, Koulov AV, Balch WE, Kelly JW. Functional amyloid--from bacteria to humans. Trends in biochemical sciences. 2007;32(5):217-24. Epub 2007/04/07. doi: 10.1016/j.tibs.2007.03.003. PubMed PMID: 17412596. 38. Chapman MR, Robinson LS, Pinkner JS, Roth R, Heuser J, Hammar M, et al. Role of Escherichia coli curli operons in directing amyloid fiber formation. Science (New York, NY). 2002;295(5556):851-5. Epub 2002/02/02. doi: 10.1126/science.1067484. PubMed PMID: 11823641; PubMed Central PMCID: PMCPMC2838482. 39. Larsen P, Nielsen JL, Dueholm MS, Wetzel R, Otzen D, Nielsen PH. Amyloid adhesins are abundant in natural biofilms. Environmental microbiology. 2007;9(12):3077-90. Epub 2007/11/10. doi: 10.1111/j.1462-2920.2007.01418.x. PubMed PMID: 17991035. 40. Romero D, Aguilar C, Losick R, Kolter R. Amyloid fibers provide structural integrity to Bacillus subtilis biofilms. Proceedings of the National Academy of Sciences of the United States of America. 2010;107(5):2230-4. Epub 2010/01/19. doi: 10.1073/pnas.0910560107. PubMed PMID: 20080671; PubMed Central PMCID: PMCPMC2836674. 41. Dueholm MS, Petersen SV, Sonderkaer M, Larsen P, Christiansen G, Hein KL, et al. Functional amyloid in Pseudomonas. Molecular microbiology. 2010;77(4):1009-20. Epub 2010/06/25. doi: 10.1111/j.1365-2958.2010.07269.x. PubMed PMID: 20572935. 42. LeVine H, 3rd. Thioflavine T interaction with synthetic Alzheimer's disease beta- amyloid peptides: detection of amyloid aggregation in solution. Protein science : a publication of the Protein Society. 1993;2(3):404-10. Epub 1993/03/01. doi: 10.1002/pro.5560020312. PubMed PMID: 8453378; PubMed Central PMCID: PMCPMC2142377. 43. Wolfe LS, Calabrese MF, Nath A, Blaho DV, Miranker AD, Xiong Y. Protein- induced photophysical changes to the amyloid indicator dye thioflavin T. Proceedings of the National Academy of Sciences of the United States of America.

113

2010;107(39):16863-8. Epub 2010/09/10. doi: 10.1073/pnas.1002867107. PubMed PMID: 20826442; PubMed Central PMCID: PMCPMC2947910. 44. Collinson SK, Emody L, Muller KH, Trust TJ, Kay WW. Purification and characterization of thin, aggregative fimbriae from Salmonella enteritidis. Journal of bacteriology. 1991;173(15):4773-81. Epub 1991/08/01. PubMed PMID: 1677357; PubMed Central PMCID: PMCPMC208156. 45. Blancas-Mejia LM, Hammernik J, Marin-Argany M, Ramirez-Alvarado M. Differential effects on light chain amyloid formation depend on mutations and type of glycosaminoglycans. The Journal of biological chemistry. 2015;290(8):4953-65. Epub 2014/12/30. doi: 10.1074/jbc.M114.615401. PubMed PMID: 25538238; PubMed Central PMCID: PMCPMC4335233. 46. Kodali R, Williams AD, Chemuru S, Wetzel R. Abeta(1-40) forms five distinct amyloid structures whose beta-sheet contents and fibril stabilities are correlated. Journal of molecular biology. 2010;401(3):503-17. Epub 2010/07/06. doi: 10.1016/j.jmb.2010.06.023. PubMed PMID: 20600131; PubMed Central PMCID: PMCPMC2919579. 47. Navarro S, Carija A, Munoz-Torrero D, Ventura S. A fast and specific method to screen for intracellular amyloid inhibitors using bacterial model systems. European journal of medicinal chemistry. 2016;121:785-92. Epub 2015/11/27. doi: 10.1016/j.ejmech.2015.10.044. PubMed PMID: 26608003. 48. Howie AJ, Brewer DB. Optical properties of amyloid stained by Congo red: history and mechanisms. Micron (Oxford, England : 1993). 2009;40(3):285-301. Epub 2008/11/21. doi: 10.1016/j.micron.2008.10.002. PubMed PMID: 19019688. 49. Gruszka DT, Wojdyla JA, Bingham RJ, Turkenburg JP, Manfield IW, Steward A, et al. Staphylococcal biofilm-forming protein has a contiguous rod-like structure. Proceedings of the National Academy of Sciences of the United States of America. 2012;109(17):E1011-8. Epub 2012/04/12. doi: 10.1073/pnas.1119456109. PubMed PMID: 22493247; PubMed Central PMCID: PMCPmc3340054. 50. Pedersen JS, Dikov D, Flink JL, Hjuler HA, Christiansen G, Otzen DE. The changing face of glucagon fibrillation: structural polymorphism and conformational imprinting. Journal of molecular biology. 2006;355(3):501-23. Epub 2005/12/03. doi: 10.1016/j.jmb.2005.09.100. PubMed PMID: 16321400. 51. Macchi F, Eisenkolb M, Kiefer H, Otzen DE. The effect of osmolytes on protein fibrillation. International journal of molecular sciences. 2012;13(3):3801-19. Epub 2012/04/11. doi: 10.3390/ijms13033801. PubMed PMID: 22489184; PubMed Central PMCID: PMCPMC3317744. 52. Kayed R, Head E, Sarsoza F, Saing T, Cotman CW, Necula M, et al. Fibril specific, conformation dependent antibodies recognize a generic epitope common to amyloid fibrils and fibrillar oligomers that is absent in prefibrillar oligomers. Molecular Neurodegeneration. 2007;2(1):18. doi: 10.1186/1750-1326-2-18. 53. Tougu V, Karafin A, Zovo K, Chung RS, Howells C, West AK, et al. Zn(II)- and Cu(II)-induced non-fibrillar aggregates of amyloid-beta (1-42) peptide are transformed to amyloid fibrils, both spontaneously and under the influence of metal chelators. Journal of neurochemistry. 2009;110(6):1784-95. Epub 2009/07/22. doi: 10.1111/j.1471- 4159.2009.06269.x. PubMed PMID: 19619132.

114

54. Bush AI, Pettingell WH, Multhaup G, d Paradis M, Vonsattel JP, Gusella JF, et al. Rapid induction of Alzheimer A beta amyloid formation by zinc. Science (New York, NY). 1994;265(5177):1464-7. Epub 1994/09/02. PubMed PMID: 8073293. 55. Adlard PA, Bush AI. Metals and Alzheimer's disease. Journal of Alzheimer's disease : JAD. 2006;10(2-3):145-63. Epub 2006/11/23. PubMed PMID: 17119284. 56. Faller P. Copper and zinc binding to amyloid-beta: coordination, dynamics, aggregation, reactivity and metal-ion transfer. Chembiochem : a European journal of chemical biology. 2009;10(18):2837-45. Epub 2009/10/31. doi: 10.1002/cbic.200900321. PubMed PMID: 19877000. 57. Hammer ND, Schmidt JC, Chapman MR. The curli nucleator protein, CsgB, contains an amyloidogenic domain that directs CsgA polymerization. Proceedings of the National Academy of Sciences of the United States of America. 2007;104(30):12494-9. Epub 2007/07/20. doi: 10.1073/pnas.0703310104. PubMed PMID: 17636121; PubMed Central PMCID: PMCPMC1941497. 58. Romero D, Vlamakis H, Losick R, Kolter R. An accessory protein required for anchoring and assembly of amyloid fibres in B. subtilis biofilms. Molecular microbiology. 2011;80(5):1155-68. Epub 2011/04/12. doi: 10.1111/j.1365-2958.2011.07653.x. PubMed PMID: 21477127; PubMed Central PMCID: PMCPMC3103627. 59. Taglialegna A, Navarro S, Ventura S, Garnett JA, Matthews S, Penades JR, et al. Staphylococcal Bap Proteins Build Amyloid Scaffold Biofilm Matrices in Response to Environmental Signals. PLoS pathogens. 2016;12(6):e1005711. Epub 2016/06/22. doi: 10.1371/journal.ppat.1005711. PubMed PMID: 27327765; PubMed Central PMCID: PMCPMC4915627. 60. O'Nuallain B, Thakur AK, Williams AD, Bhattacharyya AM, Chen S, Thiagarajan G, et al. Kinetics and thermodynamics of amyloid assembly using a high-performance liquid chromatography-based sedimentation assay. Methods in enzymology. 2006;413:34-74. Epub 2006/10/19. doi: 10.1016/s0076-6879(06)13003-7. PubMed PMID: 17046390. 61. Zhao R, So M, Maat H, Ray NJ, Arisaka F, Goto Y, et al. Measurement of amyloid formation by turbidity assay-seeing through the cloud. Biophysical reviews. 2016;8(4):445-71. Epub 2016/12/23. doi: 10.1007/s12551-016-0233-7. PubMed PMID: 28003859; PubMed Central PMCID: PMCPMC5135725. 62. Plakoutsi G, Bemporad F, Calamai M, Taddei N, Dobson CM, Chiti F. Evidence for a mechanism of amyloid formation involving molecular reorganisation within native- like precursor aggregates. Journal of molecular biology. 2005;351(4):910-22. Epub 2005/07/19. doi: 10.1016/j.jmb.2005.06.043. PubMed PMID: 16024042. 63. Banner MA, Cunniffe JG, Macintosh RL, Foster TJ, Rohde H, Mack D, et al. Localized tufts of fibrils on Staphylococcus epidermidis NCTC 11047 are comprised of the accumulation-associated protein. Journal of bacteriology. 2007;189(7):2793-804. Epub 2007/02/06. doi: 10.1128/jb.00952-06. PubMed PMID: 17277069; PubMed Central PMCID: PMCPMC1855787. 64. Macintosh RL, Brittan JL, Bhattacharya R, Jenkinson HF, Derrick J, Upton M, et al. The terminal A domain of the fibrillar accumulation-associated protein (Aap) of Staphylococcus epidermidis mediates adhesion to human corneocytes. Journal of bacteriology. 2009;191(22):7007-16. Epub 2009/09/15. doi: 10.1128/jb.00764-09. PubMed PMID: 19749046; PubMed Central PMCID: PMCPMC2772481.

115

65. Damo SM, Kehl-Fie TE, Sugitani N, Holt ME, Rathi S, Murphy WJ, et al. Molecular basis for manganese sequestration by calprotectin and roles in the innate immune response to invading bacterial pathogens. Proceedings of the National Academy of Sciences of the United States of America. 2013;110(10):3841-6. Epub 2013/02/23. doi: 10.1073/pnas.1220341110. PubMed PMID: 23431180; PubMed Central PMCID: PMCPMC3593839. 66. Nenninger AA, Robinson LS, Hultgren SJ. Localized and efficient curli nucleation requires the chaperone-like amyloid assembly protein CsgF. Proceedings of the National Academy of Sciences of the United States of America. 2009;106(3):900-5. Epub 2009/01/10. doi: 10.1073/pnas.0812143106. PubMed PMID: 19131513; PubMed Central PMCID: PMCPMC2630086. 67. Dong J, Shokes JE, Scott RA, Lynn DG. Modulating amyloid self-assembly and fibril morphology with Zn(II). Journal of the American Chemical Society. 2006;128(11):3540-2. Epub 2006/03/16. doi: 10.1021/ja055973j. PubMed PMID: 16536526; PubMed Central PMCID: PMCPMC3555692. 68. Talmard C, Leuma Yona R, Faller P. Mechanism of zinc(II)-promoted amyloid formation: zinc(II) binding facilitates the transition from the partially alpha-helical conformer to aggregates of amyloid beta protein(1-28). Journal of biological inorganic chemistry : JBIC : a publication of the Society of Biological Inorganic Chemistry. 2009;14(3):449-55. Epub 2008/12/17. doi: 10.1007/s00775-008-0461-9. PubMed PMID: 19083027. 69. Edgeworth JA, Gros N, Alden J, Joiner S, Wadsworth JD, Linehan J, et al. Spontaneous generation of mammalian prions. Proceedings of the National Academy of Sciences of the United States of America. 2010;107(32):14402-6. Epub 2010/07/28. doi: 10.1073/pnas.1004036107. PubMed PMID: 20660771; PubMed Central PMCID: PMCPMC2922516. 70. Jobling MF, Huang X, Stewart LR, Barnham KJ, Curtain C, Volitakis I, et al. Copper and zinc binding modulates the aggregation and neurotoxic properties of the prion peptide PrP106-126. Biochemistry. 2001;40(27):8073-84. Epub 2001/07/04. PubMed PMID: 11434776. 71. Calabrese MF, Eakin CM, Wang JM, Miranker AD. A regulatable switch mediates self-association in an immunoglobulin fold. Nature structural & molecular biology. 2008;15(9):965-71. Epub 2009/01/28. PubMed PMID: 19172750; PubMed Central PMCID: PMCPMC2680708. 72. Eakin CM, Knight JD, Morgan CJ, Gelfand MA, Miranker AD. Formation of a copper specific binding site in non-native states of beta-2-microglobulin. Biochemistry. 2002;41(34):10646-56. Epub 2002/08/21. PubMed PMID: 12186550. 73. Tormo MA, Knecht E, Gotz F, Lasa I, Penades JR. Bap-dependent biofilm formation by pathogenic species of Staphylococcus: evidence of horizontal gene transfer? Microbiology (Reading, England). 2005;151(Pt 7):2465-75. Epub 2005/07/08. doi: 10.1099/mic.0.27865-0. PubMed PMID: 16000737. 74. Lembre P, Vendrely C, Martino PD. Identification of an amyloidogenic peptide from the Bap protein of Staphylococcus epidermidis. Protein and peptide letters. 2014;21(1):75-9. Epub 2013/12/21. PubMed PMID: 24354773. 75. Wang Y, Jiang J, Gao Y, Sun Y, Dai J, Wu Y, et al. Staphylococcus epidermidis small basic protein (Sbp) forms amyloid fibrils, consistent with its function as a

116 scaffolding protein in biofilms. The Journal of biological chemistry. 2018;293(37):14296- 311. Epub 2018/07/28. doi: 10.1074/jbc.RA118.002448. PubMed PMID: 30049797; PubMed Central PMCID: PMCPMC6139570. 76. Decker R, Burdelski C, Zobiak M, Buttner H, Franke G, Christner M, et al. An 18 kDa scaffold protein is critical for Staphylococcus epidermidis biofilm formation. PLoS pathogens. 2015;11(3):e1004735. Epub 2015/03/24. doi: 10.1371/journal.ppat.1004735. PubMed PMID: 25799153; PubMed Central PMCID: PMCPmc4370877. 77. Pedersen JS, Otzen DE. Amyloid-a state in many guises: survival of the fittest fibril fold. Protein science : a publication of the Protein Society. 2008;17(1):2-10. Epub 2007/11/29. doi: 10.1110/ps.073127808. PubMed PMID: 18042680; PubMed Central PMCID: PMCPMC2144592. 78. Nelson R, Sawaya MR, Balbirnie M, Madsen AO, Riekel C, Grothe R, et al. Structure of the cross-beta spine of amyloid-like fibrils. Nature. 2005;435(7043):773-8. Epub 2005/06/10. doi: 10.1038/nature03680. PubMed PMID: 15944695; PubMed Central PMCID: PMCPMC1479801. 79. Shewmaker F, McGlinchey RP, Thurber KR, McPhie P, Dyda F, Tycko R, et al. The functional curli amyloid is not based on in-register parallel beta-sheet structure. The Journal of biological chemistry. 2009;284(37):25065-76. Epub 2009/07/04. doi: 10.1074/jbc.M109.007054. PubMed PMID: 19574225; PubMed Central PMCID: PMCPMC2757210. 80. Andersen CB, Hicks MR, Vetri V, Vandahl B, Rahbek-Nielsen H, Thogersen H, et al. Glucagon fibril polymorphism reflects differences in protofilament backbone structure. Journal of molecular biology. 2010;397(4):932-46. Epub 2010/02/17. doi: 10.1016/j.jmb.2010.02.012. PubMed PMID: 20156459. 81. Morris K, Serpell L. From natural to designer self-assembling biopolymers, the structural characterisation of fibrous proteins & peptides using fibre diffraction. Chemical Society reviews. 2010;39(9):3445-53. Epub 2010/07/30. doi: 10.1039/b919453n. PubMed PMID: 20668734. 82. Zhang R, Hu X, Khant H, Ludtke SJ, Chiu W, Schmid MF, et al. Interprotofilament interactions between Alzheimer's Abeta1-42 peptides in amyloid fibrils revealed by cryoEM. Proceedings of the National Academy of Sciences of the United States of America. 2009;106(12):4653-8. Epub 2009/03/07. doi: 10.1073/pnas.0901085106. PubMed PMID: 19264960; PubMed Central PMCID: PMCPMC2660777. 83. Sikorski P, Atkins ED, Serpell LC. Structure and texture of fibrous crystals formed by Alzheimer's abeta(11-25) peptide fragment. Structure (London, England : 1993). 2003;11(8):915-26. Epub 2003/08/09. PubMed PMID: 12906823. 84. Sawaya MR, Sambashivan S, Nelson R, Ivanova MI, Sievers SA, Apostol MI, et al. Atomic structures of amyloid cross-beta spines reveal varied steric zippers. Nature. 2007;447(7143):453-7. Epub 2007/05/01. doi: 10.1038/nature05695. PubMed PMID: 17468747. 85. Liu C, Sawaya MR, Cheng PN, Zheng J, Nowick JS, Eisenberg D. Characteristics of amyloid-related oligomers revealed by crystal structures of macrocyclic beta-sheet mimics. Journal of the American Chemical Society. 2011;133(17):6736-44. Epub 2011/04/09. doi: 10.1021/ja200222n. PubMed PMID: 21473620; PubMed Central PMCID: PMCPMC3124511.

117

86. MacRaild CA, Hatters DM, Lawrence LJ, Howlett GJ. Sedimentation velocity analysis of flexible macromolecules: self-association and tangling of amyloid fibrils. Biophysical journal. 2003;84(4):2562-9. Epub 2003/04/02. doi: 10.1016/s0006- 3495(03)75061-9. PubMed PMID: 12668464; PubMed Central PMCID: PMCPMC1302822. 87. Pham Cle L, Mok YF, Howlett GJ. Sedimentation velocity analysis of amyloid fibrils. Methods in molecular biology (Clifton, NJ). 2011;752:179-96. Epub 2011/06/30. doi: 10.1007/978-1-60327-223-0_12. PubMed PMID: 21713638. 88. Smith DP, Radford SE, Ashcroft AE. Elongated oligomers in beta2-microglobulin amyloid assembly revealed by ion mobility spectrometry-mass spectrometry. Proceedings of the National Academy of Sciences of the United States of America. 2010;107(15):6794-8. Epub 2010/03/31. doi: 10.1073/pnas.0913046107. PubMed PMID: 20351246; PubMed Central PMCID: PMCPMC2872402. 89. Binger KJ, Pham CL, Wilson LM, Bailey MF, Lawrence LJ, Schuck P, et al. Apolipoprotein C-II amyloid fibrils assemble via a reversible pathway that includes fibril breaking and rejoining. Journal of molecular biology. 2008;376(4):1116-29. Epub 2008/01/22. doi: 10.1016/j.jmb.2007.12.055. PubMed PMID: 18206908. 90. Ryan TM, Teoh CL, Griffin MD, Bailey MF, Schuck P, Howlett GJ. Phospholipids enhance nucleation but not elongation of apolipoprotein C-II amyloid fibrils. Journal of molecular biology. 2010;399(5):731-40. Epub 2010/05/04. doi: 10.1016/j.jmb.2010.04.042. PubMed PMID: 20433849; PubMed Central PMCID: PMCPMC2887044. 91. Lashuel HA, Lai Z, Kelly JW. Characterization of the transthyretin acid denaturation pathways by analytical ultracentrifugation: implications for wild-type, V30M, and L55P amyloid fibril formation. Biochemistry. 1998;37(51):17851-64. Epub 1999/01/28. PubMed PMID: 9922152. 92. Van Holde KE, Weischet WO. Boundary analysis of sedimentation-velocity experiments with monodisperse and paucidisperse solutes. Biopolymers. 1978;17(6):1387-403. doi: 10.1002/bip.1978.360170602. 93. Demeler B, van Holde KE. Sedimentation velocity analysis of highly heterogeneous systems. Analytical biochemistry. 2004;335(2):279-88. Epub 2004/11/24. doi: 10.1016/j.ab.2004.08.039. PubMed PMID: 15556567. 94. Demeler B, Brookes E, Nagel-Steger L. Analysis of heterogeneity in molecular weight and shape by analytical ultracentrifugation using parallel distributed computing. Methods in enzymology. 2009;454:87-113. Epub 2009/02/17. doi: 10.1016/s0076- 6879(08)03804-4. PubMed PMID: 19216924. 95. Nagel-Steger L, Demeler B, Meyer-Zaika W, Hochdörffer K, Schrader T, Willbold D. Modulation of aggregate size- and shape-distributions of the amyloid-β peptide by a designed β-sheet breaker. European Biophysics Journal. 2010;39(3):415-22. doi: 10.1007/s00249-009-0416-2. 96. Whitmore L, Wallace BA. DICHROWEB, an online server for protein secondary structure analyses from circular dichroism spectroscopic data. Nucleic acids research. 2004;32(Web Server issue):W668-73. Epub 2004/06/25. doi: 10.1093/nar/gkh371. PubMed PMID: 15215473; PubMed Central PMCID: PMCPMC441509. 97. Schuck P, Perugini MA, Gonzales NR, Howlett GJ, Schubert D. Size-distribution analysis of proteins by analytical ultracentrifugation: strategies and application to model

118 systems. Biophysical journal. 2002;82(2):1096-111. Epub 2002/01/25. doi: 10.1016/s0006-3495(02)75469-6. PubMed PMID: 11806949; PubMed Central PMCID: PMCPMC1301916. 98. Laue TM SB, Ridgeway SL. Computer aided interpretation of analytical sedimentation data for proteins. SE H, editor1992.

119

Chapter III. Tandem B-repeats from Aap show reversible zinc-dependent

assembly beyond dimer

Authors: Alexander E. Yarawsky1,2 and Andrew B. Herr2,3

Affiliations: 1 - Graduate Program in Molecular Genetics, Biochemistry & Microbiology, University of Cincinnati College of Medicine, Cincinnati, OH 45267, USA

2 - Division of Immunobiology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA

3 - Division of Infectious Diseases, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA

Author Contributions: A.E.Y. collected data.

A.E.Y. and A.B.H. analyzed data, conceived experiments and directed the project.

A.E.Y. wrote this draft.

Funding: Work was performed using funding from R01-GM094363 and U19-AI070235 awarded to A.B.H. and the University of Cincinnati Graduate School Dean's Fellowship awarded to A.E.Y. (2018-2019 AY).

120

Abstract The accumulation-associated protein (Aap) from Staphylococcus epidermidis is a critical factor for infection. The B-repeat superdomain of Aap, composed of 5 to 17 B- repeats containing a Zn2+-binding G5 domain and spacer region, is responsible for Zn2+- dependent assembly leading to accumulation of bacteria during biofilm formation. The importance of this assembly has been demonstrated in biofilm formation assays, where chelation of Zn2+ will prevent biofilm formation from occurring. Recently, we have found the presence of functional amyloid-like structures composed of Aap within biofilms. In vitro, we have used a construct containing the first 5 and a half B-repeats (Brpt5.5) from

Aap to examine amyloid fibril formation. While previous studies using minimal B-repeat constructs have described the formation of an antiparallel dimer in the presence of Zn2+, we sought to understand the initial assembly events leading up to amyloid formation using the more biologically relevant Brpt5.5 construct. In order to characterize these assembly events, we have utilized analytical ultracentrifugation (AUC) to determine hydrodynamic parameters of each species and perform linked equilibrium studies.

Interestingly, Brpt5.5 assembles beyond the expected dimer, forming a novel tetramer.

Linkage studies indicate 1-2 Zn2+ ions are bound during the tetramerization event. In characterizing the tetramer, we took advantage of this knowledge and searched for potential Zn2+ binding sites outside of the known sites involved in dimerization, as well as probing surface regions of the dimer by chemical modification of tyrosine and arginine residues. Based on these results, we developed a Brpt5.5 mutant which was unable to form the tetramer species, and was concordantly unable to form the Zn2+- induced amyloid fibrils that Brpt5.5 wild-type forms. We complemented AUC data with circular dichroism and dynamic light scattering to gain additional information in order to

121 propose models of the B-repeat assembly states prior to nucleation of amyloid fibrils. An improved understanding of the mechanistic details of tandem B-repeat assembly will pave the way for new therapeutic approaches to combat problematic staphylococcal infections.

122

Introduction

The human skin commensal, Staphylococcus epidermidis, has been referred to as the 'accidental pathogen' [1]. Its primary virulence factor is its apt ability to form biofilms on a variety of surfaces [2]. It is this feature that has allowed S. epidermidis to take its place as a leading cause of hospital-acquired infections [3]. Biofilms are well- organized communities which offer mechanical and chemical resistance upon its populations [4]. Biofilm formation begins with attachment of bacteria to a biotic (ie. corneocytes or collagen-coated implant) or abiotic surface (ie. catheter or artificial joint).

Following attachment is accumulation of bacteria mediated by protein-protein interactions and/or secretion of an extracellular polysaccharide. As the biofilm matures, its characteristic 3-dimensional structure takes shape. Eventually, shedding or dispersal of bacteria from the biofilm occurs, allowing for biofilm formation (and infection) to occur elsewhere [2].

One of the key determinants of biofilm formation, specifically in the context of infection, is the accumulation-associated protein (Aap) [5-8]. This protein is anchored to the peptidoglycan layer of the bacterial cell wall at its C-terminus. Starting at the N- terminus, furthest from the cell wall, is a series of short A-repeats, followed by a putative lectin domain flanked by a proteolytic cleavage site on either side, the B-repeat superdomain containing up to 17 B-repeats composed of Zn2+-binding G5 domains and spacer regions, and, lastly, a highly extended proline/glycine-rich stalk region [9].

Studies have shown that while the A-repeats and/or lectin domain are required for Aap's role in attachment to a surface, removal of these regions via cleavage by SepA or other proteases is required for the accumulation of bacteria in the biofilm via the B-repeat

123 region [10, 11]. Biophysical studies and x-ray crystallography have been performed on multiple minimal B-repeat constructs containing one and a half B-repeats (Brpt1.5).

These studies have shown that B-repeats are highly extended, rich in β-sheet and random coil secondary structure, and monomeric in the absence of Zn2+ [8, 12, 13].

When Zn2+ (or to some extent, Cu2+) is present, Brpt1.5 dimerizes in a mostly overlapping, anti-parallel fashion with no observable change in secondary structure. In the crystal structure, one Zn2+ ion is bound to the G5 domain and interacts with both protomers [14, 15]. The residues involved in Zn2+ binding have been identified by crystallography and mutagenesis. Another interesting aspect is that while the B-repeats are 89-100% identical, there exists two variations of B-repeats. These two variations differ in a set of eight residues in the G5 domain, which are located near the Zn2+- binding site, dimer interface and hydrophobic "stack" in the Brpt1.5 dimer structure.

Interestingly, the B-repeats with the less common variation (termed the variant repeats - as opposed to the consensus repeats) show weaker Zn2+-dependent dimerization, but higher thermal stability in Brpt1.5 constructs [12].

While the Zn2+-dependent assembly of Brpt1.5 constructs has been well- explored, Aap is believed to require at least 5 B-repeats to support biofilm formation, given results observed in its S. aureus ortholog, SasG [16]. Our lab has previously sought to characterize the Zn2+-dependent assembly of longer, more biologically relevant constructs (namely a Brpt5.5 construct), and we identified the formation of larger species leading up to amyloid fibril formation. The presence of amyloid fibrils in S. epidermidis biofilms was demonstrated, and we also showed the fibrils are composed primarily of proteolytically processed Aap. These fibrils offer the biofilm resistance

124 against DTPA, a Zn2+-chelator which can prevent biofilm formation from occurring when introduced prior to accumulation, but has no effect on mature biofilms. During this process, we found that Brpt5.5 assembled beyond dimer, before forming large, irreversible amyloid-like aggregate.

In this report, we focus on characterizing the initial, reversible assemblies formed by Brpt5.5 in the presence of Zn2+. First, we perform detailed analyses of analytical ultracentrifugation data to demonstrate the formation of the expected dimer and a novel tetramer, the latter which is not observed with Brpt1.5 constructs. By analysis of the linked equilibria between Zn2+-binding and Zn2+-mediated B-repeat assembly, we report the number of Zn2+ ions bound upon dimerization, which is consistent with the 1-2 Zn2+ ions per G5 domain we have reported for Brpt1.5 and Brpt2.5 dimerization. This is highly suggestive that the mechanism of dimerization is consistent between the larger

Brpt5.5 construct and the Brpt1.5 construct which has solved crystal structures.

Interestingly, the tetramer requires the addition of 1-2 Zn2+ ions per dimer. This suggested there were additional Zn2+-binding sites than what we had observed in crystallography studies. We performed chemical modification of tyrosine and arginine residues to narrow down our search of the tetramer interface and additional Zn2+- binding sites. Mutagenesis of a single histidine in the spacer region of each B-repeat completely abolished tetramer formation. Using data from dynamic light scattering and circular dichroism, we characterized a temperature-dependent conformational change in the presence of Zn2+ which correlated with rapid aggregation. The tetramer-negative mutant did not undergo the conformational change or the subsequent aggregation, revealing this higher-order assembly could be a critical step in amyloidogenesis. Finally,

125 we propose models of the dimer and tetramer assembly states formed by Brpt5.5. An understanding of the mechanism of assembly of longer, more biologically relevant B- repeat constructs will provide important details to inhibit amyloidogenesis in these biofilms.

126

Results

Brpt5.5 exhibits monomer-dimer-tetramer equilibrium

To begin investigating the assembly of Brpt5.5, we performed sedimentation velocity analytical ultracentrifugation (AUC) experiments at increasing ZnCl2 concentrations at a constant Brpt5.5 concentration (Figure 1A). As expected based on the dimerization of shorter B-repeat constructs [8, 12, 14, 15, 17], we observed a shift in the sedimentation coefficient as the ZnCl2 concentration was increased. Figure 1B shows the relationship between the weight-averaged sedimentation coefficient (sw) and the Zn2+ concentration. It appears here as though there may be two separate sigmoidal transitions, with the first midpoint near 3 mM ZnCl2, and the second near 5 mM ZnCl2.

This would indicate there are actually three species participating in this equilibrium, not just a monomer and dimer. Beyond 8 mM ZnCl2, there is significant loss of protein due to aggregation, but no further shift in sw.

In order to better define these three species, a sedimentation equilibrium AUC experiment was performed at 3 mM ZnCl2, where all species should be populated to an observable degree, based on the sedimentation velocity data. A global fit was performed on samples at three different protein concentrations, all in 3 mM ZnCl2, and the data was best fitted by a monomer-dimer-tetramer (1-2-4) equilibrium. Figure 1C shows the raw data and species fits for the middle concentration of 1.8 µM (0.15 mg/ml) sample, with residuals shown in the upper plot. This is the first official data that confirmed a novel tetramer state. Using the fitted association constants, a species plot

(Figure 1D) was produced. It is evident from the overlapping populations that it will not be possible to selectively populate the dimer species without having contamination from

127 monomer or dimer. Species plots at other Zn2+ concentrations show a similar trend

(data not shown). Therefore, biophysical characterization of the dimer species might require special considerations.

128

(A) (B)

(C) (D )

Figure 1. Brpt5.5 exhibits a monomer-dimer-tetramer equilibrium in the presence of Zn2+. (A) Wide Distribution Analysis (WDA) of 0.50 mg/ml Brpt5.5 in the presence of increasing ZnCl2 concentrations. (B) The weight-averaged sedimentation coefficient (sw) at increasing ZnCl2 concentrations. Data plotted were analyzed by separate programs, as designated in the key. Panel (C) shows a representative sedimentation equilibrium AUC dataset at 3 mM Zn2+, 0.15 mg/ml Brpt5.5, at 13000 rpm. This dataset is part of a global fit of 6 or more curves (at least 3 protein concentrations and at least 3 speeds). Empty circles are raw absorbance data at 236 nm, and the solid grey line represents the best fit, with residuals shown in the upper plot. Individual species are represented by lines in black (monomer), green (dimer) and red (tetramer). Panel (D) shows the distribution of each species based on Brpt5.5 concentration (13 µM = 0.50 mg/ml) and is calculated from the determined association constants at 3 mM Zn2+. The x-axis extends until saturation of monomer or tetramer.

129

Analysis of linked equilibria indicates a similar mechanism of Brpt5.5 dimerization to shorter constructs

We have previously analyzed the linked equilibria between Zn2+-binding and

Zn2+-mediated Brpt1.5 and Brpt2.5 dimerization, finding good agreement with the X-ray crystal structure of the Brpt1.5 dimer (no structural data is available for Brpt2.5) [8, 14,

17]. Sedimentation equilibrium AUC experiments were performed with Brpt5.5 at fifteen

ZnCl2 concentrations and the dimerization and tetramer assembly constants were calculated. For the dimerization assembly constant, K12, the slope of the Wyman Plot

(also referred to as a log-log plot) indicated 9.3 (±1.3) Zn2+ ions are bound upon dimerization (Figure 2A). Given that there are 6 G5 domains in the Brpt5.5 construct

(the half-repeat is a G5 domain), we compared the number of Zn2+ ions per G5 domain to that of Brpt1.5 and Brpt2.5, which were previously published [8, 17]. The slope of this plot indicates 1.6 Zn2+ ions bound per G5 domain, consistent with the 1-2 Zn2+ ions per

G5 domain previously reported (Figure 2D).

Formation of the tetramer requires additional Zn2+ ions

We then produced a Wyman Plot for the overall tetramerization constant, K14,

(Figure 2B) determined in the linked equilibria analysis, along with the more easily interpretable dimer-tetramer assembly constant, K24, (Figure 2C) which shows larger error bars due to the conversion from K14 to K24 resulting in the smaller absolute Y-axis

2+ values. The slope of the K24 Wyman Plot indicates 1-2 Zn ions bound upon formation of the tetramer from two dimers.

130

While we have previously reported that there is no change in the secondary structure upon Brpt1.5 dimerization [8], we tested for the presence of any changes that might occur in the secondary structure of Brpt5.5 upon assembly. In the presence of 5 mM ZnCl2, where Brpt5.5 should exist as a mix of dimer and tetramer, there was little to no change in the secondary structure by CD (Supplemental Figure 1).

131

(A) (B) (C) (D)

Figure 2. Analysis of linked equilibria reveals the number of Zn2+ ions bound during each assembly event. Panel (A) shows the Wyman Plot of linked equilibria for the dimer. The slope of the linear regression indicates the number of Zn2+ ions bound during dimerization (ΔZn = 9.3 ± 1.3). The Wyman Plot for the tetramer is shown based on the overall monomer-tetramer association constant (K14) in panel (B) and the dimer- tetramer association constant (K24) in panel (C). (D) A comparison of the ΔZn values per G5 domain for Brpt5.5 (determined in this study), Brpt2.5 and Brpt1.5 (previously published by Conrady, et al. 2008 [8]).

132

Chemical modification and sequence mutation to define tetramer assembly

Further characterization of this novel tetramer could provide useful insights into the amyloidogenesis pathway of Aap. To probe for the dimer surfaces important in tetramer formation, we chemically modified residues which we expected to be outside of the dimer interface, based on the Brpt1.5 dimer structure solved by X-ray crystallography (Figure 3) [14]. Because our analysis of linked equilibria suggested a similar dimerization mechanism across these B-repeat constructs, we expect this to be a useful representation of the repeating units of the Brpt5.5 dimer. We performed chemical modifications of tyrosines and arginines. Both residues are completely conserved across all B-repeats of Brpt5.5 (and all 12.5 B-repeats of Aap in RP62A).

Chemical modification of all tyrosine residues (Figure 3, orange residues) resulted in a significant decrease in the sedimentation coefficient in the presence of Zn2+, compared to that of the unmodified Brpt5.5 (Figure 4A). Modification of arginine residues (Figure 3, purple), which flank the tyrosine residues on the opposite face of the Zn2+-binding site involved in dimerization, resulted in a slightly weaker shift toward lower sedimentation coefficients. Modification of both types of residues resulted in a sedimentation profile very similar to tyrosine modification alone. Nonetheless, modified tyrosines and arginines are able to significantly inhibit tetramer formation. Based on the location of these residues, it seems likely that the tetramer is formed via side-by-side mechanisms as opposed to end-to-end mechanisms.

133

D87 H85

Y126 D122

E100

Figure 3. Chemical modification targets and potential Zn2+-binding residues are highlighted on a structure of Brpt1.5 (PDB: 4FUN). Tyrosine residues are colored orange, arginine residues are colored magenta, and hypothesized Zn2+-binding residues important in tetramer formation are colored red. The bottom left inset shows higher detail in the region within the black square.

134

After determining that additional Zn2+ ions are required for tetramer formation, we began searching for another Zn2+-binding site. Figure 3 (inset) shows our hypothesized

Zn2+-binding site, based on similar residues and orientations observed in the known dimer Zn2+-binding site [14]. Their location near Y126, which showed an ability to inhibit tetramer formation when chemically modified (Figure 4A), was a promising piece of evidence. Also, residues in position H85, D87, and D122 are completely conserved across all spacer regions in Brpt5.5, and the 12.5 B-repeats of Aap from S. epidermidis

RP62A. We chose the histidine in position 85 (H85) of each B-repeat spacer region for further investigation.

To test our hypothesis that H85 is involved in the Zn2+-binding site of tetramerization, we produced a mutant containing a H85A mutation in all spacer regions of Brpt5.5 (i.e. H85A, H213A, H341A, H469A, and H597A) - which we will refer to as

Brpt5.5 H85A for simplicity. Note that there are only 5 spacer regions, while there are 6

G5 domains in Brpt5.5, because of the half-repeat being composed of only a G5 domain. The H85A mutant appeared similar in secondary structure to the native Brpt5.5 construct by CD (Supplemental Figure 2A), and had a slight shift in the monomer sedimentation coefficient from 2.20 s* for wild-type (Figure 4B, solid line) to 2.26 s* for

H85A (Figure 4B, dashed line), but a high frictional ratio characteristic of folded B- repeats (wild-type = 3.46 and H85A = 3.46). Further, sedimentation velocity experiments in the presence of Zn2+ indicated limited assembly of Brpt5.5 H85A and there was no indication of aggregation around 8 mM ZnCl2 like we observed with

Brpt5.5 wild-type. The distribution of H85A + 8.00 mM ZnCl2 resembles closely the

135

(A) (B)

(C) (D)

Figure 4. Chemical modifications and H85A mutations inhibit tetramer formation. Panels (A) and (B) show sedimentation velocity AUC data analyzed by WDA. Chemical modification of Tyr, Arg, and both residues reduces assembly (A). In (B), mutating all H85A positions results in decreased assembly. (C) and (D) By sedimentation equilibrium AUC, Brpt5.5 H85A has similar dimerization characteristics as wild-type, but no tetramer was observed for H85A. Empty circles are raw absorbance data (WT = black, H85A = grey), which were normalized to 1.0 across (C) and (D). Solid lines represent total and species fits for Brpt5.5 WT (total = black), while dashed lines represent fits for H85A (total = grey). Monomer is shown in black, labeled with a 1. Dimer is shown in green line, while tetramer is in red.

Sample logK12 logK14 Brpt5.5 WT + 3.50 mM ZnCl2 10.41 (9.98 - 10.85) 30.46 (29.88 - 31.07) Brpt5.5 H85A + 3.50 mM ZnCl2 10.22 (9.92 - 10.54) Table 1. Measured equilibrium constants from sedimentation equilibrium experiments shown in Figure 4C and 4D.

136 distribution of the tyrosine and arginine + tyrosine chemical modification samples, suggesting that perhaps in both cases, the tetramer is unable to form. To further investigate this hypothesis, equilibrium AUC experiments were performed with Brpt5.5 wild-type and H85A in the presence of Zn2+. Raw data and species fits, displayed in

Figure 4C, indicated there was very little change in the dimerization constants, as evident by the overlap in the data and fits. Figure 4D shows a condition capable of producing mostly tetramer in the wild-type construct, but there was no tetramer detectable for the H85A mutant. Global fitting produced a logK12 = 10.41 (9.98 - 10.85) and logK14 = 30.46 (29.88 - 31.07) for wild-type and logK12 = 10.22 (9.92 - 10.54) for

H85A, with no significant improvement in the fit when incorporating a tetramer species.

We therefore are inclined to believe that H85 in the spacer region is absolutely critical for tetramer formation, while not playing a role in dimerization.

Tetramer assembly is required for Zn2+-dependent amyloidogenesis

Because we did not observe aggregation at high Zn2+ concentrations during initial characterization of the Brpt5.5 H85A construct, we were interested in testing the ability for Brpt5.5 H85A to form Zn2+-induced amyloid fibrils. With wild-type Brpt5.5, we observe a major change in the circular dichroism (CD) spectrum at ~225nm as temperature is increased (Figure 5A). The temperature depends heavily on the amount of Zn2+ present, but interestingly, Brpt5.5 H85A under the same conditions appears to simply unfold as the temperature is increased (Figure 5B). The strong minimum observed near 40°C for wild-type is likely representative of major rearrangement or twisting of β-sheets [18] into a nucleating species on the pathway to amyloidogenesis,

137

(A) (B)

(C) (D)

WT + 3.50 mM Zn H85A + 3.50 mM Zn

Figure 5. Inhibiting tetramer formation results in weaker aggregation propensity. Panel (A) shows a significant change in the Brpt5.5 CD signal at ~40°C in the presence of Zn2+, whereas the H85A mutant does not show this behavior. Panel (B) examines the turbidity of a Brpt5.5 WT or H85A sample upon Zn2+ additions. The black, filled circles show raw data for WT, while the solid line is the fit using a 4 parameter logistic curve. The horizontal dashed line is the turbidity at the WT EC50 (Turbidity = 0.218). Vertical dashed lines drop down from the intersection of the horizontal dashed line to show the EC50 along the x-axis, which is plotted on the log scale. Panel (C) and (D) show the Rh measured by DLS (black, filled circles) overlaid with CD data collected at a single wavelength (red, filled circles - wavelength specified on right y-axis).

138 as the signal is quickly lost due to aggregation in the cuvette settling or light scattering.

We have previously shown that Brpt5.5 under these conditions forms fibers which could be detected by the anti-amyloid OC antibody [See Dissertation Chapter II]. We evaluated the ability for Brpt5.5 H85A to form Zn2+-induced aggregates via two other methods as well. We monitored light scattering as Zn2+ was titrated into a cuvette of

Brpt5.5 wild-type or Brpt5.5 H85A (Figure 5C), and we observed a sigmoidal transition with a midpoint near 16 mM ZnCl2 for wild-type. For Brpt5.5 H85A, the apparent midpoint appears to be around 43 mM ZnCl2, however, there is no sigmoidal transition observed in this case, and we do not have a well-defined "maximum" signal. The solubility of ZnCl2 is severely limited at high concentrations, but does not cause significant turbidity at 400 nm (data not shown). Based on these data alone, we cannot say with certainty if this observed turbidity is related to a very weak propensity for amyloid-like aggregation, or if this is non-ordered or non-specific aggregation.

We used a third complementary technique to once again better clarify our observations. Dynamic light scattering is a hydrodynamic technique useful for measuring particle size. One can monitor for aggregation as the temperature is increased. This approach, in parallel with CD measurements, can provide us with two separate perspectives - secondary structure changes and aggregation. In good agreement with our CD observations (Figure 5A), we observe aggregation of Brpt5.5 wild-type near 37°C, consistent with the appearance of the ~225 nm minimum by CD

(Figure 5C). In the case of Brpt5.5 H85A, there is instead a significant decrease in the hydrodynamic radius (Rh) which is mirrored by the unfolding to random coil observed by

CD near 50°C (Figure 5D). Due to the highly elongated nature of the B-repeats,

139 unfolding of Brpt5.5 H85A to a random coil is expected to be described by a decrease in

Rh - contrary to the unfolding of a compact, globular protein. We therefore conclude that formation of the tetramer is critically linked to Zn2+-induced B-repeat amyloidogenesis.

140

Discussion

Based on the results presented in this study, we can propose models for the

Brpt5.5 dimer and tetramer assembly. Linkage equilibria studies suggested a consistent mechanism of dimerization between minimal Brpt1.5 constructs, a Brpt2.5 construct, and Brpt5.5 presented here. The number of Zn2+ ions bound upon dimerization was found to be a consistent 1-2 Zn2+ ions per G5 domain. X-ray crystallography structures for Brpt1.5 [14] show an "overlapping," anti-parallel dimer, where there are sufficient

Zn2+-dependent contacts to accommodate the appropriate amount of Zn2+ ions. We propose Brpt5.5 assembles into a similar "overlapping" dimer (Figure 6 middle, center) as opposed to a more offset dimer (Figure 6 middle, left) In the case of Brpt5.5 from

Aap from S. epidermidis RP62A, this is also reasonable due to the identity of the B- repeats. For example, all consensus (high Zn2+-affinity) B-repeats can make contact, while the variant (low Zn2+-affinity) B-repeat overlaps with the half-repeat cap.

Due to the number of B-repeats, one could imagine a variety of orientations or configurations for the tetramer (a dimer of dimers). In Figure 6, we evaluate the plausibility of different configurations of each assembly state. Based on our hydrodynamic data from sedimentation velocity AUC experiments, the frictional ratio decreases from monomer to tetramer for wild-type Brpt5.5. Because we cannot isolate the dimer using wild-type Brpt5.5, we cannot accurately estimate the frictional ratio of this species. However, using the H85A mutant which dimerizes similarly, we can in fact saturate the dimer population, giving us a frictional ratio smaller than that of the monomer. The "overlapping" dimer we proposed based on linkage studies would indeed exhibit a smaller frictional ratio than the monomer, as it is essentially twice as thick in

141 the z-direction, but similar along the other two axes. In contrast, if the protomers were not overlapping, but more offset from each other, a higher frictional ratio would be expected, since the length in the x-direction would increase significantly. More importantly, there would not be enough Zn2+-binding sites in contact to satisfy the 1-2

Zn2+ ions per G5 domain.

From the overlapping dimer to the wild-type tetramer, there is another decrease in the frictional ratio. If dimers attached end-to-end to form the tetramer, there would be a significant extension along the x-axis, while the other axes are unchanged. This would result in a much higher frictional ratio of the tetramer compared to the dimer. Similarly, a

"top-to-bottom" dimer of dimers would also result in a higher frictional ratio, again due to the extension along one axis - the y-axis in this case. A third option for the tetramer is a side-by-side dimer of dimers. This configuration would, in fact, yield a lower frictional ratio like what was observed, due to the extension along the z-direction which offsets the highly extended x-coordinate. The bottom rightmost option, a tilted side-by-side dimer of dimers, could also be plausible, as we only observed the addition of 1-2 Zn2+ ions upon formation of the tetramer. These latter two configurations also are logical in a biological sense, as adjacent cells with cell wall-anchored Aap extending outward from the cell surface might interact with neighboring Aap molecules.

142

Figure 6. Models of tandem B-repeat reversible assembly. The B-repeat identity is described as "Consensus" or "Variant" according to Shelton, et al. 2017 [12]. Based on the results of this study, we can eliminate several models based on biophysical data. However, we will require additional data to distinguish between the two bottom right tetramer configurations.

143

This study presents the first data demonstrating that tandem B-repeats from Aap exhibit a monomer-dimer-tetramer equilibrium. Our previous demonstration of the ability of the B-repeats of Aap to form a functional amyloid in the presence of Zn2+ was an early indication that tandem B-repeats (beyond one and a half B-repeats) behave differently in some ways than shorter B-repeat constructs. Importantly, we have now begun to understand the pathway of Zn2+-induced amyloidogenesis using biophysical analyses of Brpt5.5, a construct which represents the expected minimum number of B- repeats required for biofilm formation. Indeed, by inhibiting the formation of the tetramer via a set of point mutations in a predicted Zn2+-binding site, Zn2+-induced aggregation was inhibited.

144

Materials and methods

Protein expression and purification

Brpt5.5 cloning, expression, and purification was described previously [See

Dissertation Chapter II]. The Brpt5.5 H85A mutant was produced via the Agilent

QuikChange II Site-Directed Mutagenesis Kit. Mutated residues include H85, H213,

H341, H469, and H597, all of which were mutated to alanine. Brpt5.5 H85A was purified using the same procedures as wild-type.

Analytical ultracentrifugation

A Beckman Coulter XL-I analytical ultracentrifuge was used for AUC experiments. For sedimentation velocity experiments, two-sector epon-charcoal 1.2 cm centerpieces were used with sapphire windows. Data were collected via absorbance optics (interference optics in the case of chemical modification experiments) at 48 k rpm at 20°C in an An-60 Ti. Experiments were run overnight, usually around 20 hours. Data were analyzed using SEDFIT's continuous c(s) distribution model [19], SEDANAL's wide distribution analysis (WDA) [20], or DCDT+ version 2.4.3 by John Philo [21, 22].

Sedimentation equilibrium experiments were performed using protein dialyzed into the specified ZnCl2 concentration in 50 mM MOPS pH 7.2, 50 mM NaCl. After dialysis, protein concentrations were adjusted to approximately 0.50, 0.15, and 0.05 mg/ml and loaded into a six-channel 1.2 cm centerpiece. Samples were centrifuged at

10 k, 13 k, 17 k, 24 k, and 37 k rpm for 24 hours each, which provided ample time for equilibration of the monomer species to occur at each speed. Raw data were trimmed using WinReedit V0.999 and then fit using WinNonlin V1.080. Data from at least three

145 speeds and three loading concentrations were used for analysis of each Zn2+ concentration. Partial specific volumes, buffer densities, and buffer viscosities were estimated using SEDNTERP [23].

Analysis of linked equilibria

Experiments were designed and analyzed based on analysis of linked equilibria for Brpt1.5 and Brpt2.5, discussed elsewhere [8, 17]. Datasets were collected at ZnCl2 concentrations at 1.50, 2.00, 2.25, 2.50, 2.75, 3.00, 3.25, 3.50, 3.75, 4.00, 4.25, 4.50,

5.00, 5.50 and 6.00 mM, but 1.50 and 6.00 mM ZnCl2 were excluded due to the lack of sufficient dimer or tetramer species, which prevented accurate measurements of both logK12 and logK14 within WinNonlin. In other words, only datasets which produced both logK12 and logK14 measurements were used.

To determine the number of ligand molecules, bound or released upon a ligand- dependent equilibrium event, the following equation can be used [24]:

log = log[]

Here, the number of ligand molecules bound or released is represented by ΔY. This is dependent on the association constant, K, for a given equilibrium event occurring in the presence of a given ligand concentration, [Y] in molar units. Thus, this equation was applied to either the dimerization event (using logK12 from WinNonlin) or the tetramerization event (using logK14 from WinNonlin). In both cases, the slope of a logK vs log[Zn2+] plot was determined by linear regression, yielding the number of Zn2+ ions bound during the assembly event.

146

To convert the logK14 values to logK24 values, the following equation was used:

log log = log

Where logK14 is the overall association constant for formation of the tetramer from the monomer, logK12 is the (stepwise) association constant for the formation of the dimer from the monomer, and logK24 is the (stepwise) association constant for the formation of the tetramer from the dimer.

Chemical modification

Arginine chemical modification was performed using p-hydroxyphenylglyoxal,

HPG (G-Biosciences). HPG was dissolved in water at 100 mM. Brpt5.5 at 1 mg/ml was dialyzed into 50 mM MOPS pH 7.2, 50 mM NaCl overnight, then diluted to 0.50 mg/ml in a solution containing 20 mM HPG (final concentration). The reaction was incubated at room temperature for 3 hours, before loading AUC cells and allowing the samples to equilibrate to 20°C for 1 hour and starting the sedimentation velocity experiment. The

Zn2+-containing samples had 5 mM Zn2+ added before loading the AUC cells. The unmodified sample was treated identically, but with addition of water instead of HPG.

Tyrosine modification was performed using a final concentration of 2 mM 1-N- acetylimidazole (Sigma-Aldrich). Brpt5.5 was prepared in the same way as with arginine modification. 1-N-acetylimidazole was dissolved in water at 100 mM. The reaction was incubated at room temperature and protected from light for 2 hours before adding 5 mM

ZnCl2 and performing the sedimentation velocity analysis. For the double modification, both HPG and 1-N-acetylimidazole were added at 20 mM and 2 mM final

147 concentrations, respectively. The reaction was protected from light and incubated at room temperature for 2 hours before performing AUC.

Circular dichroism

Circular dichroism experiments were performed on an Aviv 215 CD spectrophotometer equipped with an Aviv peltier junction temperature control system and using a 0.5 mm quartz cuvette (Hellma Analytics). For temperature wavelength scans, a single wavelength scan was recorded at 10 °C intervals from 20 °C to 90 °C and back to 20°C. The cuvette was not removed in between scans, and a macro was used to perform the scans. Protein concentrations were 0.50 mg/ml (6 µM) and were dialyzed into 50 mM MOPS pH 7.2, 50 mM NaCl, then ZnCl2 was added to a final concentration of 5.00 mM before loading the cuvette. To convert the machine units, θ, to mean residue ellipticity, [θ], Eq. (1) was used. Mean residue weight, MRW, was calculated for Brpt5.5 and Brpt5.5 H85A separately, l was 0.05 cm, and concentration, c, was in mg/ml.

× [] = 10 × ×

Where thermal denaturation experiments were performed, the protein concentration was 1.00 mg/ml and 3.50 mM ZnCl2 was dialyzed into the solution. A temperature equilibration time of 2 minutes, averaging time of 3 seconds, and 0.5°C interval were used, which resulted in a similar time duration compared to the DLS experiments.

148

Turbidity assay

A BioMate 3S UV-Vis Spectrophotometer was used to record the absorbance (or light scattering) at 280, 400, and 700 nm. Protein concentrations were confirmed using the 280 nm absorbance reading before Zn2+ additions. A 200 µl sample of 0.50 mg/ml (6

µM) Brpt5.5 or Brpt5.5 H85A, which had been dialyzed in 50 mM MOPS pH 7.2, 50 mM

NaCl, was added to a quartz microcuvette. 500 mM ZnCl2 was titrated in 1 µl additions, gently shaking the cuvette between each addition and reading. The EC50 was determined by fitting the WT data to a 4 parameter logistic curve in SigmaPlot

2+ (systatsoftware.com), while the EC50 of the H85A data was estimated as the Zn concentration where turbidity was equal to the turbidity at the WT EC50.

Dynamic light scattering

To follow temperature induced aggregation, 200 µl of protein dialyzed into 50 mM

MOPS pH 7.2, 50 mM NaCl, 3.50 mM ZnCl2 was filtered and added to a quart microcuvette. Temperature experiments were performed on a Malvern Zen 3600

Zetasizer Nano, using 1°C intervals with 120 seconds of equilibration time, and 2 measurements at each temperature with an automatic measurement duration.

149

Supplementary Figures

Figure S1. Circular dichroism of Brpt5.5 shows little change in secondary structure along the monomer-dimer-tetramer equilibrium.

(A) (B)

Figure S2. (A) There is little change in secondary structure between Brpt5.5 WT and H85A in the absence of Zn2+. Additionally, in the presence of Zn2+, H85A also does not display significant changes in the CD scans. Error bars are not shown, but they do not suggest significance between any two datasets. In panel (B), H85A exhibits decreased thermodynamic stability compared to WT under the same conditions.

150

References 1. Otto M. Staphylococcus epidermidis--the 'accidental' pathogen. Nature reviews Microbiology. 2009;7(8):555-67. doi: 10.1038/nrmicro2182. PubMed PMID: 19609257; PubMed Central PMCID: PMC2807625. 2. Otto M. Staphylococcal biofilms. Current topics in microbiology and immunology. 2008;322:207-28. Epub 2008/05/06. PubMed PMID: 18453278; PubMed Central PMCID: PMCPmc2777538. 3. CDC. National Nosocomial Infections Surveillance (NNIS) system report. 4. Costerton JW, Stewart PS, Greenberg EP. Bacterial biofilms: a common cause of persistent infections. Science (New York, NY). 1999;284(5418):1318-22. Epub 1999/05/21. PubMed PMID: 10334980. 5. Hussain M, Herrmann M, von Eiff C, Perdreau-Remington F, Peters G. A 140- kilodalton extracellular protein is essential for the accumulation of Staphylococcus epidermidis strains on surfaces. Infection and immunity. 1997;65(2):519-24. Epub 1997/02/01. PubMed PMID: 9009307; PubMed Central PMCID: PMCPmc176090. 6. Rohde H, Burandt EC, Siemssen N, Frommelt L, Burdelski C, Wurster S, et al. Polysaccharide intercellular adhesin or protein factors in biofilm accumulation of Staphylococcus epidermidis and Staphylococcus aureus isolated from prosthetic hip and knee joint infections. Biomaterials. 2007;28(9):1711-20. doi: 10.1016/j.biomaterials.2006.11.046. PubMed PMID: 17187854. 7. Schaeffer CR, Woods KM, Longo GM, Kiedrowski MR, Paharik AE, Buttner H, et al. Accumulation-associated protein enhances Staphylococcus epidermidis biofilm formation under dynamic conditions and is required for infection in a rat catheter model. Infection and immunity. 2015;83(1):214-26. Epub 2014/10/22. doi: 10.1128/iai.02177- 14. PubMed PMID: 25332125; PubMed Central PMCID: PMCPmc4288872. 8. Conrady DG, Brescia CC, Horii K, Weiss AA, Hassett DJ, Herr AB. A zinc- dependent adhesion module is responsible for intercellular adhesion in staphylococcal biofilms. Proceedings of the National Academy of Sciences of the United States of America. 2008;105(49):19456-61. doi: 10.1073/pnas.0807717105. PubMed PMID: 19047636; PubMed Central PMCID: PMC2592360. 9. Yarawsky AE, English LR, Whitten ST, Herr AB. The Proline/Glycine-Rich Region of the Biofilm Adhesion Protein Aap Forms an Extended Stalk that Resists Compaction. Journal of molecular biology. 2017;429(2):261-79. Epub 2016/11/29. doi: 10.1016/j.jmb.2016.11.017. PubMed PMID: 27890783. 10. Paharik AE, Kotasinska M, Both A, Hoang TN, Buttner H, Roy P, et al. The metalloprotease SepA governs processing of accumulation-associated protein and shapes intercellular adhesive surface properties in Staphylococcus epidermidis. Molecular microbiology. 2016. Epub 2016/12/21. doi: 10.1111/mmi.13594. PubMed PMID: 27997732. 11. Rohde H, Burdelski C, Bartscht K, Hussain M, Buck F, Horstkotte MA, et al. Induction of Staphylococcus epidermidis biofilm formation via proteolytic processing of the accumulation-associated protein by staphylococcal and host proteases. Molecular microbiology. 2005;55(6):1883-95. doi: 10.1111/j.1365-2958.2005.04515.x. PubMed PMID: 15752207.

151

12. Shelton CL, Conrady DG, Herr AB. Functional consequences of B-repeat sequence variation in the staphylococcal biofilm protein Aap: deciphering the assembly code. Biochemical Journal. 2017;474(3):427-43. doi: 10.1042/bcj20160675. 13. Gruszka DT, Wojdyla JA, Bingham RJ, Turkenburg JP, Manfield IW, Steward A, et al. Staphylococcal biofilm-forming protein has a contiguous rod-like structure. Proceedings of the National Academy of Sciences of the United States of America. 2012;109(17):E1011-8. Epub 2012/04/12. doi: 10.1073/pnas.1119456109. PubMed PMID: 22493247; PubMed Central PMCID: PMCPmc3340054. 14. Conrady DG, Wilson JJ, Herr AB. Structural basis for Zn2+-dependent intercellular adhesion in staphylococcal biofilms. Proceedings of the National Academy of Sciences of the United States of America. 2013;110(3):E202-11. Epub 2013/01/02. doi: 10.1073/pnas.1208134110. PubMed PMID: 23277549; PubMed Central PMCID: PMCPmc3549106. 15. Chaton CT, Herr AB. Defining the metal specificity of a multifunctional biofilm adhesion protein. Protein science : a publication of the Protein Society. 2017. Epub 2017/07/15. doi: 10.1002/pro.3232. PubMed PMID: 28707417. 16. Corrigan RM, Rigby D, Handley P, Foster TJ. The role of Staphylococcus aureus surface protein SasG in adherence and biofilm formation. Microbiology (Reading, England). 2007;153(Pt 8):2435-46. Epub 2007/07/31. doi: 10.1099/mic.0.2007/006676- 0. PubMed PMID: 17660408. 17. Herr AB, Conrady DG. Thermodynamic Analysis of Metal Ion-Induced Protein Assembly. Methods in Enzymology. 2011;488:101-21. doi: http://dx.doi.org/10.1016/B978-0-12-381268-1.00005-7. 18. Sreerama N, Woody RW. Structural composition of βI- and βII-proteins. Protein Science. 2003;12(2):384-8. doi: 10.1110/ps.0235003. 19. Schuck P. Size-distribution analysis of macromolecules by sedimentation velocity ultracentrifugation and lamm equation modeling. Biophysical journal. 2000;78(3):1606- 19. Epub 2000/02/29. doi: 10.1016/s0006-3495(00)76713-0. PubMed PMID: 10692345; PubMed Central PMCID: PMCPmc1300758. 20. Stafford WF, Braswell EH. Sedimentation velocity, multi-speed method for analyzing polydisperse solutions. Biophysical chemistry. 2004;108(1-3):273-9. Epub 2004/03/27. doi: 10.1016/j.bpc.2003.10.027. PubMed PMID: 15043935. 21. Philo JS. Improved methods for fitting sedimentation coefficient distributions derived by time-derivative techniques. Analytical biochemistry. 2006;354(2):238-46. Epub 2006/05/30. doi: 10.1016/j.ab.2006.04.053. PubMed PMID: 16730633. 22. Stafford WF, 3rd. Boundary analysis in sedimentation transport experiments: a procedure for obtaining sedimentation coefficient distributions using the time derivative of the concentration profile. Analytical biochemistry. 1992;203(2):295-301. Epub 1992/06/01. PubMed PMID: 1416025. 23. Laue TM, Shah BD, Ridgeway TM, Pelletier SL. Computer-aided interpretation of sedimentation data for proteins. In: Harding SE, Rowe AJ, Horton JC, editors. Analytical Ultracentrifugation in Biochemistry and Polymer Science: Royal Society of Chemistry, London; 1992. p. 90-125. 24. Wyman J, Gill SJ. Binding and linkage: functional chemistry of biological macromolecules. Mill Valley, Calif: University Science Books; 1990.

152

Chapter IV. Defining the basis of the interaction between Sbp and the B-repeats of

Aap in Staphylococcus epidermidis biofilms

Authors: Alexander E. Yarawsky1,2, Andrea L. Ori2,3 and Andrew B. Herr2,4

Affiliations: 1 - Graduate Program in Molecular Genetics, Biochemistry & Microbiology, University of Cincinnati College of Medicine, Cincinnati, OH 45267, USA

2 - Division of Immunobiology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA

3 - Medical Sciences Baccalaureate Program, University of Cincinnati, Cincinnati, OH 45267, USA

4 - Division of Infectious Diseases, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA

Author Contributions: A.E.Y. and A.L.O. collected and analyzed Sbp, Brpt1.5 and Brpt5.5 CD and AUC data, and performed biofilm formation assays.

A.E.Y. collected and analyzed Brpt5.5 H85A and retroSbp CD, AUC, turbidity-related data and TEM images.

A.E.Y. collected Sbp DLS data.

A.E.Y. and A.B.H. conceived experiments and directed the project.

A.E.Y. wrote this draft.

Funding: Work was performed using funding from R01GM094363 and U19 AI070235 awarded to A.B.H. and the University of Cincinnati Graduate School Dean's Fellowship awarded to A.E.Y. (2018-2019 AY).

153

Abstract

Small basic protein, Sbp, from Staphylococcus epidermidis was identified through affinity chromatography by loading crude biofilm mixture onto a column of Aap

B-repeat-linked sepharose beads. Sbp was eluted from the column, indicating there is some interaction between the two proteins. However, other reports have been unable to observe any interaction between Sbp and minimal B-repeat constructs containing one and a half B-repeats (Brpt1.5) via isothermal titration calorimetry (ITC), size exclusion chromatography (SEC), and native mass spectrometry (native-MS). We have recently demonstrated that a construct containing five and a half B-repeats (Brpt5.5), assembles in a Zn-dependent manner to form a novel tetramer assembly state followed by formation of a functional amyloid-like aggregate. Using both minimal and longer tandem

B-repeats constructs, we investigated the possibility of Sbp-B-repeat interactions for ourselves. Interestingly, we observed aggregation between Sbp and our Brpt5.5 construct in the presence of Zn under buffer conditions where Sbp is only partially folded. Furthermore, Sbp caused Brpt5.5 to undergo Zn-induced aggregation at nearly

10-fold lower Zn concentrations, and both proteins were present in the aggregate, which resembled functional amyloid fibrils by transmission electron microscopy (TEM). While some details of the mechanism of interaction are still elusive, we utilize mutant constructs of Brpt5.5 and Sbp to show Sbp has no effect on the Brpt5.5 monomer or dimer, and the interaction is specific in nature, not a nonspecific charge-based effect.

We demonstrated the biological importance of Sbp in Zn-dependent biofilm formation assays, where exogenously added Sbp reduced the Zn concentration required to support biofilm formation.

154

Introduction

Staphylococcus epidermidis is a leading cause of hospital-acquired infections in the US [1]. This human commensal is protective and can suppress colonization by pathogenic microbes on our skin. The primary virulence factor of this species is its remarkable ability to form biofilms, well-organized bacterial communities, on biotic or abiotic surfaces [2, 3]. Because of the prevalence of S. epidermidis on human skin, medical devices and implants can become contaminated upon insertion, resulting in infections that are very difficult to treat [3].

Biofilm formation is initiated by attachment of bacteria to a surface. The following phase of biofilm formation is the accumulation of bacteria through extracellular polysaccharide secretion and/or protein-dependent interactions [4]. The accumulation- associated protein (Aap) has been shown to be a critical factor for S. epidermidis infection in a rat catheter model, whereas the icaADBC operon responsible for extracellular polysaccharide production was dispensable [5]. We have recently demonstrated that Aap forms function amyloid fibrils in S. epidermidis biofilms [See

Dissertation Chapter II].

Aap is a large, multifunctional protein that is anchored to the cell wall of S. epidermidis via its LPXTG Sortase A anchoring motif. A proline/glycine-rich region upstream of the LPXTG motif is highly extended and resistant to compaction [6]. This region likely contributes to the overall extension of Aap's functional regions into the extracellular space, including the B-repeat superdomain - located upstream of the proline/glycine-rich region, which forms anti-parallel, Zn-dependent assemblies which result in attachment of individual cells during accumulation [7, 8]. When exposed to high

155

Zn concentrations, the B-repeats of Aap can form amyloid fibrils [See Dissertation

Chapter II]. Here, we demonstrate that in the presence of Sbp, the formation of these amyloid fibrils are much more favorable, requiring nearly 10-fold less Zn in vitro.

Upstream of the B-repeat superdomain is one of two SepA protease cleavage sites, a putative lectin domain, the other SepA cleavage site, and a series of short repeats known as A-repeats. Cleavage by SepA or other proteases is required for biofilm formation to occur, presumable to remove the A-repeats and/or lectin domain which may sterically inhibit B-repeat assembly from occurring [9, 10].

Sbp was first identified in 2015 by Decker, et al. [11], when crude biofilm mixture was passed over B-repeat-coupled sepharose beads. Sbp localized mostly to the cell wall fraction and was shown to play a role in both the attachment and accumulation through extracellular polysaccharide and Aap pathways. Wang, et al. [12] more recently demonstrated that Sbp is partially folded and can form amyloid fibrils. However, they did not observe any direct interactions between Sbp and a Brpt1.5 construct via ITC or size exclusion chromatography, and neither did a separate dissertation [13].

In this report, we make use of a more biologically relevant Brpt5.5 construct to evaluate interactions with Sbp. We confirm results by Wang, et al. [12] that Sbp is partially folded in solution and monomeric, however, we expand on these findings by showing a strong salt-dependence of folding, where higher NaCl concentrations allow for Sbp to fully fold and compact. Interestingly, we see that Brpt5.5 and Sbp aggregate in the presence of Zn and form amyloid fibrils in solutions containing only 50 mM NaCl, where Sbp is partially folded and relatively expanded. This suggests Sbp's partially folded nature is important for its function as a nucleator protein to Brpt5.5

156 amyloidogenesis. We utilized a Brpt5.5 mutant which is unable to form a tetramer and subsequent amyloid fibrils, in order to better probe this interaction. We find that Sbp enhances Brpt5.5 wild-type assembly in terms of the amount of Zn that is required to populate the aggregated states. However, there is no effect on the Brpt5.5 H85A mutant, which is limited to a monomer-dimer equilibrium. We hypothesize that Sbp is crucial in the ability for Aap to form amyloid fibrils under physiologically relevant Zn concentrations. While the exact mechanism of Sbp-Aap B-repeat interaction is still not explicit, our data suggests Sbp may interact with the Brpt5.5 tetramer or a nucleating species important for initiating B-repeat amyloidogenesis.

157

Results

The secondary structure of Sbp is strongly dependent on electrostatic interactions

Secondary structure predictions of Sbp suggest 53% random coil, 31% β-sheet, and 16% α-helix (data not shown). Interestingly, we observed very different circular dichroism (CD) signals when performing wavelength scans at different NaCl concentrations (Figure 1A). For example, Sbp showed 15% α-helix at 50 mM NaCl, but

30% α-helix at 1 M NaCl. Conversely, there was a ~10% loss of β-sheet and ~5% loss of random coil going from the lower to higher NaCl concentrations (Figure 1B). We pursued this interesting result by performing thermal denaturation experiments at each

NaCl concentration. The temperature at which half of the protein molecules are unfolded (melting temperature - Tm) increased from 20°C to 37°C with increasing NaCl

(Figure 1C, Table 1). Together, these data indicate that Sbp becomes more folded and has higher stability when more NaCl is present. This would seem to suggest that there are repulsive electrostatic interactions which prevent folding of certain regions, and not until those interactions are screened will Sbp fully fold. In order to verify the relevance of the presumed fully folded state observed at >500 mM NaCl, we performed CD in the presence of 3M TMAO (Figure 1D). TMAO is a stabilizing osmolyte which destabilizes the denatured/unfolded state, thereby allowing for the native, folded state to be occupied [14, 15]. The wavelength scans in 3 M TMAO resemble the high NaCl wavelength scan (Figure 1A) at 20°C and lower NaCl scans at 4°C, suggesting the more well-folded state we observe is not being artificially induced by NaCl. Not surprisingly, TMAO was able to maintain the folded state even at 37°C.

158

(A) (B)

1 M NaCl

(C) (D)

Figure 1. Sbp is partially folded under standard conditions. Panel (A) shows CD wavelength scans of Sbp in the presence of 50 mM NaCl (black), 500 mM NaCl (grey), or 1 M NaCl (blue). (B) shows the secondary structure content predicted by the CDSSTR algorithm [16] of DichroWeb [17]. Low [NaCl] represents the 50 mM NaCl wavelength scan in (A), while High [NaCl] represents the 1 M NaCl scan in (A). Panel (C) shows the temperature dependence of folding of Sbp in the same conditions as (A). Thermodynamic parameters from these data are shown in Table 1. Wavelength scans performed in the presence of 3 M TMAO at 4, 20 and 37°C are shown in panel (D).

Sample Tm (°C) ΔH (kcal/mol) 50 mM NaCl 20.1 ± 1.0 -38.5 ± 8.1 500 mM NaCl 30.9 ± 0.5 -46.5 ± 5.4 1 M NaCl 36.6 ± 0.4 -41.4 ± 3.5 Table 1. Thermodynamic parameters calculated from thermal denaturation experiments shown in Figure 1C.

159

Sbp undergoes compaction upon electrostatic screening

Following the CD results showing increased secondary structure with increasing

NaCl, we were interested in determining whether or not this would affect the tertiary structure of Sbp. Analytical ultracentrifugation (AUC) and dynamic light scattering (DLS) were used to measure changes in the frictional ratio and hydrodynamic radius with NaCl concentration. By AUC, we observed a shift in the sedimentation coefficient, despite the fact that Sbp remained monomeric. The frictional ratio (f/f0), however, shifted from ~1.6 at 50 mM NaCl to ~1.3 at 1 M NaCl (Figure 2A, Table 2). A higher f/f0 indicates a more elongated or asymmetrical shape, whereas a lower f/f0 indicates a more globular shape.

The DLS data agree well with the AUC data, where significant compaction indicated by the drop in hydrodynamic radius occurs between 50 mM and 500 mM NaCl (Figure 2B).

Based on the CD, AUC, and DLS data, Sbp is becoming more compact as it gains secondary structure. While this is not entirely surprising, it is unexpected to observe this behavior over such a wide range of NaCl concentrations. The cause for this behavior is possibly the high positive net charge of Sbp, which could result in electrostatic repulsion along the protein chain at lower NaCl concentrations.

Probing for Sbp:B-repeat interactions at low and high NaCl concentrations

Others have attempted to observe direct interactions between Sbp and Brpt1.5 constructs, but have not found any evidence of interactions using purified, recombinant proteins [12, 13]. We tested for interactions by AUC using Sbp and two different Brpt1.5 constructs. Brpt1.511,13 contains B-repeat 11 and 13 (the C-terminal half-repeat) from

Aap from S. epidermidis RP62A. These B-repeats, along with the majority of other B-

160 repeats (termed the 'consensus repeats'), contain a set of residues which confer capacity for Zn-dependent assembly. Brpt1.58,13* contains B-repeat 8, which has a different set of residues (observed in the 'variant repeats') exhibiting weaker assembly but greater thermal stability. Brpt1.58,13* has a version of the C-terminal half-repeat cap modified to contain the set of residues consistent with the variant repeats [18]. In the case of Sbp + Brpt1.58,13* + Zn, there was not a significant shift in the weight-averaged sedimentation coefficient (Figure S1A). We therefore conclude there is no formation of a

Sbp:Brpt1.58,13 complex. In the case of Brpt1.511,13, the presence of Sbp and Zn caused a slight shift in the weight-averaged sedimentation coefficient, but no material exceeded the sedimentation coefficient expected for the Brpt1511,13 dimer (Figure S1B). Therefore, we cannot exclude the presence of an interaction, though this data alone is insufficient to conclude formation of a complex.

We have previously observed important differences in the assembly of Brpt1.5 constructs and larger constructs, such as Brpt3.5 and Brpt5.5, which can assemble into a tetramer and amyloid fibrils [See Dissertation Chapter II and Chapter III]. This prompted us to further investigate potential interactions between Sbp and Brpt5.5. By

AUC, we observed no interactions between Brpt5.5 and Sbp at 50 mM NaCl (Figure

3A), where Sbp is less well-folded, or at 1 M NaCl (Figure 3B), where Sbp is fully folded, in the absence of Zn. We performed similar experiments by CD, but observed no significant changes in secondary structure upon mixing the two proteins. When Zn is present, Brpt5.5 self-assembles, but the Zn affinity is weak and the Zn-binding site is very sensitive to being screened by NaCl, therefore, we expect to see Zn-dependent

Brpt5.5 assembly at 50 mM NaCl, but did not test at 1 M NaCl. Interestingly, when Sbp

161 is also present, we observed rapid, visible aggregation (timescale of seconds) that resulted in essentially all material being sedimented before the first scan (Figure 3E).

Also, we can infer that Sbp does not need to be fully folded and compacted in order to interact with Brpt5.5. Unfortunately, this aggregation prevented useful CD studies from being performed under these conditions.

162

(A) (B)

Figure 2. Sbp requires high NaCl concentrations to become fully compacted. Sbp was examined by sedimentation velocity AUC in the presence of various NaCl concentrations, and the c(s) distributions are shown in (A). Hydrodynamic parameters are shown in Table 2. Panel (B) shows the measured hydrodynamic radius (Rh) of Sbp as NaCl is titrated into the protein solution.

[NaCl] (mM) s20,w f/f0 50 1.48 1.57 150 1.61 1.44 500 1.72 1.32 1000 1.70 1.29

Table 2. Hydrodynamic parameters from AUC experiments in Figure 2A.

163

(A) (B)

( C) (D)

( E)

Figure 3. Interactions between Sbp and Brpt5.5 require the presence of Zn. Panels (A) and (C) show AUC data for Sbp and/or Brpt5.5 in the absence of Zn, at either 50 mM NaCl (A) or 1 M NaCl (C). Similar samples were examined by CD (B and D) to evaluate any changes in secondary structure which might not be associated with assembly, but no difference was observed between the measured spectrum (Mixobs) and the expected spectrum (Mixpred). The expected spectrum is the sum of the individual "Sbp" and "Brpt5.5" scans. Panel (E) shows AUC data for Sbp, Brpt5.5, and a 4:1 molar ratio of the two proteins, all in the presence of 50 mM NaCl and 5 mM ZnCl2.

164

Sbp enhances Zn-dependent Brpt5.5 assembly

In order to gain more useful information regarding this aggregation event, we utilized simple Zn titrations in a tabletop UV/Vis spectrophotometer to measure light scattering. In the cuvette, we started with different molar ratios of Brpt5.5 and Sbp. We observed turbidity at over 10-fold lower Zn concentrations when Sbp was present at 1:3

(Brpt5.5:Sbp) or higher stoichiometry (Figure 4A). The increase in the maximum turbidity at increasing Sbp concentrations combined with SDS-PAGE of the aggregate indicated the presence of both proteins (Figure 4B). Plotting the change in the Zn concentration required to reach half-maximal turbidity against the molar ratio illustrates that the maximal effect occurs near a 1:3 stoichiometric molar ratio (Figure 4C). In systems that form a thermodynamically reversible complex, this would indicate the stoichiometry of the complex formed, however, given that this reaction likely is not thermodynamically reversible, we cannot confidently make assumptions regarding a stoichiometry based on these data. Nonetheless, this offers a sensible molar ratio to focus on future experiments.

In an attempt to measure early oligomers formed, we monitored the turbidity of a

1:3 Brpt5.5:Sbp sample as Zn was titrated. At ~1 mM Zn, the sample reached approximately half-maximal turbidity (Figure 4A, black squares with cyan edge). We then performed a sedimentation velocity AUC experiment using a gravitational sweep method in which the speed is increased in steps to allow for the observation of a very wide range of sedimenting material (Figure 4D). Surprisingly, we were unable to observe the turbid material, thus we did not observe significant sedimentation of remaining material until 48 k rpm was reached. While we cannot distinguish between

165

(A) (B)

( C) (D)

Figure 4. Sbp induces Brpt5.5 assembly and aggregation at lower Zn concentrations. In panel (A), the turbidity of Brpt5.5 or Brpt5.5 + Sbp at specified molar ratios was monitored as Zn was titrated into the cuvette. Final samples were centrifuged to pellet the aggregated material, then the pellet was resuspended in SDS-PAGE loading dye. The material was then analyzed by SDS-PAGE (B), and when present, Brpt5.5 and Sbp are both observed. In (C) The change in EC50 values observed in (A) was plotted against the molar ratio of Sbp and Brpt5.5. The vertical dashed line indicates the molar ratio where the maximum shift is observed. A [1:3] sample was monitored along Zn titrations to ~1 mM, where turbidity was approximately half-maximal (A, black square with cyan edge). This sample was then analyzed by gravitational sweep AUC using interference optics to increase temporal resolution (D, Brpt5.5 + Sbp + 1mM Zn). The Brpt5.5 - Zn and + Brpt5.5 + 1 mM Zn distributions are from separate experiments, but closely replicate protein and buffer conditions. Note that the Brpt5.5 + Sbp + 1mM Zn data is normalized. Unfortunately, one cannot distinguish between Brpt5.5 and Sbp using these data. Nevertheless, there is an increase in size of species present, whereas there is no significant Brpt5.5 assembly without Sbp at 1 mM Zn (red line vs purple line).

166 signal corresponding to Sbp and Brpt5.5 individually, it is clear that there is some faster- sedimenting species (larger species) present. As shown, 1 mM ZnCl2 is insufficient to populate a detectable amount of the larger species without Sbp present.

Due to the strong basic nature of Sbp (pI = 9.94) and the acidic nature of Brpt5.5

(pI = 4.37), we were curious if this phenomenon was strictly based on nonspecific electrostatic interactions between the two proteins. Such interactions might increase the local concentration of Brpt5.5, reducing the apparent Zn concentration required for assembly and aggregation. We tested an unrelated basic protein, lysozyme (pI = 11.35), which had no effect on the turbidity behavior (Figure S2A). We also tested the effect of a crowding agent, PEG 8000, on the turbidity behavior of Brpt5.5, and interestingly, we examined a similar trend as with Sbp (Figure S2B), but there was no increase in overall maximum turbidity like we observed with Brpt5.5 and Sbp, again offering supporting evidence that Sbp aggregates with Brpt5.5. In an additional attempt to understand this phenomenon, we expressed Sbp with a reversed sequence (retroSbp), which will maintain parameters such as pI and charge patterning characteristics, but will impact some of all secondary structure elements. Interestingly, retroSbp was still able to induce turbidity at similar Zn concentrations as Sbp (Figure S2C), although the protein itself was less stable and had "background" levels of aggregation indicated by the baseline turbidity readings. Based on predicted secondary structure of Sbp and retroSbp, there are three β-sheets and two α-helices which both sequences have in common, which may be of future interest in determining the exact mechanism of interaction.

167

Sbp cannot interact with the Brpt5.5 monomer or dimer

In an attempt to better understand the mechanism of interaction, we utilized a previously characterize Brpt5.5 mutant, known as Brpt5.5 H85A, which cannot assemble beyond dimer [See Brpt Assembly Article]. As a result, this construct does not form Zn-induced amyloid fibrils. Figure 5A shows the decreased aggregation propensity compared to wild-type Brpt5.5. When Sbp is present, H85A shows a clearly diminished aggregation propensity compared to wild-type in the presence of Sbp.

We then performed a series of sedimentation velocity AUC experiments at multiple ZnCl2 concentrations in order to determine whether or not Sbp effected the reversible assembly of Brpt5.5 H85A. The relationship between sw and Zn concentration was not altered by the presence of Sbp, suggesting there is no interaction occurring

(Figure 5B). Therefore, it seems likely that Sbp is interacting with the tetramer or an early oligomeric species, such as a B-repeat nucleating species. Once again, this could also explain why there is no observed interaction between Sbp and Bprt1.5, which can only assemble to a dimer.

168

(A) (B)

Figure 5. Sbp shows a weaker effect toward Brpt5.5 H85A aggregation and does not affect Brpt5.5 H85A assembly. (A) Turbidity of Brpt5.5 H85A + Sbp (green circles) shows different behavior than Brpt5.5 WT + Sbp (red circles). Sbp has a much weaker effect on Brpt5.5 H85A Zn-induced aggregation. Panel (B) shows the weight-average sedimentation coefficient from AUC experiments of Brpt5.5 H85A with and without Sbp across a range of Zn-concentrations. SEDFIT c(s) distributions were loaded into GUSSI, and the part of the distribution which appeared in the range expected for Brpt5.5 H85A was integrated to get the sw value plotted in (B). The region of the distribution which resembled Sbp did not show major changes in sw which might have indicated Sbp assembly or Brpt5.5:Sbp complex formation.

169

The ability for Sbp to reduce the Zn requirement for B-repeat assembly is biologically relevant

While we have demonstrated the effect of Sbp on B-repeat assembly and aggregation in vivo, it is important to evaluate this phenomenon in a biologically functional assay. We have previously utilized a biofilm formation assay to show that

DTPA, a Zn chelator, can prevent S. epidermidis biofilm formation. Based on our in vitro observations, we hypothesized that when Sbp is present, the amount of Zn required for biofilm formation (via B-repeat assembly) will be decreased. We used constant additions of DTPA to create a baseline of no free Zn - an important step due to the presence of trace amount of Zn in the medium which allows biofilm formation. Next, we added recombinant, purified Sbp (which is in addition to endogenously expressed Sbp) and additional ZnCl2 to the wells.

At 0.5 µg/ml added Sbp, there was no effect on biofilm formation (Figure 6).

Interestingly, in conditions containing 5 µg/ml added Sbp, there was biofilm formation at lower Zn concentrations compared to when no additional Sbp was added. When a higher amount of Sbp was added (50 µg/ml), the effect was no longer observed. This observation at higher Sbp additions could be evidence that there is a direct interaction between Sbp and B-repeats, and that high Sbp concentrations can oversaturate binding sites such that the function is inhibited. A similar phenomenon was observed with the addition of soluble MBP-Brpt1.5 to biofilms, which could inhibit biofilm formation at higher concentrations by saturating cell wall-linked B-repeats with soluble MBP-Brpt1.5, preventing intercellular B-repeat interactions [7]. Another possible explanation is that

Sbp is involved in the regulation of biofilm formation, and high expression (or addition)

170 of Sbp signals unfavorable biofilm formation conditions. However, this seems unlikely based on data from Decker, et al. [11]. Nevertheless, our hypothesis based on in vitro observations that indicate Sbp can reduce the concentration of Zn required for B-repeat assembly appears to have functional relevance.

171

Figure 6. Sbp lowers the Zn required for biofilm formation. S. epidermidis RP62A biofilms were grown in 96-well plates, with the addition of Sbp, ZnCl2 and/or DTPA. The upper image shows wells after crystal violet staining, while the bar graph shows quantification of crystal violet absorbance at 520 nm.

172

Discussion

Sbp is a relatively newly discovered component of S. epidermidis biofilms. Since the initial identification of Sbp via its binding to a B-repeat-linked sepharose column [11], others have attempted to characterize its interaction with Brpt1.5, a minimal construct containing one and a half B-repeats [12, 13]. Here, we made use of a more biologically relevant Brpt5.5 construct, which we previously showed to have additional reversible assembly states (monomer-dimer-tetramer instead of monomer-dimer only) and functional amyloid formation [See Dissertation Chapter II and Chapter III]. We started by expanding on the unique NaCl-dependent folding behavior of Sbp, where electrostatic repulsion along the protein chain seems to prevent complete folding and compaction of

Sbp at low NaCl concentrations. At higher NaCl concentrations, Sbp gains additional secondary structure and compacts into a more globular conformation. We propose that this partial degree of folding may actually be important for its ability to aggregate with

Brpt5.5, given that we observe aggregation at low NaCl concentrations and at 37°C, where Sbp is partially unfolded. Also, disordered proteins and regions often undergo folding upon binding to a target protein [19].

We made use of a previously characterized Brpt5.5 mutant, Brpt5.5 H85A, which cannot assemble beyond dimer. We performed a thorough analysis via sedimentation velocity AUC to show that the reversible assembly of Brpt5.5 H85A was unaffected by

Sbp, and based on turbidity assays, Sbp had a limited effect on Brpt5.5 H85A Zn- induced aggregation propensity. This evidence strongly suggests that Sbp recognizes the tetramer or oligomeric states of Brpt5.5, possibly a nucleating species initiating aggregation and amyloidogenesis. We also produced a sequence-reversed version of

173

Sbp, which was predicted to have some secondary structural overlap with native Sbp, as well as similar linear charge-patterning characteristics. retroSbp was indeed still able to induce Brpt5.5 turbidity similar to Sbp. This provides evidence that either the pattern of charge or similar secondary structure elements may be responsible for Sbp's ability to cause Brpt5.5 aggregation. Because the similarly charged lysozyme was unable to cause Brpt5.5 aggregation, the turbidity phenomenon appears to be somewhat specific, as opposed to acting in a completely nonspecific, charge-based manner.

To test the biological implications of our in vitro observations, we performed biofilm formation assays at a wide range of ZnCl2 concentrations. As expected based on

Sbp's ability to cause Zn-induced aggregation of Brpt5.5 at lower Zn concentrations, we observed biofilm formation at significantly lower Zn concentrations when exogenous

Sbp was added. When higher concentrations of Sbp were added, however, we observed a reduced effect. This observation at higher Sbp additions could be evidence that there is, in fact, a direct interaction between Sbp and B-repeats. Specifically, excess Sbp could be preventing the ability of Sbp to bridge B-repeat tetramers, perhaps by coating the tetramers to the point where there can no longer be tetramer-tetramer interactions. These observations imply that Sbp is an important co-factor in Zn-induced biofilm formation. Sbp could be a modulator for biofilm formation, possibly allowing biofilm formation in the absence of an immune response, which causes increases in local Zn concentrations previously proposed to be important in Aap-dependent biofilm formation [7, 20]. Sbp could possibly serve its primary role in S. epidermidis commensal biofilm formation, where there is no immune response (and thus, less Zn available), as opposed to the pathogenic biofilm formation occurring in hospital-acquired infections of

174 medical devices and implants, where an immune response would release Zn into the local area. Additional support for this hypothesis is data showing that Sbp has little effect on S. epidermidis virulence in a rat catheter model [11]. Further studies using

Δsbp and Δaap strains of S. epidermidis will be valuable in confirming the role of Sbp in

Zn-induced Aap amyloidogenesis and biofilm formation.

175

Materials and Methods

Protein expression and cloning

The sbp gene was synthesized by IDT (Integrated DNA Technologies), containing an added TEV (Tobacco etch virus) protease cleavage site at the N-terminus of the protein sequence, which started at A27 and ran through the remainder of the protein sequence (UniProt accession no. Q5HRC3 - aa27-169). The first 26 residues are predicted to be the signal peptide and were therefore excluded. After TEV cleavage, an N-terminal glycine remains, which was taken into consideration in subsequent calculations. The Gateway cloning system (Invitrogen) was utilized for inserting the IDT gBlock into an entry vector and then into a pHisMBP-DEST vector which was provided by Dr. Artem Evdokimov. This destination vector contains an N-terminal MBP and 6xHis tag, both of which are removed by TEV protease cleavage. Protein was expressed in

BLR(DE3) competent E. coli cells similar to previous methods [6].

Protein purification

Sbp was purified via a 5 mL Ni2+ HiTrap cartridge column (GE Healthcare). The binding and wash buffer contained 20 mM Tris pH 7.4, 500 mM NaCl, 5 mM imidazole, and the protein was eluted using a linear imidazole gradient to 1 M imidazole. Eluted protein was dialyzed into 20 mM Tris pH 7.4, 300 mM NaCl before adding TEV protease in the presence of 5 mM 2-mercaptoethanol until removal of His-MBP was complete.

The mixture was then run over a 1 mL SP XL ion exchange column (GE Healthcare), and an NaCl gradient was used to elute Sbp, while His-MBP and TEV did not bind the column. Sbp was stored at -80 °C.

176

Circular dichroism

Proteins were dialyzed thoroughly into the desired buffer conditions overnight, then the concentrations were measured based on predicted extinction coefficients at

280 nm. CD experiments were performed on an Aviv 215 CD spectrophotometer with an Aviv peltier junction temperature control system. A 0.5 mm quartz cuvette (Hellma

Analytics) was used with Sbp concentrations of 0.33 mg/ml for NaCl-dependence datasets or 10 - 20 µM Sbp and 10 µM Brpt5.5 concentrations for datasets testing for interactions. Where data are plotted as mean residue ellipticity, [θ], Eq. (1) was used to convert from machine units, θ.

× [] = 10 × ×

In Eq. (1), MRW is the mean residue weight, l is the pathlength, and c is the concentration in mg/ml units. Secondary structure analysis was performed using the

CDSSTR algorithm [16] within the DichroWeb [17] server.

Analytical Ultracentrifugation

Experiments were performed on an XL-I analytical ultracentrifuge (Beckman

Coulter) with absorbance and interference optics at 48,000 rpm at 20 °C in an An-60 Ti rotor. Two-sector epon-charcoal centerpieces and sapphire windows were used.

Sedimentation velocity experiments were run overnight or until material was fully sedimented (except where back-diffusion became significant). SEDFIT's continuous c(s) distribution model was used for analysis of data and the frictional ratio was fitted [21].

177

Dynamic Light Scattering

Measurements of the hydrodynamic radius were performed on a Malvern Zen

3600 Zetasizer Nano with Sbp at 0.50 mg/ml dialyzed into 20 mM KPO4 pH 7.4, 50 mM

NaCl. Sbp dialyzed into 20 mM KPO4 pH 7.4, 2 M NaCl was adjusted to 0.50 mg/ml protein concentration. Both solutions were filtered using a 0.2 µm syringe filter.

Additions of the Sbp in 2 M NaCl were made to the cuvette containing Sbp in 50 mM

NaCl. After each addition, triplicate readings were taken at 20 °C. The change in viscosity at each NaCl concentration was taken into consideration in the calculation of the final hydrodynamic radius, Rh, via Eq. 2:

= ⁄(6)

where k is the Boltzmann constant, T is temperature in Kelvin, η is the solvent viscosity, and D is the diffusion coefficient. The solvent viscosity was calculated by Sednterp [22].

Turbidity assays

For each protein or protein mixture, 200 µl of dialyzed sample was loaded into a quartz microcuvette. Brpt5.5 was added at 0.50 mg/ml or 6 µM, and Sbp was added at the specified stoichiometric molar ratio. Data were recorded using a Thermo Scientific

BioMate 3S UV-Visible Spectrophotometer at 280 nm to ensure accurate starting protein concentrations, as well as 400 nm and 700 nm. Presented data is 700 nm, which contained data within the linear range, and avoided any interference from the

178 absorbance of protein or buffer components. ZnCl2 was added in 0.5 µl aliquots, and the cuvette was gently shaken after each addition to ensure complete mixing. The final samples were centrifuged for 5 min at 17,000 x g to pellet all aggregated material. The solution was removed, then the pellet was suspended in an equal volume of SDS-PAGE running buffer.

Transmission electron microscopy

A Hitachi 7600 transmission electron microscope was used to image fibrils. The accelerating voltage was set to 80 kV. Images were captured via an AMT 2k CCD camera. A 5 µl aliquot of the turbidity samples (prior to centrifugation and SDS-PAGE) were placed on 200 mesh formvar carbon/copper grids for 2 minutes. The grids were then washed with water, stained with 2% uranyl acetate for 30 seconds, then washed with water and dried on filter paper. The grids were allowed to dry for 24 hours before imaging.

Biofilm formation assays

S. epidermidis RP62A was cultured overnight on TSA with Sheep Blood (Thermo

Scientific R01202) blood agar plate at 37 °C. A single colony was grown in TSB overnight then diluted to 0.2 OD. To the wells of a 96 well plate (Corning 351172), 100

µL of the 0.2 OD culture was added. Then, 20 mM Tris pH7.4, 150 mM NaCl buffer,

Sbp, ZnCl2, and/or DTPA were added to the appropriate wells. Biofilms were allowed to grow overnight at 37 °C. The washing and crystal violet staining steps were performed as previously described [See Dissertation Chapter II].

179

Supplementary Figures

(A) (B)

Figure S1. AUC data shows no indication of Sbp:Brpt1.5 assembly. Experiments in panel (A) utilizes Brpt1.58,13* construct containing two variable G5 domains. This Brpt1.5 construct is unable to assemble in the presence of Zn. Panel (B) contains the Brpt1.511,13 construct, composed of two consensus G5 domains. The Brpt1.511,13 construct assembles in the presence of Zn, regardless of the presence of Sbp. We do not observe material with a sedimentation coefficient beyond what is expected for the Brpt1.511,13 dimer (2.68 s) [18]. However, the weight-averaged sedimentation coefficient is slightly higher than expected, based on the Sbp + Zn and Brpt1.511,13 + Zn distributions (2.17 s observed compared to 1.92 s expected). This could indicate enhanced Brpt1.5 Zn-dependent dimerization in the presence of Sbp, but additional data would be required to make that conclusion.

180

(A) (B)

Figure S2. Investigating the turbidity behavior of Brpt5.5. The increased propensity for Brpt5.5 to aggregate when Sbp is present cannot be replicated by the presence of lysozyme - another net-positively charged protein (A). In panel (B), addition of 20% PEG 8000 reduces the Zn requirement for aggregation by nearly 100-fold.

Figure S3. An Sbp construct with a reversed sequence (retroSbp) is still able to decrease the amount of Zn required for Brpt5.5 aggregation. There is significant aggregation of retroSbp present with the addition of Zn or Brpt5.5, as indicated by the increased baseline turbidity in the retroSbp data.

181

References 1. CDC. National Nosocomial Infections Surveillance (NNIS) system report. 2. Otto M. Staphylococcus epidermidis--the 'accidental' pathogen. Nature reviews Microbiology. 2009;7(8):555-67. doi: 10.1038/nrmicro2182. PubMed PMID: 19609257; PubMed Central PMCID: PMC2807625. 3. Costerton JW, Stewart PS, Greenberg EP. Bacterial biofilms: a common cause of persistent infections. Science (New York, NY). 1999;284(5418):1318-22. Epub 1999/05/21. PubMed PMID: 10334980. 4. Otto M. Staphylococcal biofilms. Current topics in microbiology and immunology. 2008;322:207-28. Epub 2008/05/06. PubMed PMID: 18453278; PubMed Central PMCID: PMCPmc2777538. 5. Schaeffer CR, Woods KM, Longo GM, Kiedrowski MR, Paharik AE, Buttner H, et al. Accumulation-associated protein enhances Staphylococcus epidermidis biofilm formation under dynamic conditions and is required for infection in a rat catheter model. Infection and immunity. 2015;83(1):214-26. Epub 2014/10/22. doi: 10.1128/iai.02177- 14. PubMed PMID: 25332125; PubMed Central PMCID: PMCPmc4288872. 6. Yarawsky AE, English LR, Whitten ST, Herr AB. The Proline/Glycine-Rich Region of the Biofilm Adhesion Protein Aap Forms an Extended Stalk that Resists Compaction. Journal of molecular biology. 2017;429(2):261-79. Epub 2016/11/29. doi: 10.1016/j.jmb.2016.11.017. PubMed PMID: 27890783. 7. Conrady DG, Brescia CC, Horii K, Weiss AA, Hassett DJ, Herr AB. A zinc- dependent adhesion module is responsible for intercellular adhesion in staphylococcal biofilms. Proceedings of the National Academy of Sciences of the United States of America. 2008;105(49):19456-61. doi: 10.1073/pnas.0807717105. PubMed PMID: 19047636; PubMed Central PMCID: PMC2592360. 8. Conrady DG, Wilson JJ, Herr AB. Structural basis for Zn2+-dependent intercellular adhesion in staphylococcal biofilms. Proceedings of the National Academy of Sciences of the United States of America. 2013;110(3):E202-11. Epub 2013/01/02. doi: 10.1073/pnas.1208134110. PubMed PMID: 23277549; PubMed Central PMCID: PMCPmc3549106. 9. Paharik AE, Kotasinska M, Both A, Hoang TN, Buttner H, Roy P, et al. The metalloprotease SepA governs processing of accumulation-associated protein and shapes intercellular adhesive surface properties in Staphylococcus epidermidis. Molecular microbiology. 2016. Epub 2016/12/21. doi: 10.1111/mmi.13594. PubMed PMID: 27997732. 10. Rohde H, Burdelski C, Bartscht K, Hussain M, Buck F, Horstkotte MA, et al. Induction of Staphylococcus epidermidis biofilm formation via proteolytic processing of the accumulation-associated protein by staphylococcal and host proteases. Molecular microbiology. 2005;55(6):1883-95. doi: 10.1111/j.1365-2958.2005.04515.x. PubMed PMID: 15752207. 11. Decker R, Burdelski C, Zobiak M, Buttner H, Franke G, Christner M, et al. An 18 kDa scaffold protein is critical for Staphylococcus epidermidis biofilm formation. PLoS pathogens. 2015;11(3):e1004735. Epub 2015/03/24. doi: 10.1371/journal.ppat.1004735. PubMed PMID: 25799153; PubMed Central PMCID: PMCPmc4370877. 12. Wang Y, Jiang J, Gao Y, Sun Y, Dai J, Wu Y, et al. Staphylococcus epidermidis small basic protein (Sbp) forms amyloid fibrils, consistent with its function as a

182 scaffolding protein in biofilms. The Journal of biological chemistry. 2018;293(37):14296- 311. Epub 2018/07/28. doi: 10.1074/jbc.RA118.002448. PubMed PMID: 30049797; PubMed Central PMCID: PMCPMC6139570. 13. Fayyaz M. Structural Characterization of Small basic protein (Sbp) and Accumulation associated protein (Aap) – two Proteins involved in Biofilm Formation in Staphylococcus epidermidis [Dissertation / PhD Thesis]: University of Hamburg; 2017. 14. Baskakov I, Bolen DW. Forcing thermodynamically unfolded proteins to fold. The Journal of biological chemistry. 1998;273(9):4831-4. Epub 1998/03/28. PubMed PMID: 9478922. 15. Baskakov IV, Kumar R, Srinivasan G, Ji YS, Bolen DW, Thompson EB. Trimethylamine N-oxide-induced cooperative folding of an intrinsically unfolded transcription-activating fragment of human glucocorticoid receptor. The Journal of biological chemistry. 1999;274(16):10693-6. Epub 1999/04/10. PubMed PMID: 10196139. 16. Johnson WC. Analyzing protein circular dichroism spectra for accurate secondary structures. Proteins. 1999;35(3):307-12. Epub 1999/05/18. PubMed PMID: 10328265. 17. Whitmore L, Wallace BA. DICHROWEB, an online server for protein secondary structure analyses from circular dichroism spectroscopic data. Nucleic acids research. 2004;32(Web Server issue):W668-73. Epub 2004/06/25. doi: 10.1093/nar/gkh371. PubMed PMID: 15215473; PubMed Central PMCID: PMCPmc441509. 18. Shelton CL, Conrady DG, Herr AB. Functional consequences of B-repeat sequence variation in the staphylococcal biofilm protein Aap: deciphering the assembly code. Biochemical Journal. 2017;474(3):427-43. doi: 10.1042/bcj20160675. 19. Dyson HJ, Wright PE. Intrinsically unstructured proteins and their functions. Nature Reviews Molecular Cell Biology. 2005;6:197. doi: 10.1038/nrm1589. 20. Chaton CT, Herr AB. Defining the metal specificity of a multifunctional biofilm adhesion protein. Protein science : a publication of the Protein Society. 2017. Epub 2017/07/15. doi: 10.1002/pro.3232. PubMed PMID: 28707417. 21. Schuck P. Size-distribution analysis of macromolecules by sedimentation velocity ultracentrifugation and lamm equation modeling. Biophysical journal. 2000;78(3):1606- 19. Epub 2000/02/29. doi: 10.1016/s0006-3495(00)76713-0. PubMed PMID: 10692345; PubMed Central PMCID: PMCPmc1300758. 22. Laue TM, Shah BD, Ridgeway TM, Pelletier SL. Computer-aided interpretation of sedimentation data for proteins. In: Harding SE, Rowe AJ, Horton JC, editors. Analytical Ultracentrifugation in Biochemistry and Polymer Science: Royal Society of Chemistry, London; 1992. p. 90-125.

183

Chapter V. The proline/glycine-rich region of the biofilm adhesion protein Aap

forms an extended stalk that resists compaction*

Authors: Alexander E. Yarawsky1,2, Lance R. English3, Steven T. Whitten3 and Andrew B. Herr2,4

Affiliations: 1 - Graduate Program in Molecular Genetics, Biochemistry & Microbiology, University of Cincinnati College of Medicine, Cincinnati, Ohio 45267, USA

2 - Division of Immunobiology, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio 45229, USA

3 - Department of Chemistry and Biochemistry, Texas State University, San Marcos, Texas 78666, USA

4 - Division of Infectious Diseases, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio 45229, USA

Author Contributions: A.E.Y. collected and analyzed AUC and CD data.

L.R.E. and S.T.W. collected and analyzed SEC and DLS data.

A.E.Y., L.R.E., S.T.W. and A.B.H. conceived experiments and directed the project.

A.E.Y., S.T.W. and A.B.H. wrote the manuscript.

Funding: Work was performed using funding from R01GM094363 to A.B.H., R15GM115603 to S.T.W. and DMR1205670 to S.T.W.

*Notice of Previous Publication: This work has been published in the Journal of Molecular Biology. The final published version is available at: https://www.sciencedirect.com/science/article/pii/S0022283616305113

184

Abstract

Staphylococcus epidermidis is one of the primary bacterial species responsible for healthcare-associated infections. The most significant virulence factor for S. epidermidis is its ability to form a biofilm, which renders the bacteria highly resistant to host immune responses and antibiotic action. Intercellular adhesion within the biofilm is mediated by the accumulation-associated protein (Aap), a cell wall-anchored protein that self-assembles in a zinc-dependent manner. The C-terminal portion of Aap contains a proline/glycine-rich, 135 amino acid-long region that has not yet been characterized.

The region contains a set of 18 nearly identical AEPGKP repeats. Analysis of the proline/glycine-rich region (PGR) using biophysical techniques demonstrated the region is a highly extended, intrinsically disordered polypeptide (IDP) with unusually high polyproline type II helix (PPII) propensity. In contrast to many IDPs, there was a minimal temperature dependence of the global conformational state of PGR in solution as measured by analytical ultracentrifugation and dynamic light scattering. Furthermore,

PGR was resistant to conformational collapse or α-helix formation upon addition of the osmolyte TMAO or the cosolvent TFE. Collectively, these results suggest PGR functions as a resilient, extended stalk that projects the rest of Aap outward from the bacterial cell wall, promoting intercellular adhesion between cells in the biofilm. This work sheds light on regions of low complexity often found near the attachment point of bacterial cell wall-anchored proteins.

185

Introduction

Staphylococcus epidermidis is a human commensal that is also responsible for a significant number of hospital-acquired infections, particularly device-related infections resulting from the propensity of S. epidermidis to adhere to abiotic surfaces such as plastic and glass. Examples of nosocomial infections resulting from S. epidermidis include bacteremia or endocarditis after introduction of catheters, pacemakers, or prosthetics [1-4]. Although S. epidermidis lacks many of the virulence factors expressed by its more pathogenic relative S. aureus, S. epidermidis is highly capable of forming robust biofilms and causing infection [5]. Biofilms provide a substantial degree of protection from antibiotics, the host immune system, and environmental stresses [6].

Therefore, treatment of biofilm-associated infections often requires removal of the contaminated device and prolonged antibiotic usage [7]. S. epidermidis biofilms are typically encapsulated in an extracellular matrix composed of poly-N-acetylglucosamine

(PNAG) [8] along with protein components, teichoic acids, and extracellular DNA [1].

One protein of particular importance for S. epidermidis biofilm formation and infection is the accumulation-associated protein (Aap). The first demonstration of the critical role of

Aap came in the form of an Aap-negative mitomycin mutant of S. epidermidis strain

RP62A with minimal ability to accumulate on polystyrene and glass. Antibodies raised against Aap inhibited wild-type RP62A accumulation on polystyrene [9]. Another piece of evidence indicating Aap’s importance was a 2007 study investigating PNAG and protein factors in S. epidermidis clinical isolates from prosthetic infections. The study

[10] found that 89% of these strains were aap+, while only 62% were positive for icaA, the gene encoding an N-acetylglucosaminyl transferase critical for the synthesis of

186

PNAG [8]. Among the biofilm-positive strains, 27% did not contain the icaADBC biosynthetic operon, and biofilm formation could be attributed to Aap in the majority of isolates [10]. Furthermore, in a rat model, Aap, but not icaADBC, was required for colonization of an intravenous catheter. Bacteria levels in blood were also significantly lower in the Δaap strain and the ΔicaADBC Δaap double mutant strain compared to the wild-type and ΔicaADBC strains [11], again highlighting the importance of Aap in biofilm formation and infection.

The accumulation-associated protein is a large, multi-functional, cell wall- anchored protein found in dense tufts on the bacterial cell surface [12] that has been implicated in both the initial attachment phase [13] and the accumulation phase [14-16] of biofilm formation. At the N-terminus, the A-domain of Aap (Fig. 1a) contains 11 short

16-aa A-repeats and a predicted globular lectin domain. This domain can initiate biofilm formation in some strains [13] and mediate adhesion to human corneocytes [17] and abiotic surfaces [11]. Downstream from the lectin domain, the B-repeat superdomain of

Aap contains 5-17 nearly identical 128-aa B-repeats. Processing of Aap by an extracellular protease results in cleavage of the A-domain and unmasking of the B- repeat region, which mediates accumulation of S. epidermidis cells at early stages of biofilm growth [14]. The B-repeats of Aap are responsible for intercellular adhesion via

Zn2+-mediated self assembly in an anti-parallel manner [15, 18]. The C-terminal portion of Aap is comprised of a proline/glycine-rich region (PGR) (Fig. 1b) followed by the

LPXTG motif that is covalently linked to the peptidoglycan layer of the bacterial cell wall by the Sortase A enzyme [19, 20].

187

The PGR of Aap is classified as one of many low-complexity regions in cell wall- anchored proteins, although little has been published regarding their structure or function [21-26]. The focus of this work was to utilize biophysical techniques to elucidate the structural characteristics and functional implications for the proline/glycine-rich region of Aap. The PGR comprises 135 residues in a series of AEPGKP repeats (Fig.

1b). Although the repeat sequence has some similarity to typical collagen triplet repeats, the presence of a non-glycine in the third position of every other triplet would sterically preclude triple helix formation by the PGR [27, 28]. Consistent with this expectation, our data reveal that the Aap PGR is monomeric and intrinsically disordered. Intrinsically disordered regions of proteins (IDPs) have been of growing interest to the scientific community over the past decade [29-32]. IDPs are increasingly revealed to be quite common [33] and to play critical biological roles [34], including in human diseases [35].

Not surprisingly, with the increased interest in understanding the functional roles of IDPs, there is also interest in surveying the structural conformations available to these proteins. This information is essential in order to understand the mechanism of how these disordered polypeptides function. In some cases, IDPs undergo a conformational transition from disordered to ordered, such as induced folding upon binding to a target protein [36]. However, disordered proteins do not necessarily require a transition from disorder to order (or order to disorder) to effect their biological function.

One category of intrinsically disordered regions, known as “entropic chains,” includes

IDPs that remain unstructured for their normal function [37, 38]. In other words, the activity of these proteins relies on the dynamic nature of the protein backbone in order to carry out the biological function. Examples of these entropic chains include flexible

188 linkers between two domains, entropic springs such as the disordered PEVK domain of titin which exhibits variable conformations contributing to the maintenance of muscle cell length [39, 40], and entropic bristles such as neurofilament H and M which contain disordered C-terminal domains known as sidearms. These sidearms occupy significant

3-dimensional space due to their ability to sample a large number of conformations, maintaining appropriate spacing between neighboring filaments [41].

Using a combination of circular dichroism and analytical ultracentrifugation, we demonstrate that the Aap PGR is a highly extended polypeptide rich in polyproline type

II (PPII) helix. These data are consistent with size-exclusion chromatography and dynamic light scattering measurements that reveal an averaged hydrodynamic radius of

37.7 Å, compared to an expected value of ~20 Å for a typical globular protein of equivalent length. These data are accurately predicted by a power-law scaling relationship [42] described for IDPs and based upon intrinsic PPII propensities, which provides additional evidence that the Aap PGR has an unusually high PPII content compared to other characterized IDPs. Interestingly, we find that PGR resists the compaction typically induced in IDPs by elevated temperature or cosolvents such as

TFE and TMAO. Taken together, our data indicate that the PGR forms a highly elongated stalk region that extends the functional B-repeat region of Aap away from the peptidoglycan layer of the staphylococcal cell wall, enabling the important intercellular adhesion events necessary for biofilm formation.

189

Results

The proline/glycine-rich region shows aberrant mobility

To understand the role of the proline/glycine-rich region in Aap and how it might contribute to S. epidermidis biofilm formation, we sought to express this region as a recombinant protein and analyze it using biophysical techniques. The proline/glycine- rich region of Aap (termed PGR) was expressed in Escherichia coli as a fusion protein with a hexahistidine-maltose binding protein (His-MBP) tag. By SDS-PAGE (Fig. 1c), the fusion protein (His-MBP-PGR) runs near the expected MW of 60 kDa. After cleavage of the His-MBP tag with Tobacco Etch Virus (TEV) protease, a prominent band appears near 150 kDa in addition to the 42.5 kDa MBP and 27 kDa TEV. After further purification by affinity and anion exchange chromatography, a single band of high purity is observed by SDS-PAGE running near 150 kDa. This putative PGR band exhibited highly aberrant migration by SDS-PAGE, with an apparent molecular weight more than 10-fold higher than predicted (13.2 kDa). Purified PGR was examined by electrospray ionization mass spectrometry to confirm the presence of a single species with a molecular mass of 13,194 Da. Aberrant mobility during SDS-PAGE has been observed as early as 1969 by Dunker and Rueckert [43], and more recently several

IDPs have shown aberrant mobility, although to a lesser extent than PGR [44-46].

Under the denaturing conditions used for SDS-PAGE gels, the mobility of the migrating protein species under the applied electric field is dependent on the number of SDS molecules bound to the protein through interactions of the acyl tail of SDS and hydrophobic regions of the protein. PGR includes very few hydrophobic residues—its

GRAVY (GRand Average of HydrophathY) hydropathy index score is -1.41 [47]—which

190

Figure 1. PGR shows aberrant mobility and exists as an elongated monomer in solution. (a) Schematic of Aap anchored to the peptidoglycan layer of the S. epidermidis cell wall by the LPXTG Sortase A motif, with approximate domain boundaries shown for relative scale. (b) Amino acid sequence of PGR flanked by the end of the B-repeat superdomain (EYGPT) and the LPXTG motif (underlined) – note the repeating AEPGKP hextads. The amino acids are numbered as they appear in Aap from S. epidermidis RP62A. Positively charged residues in the PGR are colored blue, while negatively charged residues are colored red. Panel (c) shows the aberrant mobility of PGR by silver-stained SDS-PAGE. Fusion refers to the His-MBP-PGR fusion protein. +TEV is 6hr after the addition of TEV protease to cleave the His-MBP tag from PGR. PGR is post-ANX purification. While the His-MBP-PGR fusion protein migrates as expected, cleaved PGR (13.2 kDa) migrates more than 10 times more slowly than expected. (d) Sedimentation velocity AUC absorbance data (markers) and best-fit model (lines) in the upper panel. Residual error in the bottom panel, showing good fits to the data. Panel (e) shows the c(s) distribution analysis of PGR at multiple concentrations (75 µM = 1 mg/ml) indicating PGR does not self-assemble. The frictional ratio ranged from 2.1 to 2.3, indicating a highly elongated species. The calculated molecular weight was near that of monomeric PGR, suggesting PGR exists as a highly elongated monomer under native conditions across the concentration range tested. Standardized sedimentation coefficients (s20,w) are displayed representing s at 20° C and in water. A linear regression was performed to determine the standardized

191

0 sedimentation coefficient at infinite dilution (s 20,w) by extrapolation of s20,w to zero concentration ((e), inset), yielding a value of 1.06 S. The values used in the extrapolation are shown in white circles, while the 300 µM (4 mg/ml) value shown as a filled circle was omitted. Detailed results can be found in Table S1.

192 should result in very low SDS binding capacity and very low mobility under the electric field relative to proteins of similar size with more typical amino acid distributions. This aberrant mobility has been observed for other proteins with low GRAVY scores [48] and specifically for collagen peptides, which exhibit high imino acid and glycine content but infrequent hydrophobic residues [49], similar to the PGR. This aberrant migration due to diminished SDS binding may be generally applicable to other IDPs, as low hydrophobicity is a common attribute of IDPs [50].

PGR sediments as an elongated monomer

To determine whether PGR exists in solution as an extended monomer or a higher-order assembly, we performed sedimentation velocity analytical ultracentrifugation (AUC) (Fig. 1d). This technique tracks the rate of protein sedimentation under an applied centrifugal field, allowing for the characterization of the size and shape of the species in solution under native conditions. AUC experiments can provide additional hydrodynamic information about macromolecules in solution compared to SDS-PAGE and size exclusion chromatography, because the experimental analysis provides insight into both the size of the sedimenting species (via the sedimentation coefficient) and its shape (via the frictional coefficient, which depends on broadening of the sedimentation boundary due to diffusion). At 20° C and in biologically relevant buffer, PGR was monomeric at concentrations ranging from 25 µM to 300 µM

(0.33 mg/ml to 4 mg/ml). However, under all conditions, PGR sedimented slowly, with a sedimentation coefficient of 1.0 S and an unusually high frictional ratio (greater than 2), indicating a highly elongated or non-globular conformation that results in increased drag

193

(Fig. 1e, Tables 1 and S1). These results support the hypothesis that PGR has aberrant mobility due primarily to a preference for a set of extended conformations.

Determination of the hydrodynamic radius of PGR in solution

In parallel to the sedimentation velocity analysis, we determined the radius of hydration (Rh) of PGR by size exclusion chromatography (SEC) [51]. A series of well- characterized globular proteins were run over a G-100 Sephadex column to determine the linear relationship between the thermodynamic retention factor (KD) and Rh for control proteins with known crystal structures [51, 52] (Fig. 2a). Based on the linear Rh vs KD plot, the Rh for PGR was determined to be 37.06 ± 1.1 Å. This value is surprisingly high for a protein of 13.2 kDa, as can be seen in a plot of logMW vs KD (Fig.

2b); PGR deviates significantly from the linear relationship seen for the folded, globular control proteins bovine serum albumin, chicken albumin, bovine carbonic anhydrase, and horse myoglobin (BSA, Alb, CA, and Myo respectively). If PGR were a globular protein, its Rh would correspond to an apparent MW of 53.6 kDa, nearly four times larger than its actual MW. Furthermore, a plot of logRh vs logN where N is the number of residues, shows distinct linear trends for IDPs [42] (Table S4) and globular proteins [51,

53] (Table S5); PGR clearly falls in the IDP region in this plot (Fig. 2c). The abnormal migration by SDS-PAGE, highly extended conformation in solution by AUC, and aberrant Rh compared to its MW all indicate that PGR is likely to be an IDP.

194

Figure 2. Size exclusion chromatography (SEC) confirms an extended conformation in solution.

195

Figure 2. Size exclusion chromatography (SEC) confirms an extended conformation in solution. Panel (a) is a representative dataset from SEC experiments using G-100 resin. Folded protein standards are displayed in empty circles with their linear regression plotted as a dotted line. The Rh of the folded proteins were estimated as the half-maximal Cα-Cα distance obtained from X-ray crystallography structures of bovine serum albumin (BSA – PDB ID: 4F5S) [54], chicken albumin (Alb – PDB ID: 1OVA) [55], carbonic anhydrase (CA – PDB ID: 1V9I) [56], and horse myoglobin (Myo – PDB ID: 2O58) [57]. The SEC- measured KD of PGR (blue circle) was plotted along the linear regression of the protein standards to yield a Rh of 37.21 Å (indicated by the dashed drop lines), or an average of 37.06 Å across 4 independent experiments (2 using G-75 resin, 2 using G-100 resin), each with 3-9 replicates of the training proteins and 2-3 replicates of PGR or PGR-Tyr. The addition of a Tyr to the C-terminus of PGR did not appear to significantly affect the Rh, but it allowed measurements to be performed at 1 mg/ml rather than at the 3-4 mg/ml required for PGR. Panel (b) examines the linear relationship between KD and logMW for the folded protein standards. Clearly, PGR deviates from this trend. PGR has a KD which would be expected for a folded protein of 50.9 kDa (logMW = 4.71), as indicated by the dashed drop lines starting from PGR’s KD and ending at the logMW. Panel (c) shows the relationship between logRh in Å and logN where N is the number of residues. Folded proteins [51, 53] (empty circles, dashed linear regression) trend differently than IDPs [42] (filled circles, solid linear regression). PGR trends with IDPs and not folded proteins. Also shown for reference is the IDP p53(1-93).

196

Predicted disorder based on PGR primary sequence

To investigate the inherent propensity of PGR to form an intrinsically disordered polypeptide, we submitted its amino acid sequence to several prediction servers. Figure

S1a shows the location of PGR on the Uversky plot, which segregates disordered proteins from ordered proteins based on absolute mean net charge and mean scaled hydropathy [50]. Although PGR has a relatively low absolute mean net charge per residue of 0.052 due to the alternating Glu and Lys residues, PGR still lies on the

“disordered” side of the Uversky plot due to the general lack of hydrophobic residues.

IDPs are typically composed primarily of “disorder-promoting” amino acids, which include Ala, Glu, Pro, Gly, and Lys (which comprise the primary AEPGKP repeat in

PGR), as well as Arg, Gln, and Ser. Similarly, IDPs tend to lack bulky hydrophobic and aromatic residues that would promote the formation of a hydrophobic core in a globular fold [58, 59]. Analysis of the PGR sequence using the Database of Disordered Protein

Predictions (D2P2) [60] showed unanimous agreement among all predictions that this region is disordered (data not shown); the results from several disorder-predicting servers [61-65] are shown in Figure S1b. Interestingly, such predictions indicated high probability of disorder in the A-repeats and B-repeats as well as the PGR of Aap. While no data have been published regarding the structure of the A-repeats, B-repeat constructs from Aap and its ortholog SasG have been crystallized and form an elongated, non-globular fold with long 3-stranded β-sheets interspersed with regions of coil or PPII helices [15, 18, 66][67].

Additional analysis of the PGR sequence was carried out using the CIDER

(Classification of Intrinsically Disordered Ensemble Regions) server developed by the

197

Pappu Lab [68]. CIDER calculates a variety of parameters, such as fraction of charged residues (FCR), net charge per residue (NCPR), and kappa [69] – a parameter describing charge segregation in the linear sequence. PGR can be classified as a weak polyampholyte (FCR < 0.3) with a low NCPR (-0.052). Based on the kappa value of

0.058, PGR lies in the transition region of structures taking on a “self-avoiding random walk” set of conformations and a molten globule-like set of conformations [69]. The relationship between the fraction of positively or negatively charged residues defines

PGR as a “Janus sequence,” which may be collapsed or expanded depending on the context (Fig. S2c). In the case of PGR, the frequency and distribution of proline residues may bias its conformation toward an extended state rather than a molten globule-like state.

PGR contains polyproline type II helix content

We used circular dichroism (CD) spectroscopy to assess the presence of secondary structure in PGR. The far-UV spectrum at 25° C showed a strong negative minimum near 200 nm and a weak local maximum near 220 nm (Fig. 3a; see royal blue line). A minimum in the range of 195-205 nm is typical for collagen [70], denatured proteins [71, 72], and IDPs [44-46], and the positive peak near 220-228 nm is considered to be diagnostic for the polyproline type II (PPII) conformation [70], a highly extended left-handed helix with three-fold rotational symmetry [73]. Both PPII and random coil secondary structures show a minimum near 200 nm in their far-UV CD spectra, but this peak is notably less pronounced for random coil than for PPII. The far-

UV spectrum of PGR changed as a function of temperature (ranging from 5° C in

198

Figure 3. PGR contains polyproline type II helix. Panel (a) shows the CD spectra of PGR from 5° C in magenta to 95° C in red, collected in 10° C increments. These spectra resemble those from PPII-containing peptides, showing PPII transitioning to random coil at increasing temperatures. The mean residue ellipticity, [θ] (degrees cm2 dmol-1 residue-1), in thousands is plotted against wavelength. The inset shows the temperature dependence of the minimum at 200 nm. In panel (b), a difference plot of the spectrum at each temperature subtracted from the 95° C spectrum is displayed. The inset shows the temperature dependence of the local maximum at 222 nm, associated with PPII content.

199 magenta to 95° C in red in Figure 3a). As temperature was increased, the minimum near 200 nm became less negative and the band at 222 nm became more negative.

This suggests an increase in random coil as temperature increases, while lower temperatures resulted in a more negative minimum corresponding to the stabilization of

PPII, as previously described [74]. A difference plot of the spectra between 210-240 nm as a function of temperature more clearly illustrates the decrease in PPII as temperature increases, based on the positive difference peak centered at 222 nm (Fig. 3b). These far-UV CD data are consistent with a non-cooperative transition between conformational states containing PPII at low temperatures and random coil (‘denatured PPII’) at high temperatures [44]. This is supported by the linearity of the temperature dependence of the CD signal at 200 nm (Fig. 3a-inset) or 222 nm (Fig. 3b-inset), demonstrating non- cooperative thermal denaturation.

Hydrodynamic behavior as a function of temperature

While CD provided information regarding temperature effects on the local backbone structure of PGR, AUC was performed at multiple temperatures to examine thermal effects on the global conformation of PGR. IDPs often show a decrease in hydrodynamic radius with increasing temperature, which has been described with temperature-dependent solvation effects on structure [75], a collapse of the extended

PPII conformation to more compact random coil configurations [51, 52, 76], and a stronger hydrophobic effect at elevated temperatures [59]. We conducted AUC sedimentation velocity experiments at 4°, 20°, and 37° C (the temperature extremes allowed for by the instrumentation). After converting the data to the standardized

200 sedimentation coefficient (s20,w) to account for differences in viscosity and density at these temperatures, there was little difference between the sedimentation coefficients or the frictional ratios (Fig. 4a, Table S2). To verify the monodispersity of PGR in these sedimentation velocity experiments, we used the c(s,f/f0) analysis method to determine

2-dimensional size-and-shape distributions of PGR in terms of sedimentation coefficient and frictional ratio (Fig. 4b) [77]. This is useful in distinguishing two separate species with similar sedimentation coefficients [78]. These 2D distributions confirmed that PGR sedimented as a single monomeric species with no evidence of higher-order oligomers.

Taken together, these AUC data show that the overall hydrodynamic behavior of PGR does not appreciably change with temperature, indicating that PGR does not undergo a significant collapse with increased formation of random coil.

Dynamic light scattering (DLS) provides information on particle size and diffusion coefficients by measuring time-dependent scattering fluctuations due to the particles’

Brownian motion [79]. An advantage of using DLS techniques, when compared to SEC- based methods for measuring particle size, is that data can be measured over a broad temperature range. We measured the Rh of PGR using DLS from 5° C to 45° C and observed very little change in the Rh, similar to the AUC results (Fig. 4c, Table S2). The values range from 38.9 Å to 37.8 Å and are in good agreement with the SEC-measured

Rh of 37.06 Å (Fig. 4c; see green square). Although we saw a temperature-dependent decrease in PPII content (based on mean residue ellipticity at 222 nm) for PGR, the slope is shallow and the CD data points are within the error bars of the DLS data. In the case of p53(1-93), a well-known IDP, Langridge and colleagues co-plotted Rh values by

DLS with PPII signal at 221 nm via CD (at the same scale as shown for PGR), revealing

201

Figure 4. PGR shows weak temperature dependence of Rh. Panel (a) shows the calculated c(s) distributions of PGR, with little change in the sedimentation coefficient at 25 µM between 4° C (purple line), 20° C (green), and 37° C (red). Panel (b) shows the c(s,f/f0) 2D size-and-shape distribution of PGR at 4° C (left), 20° C (middle), and 37° C (right) at 25 µM, indicative of a single species in each case. Note that the sedimentation coefficients in this panel are apparent (s*) values, which converge to 1.0-1.1 upon correcting for temperature effects on buffer viscosity and density. Panel (c) displays the DLS-measured Rh for PGR in blue circles. Measurements were taken at 10° C increments between 5° C and 45° C. Error bars show ± standard deviation for DLS measurements. The open square marks the SEC- measured Rh (37.06 Å at 22° C). The red triangle along the y-axis (Rh) shows the predicted Rh of 38.50 Å from the power-law scaling relationship. The black triangles show molar residue ellipticity at 222 nm at each temperature, corresponding to PPII content. The CD data suggests a small temperature-dependent decrease of PPII content, but this does not translate to the Rh as much as would be expected based on previous observations in the literature [51, 52]. As an example, the temperature dependence of the Rh of p53(1-93) is shown in gray circles, which also mirrored its own molar residue ellipticity at 221 nm, the local PPII maximum observed for p53 (re-plotted from [51]). The linear fits of the DLS data for PGR and p53(1-93) are shown as solid lines. The slope (m) of the line for the PGR fit (m = -0.015) is less steep than that of the fit for p53(1-93) (m = -0.076), indicating a weaker temperature dependence of Rh compared to p53(1-93).

202 a strongly correlated linear decrease in both Rh and PPII content as a function of temperature between 5° and 75° C [51]. Interestingly, the change in Rh observed for p53(1-93) between 5-45° C was 3.66 Å, compared to only 0.84 Å change in PGR (Fig.

4c, gray circles). The difference in CD signal at 221 nm over the same temperature range for p53(1-93) was ~2,000 MRE, compared to ~1,000 MRE for PGR. These data suggest the Rh of PGR decreases with elevated temperatures but to a lesser degree than observed for p53(1-93). These results suggest the temperature-dependent local transitions from PPII to random coil observed by CD do not correspond to large changes in the global conformations of PGR as observed by AUC or DLS.

Effects of cosolvents on PGR conformation

A number of osmolytes and other cosolvents can provide insight into the configurational space of polypeptides. By a combination of excluded volume effects and preferential interactions of cosolvent with backbone or sidechain groups, these molecules can chemically denature folded proteins or stabilize particular types of secondary structure. Likewise, cosolvents can favor formation of PPII conformations or even induce folding of thermodynamically denatured proteins and IDPs. Thus, response of PGR to urea, guanidine hydrochloride (GdnHCl), trimethylamine N-oxide (TMAO), and 2,2,2-trifluoroethanol (TFE) was measured by far-UV CD spectroscopy at 4°, 25°, and 37° C (Fig. 5).

Urea is a denaturing osmolyte; it favors unfolded and extended protein conformations through a combination of preferential interactions with the protein backbone and side chains [80-85]. Urea may also exert an indirect effect on protein

203

Figure 5. The local conformation of PGR shows resilience against chemical perturbants. (a) CD spectra showing PGR PPII induction at 0 M, 2 M, 4 M, 6 M urea (dark fill to light fill; 0 M and 2 M spectra overlay at 37° C). (b) CD spectra of PGR in 0 M, 2 M, 4 M, 6 M GdnHCl (dark fill to light fill; 0 M and spectra 2 M overlay, while 4 M and 6 M overlay at 4° C) showing a weaker induction of PPII than with urea. (c) CD spectra of PGR in 0 M (dark fill) or 3 M (light fill) TMAO, with no changes, indicating high resilience toward this osmolyte. D, CD spectra of PGR in 0%, 15%, 45%, or 75% TFE (dark fill to light fill; 0% and 15% TFE spectra overlay) at 4° C (purple circles), 20° C (green squares), or 37° C (red triangles). Unexpectedly, no significant changes in the spectra from 0-15% TFE were observed. Urea, GdnHCl, and TMAO data are shown only from 210-240 nm due to high noise levels below 208 nm.

204 stability via alterations in the water network surrounding the protein [86]. Guanidine hydrochloride (GdnHCl) is another chemical denaturant that favors extended, unfolded protein conformations, by preferentially interacting with the protein side chains and backbone [83, 87-90]. In general, GdnHCl is 2- to 4-fold more efficient at denaturation than urea, although this is dependent on the target protein’s sequence. Comparison of their ability to block hydrogen exchange revealed that unlike urea, GdnHCl does not form H-bonds with the polypeptide but instead stacks against planar groups in the backbone and side chains such as Asn or aromatic residues [90]. Both urea and

GdnHCl favor formation of the highly extended PPII conformation in IDPs and denatured globular proteins, monitored by the far-UV CD signal near 220 nm [44, 72,

91, 92]. PGR shows an increase in PPII content at high concentrations of urea (Fig. 5a) and GdnHCl (Fig. 5b), particularly at low temperatures where PPII is more stable. At higher temperatures, urea and GdnHCl are less effective at inducing PPII structure, similar to previous observations with p53(1-93) [44]. Interestingly, we observe that urea is more efficient than GdnHCl at increasing PPII content of PGR, which is opposite to the typical trends for denaturation of globular proteins. This may be due to the lack of any Asn or aromatic sidechains in the PGR sequence, limiting the number of planar groups onto which GdnHCl can stack [90].

TMAO is an osmolyte naturally found in many marine organisms that helps protect proteins from denaturing environmental stressors often experienced by the cells of these organisms [93, 94]. It has been shown that TMAO has an opposing effect to that of urea on IDPs [44], and induces a native-like, compact conformation [94] in globular proteins. Similar to urea, the effect of TMAO is an additive effect of

205 contributions from backbone-solute interactions and the sidechain-solute interactions.

While the unfavorable backbone contributions drive this osmolyte’s ability to fold proteins, sidechain contributions are in opposition, having favorable free energy transfers into TMAO from water [82]. In general, high concentrations of TMAO destabilize the unfolded state relative to the native state [95]. In IDPs, TMAO tends to reduce PPII content and can even induce tertiary structure from thermodynamically denatured globular proteins [94, 96, 97]. Surprisingly, addition of TMAO up to 3 M final concentration had no effect on the far-UV spectrum of PGR near 220 nm (Fig. 5c).

Although this is not a typical response, other IDPs have been reported that show essentially no change in their far-UV spectra with increasing TMAO, including myelin basic protein (MyBP) [98] and starmaker-like protein (Stm-1) involved in biomineralization of otoliths [99]. Like PGR, each of these proteins is highly charged

(28-47% of the residues are charged), although MyBP and Stm-1 have only 3-6% proline content compared to 29% for PGR. Interestingly, favorable sidechain-TMAO interactions that counterbalance the backbone-solute effects are most significant among charged residues, compared to polar or apolar side chains [82], offering an explanation for why PGR, MyBP, and Stm-1 may be relatively unaffected by TMAO.

TFE is an organic cosolvent that is preferentially excluded from the polypeptide backbone but interacts weakly with nonpolar sidechains, which strongly favors formation of α-helix, a form of secondary structure with maximal intramolecular backbone H- bonding and exposed sidechains. In fact, TFE can induce formation of α-helix even in proteins that are unlikely to form such structures in vivo [100-103]. For many IDPs, addition of 15% TFE is sufficient to induce significant spectral changes, with some IDPs

206 already showing complete transition to α-helix from a PPII or random coil native state

[91, 104-108]. In contrast, 15% TFE had no effect on PGR (Fig. 5d), while higher concentrations of 45 or 75% TFE led to significant weakening of the minimum near 200 nm and slight loss of PPII signal near 220 nm. However, none of the TFE conditions induced detectable formation of α-helix with the characteristic double minima at 208 and

222 nm. Rather, addition of high concentrations of TFE shifts PGR toward random coil, as seen for increased temperature. The conformation of PGR at 75% TFE appears comparable to the protein incubated at 75° C (Fig. 3a). Thus, PGR is quite resilient to the addition of either TMAO or TFE, and it has no propensity to form α-helix, even at very high TFE concentrations. Given that the consensus PGR sequence repeat

(AEPGKP) contains a proline in every third position, it is not surprising that α-helical conformations are inaccessible, given the well-known role of proline as a ‘helix-breaker’.

Electrostatic interactions do not affect local or global PGR conformations

PGR contains a high percentage of charged residues, with alternating charge at every third position in the consensus AEPGKP repeats. It is plausible that these residues could form salt bridges to stabilize the PPII helix. Indeed, Whittington and

Creamer [109] showed using Monte Carlo computer simulations that the orientation and spacing of the i → i + 3 lysine and glutamate sidechains are such that a salt bridge could form. However, based on their CD measurements of a PPPKPAEPPPGY peptide, there was no significant difference between the spectra at pH 7 and those collected at either pH 2 or 12, where glutamate or lysine respectively are uncharged. This led to the conclusion that salt bridges do not stabilize PPII helices [109]. Tomasso et al. [42] have

207 modeled the impact of coulombic interactions on the radius of hydration (Rh). They reported modest effects on Rh due to charge-charge interactions, with the PPII propensity being the major factor on Rh. When coulombic interactions did produce notable deviations in Rh, it was in cases of similarly-charged residues or charged clusters with little separation.

We observed no difference between the CD spectra of PGR in the presence of

30 mM or 1 M NaF, where electrostatic interactions should be completely screened

[110], at any temperature tested (Fig. 6a). Similarly, the PGR spectra were superimposable at pH values of 7.4 and 12 at all temperatures tested, and were nearly identical at pH 2 (Fig. 6b). These data confirm that salt bridges between the i → i + 3 charged residues in PGR do not appear to stabilize the PPII conformation. The effect of electrostatics on the global conformation of PGR in solution was also analyzed using

AUC. Sedimentation velocity experiments were performed on PGR with varying NaCl concentrations. Between 30 mM and 300 mM NaCl, the s20,w remained near 1.0, with a frictional ratio of 2.0-2.1 (Fig. 6c, Table S3). At 1 M NaCl, there was only a slight decrease in the s20,w value and a very slight increase in the frictional ratio at 20° C and

37° C, indicating that fully screening electrostatic interactions caused very little change in the global conformation of PGR in solution. (See Fig. 6.)

We also considered whether PGR could interact with divalent metal ions. There have been numerous reports of disordered proteins undergoing a conformational transition upon binding of a metal ligand [37, 59, 111, 112]. Due to the numerous charged residues in the sequence of PGR, and the well-defined role of Zn2+ in the function of the Aap B-repeat superdomain [15, 18], we were interested in testing

208

Figure 6. Coulombic effects do not play a role in local or global conformations. The CD spectra in (a) show no differences in 30 mM, 100 mM, 300 mM, or 1 M NaF (dark fill to light fill) at 4° C (purple circles), 20° C (green squares), or 37° C (red triangles). In (b), the CD spectra of PGR at pH 2 (light fill) has a slightly less negative minimum near 200nm at all temperatures, but spectra at pH 7.4 (medium fill), or pH 12 (dark fill) show no differences. Together, (a) and (b) indicate that charge interactions do not affect the local PPII content. Panel (c) shows c(s) distributions from sedimentation velocity AUC experiments at 30 mM, 100 mM, 300 mM, or 1 M (solid, short dashed, long dashed, and dotted lines) NaCl at 4° C (purple lines), 20° C (green), or 37° C (red), revealing only a slight change in the s20,w at 20° C and 37° C at 1 M NaCl. These data demonstrate that coulombic effects have little influence on the global conformations available to PGR. (d), c(s) distributions from sedimentation velocity AUC experiments on apo-PGR (black line), or PGR in the presence of 2 mM CaCl2 (purple), 2 mM MgCl2 (green), or 2 mM ZnCl2 (red) at 20° C, showing no change in global conformation due to interactions with divalent cations.

209 biologically relevant, divalent metal ions as potential ligands. Furthermore, Zn2+ has been shown to increase the rigidity of the S. aureus cell wall and plays a role in the extension of the Aap ortholog SasG away from the cell wall [113]. We performed sedimentation velocity AUC experiments in the presence of CaCl2, MgCl2, and ZnCl2.

When 2 mM of each divalent cation was added to PGR in 20 mM HEPES pH 7.4, 150 mM NaCl, there was no significant change in the sedimentation coefficient or frictional ratio of PGR (Fig. 6d).

Predicting PPII propensity and Rh from PGR primary sequence

Our data make it clear that PGR is a highly extended IDP in solution that is rather resistant to conformational transitions. The primary sequence of PGR is comprised of a high percentage of proline residues (29.6%), glycine residues (14.8%), and charged residues (26.5%), with a net charge of -7. The inclusion of a proline at every 3rd position throughout the first 19 hextad repeats clearly contributes a strong propensity toward

PPII conformations and places significant constraints on the conformational states available to PGR. In addition to the high proline content, PGR is rich in disorder- promoting residues such Glu and Lys; these residues are highly represented in experimentally proven intrinsically disordered proteins [114]. Previously, it was shown that a simple power-law scaling relationship relates the Rh of IDPs to N, the number of residues, and fPPII, the fractional number of residues in the PPII conformation as estimated from intrinsic PPII propensities [42, 51] (Eq. 7). Tomasso et al. compared experimental PPII propensity scales and found that the Hilser scale [115] or a composite scale combining Hilser, Kallenbach [116], and Creamer [117] PPII propensity

210 scales could accurately predict experimental Rh values for a large dataset of IDPs ranging from 73 to 260 residues in length and varying in sequence composition and net charge (from 1 to 43) [42]. Another variation on the power-law scaling relationship was proposed by Marsh and Forman-Kay, which takes into account the fraction of proline residues and the absolute net charge [53]. The Marsh and Forman-Kay relationship (Eq.

9) predicts an Rh value of 38.78 Å, while applying the Tomasso et al. relationship (Eq. 7) to the PGR sequence predicts an Rh value of 38.50 Å, which are both remarkably close to the experimentally determined values from SEC: 37.06 ± 1.1 Å (at 22° C), and DLS:

38.39 ± 0.9 Å (at 25° C). The hydrodynamic size of PGR measured experimentally thus provides additional support for the simulation-derived relationship between Rh, N and fPPII. An interesting aspect of the Tomasso et al. approach to Rh prediction for PGR is that fPPII, the fractional number of residues in the PPII conformation based on the Hilser propensity scale, was 0.5350. This was a higher value of fPPII than observed for any of the IDPs in the test set from the previous study [42]. Indeed, PGR was the only IDP among all proteins in the test set with a PPII fraction greater than 0.5 (Table S4).

211

Discussion

Many bacterial cell wall-anchored (CWA) proteins feature regions defined as ‘low complexity’, although they vary greatly in size and amino acid content [21-26]. In addition to Aap and its ortholog SasG, examples include the proline-rich region of the streptococcal β protein [21], M6 protein of Group A Streptococcus [22], β antigen of the

Ibc protein complex [23], as well as the serine-aspartate repeat proteins found in many staphylococci [24] and fibronectin-binding protein of S. aureus [25]. In many cases, these regions are found just upstream of the LPXTG Sortase A anchoring motif. The proline/glycine-rich region (PGR) of Aap is 135 residues in length and composed primarily of AEPGKP repeats. The S. aureus ortholog SasG has a proline-rich region quite different from the Aap PGR, spanning roughly half the number of residues and lacking the hextad repeats of Aap-PGR. Nonetheless, the proline-rich region of SasG is predicted to have a high fraction of PPII content (0.4711), surpassed only by Aap-PGR

(0.5350) and p53(1-93) (0.4890) in the Tomasso et al. database [42] based on the

Hilser propensity scale [115]. Thus, the proline-rich region of SasG is anticipated to function as a stalk similar to the Aap PGR. However, the SasG proline-rich region has a relatively higher fraction of charged residues (0.362 compared to 0.259 for Aap-PGR), categorizing it as a strong polyampholyte that could potentially sample coil-like or hairpin-like conformations according to the Das-Pappu phase plot [69]. The serine- aspartate repeat (Sdr) proteins found in S. epidermidis and S. aureus can contain as many as 558 SD repeats leading up to their LPXTG motif [24]. Proline-rich regions less than 30 residues in length are commonly found among other gram positive bacteria adjacent to the sortase A motif [26], though their role is not yet clear. In many cases,

212 these proline-rich regions appear to function as stalks that act to project adhesive protein domains away from the cell wall. These can include rigid, highly structured stalks as observed for the streptococcal adhesion antigen I/II, which features an elongated α-helix intertwined with a long PPII helix [118]. In many other CWA proteins, proline-rich regions appear to form semi-flexible stalks [119] that may act as entropic chains. For example, Linke et al. solved a crystal structure of FctB, a minor pilin from

Group A streptococci and observed a short PPII helix formed by the C-terminal

PXPPXXPXXPXXPXXP tail that extends outward from the Ig-like pilin domain [26]. Aap

PGR, with its series of AEPGKP repeats, appears to fall in the latter category. The PGR provides an interesting system both in terms of its role in mediating biofilm-related infections and as a protein sequence with an unusually high propensity for PPII structure.

We have demonstrated using a variety of biophysical techniques that the PGR of

Aap is an extended, intrinsically disordered polypeptide. Sedimentation velocity AUC revealed a highly elongated monomeric species that was shown to have substantial

PPII helix content by far-UV CD. In close agreement with the power-law scaling equation which relates the amino acid sequence of an IDP to its Rh based on PPII propensity [42, 115], the Rh of PGR was measured to be 37.06 Å by SEC. The power- law relationship shows that the large Rh is primarily the result of an fPPII value greater than 0.5, which was a higher fractional PPII content than for any IDP in the test dataset.

Another unusual feature of PGR was that both AUC and DLS data indicated only small changes in the s20,w or Rh values, respectively, of PGR with increasing temperature.

Typically, the Rh of an IDP decreases as temperature increases, due to destabilization

213 of PPII and conformational collapse [51, 74]. Likewise, neither the osmolyte TMAO (up to 3 M) nor the cosolvent TFE (up to 15%) had any effect on the PPII content as judged by far-UV CD. Very high concentrations of TFE (45 to 75%) reduced PPII content and increased random coil in a manner similar to high temperature, but did not induce α- helix formation. Thus, our data show that PGR is unusually resilient to conformational fluctuations or compaction and maintains a generally elongated configuration even at elevated temperatures.

We propose that the basis for these phenomena is implicit in the repeat sequence of PGR and depends on the local propensity of individual PGR residues to sample the PPII conformation and the cis/trans isomerization state of the prolines in this region. The far-UV CD spectra of PGR as a function of temperature (Fig. 3a) recapitulate the well-known behavior seen for the transition of polymers of proline, hydroxyproline, and other proline derivatives from PPII to PPI conformations (e.g., shifting from trans  cis proline isomers) [120]. Furthermore, Cammers-Goodwin et al. demonstrated that TFE increased the rate of cis/trans proline interconversion [121], which could explain why CD spectra for PGR at high (45-75%) TFE concentrations (Fig.

5d) mimic the spectra for PGR at elevated temperature (Fig. 3a). Finally, highly acidic conditions are also known to shift proline residues from trans to cis isomers [122], and our pH 2.0 data for PGR showed a similar loss of PPII content, albeit to a much smaller degree (Fig. 6b).

Within the consensus hextad repeat (AEPGKP), all residues except glycine have high PPII propensity on the Hilser scale [115] used for the power-law scaling relationship [42]. Two residues per hextad are prolines, with the highest PPII propensity

214

(1.00), followed by lysine (0.56) and glutamate (0.42), which have the 2nd and 4th- highest PPII propensity values. Even alanine has a greater value (0.37) than average

(0.35 over all 20 amino acids). In contrast, glycine has the lowest PPII propensity of all

(0.13). Thus, a repeating pattern occurs with five PPII-prone residues separated by glycine (e.g., KPAEPG-KPAEPG). Taken together, our data suggest a model for PGR in which the majority of residues can transition independently between local random coil and PPII conformations, but with a bias toward PPII. We propose that the primary determinant of compaction is the extent of trans to cis isomerization of the proline residues. Overall, PGR contains 40 prolines, and on average 5-10% of prolines in IDPs are found in the cis isomer [114]. Thus, in the resting state (e.g., standard buffer at 25°

C), we can anticipate that two to four of the prolines in PGR would be found in the cis isomer at any given time. Increasing temperature, high TFE concentrations, or very low pH will increase the net number of cis-Pro isomers, resulting in compaction of PGR, although minimal, under these conditions. An interesting aspect of PGR is the demonstration that even under conditions of reduced PPII content (e.g., at high temperature), the chain remains relatively elongated in solution. Similar sequences may prove to be useful tools for engineering biomaterials.

These data support the hypothesis of PGR functioning as an extended stalk, pushing Aap out and away from the bacterial cell wall (see Fig. 7). It is likely that the

PGR accomplishes several important functions by acting as an entropic chain. First, by extending away from the cell wall, PGR prevents steric hindrance during self-assembly of the B-repeat region. We have previously shown that the C-terminal G5 domain ‘half- repeat’ in the B-repeat region takes part in formation of an anti-parallel B-repeat dimer

215 in the presence of Zn2+ [15, 18]. The engagement of an Aap B-repeat region from an opposing cell with this C-terminal G5 domain (which terminates immediately before the beginning of the PGR: see the EYGPT sequence in Figure 1) would be occluded by other CWA proteins if not for the PGR elevating the C-terminal G5 above the cell surface. In future work it will be of interest to genetically engineer S. epidermidis to express an Aap variant lacking the PGR to validate its biological function, and to further investigate similar low-complexity regions in other bacterial CWA proteins. Given the significant health burden caused by recurrent, hard-to-treat biofilm-related infections, a clearer understanding of the structural and functional characteristics of such intrinsically disordered regions in Aap, SasG, and other key CWA proteins will be important.

216

Figure 7. Model of Aap on the surface of S. epidermidis. The gram-positive cell wall of S. epidermidis contains a peptidoglycan layer. Sortase A covalently links the LPXTG motif of Aap to Lipid II toward the outer side of the peptidoglycan layer. The P/G-rich region forms an extended stalk with high polyproline type II helix propensity and high resistance to compaction. This region pushes the B- repeat superdomain out and away from the cell surface where it can better interact with the B-repeat superdomain of adjacent S. epidermidis cells in the presence of Zn2+ ions and contribute to biofilm formation.

Table 1. Summary of hydrodynamic parameters determined in this study

0 0 a Technique s 20,w D 20,w f/f0 Rh (Svedberg) (cm2/s) (Å) AUC 1.06 6.87 × 10-7 2.14 SEC 37.06 ± 1.1 (22° C) DLS 7.09 ± 0.4 × 10-7 38.39 ± 0.9 (25° C) aFrictional ratio from 25 µM PGR at 20° C in 20 mM KPO4 pH 7.4, 150 mM NaCl

217

Materials and methods

Cloning

The PGR gene was synthesized by LifeTechnologies GeneArt® composed of amino acids 2225-2359 from S. epidermidis RP62A (UniProt accession no. Q9L470) and subcloned into the pENTR221 vector for use in the Gateway cloning system

(Invitrogen). The gene was then transferred into a destination vector by LR clonase

(Invitrogen) reaction. The destination vector pHisMBP-DEST was kindly provided by Dr.

Artem Evdokimov and contained an N-terminal maltose binding protein (MBP) and hexahistidine tag which are cleavable by Tobacco Etch Virus (TEV) protease.

Protein Expression

BLR(DE3) Escherichia coli were transformed with pDEST-His-MBP-PGR plasmid. Cultures were grown to an OD600 near 1.0 before cooling to 10° C in an ice bath. Ethanol was added to 2% (v/v) and isopropyl β-D-1-thiogalactopyranoside (IPTG) to 200 µM, then cultures were incubated at 20° C for 14-16 hours at 200 rpm [123]. The cultures were centrifuged to pellet the bacteria, which were then resuspended in 20 mM

Tris pH 7.4, 300 mM NaCl and frozen at -20° C.

Protein Purification and Storage

Frozen cells were thawed and lysed by sonication. His-tagged protein was isolated using a 5 mL Ni2+-charged HiTrap HP cartridge column (GE Healthcare) attached to an ÄktaPure M chromatography system. The column was washed with 20 mM Tris pH 7.4, 500 mM NaCl, 5 mM β-mercaptoethanol and eluted with a linear

218 gradient from 0 to 1 M imidazole. The eluted protein was pooled and dialyzed into 20 mM Tris pH 7.4 and 150 mM NaCl before being cleaved for 6-16 hours by His-tagged

TEV protease. The mixture was run over a Ni2+-charged HiTrap column to trap the cleaved His-tagged MBP and His-tagged TEV protease. The flow-through contained nearly pure PGR, which was dialyzed into 20 mM potassium phosphate pH 7.4 and 50 mM NaCl and purified by anion exchange (GE Healthcare) to remove any remaining contaminants. Purity was evaluated by size exclusion chromatography (Superdex 75 prep grade – GE Healthcare) and silver-stained SDS-PAGE gels. Purified PGR was filtered using a 0.22 µm syringe filter unit with PES membrane (EMD Millipore), which helped prevent degradation. After purification, PGR was stored at -80° C to prevent degradation. CD and AUC experiments before and after freeze/thaw cycles showed there were no adverse effects on protein structure or stability (data not shown).

SDS-PAGE and Silver Staining

SDS-PAGE was conducted using 4-20% Mini-PROTEAN® TGX™ Precast

Protein Gels (BIO-RAD) at 175 V for 40 min at 25° C with running buffer containing 25 mM Tris, 250 mM Glycine, and 0.1% SDS (pH 10) running buffer. The samples were run under non-reduced conditions in Laemmli sample buffer containing 65.8 mM Tris-

HCl, pH 6.8, 26.3 (w/v) glycerol, 2.1% SDS, and 0.01% bromophenol blue (BIO-RAD).

Samples containing sample buffer were heated at 95° C for 5 min before loading.

Staining was performed using the Pierce™ Silver Stain Kit (Thermo Scientific™).

219

Mass Spectrometry

Electrospray ionization mass spectrometry (ESI-MS) was performed on a sample of purified PGR in a 50:50 solution of H2O:acetonitrile with 0.1% formic acid. Prior to mass analysis, PGR was dialyzed against water and then diluted into the

H2O:acetonitrile solution to a final concentration of 10 µM. For molecular weight measurement, mass spectra were collected using a Waters Synapt G2 ESI-Q-TOF

Mass Spectrometer and analyzed with MassLynx. The ESI-MS experiments were performed with the following conditions: ESI capillary voltage, 3.5 kV; sample cone voltage, 35 V; extraction cone voltage, 3.5 V; source temperature 150 C; desolvation temperature, 180 C; cone gas flow, 10 L/h; desolvation gas flow, 700 L/h (N2).

Analytical Ultracentrifugation

Sedimentation velocity experiments were performed using an XL-I analytical ultracentrifuge (Beckman Coulter) with absorbance optics at speeds of 45,000 or 48,000 rpm at 4, 20, or 37° C in An-60 Ti or An-50 Ti rotors. Two-sector epon-charcoal centerpieces were used with sapphire or quartz windows. Cells were scanned at a wavelength between 230-250 nm, depending on the sample concentration. PGR concentration was 25 M (0.33 mg/ml) for experiments where concentration is not explicitly stated. Experiments were run for about 20 hours or until sedimentation progress slowed due to back-diffusion. Data were analyzed using the c(s) model in

Sedfit [124] (sedfitsedphat.nibib.nih.gov), with 100-150 total scans loaded. Buffer densities and viscosities at 20° C were calculated using Sednterp [125]

(sednterp.unh.edu). The density and viscosity of buffers at temperatures other than 20°

220

C were calculated using Eq. 1, where η is the viscosity of the buffer b, or water w, at 20°

C or temperature T. Eq. 2 shows the equivalent formula for density, ρ. The viscosity and density of water at 4, 20, and 37° C were obtained from the NIST Chemistry WebBook

[126].

, , = , (1) ,

, , = , (2) ,

The partial specific volume of PGR was calculated to be 0.71703 based on amino acid sequence using Sednterp. GUSSI [127] was used to produce a sedimentation velocity data-fit-residual plot (Fig. 1d), as well as the c(s) distribution in Figure 1e. All other AUC data displayed were plotted in Sigmaplot (systatsoftware.com). Apparent sedimentation coefficients (s*) were converted to s20,w (the sedimentation coefficient at standard conditions of 20° C in water) in Sedfit based on Eq. 3, [128]

,, , , , = ( ) (3) ,, , ,

where is the partial specific volume of water (w) or buffer (b) at 20° C or the experimental temperature, T, while ρ and η are the solution density and viscosity

0 respectively. The s 20,w value was calculated from linear extrapolation of s20,w values from a concentration series to infinite dilution.

221

The c(s,f/f0) size-and-shape distributions were also determined using Sedfit [77].

A total of 100-150 scans were loaded. The sedimentation coefficient values spanned a range of 1.25 S along the uncorrected sedimentation coefficient (s*) axis with a resolution of 20 or 25, while the f/f0 axis ranged from 1-4 with a resolution of 10. The radial resolution was set to 0.003 to decrease processing time. The distributions were plotted in MATLAB (MathWorks.com).

Predictions of Disorder

The PONDR (Predictor of Natural Disordered Regions) (www.pondr.com) server was utilized for determining the absolute mean net charge and mean scaled hydropathy values for PGR in the Uversky Plot, as well as performing the VLXT prediction. The

RONN v3.2 predictor was accessed at (www.strubi.ox.ac.uk/RONN). IUPred

(iupred.enzim.hu) was used with the long disorder prediction type selected. FoldIndex was accessed at (http://bip.weizmann.ac.il/fldbin/findex) and the sequence was submitted with a window size of 10 and step size of 1. Additional disorder prediction programs were assessed through the Database of Disordered Protein Predictions

(www.d2p2.pro) [60]. CIDER was accessed at pappulab.wustl.edu/CIDER/ which calculates parameters such as kappa (a description of charged residue mixing [69]), fraction of charged residues (FCR), net charge per residue (NCPR), hydropathy, and the fraction of disorder promoting residues. It should be noted that CIDER does not predict if a sequence will be disordered; the service was utilized in this work to provide parameters for classification of PGR and for investigating conformations expected to be sampled by PGR.

222

Circular Dichroism Spectroscopy

All CD experiments utilized an Aviv 215 circular dichroism spectrophotometer with an Aviv peltier junction temperature control system to measure far-UV spectra at temperatures between 5° C and 95° C. Samples were loaded into a 0.5 mm quartz cuvette (Hellma Analytics). Wavelength scans were taken in 0.5 nm steps with a 10 s averaging time and 0.333 s settling time. For the 5° C - 95° C temperature experiments, nine wavelength scans at each temperature were averaged together from three separate protein samples, each with three scans. All other CD spectra were the average of three wavelength scans from one protein sample. PGR in 20 mM potassium phosphate pH 7.4 and 50 mM NaF was measured at 50 µM (0.66 mg/ml). The urea used was 8 M ultra-pure grade solution (Amresco), and guanidine HCl was 8.0 M high purity solution (Pierce). Machine units in millidegrees, θ, were converted to the mean residue ellipticity, [θ], using Eq. 4:

× [] = (4) ××

where MRW is the molecular weight of PGR divided by the number of residues. l is the path length in cm, and c is the concentration in mg/ml. The MRW value for PGR is

97.03 g mol-1 residue-1. The mean residue ellipticity, [θ], has the units of degrees cm2 dmol-1 residue-1.

223

Dynamic Light Scattering

A Malvern Zen 3600 Zetasizer Nano was used to measure the size of 1-1.5 mg/ml PGR in 20 mM potassium phosphate pH 7.4, 150 mM NaCl at different temperatures. Samples were filtered using a 0.2 µm-pore size PVDF syringe-driven filter

(EMD Millipore) immediately prior to use. The diffusion coefficient was measured using

600 µL samples in teflon-capped quartz micro-cuvettes that were allowed to equilibrate at the set temperature for 15 minutes. The number of runs per measurement was set to automatic. Measurements were made at 5° C first, the sample temperature was then increased in 10° C steps to 45° C, followed by cooling back to 5° C and repeating the cycle of measurements at different temperatures. The reported values represent the average and standard deviation from 5 measurements at each temperature.

The Stokes-Einstein equation (Eq. 5) was used to determine the apparent hydrodynamic radius (Rh) based on the measured diffusion coefficient, D,

= ⁄(6) (5)

where k is the Boltzmann constant, T is the temperature in Kelvin, and η is the viscosity of the solvent. Solvent viscosity was estimated using the solvent builder software from

Malvern based on Sednterp [125].

224

Size Exclusion Chromatography

Sephadex G-100 (GE Healthcare) was equilibrated in 10mM sodium phosphate pH 7.0, 100 mM NaCl. A Bio-Rad BioLogic LP System was used to monitor UV absorbance at 280 nm to determine elution volumes (Ve). 250 µL samples of PGR at concentrations of 3-4 mg/ml were required to see an elution peak at 280 nm, due to the lack of aromatic residues in PGR. To verify that this peak corresponded to PGR, a single tyrosine residue was added (PGR-Tyr) to the C-terminus and measured by SEC at 1 mg/ml. To determine the void (V0) and total column volume (Vt), 10 µL of 3 mg/ml blue dextran and 0.03 mg/ml 2,4-dinitrophenyl-L-aspartate were run through the column separately from the protein. The thermodynamic retention factor (KD) was calculated using Eq. 6:

= ( )⁄( ). (6)

To determine the hydrodynamic radius of PGR, Rh values based on crystal structures of globular protein standards were plotted against the experimentally determined KD values. The Rh of the crystal structures were estimated as one-half the maximal Cα-Cα distance. A linear regression was performed on these protein standards.

The KD of the protein sample was inserted into the linear equation of best fit to determine the Rh. The same protocol was used to measure KD for repeat experiments using Sephadex G-75.

225

Rh Prediction

The previously described algorithm utilizes information regarding the PPII propensity to predict Rh of an IDP based on the amino acid sequence [42]. The input sequence was amino acid 2225-2359 of S. epidermidis Aap (UniProt accession no.

Q9L470). To predict the hydrodynamic radius (Rh) of PGR based on intrinsic PPII propensities, a power-law scaling relationship (Eq. 7) was used which is based on the number of residues (N) and the fraction of polyproline type II helix in the peptide chain

(fPPII,chain):

.. , , = 2.16 . (7)

The chain propensity for PPII structure, fPPII,chain, is based on the experimental scale from Hilser [115] that utilized a peptide host-guest system in which the C. elegans Sem-

5 SH3 domain binds a peptide in the PPII conformation. A non-interacting residue of the peptide was substituted for each amino acid before binding was measured by isothermal titration calorimetry. The value for fPPII,chain in Eq. 7 was determined by Eq. 8, where N is the number of residues and PPIIprop is the PPII propensity from the Hilser scale for each amino acid in the sequence from 1 to N:

, = ∑ ⁄. (8)

Predicting Rh from Eq. 7 for PGR yields 38.50 Å, based on fPPII,chain found to be 0.5350 for the PGR sequence.

226

Marsh and Forman-Kay have also published an equation for predicting hydrodynamic radius [53], based upon the same power-law scaling relationship originally employed by

Wilkens et al. to describe folded and chemically denatured proteins [129]. The Marsh and Forman-Kay equation is shown below:

= + (|| + ) . (9)

This variation of the power-law scaling relationship includes the fraction of proline residues (Ppro) and absolute net charge (|Q|), as well as constants A-D. The Shis* term is

1 in the case of PGR, because no histidine tag is present. The constants R0 and v are

2.49 and 0.509, respectively.

Multiple Sequence Alignment

The Clustal Omega [130] web server (v1.2.1) was accessed at The European

Bioinformatics Institute (www.ebi.ac.uk). Each sequence was obtained from the UniProt database [131] as a complete sequence for the accumulation-associated protein in

Staphylococcus epidermidis. The UniProt accession number precedes each strain name in the alignment.

227

Supplementary Figures

Figure S1. PGR is predicted to be intrinsically disordered. Panel (a) is the Uversky plot, showing PGR (green triangle – (0.3454, 0.0522)) lies on the portion of the plot where disordered proteins (black circles) tend to fall. Ordered proteins are shown in white circles. In panel (b), the results from several disorder predictions are plotted. The y-axis units should be considered arbitrary, as these algorithms have different ranges for their predictions; however, these results strongly support the hypothesis that PGR is an IDP. (c) shows PGR (green triangle – (0.10370, 0.15556)) in the boundary region (2) of the Das-Pappu phase plot separating weak

228 polyampholytes/polyelectrolytes (1) and strong polyampholytes (3). Panel (d) shows secondary structure predictions by SABLE, SCRATCH, and GOR servers, each predicting essentially complete random coil.

229

Figure S2. The sequence of PGR is highly conserved among S. epidermidis strains. Identical residues are marked with an asterisk (*), highly conserved residues with a semicolon (:), and weakly conserved residues with a period (.). The UniProt accession number and the name of the strain identify each sequence in the alignment. The NCTC 11047 strain (UniProt accession no. E0ACJ2) is significantly longer than the strains shown above, having an additional 7 AEPGKP repeats compared to RP62A (UniProt accession no. Q9L470), and thus was omitted for clarity. The Aap from strain PM221 (GenBank accession no. CDM15051) contains a region between the last half B- repeat and the LPXTG motif which does not resemble the PGR of any of the above strains, neither in number of residues nor in amino acid content.

230

Supplementary Tables

Table S1. Concentration dependence of sedimentation velocity AUC data a b c PGR concentration s20,w f/f0 Mcalc 25 µM 1.05 2.14 14.4 kDa 75 µM 1.05 2.11 14.0 kDa 150 µM 1.03 2.15 14.1 kDa 225 µM 1.02 2.18 14.1 kDa 300 µM 0.99 2.32 14.8 kDa aThe sedimentation coefficient standardized to 20° C and pure water. bFrictional ratio – the experimental frictional coefficient divided by the frictional coefficient of an ideal, non-hydrated sphere cThe molecular weight calculated from the sedimentation coefficient and frictional ratio

Table S2. Temperature dependence of sedimentation velocity AUC data a b c Temperature (° C) s20,w f/f0 Mcalc 4° C 1.06 2.14 15.3 kDa 20° C 1.05 2.14 14.4 kDa 37° C 1.03 2.10 13.0 kDa aThe sedimentation coefficient standardized to 20° C and pure water. bFrictional ratio – the experimental frictional coefficient divided by the frictional coefficient of an ideal, non-hydrated sphere cThe molecular weight calculated from the sedimentation coefficient and frictional ratio

231

Table S3. Salt dependence of sedimentation velocity AUC data a b c Temperature (° C) NaCl Concentration s20,w f/f0 Mcalc 4° C 30 mM 1.04 2.05 13.8 kDa 100 mM 1.06 2.02 13.9 kDa 300 mM 1.06 2.00 13.8 kDa 1 M 1.03 2.00 13.2 kDa 20° C 30 mM 1.04 2.09 13.7 kDa 100 mM 1.05 2.08 13.8 kDa 300 mM 1.04 2.08 13.7 kDa 1 M 1.00 2.19 13.8 kDa 37° C 30 mM 1.04 2.04 12.6 kDa 100 mM 1.05 2.05 12.9 kDa 300 mM 1.03 2.09 13.0 kDa 1 M 0.98 2.12 12.2 kDa aThe sedimentation coefficient standardized to 20° C and pure water. bFrictional ratio – the experimental frictional coefficient divided by the frictional coefficient of an ideal, non-hydrated sphere cThe molecular weight calculated from the sedimentation coefficient and frictional ratio

232

Table S4. Comparison of hydrodynamic properties for PGR to a dataset of studied IDPs a c Sequence N Net Charge Rh fPPII Rh Reference predictedb Obsd Aap-PGR 135 7 38.50 0.5350 37.06e This work p53(1-93) 93 15 29.51 0.4890 32.4 [52] p53(1-93) ALA- 93 15 28.66 0.4581 30.4 [52] p53 TAD 73 14 24.79 0.4500 23.8 [135] Securin 202 1 42.57 0.4130 39.7 [136] PDE-γ 87 4 26.51 0.4122 24.8 [137] Cad136 136 9 33.77 0.4025 28.1 [138] HIF1-α-403 202 29 42.13 0.4024 44.3 [139] Tau-K45 198 19 41.52 0.3988 45.0 [140] HIF1-α-530 170 10 37.81 0.3899 38.3 [139] Fos-AD 168 16 37.17 0.3783 35.0 [141] ShB-C 146 4 34.32 0.3764 32.9 [142] α-synuclein 140 9 33.47 0.3744 28.2 [143] Mlph(147-403) 260 28 47.00 0.3703 49.0 [144] CFTR-R-region 189 5 39.18 0.3644 32.0 [145] p57-ID 73 6 23.14 0.3636 24.0 [146] prothymosin-α 110 43 29.02 0.3633 33.7 [147] LJIDP1 94 4 26.46 0.3565 24.5 [148] Mlph(147-240) 97 15 26.85 0.3528 28.0 [144] SNAP25 206 14 40.60 0.3513 39.7 [149] Hdm2-ABD 97 29 26.47 0.3345 25.7 [150] Vmw65 89 19 25.13 0.3278 28.0 [151] p53(1-93) 93 15 24.93 0.2832 27.4 [52] PRO- aThe number of amino acids in the sequence b The predicted Rh from sequence and according to equation 6 in the main text cThe fractional number of PPII residues from sequence and according to intrinsic PPII propensities [42] d The Rh of the IDP as measured experimentally in the reference listed in the final column eAs measured in this study, listed is the average of SEC and DLS measurements. IDP dataset adapted from Tomasso et al. [42] and sorted by fPPII

233

Table S5. Folded proteins and hydrodynamic measurements from literature a b Sequence N Rh Reference staphylococcal nuclease 151‡ 22.5 [51] human recombinant lysozyme 132‡ 21.8 [51] bovine erythrocyte carbonic 267‡ 26.8 [51] anhydrase bovine pancreatic trypsin inhibitor 58 15.8 [53] SH3 domain of PI3 kinase 90 18.6 [53] horse heart cytochrome c 104 17.8 [53] hen lysozyme 129 20.5 [53] horse myoglobin 153 21.2 [53] bovine alpha-lactalbumin 123 18.8 [53] bovine pancreatic ribonuclease A 124 19.0 [53] sperm whale apomyoglobin 153 20.9 [53] ubiquitin 76 16.5 [53] (apo)cytochrome C 104 18.5 [53] α-lactalbumin 123 18.5 [53] tumor supressor, p16 156 20.0 [53] (apo)myoglobin 154 20.9 [53] β-lactoglobulin 162 22.0 [53] sarcoplasmic calcium binding 174 21.5 [53] adenylate kinase 194 21.9 [53] tryptophan synthase 268 24.2 [53] β-lactamase 257 23.7 [53] carbonic anhydrase B 260 23.3 [53] RTEM β-lactamase 263 24.5 [53] aThe number of amino acids in the sequence, taken from [53] unless otherwise noted bHydrodynamic radius, in Å, reported by the reference listed in the final column ‡The number of residues was estimated from the MW using the average of 111.6 Da/residue Folded protein data set adapted from [53], see [132, 152, 153] for additional details and individual references

234

References 1. Otto M. Staphylococcal biofilms. Current topics in microbiology and immunology. 2008;322:207-28. Epub 2008/05/06. PubMed PMID: 18453278; PubMed Central PMCID: PMCPmc2777538. 2. Arber N, Pras E, Copperman Y, Schapiro JM, Meiner V, Lossos IS, et al. Pacemaker endocarditis. Report of 44 cases and review of the literature. Medicine (Baltim). 1994;73(6):299-305. Epub 1994/11/01. PubMed PMID: 7984081. 3. McCann MT, Gilmore BF, Gorman SP. Staphylococcus epidermidis device- related infections: pathogenesis and clinical management. J Pharm Pharmacol. 2008;60(12):1551-71. Epub 2008/11/13. doi: 10.1211/jpp/60.12.0001. PubMed PMID: 19000360. 4. Otto M. Staphylococcus epidermidis--the 'accidental' pathogen. Nature reviews Microbiology. 2009;7(8):555-67. doi: 10.1038/nrmicro2182. PubMed PMID: 19609257; PubMed Central PMCID: PMC2807625. 5. Fey PD, Olson ME. Current concepts in biofilm formation of Staphylococcus epidermidis. Future Microbiol. 2010;5(6):917-33. doi: 10.2217/fmb.10.56. PubMed PMID: 20521936; PubMed Central PMCID: PMC2903046. 6. Costerton JW, Stewart PS, Greenberg EP. Bacterial biofilms: a common cause of persistent infections. Science (New York, NY). 1999;284(5418):1318-22. Epub 1999/05/21. PubMed PMID: 10334980. 7. Darouiche RO. Treatment of infections associated with surgical implants. N Engl J Med. 2004;350(14):1422-9. Epub 2004/04/09. doi: 10.1056/NEJMra035415. PubMed PMID: 15070792. 8. Gerke C, Kraft A, Sussmuth R, Schweitzer O, Gotz F. Characterization of the N- acetylglucosaminyltransferase activity involved in the biosynthesis of the Staphylococcus epidermidis polysaccharide intercellular adhesin. J Biol Chem. 1998;273(29):18586-93. Epub 1998/07/11. PubMed PMID: 9660830. 9. Hussain M, Herrmann M, von Eiff C, Perdreau-Remington F, Peters G. A 140- kilodalton extracellular protein is essential for the accumulation of Staphylococcus epidermidis strains on surfaces. Infection and immunity. 1997;65(2):519-24. Epub 1997/02/01. PubMed PMID: 9009307; PubMed Central PMCID: PMCPmc176090. 10. Rohde H, Burandt EC, Siemssen N, Frommelt L, Burdelski C, Wurster S, et al. Polysaccharide intercellular adhesin or protein factors in biofilm accumulation of Staphylococcus epidermidis and Staphylococcus aureus isolated from prosthetic hip and knee joint infections. Biomaterials. 2007;28(9):1711-20. doi: 10.1016/j.biomaterials.2006.11.046. PubMed PMID: 17187854. 11. Schaeffer CR, Woods KM, Longo GM, Kiedrowski MR, Paharik AE, Buttner H, et al. Accumulation-associated protein enhances Staphylococcus epidermidis biofilm formation under dynamic conditions and is required for infection in a rat catheter model. Infection and immunity. 2015;83(1):214-26. Epub 2014/10/22. doi: 10.1128/iai.02177- 14. PubMed PMID: 25332125; PubMed Central PMCID: PMCPmc4288872. 12. Banner MA, Cunniffe JG, Macintosh RL, Foster TJ, Rohde H, Mack D, et al. Localized tufts of fibrils on Staphylococcus epidermidis NCTC 11047 are comprised of the accumulation-associated protein. J Bacteriol. 2007;189(7):2793-804. doi:

235

10.1128/JB.00952-06. PubMed PMID: 17277069; PubMed Central PMCID: PMC1855787. 13. Conlon BP, Geoghegan JA, Waters EM, McCarthy H, Rowe SE, Davies JR, et al. Role for the A domain of unprocessed accumulation-associated protein (Aap) in the attachment phase of the Staphylococcus epidermidis biofilm phenotype. J Bacteriol. 2014;196(24):4268-75. Epub 2014/10/01. doi: 10.1128/jb.01946-14. PubMed PMID: 25266380; PubMed Central PMCID: PMCPmc4248850. 14. Rohde H, Burdelski C, Bartscht K, Hussain M, Buck F, Horstkotte MA, et al. Induction of Staphylococcus epidermidis biofilm formation via proteolytic processing of the accumulation-associated protein by staphylococcal and host proteases. Molecular microbiology. 2005;55(6):1883-95. doi: 10.1111/j.1365-2958.2005.04515.x. PubMed PMID: 15752207. 15. Conrady DG, Brescia CC, Horii K, Weiss AA, Hassett DJ, Herr AB. A zinc- dependent adhesion module is responsible for intercellular adhesion in staphylococcal biofilms. Proceedings of the National Academy of Sciences of the United States of America. 2008;105(49):19456-61. doi: 10.1073/pnas.0807717105. PubMed PMID: 19047636; PubMed Central PMCID: PMC2592360. 16. Corrigan RM, Rigby D, Handley P, Foster TJ. The role of Staphylococcus aureus surface protein SasG in adherence and biofilm formation. Microbiology (Reading, England). 2007;153(Pt 8):2435-46. Epub 2007/07/31. doi: 10.1099/mic.0.2007/006676- 0. PubMed PMID: 17660408. 17. Macintosh RL, Brittan JL, Bhattacharya R, Jenkinson HF, Derrick J, Upton M, et al. The terminal A domain of the fibrillar accumulation-associated protein (Aap) of Staphylococcus epidermidis mediates adhesion to human corneocytes. J Bacteriol. 2009;191(22):7007-16. Epub 2009/09/15. doi: 10.1128/jb.00764-09. PubMed PMID: 19749046; PubMed Central PMCID: PMCPmc2772481. 18. Conrady DG, Wilson JJ, Herr AB. Structural basis for Zn2+-dependent intercellular adhesion in staphylococcal biofilms. Proceedings of the National Academy of Sciences of the United States of America. 2013;110(3):E202-11. Epub 2013/01/02. doi: 10.1073/pnas.1208134110. PubMed PMID: 23277549; PubMed Central PMCID: PMCPmc3549106. 19. Marraffini LA, Dedent AC, Schneewind O. Sortases and the art of anchoring proteins to the envelopes of gram-positive bacteria. Microbiol Mol Biol Rev. 2006;70(1):192-221. Epub 2006/03/10. doi: 10.1128/mmbr.70.1.192-221.2006. PubMed PMID: 16524923; PubMed Central PMCID: PMCPmc1393253. 20. Mazmanian SK, Skaar EP, Gaspar AH, Humayun M, Gornicki P, Jelenska J, et al. Passage of heme-iron across the envelope of Staphylococcus aureus. Science. 2003;299(5608):906-9. Epub 2003/02/08. doi: 10.1126/science.1081147. PubMed PMID: 12574635. 21. Areschoug T, Linse S, Stalhammar-Carlemalm M, Heden LO, Lindahl G. A proline-rich region with a highly periodic sequence in Streptococcal beta protein adopts the polyproline II structure and is exposed on the bacterial surface. J Bacteriol. 2002;184(22):6376-83. Epub 2002/10/26. PubMed PMID: 12399508; PubMed Central PMCID: PMCPmc151936.

236

22. Hollingshead SK, Fischetti VA, Scott JR. Complete nucleotide sequence of type 6 M protein of the group A Streptococcus. Repetitive structure and membrane anchor. J Biol Chem. 1986;261(4):1677-86. Epub 1986/02/05. PubMed PMID: 3511046. 23. Jerlstrom PG, Chhatwal GS, Timmis KN. The IgA-binding beta antigen of the c protein complex of Group B streptococci: sequence determination of its gene and detection of two binding regions. Mol Microbiol. 1991;5(4):843-9. Epub 1991/04/01. PubMed PMID: 1857207. 24. McCrea KW, Hartford O, Davis S, Eidhin DN, Lina G, Speziale P, et al. The serine-aspartate repeat (Sdr) protein family in Staphylococcus epidermidis. Microbiology. 2000;146 ( Pt 7):1535-46. Epub 2000/07/06. doi: 10.1099/00221287-146- 7-1535. PubMed PMID: 10878118. 25. Penkett CJ, Redfield C, Dodd I, Hubbard J, McBay DL, Mossakowska DE, et al. NMR analysis of main-chain conformational preferences in an unfolded fibronectin- binding protein. J Mol Biol. 1997;274(2):152-9. Epub 1998/02/12. doi: 10.1006/jmbi.1997.1369. PubMed PMID: 9398523. 26. Linke C, Young PG, Kang HJ, Bunker RD, Middleditch MJ, Caradoc-Davies TT, et al. Crystal structure of the minor pilin FctB reveals determinants of Group A streptococcal pilus anchoring. J Biol Chem. 2010;285(26):20381-9. Epub 2010/04/30. doi: 10.1074/jbc.M109.089680. PubMed PMID: 20427291; PubMed Central PMCID: PMCPmc2888449. 27. Bhate M, Wang X, Baum J, Brodsky B. Folding and conformational consequences of glycine to alanine replacements at different positions in a collagen model peptide. Biochemistry. 2002;41(20):6539-47. Epub 2002/05/16. PubMed PMID: 12009919. 28. Bryan MA, Cheng H, Brodsky B. Sequence environment of mutation affects stability and folding in collagen model peptides of osteogenesis imperfecta. Biopolymers. 2011;96(1):4-13. Epub 2010/03/18. doi: 10.1002/bip.21432. PubMed PMID: 20235194; PubMed Central PMCID: PMCPmc2980582. 29. Wright PE, Dyson HJ. Intrinsically unstructured proteins: re-assessing the protein structure-function paradigm. J Mol Biol. 1999;293(2):321-31. Epub 1999/11/05. doi: 10.1006/jmbi.1999.3110. PubMed PMID: 10550212. 30. Forman-Kay JD, Mittag T. From sequence and forces to structure, function, and evolution of intrinsically disordered proteins. Structure. 2013;21(9):1492-9. Epub 2013/09/10. doi: 10.1016/j.str.2013.08.001. PubMed PMID: 24010708; PubMed Central PMCID: PMCPmc4704097. 31. Habchi J, Tompa P, Longhi S, Uversky VN. Introducing protein intrinsic disorder. Chem Rev. 2014;114(13):6561-88. Epub 2014/04/18. doi: 10.1021/cr400514h. PubMed PMID: 24739139. 32. Das RK, Ruff KM, Pappu RV. Relating sequence encoded information to form and function of intrinsically disordered proteins. Curr Opin Struct Biol. 2015;32:102-12. Epub 2015/04/13. doi: 10.1016/j.sbi.2015.03.008. PubMed PMID: 25863585; PubMed Central PMCID: PMCPmc4512920. 33. Romero P, Obradovic Z, Kissinger CR, Villafranca JE, Garner E, Guilliot S, et al. Thousands of proteins likely to have long disordered regions. Pac Symp Biocomput. 1998:437-48. Epub 1998/08/11. PubMed PMID: 9697202.

237

34. Uversky VN. Intrinsically disordered proteins from A to Z. Int J Biochem Cell Biol. 2011;43(8):1090-103. Epub 2011/04/20. doi: 10.1016/j.biocel.2011.04.001. PubMed PMID: 21501695. 35. Uversky VN, Oldfield CJ, Dunker AK. Intrinsically disordered proteins in human diseases: introducing the D2 concept. Annu Rev Biophys. 2008;37:215-46. Epub 2008/06/25. doi: 10.1146/annurev.biophys.37.032807.125924. PubMed PMID: 18573080. 36. Uversky VN, Dunker AK. Understanding protein non-folding. Biochim Biophys Acta. 2010;1804(6):1231-64. Epub 2010/02/02. doi: 10.1016/j.bbapap.2010.01.017. PubMed PMID: 20117254; PubMed Central PMCID: PMCPmc2882790. 37. Dunker AK, Brown CJ, Obradovic Z. Identification and functions of usefully disordered proteins. Adv Protein Chem. 2002;62:25-49. Epub 2002/11/07. PubMed PMID: 12418100. 38. van der Lee R, Buljan M, Lang B, Weatheritt RJ, Daughdrill GW, Dunker AK, et al. Classification of intrinsically disordered regions and proteins. Chem Rev. 2014;114(13):6589-631. Epub 2014/04/30. doi: 10.1021/cr400525m. PubMed PMID: 24773235; PubMed Central PMCID: PMCPmc4095912. 39. Trombitas K, Greaser M, Labeit S, Jin JP, Kellermayer M, Helmes M, et al. Titin extensibility in situ: entropic elasticity of permanently folded and permanently unfolded molecular segments. J Cell Biol. 1998;140(4):853-9. Epub 1998/03/21. PubMed PMID: 9472037; PubMed Central PMCID: PMCPmc2141751. 40. Linke WA, Kulke M, Li H, Fujita-Becker S, Neagoe C, Manstein DJ, et al. PEVK domain of titin: an entropic spring with actin-binding properties. J Struct Biol. 2002;137(1-2):194-205. Epub 2002/06/18. doi: 10.1006/jsbi.2002.4468. PubMed PMID: 12064946. 41. Brown HG, Hoh JH. Entropic exclusion by neurofilament sidearms: a mechanism for maintaining interfilament spacing. Biochemistry. 1997;36(49):15035-40. Epub 1998/01/10. doi: 10.1021/bi9721748. PubMed PMID: 9424114. 42. Tomasso ME, Tarver MJ, Devarajan D, Whitten ST. Hydrodynamic Radii of Intrinsically Disordered Proteins Determined from Experimental Polyproline II Propensities. PLoS computational biology. 2016;12(1):e1004686. Epub 2016/01/05. doi: 10.1371/journal.pcbi.1004686. PubMed PMID: 26727467. 43. Dunker AK, Rueckert RR. Observations on molecular weight determinations on polyacrylamide gel. J Biol Chem. 1969;244(18):5074-80. Epub 1969/09/25. PubMed PMID: 5824577. 44. Schaub LJ, Campbell JC, Whitten ST. Thermal unfolding of the N-terminal region of p53 monitored by circular dichroism spectroscopy. Protein Sci. 2012;21(11):1682-8. Epub 2012/08/24. doi: 10.1002/pro.2146. PubMed PMID: 22915551; PubMed Central PMCID: PMCPmc3527704. 45. Hotta K, Ranganathan S, Liu R, Wu F, Machiyama H, Gao R, et al. Biophysical properties of intrinsically disordered p130Cas substrate domain--implication in mechanosensing. PLoS Comput Biol. 2014;10(4):e1003532. Epub 2014/04/12. doi: 10.1371/journal.pcbi.1003532. PubMed PMID: 24722239; PubMed Central PMCID: PMCPmc3983058. 46. Ishijima J, N, Maeshima M, Miyano M. RVCaB, a calcium-binding protein in radish vacuoles, is predominantly an unstructured protein with a polyproline

238 type II helix. J Biochem. 2007;142(2):201-11. Epub 2007/06/19. doi: 10.1093/jb/mvm130. PubMed PMID: 17575286. 47. Kyte J, Doolittle RF. A simple method for displaying the hydropathic character of a protein. J Mol Biol. 1982;157(1):105-32. Epub 1982/05/05. PubMed PMID: 7108955. 48. Shirai A, Matsuyama A, Yashiroda Y, Hashimoto A, Kawamura Y, Arai R, et al. Global analysis of gel mobility of proteins and its use in target identification. J Biol Chem. 2008;283(16):10745-52. Epub 2008/02/23. doi: 10.1074/jbc.M709211200. PubMed PMID: 18292091. 49. Hayashi T, Nagai Y. The anomalous behavior of collagen peptides on sodium dodecyl sulfate-polyacrylamide gel electrophoresis is due to the low content of hydrophobic amino acid residues. J Biochem. 1980;87(3):803-8. Epub 1980/03/01. PubMed PMID: 7390962. 50. Uversky VN, Gillespie JR, Fink AL. Why are "natively unfolded" proteins unstructured under physiologic conditions? Proteins. 2000;41(3):415-27. Epub 2000/10/12. PubMed PMID: 11025552. 51. Langridge TD, Tarver MJ, Whitten ST. Temperature effects on the hydrodynamic radius of the intrinsically disordered N-terminal region of the p53 protein. Proteins. 2014;82(4):668-78. Epub 2013/10/24. doi: 10.1002/prot.24449. PubMed PMID: 24150971. 52. Perez RB, Tischer A, Auton M, Whitten ST. Alanine and proline content modulate global sensitivity to discrete perturbations in disordered proteins. Proteins. 2014;82(12):3373-84. Epub 2014/09/23. doi: 10.1002/prot.24692. PubMed PMID: 25244701; PubMed Central PMCID: PMCPmc4237723. 53. Marsh JA, Forman-Kay JD. Sequence determinants of compaction in intrinsically disordered proteins. Biophysical journal. 2010;98(10):2383-90. Epub 2010/05/21. doi: 10.1016/j.bpj.2010.02.006. PubMed PMID: 20483348; PubMed Central PMCID: PMCPmc2872267. 54. Bujacz A. Structures of bovine, equine and leporine serum albumin. Acta Crystallogr D Biol Crystallogr. 2012;68(Pt 10):1278-89. Epub 2012/09/21. doi: 10.1107/s0907444912027047. PubMed PMID: 22993082. 55. Stein PE, Leslie AG, Finch JT, Turnell WG, McLaughlin PJ, Carrell RW. Crystal structure of ovalbumin as a model for the reactive centre of serpins. Nature. 1990;347(6288):99-102. Epub 1990/09/06. doi: 10.1038/347099a0. PubMed PMID: 2395463. 56. Saito R, Sato T, Ikai A, Tanaka N. Structure of bovine carbonic anhydrase II at 1.95 A resolution. Acta Crystallogr D Biol Crystallogr. 2004;60(Pt 4):792-5. Epub 2004/03/25. doi: 10.1107/s0907444904003166. PubMed PMID: 15039588. 57. Zahran ZN, Chooback L, Copeland DM, West AH, Richter-Addo GB. Crystal structures of manganese- and cobalt-substituted myoglobin in complex with NO and nitrite reveal unusual ligand conformations. J Inorg Biochem. 2008;102(2):216-33. Epub 2007/10/02. doi: 10.1016/j.jinorgbio.2007.08.002. PubMed PMID: 17905436; PubMed Central PMCID: PMCPmc2771112. 58. Dunker AK, Lawson JD, Brown CJ, Williams RM, Romero P, Oh JS, et al. Intrinsically disordered protein. J Mol Graph Model. 2001;19(1):26-59. Epub 2001/05/31. PubMed PMID: 11381529.

239

59. Uversky VN. Intrinsically disordered proteins and their environment: effects of strong denaturants, temperature, pH, counter ions, membranes, binding partners, osmolytes, and macromolecular crowding. Protein J. 2009;28(7-8):305-25. Epub 2009/09/22. doi: 10.1007/s10930-009-9201-4. PubMed PMID: 19768526. 60. Oates ME, Romero P, Ishida T, Ghalwash M, Mizianty MJ, Xue B, et al. D(2)P(2): database of disordered protein predictions. Nucleic Acids Res. 2013;41(Database issue):D508-16. Epub 2012/12/04. doi: 10.1093/nar/gks1226. PubMed PMID: 23203878; PubMed Central PMCID: PMCPMC3531159. 61. Romero P, Obradovic Z, Li X, Garner EC, Brown CJ, Dunker AK. Sequence complexity of disordered protein. Proteins. 2001;42(1):38-48. Epub 2000/11/28. PubMed PMID: 11093259. 62. Yang ZR, Thomson R, McNeil P, Esnouf RM. RONN: the bio-basis function neural network technique applied to the detection of natively disordered regions in proteins. Bioinformatics. 2005;21(16):3369-76. Epub 2005/06/11. doi: 10.1093/bioinformatics/bti534. PubMed PMID: 15947016. 63. Dosztanyi Z, Csizmok V, Tompa P, Simon I. The pairwise energy content estimated from amino acid composition discriminates between folded and intrinsically unstructured proteins. J Mol Biol. 2005;347(4):827-39. Epub 2005/03/17. doi: 10.1016/j.jmb.2005.01.071. PubMed PMID: 15769473. 64. Dosztanyi Z, Csizmok V, Tompa P, Simon I. IUPred: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content. Bioinformatics. 2005;21(16):3433-4. Epub 2005/06/16. doi: 10.1093/bioinformatics/bti541. PubMed PMID: 15955779. 65. Prilusky J, Felder CE, Zeev-Ben-Mordehai T, Rydberg EH, Man O, Beckmann JS, et al. FoldIndex: a simple tool to predict whether a given protein sequence is intrinsically unfolded. Bioinformatics. 2005;21(16):3435-8. Epub 2005/06/16. doi: 10.1093/bioinformatics/bti537. PubMed PMID: 15955783. 66. Gruszka DT, Wojdyla JA, Bingham RJ, Turkenburg JP, Manfield IW, Steward A, et al. Staphylococcal biofilm-forming protein has a contiguous rod-like structure. Proceedings of the National Academy of Sciences of the United States of America. 2012;109(17):E1011-8. Epub 2012/04/12. doi: 10.1073/pnas.1119456109. PubMed PMID: 22493247; PubMed Central PMCID: PMCPmc3340054. 67. Gruszka DT, Whelan F, Farrance OE, Fung HK, Paci E, Jeffries CM, et al. Cooperative folding of intrinsically disordered domains drives assembly of a strong elongated protein. Nat Commun. 2015;6:7271. doi: 10.1038/ncomms8271. PubMed PMID: 26027519. 68. Holehouse AS, Ahad J, Das RK, Pappu RV. CIDER: Classification of Intrinsically Disordered Ensemble Regions. Biophys J. 2015;108:228a. 69. Das RK, Pappu RV. Conformations of intrinsically disordered proteins are influenced by linear sequence distributions of oppositely charged residues. Proc Natl Acad Sci U S A. 2013;110(33):13392-7. Epub 2013/08/01. doi: 10.1073/pnas.1304749110. PubMed PMID: 23901099; PubMed Central PMCID: PMCPmc3746876. 70. Lopes JL, Miles AJ, Whitmore L, Wallace BA. Distinct circular dichroism spectroscopic signatures of polyproline II and unordered secondary structures: applications in secondary structure analyses. Protein Sci. 2014;23(12):1765-72. Epub

240

2014/09/30. doi: 10.1002/pro.2558. PubMed PMID: 25262612; PubMed Central PMCID: PMCPmc4253816. 71. Tiffany ML, Krimm S. New chain conformations of poly(glutamic acid) and polylysine. Biopolymers. 1968;6(9):1379-82. Epub 1968/01/01. doi: 10.1002/bip.1968.360060911. PubMed PMID: 5669472. 72. Whittington SJ, Chellgren BW, Hermann VM, Creamer TP. Urea promotes polyproline II helix formation: implications for protein denatured states. Biochemistry. 2005;44(16):6269-75. Epub 2005/04/20. doi: 10.1021/bi050124u. PubMed PMID: 15835915. 73. Adzhubei AA, Sternberg MJ, Makarov AA. Polyproline-II helix in proteins: structure and function. J Mol Biol. 2013;425(12):2100-32. Epub 2013/03/20. doi: 10.1016/j.jmb.2013.03.018. PubMed PMID: 23507311. 74. Drake AF, Siligardi G, Gibbons WA. Reassessment of the electronic circular dichroism criteria for random coil conformations of poly(L-lysine) and the implications for protein folding and denaturation studies. Biophys Chem. 1988;31(1-2):143-6. Epub 1988/08/01. PubMed PMID: 3233285. 75. Wuttke R, Hofmann H, Nettels D, Borgia MB, Mittal J, Best RB, et al. Temperature-dependent solvation modulates the dimensions of disordered proteins. Proc Natl Acad Sci U S A. 2014;111(14):5213-8. Epub 2014/04/08. doi: 10.1073/pnas.1313006111. PubMed PMID: 24706910; PubMed Central PMCID: PMCPmc3986154. 76. Kjaergaard M, Norholm AB, Hendus-Altenburger R, Pedersen SF, Poulsen FM, Kragelund BB. Temperature-dependent structural changes in intrinsically disordered proteins: formation of alpha-helices or loss of polyproline II? Protein Sci. 2010;19(8):1555-64. Epub 2010/06/18. doi: 10.1002/pro.435. PubMed PMID: 20556825; PubMed Central PMCID: PMCPmc2923508. 77. Brown PH, Schuck P. Macromolecular size-and-shape distributions by sedimentation velocity analytical ultracentrifugation. Biophys J. 2006;90(12):4651-61. Epub 2006/03/28. doi: 10.1529/biophysj.106.081372. PubMed PMID: 16565040; PubMed Central PMCID: PMCPmc1471869. 78. Chaton CT, Herr AB. Elucidating Complicated Assembling Systems in Biology Using Size-and-Shape Analysis of Sedimentation Velocity Data. Methods Enzymol. 2015;562:187-204. Epub 2015/09/29. doi: 10.1016/bs.mie.2015.04.004. PubMed PMID: 26412652. 79. Gast K, Fiedler C. Dynamic and static light scattering of intrinsically disordered proteins. Methods Mol Biol. 2012;896:137-61. doi: 10.1007/978-1-4614-3704-8_9. PubMed PMID: 22821522. 80. Tanford C. Isothermal Unfolding of Globular Proteins in Aqueous Urea Solutions. J Am Chem Soc. 1964;86(10):2050-9. 81. Scholtz JM, Grimsley GR, Pace CN. Solvent denaturation of proteins and interpretations of the m value. Methods Enzymol. 2009;466:549-65. Epub 2009/01/01. doi: 10.1016/s0076-6879(09)66023-7. PubMed PMID: 21609876. 82. Auton M, Rosgen J, Sinev M, Holthauzen LM, Bolen DW. Osmolyte effects on protein stability and solubility: a balancing act between backbone and side-chains. Biophys Chem. 2011;159(1):90-9. Epub 2011/06/21. doi: 10.1016/j.bpc.2011.05.012. PubMed PMID: 21683504; PubMed Central PMCID: PMCPmc3166983.

241

83. Holehouse AS, Garai K, Lyle N, Vitalis A, Pappu RV. Quantitative assessments of the distinct contributions of polypeptide backbone amides versus side chain groups to chain expansion via chemical denaturation. J Am Chem Soc. 2015;137(8):2984-95. Epub 2015/02/11. doi: 10.1021/ja512062h. PubMed PMID: 25664638; PubMed Central PMCID: PMCPMC4418562. 84. Moeser B, Horinek D. Unified description of urea denaturation: backbone and side chains contribute equally in the transfer model. J Phys Chem B. 2014;118(1):107- 14. Epub 2013/12/18. doi: 10.1021/jp409934q. PubMed PMID: 24328141. 85. Canchi DR, Garcia AE. Backbone and side-chain contributions in protein denaturation by urea. Biophys J. 2011;100(6):1526-33. Epub 2011/03/16. doi: 10.1016/j.bpj.2011.01.028. PubMed PMID: 21402035; PubMed Central PMCID: PMCPMC3059734. 86. Das A, Mukhopadhyay C. Urea-mediated protein denaturation: a consensus view. J Phys Chem B. 2009;113(38):12816-24. Epub 2009/08/28. doi: 10.1021/jp906350s. PubMed PMID: 19708649. 87. Mason PE, Brady JW, Neilson GW, Dempsey CE. The interaction of guanidinium ions with a model peptide. Biophys J. 2007;93(1):L04-6. Epub 2007/04/24. doi: 10.1529/biophysj.107.108290. PubMed PMID: 17449674; PubMed Central PMCID: PMCPMC1914420. 88. Mason PE, Dempsey CE, Neilson GW, Kline SR, Brady JW. Preferential interactions of guanidinum ions with aromatic groups over aliphatic groups. J Am Chem Soc. 2009;131(46):16689-96. Epub 2009/10/31. doi: 10.1021/ja903478s. PubMed PMID: 19874022; PubMed Central PMCID: PMCPMC2784182. 89. O'Brien EP, Dima RI, Brooks B, Thirumalai D. Interactions between hydrophobic and ionic solutes in aqueous guanidinium chloride and urea solutions: lessons for protein denaturation mechanism. J Am Chem Soc. 2007;129(23):7346-53. Epub 2007/05/17. doi: 10.1021/ja069232+. PubMed PMID: 17503819. 90. Lim WK, Rosgen J, Englander SW. Urea, but not guanidinium, destabilizes proteins by forming hydrogen bonds to the peptide group. Proc Natl Acad Sci U S A. 2009;106(8):2595-600. Epub 2009/02/07. doi: 10.1073/pnas.0812588106. PubMed PMID: 19196963; PubMed Central PMCID: PMCPmc2650309. 91. Wetzler DE, Gallo M, Melis R, Eliseo T, Nadra AD, Ferreiro DU, et al. A strained DNA binding helix is conserved for site recognition, folding nucleation, and conformational modulation. Biopolymers. 2009;91(6):432-43. Epub 2009/01/22. doi: 10.1002/bip.21146. PubMed PMID: 19156829. 92. Chemes LB, Alonso LG, Noval MG, de Prat-Gay G. Circular dichroism techniques for the analysis of intrinsically disordered proteins and domains. Methods Mol Biol. 2012;895:387-404. doi: 10.1007/978-1-61779-927-3_22. PubMed PMID: 22760329. 93. Yancey PH, Clark ME, Hand SC, Bowlus RD, Somero GN. Living with water stress: evolution of osmolyte systems. Science. 1982;217(4566):1214-22. Epub 1982/09/24. PubMed PMID: 7112124. 94. Baskakov I, Bolen DW. Forcing thermodynamically unfolded proteins to fold. J Biol Chem. 1998;273(9):4831-4. Epub 1998/03/28. PubMed PMID: 9478922.

242

95. Bolen DW, Baskakov IV. The osmophobic effect: natural selection of a thermodynamic force in protein folding. J Mol Biol. 2001;310(5):955-63. Epub 2001/08/15. doi: 10.1006/jmbi.2001.4819. PubMed PMID: 11502004. 96. Qu Y, Bolen DW. Efficacy of macromolecular crowding in forcing proteins to fold. Biophys Chem. 2002;101-102:155-65. Epub 2002/12/19. PubMed PMID: 12487997. 97. McPhie P, Ni YS, Minton AP. Macromolecular crowding stabilizes the molten globule form of apomyoglobin with respect to both cold and heat unfolding. J Mol Biol. 2006;361(1):7-10. Epub 2006/07/11. doi: 10.1016/j.jmb.2006.05.075. PubMed PMID: 16824541. 98. Hill CM, Bates IR, White GF, Hallett FR, Harauz G. Effects of the osmolyte trimethylamine-N-oxide on conformation, self-association, and two-dimensional crystallization of myelin basic protein. J Struct Biol. 2002;139(1):13-26. Epub 2002/10/10. PubMed PMID: 12372316. 99. Rozycka M, Wojtas M, Jakob M, Stigloher C, Grzeszkowiak M, Mazur M, et al. Intrinsically disordered and pliable Starmaker-like protein from medaka (Oryzias latipes) controls the formation of calcium carbonate crystals. PLoS One. 2014;9(12):e114308. Epub 2014/12/10. doi: 10.1371/journal.pone.0114308. PubMed PMID: 25490041; PubMed Central PMCID: PMCPmc4260845. 100. Baskakov IV, Kumar R, Srinivasan G, Ji YS, Bolen DW, Thompson EB. Trimethylamine N-oxide-induced cooperative folding of an intrinsically unfolded transcription-activating fragment of human glucocorticoid receptor. J Biol Chem. 1999;274(16):10693-6. Epub 1999/04/10. PubMed PMID: 10196139. 101. Buck M, Schwalbe H, Dobson CM. Characterization of conformational preferences in a partly folded protein by heteronuclear NMR spectroscopy: assignment and secondary structure analysis of hen egg-white lysozyme in trifluoroethanol. Biochemistry. 1995;34(40):13219-32. Epub 1995/10/10. PubMed PMID: 7548086. 102. Fan P, Bracken C, Baum J. Structural characterization of monellin in the alcohol- denatured state by NMR: evidence for beta-sheet to alpha-helix conversion. Biochemistry. 1993;32(6):1573-82. Epub 1993/02/16. PubMed PMID: 8381663. 103. Sonnichsen FD, Van Eyk JE, Hodges RS, Sykes BD. Effect of trifluoroethanol on protein secondary structure: an NMR and CD study using a synthetic actin peptide. Biochemistry. 1992;31(37):8790-8. Epub 1992/09/22. PubMed PMID: 1390666. 104. Brocca S, Šamalíková M, Uversky VN, Lotti M, Vanoni M, Alberghina L, et al. Order propensity of an intrinsically disordered protein, the cyclin-dependent-kinase inhibitor Sic1. Proteins: Structure, Function, and Bioinformatics. 2009;76(3):731-46. doi: 10.1002/prot.22385. 105. Chemes LB, Sánchez IE, Smal C, de Prat-Gay G. Targeting mechanism of the retinoblastoma tumor suppressor by a prototypical viral oncoprotein. FEBS J. 2010;277(4):973-88. doi: 10.1111/j.1742-4658.2009.07540.x. 106. Garcia-Alai MM, Alonso LG, de Prat-Gay G. The N-terminal module of HPV16 E7 is an intrinsically disordered domain that confers conformational and recognition plasticity to the oncoprotein. Biochemistry. 2007;46(37):10405-12. Epub 2007/08/25. doi: 10.1021/bi7007917. PubMed PMID: 17715947. 107. Morin B, Bourhis JM, Belle V, Woudstra M, Carriere F, Guigliarelli B, et al. Assessing induced folding of an intrinsically disordered protein by site-directed spin- labeling electron paramagnetic resonance spectroscopy. J Phys Chem B.

243

2006;110(41):20596-608. Epub 2006/10/13. doi: 10.1021/jp063708u. PubMed PMID: 17034249. 108. Garcia-Alai MM, Gallo M, Salame M, Wetzler DE, McBride AA, Paci M, et al. Molecular basis for phosphorylation-dependent, PEST-mediated protein turnover. Structure. 2006;14(2):309-19. Epub 2006/02/14. doi: 10.1016/j.str.2005.11.012. PubMed PMID: 16472750. 109. Whittington SJ, Creamer TP. Salt bridges do not stabilize polyproline II helices. Biochemistry. 2003;42(49):14690-5. Epub 2003/12/10. doi: 10.1021/bi035565x. PubMed PMID: 14661982. 110. Record MT, Jr., Lohman ML, De Haseth P. Ion effects on ligand-nucleic acid interactions. J Mol Biol. 1976;107(2):145-58. Epub 1976/11/04. PubMed PMID: 1003464. 111. Uversky VN, Gillespie JR, Millett IS, Khodyakova AV, Vasilenko RN, Vasiliev AM, et al. Zn(2+)-mediated structure formation and compaction of the "natively unfolded" human prothymosin alpha. Biochem Biophys Res Commun. 2000;267(2):663-8. Epub 2000/01/13. doi: 10.1006/bbrc.1999.2013. PubMed PMID: 10631119. 112. Kaplon TM, Michnik A, Drzazga Z, Richter K, Kochman M, Ozyhar A. The rod- shaped conformation of Starmaker. Biochim Biophys Acta. 2009;1794(11):1616-24. Epub 2009/07/29. doi: 10.1016/j.bbapap.2009.07.010. PubMed PMID: 19635593. 113. Formosa-Dague C, Speziale P, Foster TJ, Geoghegan JA, Dufrene YF. Zinc- dependent mechanical properties of Staphylococcus aureus biofilm-forming surface protein SasG. Proc Natl Acad Sci U S A. 2016;113(2):410-5. Epub 2015/12/31. doi: 10.1073/pnas.1519265113. PubMed PMID: 26715750; PubMed Central PMCID: PMCPmc4720321. 114. Theillet F-X, Kalmar L, Tompa P, Han K-H, Selenko P, Dunker AK, et al. The alphabet of intrinsic disorder. Intrinsically Disordered Proteins. 2013;1(1):e24360. doi: 10.4161/idp.24360. 115. Elam WA, Schrank TP, Campagnolo AJ, Hilser VJ. Evolutionary conservation of the polyproline II conformation surrounding intrinsically disordered phosphorylation sites. Protein Sci. 2013;22(4):405-17. Epub 2013/01/24. doi: 10.1002/pro.2217. PubMed PMID: 23341186; PubMed Central PMCID: PMCPmc3610046. 116. Shi Z, Chen K, Liu Z, Ng A, Bracken WC, Kallenbach NR. Polyproline II propensities from GGXGG peptides reveal an anticorrelation with beta-sheet scales. Proc Natl Acad Sci U S A. 2005;102(50):17964-8. Epub 2005/12/07. doi: 10.1073/pnas.0507124102. PubMed PMID: 16330763; PubMed Central PMCID: PMCPmc1312395. 117. Rucker AL, Pager CT, Campbell MN, Qualls JE, Creamer TP. Host-guest scale of left-handed polyproline II helix formation. Proteins. 2003;53(1):68-75. Epub 2003/08/29. doi: 10.1002/prot.10477. PubMed PMID: 12945050. 118. Larson MR, Rajashankar KR, Patel MH, Robinette RA, Crowley PJ, Michalek S, et al. Elongated fibrillar structure of a streptococcal adhesin assembled by the high- affinity association of alpha- and PPII-helices. Proc Natl Acad Sci U S A. 2010;107(13):5983-8. Epub 2010/03/17. doi: 10.1073/pnas.0912293107. PubMed PMID: 20231452; PubMed Central PMCID: PMCPmc2851892.

244

119. Berisio R, Vitagliano L. Polyproline and triple helix motifs in host-pathogen recognition. Curr Protein Pept Sci. 2012;13(8):855-65. Epub 2013/01/12. PubMed PMID: 23305370; PubMed Central PMCID: PMCPmc3707005. 120. Horng JC, Raines RT. Stereoelectronic effects on polyproline conformation. Protein Sci. 2006;15(1):74-83. Epub 2005/12/24. doi: 10.1110/ps.051779806. PubMed PMID: 16373476; PubMed Central PMCID: PMCPmc2242370. 121. Cammers-Goodwin A, Allen TJ, Oslick SL, McClure KF, Lee JH, Kemp DS. Mechanism of Stabilization of Helical Conformations of Polypeptides by Water Containing Trifluoroethanol. J Am Chem Soc. 1996;118(13):3082-90. doi: 10.1021/ja952900z. 122. Steinberg IZ, Harrington WF, Berger A, Sela M, Katchalski E. The Configurational Changes of Poly-L-proline in Solution. J Am Chem Soc. 1960;82(20):5263-79. doi: 10.1021/ja01505a001. 123. Lima CD, Wang LK, Shuman S. Structure and mechanism of yeast RNA triphosphatase: an essential component of the mRNA capping apparatus. Cell. 1999;99(5):533-43. Epub 1999/12/10. PubMed PMID: 10589681. 124. Schuck P. Size-distribution analysis of macromolecules by sedimentation velocity ultracentrifugation and lamm equation modeling. Biophysical journal. 2000;78(3):1606- 19. Epub 2000/02/29. doi: 10.1016/s0006-3495(00)76713-0. PubMed PMID: 10692345; PubMed Central PMCID: PMCPmc1300758. 125. Laue TM, Shah BD, Ridgeway TM, Pelletier SL. Computer-aided interpretation of sedimentation data for proteins. In: Harding SE, Rowe AJ, Horton JC, editors. Analytical Ultracentrifugation in Biochemistry and Polymer Science: Royal Society of Chemistry, London; 1992. p. 90-125. 126. Lemmon EW, McLinden MO, Friend DG. Thermophysical Properties of Fluid Systems. In: Linstrom PJ, Mallard WG, editors. NIST Chemistry WebBook, NIST Standard Reference Database Number 69. http://webboook.nist.gov. 127. Brautigam CA. Calculations and Publication-Quality Illustrations for Analytical Ultracentrifugation Data. Methods Enzymol. 2015;562:109-33. Epub 2015/09/29. doi: 10.1016/bs.mie.2015.05.001. PubMed PMID: 26412649. 128. Cantor CR, Schimmel PR. Biophysical Chemistry Part III: The Behaviour of Biological Macromolecules: W. H. Freeman and Co, New York; 1980. 129. Wilkins DK, Grimshaw SB, Receveur V, Dobson CM, Jones JA, Smith LJ. Hydrodynamic Radii of Native and Denatured Proteins Measured by Pulse Field Gradient NMR Techniques. Biochemistry. 1999;38(50):16424-31. doi: 10.1021/bi991765q. 130. Sievers F, Wilm A, Dineen D, Gibson TJ, Karplus K, Li W, et al. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol Syst Biol. 2011;7:539. Epub 2011/10/13. doi: 10.1038/msb.2011.75. PubMed PMID: 21988835; PubMed Central PMCID: PMCPmc3261699. 131. Consortium TU. UniProt: a hub for protein information. Nucleic Acids Res. 2015;43(Database issue):D204-12. Epub 2014/10/29. doi: 10.1093/nar/gku989. PubMed PMID: 25348405; PubMed Central PMCID: PMCPmc4384041. 135. Lowry DF, Stancik A, Shrestha RM, Daughdrill GW. Modeling the accessible conformations of the intrinsically unstructured transactivation domain of p53. Proteins.

245

2008;71(2):587-98. Epub 2007/11/01. doi: 10.1002/prot.21721. PubMed PMID: 17972286. 136. Sanchez-Puig N, Veprintsev DB, Fersht AR. Human full-length Securin is a natively unfolded protein. Protein science : a publication of the Protein Society. 2005;14(6):1410-8. Epub 2005/06/03. doi: 10.1110/ps.051368005. PubMed PMID: 15929994; PubMed Central PMCID: PMCPmc2253381. 137. Uversky VN, Permyakov SE, Zagranichny VE, Rodionov IL, Fink AL, Cherskaya AM, et al. Effect of zinc and temperature on the conformation of the gamma subunit of retinal phosphodiesterase: a natively unfolded protein. Journal of proteome research. 2002;1(2):149-59. Epub 2003/03/20. PubMed PMID: 12643535. 138. Permyakov SE, Millett IS, Doniach S, Permyakov EA, Uversky VN. Natively unfolded C-terminal domain of caldesmon remains substantially unstructured after the effective binding to calmodulin. Proteins. 2003;53(4):855-62. Epub 2003/11/25. doi: 10.1002/prot.10481. PubMed PMID: 14635127. 139. Sanchez-Puig N, Veprintsev DB, Fersht AR. Binding of natively unfolded HIF- 1alpha ODD domain to p53. Molecular cell. 2005;17(1):11-21. Epub 2005/01/05. doi: 10.1016/j.molcel.2004.11.019. PubMed PMID: 15629713. 140. Soragni A, Zambelli B, Mukrasch MD, Biernat J, Jeganathan S, Griesinger C, et al. Structural characterization of binding of Cu(II) to tau protein. Biochemistry. 2008;47(41):10841-51. Epub 2008/09/23. doi: 10.1021/bi8008856. PubMed PMID: 18803399. 141. Campbell KM, Terrell AR, Laybourn PJ, Lumb KJ. Intrinsic structural disorder of the C-terminal activation domain from the bZIP transcription factor Fos. Biochemistry. 2000;39(10):2708-13. Epub 2000/03/08. PubMed PMID: 10704222. 142. Magidovich E, Orr I, Fass D, Abdu U, Yifrach O. Intrinsic disorder in the C- terminal domain of the Shaker voltage-activated K+ channel modulates its interaction with scaffold proteins. Proceedings of the National Academy of Sciences. 2007;104(32):13022-7. doi: 10.1073/pnas.0704059104. 143. Paleologou KE, Schmid AW, Rospigliosi CC, Kim HY, Lamberto GR, Fredenburg RA, et al. Phosphorylation at Ser-129 but not the phosphomimics S129E/D inhibits the fibrillation of alpha-synuclein. The Journal of biological chemistry. 2008;283(24):16895- 905. Epub 2008/03/18. doi: 10.1074/jbc.M800747200. PubMed PMID: 18343814; PubMed Central PMCID: PMCPmc2423264. 144. Geething NC, Spudich JA. Identification of a minimal myosin Va binding site within an intrinsically unstructured domain of melanophilin. The Journal of biological chemistry. 2007;282(29):21518-28. Epub 2007/05/22. doi: 10.1074/jbc.M701932200. PubMed PMID: 17513864. 145. Baker JMR. Structural characterization and interactions of the CFTR regulatory region. 2009. 146. Adkins JN, Lumb KJ. Intrinsic structural disorder and sequence features of the cell cycle inhibitor p57Kip2. Proteins. 2002;46(1):1-7. Epub 2001/12/18. PubMed PMID: 11746698. 147. Yi S, Boys BL, Brickenden A, Konermann L, Choy WY. Effects of zinc binding on the structure and dynamics of the intrinsically disordered protein prothymosin alpha: evidence for metalation as an entropic switch. Biochemistry. 2007;46(45):13120-30. Epub 2007/10/13. doi: 10.1021/bi7014822. PubMed PMID: 17929838.

246

148. Haaning S, Radutoiu S, Hoffmann SV, Dittmer J, Giehm L, Otzen DE, et al. An unusual intrinsically disordered protein from the model legume Lotus japonicus stabilizes proteins in vitro. The Journal of biological chemistry. 2008;283(45):31142-52. Epub 2008/09/10. doi: 10.1074/jbc.M805024200. PubMed PMID: 18779323; PubMed Central PMCID: PMCPmc2662180. 149. Choi UB, McCann JJ, Weninger KR, Bowen ME. Beyond the random coil: stochastic conformational switching in intrinsically disordered proteins. Structure (London, England : 1993). 2011;19(4):566-76. Epub 2011/04/13. doi: 10.1016/j.str.2011.01.011. PubMed PMID: 21481779; PubMed Central PMCID: PMCPmc3075556. 150. Sivakolundu SG, Nourse A, Moshiach S, Bothner B, Ashley C, Satumba J, et al. Intrinsically unstructured domains of Arf and Hdm2 form bimolecular oligomeric structures in vitro and in vivo. Journal of molecular biology. 2008;384(1):240-54. Epub 2008/09/24. doi: 10.1016/j.jmb.2008.09.019. PubMed PMID: 18809412; PubMed Central PMCID: PMCPmc2612038. 151. Donaldson L, Capone JP. Purification and characterization of the carboxyl- terminal transactivation domain of Vmw65 from herpes simplex virus type 1. The Journal of biological chemistry. 1992;267(3):1411-4. Epub 1992/01/25. PubMed PMID: 1309782. 152. Wilkins DK, Grimshaw SB, Receveur V, Dobson CM, Jones JA, Smith LJ. Hydrodynamic Radii of Native and Denatured Proteins Measured by Pulse Field Gradient NMR Techniques. Biochemistry. 1999;38(50):16424-31. doi: 10.1021/bi991765q. 153. Tcherkasskaya O, Uversky VN. Denatured collapsed states in protein folding: Example of apomyoglobin. Proteins: Structure, Function, and Bioinformatics. 2001;44(3):244-54. doi: 10.1002/prot.1089.

247

Chapter VI. Comparing intrinsically disordered regions of Staphylococcus

surface proteins

Authors: Alexander E. Yarawsky1,2, Andrea L. Ori2,3 and Andrew B. Herr2,4

Affiliations: 1 - Graduate Program in Molecular Genetics, Biochemistry & Microbiology, University of Cincinnati College of Medicine, Cincinnati, OH 45267, USA

2 - Division of Immunobiology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA

3 - Medical Sciences Baccalaureate Program, University of Cincinnati, Cincinnati, OH 45267, USA

4 - Division of Infectious Diseases, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA

Author Contributions: A.E.Y. and A.L.O. collected data.

A.E.Y., A.L.O. and A.B.H. analyzed data.

A.E.Y. and A.B.H. conceived experiments and directed the project.

A.E.Y. wrote this draft.

Funding: Work was performed using funding from R01 GM094363 and U19 AI070235 awarded to A.B.H. and the University of Cincinnati Graduate School Dean's Fellowship awarded to A.E.Y. (2018-2019 AY).

Most of the Aap-PGR data presented in this work has been previously published as part of a separate work in the Journal of Molecular Biology (see Chapter V of this dissertation). The final published version of the previous work is available at: https://www.sciencedirect.com/science/article/pii/S0022283616305113

248

Abstract Staphylococcus epidermidis and S. aureus are highly problematic bacteria in hospital settings. This stems, at least in part, from strong abilities to form biofilms on abiotic or biotic surfaces. Biofilms are well-organized multicellular aggregates of bacteria, which, when formed on indwelling medical devices, lead to infections that are difficult to treat. Cell wall-anchored proteins are known to be important players in biofilm formation and infection. Many of these proteins have putative stalk-like regions or regions of low complexity near the cell wall-anchoring motif. We recently showed incredible propensity for the stalk region of the S. epidermidis accumulation-associated protein (Aap) to remain highly extended under conditions which would otherwise be expected to cause significant conformational changes, a characteristic which seems well-suited to the overall function of Aap in biofilm formation. In this study, we evaluate the possibility that the ability to resist compaction is a common theme among stalks from various cell wall-anchored proteins. We used biophysical approaches, such as circular dichroism to examine secondary structure changes as a function of temperature and cosolvents, as well as sedimentation velocity analytical ultracentrifugation-based size-and-shape analysis of the constructs of interest. All regions tested are intrinsically disordered, lacking secondary structure beyond random coil and polyproline type II helix, and they all sample highly extended conformations. We did observe notable differences in their responses to temperature and cosolvents, which appear connected to polyproline type II helix propensity or other sequence-based parameters.

249

Introduction

The gram-positive bacteria Staphylococcus aureus and S. epidermidis are of major concern to the healthcare system. Hospital-acquired infections can lead to bacteremia, endocarditis and prosthetic joint infection. Biofilm formation is a critical virulence factor, which makes these infections difficult to treat [1-4]. It is well understood now that cell wall-anchored proteins, rather than just polysaccharide intercellular adhesin, are extremely important in various stages of biofilm formation and infection [1,

4-6].

Much work has focused on the role of the ordered, functional regions of various cell wall-anchored proteins in biofilm formation [5], however, the regions of low complexity or putative stalk-like regions of these proteins are often neglected. We recently investigated the proline/glycine-rich (stalk-like) region (PGR) of the accumulation-associated protein, Aap, from S. epidermidis, and we found this region is intrinsically disordered, but has an unusual ability to remain extended under harsh conditions. Given the placement of this region adjacent to the cell wall, we hypothesize the ability to maintain an extended conformation could help facilitate bacterial surface attachment and intercellular accumulation via the functional regions of Aap [7].

In this study, we sought to understand if such a property was common among similar regions of low complexity from various cell wall-anchored proteins (Figure 1).

Both Aap-PGR and Arpts are predicted disordered regions of the critical biofilm-related protein, the accumulation-associated protein (Aap) from Staphyloccocus epidermidis [6,

8, 9]. Aap-PGR is the stalk-like region nearest the bacterial cell wall, while Arpts refers to the A-repeat region, containing 11 repeats of 16 amino acids at the N-terminus of the

250

AaP-PGR: AEPGKPAEPGKPAEPGKPAEPGTPAEPGKPAEPGTPAEPGKPAEPGKPAEPGKPAEPGKPAEPG TPAEPGTPAEPGKPAEPGTPAEPGKPAEPGTPAEPGKPAESGKPVEPGTPAQSGAPEQPNRSMH STDNKNQ

SasG-PGR: PKDPKGPENPEKPSRPTHPSGPVNPNNPGLSKDRAKPNGPVHSMDKNDKVKKSKIAKESVANQE KKRAE

Arpts: NNEAPQMSSTLQAEEGSNAEAPQSEPTKAEEGGNAEAAQSEPTKAEEGGNAEAPQSEPTKAEEG GNAEAAQSEPTKTEEGSNVKAAQSEPTKAEEGSNAEAPQSEPTKTEEGSNAKAAQSEPTKAEEG GNAEAAQSEPTKTEEGSNAEAPQSEPTKAEEGGNAEAPQSEPTKTEEGGNAEAPNVPTIKA

SdrC-SD: SDSDSDSDSDSDSDSDSDSDSDSDSDSDNDSDSDSDSDSDAGKHTPAKPMSTVKDQHKTAKA

SD-30mer: SDSDSDSDSDSDSDSDSDSDSDSDSDSDSD

Figure 1. Sequences of constructs. Prolines are bolded, negatively charged residues are colored red, and positively charged residues are colored blue.

251 protein. The N-terminus of Aap, including the A-repeat region and the downstream putative lectin region, is important for surface adhesion, although binding can be attributed to the lectin region [10]. The N-terminus must be removed via SepA or other proteases for Aap-dependent biofilm accumulation to occur via Zn2+-dependent assembly of the B-repeat superdomain [11, 12]. While the role of the A-repeat region is still unclear, it might be important for sterically inhibiting adhesion in situations where adhesion is undesirable, or it might fold back and interact with the lectin or B-repeat region to modulate adhesion. Based on the latter hypothesis, we might expect the A- repeats to have a disordered state, as well as a more ordered state which is occupied upon interaction with the lectin or B-repeat region. Disorder-to-order transitions are often observed in intrinsically disordered proteins or regions (IDPs) upon binding to a ligand [13-15].

Another region of interest in this study is the region equivalent to the PGR of

Aap, but in the S. aureus ortholog, SasG [16]. We hypothesize that SasG-PGR will share very similar properties as Aap-PGR, as both Aap and SasG PGRs contain a proline in every third position (Figure 1) and have similar roles in biofilm formation. In both proteins, a highly extended stalk would likely facilitate the biological function of intercellular adhesion.

SdrC belongs to the serine-aspartate family of proteins expressed by S. epidermidis and S. aureus [5, 17]. These proteins are cell wall-anchored and play important roles in biofilm formation, including primary attachment via ligand-binding and accumulation via homophilic interaction [5, 17]. The common scheme of serine- aspartate (Sdr) proteins is a large A-region which contains multiple IgG-like repeats

252 involved in homophilic interaction, followed by several B-repeats of ~110 residues that are elongated in structure and are capable of binding ligands such as collagen [18, 19].

Downstream of the B-repeats is the serine-aspartate repeat region, which contains a very wide range of SD dipeptide repeats, from 56 residues in SdrG to 558 in SdrF [17].

The C-terminus contains an LPXTG cell wall-anchoring motif and a hydrophobic membrane-spanning region [17]. We hypothesize that the SD repeats of SdrC and other

Sdr family members will show similar features to the PGR of Aap and SasG, in that they may be highly extended in order to allow the A and B regions to reach outward and make homophilic interactions or find ligands. Interestingly, the SD repeats lack the high proline content we hypothesized is important for Aap-PGR, but they may instead rely on charge-based contributions that are lacking in Aap-PGR.

We first utilized several sequence-based predictions that indicated each construct is likely to be disordered, primarily due to the lack of hydrophobic (order- promoting) residues, but in the case of the SD repeats, there is also a strong contribution from high net charge. Predictions based on charge patterning and polyproline type II helix (PPII) propensity suggest highly extended conformations may be preferred by these sequences. Indeed, sedimentation velocity analytical ultracentrifugation (AUC) indicated high frictional ratios representative of highly extended or elongated species. A battery of experiments using circular dichroism (CD) spectroscopy revealed trends in the resistance to compaction which were proportional to predicted PPII propensity, where sequences with higher PPII propensity showed stronger resistance to perturbation by temperature or cosolvents. Data remain to be

253 collected on the SD repeats, which will be critical for validating our interpretations thus far, as these sequences have relatively low predicted PPII propensities.

254

Results

All constructs are predicted to be disordered

Our first approach to compare these constructs is to examine the likelihood that they will be folded (ordered) or unfolded (disordered). Uversky, et al. found that the mean net charge and mean scaled hydropathy of a protein is very powerful in predicting whether or not the protein will be ordered or disordered [20]. In Figure 2A, all constructs of interest are located on the disordered side of the Uversky Plot. Interestingly, while the mean scaled hydropathy values all fall within the range of 0.25 - 0.35, the absolute mean net charge varies much more, ranging from 0.05 to 0.50. In light of the variability in mean net charge, we sought to examine additional parameters related to charge.

Several studies have been published examining the potential effects of charge on

IDP conformation. Das & Pappu have investigated the impact of charge distribution or charge mixing on IDP conformational preferences [21]. They proposed classifying proteins based on the fraction of positively (f+) and negatively (f-) charged residues, resulting in the Das-Pappu Plot shown in Figure 2B. Each region of this plot is discussed in detail elsewhere [21]. Region 2 contains Aap-PGR and Arpts. This region is considered a "boundary" region, where the conformational ensemble may sample either globule or tadpole conformations (Region 1) or coils, hairpins, or chimeras

(Region 3). SdrC-SD and SasG-PGR both creep slightly into Region 3, while SD-30mer is located in Region 4 - a swollen coil (Figure 2B). Das & Pappu also investigated the importance of charge-mixing or segregation [21]. A value named kappa (κ) represents charge patterning. A high κ value represents highly segregated sequences

255

(A) (B)

Figure 2. Predictions of disorder and classification of constructs. The Uversky Plot (A) is a powerful predictor of disorder based on net charge and hydrophathy [20]. A reference line (dashed) roughly separates disordered proteins (black, filled circles) and ordered proteins (grey, filled circles). All constructs of interest in this study fall on the "Disordered" side of the reference line. Panel (B) shows the IDPs plotted on the Das- Pappu phase plot [21]. Symbols are consistent with (A). Numbers refer to the phase plot region (See Table 1).

256

Parameter Aap-PGR SasG-PGR Arpts SdrC-SD SD-30mer N 135 69 189 62 30 f- 0.15556 0.13043 0.22222 0.33871 0.50000 f+ 0.1037 0.23188 0.06878 0.08065 0 FCR 0.25926 0.36232 0.29101 0.41935 0.50000 NCPR -0.05185 0.10145 -0.15344 -0.25806 -0.50000 Kappa 0.05825 0.09562 0.08655 0.30207 0.02324 FPR 0.28889 0.17391 0.09524 0.03226 0 Omega 0.03234 0.07106 0.05146 0.00657 0.00096 Hydropathy 3.09259 2.72899 3.08466 2.64839 2.35 Phase Plot Region 2 3 2 3 4 N: Number of residues f-: Fraction of negative residues f+: Fraction of positive residues FCR: Fraction of charged residues NCPR: Net charge per residue Kappa: κ is a charge patterning parameter [21]. Highly mixed charged sequences approach κ = 0, while highly segregated charged sequences approach κ = 1. For sequences that are not highly charged, κ is not very meaningful. FPR: Fraction of proline residues (not a parameter provided by CIDER, but included here for relevance) Omega: Ω is a charge/proline patterning parameter [22]. This parameter is similar to κ, but also incorporates proline residues. If prolines and charged residues are well mixed along a sequence (with respect to other amino acids), there will be a low Ω value. If proline/charged residues are highly segregated, Ω will approach 1. Hydropathy: Based on the Kyte-Doolittle scale [23], normalized from 0 (least hydrophobic) to 9 (most hydrophobic). Phase Plot Region: Location on the Das-Pappu phase plot this sequence falls Phase Plot Annotation: 1: Weak polyampholytes and polyelectrolytes (Globules & Tadpoles) 2: Boundary region (Janus sequences) 3: Strong polyampholytes 4: Strong negatively charged polyelectrolytes 5: Strong positively charged polyelectrolytes

Table 1. Parameters calculated by CIDER. The CIDER server [24] predicts various parameters, including those required for the Das-Pappu Plot [21].

257

(e.g EEEEEKKKKK), which show a preference for hairpins or more compact conformations, while sequences with low κ values (e.g. EKEKEKEKEK) experience self- avoidance due to electrostatic-repulsion and tend toward extended chains or Flory random coils [21]. The κ value for each construct is shown in Table 1, along with additional parameters relevant to charge provided by the CIDER webserver [24]. An important consideration for κ is that it is dependent on the fraction of charged residues

(FCR). When the FCR is below 0.3, the charge-dependent effects encompassed by κ become less reliable [21]. As seen in Table 1, Aap-PGR and Arpts are slightly below an

FCR of 0.3, while SD-30mer has the highest FCR at 0.5. Considering this, κ may not be a reasonable predictor of conformational tendencies for some constructs, but it may be useful in the context of SD-30mer, especially given the very low κ value suggestive of a

"self-avoiding random walk" conformational ensemble. The κ value of the other constructs lie in a region that is less reliable in differentiating between self-avoiding random walk and conformations approaching Flory random coil [21].

A more relevant parameter in the case of Aap-PGR, SasG-PGR and Arpts might be omega, Ω. This value considers the patterning of proline and charged residues against all other residues [22]. The identity of the proline or charged residues is not considered. While the implications of κ are dependent upon FCR, Ω is dependent on

FCR and the number of proline residues. Thus, the value of Ω as a predictor of conformation is contingent upon the identity of the charged residues. When used appropriately, such as a sequence with mostly similarly-charged charged residues and well-dispersed prolines, sequences with low Ω are more expanded or extended than those with high Ω [22]. One should use caution when the sequence of interest contains

258 poorly-mixed charged/proline residues, as this could enable long-range electrostatic attractions leading to hairpins, or there could be steric restrictions from groups of prolines. Table 1 also lists the Ω value for each construct. For Aap-PGR, SasG-PGR and Arpts, Ω is very low, suggesting the patterning of charged/proline residues could bias the conformational ensemble toward a higher radius. Again, SdrC-SD and SD-

30mer contain few or no proline residues, so κ is the more appropriate parameter to consider for those constructs.

Uversky, et al. [20] and work from the Pappu Lab [21, 22, 25] have shown that charged residues can play an important role in the conformational preferences of IDPs, particularly when there is a high number of charged residues. Not surprisingly, there are additional factors that confer conformational bias. Work from the Whitten Lab has examined the effect of polyproline type-II (PPII) helix propensity [26-28] and α-helix propensity [29] on the hydrodynamic radius (Rh) of an IDP. PPII is a left-handed helix with three-fold rotational symmetry and is highly extended compared to an α-helix [30].

Tomasso, et al. [26] found the Rh values of IDPs could be predicted very well based on the PPII propensity and number of residues. Interestingly, the effect of charged residues was very weak compared to PPII propensity, but could be significant where there was poor mixing of oppositely charged residues (i.e. sequences with a low κ value).

Parameters relevant to Rh prediction from PPII propensity are listed in Table 2. In the case of Aap-PGR, SasG-PGR and Arpts, there are a high number of prolines, which could place a strong PPII-based bias on their conformations. They are also rich in other residues that have a high propensity for PPII [26, 31-33]. The predicted fraction of residues in the PPII conformation, fPPII, is the averaged PPII propensity [26, 31] based

259

Net Rh Rh Rh (PPII Coil PPII Charge IDP N charge (coil) (PPII) charge) bias bias bias fPPII Aap-PGR 135 -7 25.64 38.50 37.84 67.75 29.05 3.20 0.5350 SasG-PGR 69 +7 18.27 24.56 24.43 74.79 20.25 4.96 0.4761 Arpts 189 -29 30.38 41.26 44.06 68.95 9.66 11.39 0.4190 SdrC-SD 62 -16 17.31 20.64 22.15 78.18 9.33 12.50 0.3294 SD-30mer 30 -15 12.01 13.45 15.16 79.18 3.71 17.11 0.2700

Table 2. Calculated and predicted parameters of IDP constructs. The number of residues is listed in the N column. Rh is the predicted hydrodynamic radius (in Å) assuming complete random coil (Rh (coil)), considering polyproline type-II helix (Rh (PPII)) or PPII propensity and charge (Rh (PPII charge)). The contribution of random coil, PPII, and charge effects to the predicted (Rh (PPII charge)) are also listed. The predicted fraction of PPII (fPPII) refers to the number of residues predicted to be in the PPII conformation divided by the total number of residues. All parameters were calculated using a program provided by Steven Whitten, based on Tomasso, et al. [26].

260 on sequence composition. Aap-PGR has a higher fPPII value than all other IDPs in the database used by Tomasso, et al. [26] (Table S1). SasG-PGR has the third highest fPPII value, after the disordered tail of p53 (referred to as p53(1-93), fPPII = 0.4890 [26], Table

S1) and Aap-PGR. Arpts lies at number six in the dataset of 27, while SdrC-SD and SD-

30mer fall at or near the bottom of fPPII values (Table S1). Also listed in Table 2 are the predicted Rh assuming total random coil conformation (Rh (coil)), predicted Rh based on random coil and fPPII bias (Rh (PPII)), and predicted Rh based on random coil, fPPII bias and charge effects (Rh (PPII charge)). These values allow for a useful interpretation of which factors might influence the predicted Rh more strongly. For the Rh (PPII charge) value, the percent contribution of random coil bias, PPII propensity bias (based on fPPII) and charge bias are listed in Table 2 (Coil bias, PPII bias and Charge bias). Not surprisingly, the random coil bias is the largest contribution in all cases. Whether or not a protein is folded is perhaps the strongest influence on whether the conformation(s) is expected to be compact/globular or expanded/extended [26-28, 34, 35]. As expected, the contribution from PPII propensity is proportional to fPPII, and in the case of the IDPs with high FCR but low proline content, the contribution from charge outweighs that of the PPII propensity (Table 2).

261

AUC indicates highly elongated monomers

Before examining the proteins by circular dichroism (CD), we examined their global conformations by sedimentation velocity analytical ultracentrifugation (AUC).

Sedimentation velocity AUC measures the rate of sedimentation (sedimentation coefficient) as a species moves through solution [36, 37]. The calculated sedimentation coefficient (s) is dependent on the size (buoyant molecular weight) and shape of the species. A protein that is highly expanded or elongated will sediment more slowly, due to increased frictional resistance, than a compact, globular protein of the same molecular weight. The frictional coefficient is reported as the frictional ratio, where the observed frictional coefficient is divided by the frictional coefficient of a sphere of the same volume [38]. We performed sedimentation velocity AUC on each of the IDPs of interest (no data yet on SD-30mer). The c(s) distributions are plotted in Figure 3, and it is evident that each IDP is monomeric (see Table 3, MWexp). SdrC-SD showed a weak, broad peak near 3 s, which was later identified as His-MBP by mass spectrometry and was difficult to purify away. For this reason, CD experiments were not performed on

SdrC-SD, and instead a SD-30mer peptide was synthesized. In addition to determining the molecular weight, the frictional ratio (f/f0) was calculated and is listed in Table 3. The analyses were performed using SEDFIT's continuous c(s) distribution model [38]. In all cases, a very high f/f0 is observed, indicative of highly elongated conformations.

262

Figure 3. AUC indicates each construct is highly elongated and monomeric. Sedimentation velocity AUC data suggest each construct is monomeric and highly elongated, as indicated by the fitted parameters listed in Table 3.

Rh (PPII IDP s20,w f/f0 MWexp (Da) Rs (Å) charge) Aap-PGR 1.06 2.12 14,371 33.9 37.84 SasG-PGR 0.80 2.03 9,254 28.1 24.43 Arpts 1.30 2.52 22,234 46.2 44.06 SdrC-SD* 0.97 1.93 8,055 24.7 22.15 SD-30mer N/A N/A N/A N/A 15.16

Table 3. Sedimentation velocity AUC parameters. Data were analyzed by SEDFIT's continuous c(s) distribution model [38] and are presented in Figure 3. The Stokes- Einstein radius, Rs, is from SEDFIT, while the hydrodynamic radius, Rh (PPII charge), is from Table 2, shown here for comparison. No AUC data has been collected on SD- 30mer. *Note that SdrC-SD was less than 95% pure, due to contamination from His-MBP, which appears as a very weak peak near 3 s.

263

CD confirms random coil/PPII secondary structure content

After confirming the constructs of interest are monomeric and highly extended, we utilized circular dichroism (CD) to measure the secondary structure of the IDPs.

Random coil exhibits a minimum near 200 nm, while PPII exhibits an even stronger minimum near 200 nm, along with a weak local maximum near 220 nm [39]. It is very useful to examine the temperature dependence of the CD spectrum to examine the transition to random coil at high temperatures. PPII is stabilized at low temperatures, whereas high temperatures disrupt PPII, resulting in random coil [7, 27, 39, 40]. The resulting temperature dependence is usually very linear, which indicates little or no cooperativity in PPII formation [7, 40, 41]. However, if there is significant α-helix or β- sheet content, one would expect to observe sigmoidality in the temperature dependence

[42].

Figure 4 shows CD wavelength scans of Aap-PGR, SasG-PGR and Arpts as a function of temperature. The purity of SdrC-SD was insufficient for accurate CD, and

SD-30mer has not yet been analyzed. In each case, it is apparent that the primary features of the scans are the 200 nm minimum and the shift in intensity of the 220 nm signal. This strongly indicates that each IDP is composed of a mix of random coil and

PPII. The insets of Figure 4A, C and E show the linear trend of the CD signal at the 200 nm minimum. The slope (m) of the linear regression of these data trends inversely with fPPII, where higher PPII propensity yielded a less steep slope, but the interpretation of this phenomenon is unclear. One possible interpretation is that a protein with higher

PPII propensity would require more energy to destabilize the PPII structure(s), yielding a less steep slope. Conversely, a protein with a lower PPII propensity would be more

264

(A) (B)

(C) (D)

(E) (F)

Figure 4. Circular dichroism wavelength scans show constructs have primarily random coil and PPII helix content. (A) shows the wavelength scans for Aap-PGR from 5°C - 95°C (cool to warm colors), while (B) shows the difference of each scan minus 95°C. This highlights the shift in the local maximum around 220 nm. The insets highlight the linearity of the CD signal at 200 nm (A) or 220 nm (B). Data are shown for SasG-PGR (C, D) and Arpts (E, F). The slope determined by linear regression of the CD signal (200 nm) vs temperature is listed as "m" in (A, C and E).

265 easily unfolded to random coil, thereby showing a steeper slope. Without data saturating the complete random coil and complete PPII conformations of each protein, which would require specialized equipment and different buffer conditions to make measurements approaching -100°C and over 100°C [40], it would be difficult to interpret these data in a more quantitative manner.

266

Cosolvents perturb secondary structure to varied degrees

In Figure 5, we examined the effect of denaturants on the CD signal. Urea and guanidinium chloride (GdnHCl) destabilize the folded state of globular proteins via favorable interactions with the peptide backbone and hydrophobic residues [43].

Generally, GdnHCl is twice as strong, or more, than urea at denaturing folded proteins, but differences can be expected depending on amino acid composition. Interestingly, once a protein is unfolded, these denaturants will induce PPII [41, 44-46]. Figure 5 focuses on the region around the local maximum at 220 nm. The strong absorbance of urea and GdnHCl preclude reliable measurements below this range. Aap-PGR showed only a slight increase in PPII, indicated by the lack of significant change in the CD signal at 220 nm in the presence of urea or GdnHCl. SasG-PGR and Arpts showed much larger increases around 220 nm, especially at 4°C where PPII is more stable, suggesting a larger increase in the PPII content upon urea and GdnHCl addition. The magnitude of the changes is (qualitatively) inversely proportional to fPPII. A possible interpretation of these data is that Aap-PGR does not show much of an increase in PPII content, because it already has a high PPII content (high fPPII), whereas Arpts has a lower PPII content to begin with, so it has more that can be induced by denaturant.

Figure 6 shows similar CD measurements in the presence of a stabilizing osmolyte, TMAO, or an alcohol, TFE. TMAO is a cosolvent that is capable of inducing compact conformations, often the functional native state, primarily by having strongly unfavorable interactions with the peptide backbone. TMAO is particularly useful, because it does not force a specific structure, it simply destabilizes the unfolded or expanded state, which results in a shift in the equilibrium from unfolded states to the

267

(A) (B)

(C) (D)

(E) (F)

Figure 5. Constructs respond to denaturants to different extents. Concentrations of urea (A, C and E) were 0, 2, 4, 6 M, colored from darkest to lightest fill. (B, D and F) show 0, 2, 4, 6 M GdnHCl.

268 folded, native state [43, 47-49]. This is in stark contrast to TFE, which induces α-helix formation, even when this is non-native [45, 50-53].

The addition of TMAO to Aap-PGR has essentially no effect (Figure 6A). We previously hypothesized this was due to the high frequency of charged residues, which form favorable interactions with TMAO, and therefore, are seemingly able to counter the backbone-TMAO effects [7, 47]. If this is a valid explanation for the lack of effect, then we should also see a similar result for SasG-PGR and Arpts, which have a greater fraction of charged residues (Table 1). Indeed, we see a little to no effect of TMAO on the CD spectrum of these two IDPs, although some difference is observed at low temperatures in both cases (Figure 6). A lack of response to TMAO has been observed in other IDPs, including myelin basic protein [54] and Starmaker-like protein [55], which also have high FCR values of 24% and 47%. These two proteins have very low proline content (3% and 6%), so while proline also has a favorable interaction with TMAO, the charged residues are likely the most dominant factor in these cases [47]. However, the decreased proline content in SasG-PGR and Arpts (Table 1) may allow for the slight effect of TMAO (Figure 6). The lack of a response to TMAO could also suggest that the native state is already populated, and there is not a separate, compact state which might be induced upon binding of a ligand as is the case with some IDPs [14, 15].

269

(A) (B)

(C) (D)

(E) (F)

Figure 6. Comparing the response to TMAO and TFE. Concentrations of TMAO (A, C and E) were 0 (dark fill) and 3 M (light fill). The ability for TFE to perturb the secondary structure is shown in (B, D and F). TFE concentrations were 0, 15%, 45% and 75% (from dark fill to light fill).

270

The response to TFE was much greater than TMAO (Figure 6). As discussed elsewhere, Aap-PGR has an apparent shift from PPII to random coil upon TFE addition, but no indication of α-helix formation [7]. This is not surprising due to the high frequency of proline, which sterically prohibits α-helix formation. However, we expect that with the lower frequency of proline in SasG-PGR and Arpts, TFE might have a stronger effect.

Indeed, we do see a much more significant effect of TFE on SasG-PGR and Arpts. CD spectra of α-helix content shows minima at 222 nm and 208 nm [42]. Interestingly, while both of these proteins show clear development of the local minimum around 222 nm and a shift of the ~200 nm minimum toward the 208 nm minimum, there are potentially interesting differences. For SasG-PGR, there is little or no weakening of the ~200 - 208 nm minimum, while Arpts (and Aap-PGR) show a strong weakening of this signal.

271

Discussion

After our investigation of Aap-PGR, the proline/glycine-rich region from Aap, we became curious if other low-complexity, stalk-like regions would show a similar resistance toward compaction [7]. Therefore, we chose several additional regions to investigate and compare to Aap-PGR. These regions include the similar stalk-like region of SasG - the Staphylococcus aureus ortholog of Aap (S. epidermidis) and the serine- aspartate region of SdrC, which spans the space between the functional domain(s) and the cell wall-anchoring motif of this protein [17].

We began by showing sequence-based predictions suggesting these proteins will be disordered in solution. The amino acid compositions alone, which lacked many hydrophobic residues (order-promoting residues) and were enriched with polar and charged residues (disorder-promoting residues), are strong indicators of disorder [13,

14]. Furthermore, our sequences of interest have a wide range of net charge (Figure 2,

Table 1). This allowed for evaluation of potential charge effects on conformation. We also predicted the hydrodynamic radius (Rh) of each sequence based on PPII propensity and charge effects, which suggested PPII was a strong contributor to Rh in sequences with medium or high proline content, but charge effects might be much more dominant in the sequences lacking prolines. In cases where the latter is true, high NaCl concentrations might be expected to cause decreases in Rh, whereas NaCl will not affect PPII content (as was shown with Aap-PGR [7]). Dynamic light scattering or sedimentation velocity AUC experiments to look at Rh or conformational preferences in the presence of increasing NaCl concentrations are planned, but have not yet been performed. Sedimentation velocity AUC showed that each IDP is monomeric and exists

272 in highly elongated conformations (Figure 3 and Table 3). The calculated Rs (Stokes-

Einstein radius) trends well with the predicted Rh based on PPII and charge.

Using circular dichroism (CD), we verified that Aap-PGR, SasG-PGR and Arpts exist in a PPII-random coil equilibrium, which can be modulated by temperature (Figure

4). The dependence of the CD spectra on temperature correlated with the predicted fPPII. The ability of chemical denaturants to increase the PPII signal showed an inverse trend with fPPII; specifically, there was a strong increase in PPII with denaturant concentration in the sequences with lower PPII propensity.

With all IDPs tested, there was very little change in the CD spectra upon TMAO addition (Figure 6). This is likely due to the high fraction of charged residues, with some additional contribution from proline content. It may also suggest that these IDPs do not form additional compact states, but actually remain disordered in their native states.

While this is in line with our hypothesis stating that SasG-PGR might be an extended stalk that is resistant to compaction, like Aap-PGR, the role of the A-repeat region of

Aap is unclear, but it seems unlikely from the lack of TMAO response that the region binds a ligand (lectin and/or B-repeat region) and adopts a new conformation. This result, however, does not preclude that the A-repeat region might be interacting with said regions in a manner which doesn't require major conformational changes. More appropriate approaches to evaluating the ability for the A-repeat region to bind the lectin and/or B-repeat region include testing for binding by isothermal titration calorimetry, assembly by AUC and structural characterization of an A-repeat-lectin-B-repeat construct.

273

The addition of α-helix-inducing TFE had more significant effects than TMAO on all IDPs, but development of the characteristic 208 and 222 nm α-helix minima was very clear in SasG-PGR and Arpts. In Aap-PGR, there are frequent prolines in every third position, making it more unlikely that helices could form in these regions. SasG-PGR also contains a proline in every third position throughout the first half of the region, however, the second half becomes more variable and lacks proline residues. This likely allows significant α-helix formation to occur in the presence of TFE. While Arpts does contain a fair amount of prolines, they are much more spaced out (~11-15 residues apart) which probably allows for short, interspersed α-helices to form (the average length of an α-helix is 10-15 residues [56]).

So far, it does appear that the regions of low-complexity investigated in this study are disordered and have conformational bias toward extended states. While PPII propensity appears to contribute to the preference for extended conformations of Aap-

PGR and SasG-PGR, it will be important to examine whether or not the IDPs with lower proline content and higher net charge will be susceptible to compaction in the presence of high salt concentrations. To better understand the impact of temperature- and cosolvent-dependent secondary structure changes on conformation, dynamic light scattering (DLS) [7, 28] and/or small-angle X-ray scattering (SAXS) [57] should be performed on these IDPs. This could allow for a better interpretation of the CD observations, but also may provide useful insights into why Aap-PGR (and maybe other

IDPs) are resistant to compaction.

The results from this work support the hypothesis that a highly extended stalk is potentially quite common, and likely functionally beneficial, among cell wall-anchored

274 proteins involved in biofilm formation. Considering we have only examined a handful of cell wall-anchored proteins, future work may benefit from bioinformatics approaches to examine trends in PPII propensity and charge density among a greater variety of cell wall-anchored proteins. It is intriguing that two separate mechanisms have apparently evolved for achieving extended conformations: 1) high PPII propensity via high proline frequency and 2) high charge density. The reason for the multiple mechanisms could be that the specific environment of the particular bacterium is better suited to one or the other mechanism. For example, if a bacterium is required to form biofilms in environments exposed to high temperatures, a proline-rich region might be advantageous as we observed little temperature dependence on the secondary structure in Aap-PGR and SasG-PGR. On the other hand, differences in the biofilm matrix may be a determining factor. A stalk with a very long stretch of high charge density (e.g. SD repeats in the Sdr family of proteins) might be able to interact with polysaccharide intercellular adhesins (positively-charged) or teichoic acids (zwitterionic) in the biofilm matrix. Approaches such as swapping stalk regions between cell wall- anchored proteins among different bacteria or reducing the PPII propensity of specific stalk regions and testing the capacity for biofilm formation will be indispensable in determining the functional importance of this work.

275

Materials and methods

Cloning and protein expression

SasG-PGR, Arpts and SdrC-SD genes were ordered as IDT gBlocks with 5'

CACC sequences for cloning into the Gateway Cloning System via pENTR/D-TOPO reaction, followed by TEV cleavage sites (ENLYQF/G) and the sequences in Figure 1, followed by a stop codon. The gene was transferred from the pENTR vector to a destination vector containing an N-terminal His6 tag and maltose binding protein (MBP), which was kindly provided by Dr. Artem Evdokimov. Aap-PGR DNA was synthesized by

LifeTechnologies GeneArt® as previously described and moved into the same His6-

MBP destination vector [7]. SD-30mer was ordered as a peptide from Peptide 2.0 at

≥95% purity.

BLR(DE3) cells were transformed with the destination vector containing the gene of interest. Overnight cultures were grown at 37 °C, then 25 ml used to inoculate 1 l LB containing ampicillin and tetracycline antibiotic selection. The cultures were grown to an

OD600 of 0.8 - 1.0 at 37 °C, shaking at 200 - 250 rpm. The cultures were then placed in an ice bath until cooled to 10 °C. At this point, ethanol was added to 2% final concentration (volume/volume) and IPTG to 200 µM. The cultures were placed back into a shaker to incubate overnight at 20 °C. The following morning, cultures were centrifuged at 4,500 rpm for 1 hr, the supernatants discarded, and the pellets resuspended in 20 mM Tris (pH 7.4) and 300 - 500 mM NaCl. Resuspended pellets were stored at -20 °C.

276

Protein purification

Frozen pellets were thawed, sonicated to lyse the bacteria, centrifuged for 45 min at 14,000 rpm, the supernatant filtered through a 0.22 µm filter, and the protein purified by Ni2+-affinity chromatography using an Akta FPLC Chromatography System (GE

Healthcare). The fusion proteins were eluted with a gradient of 1 M imidazole and dialyzed into 20 mM Tris (pH 7.4) and 300 mM NaCl. The protein was then incubated with TEV protease for 6 - 8 hours at room temperature. A subtractive Ni2+-affinity step was performed to capture uncleaved His6-tagged fusion protein, His6-tagged MBP, and

His6-tagged TEV protease, while cleaved protein of interest did not interact with the

Ni2+-affinity column. The cleaved protein of interest was then purified by size exclusion chromatography using a Superdex 75 prep grade column (GE Healthcare). Where necessary, only fractions containing the highest purity of full-length protein was collected from the Superdex 75 elution, or ion exchange chromatography was used to remove contaminants or degraded species.

Analytical ultracentrifugation

Sedimentation velocity experiments were performed on a Beckman Coulter XL-I

AUC, using 1.2 cm two-sector epon-charcoal centerpieces. Experiments were performed at 48,000 rpm at 20 °C in a An-60 Ti rotor and were continued until sedimentation was complete or back-diffusion became obvious in the raw data (~20-24 hours). Data analysis was completed in SEDFIT using the continuous c(s) distribution

277 model [38]. SEDNTERP was used to estimate partial specific volume and buffer density and viscosity [58].

Circular dichroism

An Aviv 215 spectrophotometer was used to perform CD measurements. The instrument is equipped with an Aviv peltier junction temperature control system. A 0.05 cm quartz cuvette (Hellma Analytics) was used for all measurements. Temperature- dependence experiments were performed using protein dialyzed into 20 mM potassium phosphate (pH 7.4) and 50 mM NaF. For cosolvent experiments, concentrated protein samples were mixed with appropriate amounts of cosolvent or water to match protein concentrations within experiments. Urea used was 8 M ultra-pure grade solution

(Amresco), GdnHCl was 8 M high purity solution (Pierce), TFE was >99.0% purity

(Sigma-Aldrich), and TMAO was 95% solid (Sigma-Aldrich) which was dissolved into water before addition to samples. All protein-cosolvent spectra were subtracted from buffer blanks which contained equal amounts of cosolvent at each temperature.

To convert data to mean residue ellipticity, [θ], Equation 1 was used. In this equation, θ is raw machine units output by the instrument, MRW is the mean residue weight, l is the path length of 0.05 cm, and c is the concentration in mg/ml units.

× [] = Equation 1 ××

278

Supplementary Tables

Net Rh Rh Rh (PPII Coil PPII Charge IDP N charge (coil) (PPII) charge) bias bias bias fPPII Aap-PGR 135 -7 25.64 38.50 37.84 67.75 29.05 3.20 0.5350 p53(1-93) 93 -15 21.24 29.51 30.56 69.52 21.99 8.49 0.4890 SasG-PGR 69 +7 18.27 24.56 24.43 74.79 20.25 4.96 0.4761 p53(1-93) ALA- 93 -15 21.24 28.66 29.70 71.52 19.74 8.74 0.4581 p53 TAD 73 -14 18.80 24.79 25.84 72.77 17.86 9.37 0.4500 Arpts 189 -29 30.38 41.26 44.06 68.95 19.66 11.39 0.4190 Securin 202 -1 31.41 42.57 40.45 77.65 21.92 0.43 0.4130 PDE-γ 87 +4 20.54 26.51 25.70 79.92 17.39 2.69 0.4122 Cad136 136 +9 25.73 33.77 33.45 76.93 18.41 4.66 0.4025 HIF1-α-403 202 -29 31.41 42.13 44.86 70.03 18.78 11.18 0.4024 Tau-K45 198 +19 31.10 41.52 42.53 73.11 19.16 7.73 0.3988 HIF1-α-530 170 -10 28.80 37.81 37.44 76.91 18.47 4.62 0.3899 Fos-AD 168 -16 28.62 37.17 37.84 75.64 17.05 7.31 0.3783 ShB-C 146 -4 26.67 34.32 33.06 80.65 17.25 2.09 0.3764 α-synuclein 140 -9 26.11 33.47 33.12 78.83 16.47 4.70 0.3744 Mlph(147-403) 260 -28 35.68 47.00 49.24 72.46 17.70 9.84 0.3703 CFTR-R-region 189 -5 30.38 39.18 37.82 80.31 17.40 2.29 0.3644 p57-ID 73 -6 18.80 23.14 22.80 82.45 13.00 4.55 0.3636 prothymosin-α 110 -43 23.12 29.02 34.77 66.50 12.10 21.40 0.3633 LJIDP1 94 +4 21.36 26.46 25.59 83.45 13.85 2.70 0.3565 Mlph(147-240) 97 -15 21.70 26.85 27.86 77.89 12.79 9.32 0.3528 SNAP25 206 -14 31.73 40.60 40.70 77.94 16.11 5.95 0.3513 Hdm2-ABD 97 -29 21.70 26.47 29.91 72.56 10.67 16.78 0.3345 SdrC-SD 62 -16 17.31 20.64 22.15 78.18 9.33 12.50 0.3294 Vmw65 89 -19 20.78 25.13 26.90 77.24 10.54 12.22 0.3278 p53(1-93) PRO- 93 -15 21.24 24.93 25.97 81.79 8.22 9.99 0.2832 SD-30mer 30 -15 12.01 13.45 15.16 79.18 3.71 17.11 0.2700

Table S1. Sequence-based parameters of IDP dataset. The dataset is reproduced from Tomasso, et al. [26]. Parameters listed here were calculated using a program provided by Steven Whitten, based on Tomasso, et al. [26]. Shaded IDPs are from the current study. IDPs are sorted by descending fPPII.

279

IDP Sequence p53(1-93) MEEPQSDPSVEPPLSQETFSDLWKLLPENNVLSPLPSQAMDD LMLSPDDIEQWFTEDPGPDEAPRMPEAAPPVAPAPAAPTPAA PAPAPSWPL p53(1-93) ALA- MEEPQSDPSVEPPLSQETFSDLWKLLPENNVLSPLPSQGMDD LMLSPDDIEQWFTEDPGPDEGPRMPEGGPPVGPGPGGPTPG GPGPGPSWPL p53(1-93) PRO- MEEGQSDGSVEGGLSQETFSDLWKLLGENNVLSGLGSQAMD DLMLSGDDIEQWFTEDGGGDEAGRMGEAAGGVAGAGAAGT GAAGAGAGSWGL p53 TAD MEEPQSDPSVEPPLSQETFSDLWKLLPENNVLSPLPSQAMDD LMLSPDDIEQWFTEDPGPDEAPRMPEAAPRV Vmw65 GSAGHTRRLSTAPPTDVSLGDELHLDGEDVAMAHADALDDFD LDMLGDGDSPGPGFTPHDSAPYGALDMADFEFEQMFTDALGI DEYGG Hdm2-ABD ERSSSSESTGTPSNPDLDAGVSEHSGDWLDQDSVSDQFSVE FEVESLDSEDYSLSEEGQELSDEDDEVYQVTVYQAGESDTDS FEEDPEISLADYWK prothymosin-α MSDAAVDTSSEITTKDLKEKKEVVEEAENGRDAPANGNANEE NGEQEADNEVDEEEEEGGEEEEEEEEGDGEEEDGDEDEEAE SATGKRAAEDDEDDDVDTKKQKTDEDD HIF1-α-403 PAAGDTIISLDFGSNDTETDDQQLEEVPLYNDVMLPSPNEKLQ NINLAMSPLPTAETPKPLRSSADPALNQEVALKLEPNPESLELS FTMPQIQDQTPSPSDGSTRQSSPEPNSPSEYCFYVDSDMVNE FKLELVEKLFAEDTEAKNPFSTQDTDLDLEMLAPYIPMDDDFQ LRSFDQLSPLESSSASPESASPQSTVTVFQ Fos-AD GSHMSVASLDLTGGLPEVATPESEEAFTLPLLNDPEPKPSVEP VKSISSMELKTEPFDDFLFPASSRPSGSETARSVPDMDLSGSF YAADWEPLHSGSLGMGPMATELEPLCTPVVTCTPSCTAYTSS FVFTYPEADSFPSCAAAHRKGSSSNEPSSDSLSSPTLLAL Mlph(147-240) RLQGGGGSEPSLEEGNGDSEQTDEDGDLDTEARDQPLNSKK KKRLLSFRDVDFEEDSDHLVQPCSQTLGLSSVPESAHSLQSLS GEPYSEDTTSLEP Tau-K45 MSSPGSPGTPGSRSRTPSLPTPPTREPKKVAVVRTPPKSPSS AKSRLQTAPVPMPDLKNVKSKIGSTENLKHQPGGGKVQIINKK LDLSNVQSKCGSKDNIKHVPGGGSVQIVYKPVDLSKVTSKCG SLGNIHHKPGGGQVEVKSEKLDFKDRVQSKIGSLDNITHVPGG GNKKIETHKLTFRENAKAKTDHGAEIVY Mlph(147-403) RLQGGGGSEPSLEEGNGDSEQTDEDGDLDTEARDQPLNSKK KKRLLSFRDVDFEEDSDHLVQPCSQTLGLSSVPESAHSLQSLS GEPYSEDTTSLEPEGLEETGARALGCRPSPEVQPCSPLPSGE DAHAELDSPAASCKSAFGTTAMPGTDDVRGKHLPSQYLADVD TSDEDSIQGPRAASQHSKRRARTVPETQILELNKRMSAVEHLL VHLENTVLPPSAQEPTVETHPSADTEEETLRRRLEELTSNISG SSTSSE

280 p57-ID VRTSACRSLFGPVDHEELSRELQARLAELNAEDQNRWDYDF QQDMPLRGPGRLQWTEVDSDSVPAFYRETVQV PDE-γ MNLEPPKAEIRSATRVMGGPVTPRKGPPKFKQRQTRQFKSKP PKKGVQGFGDDIPGMEGLGTDITVICPWEAFNHLELHELAQYG II LJIDP1 MARSFTNIKAISALVAEEFSNSLARRGYAATAQSAGRVGASMS GKMGSTKSGEEKAAAREKVSWVPDPVTGYYKPENIKEIDVAE LRSAVLGKN Cad136 RLEQYTSAVVGNKAAKPAKPAASDLPVPAEGVRNIKSMWEKG NVFSSPGGTGTPNKETAGLKVGVSSRINEWLTKTPEGNKSPA PKPSDLRPGDVSGKRNLWEKQSVEKPAASSSKVTATGKKSET NGLRQFEKEP α-synuclein MDVFMKGLSKAKEGVVAAAEKTKQGVAEAAGKTKEGVLYVG SKTKEGVVHGVATVAEKTKEQVTNVGGAVVTGVTAVAQKTVE GAGSIAAATGFVKKDQLGKNEEGAPQEGILEDMPVDPDNEAY EMPSEEGYQDYEPEA CFTR-R-region GAMESAERRNSILTETLHRFSLEGDAPVSWTETKKQSFKQTG EFGEKRKNSILNPINSIRKFSIVQKTPLQMNGIEEDSDEPLERRL SLVPDSEQGEAILPRISVISTGPTLQARRRQSVLNLMTHSVNQ GQNIHRKTTASTRKVSLAPQANLTELDIYSRRLSQETGLEISEEI NEEDLKECLFDDME SNAP25 MAEDADMRNELEEMQRRADQLADESLESTRRMLQLVEESKD AGIRTLVMLDEQGEQLERIEEGMDQINKDMKEAEKNLTDLGKF CGLCVCPCNKLKSSDAYKKAWGNNQDGVVASQPARVVDERE QMAISGGFIRRVTNDARENEMDENLEQVSGIIGNLRHMALDM GNEIDTQNRQIDRIMEKADSNKTRIDEANQRATKMLGSG ShB-C MTLGQHMKKSSLSESSSDMMDLDDGVESTPGLTETHPGRSA VAPFLGAQQQQQQPVASSLSMSIDKQLQHPLQQLTQTQLYQ QQQQQQQQQQNGFKQQQQQTQQQLQQQQSHTINASAAAAT SGSGSSGLTMRHNNALAVSIETDV HIF1-α-530 NEFKLELVEKLFAEDTEAKNPFSTQDTDLDLEMLAPYIPMDDD FQLRSFDQLSPLESSSASPESASPQSTVTVFQQTQIQEPTANA TTTTATTDELKTVTKDRMEDIKILIASPSPTHIHKETTSATSSPY RDTQSRTASPNRAGKGVIEQTEKSHPRSPNVLSVALSQR Securin MATLIYVDKENGEPGTRVVAKDGLKLGSGPSIKALDGRSQVST PRFGKTFDAPPALPKATRKALGTVNRATEKSVKTKGPLKQKQ PSFSAKKMTEKTVKAKSSVPASDDAYPEIEKFFPFNPLDFESF DLPEEHQIAHLPLSGVPLMILDEERELEKLFQLGPPSPVKMPSP PWESNLLQSPSSILSTLDVELPPVCCDIDI Aap-PGR AEPGKPAEPGKPAEPGKPAEPGTPAEPGKPAEPGTPAEPGKP AEPGKPAEPGKPAEPGKPAEPGTPAEPGTPAEPGKPAEPGTP AEPGKPAEPGTPAEPGKPAESGKPVEPGTPAQSGAPEQPNR SMHSTDNKNQ SasG-PGR PKDPKGPENPEKPSRPTHPSGPVNPNNPGLSKDRAKPNGPV HSMDKNDKVKKSKIAKESVANQEKKRAE

281

Arpts NNEAPQMSSTLQAEEGSNAEAPQSEPTKAEEGGNAEAAQSE PTKAEEGGNAEAPQSEPTKAEEGGNAEAAQSEPTKTEEGSN VKAAQSEPTKAEEGSNAEAPQSEPTKTEEGSNAKAAQSEPTK AEEGGNAEAAQSEPTKTEEGSNAEAPQSEPTKAEEGGNAEA PQSEPTKTEEGGNAEAPNVPTIKA SdrC-SD SDSDSDSDSDSDSDSDSDSDSDSDSDSDNDSDSDSDSDSDA GKHTPAKPMSTVKDQHKTAKA SD-30mer SDSDSDSDSDSDSDSDSDSDSDSDSDSDSD

Table S2. The sequence of IDPs used in PPII and Rh predictions. IDP sequences (other than those from the current study - shaded) are from Tomasso, et al. supplementary material [26].

282

References 1. Otto M. Staphylococcus epidermidis--the 'accidental' pathogen. Nature reviews Microbiology. 2009;7(8):555-67. doi: 10.1038/nrmicro2182. PubMed PMID: 19609257; PubMed Central PMCID: PMC2807625. 2. Tong SY, Davis JS, Eichenberger E, Holland TL, Fowler VG, Jr. Staphylococcus aureus infections: epidemiology, pathophysiology, clinical manifestations, and management. Clinical microbiology reviews. 2015;28(3):603-61. Epub 2015/05/29. doi: 10.1128/cmr.00134-14. PubMed PMID: 26016486; PubMed Central PMCID: PMCPMC4451395. 3. CDC. National Nosocomial Infections Surveillance (NNIS) system report. 4. Rohde H, Burandt EC, Siemssen N, Frommelt L, Burdelski C, Wurster S, et al. Polysaccharide intercellular adhesin or protein factors in biofilm accumulation of Staphylococcus epidermidis and Staphylococcus aureus isolated from prosthetic hip and knee joint infections. Biomaterials. 2007;28(9):1711-20. doi: 10.1016/j.biomaterials.2006.11.046. PubMed PMID: 17187854. 5. Speziale P, Pietrocola G, Foster TJ, Geoghegan JA. Protein-based biofilm matrices in Staphylococci. Frontiers in cellular and infection microbiology. 2014;4:171. Epub 2014/12/30. doi: 10.3389/fcimb.2014.00171. PubMed PMID: 25540773; PubMed Central PMCID: PMCPMC4261907. 6. Schaeffer CR, Woods KM, Longo GM, Kiedrowski MR, Paharik AE, Buttner H, et al. Accumulation-associated protein enhances Staphylococcus epidermidis biofilm formation under dynamic conditions and is required for infection in a rat catheter model. Infection and immunity. 2015;83(1):214-26. Epub 2014/10/22. doi: 10.1128/iai.02177- 14. PubMed PMID: 25332125; PubMed Central PMCID: PMCPmc4288872. 7. Yarawsky AE, English LR, Whitten ST, Herr AB. The Proline/Glycine-Rich Region of the Biofilm Adhesion Protein Aap Forms an Extended Stalk that Resists Compaction. Journal of molecular biology. 2017;429(2):261-79. Epub 2016/11/29. doi: 10.1016/j.jmb.2016.11.017. PubMed PMID: 27890783. 8. Conrady DG, Brescia CC, Horii K, Weiss AA, Hassett DJ, Herr AB. A zinc- dependent adhesion module is responsible for intercellular adhesion in staphylococcal biofilms. Proceedings of the National Academy of Sciences of the United States of America. 2008;105(49):19456-61. doi: 10.1073/pnas.0807717105. PubMed PMID: 19047636; PubMed Central PMCID: PMC2592360. 9. Hussain M, Herrmann M, von Eiff C, Perdreau-Remington F, Peters G. A 140- kilodalton extracellular protein is essential for the accumulation of Staphylococcus epidermidis strains on surfaces. Infection and immunity. 1997;65(2):519-24. Epub 1997/02/01. PubMed PMID: 9009307; PubMed Central PMCID: PMCPmc176090. 10. Macintosh RL, Brittan JL, Bhattacharya R, Jenkinson HF, Derrick J, Upton M, et al. The terminal A domain of the fibrillar accumulation-associated protein (Aap) of Staphylococcus epidermidis mediates adhesion to human corneocytes. Journal of bacteriology. 2009;191(22):7007-16. Epub 2009/09/15. doi: 10.1128/jb.00764-09. PubMed PMID: 19749046; PubMed Central PMCID: PMCPmc2772481. 11. Rohde H, Burdelski C, Bartscht K, Hussain M, Buck F, Horstkotte MA, et al. Induction of Staphylococcus epidermidis biofilm formation via proteolytic processing of the accumulation-associated protein by staphylococcal and host proteases. Molecular

283 microbiology. 2005;55(6):1883-95. doi: 10.1111/j.1365-2958.2005.04515.x. PubMed PMID: 15752207. 12. Paharik AE, Kotasinska M, Both A, Hoang TN, Buttner H, Roy P, et al. The metalloprotease SepA governs processing of accumulation-associated protein and shapes intercellular adhesive surface properties in Staphylococcus epidermidis. Molecular microbiology. 2016. Epub 2016/12/21. doi: 10.1111/mmi.13594. PubMed PMID: 27997732. 13. Dunker AK, Lawson JD, Brown CJ, Williams RM, Romero P, Oh JS, et al. Intrinsically disordered protein. Journal of molecular graphics & modelling. 2001;19(1):26-59. Epub 2001/05/31. PubMed PMID: 11381529. 14. Uversky VN. What does it mean to be natively unfolded? European journal of biochemistry / FEBS. 2002;269(1):2-12. Epub 2002/01/11. PubMed PMID: 11784292. 15. van der Lee R, Buljan M, Lang B, Weatheritt RJ, Daughdrill GW, Dunker AK, et al. Classification of intrinsically disordered regions and proteins. Chemical reviews. 2014;114(13):6589-631. Epub 2014/04/30. doi: 10.1021/cr400525m. PubMed PMID: 24773235; PubMed Central PMCID: PMCPmc4095912. 16. Corrigan RM, Rigby D, Handley P, Foster TJ. The role of Staphylococcus aureus surface protein SasG in adherence and biofilm formation. Microbiology (Reading, England). 2007;153(Pt 8):2435-46. Epub 2007/07/31. doi: 10.1099/mic.0.2007/006676- 0. PubMed PMID: 17660408. 17. McCrea KW, Hartford O, Davis S, Eidhin DN, Lina G, Speziale P, et al. The serine-aspartate repeat (Sdr) protein family in Staphylococcus epidermidis. Microbiology (Reading, England). 2000;146 ( Pt 7):1535-46. Epub 2000/07/06. doi: 10.1099/00221287-146-7-1535. PubMed PMID: 10878118. 18. Josefsson E, O'Connell D, Foster TJ, Durussel I, Cox JA. The binding of calcium to the B-repeat segment of SdrD, a cell surface protein of Staphylococcus aureus. The Journal of biological chemistry. 1998;273(47):31145-52. Epub 1998/11/13. PubMed PMID: 9813018. 19. Arrecubieta C, Lee MH, Macey A, Foster TJ, Lowy FD. SdrF, a Staphylococcus epidermidis surface protein, binds type I collagen. The Journal of biological chemistry. 2007;282(26):18767-76. Epub 2007/05/03. doi: 10.1074/jbc.M610940200. PubMed PMID: 17472965. 20. Uversky VN, Gillespie JR, Fink AL. Why are "natively unfolded" proteins unstructured under physiologic conditions? Proteins. 2000;41(3):415-27. Epub 2000/10/12. PubMed PMID: 11025552. 21. Das RK, Pappu RV. Conformations of intrinsically disordered proteins are influenced by linear sequence distributions of oppositely charged residues. Proceedings of the National Academy of Sciences of the United States of America. 2013;110(33):13392-7. Epub 2013/08/01. doi: 10.1073/pnas.1304749110. PubMed PMID: 23901099; PubMed Central PMCID: PMCPmc3746876. 22. Martin EW, Holehouse AS, Grace CR, Hughes A, Pappu RV, Mittag T. Sequence Determinants of the Conformational Properties of an Intrinsically Disordered Protein Prior to and upon Multisite Phosphorylation. Journal of the American Chemical Society. 2016;138(47):15323-35. doi: 10.1021/jacs.6b10272.

284

23. Kyte J, Doolittle RF. A simple method for displaying the hydropathic character of a protein. Journal of molecular biology. 1982;157(1):105-32. Epub 1982/05/05. PubMed PMID: 7108955. 24. Holehouse AS, Das RK, Ahad JN, Richardson MO, Pappu RV. CIDER: Resources to Analyze Sequence-Ensemble Relationships of Intrinsically Disordered Proteins. Biophysical journal. 2017;112(1):16-21. Epub 2017/01/12. doi: 10.1016/j.bpj.2016.11.3200. PubMed PMID: 28076807; PubMed Central PMCID: PMCPMC5232785. 25. Mao AH, Crick SL, Vitalis A, Chicoine CL, Pappu RV. Net charge per residue modulates conformational ensembles of intrinsically disordered proteins. Proceedings of the National Academy of Sciences of the United States of America. 2010;107(18):8183- 8. Epub 2010/04/21. doi: 10.1073/pnas.0911107107. PubMed PMID: 20404210; PubMed Central PMCID: PMCPmc2889596. 26. Tomasso ME, Tarver MJ, Devarajan D, Whitten ST. Hydrodynamic Radii of Intrinsically Disordered Proteins Determined from Experimental Polyproline II Propensities. PLoS computational biology. 2016;12(1):e1004686. Epub 2016/01/05. doi: 10.1371/journal.pcbi.1004686. PubMed PMID: 26727467. 27. Perez RB, Tischer A, Auton M, Whitten ST. Alanine and proline content modulate global sensitivity to discrete perturbations in disordered proteins. Proteins. 2014;82(12):3373-84. Epub 2014/09/23. doi: 10.1002/prot.24692. PubMed PMID: 25244701; PubMed Central PMCID: PMCPmc4237723. 28. Langridge TD, Tarver MJ, Whitten ST. Temperature effects on the hydrodynamic radius of the intrinsically disordered N-terminal region of the p53 protein. Proteins. 2014;82(4):668-78. Epub 2013/10/24. doi: 10.1002/prot.24449. PubMed PMID: 24150971. 29. English LR, Tilton EC, Ricard BJ, Whitten ST. Intrinsic alpha helix propensities compact hydrodynamic radii in intrinsically disordered proteins. Proteins. 2017;85(2):296-311. Epub 2016/12/10. doi: 10.1002/prot.25222. PubMed PMID: 27936491; PubMed Central PMCID: PMCPMC5258847. 30. Adzhubei AA, Sternberg MJ, Makarov AA. Polyproline-II helix in proteins: structure and function. Journal of molecular biology. 2013;425(12):2100-32. Epub 2013/03/20. doi: 10.1016/j.jmb.2013.03.018. PubMed PMID: 23507311. 31. Elam WA, Schrank TP, Campagnolo AJ, Hilser VJ. Evolutionary conservation of the polyproline II conformation surrounding intrinsically disordered phosphorylation sites. Protein science : a publication of the Protein Society. 2013;22(4):405-17. Epub 2013/01/24. doi: 10.1002/pro.2217. PubMed PMID: 23341186; PubMed Central PMCID: PMCPmc3610046. 32. Rucker AL, Pager CT, Campbell MN, Qualls JE, Creamer TP. Host-guest scale of left-handed polyproline II helix formation. Proteins. 2003;53(1):68-75. Epub 2003/08/29. doi: 10.1002/prot.10477. PubMed PMID: 12945050. 33. Shi Z, Chen K, Liu Z, Ng A, Bracken WC, Kallenbach NR. Polyproline II propensities from GGXGG peptides reveal an anticorrelation with beta-sheet scales. Proceedings of the National Academy of Sciences of the United States of America. 2005;102(50):17964-8. Epub 2005/12/07. doi: 10.1073/pnas.0507124102. PubMed PMID: 16330763; PubMed Central PMCID: PMCPmc1312395.

285

34. Uversky VN. Natively unfolded proteins: a point where biology waits for physics. Protein science : a publication of the Protein Society. 2002;11(4):739-56. Epub 2002/03/23. doi: 10.1110/ps.4210102. PubMed PMID: 11910019; PubMed Central PMCID: PMCPMC2373528. 35. Marsh JA, Forman-Kay JD. Sequence determinants of compaction in intrinsically disordered proteins. Biophysical journal. 2010;98(10):2383-90. Epub 2010/05/21. doi: 10.1016/j.bpj.2010.02.006. PubMed PMID: 20483348; PubMed Central PMCID: PMCPmc2872267. 36. Cantor CR, Schimmel PR. Biophysical Chemistry Part III: The Behaviour of Biological Macromolecules: W. H. Freeman and Co, New York; 1980. 37. Laue TM, Stafford WF, 3rd. Modern applications of analytical ultracentrifugation. Annual review of biophysics and biomolecular structure. 1999;28:75-100. Epub 1999/07/20. doi: 10.1146/annurev.biophys.28.1.75. PubMed PMID: 10410796. 38. Brown PH, Schuck P. Macromolecular size-and-shape distributions by sedimentation velocity analytical ultracentrifugation. Biophysical journal. 2006;90(12):4651-61. Epub 2006/03/28. doi: 10.1529/biophysj.106.081372. PubMed PMID: 16565040; PubMed Central PMCID: PMCPmc1471869. 39. Lopes JL, Miles AJ, Whitmore L, Wallace BA. Distinct circular dichroism spectroscopic signatures of polyproline II and unordered secondary structures: applications in secondary structure analyses. Protein science : a publication of the Protein Society. 2014;23(12):1765-72. Epub 2014/09/30. doi: 10.1002/pro.2558. PubMed PMID: 25262612; PubMed Central PMCID: PMCPmc4253816. 40. Drake AF, Siligardi G, Gibbons WA. Reassessment of the electronic circular dichroism criteria for random coil conformations of poly(L-lysine) and the implications for protein folding and denaturation studies. Biophysical chemistry. 1988;31(1-2):143-6. Epub 1988/08/01. PubMed PMID: 3233285. 41. Schaub LJ, Campbell JC, Whitten ST. Thermal unfolding of the N-terminal region of p53 monitored by circular dichroism spectroscopy. Protein science : a publication of the Protein Society. 2012;21(11):1682-8. Epub 2012/08/24. doi: 10.1002/pro.2146. PubMed PMID: 22915551; PubMed Central PMCID: PMCPmc3527704. 42. Greenfield NJ. Analysis of circular dichroism data. Methods in enzymology. 2004;383:282-317. Epub 2004/04/06. doi: 10.1016/s0076-6879(04)83012-x. PubMed PMID: 15063655. 43. Pace CN, Trevino S, Prabhakaran E, Scholtz JM. Protein structure, stability and solubility in water and other solvents. Philosophical transactions of the Royal Society of London Series B, Biological sciences. 2004;359(1448):1225-34; discussion 34-5. Epub 2004/08/13. doi: 10.1098/rstb.2004.1500. PubMed PMID: 15306378; PubMed Central PMCID: PMCPMC1693406. 44. Whittington SJ, Chellgren BW, Hermann VM, Creamer TP. Urea promotes polyproline II helix formation: implications for protein denatured states. Biochemistry. 2005;44(16):6269-75. Epub 2005/04/20. doi: 10.1021/bi050124u. PubMed PMID: 15835915. 45. Chemes LB, Alonso LG, Noval MG, de Prat-Gay G. Circular dichroism techniques for the analysis of intrinsically disordered proteins and domains. Methods in molecular biology. 2012;895:387-404. doi: 10.1007/978-1-61779-927-3_22. PubMed PMID: 22760329.

286

46. Tiffany ML, Krimm S. Extended Conformations of Polypeptides and Proteins in Urea and Guanidine Hydrochloride. Biopolymers. 1973;12(12):575-87. 47. Auton M, Rosgen J, Sinev M, Holthauzen LM, Bolen DW. Osmolyte effects on protein stability and solubility: a balancing act between backbone and side-chains. Biophysical chemistry. 2011;159(1):90-9. Epub 2011/06/21. doi: 10.1016/j.bpc.2011.05.012. PubMed PMID: 21683504; PubMed Central PMCID: PMCPmc3166983. 48. Baskakov I, Bolen DW. Forcing thermodynamically unfolded proteins to fold. The Journal of biological chemistry. 1998;273(9):4831-4. Epub 1998/03/28. PubMed PMID: 9478922. 49. Bolen DW, Baskakov IV. The osmophobic effect: natural selection of a thermodynamic force in protein folding. Journal of molecular biology. 2001;310(5):955- 63. Epub 2001/08/15. doi: 10.1006/jmbi.2001.4819. PubMed PMID: 11502004. 50. Baskakov IV, Kumar R, Srinivasan G, Ji YS, Bolen DW, Thompson EB. Trimethylamine N-oxide-induced cooperative folding of an intrinsically unfolded transcription-activating fragment of human glucocorticoid receptor. The Journal of biological chemistry. 1999;274(16):10693-6. Epub 1999/04/10. PubMed PMID: 10196139. 51. Buck M, Schwalbe H, Dobson CM. Characterization of conformational preferences in a partly folded protein by heteronuclear NMR spectroscopy: assignment and secondary structure analysis of hen egg-white lysozyme in trifluoroethanol. Biochemistry. 1995;34(40):13219-32. Epub 1995/10/10. PubMed PMID: 7548086. 52. Fan P, Bracken C, Baum J. Structural characterization of monellin in the alcohol- denatured state by NMR: evidence for beta-sheet to alpha-helix conversion. Biochemistry. 1993;32(6):1573-82. Epub 1993/02/16. PubMed PMID: 8381663. 53. Sonnichsen FD, Van Eyk JE, Hodges RS, Sykes BD. Effect of trifluoroethanol on protein secondary structure: an NMR and CD study using a synthetic actin peptide. Biochemistry. 1992;31(37):8790-8. Epub 1992/09/22. PubMed PMID: 1390666. 54. Hill CM, Bates IR, White GF, Hallett FR, Harauz G. Effects of the osmolyte trimethylamine-N-oxide on conformation, self-association, and two-dimensional crystallization of myelin basic protein. Journal of structural biology. 2002;139(1):13-26. Epub 2002/10/10. PubMed PMID: 12372316. 55. Rozycka M, Wojtas M, Jakob M, Stigloher C, Grzeszkowiak M, Mazur M, et al. Intrinsically disordered and pliable Starmaker-like protein from medaka (Oryzias latipes) controls the formation of calcium carbonate crystals. PloS one. 2014;9(12):e114308. Epub 2014/12/10. doi: 10.1371/journal.pone.0114308. PubMed PMID: 25490041; PubMed Central PMCID: PMCPmc4260845. 56. Creighton TE. Proteins: structures and molecular properties. 2nd ed. New York: W.H. Freeman; 1993. 57. Kikhney AG, Svergun DI. A practical guide to small angle X-ray scattering (SAXS) of flexible and intrinsically disordered proteins. FEBS Letters. 2015;589(19, Part A):2570-7. doi: http://dx.doi.org/10.1016/j.febslet.2015.08.027. 58. Laue TM, Shah BD, Ridgeway TM, Pelletier SL. Computer-aided interpretation of sedimentation data for proteins. In: Harding SE, Rowe AJ, Horton JC, editors. Analytical Ultracentrifugation in Biochemistry and Polymer Science: Royal Society of Chemistry, London; 1992. p. 90-125.

287

288

Chapter VII. Future Directions

The importance of Aap higher-order assembly and amyloidogenesis in

Staphylococcus epidermidis biofilm formation and virulence

We have performed extensive biophysical analyses of the Zn2+-dependent assembly of the B-repeats from Aap. To briefly summarize, Conrady, et al. [1] showed

Brpt1.5 can dimerize in the presence of Zn2+. The biological implications of this event were demonstrated by the ability of DTPA, a Zn2+-chelator, to affect biofilm formation.

When DTPA was present at the start of the experiment, biofilm formation was inhibited, however, addition of DTPA to a preformed biofilm was ineffective at disrupting the biofilm. The study proposed a mechanism of bacterial accumulation dependent on intercellular B-repeat dimerization in the presence of Zn2+ [1]. Following this initial study,

Conrady, et al. [2] then solved X-ray crystal structures of the Brpt1.5 dimer. They found that the structure of the Brpt1.5 protomer is highly extended and the Zn2+-binding site showed pleomorphism, where different crystal forms showed different combinations of

Zn2+-coordinating residues. Mutational studies of Brpt1.5 demonstrated that an E203A substitution was unable to form significant amounts of the dimer species, whereas other

Zn2+-coordinating residues reduced, but did not abolish dimerization [2]. Aside from the assembly of Brpt1.5, the secondary structure was consistent between the monomer and dimer [1]. While there is a high sequence similarity across all B-repeats of Aap, there is a stretch of several residues that adopt one of two sequences, which we refer to as

"cassettes." B-repeats containing the more prevalent consensus cassettes are more assembly competent, whereas B-repeats containing the variant cassettes are more thermodynamically stable, but lack some assembly capability [3].

289

In this dissertation work, we have demonstrated the presence of a monomer- dimer-tetramer equilibrium in the more biologically relevant Brpt5.5 construct. Beyond this initial reversible assembly, Brpt5.5 can further assemble into functional amyloid fibrils, which we proposed are responsible for providing additional strength and DTPA resistance to the biofilm. While our biophysical approaches using recombinant protein point towards this hypothesis, we are limited in our ability to test this hypothesis in genetically modified S. epidermidis strains which would constitute a standard biological or functional assay. We have recently begun collaborating with Alex Horswill's lab, who are one of only a few labs which are well-versed in the genetic manipulation of

Staphylococcal epidermidis [4]. The Horswill lab has published work using Δaap, Δica, and ΔaapΔica strains. The Δica strains do not synthesize the extracellular polysaccharide, PNAG, and therefore, working with strains with this background allow one to focus directly on effects in protein-dependent biofilms.

We have made plans with Alex Horswill to produce S. epidermidis strains expressing Aap with mutated B-repeats. This will allow us to more accurately detail the biological importance of the assembly states of the B-repeats. Specifically, Conrady, et al. [2] showed a Brpt1.5 E203A mutant was unable to dimerize, even at 10 mM ZnCl2.

By making the corresponding E203A mutation in each B-repeat of Aap, we could directly probe the effect of the dimerization assembly step. We have already verified that E203A mutations prevents dimerization in recombinant B-repeat superdomain

(Brpt6.5) from Aap (S. epidermidis strain 1457). Presumably, the E203A mutations should have the same effect as chelating Zn2+ with DTPA - robust biofilm formation should be prevented by preventing intercellular accumulation. However, Aap should still

290 be able to act in the attachment phase of biofilm formation through the N-terminal lectin region - forming a thin monolayer of cells attached to the surface. In our standard, static biofilm formation assay, we would expect to see very little, if any, remaining bacteria in the wells after the wash steps. However, if confocal laser-scanning microscopy (CLSM) were used, there is much less perturbation of the biofilm growth, so a thin monolayer of cells attached to the substrate might be expected - similar to what biofilms look like when Aap is not processed to expose the B-repeat superdomain [5].

By expressing Aap with B-repeat H85A mutations in all B-repeats, we could allow the initial dimerization assembly in the presence of Zn2+, but prevent higher-order assembly and Zn2+-dependent amyloidogenesis. In this case, we would expect to observe attachment and intercellular accumulation. However, based on our hypothesis that amyloid fibrils will offer strength and stability to the biofilm structure, this biofilm should be more easily dispersed or broken apart by perturbants like sheer stress, washing with a pipette, and DTPA. Dynamic flow cell chambers have been used to more closely model the environment experienced by the biofilm in an infectious case [6].

Interestingly, while a Δaap knockout showed no decrease in biofilm formation compared to the aap-positive strain according to a standard (static) biofilm formation assay, the same mutant in a flow cell chamber showed significantly less biofilm formation than the aap-positive strain [6].

Biofilms grown in vitro can be examined by transmission electron microscopy

(TEM) for the presence of amyloid-like fibrils we observed with S. epidermidis RP62A

(native), as well as tested for the presence of thioflavin T (ThT) fluorescent species by

CLSM indicative of amyloid fibrils. Beyond in vitro analyses, a rat catheter model has

291 been utilized [6]. Here, Aap was demonstrated to be essential in S. epidermidis virulence [6]. Once the Aap B-repeat mutant strains are made, the flow cell chamber and rat catheter model would be excellent ways to evaluate the importance of the assembly states and the role of the functional amyloid. Recall the incredible tensile strength of amyloid fibrils [7]. We expect to be able to better differentiate between the

H85A and native strains in such settings where shear stress or fluid shear is present.

Addition of proteases could also be useful in differentiating increased biofilm strength/resilience when amyloid fibrils are present, due to their decreased proteolytic susceptibility [8].

The Horswill group, and most others, use the 1457 strain of S. epidermidis for genetic manipulation [4, 6, 9]. The number of B-repeats in Aap varies between strains.

Strain RP62A contains 12.5 B-repeats, while 1457 contains just 6.5 B-repeats.

Moreover, there are differences in the pattern of consensus and variant the B-repeats

[3]. Aap from strain 1457 has 6 consensus repeats and a variant half-repeat at the C- terminus. Our Brpt5.5 construct contains the 5.5 most C-terminal repeats, which are all consensus except for one variant B-repeat at the N-terminus. Based on these differences, we produced Brpt6.5 constructs corresponding to Aap from strain 1457 and have begun to verify the assembly (reversible equilibrium and amyloidogenesis) of the native and mutant constructs. Preliminary data for Brpt6.5 is listed in Table VII-1, along with Brpt1.5 and Brpt5.5 assembly data.

292

Table VII-1. S. epidermidis Aap B-repeat mutations and the predicted effects on biofilm formation Mutation(s) Brpt1.5 Brpt5.5 Brpt6.5 Biofilm Anticipated (RP62A) (RP62A) (1457) Characteristics Virulence Assembly Assembly Assembly States States States Observed Observed Observed [1, 2] E203A 1 No Data 1 Attached - monolayer, no accumulation H85A 1 - 2 1 - 2 1 - 2 Accumulation + capable, but weak biofilm resistance, DTPA sensitive Native 1 - 2 1-2-4- 1 - >2* Robust mature +++ amyloid biofilms, strong resistance Brpt1.5 (RP62A) and Brpt5.5 (RP62A) refer to constructs containing the C-terminal 1.5 and 5.5 B-repeats from Aap from strain RP62A.

Brpt6.5 (1457) refers to a construct containing all 6.5 B-repeats from Aap from strain 1457.

Brpt1.5 data is from [1, 2]. Brpt1.5 H85A data is from a Brpt1.5 H75A/H85A double mutant, which showed slightly reduced dimerization compared to wild-type [1].

No data has been collected on a Brpt5.5 (RP62A) construct.

*Brpt6.5 (Native) forms very large species in the presence of Zn2+. Based on molecular weight estimates from sedimentation velocity AUC experiments, the largest soluble species may have a stoichiometry between 26 and 32. Sedimentation equilibrium experiments have not yet been performed to characterize intermediate species. Limited data has been collected on the amyloid-forming capability of Brpt6.5.

293

Biological significance of the role of Sbp in Aap amyloidogenesis in

Staphylococcus epidermidis biofilms

Chapter IV of this dissertation investigated the interaction between Aap B- repeats and Sbp. Decker, et al. [9] first identified Sbp by running crude biofilm matrix over sepharose beads coupled to recombinantly expressed "B-domain" from S. epidermidis 1457. This includes all 6.5 B-repeats from 1457 [5]. Sbp was eluted from the beads at low pH and identified by mass spectrometry [9]. Binding in vitro was confirmed by monitoring recombinant Sbp binding to polystyrene-immobilized B-domain, which could be inhibited by allowing Sbp to interact with B-domain prior to the experiment. Similar experiments in the absence of ZnCl2 showed binding, and the initial isolation of Sbp from the crude biofilm mixture was done in the absence of added Zn2+

(trace amounts of Zn2+ are present in TSB media used for the initial biofilm growth).

Binding in vitro was increased about 3-fold in the presence of 10-30 µM ZnCl2, but not in the presence of another divalent metal cation, Mg2+. Further evidence for the interaction came in the form of biofilm formation assays in a biofilm-negative Δica Δaap background. Biofilm formation was recovered after complementation with B-domain when Sbp was either expressed endogenously or added exogenously. CLSM of B- domain-complemented biofilms with exogenously added Sbp showed co-localization of

B-domain and Sbp [9].

Two separate investigations of Sbp:B-repeat interactions have been published since this initial article. The first is a Ph.D dissertation [10] from Madiha Fayyaz, a student associated with Holger Rohde - a key figure in the initial Sbp study [9]. The dissertation work used size exclusion chromatography (SEC), native mass spectrometry

294

(MS), and microscale thermophoresis (MST) to probe for interactions between Sbp and several B-repeat constructs. The B-repeat constructs used here were a Brpt1.0, Brpt1.5, and Brpt2.5 construct based on Aap from S. epidermidis RP62A. Importantly, the sequences used started at the N-terminus of the B-repeat superdomain [10], as opposed to our convention of starting at the C-terminal half-repeat (nearest the bacterial cell wall). Whereas the C-terminal most 4.5 B-repeats all contain consensus cassettes

(assembly competent), the N-terminal most B-repeat is consensus (but especially

"divergent" in sequence compared to all other B-repeats) followed by two variant cassettes (assembly impaired) [3]. This means the B-repeat constructs used by Fayyaz will likely be unable to completely assemble. In fact, by SEC and native MS, no B-repeat assembly was observed. Another potential problem is that the Zn2+ concentrations used were only 0.1 mM Zn(CH3CO2)2 or 1 mM ZnCl2. Based on our numerous biophysical studies, such Zn2+ concentrations will be insufficient to observe significant dimerization of these constructs (even if they were the assembly competent B-repeats). Regardless, there were no interactions observed between Sbp and these B-repeat constructs in the absence or presence of Zn2+ [10]. As a side-note, the Sbp construct used by Fayyaz

[10] and Decker, et al. [9] contained an extra 16 residues at the N-terminus which we excluded due to their being identified as a signal peptide by SignalP

(http://www.cbs.dtu.dk/services/SignalP/).

The other investigation of Sbp was published in 2018 by Wang, et al. [11].

Herein, SEC and isothermal titration calorimetry (ITC) were both used to test for Sbp:B- repeat interactions. In both cases, with or without 1 mM added ZnCl2, there was no indication of any interaction. The B-repeat construct in this case was a Brpt1.5

295 construct, which seems to be the most C-terminal one and a half B-repeats, although it is not clearly described which strain of S. epidermidis this construct is based on.

Furthermore, it was not apparent from the presented data whether or not Brpt1.5 dimerization was even observed. Nonetheless, this study also did not find any interactions between Sbp and Brpt1.5 [11].

To summarize the published studies investigating Sbp:B-repeat interactions, the only positive data presented was the ability of Sbp to bind B-domain (Brpt6.5 from S. epidermidis 1457) in vitro and the biofilm co-localization data [9]. Therefore, one could assume that Sbp requires more than 2.5 B-repeats for binding, requires specific B- repeat cassettes (consensus vs variant), or binds to assembled B-repeats specifically - whether a B-repeat dimer, tetramer, or higher-order species leading up to amyloid fibrils. In Chapter IV, we demonstrated that Sbp did not bind to artificial Brpt1.5 constructs composed of two consensus or two variant B-repeats (Brpt1.511,13 and

Brpt1.58,13* originally described in [3]) in the presence or absence of Zn2+, and we also did not observe binding to the native Brpt1.5 construct (essentially consensus B-repeat and consensus half repeat [1, 2]) in either conditions. This seems to eliminate the possibility of Sbp binding to specific B-repeats based on cassette type, and it rules out a

Brpt1.5 dimer as the specific target. Using our Brpt5.5 construct (much more similar to the "B-domain" used by Decker, et al. [9]), we showed Sbp interacts with Brpt5.5 neither in the absence of Zn2+, nor in low or high salt conditions (without Zn2+) where Sbp is partially or fully folded, respectively. Instead, we did see major aggregation when Sbp and Brpt5.5 were mixed in the presence of Zn2+. While the rapid aggregation of the mixture prohibited us from observing a traditional Sbp:Brpt5.5 complex, we used a

296 simple turbidity assay to show that Sbp significantly reduced the amount of Zn2+ required to aggregate Brpt5.5 - a feature we showed is indicative of amyloidogenesis capability. Also, when Sbp was mixed with Brpt5.5 H85A (construct limited to monomer- dimer assembly) in the presence of Zn2+, Sbp had no effect on assembly and no complex was observed. Therefore, we narrowed down the interaction requirements to the tetramer or higher-order species and amyloid fibrils. The Brpt5.5 tetramer shows little change in secondary structure, suggesting that if Sbp does bind the tetramer, it is probably recognizing a specific binding site spanning more than one protomer (or else

Sbp could bind the monomer or dimer as well). It seems more reasonable that Sbp might be recognizing the nucleating species observed when Zn2+ is present at elevated temperatures, because there is significant change in the secondary structure by CD, which could create novel binding sites for Sbp to recognize.

While our data provides great improvements to our understanding of Sbp:B- repeat interactions, there are still important experiments to perform - especially more biologically relevant ones. Again, in collaboration with Alex Horswill, we plan to explicitly investigate the role of Sbp in Aap amyloidogenesis in S. epidermidis mutants. Initially, we will use S. epidermidis 1457 Δaap and 1457 Δsbp and 1457 Δaap Δsbp strains in the 1457 Δica background to focus on effects on protein-dependent biofilms, without interference from PNAG production. Decker, et al. [9] have already demonstrated that

Sbp is an important co-factor for Aap-dependent biofilm formation. However, we shall expand on this hypothesis by using TEM to investigate the presence or absence of amyloid fibrils, and importantly use TEM immunogold labelling to specifically pinpoint the location of B-repeats and Sbp in the biofilm and amyloid fibrils. This approach has

297 been taken previously to demonstrate the role of TapA as an accessory protein in TasA amyloidogenesis (Bacillus subtilis) [12]. In our case, biofilm samples could be immunogold labelled with either anti-B-repeat antibody or anti-Sbp antibody, or, to more elegantly show that both proteins are present in the same fibrils, the same sample could be labelled as long as different size gold particles and different secondary antibodies can be used. Alternatively, fluorescently tagged recombinant B-repeats and/or Sbp could be added exogenously and CLSM used to visualize each protein in biofilms and amyloid fibrils.

Another interesting question which could be answered by the Δaap and Δsbp strains is whether or not a phenomenon known as interbacterial complementation can occur. In the well-characterized curli amyloid system found in Escherichia coli, CsgB nucleates amyloid formation of the major curli subunit, CsgA. It has been shown that

CsgA that is secreted from a ΔcsgB "donor cell" can interact with CsgB on the surface of a ΔcsgA "acceptor cell" in a process known as interbacterial complementation [13].

Aap is covalently linked to the peptidoglycan layer of S. epidermidis, and it also appears to be present in the extracellular space in smaller amounts, although it is unclear if there are any inherent differences between cell wall-anchored Aap and "extracellular" Aap.

We know from our previous study that Aap amyloid fibrils are present both in the extracellular space and along the cell wall. Sbp is mostly localized to the cell wall, although it lacks any call wall-anchoring motif. We were therefore interested to see if we could observe interbacterial complementation yielding biofilm formation and/or amyloidogenesis between Δaap and Δsbp strains. This type of experiment could provide biological insight into the ability of Aap amyloidogenesis to occur in interspecies

298 biofilms, such as between S. epidermidis expressing Aap (but not Sbp) and S. aureus expressing Sbp (some available genomes show the presence of Sbp).

299

Investigating the mechanism of B-repeat amyloidogenesis

There is much more to be learned about how Aap B-repeat amyloidogenesis occurs. Specifically, from a structural standpoint, it is unclear what molecular changes are occurring between the reversible Zn2+-induced assembly and the nucleating and fibril species. Observations from this dissertation work (Chapter II and Chapter III) show there is no major restructuring across the monomer-dimer-tetramer assemblies, with the exception of some changes in the random coil and β-sheet proportions. However, upon formation of our hypothesized nucleating species at elevated temperatures and in the presence of Zn2+, there is a very clear change in the CD spectrum. The resulting CD spectrum, which is quite transient due to the aggregation that follows, indicates a loss of random coil and gain of β-sheet and β-turn content.

Nelson and Eisenberg have proposed three classes of structural mechanisms of fibril formation [14]. The first class, refolding, involves partial or complete unfolding of the native state into a fold reminiscent of the fibrillar state. The second class, gain-of- interaction, involves a change in only part of the natively folded protein to a fibril-like conformation. This region would then mediate fibril stacking, while the remainder of the protein (which is still natively folded) may be excluded from the fibril. The third class, natively disordered, simply involves the folding of an intrinsically disordered region or protein into a fibril state.

Considering these three mechanistic classes for B-repeat amyloidogensis, the

CD data suggests there is change in secondary structure upon fibrillation. We have also tested an anti-amyloid antibody (OC antibody) which recognizes a broad range of amyloid-forming proteins, specifically in the fibril state and not prefibrillar oligomers [15].

300

We have performed dot blots on Brpt5.5 samples using OC antibody (Figure VII-1) and observed OC antibody binding to natively folded Brpt5.5 (top left) as well as Brpt5.5 incubated under amyloid-forming conditions (Zn2+ + heat + time; bottom middle, bottom right). Brpt5.5 was incubated with 3 mM ZnCl2 (middle column) or 5 mM ZnCl2 (right column) at room temperature (top row) or at the temperature required for formation of our hypothesized nucleating species (50°C - bottom middle, 40°C - bottom right).

Unfolded Brpt5.5 (60°C - bottom left) was not recognized by the antibody. This experiment points toward the gain-of-function mechanism, because the epitope recognized by OC antibody is present in the native state as well as in the fibril state.

Yet, it is also possible that the many regions of disorder in B-repeats could induce fibril formation through a gain-of-structure.

Figure VII-1. Dot blot assay performed on Brpt5.5 samples using the amyloid-detecting antibody OC (Millipore Sigma). Top left, Brpt5.5 under native conditions. Top middle, Brpt5.5 + 3 mM ZnCl2 incubated at room temperature. Top right, Brpt5.5 + 5 mM ZnCl2 incubated at room temperature. Bottom left, Brpt5.5 without Zn2+, incubated at 60°C, where Brpt5.5 is unfolded. Bottom middle and bottom right, Brpt5.5 + 3 mM ZnCl2 or 5 mM ZnCl2 incubated at temperatures able to cause formation of aggregate (50°C and 40°C.

301

Based on computational predictions of amyloid "hot spots," we can potentially narrow down the mechanism of amyloidogenesis and gain structural information about the mature fibril state. The sequence of Brpt5.5 was submitted to the AmylPred2 server

(http://aias.biol.uoa.gr/AMYLPRED2/), which utilizes up to 11 separate methods shown to be useful in predicting amyloidogenic propensities [16]. An additional prediction was made using FoldAmyloid (http://bioinfo.protres.ru/fold-amyloid/), as well as

Aggrescan3D (http://biocomp.chem.uw.edu.pl/A3D/) which can utilize structural models.

In our case, we submitted the Brpt1.5 crystal structure (PDB ID: 4FUN). Figure VII-2 shows the crystal structure of Brpt1.5 (PDB ID: 4FUN) with predicted amyloidogenic regions in green (weak predictions from AmylPred2), bronze (Aggrescan 3D predictions), and red (strong predictions from AmylPred2 and overlapping predictions from any methods). Interestingly, the most strongly predicted amyloidogenic region

(IVHY) is only present in the consensus cassettes, whereas the variant cassettes contain ITEY (predicted non-amyloidogenic). The histidine in IVHY is also a Zn2+- coordinating residue, providing an opportunity for Zn2+ to modulate amyloidogenesis.

Conversely, another weakly predicted amyloid region is found in the consensus cassette but is "swapped out" in the variant cassette. Because we don't observe amyloidogenesis in Brpt1.5 constructs, we would need to produce "all-consensus" or

"all-variant" Brpt5.5 constructs to explicitly test the effect of cassette type on amyloidogenesis. This would require gene synthesis services, which are relatively expensive. Instead, we could first begin with testing short polypeptides that are predicted to be amyloidogenic. This would help narrow down our search for the regions

302 which are most responsible for the amyloidogenicity of Brpt5.5 and Aap without the large monetary investment.

Figure VII-2. Predictions of amyloidogenic regions were performed using several methods described in the text. Results are summarized here using a Brpt1.5 dimer model (PDB: 4FUN). Blue regions have no predicted amyloid proposenity. Green regions have weak propensities predicted from AmyPred2. Bronze regions are predicted hot spots from Aggrescan 3D. Red regions are predicted to have strong amyloid propensity based on AmylPred2 or any regions predicted by multiple methods to have some propensity for amyloid.

A different perspective could also be taken to identify regions important in amyloidogenesis or in the mature fibril. Perhaps the most simplistic approach is to form the amyloid fibrils in vitro, then expose the fibrils to proteolysis. The mature fibril "core" should show high resistance to protease degradation due to decreased accessibility, while any regions which are excluded from the core will be degraded. The fibrils can then be depolymerized with formic acid and analyzed by SDS-PAGE followed by mass spectrometry of any observed bands. This approach has been successfully utilized to identify the core structure of α-synuclein fibrils [8]. A more detailed solution is to perform hydrogen-deuterium exchange (HDX) experiments [17, 18]. The basic characteristic

303 being probed by HDX experiments is solvent accessibility. Amide hydrogens that are exposed to solvent will rapidly exchange with deuterium in a deuterated solvent.

However, amide hydrogens which are involved in hydrogen bonds or protected from the solvent will more slowly exchange (or not exchange) with deuterium in the solvent.

Therefore, when fibrils are exposed to deuterated solvent, depolymerized, and then analyzed by mass spectrometry (MS) or NMR, regions of the monomer sequence which were incorporated into the core structure of the fibril will still have hydrogen atoms, whereas more exposed amides will contain deuterium (1 Da heavier than hydrogen).

This approach has also proven useful in analyzing intermediate, pre-amyloid species

[17]. In the case of Brpt5.5, there may be additional complexity due to the repetitive nature of the protein. This could lead to a sequence being exposed on the surface in one protomer, while the same sequence is buried in the amyloid core in another protomer.

304

Biophysical and structural insights from endogenously expressed full- length Aap

Expressing full-length Aap as a recombinant protein would be extremely useful for many reasons. We have expressed and characterized the more "biologically relevant" Brpt5.5 construct, which exhibits a monomer-dimer-tetramer reversible assembly in the presence of Zn2+, followed by amyloid fibril formation. Based on the fact that at least five B-repeats are required for biofilm formation in the S. aureus ortholog,

SasG, it is not unreasonable to think we have captured the full spectrum of B-repeat assembly. However, given that Aap can contain up to 17 B-repeats, there must be some advantage to having more than 5 B-repeats. Based on sedimentation equilibrium and velocity AUC experiments of Brpt1.5, Brpt2.5, and Brpt5.5, we know that the amount of

Zn2+ required to induce dimerization decreases with the increase in B-repeats. We believe this phenomenon is due to the chelate effect [19, 20]. Based on biofilm assays in the presence of Zn2+ and Zn2+-chelator, it seems to be the case that full-length Aap, with 17 B-repeats and at a high density on the cell surface, requires micromolar Zn2+ concentrations [1]. It would be useful to understand whether or not we can observe assembly at lower Zn2+ concentrations in vitro using a B-repeat construct of all 17 B- repeats. Given that even with Brpt5.5, we have extreme difficulty in purifying away truncated versions of the protein, a Brpt17.5 construct would be extremely difficult to express and purify. Adding a C-terminal Strep-tag to improve purification efforts resulted in significantly altered Zn2+-binding and assembly, due to undesired Zn2+-Strep-tag binding (data not shown). Other tags may affect amyloidogenesis as well.

305

While expressing and purifying full-length Aap in E. coli using standard procedures does not seem feasible, S. epidermidis clearly is capable of the task.

Therefore, we have made efforts to isolate endogenous Aap from S. epidermidis. Sun, et al. have published SDS-PAGE data and Aap purification data which allowed them to produce antibodies against Aap [21]. We have used a similar protocol to isolate and purify Aap from the cell wall and from the extracellular space at quantities visible by

SDS-PAGE coomassie staining. In fact, we have been able to perform sedimentation velocity AUC experiments with purified Aap, which appeared >99% pure (Figure VII-3).

We have also used MS to confirm the identity of the protein, and begin to probe whether this Aap is a processed form (cleaved by SepA after the A-repeats or lectin) or the full native protein. Preliminary AUC data did not show Zn2+-dependent assembly of Aap, suggesting the A-repeats and/or lectin have not been cleaved. This result was consistent with mass spectrometry results in which peptide fragments from the lectin region were observed. To process the isolated Aap, cleavage by SepA [5] or other proteases [22] could be allowed in order to analyze assembly and amyloidogenesis.

Small angle X-ray scattering (SAXS) could prove a very valuable tool for analyzing full-length Aap. We could gain information of the dynamics of the protein, and potentially determine whether or not the A-repeats fold back onto the lectin and/or N- terminal B-repeats to inhibit Zn2+-dependent assembly. Collecting additional SAXS data on smaller, recombinantly expressed protein constructs including the A-repeats, or the

A-repeats, lectin and first one and a half B-repeats could be very valuable resources in properly fitting SAXS data for full-length Aap.

306

Figure VII-3. Analysis of Aap by AUC. Aap was isolated from S. epidermidis biofilms and purified. Sedimentation velocity AUC was performed in the absence and presence of Zn2+, shown here. Integration of the peak at ~3 s (Aap) indicates >99% purity. No assembly was observed in the presence of 500 µM ZnCl2, indicating the N-terminal region of Aap may not yet have been cleaved away.

307

Defining spatial and temporal parameters of Staphylococcus epidermidis biofilm formation

Biofilm formation is a highly complex process and is heavily regulated.

Understanding the kinetics of biofilm formation could be extremely useful in characterizing the interactions and regulatory elements that participate in biofilm formation. Our current hypothesis regarding the role of Sbp in the accumulation phase of biofilm formation is that Sbp nucleates Aap amyloidogenesis. Essentially, Sbp may be the signal to "turn on" amyloid formation. Decker, et al. [9] showed Sbp was strongly expressed within 4 hours of the growth phase, but did not probe earlier time points. If our hypothesis is correct, and Sbp nucleates amyloidogenesis, this expression pattern would be consistent with our data showing the presence of amyloid fibrils (and DTPA resistance) within 2 hours of growth on a surface. The study also showed with a ΔsarA mutant, there was a strong delay in Sbp expression by 4-8 hours. This indicated the global gene regulator, sarA, is an important positive regulator of Sbp [9]. Examining biofilm formation of the ΔsarA mutant, specifically looking at ThT fluorescence and

DTPA resistance, could provide evidence that Sbp is indeed nucleating amyloidogenesis. We would expect to see delayed DTPA resistant and ThT fluorescence due to the delayed production of Sbp.

Before amyloidogenesis can take place, Aap must be cleaved by SepA [5]. SepA is negatively regulated by SarA, with the ΔsarA mutant showing higher Aap-dependent biofilm formation [5]. Aap amyloidogenesis, therefore, could also be regulated by SepA.

The time-dependence of SepA expression has not been explicitly investigated.

308

Presumably, SepA must be expressed after attachment (via Aap N-terminal lectin region) but before intercellular accumulation (via B-repeat region).

Extracellular DNA (eDNA) is a component of biofilms which can be involved in essentially every step of biofilm formation. In S. epidermidis biofilms, lysis of bacteria via

AtlE-dependent autolysis causes the buildup of eDNA [23]. Due to the negatively charged nature of eDNA, this molecule has potential to act as an important structural component. For example, positively charged proteins, such as Sbp, could form electrostatic interactions with eDNA. In S. aureus, eDNA has been shown to induce amyloidogenesis of phenol soluble modulins (PSMs) [24]. In addition to eDNA, host polyanions/polycations such as heparin, heparin sulfate, and dextran could affect amylodogenesis [25]. PNAG, the exopolysaccharide present in many S. epidermidis biofilms, is also worth investigating for its effect on amyloidogenesis, although given that

PNAG (or the ica operon) is not required for virulence, it seems unlikely that this dependence would exist. eDNA can be quantified at distinct time points during biofilm formation [24, 26], or could be monitored by CLSM using a fluorescent dye which stains

DNA but is impermeable to the cell membrane, such as ethidium homodimer.

By examining the time-dependence of each of these components, we can better understand the regulation of biofilm formation and even gain insights into the regulation of Aap amyloidogenesis in biofilms. CLSM can also offer powerful insights into the spatial distribution of these elements, including co-localization of elements indicative of interactions.

309

Differentiating infectious and commensal S. epidermidis colonization

One of the most elusive questions regarding S. epidermidis biofilm formation is what the differences are between biofilms formed on indwelling medical devices

(infectious biofilms) and S. epidermidis on the skin (commensal growth). While this difference has not been explicitly addressed, multiple studies provide some insight into what factors may be important in the different growth modes.

An obvious potential difference is in the bacterial protein expression. Multiple studies have demonstrated that Aap is an important factor or is abundantly expressed in device-related infections [27] or in a rat catheter model [6]. Interestingly, Sbp did not play a significant role in the catheter infection model in which Aap was critical [9]. Due to

Sbp being a relatively recent discovery, it was not included in the original studies of isolates from device-related infections. Based on these limited data, Aap appears critical for infection, while Sbp is not. But what about for commensal growth on the skin? The data applicable to this scenario is more limited [9, 28]. Sbp expression was significantly important for adherence to keratinocytes, as determined by binding to keratinocytes [9].

Macintosh, et al. examined the role of Aap in skin adhesion quite extensively [28]. They showed isolates from skin and nose utilized Aap for attachment of S. epidermidis to corneocytes. Aap was also sufficient to allow corneocyte binding in a surrogate host,

Lactococcus lactis. Therefore, Aap seems very important in S. epidermidis skin colonization [28]. Given that Aap expression is important in both infectious and commensal roles, it seems like there is another factor responsible. Sbp expression could be an important deciding factor, since it seems important in skin colonization, but not in virulence.

310

One difference that we expect is critical to the formation of infectious biofilms is the presence of Zn2+. The immune response to bacterial infection involves the release of Zn2+ by neutrophils and mast cells [29-31]. It seems possible that Aap's Zn2+- dependent accumulation is an evolved response to the host immune system. This is supported by the fact that Aap is critical for infection in a rat catheter model [6], and that

S. epidermidis isolates from contaminated medical devices are usually aap+ [27].

However, there has been no direct verification of the presence of Zn2+ in biofilms from hospital-acquired infections or animal models.

To demonstrate the presence of Zn2+ in infectious biofilms, one could utilize atomic absorption spectrometry (AAS). AAS is a common technique used in analytical chemistry to quantify metals in various sample types. Samples are atomized by a flame, and the light absorbed by the atoms is used to determine the concentration of the particular element [32, 33]. Such an approach has been used to demonstrate the ability of biofilms to adsorb Zn2+ and Mn2+ [34]. Samples of biofilm would need to be taken directly from an infected medical device, with care taken not to dilute out any Zn2+ present. Appropriate controls should be used to evaluate leaching of Zn2+ or other trace metals from the medical device (depending on material composition), which may contribute to the detected Zn2+ in the biofilm sample. However, there would be no concern over keeping the biofilm alive or intact, since this would not change the atomic composition of the sample. This means shipment and storage of samples should not adversely affect the results. In addition to examining samples from infected medical devices, samples could also be taken from skin. This would provide information regarding the amount of Zn2+ present in the environment where S. epidermidis is in the

311

"commensal" form. Samples taken from different layers of the skin can be analyzed in parallel with bacterial genotyping to ensure the analysis is relevant to the locations or layers of the skin where S. epidermidis is actually present. An accurate comparison of

Zn2+ concentrations could be made, providing evidence to whether or not Zn2+ is a factor contributing to the different growth "modes" of S. epidermidis.

312

References

1. Conrady DG, Brescia CC, Horii K, Weiss AA, Hassett DJ, Herr AB. A zinc- dependent adhesion module is responsible for intercellular adhesion in staphylococcal biofilms. Proceedings of the National Academy of Sciences of the United States of America. 2008;105(49):19456-61. doi: 10.1073/pnas.0807717105. PubMed PMID: 19047636; PubMed Central PMCID: PMC2592360. 2. Conrady DG, Wilson JJ, Herr AB. Structural basis for Zn2+-dependent intercellular adhesion in staphylococcal biofilms. Proceedings of the National Academy of Sciences of the United States of America. 2013;110(3):E202-11. Epub 2013/01/02. doi: 10.1073/pnas.1208134110. PubMed PMID: 23277549; PubMed Central PMCID: PMCPmc3549106. 3. Shelton CL, Conrady DG, Herr AB. Functional consequences of B-repeat sequence variation in the staphylococcal biofilm protein Aap: deciphering the assembly code. Biochemical Journal. 2017;474(3):427-43. doi: 10.1042/bcj20160675. 4. Olson ME, Horswill AR. Bacteriophage Transduction in Staphylococcus epidermidis. Methods in molecular biology (Clifton, NJ). 2014;1106:167-72. Epub 2013/11/14. doi: 10.1007/978-1-62703-736-5_15. PubMed PMID: 24222465; PubMed Central PMCID: PMCPMC4711990. 5. Paharik AE, Kotasinska M, Both A, Hoang TN, Buttner H, Roy P, et al. The metalloprotease SepA governs processing of accumulation-associated protein and shapes intercellular adhesive surface properties in Staphylococcus epidermidis. Molecular microbiology. 2016. Epub 2016/12/21. doi: 10.1111/mmi.13594. PubMed PMID: 27997732. 6. Schaeffer CR, Woods KM, Longo GM, Kiedrowski MR, Paharik AE, Buttner H, et al. Accumulation-associated protein enhances Staphylococcus epidermidis biofilm formation under dynamic conditions and is required for infection in a rat catheter model. Infection and immunity. 2015;83(1):214-26. Epub 2014/10/22. doi: 10.1128/iai.02177- 14. PubMed PMID: 25332125; PubMed Central PMCID: PMCPmc4288872. 7. Knowles TP, Fitzpatrick AW, Meehan S, Mott HR, Vendruscolo M, Dobson CM, et al. Role of intermolecular forces in defining material properties of protein nanofibrils. Science (New York, NY). 2007;318(5858):1900-3. Epub 2007/12/22. doi: 10.1126/science.1150057. PubMed PMID: 18096801. 8. Miake H, Mizusawa H, Iwatsubo T, Hasegawa M. Biochemical characterization of the core structure of alpha-synuclein filaments. The Journal of biological chemistry. 2002;277(21):19213-9. Epub 2002/03/15. doi: 10.1074/jbc.M110551200. PubMed PMID: 11893734. 9. Decker R, Burdelski C, Zobiak M, Buttner H, Franke G, Christner M, et al. An 18 kDa scaffold protein is critical for Staphylococcus epidermidis biofilm formation. PLoS pathogens. 2015;11(3):e1004735. Epub 2015/03/24. doi: 10.1371/journal.ppat.1004735. PubMed PMID: 25799153; PubMed Central PMCID: PMCPmc4370877. 10. Fayyaz M. Structural Characterization of Small basic protein (Sbp) and Accumulation associated protein (Aap) – two Proteins involved in Biofilm Formation in Staphylococcus epidermidis [Dissertation / PhD Thesis]: University of Hamburg; 2017. 11. Wang Y, Jiang J, Gao Y, Sun Y, Dai J, Wu Y, et al. Staphylococcus epidermidis small basic protein (Sbp) forms amyloid fibrils, consistent with its function as a

313 scaffolding protein in biofilms. The Journal of biological chemistry. 2018;293(37):14296- 311. Epub 2018/07/28. doi: 10.1074/jbc.RA118.002448. PubMed PMID: 30049797; PubMed Central PMCID: PMCPMC6139570. 12. Romero D, Vlamakis H, Losick R, Kolter R. An accessory protein required for anchoring and assembly of amyloid fibres in B. subtilis biofilms. Molecular microbiology. 2011;80(5):1155-68. Epub 2011/04/12. doi: 10.1111/j.1365-2958.2011.07653.x. PubMed PMID: 21477127; PubMed Central PMCID: PMCPMC3103627. 13. Robinson LS, Ashman EM, Hultgren SJ, Chapman MR. Secretion of curli fibre subunits is mediated by the outer membrane-localized CsgG protein. Molecular microbiology. 2006;59(3):870-81. Epub 2006/01/20. doi: 10.1111/j.1365- 2958.2005.04997.x. PubMed PMID: 16420357; PubMed Central PMCID: PMCPMC2838483. 14. Nelson R, Eisenberg D. Structural Models of Amyloid‐Like Fibrils. Advances in Protein Chemistry. 73: Academic Press; 2006. p. 235-82. 15. Kayed R, Head E, Sarsoza F, Saing T, Cotman CW, Necula M, et al. Fibril specific, conformation dependent antibodies recognize a generic epitope common to amyloid fibrils and fibrillar oligomers that is absent in prefibrillar oligomers. Molecular Neurodegeneration. 2007;2(1):18. doi: 10.1186/1750-1326-2-18. 16. Tsolis AC, Papandreou NC, Iconomidou VA, Hamodrakas SJ. A consensus method for the prediction of 'aggregation-prone' peptides in globular proteins. PloS one. 2013;8(1):e54175. Epub 2013/01/18. doi: 10.1371/journal.pone.0054175. PubMed PMID: 23326595; PubMed Central PMCID: PMCPMC3542318. 17. Carulla N, Zhou M, Giralt E, Robinson CV, Dobson CM. Structure and intermolecular dynamics of aggregates populated during amyloid fibril formation studied by hydrogen/deuterium exchange. Accounts of chemical research. 2010;43(8):1072-9. Epub 2010/06/19. doi: 10.1021/ar9002784. PubMed PMID: 20557067. 18. Kheterpal I, Wetzel R. Hydrogen/deuterium exchange mass spectrometry--a window into amyloid structure. Accounts of chemical research. 2006;39(9):584-93. Epub 2006/09/20. doi: 10.1021/ar050057w. PubMed PMID: 16981674. 19. Page MI, Jencks WP. Entropic contributions to rate accelerations in enzymic and intramolecular reactions and the chelate effect. Proceedings of the National Academy of Sciences of the United States of America. 1971;68(8):1678-83. Epub 1971/08/01. PubMed PMID: 5288752; PubMed Central PMCID: PMCPMC389269. 20. Chaton CT, Herr AB. Defining the metal specificity of a multifunctional biofilm adhesion protein. Protein science : a publication of the Protein Society. 2017;26(10):1964-73. Epub 2017/07/15. doi: 10.1002/pro.3232. PubMed PMID: 28707417; PubMed Central PMCID: PMCPMC5606542. 21. Sun D, Accavitti MA, Bryers JD. Inhibition of biofilm formation by monoclonal antibodies against Staphylococcus epidermidis RP62A accumulation-associated protein. Clinical and diagnostic laboratory immunology. 2005;12(1):93-100. Epub 2005/01/12. doi: 10.1128/cdli.12.1.93-100.2005. PubMed PMID: 15642991; PubMed Central PMCID: PMCPMC540198. 22. Rohde H, Burdelski C, Bartscht K, Hussain M, Buck F, Horstkotte MA, et al. Induction of Staphylococcus epidermidis biofilm formation via proteolytic processing of the accumulation-associated protein by staphylococcal and host proteases. Molecular

314 microbiology. 2005;55(6):1883-95. doi: 10.1111/j.1365-2958.2005.04515.x. PubMed PMID: 15752207. 23. Flemming HC, Wingender J. The biofilm matrix. Nature reviews Microbiology. 2010;8(9):623-33. Epub 2010/08/03. doi: 10.1038/nrmicro2415. PubMed PMID: 20676145. 24. Schwartz K, Ganesan M, Payne DE, Solomon MJ, Boles BR. Extracellular DNA facilitates the formation of functional amyloids in Staphylococcus aureus biofilms. Molecular microbiology. 2016;99(1):123-34. Epub 2015/09/15. doi: 10.1111/mmi.13219. PubMed PMID: 26365835; PubMed Central PMCID: PMCPmc4715698. 25. Blancas-Mejia LM, Hammernik J, Marin-Argany M, Ramirez-Alvarado M. Differential effects on light chain amyloid formation depend on mutations and type of glycosaminoglycans. The Journal of biological chemistry. 2015;290(8):4953-65. Epub 2014/12/30. doi: 10.1074/jbc.M114.615401. PubMed PMID: 25538238; PubMed Central PMCID: PMCPMC4335233. 26. Jones EA, McGillivary G, Bakaletz LO. Extracellular DNA within a nontypeable Haemophilus influenzae-induced biofilm binds human beta defensin-3 and reduces its antimicrobial activity. Journal of innate immunity. 2013;5(1):24-38. Epub 2012/08/28. doi: 10.1159/000339961. PubMed PMID: 22922323; PubMed Central PMCID: PMCPMC3640559. 27. Rohde H, Burandt EC, Siemssen N, Frommelt L, Burdelski C, Wurster S, et al. Polysaccharide intercellular adhesin or protein factors in biofilm accumulation of Staphylococcus epidermidis and Staphylococcus aureus isolated from prosthetic hip and knee joint infections. Biomaterials. 2007;28(9):1711-20. doi: 10.1016/j.biomaterials.2006.11.046. PubMed PMID: 17187854. 28. Macintosh RL, Brittan JL, Bhattacharya R, Jenkinson HF, Derrick J, Upton M, et al. The terminal A domain of the fibrillar accumulation-associated protein (Aap) of Staphylococcus epidermidis mediates adhesion to human corneocytes. Journal of bacteriology. 2009;191(22):7007-16. Epub 2009/09/15. doi: 10.1128/jb.00764-09. PubMed PMID: 19749046; PubMed Central PMCID: PMCPmc2772481. 29. Sobotka AK, Malveaux FJ, Marone G, Thomas LL, Lichtenstein LM. IgE- mediated basophil phenomena: quantitation, control, inflammatory interactions. Immunological reviews. 1978;41:171-85. Epub 1978/01/01. PubMed PMID: 81547. 30. Metz M, Maurer M. Mast cells – key effector cells in immune responses. Trends in Immunology. 2007;28(5):234-41. doi: https://doi.org/10.1016/j.it.2007.03.003. 31. Hasan R, Rink L, Haase H. Zinc signals in neutrophil granulocytes are required for the formation of neutrophil extracellular traps. Innate Immunity. 2013;19(3):253-64. doi: 10.1177/1753425912458815. PubMed PMID: 23008348. 32. Hill SJ, Fisher AS. Atomic Absorption, Methods and Instrumentation. In: Lindon JC, Tranter GE, Koppenaal DW, editors. Encyclopedia of Spectroscopy and Spectrometry (Third Edition). Oxford: Academic Press; 2017. p. 37-43. 33. Dasgupta A, Wahed A. Chapter 1 - Instrumentation and Analytical Methods. In: Dasgupta A, Wahed A, editors. Clinical Chemistry, Immunology and Laboratory Quality Control. San Diego: Elsevier; 2014. p. 1-18. 34. Pani T, Das A, Osborne JW. Bioremoval of zinc and manganese by bacterial biofilm: A bioreactor-based approach. Journal of Photochemistry and Photobiology B: Biology. 2017;175:211-8. doi: https://doi.org/10.1016/j.jphotobiol.2017.08.039.

315

316

Appendix I. Biophysical insights into the mechanism of Bap-dependent biofilm

formation in Acinetobacter baumannii

Authors: Alexander E. Yarawsky1,2, Joseph J. Maciag2, P. Ethan Adkins2, Alexander R.

Horswill3 and Andrew B. Herr2,4

Affiliations: 1 - Graduate Program in Molecular Genetics, Biochemistry & Microbiology, University of Cincinnati College of Medicine, Cincinnati, OH 45267, USA

2 - Division of Immunobiology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA

3 - Department of Immunology and Microbiology, Anschutz Medical Campus, University of Colorado, Aurora, CO 80045

4 - Division of Infectious Diseases, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA

Author Contributions: A.E.Y. collected and analyzed all data associated with Bapice2, and assisted J.J.M. in collecting and analyzing Bapice1 CD and AUC data.

J.J.M. collected and analyzed Bapice1-associated data.

P.E.A. assisted A.E.Y. in purification of Bapice2 and associated data collection.

A.R.H. provided Bapice1 and Bapice2 protein-expressing BLR(DE3) cells.

A.B.H. and A.E.Y. conceived experiments and directed the project.

A.E.Y. wrote this draft.

Funding: Work was performed using funding from R01GM094363 awarded to A.B.H. and the University of Cincinnati Graduate School Dean's Fellowship awarded to A.E.Y. (2018-2019 AY).

317

Abstract

The emerging pathogen, Acinetobacter baumannii, is a gram-negative bacterium capable of causing pneumonia, bacteremia, device-related infections, and more. It has also become a particular problem in victims of natural disasters and war. The biofilm- associated protein, Bap, has been associated with many clinical strains, and is a target of interest for therapeutic solutions to A. baumannii infection. We present here the first biophysical analysis of two regions of Bap, composed of A, B and C repeating domains.

These regions are highly elongated and composed of beta-sheet and random coil.

Interestingly, Zn2+-induced dimerization was observed, suggesting a mechanism of intercellular adhesion dependent on this important divalent cation. Further investigation will be required to decipher the relevance of the Zn2+-induced dimerization event.

318

Introduction

Acinetobacter baumannii is an emerging pathogen in hospitals worldwide, and was recognized as one of six especially problematic pathogens by the Antimicrobial

Availability Task Force of the Infectious Diseases Society of America in 2006 [1]. This gram-negative bacterium is responsible for hospital-acquired pneumonia, as well as bloodstream infections, bacteremia, skin infections, and infections associated with catheters [1, 2]. A. baumannii is particularly dangerous for elderly, alcoholics, and smokers [1, 2]. More recently, this pathogen has presented itself as problematic for war and natural disaster victims [1, 3, 4]. In a report by Davis, et al., 16 of 20 war-related injuries involving A. baumannii infection contained multidrug-resistant isolates [4].

In addition to its multidrug resistance, the virulence of A. baumannii is also connected to its ability to form biofilms on biotic and abiotic surfaces [5]. The biofilm- associated protein, Bap, has been identified as an important player in biofilm formation and pathogenesis [6-8]. An antibody raised against the surface-attached Bap reacted to

41% of A. baumannii strains isolated from military personnel [6]. A transposon mutant lacking Bap expression was unable to form well-developed biofilms, suggesting a role for Bap in intercellular adhesion [6]. Furthermore, biofilm formation could be inhibited by addition of Bap antibody, suggesting Bap may be involved in surface adhesion to polystyrene, as well [7]. Brossard and Campagnari [8] investigated the ability of Bap to mediate adhesion to human bronchial epithelial cells and human neonatal keratinocytes. In both cases, a Bap-expressing strain showed significantly better adherence than a Bap-deficient mutant. The authors demonstrated that Bap greatly

319 increases the bacterial cell surface hydrophobicity, which is a likely mechanism for

Bap's ability to influence surface adhesion [8].

Bap is a very large, cell wall-associated protein with multiple repeating modules capable of reaching a total size of 854 kDa. The Bapice1 and Bapice2 regions at the N- terminus of the protein (Figure 1) share homology with immunoglobulin-like proteins, while the D repeats share homology to cadherins [6, 9]. The secondary structure is predicted to be predominantly beta-sheet along the immunoglobulin-like regions of

Bapice1 and Bapice2, while the D repeats are predicted to contain alpha helical content

[9]. Ligand-binding predictions identified Ca-, Cu- and Ni-binding sites in the Bapice regions, along with a variety of carbohydrates, while the D repeats may bind Mg and Ca

[9]. Downstream of the D repeats, are E, F, and G repeats, which are not similar to the

Bapice regions or D repeats [6, 9]. A region from Bapice2 was evaluated as a vaccine candidate, and mice immunized with the recombinant protein were capable of developing antibodies against Bap, which resulted in significant protection from A. baumannii infection [10].

Figure 1. Bap domain arrangement. Bapice1 and Bapice2 are the two regions (colored blue) of interest in this study, and they are composed of A, B and C repeats. The D repeat region (orange) is downstream of the Bapice regions. The green-colored regions are each composed of various arrangements of E, F and G repeats.

320

Tiwari, et al. [11] performed in-silico screening for small molecule inhibitors of biofilm formation via Bap-binding. Using a 396 amino acid region of Bapice1, the authors produced a structural model containing nearly all beta-sheet and random coil secondary structure, and a highly elongated shape. An active site was predicted, and two compounds from the ZINC database survived through all rounds of screening. The top compound also docked to a predicted dimer. Both compounds inhibited A. baumannii biofilm formation when added at 0 hr or 48 hr timepoints [11].

Herein, we provide the first experimental investigation of the biophysical parameters of two regions of Bap (Figure 1 - Bapice1 and Bapice2). Using circular dichroism, we observed strong beta-sheet and random coil secondary structure signals, with little to no alpha-helix. Analytical ultracentrifugation (AUC) analyses show these regions are highly extended in solution. The strong beta-sheet and random coil secondary structure, along with a highly elongated tertiary structure, align well with predictions based on sequence homology to immunoglobulin-like proteins [11].

Interestingly, we identify a novel dimer, which is induced by addition of Zn2+ ions. The biological implications of the Zn2+-induced dimerization event will need to be tested in the context of biofilm formation assays and models of pathogenesis. Such work may be a critical foundation upon which new therapeutic approaches to preventing or treating A. baumannii infections will be built.

321

(A) (B)

Figure 2. Secondary structure analysis reveals beta-sheet and random coil content. Panel (A) shows the circular dichroism wavelength data for Bapice1 at 20°C (black) and 80°C (red). Panel (B) shows similar data for Bapice2 at 20°C (blue) and 80°C (red). The deconvolution of the 20°C wavelength data using the CDSSTR algorithm [12] of DichroWeb [13] using 2 different reference sets is shown in Table 1. The 80°C data shows a large increase in the minimum at 200nm, resembling an unfolded protein.

Sample Ref Set NRMSD Helix Strand Turns Unordered Total Bapice1 4 0.101 6 ± 4 31 ± 9 26 ± 6 37 ± 8 100 ± 2 7 0.101 4 ± 1 39 ± 17 21 ± 9 35 ± 11 99 ± 2 Bapice2 4 0.012 8 ± 2 64 ± 15 6 ± 6 22 ± 9 99 ± 2 7 0.012 1 86 3 12 102 Table 1. Wavelength scans of Bapice1 and Bapice2 at 20°C were analyzed by DichroWeb [13]. The CDSSTR algorithm [12] was used with two separate reference sets. In all cases, Bapice1 and Bapice2 were found to be enriched in strand, turns, and unordered elements, with very little helix. The top six valid solutions were averaged, and the standard deviation was calculated. Note that the analysis of Bapice2, reference set 7, only provided one solution.

322

Results

Bap is rich in beta-sheet secondary structure content and highly elongated

Recombinant Bapice1 and Bapice2 were expressed and their secondary structure characterized in solution. Circular dichroism (CD) showed mostly beta-sheet and random coil content, with virtually no helix content (Figure 2 and Table 1). These results are in very good agreement with sequence-based secondary structure predictions (data not shown). We then utilized sedimentation velocity analytical ultracentrifugation (AUC) to examine the size and shape of the protein in solution.

Bapice1 and Bapice2 each sedimented as a single species - a monomer, with a frictional ratio near 1.8 and 2.2, respectively (Figure 3, Table 2). A frictional ratio of 1.0 represents a perfect sphere, while 1.3-1.4 is expected for common globular proteins, such as bovine serum albumin [14]. The more elongated a protein, the higher its frictional ratio. The elongated nature of a biofilm-related surface protein such as this is neither unexpected, nor uncommon [15-18]. The ability to "reach" out toward another cell is most certainly useful for producing intercellular contacts.

Bap dimerization occurs in the presence of Zn2+

Due to the apparent similarities between the structure and functional roles of the

B-repeats of Aap from Staphylococcus epidermidis [16, 17] and the Bapice regions of A. baumannii, we hypothesized that Bap may be able to assemble in the presence of divalent metal cations to form intercellular contacts in the biofilm. Once again utilizing sedimentation velocity AUC, we observed a significant shift in the sedimentation coefficient (s*) of Bapice2 in the presence of 10 mM ZnCl2, but not when other tested

323

Figure 3. Bapice2 is highly extended in solution. Bapice2 (16.7 µM = 1 mg/ml) was analyzed by sedimentation velocity AUC in 20 mM Tris (pH 7.4), 150 mM NaCl, at 20°C. Hydrodynamic parameters calculated from this experiment are listed in Table 2.

Sample s* f/f0 MW Bapice1 2.37 1.83 42 kDa Bapice2 2.82 2.18 69 kDa Table 2. The hydrodynamic parameters determined by sedimentation velocity AUC are listed. The s* value has not been corrected for buffer density, viscosity, or partial specific volume of the protein. Values were calculated by SEDFIT's continuous c(s) distribution analysis.

324 divalent cations were present (Figure 4A). The frictional ratio decreased to ~1.7 (Table

3), indicating a less elongated structure than the monomer. This is consistent with our observations with the B-repeat constructs of Aap, and may indicate dimerization of

Bapice2, with each protomer overlapping another in such a way that the overall complex is now twice as thick, but similar in length. However, it is also possible the protein is simply undergoing changes in structure associated with Zn2+-binding, but is still monomeric. Similar experiments were performed with Bapice1 (Figure 4B), however, there was not a considerable shift in the sedimentation coefficient upon Zn2+ addition.

This should not necessarily be interpreted as Bapice1 being unable to assemble, but that the capacity for binding is lower. This is likely related to the fact that Bapice2 has an additional two bacterial Ig-like repeats (Figure 1). In the context of our previous work with Aap B-repeats, the more repeats that are present, the lower the Zn2+ required for assembly [16, 19][ See Dissertation Chapter III].

To investigate whether or not Bapice2 is assembling in the presence of Zn2+, we performed sedimentation equilibrium AUC experiments. This approach uses first principles to measure the molecular weight of proteins upon reaching thermodynamic equilibrium. As shown in Figure 4B and Figure 4C, Bapice2 dimerizes in the presence of

Zn2+, but not in the absence of Zn2+. Interestingly, even at high Zn2+ concentrations, the monomer species was more prevalent (Figure 4E). This suggests a weak dimerization constant. Also, because we are unable to fully saturate the dimer, it is not clear from these data whether or not the dimer is the terminal assembly state. Nonetheless, the shift in the sedimentation coefficient observed in Figure 4A is at least in part due to

Bapice2 dimerization.

325

(A) (B)

(C) (D)

(E) (F)

Figure 4. Zn2+ induces dimerization of Bapice2. Sedimentation velocity experiments were performed at 1 mg/ml Bapice1 or Bapice2 in 20 mM Tris (pH 7.4), 150 mM NaCl, at 20°C, with the addition of 10 mM of various divalent cations. For Bapice2, only in the presence of ZnCl2 was a major shift in the sedimentation coefficient observed (A). Bapice1 did not show a major shift in the sedimentation coefficient under similar

326 conditions (B). Sedimentation equilibrium AUC experiments were then performed on Bapice2 in the presence or absence of Zn2+. An example of the raw data (empty circles) and species fits (lines) for Bapice2 without ZnCl2 (C) and with 8.00 mM ZnCl2 (D) are shown. Panel (E) shows the weight-averaged molecular weight approximated by fitting the data to a single species. Panel (F) shows circular dichroism data examining secondary structure of Bapice2 with and without ZnCl2 by CD.

Sample s* f/f0 Bapice1 2.37 1.83 ZnCl2 2.48 1.76

Bapice2 2.82 2.18 ZnCl2 3.63 1.69 NiCl2 2.91 2.03

CoCl2 2.90 1.90 CaCl2 2.81 1.94

MnCl2 2.89 2.03 SrCl2 2.90 2.07

MgCl2 2.81 2.03 Table 3. Hydrodynamic parameters from sedimentation velocity AUC experiments with Bapice1 and Bapice2 in the absence or presence of various divalent cations (Figure 4).

327

Zn2+-induced dimerization of Bapice2 results in no significant secondary structural changes

To examine the effect of Zn2+-binding and dimerization on Bapice2 structure and stability, we performed CD in the presence of 5 mM ZnCl2. Data shown in Figure 4F demonstrate there is no significant change in secondary structure in the presence of 5 mM ZnCl2. Under these conditions, it seems likely that there is some degree of dimerization, but it would be useful to test higher concentrations of Zn2+ with higher protein concentrations to better address this question. Interestingly, in the B-repeats of

Aap we also do not observe structural changes by CD upon dimerization (or tetramerization) [16][ See Dissertation Chapter III].

Testing for heteroassociation between the Bap ice-repeat regions

In the context of Bap-dependent intercellular adhesion, one might expect the

Bapice1 region of one protein to overlap the Bapice2 region of the other protein. We have already shown via AUC approaches that Bapice2 can self-associate in the presence of Zn2+. To test for the possibility of heteroassociation (i.e. a Bapice1+Bapice2 dimer), we performed a series of sedimentation velocity AUC experiments in the absence and presence of Zn2+ (Figure 5). In the absence of Zn2+, we did not observe the formation of any faster-sedimenting species. In fact, the weight-averaged sedimentation coefficient (sw) observed for the mixture was very similar to the calculated sw based on the sw and concentrations of each protein individually (Table 4). The same approach was used to search for heteroassociation in the presence of Zn2+. Similarly,

2+ the sw of the mixture matched that calculated from the individual Bapice1+Zn and

328

Bapice2+Zn2+ data. Given the unlikelihood that the heteroassembly would have the same assembly constant as the homoassembly, the sw value being the same as the calculated value does seem to suggest that heteroassembly is not occurring. One might anticipate that any Bapice1:Bapice2 heteroassembly would be more preferred than the

Bapice1 homoassembly, because that would allow complete overlap across both Bap ice-repeat regions. Alternatively, Bapice2 regions may overlap, while the Bapice1 and D repeats overlap (see orange repeats in Figure 1).

329

(A) (B)

Figure 5. No evidence for heteroassociation of Bap ice-repeat regions is observed. Bapice1, Bapice2, or an equal molar mix of the two (Bapice1+2) were analyzed by sedimentation velocity AUC and the absence (A) or presence (B) of Zn2+. Table 3 contains the analysis of the weight-averaged sedimentation coefficients. Based on these distributions, no larger species are forming, beyond the expected dimer.

Sample Area sw Bapice1 - ZnCl2 0.454 2.328 Bapice2 - ZnCl2 0.639 2.708 Bapice1+2 - ZnCl2 1.099 2.475 (2.536) Bapice1 + 8 mM ZnCl2 0.477 2.458 Bapice2 + 8 mM ZnCl2 0.704 3.736 Bapice1+2 + 8 mM ZnCl2 1.201 3.169 (3.166)

Table 4. The area of each distribution and the corresponding weight-averaged sedimentation coefficient (sw) was calculated in GUSSI [20]. In the parentheses are the expected sw values for the Bapice1+2 experiments, assuming only homodimers are forming.

330

Discussion

Despite the apparent importance of Bap in A. baumannii infection and biofilm formation [6-8], there is no published experimental data regarding the structure of the protein. One obvious challenge that may be limiting the field is the enormous size of

Bap, reaching 854 kDa (8620 amino acids). Bap contains repeating units denoted A-G.

The N-terminal Bapice1 and Bapice2 regions studied here are composed of A, B, and C repeats [9], which have been shown to be important in Bap ligand-binding [9], infection

[10], and intercellular association [11].

We measured the secondary structure content of Bapice1 and Bapice2, and observed primarily beta-sheet and random coil content, consistent with secondary structure predictions. Both proteins also showed high frictional ratios (elongated shape) by sedimentation velocity AUC, consistent with predicted homology to immunoglobulin and cadherin domains. In the presence of Zn2+, Bapice2 showed a significant shift in the sedimentation coefficient, which was shown to be a dimerization event based on sedimentation equilibrium AUC experiments. By CD, there was no significant change in the secondary structure under similar conditions. Future experiments should be performed at higher protein concentrations to examine the capability of Bapice2 to assemble beyond dimer, as the conditions used here resulted in only a moderate amount of overall dimerization. While we did not observe heteroassociation between

Bapice1 and Bapice2, a more appropriate experiment for testing heteroassociation efficiently would be to fluorescently label Bapice1 and mix it with Bapice2 and Zn2+, then follow the sedimentation of that label in parallel to the protein. If Bapice1:label is

331 sedimenting with a sedimentation coefficient higher than Bapice1+Zn2+ alone, then this would demonstrate that the labeled Bapice1 may be in complex with Bapice2.

Tiwari, et al. [11] reported Congo Red binding by extracellular matrix proteins in

A. baumannii which could indicate Bap amyloid formation - similar to Staphylococcus aureus Bap (poor sequence similarity, but similar predicted structural features) [11, 15].

Amyloidogenesis of Bapice1 and Bapice2 was not evaluated in this study, but given the weak assembly in the presence of Zn2+, a larger construct may be required for this type of investigation. Amyloidogenesis is highly dependent on local protein concentrations, which are much easier to achieve when subunits are linked together in the same protein, such as a construct containing both Bapice1 and Bapice2 regions. While Tiwari, et al. [11] have demonstrated the ability of Bap-targeted small molecules to inhibit and disrupt biofilm formation, we have proposed that Zn2+ is an important ligand in the intercellular association of Bap. Future studies should evaluate the ability for Zn2+- chelators to inhibit or disrupt biofilm formation. Zn2+ is an important divalent cation released by mast cells and neutrophils in response to bacterial infection [21-23], therefore, Bap may act as a sensor to induce A. baumannii biofilm formation as a defensive tactic to escape the host immune response.

332

Materials and methods

Protein expression and purification

Bapice1 and Bapice2 were provided by Alexander Horswill in pTEV5 vectors in

BLR(DE3) protein expression cells. A single colony was grown overnight at 37°C, shaking, in LB containing ampicillin (50 µg/l) and tetracycline (12.5 µg/l). A 25 ml aliquot from the overnight culture was added to 1 l of LB containing ampicillin and tetracycline.

The culture was grown at 37°C, shaking, until it reached an OD600 of 0.8-1.0. The culture(s) was chilled to 10°C in an ice bath, then 2% ethanol (vol/vol) and 20 µM IPTG

(Isopropyl β-D-1-thiogalactopyranoside) were added. The culture(s) was then incubated in a shaker at 20°C overnight. The following morning, the culture(s) was centrifuged at

4,500 rpm for 1 hour, the media removed, and the pellet resuspended in 20 m Tris (pH

7.4) and 300 mM NaCl, collected and frozen at -20°C.

Frozen cells were lysed by a sonicator, centrifuged for 45 minutes at 14,000 rpm, then filtered through a 0.22 µm membrane. The filtered material was run over a Ni2+- charged HiTrap HP cartridge column (GE Healthcare). The binding buffer was 20 mM

Tris (pH 7.4), 500 mM NaCl, and 10 mM imidazole. A gradient elution of 20 mM Tris (pH

7.4), 500 mM NaCl, and 1 M imidazole was performed to elute the 6xHis-tagged Bapice proteins from the affinity column. The imidazole was removed by dialysis into 20 mM

Tris (pH 7.4), 300 mM NaCl before cleavage by His-tagged TEV protease to remove the

6xHis tag. The material was run over the HiTrap column again, this time collecting the flow-through containing Bapice1 or Bapice2. Proteins were then purified by a Superdex

S200 (Bapice2) or Superdex S75 (Bapice1) (GE Healthcare). Proteins were then stored at 4°C or flash-frozen and stored at -80°C.

333

Analytical ultracentrifugation

AUC experiments were performed on a Beckman Coulter XL-I, using absorbance and/or interference optics. An An-60 Ti rotor and 1.2 cm two-sector epon-charcoal centerpieces with sapphire windows were used for sedimentation velocity experiments at 48,000 rpm, 20°C. SEDFIT's continuous c(s) distribution analysis was used for sedimentation velocity data [24]. SEDFIT c(s) distributions were loaded into GUSSI for calculation of area and sw by integration [20]. For sedimentation equilibration experiments, six-sector 1.2 cm centerpieces were used with quartz windows. Bapice2 was dialyzed into 50 mM MOPS (pH 7.2) and 50 mM NaCl without ZnCl2, with 5 mM

ZnCl2, or with 8 mM ZnCl2, and was loaded at 1.00, 0.33, and 0.11 mg/ml, and centrifuged at 15000, 20000, 24000, 32000, and 48000 for 24 hours, at which point equilibrium had already been reached. Data were trimmed in WinReedit V0.999

(http://www.rasmb.org/software/) and fitted in WinNonlin V1.080

(http://www.rasmb.org/software/), using the partial specific volume, buffer density, and buffer viscosity estimated by SEDNTERP [25].

Circular dichroism

An Aviv 215 circular dichroism spectrophotometer equipped with an Aviv peltier junction temperature control sample holder was used to analyze samples in a 0.5 mm quartz cuvette (Hellma Analytics) at 20°C. Proteins were examined at 0.33 mg/ml, from

300 nm to 178 nm, with 1 nm steps. The averaging time was 3 seconds, and scans were taken in triplicates. For Bapice2, the mean residue weight (MRW) used for

334 calculating the mean residue ellipticity ([θ]) was 98.36 g mol-1 residue-1. The mean residue ellipticity was calculated as described elsewhere [18].

335

References

1. Talbot GH, Bradley J, Edwards JE, Jr., Gilbert D, Scheld M, Bartlett JG. Bad bugs need drugs: an update on the development pipeline from the Antimicrobial Availability Task Force of the Infectious Diseases Society of America. Clinical infectious diseases : an official publication of the Infectious Diseases Society of America. 2006;42(5):657-68. Epub 2006/02/01. doi: 10.1086/499819. PubMed PMID: 16447111. 2. Antunes LC, Visca P, Towner KJ. Acinetobacter baumannii: evolution of a global pathogen. Pathogens and disease. 2014;71(3):292-301. Epub 2014/01/01. doi: 10.1111/2049-632x.12125. PubMed PMID: 24376225. 3. Joly-Guillou ML. Clinical impact and pathogenicity of Acinetobacter. Clinical microbiology and infection : the official publication of the European Society of Clinical Microbiology and Infectious Diseases. 2005;11(11):868-73. Epub 2005/10/12. doi: 10.1111/j.1469-0691.2005.01227.x. PubMed PMID: 16216100. 4. Davis KA, Moran KA, McAllister CK, Gray PJ. Multidrug-resistant Acinetobacter extremity infections in soldiers. Emerging infectious diseases. 2005;11(8):1218-24. Epub 2005/08/17. doi: 10.3201/eid1108.050103. PubMed PMID: 16102310; PubMed Central PMCID: PMCPMC3320488. 5. Longo F, Vuotto C, Donelli G. Biofilm formation in Acinetobacter baumannii. The new microbiologica. 2014;37(2):119-27. Epub 2014/05/27. PubMed PMID: 24858639. 6. Loehfelm TW, Luke NR, Campagnari AA. Identification and characterization of an Acinetobacter baumannii biofilm-associated protein. Journal of bacteriology. 2008;190(3):1036-44. Epub 2007/11/21. doi: 10.1128/jb.01416-07. PubMed PMID: 18024522; PubMed Central PMCID: PMCPMC2223572. 7. Goh HM, Beatson SA, Totsika M, Moriel DG, Phan MD, Szubert J, et al. Molecular analysis of the Acinetobacter baumannii biofilm-associated protein. Applied and environmental microbiology. 2013;79(21):6535-43. Epub 2013/08/21. doi: 10.1128/aem.01402-13. PubMed PMID: 23956398; PubMed Central PMCID: PMCPMC3811493. 8. Brossard KA, Campagnari AA. The Acinetobacter baumannii biofilm-associated protein plays a role in adherence to human epithelial cells. Infection and immunity. 2012;80(1):228-33. Epub 2011/11/16. doi: 10.1128/iai.05913-11. PubMed PMID: 22083703; PubMed Central PMCID: PMCPMC3255684. 9. Rahbar MR, Rasooli I, Mousavi Gargari SL, Amani J, Fattahian Y. In silico analysis of antibody triggering biofilm associated protein in Acinetobacter baumannii. Journal of theoretical biology. 2010;266(2):275-90. Epub 2010/07/06. doi: 10.1016/j.jtbi.2010.06.014. PubMed PMID: 20600143. 10. Fattahian Y, Rasooli I, Mousavi Gargari SL, Rahbar MR, Darvish Alipour Astaneh S, Amani J. Protection against Acinetobacter baumannii infection via its functional deprivation of biofilm associated protein (Bap). Microbial pathogenesis. 2011;51(6):402- 6. Epub 2011/09/29. doi: 10.1016/j.micpath.2011.09.004. PubMed PMID: 21946278. 11. Tiwari V, Patel V, Tiwari M. In-silico screening and experimental validation reveal L-Adrenaline as anti-biofilm molecule against biofilm-associated protein (Bap) producing Acinetobacter baumannii. International journal of biological macromolecules. 2018;107(Pt A):1242-52. Epub 2017/10/02. doi: 10.1016/j.ijbiomac.2017.09.105. PubMed PMID: 28964839.

336

12. Johnson WC. Analyzing protein circular dichroism spectra for accurate secondary structures. Proteins. 1999;35(3):307-12. Epub 1999/05/18. PubMed PMID: 10328265. 13. Whitmore L, Wallace BA. DICHROWEB, an online server for protein secondary structure analyses from circular dichroism spectroscopic data. Nucleic acids research. 2004;32(Web Server issue):W668-73. Epub 2004/06/25. doi: 10.1093/nar/gkh371. PubMed PMID: 15215473; PubMed Central PMCID: PMCPMC441509. 14. Zhao H, Ghirlando R, Alfonso C, Arisaka F, Attali I, Bain DL, et al. A multilaboratory comparison of calibration accuracy and the performance of external references in analytical ultracentrifugation. PloS one. 2015;10(5):e0126420. Epub 2015/05/23. doi: 10.1371/journal.pone.0126420. PubMed PMID: 25997164; PubMed Central PMCID: PMCPMC4440767. 15. Taglialegna A, Navarro S, Ventura S, Garnett JA, Matthews S, Penades JR, et al. Staphylococcal Bap Proteins Build Amyloid Scaffold Biofilm Matrices in Response to Environmental Signals. PLoS pathogens. 2016;12(6):e1005711. Epub 2016/06/22. doi: 10.1371/journal.ppat.1005711. PubMed PMID: 27327765; PubMed Central PMCID: PMCPMC4915627. 16. Conrady DG, Brescia CC, Horii K, Weiss AA, Hassett DJ, Herr AB. A zinc- dependent adhesion module is responsible for intercellular adhesion in staphylococcal biofilms. Proceedings of the National Academy of Sciences of the United States of America. 2008;105(49):19456-61. doi: 10.1073/pnas.0807717105. PubMed PMID: 19047636; PubMed Central PMCID: PMC2592360. 17. Conrady DG, Wilson JJ, Herr AB. Structural basis for Zn2+-dependent intercellular adhesion in staphylococcal biofilms. Proceedings of the National Academy of Sciences of the United States of America. 2013;110(3):E202-11. Epub 2013/01/02. doi: 10.1073/pnas.1208134110. PubMed PMID: 23277549; PubMed Central PMCID: PMCPmc3549106. 18. Yarawsky AE, English LR, Whitten ST, Herr AB. The Proline/Glycine-Rich Region of the Biofilm Adhesion Protein Aap Forms an Extended Stalk that Resists Compaction. Journal of molecular biology. 2017;429(2):261-79. Epub 2016/11/29. doi: 10.1016/j.jmb.2016.11.017. PubMed PMID: 27890783. 19. Chaton CT, Herr AB. Defining the metal specificity of a multifunctional biofilm adhesion protein. Protein science : a publication of the Protein Society. 2017. Epub 2017/07/15. doi: 10.1002/pro.3232. PubMed PMID: 28707417. 20. Brautigam CA. Calculations and Publication-Quality Illustrations for Analytical Ultracentrifugation Data. Methods in enzymology. 2015;562:109-33. Epub 2015/09/29. doi: 10.1016/bs.mie.2015.05.001. PubMed PMID: 26412649. 21. Metz M, Maurer M. Mast cells – key effector cells in immune responses. Trends in Immunology. 2007;28(5):234-41. doi: https://doi.org/10.1016/j.it.2007.03.003. 22. Sobotka AK, Malveaux FJ, Marone G, Thomas LL, Lichtenstein LM. IgE- mediated basophil phenomena: quantitation, control, inflammatory interactions. Immunological reviews. 1978;41:171-85. Epub 1978/01/01. PubMed PMID: 81547. 23. Hasan R, Rink L, Haase H. Zinc signals in neutrophil granulocytes are required for the formation of neutrophil extracellular traps. Innate Immunity. 2013;19(3):253-64. doi: 10.1177/1753425912458815. PubMed PMID: 23008348.

337

24. Schuck P. Size-distribution analysis of macromolecules by sedimentation velocity ultracentrifugation and lamm equation modeling. Biophysical journal. 2000;78(3):1606- 19. Epub 2000/02/29. doi: 10.1016/s0006-3495(00)76713-0. PubMed PMID: 10692345; PubMed Central PMCID: PMCPmc1300758. 25. Laue TM, Shah BD, Ridgeway TM, Pelletier SL. Computer-aided interpretation of sedimentation data for proteins. In: Harding SE, Rowe AJ, Horton JC, editors. Analytical Ultracentrifugation in Biochemistry and Polymer Science: Royal Society of Chemistry, London; 1992. p. 90-125.

338