A Canonical Biophysical Model of the Csra Global Regulator Suggests
Total Page:16
File Type:pdf, Size:1020Kb
www.nature.com/scientificreports OPEN A Canonical Biophysical Model of the CsrA Global Regulator Suggests Flexible Regulator-Target Received: 15 January 2018 Accepted: 4 June 2018 Interactions Published: xx xx xxxx A. N. Leistra1, G. Gelderman 1, S. W. Sowa2, A. Moon-Walker3, H. M. Salis4 & L. M. Contreras1 Bacterial global post-transcriptional regulators execute hundreds of interactions with targets that display varying molecular features while retaining specifcity. Herein, we develop, validate, and apply a biophysical, statistical thermodynamic model of canonical target mRNA interactions with the CsrA global post-transcriptional regulator to understand the molecular features that contribute to target regulation. Altogether, we model interactions of CsrA with a pool of 236 mRNA: 107 are experimentally regulated by CsrA and 129 are suspected interaction partners. Guided by current understanding of CsrA- mRNA interactions, we incorporate (i) mRNA nucleotide sequence, (ii) cooperativity of CsrA-mRNA binding, and (iii) minimization of mRNA structural changes to identify an ensemble of likely binding sites and their free energies. The regulatory impact of bound CsrA on mRNA translation is determined with the RBS calculator. Predicted regulation of 66 experimentally regulated mRNAs adheres to the principles of canonical CsrA-mRNA interactions; the remainder implies that other, diverse mechanisms may underlie CsrA-mRNA interaction and regulation. Importantly, results suggest that this global regulator may bind targets in multiple conformations, via fexible stretches of overlapping predicted binding sites. This novel observation expands the notion that CsrA always binds to its targets at specifc consensus sequences. Large-scale omics techniques have been applied with increasing frequency to the study of bacterial post-transcriptional global regulators (e.g., E. coli Hfq, ProQ, and CsrA), aiming to elucidate the scope of their targets and regulatory efects1–6. Results have established that these global regulators can act upon over hundreds of targets. For example, the Hfq global regulator has been implicated as a chaperone for nearly all characterized small RNA (sRNA)-messenger RNA (mRNA) interactions in E. coli7,8. Hfq-RNA interactions are characterized by Hfq binding U-rich sequences in the 3′ portions of sRNAs3,9–11 and A-rich sequences in the 5′ untranslated regions (UTRs) of mRNAs3,12,13. Tese motifs are considered specifc to the proximal and distal binding faces, respectively, of the Hfq hexamer12,14. However, interaction-specifc variation is observed, in which sRNAs contact both faces, like ChiX and McaS15,16. Te impact of such variation in binding on target control and network regu- lation is still unfolding. For the case of CsrA, approximately 800 mRNA have been identifed across multiple environmental condi- tions as potentially interacting with CsrA2,5: this total approaches 20% of the E. coli genome. Generally, the CsrA homodimer binds a target mRNA at two copies17 of a consensus sequence (ANGGA)18, preferentially located in the loop of a hairpin structure18,19. However substantial variation in these traits is observed. For example, among the 31 classical and well-characterized targets of CsrA (defned in Table 1), mRNAs like pgaA may hold to the general pattern20, but clpB, dps, patA, and purM do not present the consensus ANGGA binding motif5. Similarly, hfq and ycdT present only a single copy of the consensus sequence20–24 and cstA21 presents footprinted binding sites outside of the typical stem loop structures that have been shown favorable to CsrA binding. 1McKetta Department of Chemical Engineering, University of Texas at Austin, 200 E. Dean Keeton St. Stop C0400, Austin, TX, 78712, USA. 2Microbiology Graduate Program, University of Texas at Austin, 100 E. 24th St. Stop A6500, Austin, TX, 78712, USA. 3Biological Sciences Program College of Natural Sciences, University of Texas at Austin, 120 Inner Campus Drive Stop G2500, Austin, TX, 78712, USA. 4Department of Chemical Engineering, Pennsylvania State University, 210 Agricultural Engineering Building, University Park, PA, 16802, USA. Correspondence and requests for materials should be addressed to L.M.C. (email: [email protected]) SCIENTIFIC REPORTS | (2018) 8:9892 | DOI:10.1038/s41598-018-27474-2 1 www.nature.com/scientificreports/ Number Number of mRNA with Foot- Direct Regulation of Literature of mRNA association of mRNA Correctly-Predicted mRNA Target Type printed? Binding? translation? with CsrA modeled Regulation (rep or act) Classical Yes Yes Yes Multiple Studies20,21,23,25,39,42–47 10 8 Well-characterized No Yes Yes Multiple Studies5,6,24 21 13 Sequence-Search model40,41 30 23 Multi-Omics study5 28 17 Functional No No Yes 2 5 Co-IP , HITS-CLIP , or mRNA 18 5 stability4 study Total 107 66 Table 1. mRNAs experimentally regulated by CsrA. Figure 1. Biophysical considerations for modeling the CsrA post-transcriptional regulator. (A) Te CsrA protein regulator binds mRNA, preferentially with a stoichiometry of one CsrA homodimer to two binding sites within an mRNA. (B) Zooming in on the CsrA-mRNA interaction highlights the biophysical factors that afect it. CsrA preferentially binds (i) ANGGA nucleotide sequences (ii) spaced approximately 10–55 nucleotides apart to support cooperative dual-site binding. Te (iii) structure of the mRNA determines how accessible these sequences may be and the energetic cost of CsrA-mRNA binding. Lastly, CsrA bound to the mRNA may (iv) interfere with ribosome binding to regulate mRNA translation. Te question thus arises of how post-transcriptional global regulators execute hundreds of interactions with targets of variable sequence and structural features while retaining specifcity. Although mutation followed by biochemical footprinting, gel shif, and reporter assays are typically used to assess how such features may con- tribute to regulator-target interactions25–27, these are low throughput. For this reason, omics techniques and, in particular, co-immunoprecipitation studies have proven helpful for establishing pools of a post-transcriptional regulator’s potential targets3,5,28. Tese studies, however, cannot assess how specifc molecular features of a target mRNA may infuence its recognition and control by the regulator. Termodynamic modeling approaches ofer higher-throughput opportunities to test the impact of specifc molecular features hypothesized to be important on regulator-target interactions. For instance, thermody- namic models have been used to characterize the energetics of single post-transcriptional regulator-target, e.g., sRNA-mRNA, interactions and predict energetically likely mRNA targets from bacterial genomes29. Tese models have also begun to incorporate estimation of a regulator’s efect on RNA target expression30–32 and to predict local RNA accessibility patterns that indicate potential areas for RNA interactions33. Recent advances in thermody- namic modeling of translation initiation34–37 have further expanded the ability to calculate a regulator’s impact on the expression of its mRNA targets by bridging molecular interactions with cellular translation. In this work, we develop, validate, and apply a biophysical, statistical thermodynamic model of CsrA-mRNA targets to investigate how the molecular features of a target mRNA may impact its binding and regulation by CsrA. We employ the E. coli CsrA protein (Fig. 1A) as a model post-transcriptional regulator given that its binding and regulation of several mRNA targets has been well-characterized. Moreover, we investigate this system because, as described above, its well-characterized mRNA targets demonstrate variance in the sequence and structural features typically considered as hallmarks of CsrA regulatory interactions. Tis experimentally-demonstrated fexibility in mRNA recognition17,19,38 raises questions as to how molecular diversity of target interactions could enable diverse control schemes to support the large proposed scope and known complexity of the CsrA target network2,5,39. Previous models of CsrA-mRNA interactions have focused on genome-wide identification of potential mRNAs bound by CsrA with variations of the ANGGA consensus binding site sequence18,40,41. Specifcally, work by McKee et al. identifed E. coli genes with a [A/C/U]A[A/G/U]GGA[A/G/U][A/C/U] version of the CsrA bind- ing motif within a 22 nucleotide window upstream of their translation start site40. Similarly, Kulkarni et al. pub- lished an algorithm that identifed genes containing an A(N)GGA sequence (where N is any or a gap nucleotide) in a window 30 nucleotides upstream to 5 nucleotides downstream of their translation start sites. Additionally, the SCIENTIFIC REPORTS | (2018) 8:9892 | DOI:10.1038/s41598-018-27474-2 2 www.nature.com/scientificreports/ number and spacing of total available A(N)GGA-like binding sites were considered. It was required that within a window from its transcription start site to the 30th nucleotide of coding sequence, the gene must include at least 3 total A(N)GGA sequences within 10–60 nucleotides of each other or 2A(N)GGA sequences, each with a degenerate GGA-like sequence within a 10 nucleotide distance41. Given the detailed, mechanistic characterizations of several CsrA-mRNA interactions, termed “classical” tar- gets20,21,23,25,39,42–47 (defned in Table 1), and the recent expansion of the well-characterized CsrA mRNA target repertoire5,6, we expand the previous sequence-based approaches to build a model with greater molecular-level resolution. Specifcally,