Systematic Approach for Dissecting the Molecular Mechanisms Of
Total Page:16
File Type:pdf, Size:1020Kb
Systematic approach for dissecting the molecular mechanisms of transcriptional regulation in bacteria Nathan M. Belliveaua, Stephanie L. Barnesa, William T. Irelandb, Daniel L. Jonesc, Michael J. Sweredoskid, Annie Moradiand, Sonja Hessd,1, Justin B. Kinneye, and Rob Phillipsa,b,f,2 aDivision of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125; bDepartment of Physics, California Institute of Technology, Pasadena, CA 91125; cDepartment of Cell and Molecular Biology, Science for Life Laboratory, Uppsala University, 751 24 Uppsala, Sweden; dProteome Exploration Laboratory, Beckman Institute, California Institute of Technology, Pasadena, CA 91125; eSimons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724; and fDepartment of Applied Physics, California Institute of Technology, Pasadena, CA 91125 Edited by Curtis G. Callan Jr., Princeton University, Princeton, NJ, and approved April 6, 2018 (received for review December 19, 2017) Gene regulation is one of the most ubiquitous processes in biol- constructed from known binding sites, and to search for new ogy. However, while the catalog of bacterial genomes continues transcriptional regulatory sequences (13–19). CRISPR assays to expand rapidly, we remain ignorant about how almost all of have also shown promise for identifying longer-range enhancer– the genes in these genomes are regulated. At present, character- promoter interactions in mammalian cells (20). However, no izing the molecular mechanisms by which individual regulatory approach for using massively parallel reporter technologies to sequences operate requires focused efforts using low-throughput decipher the functional mechanisms of previously uncharacter- methods. Here, we take a first step toward multipromoter dissec- ized regulatory sequences has yet been established. tion and show how a combination of massively parallel reporter Here, we take a first step toward quantitative, multipromoter assays, mass spectrometry, and information-theoretic modeling dissection and describe a systematic approach for identifying can be used to dissect multiple bacterial promoters in a systematic the functional architecture of previously uncharacterized bacte- way. We show this approach on both well-studied and previously rial promoters at nucleotide resolution using a combination of uncharacterized promoters in the enteric bacterium Escherichia genetic, functional, and biochemical measurements. A massively coli. In all cases, we recover nucleotide-resolution models of promoter mechanism. For some promoters, including previously BIOPHYSICS AND unannotated ones, the approach allowed us to further extract Significance COMPUTATIONAL BIOLOGY quantitative biophysical models describing input–output relation- ships. Given the generality of the approach presented here, it opens Organisms must constantly make regulatory decisions in up the possibility of quantitatively dissecting the mechanisms of response to a change in cellular state or environment. How- promoter function in E. coli and a wide range of other bacteria. ever, while the catalog of genomes expands rapidly, we remain ignorant about how the genes in these genomes are gene regulation j massively parallel reporter assay j quantitative models j regulated. Here, we show how a massively parallel reporter DNA affinity chromatography j mass spectrometry assay, Sort-Seq, and information-theoretic modeling can be used to identify regulatory sequences. We then use chro- he sequencing revolution has left in its wake an enor- matography and mass spectrometry to identify the regulatory Tmous challenge: the rapidly expanding catalog of sequenced proteins that bind these sequences. The approach results in genomes is far outpacing a sequence-level understanding of quantitative base pair-resolution models of promoter mecha- how the genes in these genomes are regulated. This ignorance nism and was shown in both well-characterized and unanno- extends from viruses to bacteria to archaea to eukaryotes. Even tated promoters in Escherichia coli. Given the generality of the in Escherichia coli, the model organism in which transcriptional approach, it opens up the possibility of quantitatively dissect- regulation is best understood, we still have no indication if or how ing the mechanisms of promoter function in a wide range of more than one-half of the genes are regulated (SI Appendix, Fig. bacteria. S1) [RegulonDB (1) or EcoCyc (2)]. In other model bacteria, Author contributions: N.M.B., D.L.J., and R.P. designed research; N.M.B., S.L.B., D.L.J., such as Bacillus subtilis, Caulobacter crescentus, Vibrio harveyii, M.J.S., A.M., S.H., and J.B.K. performed research; N.M.B., W.T.I., M.J.S., A.M., and J.B.K. or Pseudomonas aeruginosa, far fewer genes have established analyzed data; N.M.B., S.L.B., and R.P. provided conceptualization; N.M.B., S.L.B., W.T.I., regulatory mechanisms (3–5). D.L.J., and J.B.K. provided methodology; N.M.B. provided investigation; N.M.B., W.T.I., D.L.J., M.J.S., and J.B.K provided software; N.M.B., W.T.I., M.J.S., and J.B.K. provided New approaches are needed for studying regulatory architec- validation; S.H. and R.P. acquired funding; N.M.B., S.L.B., J.B.K., and R.P. wrote the paper. ture in these bacteria and others. Chromatin immunoprecipita- The authors declare no conflict of interest. tion (ChIP) and other high-throughput techniques are increas- ingly being used to study gene regulation in E. coli (6–11), but This article is a PNAS Direct Submission. these methods are incapable of revealing either the nucleotide- This open access article is distributed under Creative Commons Attribution- NonCommercial-NoDerivatives License 4.0 (CC BY-NC-ND). resolution location of all functional transcription factor binding Data deposition: Raw sequencing files have been deposited on the NCBI Sequence sites or the way in which interactions between DNA-bound Read Archive, https://www.ncbi.nlm.nih.gov/sra (accession no. SRP121362) and Sort- transcription factors and RNA polymerase (RNAP) modulate Seq regulatory sequencing data from Escherichia coli has been deposited in NCBI transcription. Although an arsenal of now classic genetic and BioSample, https://www.ncbi.nlm.nih.gov/biosample (accession no. SAMN07830099). biochemical methods has been developed for dissecting pro- Thermo RAW mass spectrometry files have been deposited in the jPOST Reposi- tory, https://repository.jpostdb.org (accession no. PXD007892). Files containing pro- moter function at individual bacterial promoters [reviewed in the cessed data and python code association with data analysis and plotting have work by Minchin and Busby (12)], these methods are not readily been deposited on GitHub and Zenodo (available at https://www.github.com/RPGroup- parallelized and often require purification of promoter-specific PBoC/sortseq belliveau and https://doi.org/10.5281/zenodo.1184169, respectively). regulatory proteins. 1 Present address: Antibody Discovery and Protein Engineering, MedImmune, Gaithers- In recent years, a variety of massively parallel reporter assays burg, MD 20878. have been developed for dissecting the functional architecture of 2 To whom correspondence should be addressed. Email: [email protected]. transcriptional regulatory sequences in bacteria, yeast, and meta- This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. zoans. These technologies have been used to infer biophysical 1073/pnas.1722055115/-/DCSupplemental. models of well-studied loci, to characterize synthetic promoters www.pnas.org/cgi/doi/10.1073/pnas.1722055115 PNAS Latest Articles j 1 of 10 Downloaded by guest on October 2, 2021 parallel reporter assay [Sort-Seq (13)] is performed on a pro- A moter in multiple growth conditions to identify functional tran- scription factor binding sites. DNA affinity chromatography and mass spectrometry (21, 22) are then used to identify the reg- ulatory proteins that recognize these sites. In this way, one is able to identify both the functional transcription factor binding sites and cognate transcription factors in previously unstudied promoters. Subsequent massively parallel assays are then per- formed in gene deletion strains to provide additional validation B of the identified regulators. The reporter data thus generated are also used to infer sequence-dependent quantitative models of transcriptional regulation. In what follows, we first illustrate the overarching logic of our approach through application to four A C previously annotated promoters: lacZYA, relBE, marRAB, and - G yebG. We then apply this strategy to the previously uncharacter- T ized promoters of purT, xylE, and dgoRKADT, showing the ability to go from regulatory ignorance to explicit quantitative models of a promoter’s input–output behavior. C Results To dissect how a promoter is regulated, we begin by perform- ing Sort-Seq (13). As shown in Fig. 1A, Sort-Seq works by first generating a library of cells, each of which contains a mutated promoter that drives expression of green fluorescent protein (GFP) from a low copy plasmid [5–10 copies per cell (23)] and provides a readout of transcriptional state. We use fluorescence- activated cell sorting (FACS) to sort cells into multiple bins gated Fig. 1. Overview of the approach to characterize transcriptional regulatory by their fluorescence level and then sequence the mutated plas- DNA using Sort-Seq and mass spectrometry. (A) Schematic of Sort-Seq. A mids from each bin. We found it sufficient