Determination and Dissection of DNA-Binding Specificity for The

International Journal of Molecular Sciences Article Determination and Dissection of DNA-Binding Specificity for the Thermus thermophilus HB8 Transcriptional Regulator TTHB099 Kristi Moncja and Michael W. Van Dyke * Department of Chemistry and Biochemistry, Kennesaw State University, Kennesaw, GA 30144, USA; [email protected] * Correspondence: [email protected]; Tel.: +1-470-578-2793 Received: 14 October 2020; Accepted: 24 October 2020; Published: 26 October 2020 Abstract: Transcription factors (TFs) have been extensively researched in certain well-studied organisms, but far less so in others. Following the whole-genome sequencing of a new organism, TFs are typically identified through their homology with related proteins in other organisms. However, recent findings demonstrate that structurally similar TFs from distantly related bacteria are not usually evolutionary orthologs. Here we explore TTHB099, a cAMP receptor protein (CRP)-family TF from the extremophile Thermus thermophilus HB8. Using the in vitro iterative selection method Restriction Endonuclease Protection, Selection and Amplification (REPSA), we identified the preferred DNA-binding motif for TTHB099, 50–TGT(A/g)NBSYRSVN(T/c)ACA–30, and mapped potential binding sites and regulated genes within the T. thermophilus HB8 genome. Comparisons with expression profile data in TTHB099-deficient and wild type strains suggested that, unlike E. coli CRP (CRPEc), TTHB099 does not have a simple regulatory mechanism. However, we hypothesize that TTHB099 can be a dual-regulator similar to CRPEc. Keywords: bioinformatics; biolayer interferometry (BLI); electrophoretic mobility shift assay (EMSA); extremophile; protein-DNA binding; type IIS restriction endonuclease 1. Introduction Transcription factors (TFs) are DNA-binding proteins that allow for modulation of transcription initiation in response to intracellular and extracellular changes. Over decades of research, there have been many advances in exploring the TFs regulatory mechanisms cells use to control their gene expression. However, technological innovations such as massively parallel sequencing and data sciences have expanded our interest in new model organisms and their adaptations. TFs are trans factors that bind to cis-regulatory elements, promoter or enhancer sequences known as TF binding sites (TFBSs). It has been reported that most of the bacterial TFBSs are found in the proximal region (about 100 to +20 bp from the transcription start site [TSS]) and distal regions (up to − 200 from TSS) [1–3]. Functionally, TFs are categorized into activators and suppressors, with a few of − them being dual-regulators [4]. Regarding the number of genes regulated, TFs are classified into local or global regulators [5]. Such characteristics make up the mechanism of transcription regulation and help identify novel TFs. Proteomic studies allow the grouping of TFs into families based on structural comparison studies. However, new findings have shown that structurally similar TFs from distantly related bacteria are not usually evolutionary orthologs [6]. A more comprehensive characterization of the TF regulatory network is achieved by identifying the TFBSs, the genes regulated, and the method of regulation. Advances in computational biology and data processing have given rise to inclusive databases that can Int. J. Mol. Sci. 2020, 21, 7929; doi:10.3390/ijms21217929 www.mdpi.com/journal/ijms Int.Int. J. Mol. Sci. 20202020, 2211, x 7929 FOR PEER REVIEW 22 ofof 1818 databases that can predict structure and function for TFs in new model organisms [7]. However, most predictof these structure databases and are function built from for experimental TFs in new model studies. organisms [7]. However, most of these databases are builtTo fromgain experimentalinsights into studies.transcriptional regulatory networks in extremophile organisms, our laboratoryTo gain has insights employed into transcriptional a novel biochemistry regulatory-based networks method, in extremophile Restriction organisms,Endonuclease, our laboratorySelection, hasProtection, employed and a novelAmplification biochemistry-based (REPSA), to method, characterize Restriction several Endonuclease, TFs in the extrem Selection,e therm Protection,ophilic andmodel Amplification organism Thermus (REPSA), thermophilus to characterize HB8. To severaldate, w TFse have in studied the extreme four tetracycline thermophilic repressor model organismprotein (TetRThermus) family thermophilus transcriptionalHB8. suppressors To date, we and have have studied successfully four tetracycline identified their repressor TFBSs protein [8–11]. (TetR)Commonly family, suppressors transcriptional bind DNA suppressors in the absence and have of small successfully-molecule identifiedmodulators/cofactors their TFBSs and [8 –with11]. Commonly,high-affinity. suppressors Contrary, numerous bind DNA transcriptional in the absence activators of small-molecule employ small modulators-molecule /modulatorscofactors and in withorder high-a to bindffi nity.DNA Contrary,, thus complicating numerous their transcriptional analysis in activators vitro. employ small-molecule modulators in orderIn this to bind study, DNA, we thus explore complicating the utility their of analysis REPSA in vitro.to identify and characterize a potential thermophilicIn this study,transcriptional we explore activator, the utility TTHB099. of REPSA Protein tosequence identify homology and characterize analysis indicates a potential that thermophilicTTHB099 is one transcriptional of the four cAMP activator, receptor TTHB099. protein Protein(CRP) family sequence members homology (TTHA1437, analysis TTHA1359, indicates thatTTHB099, TTHB099 and is TTHA1567) one of the four in T cAMP. thermophilus receptor HB8 protein and (CRP)should family bind memberspalindromic (TTHA1437, DNA sequences TTHA1359, as a TTHB099,homodimer and [12 TTHA1567)]. However, in despiteT. thermophilus having a HB8cAMP and binding should domain, bind palindromic it does not DNArequire sequences this cofactor as a homodimerto bind DNA. [12 Here,]. However, we identified despite havingthe preferred a cAMP DNA binding-binding domain, sequence it does for not TTHB099 require this as the cofactor 16-mer to bindmotif: DNA. 5′–TGT(A/g)n(t/c)c(t/c)(a/g)g(a/g)n(T/c)ACA Here, we identified the preferred DNA-binding–3′. Furthermore, sequence for TTHB099we used as binding the 16-mer kinetics motif: 5studies0–TGT(A and/g)n(t mRNA/c)c(t expression/c)(a/g)g(a/ g)n(Tdata to/c)ACA–3 validate0 .potential Furthermore, biological we used roles binding of TTHB099. kinetics studies and mRNA expression data to validate potential biological roles of TTHB099. 2. Results 2. Results 2.1. Preferred TTHB099-Binding Sequences Selected Via REPSA 2.1. Preferred TTHB099-Binding Sequences Selected Via REPSA REPSA was used to select for TTHB099-binding sites present in a large pool (~60 billion molecules)REPSA wasof synthesize used to selectd double for TTHB099-binding-stranded DNA sites. Our present selection in a large library pool (~60, ST2R24 billion, molecules)has been ofsuccessfully synthesized used double-stranded in four previous DNA. studies Our selection [8–11]. library, IRDye ST2R24,® 700 (IRD7) has been-labeled successfully library used DNA in fourwas ® previousincubated studies with purified [8–11]. IRDye TTHB099700 protein (IRD7)-labeled to permit library specific DNA binding, was incubated then challenged with purified by a TTHB099 type IIS proteinrestriction to permitendonuclease specific (IISRE) binding,. Sequence then challenged-specific binding by a type of IISTTHB099 restriction to a endonuclease subset of the (IISRE).library Sequence-specificprotected those oligonucleotides binding of TTHB099 from endonuclease to a subset of activity the library, thereby protected permitting those their oligonucleotides amplification fromby PCR. endonuclease Seven rounds activity, of binding, thereby permittingIISRE cleavage their, amplificationand PCR res byulted PCR. in Seventhe enrichment rounds of binding,of DNA IISREresistant cleavage, to IISRE and cleavage PCR resulted when in TTHB099 the enrichment was present of DNA (Figure resistant 1, Round to IISRE 7) cleavage. Note that when in TTHB099Round 4, wassubstantial present uncut (Figure DNA1, Round appeared 7). Noteon the that IISRE in Roundcontrol 4,lane substantial (–/F) as well uncut as DNAthe test appeared lane (+/F) on. This the IISREnonspecific control cleavage lane (–/ F)inhibiti as wellon ashas the been test observed lane (+/F). before This and nonspecific has been cleavage ascribed inhibition to the selection has been of observedFokI cleavage before-resistant and has sequences been ascribed [8,13]. to Thus, the selection subsequent of FokIrounds cleavage-resistant of REPSA were sequencesperformed [ 8with,13]. Thus,an alternative subsequent, albeit rounds less of efficient REPSA IISRE wereperformed (BpmI). with an alternative, albeit less efficient IISRE (BpmI). Round 1 Round 2 Round 3 Round 4 Round 5 Round 6 Round 7 –/– –/F +/F –/– –/F +/F –/– –/F +/F –/– –/F +/F –/– –/B +/B –/– –/B +/B –/– –/B +/B 099 / RE - T - X - D - P FigureFigure 1.1. Selection of TTHB099-bindingTTHB099-binding DNA sequences.sequences. Shown Shown are are IR fluorescencefluorescence images ofof restrictionrestriction endonucleaseendonuclease cleavage-protection cleavage-protection assays assays made made during during Rounds Rounds 1–7 of1–7 REPSA of REPSA selection selection with 50.6with nM 50.6 TTHB099 nM TTHB099 protein. protein The presence. The presence (+) or absence

Determination and Dissection of DNA-Binding Specificity for The

DNA Sequencing and Sorting: Identifying Genetic Variations

Escherichia Coli: Protein Families and Binding Sites M

Regulondb (Version 6.0): Gene Regulation Model Of

Description Data File Programa De Genómica Computacional / Centro De Ciencias Genómicas

A Tool for Detecting Base Mis-Calls in Multiple Sequence Alignments by Semi-Automatic Chromatogram Inspection

A Brief History of Sequence Logos

Bioinformatics Manual Sequence Data Analysis with CLC Main Workbench

A STAT Protein Domain That Determines DNA Sequence Recognition Suggests a Novel DNA-Binding Domain

Protein-DNA Interactions II

A Database of Regulatory Networks in Gamma-Proteobacterial Genomes Abel D

Bioinformatics I Sanger Sequence Analysis

Identification of Factors That Interact with the E IA-Inducible Adenovirus E3 Promoter