A Thesis
entitled
Naturally-Occurring Fusion Between the Regulatory and Catalytic Components of Type
IIP Restriction-Modification Systems
by
Jixiao Liang
Submitted to the Graduate Faculty as partial fulfillment of the requirements for the
Master of Science Degree in Biomedical Sciences
______Dr. Robert Blumenthal, Committee Chair
______Dr. Steve Patrick, Committee Member
______Dr. Jason Huntley, Committee Member
______Dr. Patricia R. Komuniecki, Dean College of Graduate Studies
The University of Toledo
December 2013
Copyright 2013, Jixiao Liang
This document is copyrighted material. Under copyright law, no parts of this document may be reproduced without the expressed permission of the author. An Abstract of
Naturally-Occurring Fusion Between the Regulatory and Catalytic Components of Type IIP Restriction-Modification Systems
by
Jixiao Liang
Submitted to the Graduate Faculty as partial fulfillment of the requirements for the Master of Science Degree in Biomedical Sciences
The University of Toledo
December 2013
Restriction-modification (R-M) systems play key roles in controlling gene flow among bacteria and archaea, and their own genetic mobility depends critically on their regulation, but the regulation of these systems is poorly understood. The PvuII R-M system is a Type IIP R-M system in that the protective DNA methyltransferase (MTase) is a separate and independently-active protein from the potentially lethal restriction endonuclease (REase). PvuII is one of the best studied of the R-M systems that use a positive feedback regulatory loop, involving a transcriptional regulator called C protein, to delay expression of the REase relative to that of the MTase. This allows protective methylation of a new host cell’s DNA before the REase is produced. In searching for R-
M systems related to PvuII, in order to study evolution and variation of its regulatory system, a putative system was found in the genome sequence of the bacterium Niabella soli strain DSM 19437, in which the regulatory C protein and the REase are translationally fused. The hypothesis is that N. soli truly produces a fused C-R protein, and that it is active as both a REase and as an autogenous regulator. The genes for the N. soli R-M system were synthesized, produced and purified with affinity tags, and the
iii production of full-length C-REase fusion protein was confirmed. The dual activity of the fusion protein was determined by in vitro restriction of known DNAs, and in vivo transcriptional activation of a lacZ fusion to the promoter on which the C protein acts.
iv
This work is dedicated to my parents, Zhao-jun Liang and Gui-ying Xu for their love and support.
Acknowledgements
This thesis and the associated research would not have been possible without the ever-patient guidance of my mentor, Dr. Robert Blumenthal. I would like to express my sincere gratitude to my major advisor Dr. Robert Blumenthal for his continuous support of my graduate study and research, for his patience, encouragement, guidance and support. He recognizes my strength and weakness, which keep me motivated. I am also grateful for all his advice about life, career and everything else.
I would additionally like to thank my committee members, Dr. Jason Huntley and
Dr. Steve Patrick for their valuable time, constructive suggestions, and criticisms during my study.
Further, for her constant support as an instructor in lab and a friend in life, I would like to sincerely thank my lab mate Dr. Kristen Williams. Also, my friends Dr.
Guo-ping Ren and Dr. Gang Ren have offered me valuable advice and help on my experiments. Last but not least, I would like to thank all the students, faculty, and staff in the Medical Microbiology and Immunology Department. Thank you all!
vi
Table of Contents
Abstract ...... iii
Acknowledgements ...... vi
Table of Contents ...... vii
List of Figures ...... viii
Chapter 1: Literature Review ...... 1
Chapter 2: Materials and Methods ...... 13
Chapter 3: Results………………………………………………………………………..22
Chapter 4: Discussion and Conclusion ...... 33
References ...... 39
vii
List of Figures
Figure1 Complex formed by R.PvuII and its cognate DNA.
Figure2 PvuII R-M system control region.
Figure3 Structure of C. AhdI.
Figure4 Sequence of synthesized NsoJS138I R-M system.
Figure5 Vector map of constructed plasmids
Figure6 Alignment of CR fusion proteins orthologous to C.PvuII and R.PvuII.
Figure7 Test of CR fusion protein production.
Figure8 Test of CR fusion protein production.
Figure9 Assessment of REase activity in CR.NsoJS138I.
Figure10 Confirmation of specific digestion conditions.
Figure11 Assessment of C activity in CR.NsoJS138I.
Figure12 Possible interactions of C-REase fusion polypeptides.
viii Chapter 1
Literature Review
1. Restriction-modification (R-M) systems
The biological phenomenon of restriction and modification were first recognized in the early 1950s, and the first R-M system was cloned in E. coli in the late 1970s [1]. R-
M systems are present in the great majority of bacteria and archaea, with more than 3000 being found to date (most by detecting MTase gene sequences) [2]. As the term indicates, a typical R-M system comprises two activities: a restriction endonuclease (REase) that cleaves DNA at a target sequence, and a methyltransferase (MTase) that modifies the same sequence to protect it from the cognate REase [2]. Four broad types of R-M systems have been reported so far, each with unique characteristics, and the two enzymes have been combined into a single multi-subunit protein in some of the systems [3]. However in
Type IIP R-M systems, the REase and MTase separately execute their opposing intracellular enzymatic activities [3].
1.1 Restriction Endonuclease (REase)
The REase catalyzes the cleavage of double-stranded DNA, generally on both strands. REases recognize specific sequences on the target DNA, and the cleavage occurs
1 via hydrolysis of one phosphate-deoxyribose bond in the backbone of each DNA strand
[4]. Typically, such enzymatic activity takes place without energy input, but commonly requires Mg2+ or a similar divalent cation; some REases also require or are stimulated by,
ATP or S-adenosylmethionine (AdoMet) [5]. REases appear to come from very different backgrounds, and are difficult to identify from their sequences alone [6-8].
1.2 Modification Methyltransferase (MTase)
REase cleavage of DNA could be lethal to cells producing R-M systems. To protect endogenous DNA from REase, the paired (cognate) MTase catalyzes addition of a methyl group to one nucleotide in each strand of the recognition sequence, with the identities and positions varying from MTase to MTase [9]. AdoMet always serves as the methyl donor and is thus an essential cofactor for methylation [10]. The sensitivity of the
REase of R-M systems to methylation on the recognition sequences usually prevents cleavage of endogenous DNA. However, while cleavage can be prevented by the cognate methylation, noncognate methylation occurring elsewhere in the recognition sequence may or may not prevent the cleavage [11].
1.3 Types of restriction modification systems
R-M systems are classified based on enzyme composition and cofactor requirements, recognition sequence symmetry, and cleavage position [3, 12]. Because my research defines a new subtype of R-M system, in which the REase and regulatory C protein are fused, it is appropriate to describe the various known types of R-M systems.
2 1.3.1 Type I Systems
Type I systems are considered as the most complex R-M systems, as they consist of three polypeptides: R (Restriction), M (Modification) and S (Specificity). These form a complex that can both cleave and methylate DNA in an energy (ATP) dependent manner, and about half of the bacterial genomes contain closely linked-genes that are predicted to code for these three polypeptides, based on screening of the present database of complete genomic sequences [13]. Furthermore, the fact that cleavage occurs at a considerable distance away from the recognition site in most cases, makes it difficult to visualize the discrete bands by gel electrophoresis [14]. So these enzymes have substantial biological significance, but have not yet found major biotechnological uses.
1.3.2 Type II Systems
Type II systems are believed to be the simplest and most prevalent R-M systems.
As opposed to type I systems, Type II REase and MTase act independently without the need of a specificity protein, and each has its own simple catalytic requirement: REase requires Mg2+ (or similar divalent cation) and MTase requires AdoMet [14]. Type IIP
REases are generally active after they dimerize and form homodimers, while most Type
II MTases only form monomers for catalyzing the addition of methyl groups to the cognate DNA [14, 15]. Early on it was recognized that while typical Type II enzymes recognized palindromic sequences and cleaved symmetrically within them, the Type IIS enzymes cut outside their normally asymmetric sequences and differed in other interesting ways [16]. There are many subdivisions of Type II enzymes, classified based on their recognition and cleavage differences [3]. Specifically, some of the criteria are
3 based on the sequence cleaved and others on the structure of the enzymes themselves, so not all subdivisions are mutually exclusive [3]. Type IIP designates the enzymes that recognize symmetric sequences (palindromes) [3]. Some new subclasses of Type II R-M systems involve fusion of components, such as between the REase and MTase [17-20].
1.3.3 Type III and Type IV Systems
Type III MTase and REase form a complex of modification and cleavage [21].
Similar to Type II systems, Mg2+ and AdoMet are essential cofactors for Type III REase and MTase, respectively; and in the presence of such cofactors, a complex formed from
REase and MTase competes internally for modifying and restricting at the same DNA position [22]. As a consequence, incomplete digestions are typical [14]. The Type IV
REases cleave only modified DNA, which consist of methylated, hydroxymethylated and glucosyl-hydroxymethylated bases [3]. However, their recognition sequences have usually not been well defined except for EcoK-McrBC, and cleavage occurs at ~30 bp away from one of the sites [3] The Escherichia coli McrBC enzyme, the best studied of the type IV REases and the only one that is commercially available, requires two purine methylcytosine/hydroxymethylcytosine sites separated by 40–3000 base pairs for cleavage [23].
2. Roles and Control of R-M systems
One major function of R-M systems is to protect bacterial cells from bacteriophage infection or invasion by foreign DNA [24]. In addition to being bacterial defense systems, R-M systems manifest themselves in a diverse range of functions such
4 as stabilization of genomic islands, maintenance of bacterial fitness and nutrition, immigration control, recombination and genome rearrangement, evolution of genomes, enforcing methylation on the genome and so forth [25].
Lethal DNA damage would occur if the two R-M enzymatic activities (MTase and REase) were unbalanced [26]. This is particularly true when R-M genes first enter a new host cell that has completely unmethylated DNA [27]. Therefore, a timing delay between expression of the MTase and REase is theoretically believed to occur in Type
IIP systems, and this has been documented to occur in PvuII [28]. Specifically, there is a
~10-min delay between the appearance of MTase and REase transcripts and activities
[28]. This boosts our understanding of the mobility of R-M systems.
3. PvuII R-M system and its regulatory characteristics
3.1 Overview of PvuII
PvuII was discovered [29] and then cloned into E. coli from its original host
Proteus vulgaris about three decades ago [30]. Since then, it has been subjected to many regulatory studies [31-34]. This system was also the first R-M system to have had both the REase [35, 36], and MTase [37] structures crystallographically determined. Because this study reports a REase-C protein fusion, it is important to discuss the structures of those two components.
5 3.2 Structure and function of PvuII restriction endonuclease
Figure 1. Complex formed by R.PvuII and its cognate DNA. In this
view, the enzyme is in ribbons representation in purple, with the DNA strands in
green and cyan. The amino termini of the two REase subunits are at the right. The
image is structure 1EYU of the Protein Data Bank (managed by the Research
Collaboratory for Structural Biology). The image is in the public domain.
With the application of X-ray crystallography, the molecular structure of active
PvuII endonuclease has been identified as a homodimeric protein, with the subunit interface region consisting of a pseudo three-helix bundle at the amino end [35]. Three regions have been determined in R.PvuII, namely the subunit interface region, catalytic
6 region and DNA recognition region. The recognition sequence for R. PvuII cleavage is
CAG↓CTG, and such cleavage is prevented by N4-methylcytosine (yielding
CAGN4mCTG), generated by its cognate methyltransferase [27].
3.3 C- protein and its regulatory roles
3.3.1 Overview
In addition to the MTase and REase genes, a subset of type II R-M systems contains regulatory genes. The regulatory C (controller) gene was first discovered in the
PvuII [38, 39] and BamHI [40] R-M systems. A milestone in characterizing the PvuII system is the identification of a regulatory element called “C-Boxes” between the pvuIIM and pvuIIR genes, exerting the time-control for the expression of REase and MTase [28,
39, 41]. C boxes are where the C protein binds to exert its effects [31]. While the location of the MTase gene varies among R-M systems, in those that have C proteins the C gene is typically upstream of the REase gene [31].
Figure 2. PvuII R-M system control region. Two transcription starts for
pvuIICR are identified by rightward bent arrows: from the C-independent weak
7 promoter (left) and C-dependent strong promoter (right) [38]. The two pvuIIM
promoters are also shown (leftward bent arrows). Gray wavy lines represent the
resulting mRNAs.
3.3.2 C protein-dependent regulatory circuit in PvuII
C proteins (encoded by C gene), where tested, activate transcription of their own gene (‘autogenous’ activation). They are believed to be responsible for the delay in
REase activity, since the REase gene typically does not have its own promoter [33] and is completely dependent on transcription from the upstream autogenously regulated C gene
[42]. Thus when the R-M genes enter a new cell, and no C protein is present, MTase is expressed while C protein (and REase) are initially produced at very low levels. As C protein accumulates, the positive feedback loop results in a sharp increase in C and
REase expression [33, 34]. The C protein acts as both as an activator and repressor, so it can prevent overexpression of the REase [43].
8 3.3.3 C protein structure Figure 3. Structure of C.
AhdI [44]. In this view, the
dimeric protein is in ribbons
representation. The image is
structure 1Y7Y of the
Protein Data Bank
(managed by the Research
Collaboratory for Structural
Biology). The image is in
the public domain.
Studies in Type II R-M systems have indicated that C proteins are only active when they become homodimers [44, 45]. The dimerization of C proteins is required for
DNA binding and, considering the relatively low stability of the dimer itself, this appears to be an important component of the genetic switch that delays transcription of the C- gene, and consequently that of the endonuclease (R) gene transcribed from the same promoter [46]. The regulatory C protein of another R-M system named AhdI has been crystallized [44], and a high-resolution crystal structure of C.AhdI was described two years later by the same group of scientists [47]. The high-resolution structure of C.AhdI reveals a compact, single-domain homodimer and can be classified as an all-alpha protein: 65% of the residues are in a helical conformation with no beta-sheet present [44]
(Figure 3).
9 4. CR fusion protein in Type II R-M systems
The PvuII R-M system is one of the best studied of the group that uses a positive feedback regulatory loop to delay restriction endonuclease (REase) expression with respect to DNA methyltransferase (MTase) expression [43], allowing protective methylation of a new host cell’s DNA before the REase is produced. To better understand the variation in and evolution of this regulatory system, I searched for other R-M systems closely related to PvuII. This work is described under Results, but a group of related systems had naturally-occurring fusions between the C and REase proteins. I provide here some background on the considerations underlying my studies on one of these fused systems. Gene fusion is a major contributor to the evolution of multi-domain bacterial proteins, that typically results in one long composite protein in one organism in place of two or more smaller split proteins in another organism [48, 49].
4.1 Identification of the CR fused Type II R-M systems
To search for R-M systems closely related to PvuII, the REase (R.PvuII) amino acid sequence was used as the search seed in TBLASTN [61]. This was done because the
C proteins are fairly well conserved [31, 33, 50], and the MTase proteins have well- conserved motifs [24, 51], so using them as search seeds would likely give a higher background of unrelated R-M systems. However , the generally poor conservation of
REases implies that only two closely related R-M systems would have similar REase sequences. One fused polypeptide with portions similar to both C.PvuII and R.PvuII was found in the bacterium Niabella soli.
10 4.1.1 Overview of Niabella soli
The genus Niabella was proposed by Kim et al. (2007) [52] for a bacterium isolated from soil. This genus was characterized as Gram-negative, aerobic, non- flagellated, flexirubin-pigment-producing bacteria that form short rods. Shortly after that, a dark yellow-colored bacterium, JS13-8T, was isolated from a soil sample from Jeju
Island, Republic of Korea [53]. The cells were aerobic, Gram-negative, non-motile, short rods. Growth occurred at 15–35 oC (optimally at 30 oC). On the basis of the phylogenetic, physiological and chemotaxonomic data, strain JS13-8T was deemed to represent a novel species of the genus Niabella, for which the name Niabella soli sp. nov. was proposed
[53]. Subsequent to our discovery of a fused system in N. soli it was also detected by an automated sequence search by the curators of REBASE [2], which is a continuously updated R-M system database. We have adopted their nomenclature as NsoJS138I for this system, following their entry on April 10, 2013. They performed no biochemical characterization of the R-M system.
4.1.2 Translational frameshifting as a possible mechanism for production of free C protein in such fused systems
C.NsoJS138I and R.NsoJS138I are clearly fused at the sequence level, as described in Results. However, it is possible that a certain amount of free NsoJS138I C or
REase protein is produced via translational frameshifting or post-translational processing.
Post-translational processing could involve proteolytic cleavage that yields free C and free REase polypeptides. Alternatively, free C protein (but not free REase) could result from ribosomal frameshifting during translation, which can occur when a ribosome
11 encounters certain sequence patterns in the mRNA [54]. Translational frameshifting represents an alternative process of protein translation [55], and occurs much more frequently than was originally expected [56]. For instance, a study of ribosomal frameshifting on the sequence GCAAAA has shown that this pattern is associated with efficient -1 ribosomal frameshifting in Escherichia coli [57].
4.1.3 Novel demonstration of CR fusions in Type II R-M systems
Natural and synthetic fusions of the REase and MTase polypeptides have been observed, and found to be active [17-20]. However this thesis focuses on naturally- occurring fusions between the REase and the regulatory C protein. These have been suggested to occur by automated annotation systems, such as REBASE, but have never been tested and shown to be active for either the REase or the C protein components.
12 Chapter 2
Materials and Methods
Gene synthesis
The sequence containing the complete R-M system of Niabella soli (1837nt, from
NCBI database; GenBank accession # NZ_AGSA01000028) was obtained from
Genscript Inc.
(Piscataway, NJ). Some modifications were made to optimize the distribution of restriction sites, but without changing the specified amino acids (Figure 4). The inferred
NsoJS138I C-Box and promoter region (161nt) was also obtained from Genscript, and for cloning purposes the restriction sites XmaI (at C gene end) and BamHI (at R gene end) were appropriately placed .
Cloning strategy
The R-M system Mru1279I (~2.4 kbp) was cloned into the high-copy vector pUC19, using NruI (at CR gene end) and BamHI sites (at M gene end). Genscript synthesized the complete NsoJS138I system, but could only clone it into a low-copy number vector pCC1 (they normally use higher-copy pUC57). This presumably resulted from a frameshift error in the MTase gene that is due to an error in the requested
13 sequence. To avoid the apparent toxicity, a truncated version was subcloned, consisting of only the fused CR gene of NsoJS138I and missing a portion of the COOH-end of the
REase (so the MTase would not be required). The truncated NsoJS138ICR was cloned into the pACYCDuet-1 vector (Novagen®), with the N-terminus (C protein end) in-frame with the His-tag (using BamHI and SaI I sites), and preceded by a T7 promoter. This plasmid, pJL100, is referred to for readability as “pNsoShort”. Full length NsoJS138ICR was also cloned into this vector, by transforming an E. coli strain containing the pre- expressed PvuII MTase [58], with the NsoJS138ICR COOH-terminus (REase end) in- frame with the His-tag (using the NcoI site), named as pJL200 (“pNso”). The truncated product would be ~1.5 kDa less than the full length one.
The synthesized NsoJS138I “C-Box” region was digested with BamHI and XmaI and ligated into pBH403, which is a derivative of pKK232-8 and contains a promoterless lacZ gene between two bidirectional transcription terminators [59], making the pJL300
(“pBoxLac”). These plasmids are illustrated in Figure 5. The oligonucleotide primers used for PCR amplification are shown below (all in the 5’à3’ direction).
Primer set for cloning the complete Mru1279I R-M system:
ggtTCGCGActtccgggtctacacctcaa; ggtGGATCCagccctaaccagccgtaaat
Primer set for making the truncated NsoJS138ICR PCR product for pJL100:
aatGTCGACttatttgggattattaatatccttatcac; aatGGATCCgatgaacgaaccaaatgc
14 Primer set for making the full length NsoJS138ICR PCR product for pJL200: cgtCCATGGacaaaagtcttatgccat; cgtCCATGGatgaacgaaccaaatgctta
15
Figure 4. Sequence of synthesized NsoJS138I R-M system. The initiators of the
CR and M genes are in green. The red arrow near the top indicates the position at which the C-REase gene is interrupted in the truncated clone (pJL100, pNsoShort).
16 Figure 5. Vector map of constructed plasmids.
(A) pJL100 (“pNsoShort”), truncated version of the C-REase fusion with N-terminal His tag;
(B) pJL200 (“pNso”), full-length version of the C-REase fusion with COOH-terminal His tag (next page);
(C) pJL300 (“pBoxLac”), promoter-C-box region fused to promoterless lacZ reporter gene (next page).
A
17 B
C 18 Protein expression & purification
Twenty ng of both truncated and full-length versions of NsoJS138ICR DNA were transformed into a BL21 (DE3) E. coli strain (InvitrogenTM) that has isopropylthio-β-D- galactoside (IPTG) inducible T7 RNA polymerase expression. Overnight cultures of cells in stationary phase were subcultured into 250 mL (as per the QIAexpress® protocol for
His tagged protein purification) LB medium with a 1:20 dilution at 37oC. IPTG was added to a final concentration of 0.5 mM when the subculture cells reached mid-log phase (OD600~0.46). Cells were grown for another 2.5 h before being harvested by centrifugation and frozen at -80oC. The QIAexpress® Ni-NTA Fast Start Kit was used to purify 6xHis-tagged protein (under naïve condition). PMSF protease inhibitor was added
(final concentration of 0.5 mM) to the lysis buffer immediately before purification of full length NsoJS138ICR. Purified protein was added immediately to either Diluent B
(NEB#B8002S, for protection of REase activity) or 2x SDS PAGE sample buffer (1:1 solution), and stored at -20oC. Protein concentration was determined by the Pierce 660 nm Assay (Thermo Scientific).
Western blots
Purified proteins were separated by SDS-PAGE (Novex® 10~20% Tris-Glycine gradient gel), and either stained with standard Coomassie blue or blotted onto PVDF membranes at 30 V for 2 h using an Xcell apparatus (Invitrogen). For signal detection, membranes were blocked by incubation at 4oC overnight in 1% BSA-0.1% Tween-20 in
PBS, followed by incubation with a 1:1,000 dilution of mouse anti-His tag monoclonal antibody (Millipore) for 2 h at 4oC, followed by three 10-min washes. The blots were
19 then incubated with horseradish peroxidase (HRP)-conjugated goat anti-mouse IgG (1 :
15,000, Invitrogen) for 2 h at room temperature. After three 10-min washes, protein bands were visualized by ECL Plus enhanced chemiluminescence (GE Healthcare) and image captured using an Alpha Innotech FluorChem HD Imaging System. Minor adjustments of brightness and contrast were carried out to better visualize data, but in all cases the same manner of such changes were applied to the complete image panel as a whole. The pre-stained MW markers used were SeeBluePlus (Invitrogen).
Restriction activity assay
To assess the enzymatic activity of NsoJS138I REase, bacteriophage lambda
DNA (NEB#N3011S) was used as substrate. Restriction enzyme PvuII (NEB#R0151S) was used as a standard control, with the digestion pattern on lambda DNA already known. NsoJS138IR (2.36 µg) or 10 u PvuII were incubated with 1.5 µg of lambda DNA for 1 h at 27, 32, 37 or 42 °C, in four NEB standard buffers for each reaction, and the
DNA was resolved on 0.8% agarose gels. Empty pUC19 vector DNA (0.8 µg) was also used as substrate.
Assays for C protein activity
pJL300 was transformed into a IPTG inducible Tn7 E. coli DE3 strain (Lac-) carrying the NsoJS138IC RM system. The LacZ assay was based on hydrolysis of O- nitrophenyl-β-D-thiogalactoside using the Miller units as modified by others [60].
Briefly, β -galactosidase activity and culture density were measured at 20–30 min intervals during exponential growth. The units for this assay were calculated by dividing
20 the measured A420nm (released nitrophenol) by the time allowed for the reaction and by the volume of permeabilized cells used for the reaction. For plots vs. time, culture density
(OD600nm) was also in the denominator, yielding standard Miller units. For plots vs. culture density, this term was omitted from the denominator, yielding modified Miller units (1000 × ΔA420nm min-1 ml-1). Specific activity was obtained by determining the slope of a plot of LacZ activity versus the culture density via linear regression.
List of abbreviations used
HRP: horseradish peroxidase
IPTG: isopropylthio-β-D-galactoside
LacZ: β-galactosidase (product of lacZ gene)
MTase: modification DNA methyltransferase
OD: optical density
ONPG: O-nitrophenyl-β-D-thiogalactoside
PMSF: phenyl-methyl-sulfonylfluoride
REase: restriction endonuclease
RM: restriction-modification
SDS: sodium dodecylsulfate
21 Chapter 3
Results
1. Identification of C-REase-fused R-M systems
We are interested in how regulation of RM systems varies and evolves. To study this, our lab periodically searches Genbank for sequences related to the PvuII RM system, because many of our studies have focused on that system. As noted above
(section 1.1), REases are the most poorly-conserved component of RM systems, so using the R.PvuII amino acid sequence as the search seed ensures that only systems relatively closely related to PvuII will be identified. Sequence searching used TBLASTN [61], with default parameters.
A putative system in the genome sequence of the thermophilic bacterium
Meiothermus ruber Mru1279I was identified, in which the REase had a long amino- terminal extension relative to R.PvuII that, on further analysis, bore strong resemblance to a C protein (Figure 6). However, studying the catalytic and regulatory effects of the fusion in the Meiothermus system would likely be problematic using clones in the mesophile E. coli – not only the catalytic functions but also DNA binding and subunit associations are designed to work at much higher temperatures than E. coli can survive
22 [62]. By using the translated sequence of the Meiothermus C-R fusion as searching seed, another hit in an organism called Niabella soli has been found (Figure 6). The fact that N. soli has a mesophilic system makes it easier to perfrom in vivo studies in E. coli [53], and this type of special R-M system with C-R fusion protein will have unusual regulatory properties that will shed light on the regulation of type II R-M systems in general.
Figure 6. Alignment of CR fusion proteins orthologous to C.PvuII and
R.PvuII. The PvuII system (top line) is unfused, and shown for comparison. Species
sources are: Pvu (Proteus vulgaris), Mru (Meiothermus ruber) and Nso (Niabella
soli). Annotations refer to (in order) the transcriptional activation, DNA recognition
and dimerization interface portions of C protein, and the dimerization, catalytic, and
DNA methylation recognizing portions of REase, based on knowledge of the PvuII
system. Identities are shaded.
23 2. Testing for translational frameshifting (free NsoJS138I C protein production).
Considering the fact that no structural or functional studies have been done on C-
REase fused Type II R-M systems before, it is very possible that C protein might be produced separately. Translational frameshifting [63] happens much more frequently than was expected [54]. In particular, translational frameshifting in NsoJS138I is suggested by two features of the DNA sequence in the junction region (Figure 7A): one is a short sequence that has been associated with -1 translational frameshifts [55], and the other is a nearby stop codon in the -1 reading frame.
High-copy clones of the NsoJS138I R-M system were toxic (see Materials and
Methods), presumably due to a frameshift error in the MTase gene. To address this, a truncated version of NsoJS138ICR was subcloned, removing part of the REase COOH end (34 aa; see Figure 4 from Materials and Methods). To test for translational frameshifting, we added a His6 tag to the amino or carboxyl end of the fusion protein, expressed the tagged proteins from a strong inducible promoter, partially purified cell extracts on affinity columns, and resolved the column eluates on SDS-polyacrylamide gels. Figure 7B shows the Coomassie-stained gels next to western blots probed with anti-
His6 antiserum, while Figure 8 shows the amino-tagged protein isolated in the presence of protease inhibitor PMSF. Translational frameshifting would result in a ~9 kDa polypeptide in the extracts with amino-tagged fusion (the carboxyl-tagged fusion would only yield smaller protein in the case of proteolytic cleavage), and we see no evidence for that product. Nevertheless, we cannot rule out the possibility that frameshifting occurs in
24 the native host (N. soli), or in E. coli under different growth conditions than those we used.
Figure 7. Test of CR fusion protein production. (A) The sequence spanning the
C-REase junction has properties that might result in production of some free C protein (~
9 kDa). GCAAAAA has been associated with -1 ribosomal frameshifts (see text for references), and this would result here in termination at a nearby TGA triplet. (B)
Production of NsoJS138I C-REase fusion protein (~27.2 kDa), with an amino-terminal
(upper) or carboxyl-terminal (lower) His6 tag, was induced using a T7 RNA polymerase- dependent promoter (see Materials and Methods). The upper panels show the results from clones having a small carboxyl-terminal deletion (done in case the REase activity was
25 toxic; size ~25.6 kDa), while the lower panels show full-length clones. Centrifugally- clarified whole-cell extracts were passed over affinity columns to purify the His-tagged polypeptides, and resolved on duplicate 10-20% gradient acrylamide SDS gels. For the lower panels, the extracts were prepared in the presence of protease inhibitor. One gel of each pair was stained (left), the other was electroblotted and probed with anti-His-tag antiserum (see Materials and Methods). Loaded amounts of protein per lane were 2.0 µg
(upper, in both lanes), and 3.4, 6.8 and 5.1 µg (lower, left to right).
Figure 8. Test of CR fusion protein production. Lanes, from left included 4.5
µg protein, MW markers, a control (lysis buffer + PMSF), and 5.5 µg protein. Production of NsoJS138I C-REase fusion protein, with an amino-terminal His6 tag, was induced using a T7 RNA polymerase-dependent promoter (see Methods). The clone had a small carboxyl-terminal deletion (in case the REase activity proved to be toxic). Centrifugally- clarified whole-cell extracts, containing protease inhibitor PMSF, were passed over affinity columns and resolved on a 10-20% gradient acrylamide SDS gels. The gel was blotted to PVDF, blocked, and probed with HRP-conjugated anti-His tag antibodies. The
26 image on the left was detected using lights with 365/302 nm dual-wavelength for the visibility of markers ; the image on the right is from chemiluminescence alone .
3. Activity assay for assessing NsoJS138IR activity and specificity
The central question regarding these C-REase fusions is whether or not they are active. There are numerous examples of RM systems, identified through sequence comparisons, that do not produce catalytically active proteins [64, 65]. We focused on two of the fused RM systems, isolating the Meiothermus ruber Mru1279I genes by amplification from genomic DNA (not shown), and having the Niabella soli NsoJS138I genes synthesized. Full length NsoJS138ICR fragment has been ligated into pACYCDuet
(His-tag at C terminus, REase end) and cloned into a E. coli strain carrying the pre- expressed PvuII MTase (Figure 5B & Figure 7B lower panel).We were unable to detect
REase activity from the M. ruber clones (not shown), possibly due to poor expression in
E. coli and/or improper folding of the protein at the lower E. coli growth temperature (37
°C), though cell extracts were tested at the optimum for M. ruber growth (60 °C) [66].
The M. ruber clones were not studied further.
In contrast, extracts from E. coli cultures carrying the N. soli genes gave detectable REase activity (Figures 9, 10) that indicated a specificity indistinguishable from that of PvuII REase. However, the NsoJS138I C-REase fusion exhibited much more stringent activity requirements than R.PvuII when they were tested at four temperatures in each of four buffers (Figures 9, 10). For these studies, I used 10u of PvuII from a commercial supplier; this is equivalent to ~20 ng of PvuII REase protein [67]. In
27 comparison, 2.4 µg of NsoJS138I CR protein was used (~ 120x as much). R.PvuII was active in 15/16 tested conditions, while the fusion was active in 5/16 (Figure 9).
NsoJS138I was inactive in all four buffers at 27 and 42 °C, and was active in three buffers at 32° and two buffers at 37 °C (Figures 9 and 10). Serial dilution indicated that, at 32 °C, NsoJS138I was most active in NEBuffer 3 (not shown). Differences from
R.PvuII could be due to the presence of the fused C portion at the amino ends of each subunit, to the sequence differences between the PvuII and NsoJS138I REase portions, or a combination of the two factors. The C-terminal His6 tag might also play a role, though it has little effect on R.PvuII.
Figure 9. Assessment of REase activity in CR.NsoJS138I. CR.NsoJS138I and, for comparison, commercial R.PvuII, were incubated for 1h with DNA from
28 bacteriophage λ. Four different reaction buffers were used, and in each buffer four temperatures were used. Reactions were resolved on agarose-TBE gels containing ethidium bromide (see Materials and Methods for details). Inset: Left-to-right: markers, uncut pUC19, pUC19 cut with CR.NsoJS138I or PvuII at 37°, pUC19 cut with
CR.NsoJS138I at 30°.
Figure 10. Confirmation of specific digestion conditions. In some buffers,
R.PvuII or CR.NsoJS138I was active or inactive at a single temperature (as shown in
Figure 9). To confirm this temperature specificity, those two buffer-enzyme combinations were re-tested using 2 µg (upper lanes) or 1.5 µg (lower lanes) of bacteriophage λ DNA.
29 M = size markers; U = uncut DNA. The image is a UV - illuminated agarose gel containing ethidium bromide.
4. In vivo test of CR.NsoJ138I for C protein activity
Detection of C protein activity was via LacZ assays, using a new pBoxLac plasmid (see Figure 5 from Materials and Methods). This assumes that, as in PvuII, the C protein activates expression of its own promoter (in this case boosting beta-galactosidase activity) [31, 68-71]. The C protein operators, called C boxes, have recognizable sequences with symmetrical elements upstream of the C ORFs [41-43, 72]. Based on this,
I examined this region of the NsoJ138I sequence for putative bacterial promoters [73-75], with the best candidate shown in Figure 11B. This 161 bp sequence as shown in Figure
11B was cloned upstream of a reporterless lacZ gene, and transformed into an E. coli strain that also carried ∆lacZ and the nsoJ138ICR gene under control of T7 RNA polymerase (Figure 5C & Figure 11A). In this strain, IPTG induction leads to production of T7 RNA polymerase, which results in production of CR.NsoJ138I (Figure 7B). If
CR.NsoJ138I activates the putative promoter region, β-galactosidase (LacZ) activity will be increased. To detect this, first, IPTG was added to growing cultures with or without the promoter-lacZ fusion plasmid, and samples taken over time showed a clear induction when the fusion plasmid is present (Figure 11C). In addition, I also grew cultures under conditions approximating steady-state, where the IPTG (when present) was in the culture medium for at least 10 generations, and the slope of the activity vs. culture OD plot is a sensitive measure of expression. As shown in Figure 11D, I observed a 23-fold increase
30 in LacZ activity in response to production of CR.NsoJ138I. These results indicate that the fusion is active as a C protein.
Figure 11. Assessment of C activity in CR.NsoJS138I. A. Schematic design of experiment. Top line indicates IPTG-inducible gene for T7 RNA polymerase in the host strain’s chromosome, middle indicates a plasmid (called pNso for the figure) that carries the gene for 25CR.NsoJS138I linked to a T7 promoter, and the bottom indicates a plasmid (called pBox-Lac for the figure, and this is actually PNsoJS138ICR-lacZ from
Figure 5C) that carries the putative promoter and C box region from NsoJS138I linked to a promoterless gene for lacZ (β-galactosidase). B. Sequence of the putative promoter and
C box region from NsoJS138I, showing the candidate C boxes (shaded) and promoter elements (-35 and -10 hexamers, or tTGaCA and tAtRaTg). This 161 nt sequence is what was included in pBoxLac. C. Time course of LacZ induction. Growing triplicate cultures of cells containing the indicated plasmids were treated at time = 0 with the inducer IPTG
31 (which in these cells controls the gene for T7 RNA polymerase), and matched control cultures received no IPTG. LacZ activity was measured over time. The symbols indicate means of the triplicate cultures; standard errors are shown but mostly obscured by the symbols. D. Steady-state expression of lacZ. Triplicate cultures containing the plasmids indicated were grown for at least 10 generations in the presence or absence of IPTG, and
LacZ activity was measured. In this case, activity is plotted vs. culture density, and so modified Miller units are used. Cultures approximating steady-state growth should give good linear fits, the slopes of which accurately measure relative expression levels.
Symbols indicate means of the triplicate cultures; standard errors are shown but are in some cases obscured by the symbols.
32 Chapter 4
Discussion and Conclusion
While C-REase fusions have not previously been characterized, other types of
REase fusions have. One class, for example, involves natural and synthetic fusions of the
REase and MTase polypeptides [17-20]. This ability to form a variety of active fusions illustrates the remarkable flexibility and modularity of RM systems.
Formation of C-REase fusions
Ribosomes must maintain the translational reading frame for the sake of processing primary genetic information into polypeptides, but it is possible certain signals embedded into the mRNAs segments represent a higher order of information content, that being encountered by the ribosomes, can shift the original reading frame, thus alternating gene expression in a major way. The event of translational frameshift happens at an unexpectedly high frequency, and might play vital roles in gene regulation such as controlling transposition in bacteria (by affecting production of the transposase protein)
[76]. However, it is not clearly known whether this occurs at a fixed frequency in response to the genetic signal or, in other words, the signal sequence has not been proved
33 as the decisive factor here, and certain external conditions (e.g., growth media, temperature, pH and host cell) might significantly affect the rate at which it happens.
Despite the lack of evidence in the present study supporting the occurrence of translational frameshifting (see Figures 7 & 8), it still merits further investigation for several reasons. First of all, growth conditions can be altered to observe whether there might be some translational frameshifting. Furthermore, since the test was carried out with the COOH-truncated NsoICR, we cannot rule out the possibility, although it seems unlikely, that such frameshifting would occur in its full length native form. Last but not least, since this study was based on the use of a synthetic gene in E. coli as the host cell, we may want to obtain its native host N. soli, if possible, and repeat some studies in it.
Implications of C-REase fusions
The occurrence of active translational fusions between REases and regulatory proteins has not previously been reported. This is the first study aimed at understanding how the fusion of regulatory C proteins and restriction endonuclease, which are separately active in most Type II R-M systems, could result in the regulatory variation.
As opposed to MTase-REase fusions, both proteins in C-REase fusions (in the case of C and Type IIP REases) can only become active when they form dimers. Since in this study the C-REase fusion has been proved to be produced and both parts are active, certain advantage might be conferred by this arrangement, which results in three ways for the possible dimerization state of C protein and REase in such fusion event.
34
First, both subunit interfaces (C and REase portions) on one fusion polypeptide may dimerize at the same time with the interaction with those on a second fusion molecule (Figure 12B). However, this has to occur without violating the symmetry rules, and thus a linker region of sufficient length and flexibility is essential for achieving this.
Although there are no additional amino acids in the C-REase junctions of Mru
(CR.Mru1279I) and Nso (CR.NsoJ138I) (relative to PvuII; see Figure 6), it is still possible that both C and REase portions are able to dimerize simultaneously, if the carboxyl-terminal region of the C protein portion and/or the amino-terminal region of the
REase portion are sufficiently flexible.
Second, a concatemeric chain might be formed with alternating interfaces (C-
R•R-C•C-R…) (Figure 12D), and this assumes the two interfaces may dimerize with two different polypeptides. Although it is not clear whether such chain formation would confer any structural or functional advantage for such fusion proteins, this could still occur at higher protein concentrations, or if the two interfaces have similar Kd values.
However, the dimerization interface for R.PvuII is ~2300 Å2 [35, 77], while that for a C protein (C.AhdI) is ~1400 Å2 [44], suggesting that the REase portion may have a lower
Kd than the C protein portion.
The third model, which might be the most interesting one, is that the two portions effectively act as internal (first-order) competitive inhibitors of one another’s dimerizations (Figure 12C). This model assumes that either active C protein or active
35 REase will be formed at certain time with each of them being dimerized, respectively. If this is true, the competitive dimerization not only indicates that the two portions could not become active simultaneously, but also might have implications on the relative timing of MTase and REase appearance after the R-M system genes enter a new host cell. First, I consider effects on the appearance of REase activity. If the C interface is stronger than the R interface, active REase would only appear at later times, after a higher concentration of fused polypeptide had accumulated. On the other hand, a stronger REase interface (compared to the C interface) would result in the early appearance of small amounts of REase activity, which might require DNA repair. Second, I consider the effects on timing of gene expression. A stronger C interface (relative to REase) would result in an earlier and sharper induction threshold in the positive feedback loop; while a stronger REase interface may lead to the longer time for the positive feedback loop to cross the threshold for high expression of the fusion gene, giving more time for protective methylation to occur (even if some DNA damage results from the low level of early
REase activity). Either way, this competitive dimerization model seems to provide the most obvious (and testable) hypotheses of the three interaction modes for possible selective advantages of forming C-REase fusions.
The significance of this work manifests itself not only in identification and characterization of this novel Type II R-M system bearing the C-R fusion, but also provides a perspective of how the structural variation of certain proteins could have affected the functional regulation, thus contributing to the understanding of bacterial evolution.
36
Conclusions: R-M systems closely related to PvuII (as judged by similarity of the REase sequences) have diverse regulatory mechanisms. Most resemble PvuII in having a separate regulatory (C) protein, and one of these fusion proteins, from the bacterium
Niabella soli, is active both as a REase and as a C protein. Fusions between C proteins and REases have not previously been characterized. These results reinforce the evidence for modularity among RM system proteins, and raise important questions about the possible selective advantages of C-REase fusion, including implications of these fusions on RM system expression kinetics.
37 Figure 12. Possible interactions of C-REase fusion polypeptides. A. Unfused systems such as PvuII, where the C protein and REase form separate homodimers. B. Fused system in which the linker between C and REase regions of the polypeptide is long enough and flexible enough to allow simultaneous dimerization at both C and REase subunit interfaces. C. Fused system in which dimerization of the C portion is in competition with dimerization of the REase portion. D. Fused system in which concatameric chains can form. See text for details.
38
References
1. Mann, M. B., Rao , R. N., Smith, H. O.: Cloning of restriction and mod
ification genes in E. coli: The HhaJI system from Haemophilus haemolyticus.
Gene 1978 3:97 -112
2. Roberts RJ, Vincze T, Posfai J, Macelis D: REBASE--a database for DNA
restriction and modification: enzymes, genes and genomes. Nucleic Acids Res.
2010 Jan;38(Database issue):D234-6.
3. Roberts RJ, Belfort M, Bestor T, Bhagwat AS, Bickle TA, Bitinaite J, Blumenthal
RM, Degtyarev S, Dryden DT, Dybvig K et al: A nomenclature for restriction
enzymes, DNA methyltransferases, homing endonucleases and their genes.
Nucleic Acids Res 2003, 31(7):1805-1812.
4. Roberts , RJ: Restriction endonucleases. CRC Crit. Rev. Biochem. 1976 4: 1 23-
64
5. Srivani Sistla and Desirazu N. Rao: S-Adenosyl-L-methionine–Dependent
Restriction Enzymes. Critical Reviews in Biochemistry and Molecular Biology,
39:1–19, 2004
6. Kovall RA, Matthews BW: Type II restriction endonucleases: structural,
functional and evolutionary relationships. Curr Opin Chem Biol 1999,
3(5):578-583.
39 7. Pawlak SD, Radlinska M, Chmiel AA, Bujnicki JM, Skowronek KJ: Inference of
relationships in the 'twilight zone' of homology using a combination of
bioinformatics and site-directed mutagenesis: a case study of restriction
endonucleases Bsp6I and PvuII. Nucleic acids research 2005, 33(2):661-671.
8. Pingoud A, Fuxreiter M, Pingoud V, Wende W: Type II restriction
endonucleases: structure and mechanism. Cell Mol Life Sci 2005, 62(6):685-
707
9. Smith, H. 0., Kelly, SV: Methylases of the type II restriction modification
systems. In DNA Methylation: Biochemistry and Biological Significance, ed.
1984 A. Razin, H. Cedar, A. D. Riggs, pp. 39-7 1. New York: Springer-Verlag
10. Chiang PK, Gordon RK, Tal J, Zeng GC, Doctor BP, Pardhasaradhi K, McCann
PP: S-Adenosylmethionine and methylation. FASEB J. 1996 Mar;10(4):471-80.
11. McClelland, M., Nelson, M: The effect of site-specific methylation on
restriction endonucleases and DNA modification methyltransferases-a
review. Gene 1988 74:291- 304
12. Yuan, R., Smith, HO: The restriction and modification DNA methylases: an
overview. In DNA Methylation: Biochemistry and Biological Significance, ed. A.
Razin, H. Cedar, A. D. Riggs, pp. 73-80. New York: Springer-Verlag 1984
13. Murray NE: Type I restriction systems: sophisticated molecular machines (a
legacy of Bertani and Weigle). Microbiol Mol Biol Rev. 2000 Jun;64(2):412-34.
14. Wilson, G: "Restriction and Modification Systems," Annual Review of
Genetics (1991), 25:585-627.
40 15. Bickle, TA: The ATP-dependent restriction endonucleases. In Nucleases, ed.
S. M. Linn, R. J. Roberts , pp. 85- 108. Cold Spring Harbor, NY: Cold Spring
Harbor Lab. Press 1982
16. Szybalski, W. , Kim, S. c., Hasan, N., Podha jska, AJ: Class-lIs restriction
enzymes-a review. Gene. 1991
17. Zylicz-Stachula A, Bujnicki JM, Skowron PM: Cloning and analysis of a
bifunctional methyltransferase/restriction endonuclease TspGWI, the
prototype of a Thermus sp. enzyme family. BMC molecular biology 2009,
10:52.
18. Zylicz-Stachula A, Zolnierkiewicz O, Lubys A, Ramanauskaite D, Mitkaite G,
Bujnicki JM, Skowron PM: Related bifunctional restriction endonuclease-
methyltransferase triplets: TspDTI, Tth111II/TthHB27I and TsoI with
distinct specificities. BMC molecular biology 2012, 13:13.
19. Mokrishcheva ML, Solonin AS, Nikitin DV: Fused eco29kIR- and M genes
coding for a fully functional hybrid polypeptide as a model of molecular
evolution of restriction-modification systems. BMC Evol Biol 2011, 11:35.
20. Shen BW, Xu D, Chan SH, Zheng Y, Zhu Z, Xu SY, Stoddard BL:
Characterization and crystal structure of the type IIG restriction
endonuclease RM.BpuSI. Nucleic acids research 2011, 39(18):8223-8236.
21. Richard J. Roberts: Restriction enzymes and their isoschizomers. Nucleic
Acids Res. 1990 April 25; 18(Suppl): 2331–2365.
41 22. Bickle, TA: DNA restriction and modification systems. In Escherichia coli
and Salmonella typhimurium: Cellular and Molecular Biolo gy, ed. F. C.
Neidhardt, J. L. Ingraham , K. B. Low, B. Magasanik, M. Schaechter, H. E.
Umbarg er, pp. 692-96 . Washington, DC: Am. Soc. Microbiol. 1987
23. E A Raleigh, J Benner, F Bloom, H D Braymer, E DeCruz, K Dharmalingam, J
Heitman, M Noyer Weidner, A Piekarowicz, P L Kretz, et al: Nomenclature
relating to restriction of modified DNA in Escherichia coli. J Bacteriol. 1991
April; 173(8): 2707–2709.
24. Bujnicki JM, Radlinska M, Zaleski P, Piekarowicz A: Cloning of the
Haemophilus influenzae Dam methyltransferase and analysis of its
relationship to the Dam methyltransferase encoded by the HP1 phage. Acta
Biochim Pol. 2001;48(4):969-83.
25. Kommireddy Vasu and Valakunja Nagaraja: Diverse Functions of Restriction-
Modification Systems in Addition to Cellular Defense. Microbiol. Mol. Biol.
Rev. 2013, 77(1):53. DOI: 10.1128/MMBR.00044-12.
26. Handa N, Ichige A, Kusano K, Kobayashi I: Cellular responses to
postsegregational killing by restriction-modification genes. J. Bacteriol. 2000
182:2218 –2229.
27. Rice MR, Blumenthal RM: Recognition of native DNA methylation by the
PvuII restriction endonuclease. Nucleic Acids Res 2000, 28(16):3143-3150.
42 28. Mruk I, Blumenthal RM: Real-time kinetics of restriction-modification gene
expression after entry into a new host cell. Nucleic Acids Res 2008, 36(8):2581-
2593.
29. Gingeras TR, Greenough L, Schildkraut I, Roberts RJ: Two new restriction
endonucleases from Proteus vulgaris. Nucleic Acids Res 1981, 9(18):4525-
4536.
30. Blumenthal RM, Gregory SA, Cooperider JS: Cloning of a restriction-
modification system from Proteus vulgaris and its use in analyzing a
methylase-sensitive phenotype in Escherichia coli. J Bacteriol 1985,
164(2):501-509.
31. Tao T, Bourne JC, Blumenthal RM: A family of regulatory genes associated
with type II restriction-modification systems. J Bacteriol 1991, 173(4):1367-
1375.
32. Adams,G.M. and Blumenthal, RM: The PvuII DNA (cytosine-N4)-
methyltransferase comprises two trypsin-defined domains, each of which
binds a molecule of S-adenosyl-L-methionine. Biochemistry, 36, 8284–8292.
1997
33. Mruk I, Rajesh P, Blumenthal RM: Regulatory circuit based on autogenous
activation- repression: roles of C-boxes and spacer sequences in control of
the PvuII restriction- modification system. Nucleic Acids Res 2007,
35(20):6935-6952.
43 34. Williams K, Savageau MA, Blumenthal RM: A bistable hysteretic switch in an
activator- repressor regulated restriction-modification system. Nucleic acids
research 2013.
35. Cheng X, Balendiran K, Schildkraut I, Anderson JE: Structure of PvuII
endonuclease with cognate DNA. The EMBO journal 1994, 13(17):3927-3935.
36. Horton JR, Nastri HG, Riggs PD, Cheng X: Asp34 of PvuII endonuclease is
directly involved in DNA minor groove recognition and indirectly involved in
catalysis. J Mol Biol. 1998 Dec 18;284(5):1491-504.
37. Gong W, O'Gara M, Blumenthal RM, Cheng X: Structure of pvu II DNA-
(cytosine N4) methyltransferase, an example of domain permutation and
protein fold assignment. Nucleic Acids Res 1997, 25(14):2702-2715.
38. Knowle D, Lintner RE, Touma YM, RM Blumenthal: Nature of the promoter
activated by C.PvuII, an unusual regulatory protein conserved among
restriction-modification systems. J Bacteriol. 2005 Jan;187(2):488-97.
39. Tao, T., and R. M. Blumenthal: Sequence and characterization of pvuIIR, the
PvuII endonuclease gene, and of pvuIIC, its regulatory gene. J. Bacteriol.
1992 174:3395–3398.
40. Sohail A, Ives CL, JE Brooks: Purification and characterization of C.BamHI,
a regulator of the BamHI restriction-modification system. Gene. 1995 May
19;157(1-2):227-8.
44 41. Bart A, Dankert J, van der Ende A: Operator sequences for the regulatory
proteins of restriction modification systems. Mol Microbiol 1999, 31(4):1277-
1278.
42. Vijesurier RM, Carlock L, Blumenthal RM, Dunbar JC: Role and mechanism of
action of C. PvuII, a regulatory protein conserved among restriction-
modification systems. J Bacteriol 2000, 182(2):477-487.
43. Mruk I, Blumenthal RM: Tuning the relative affinities for activating and
repressing operators of a temporally regulated restriction-modification
system. Nucleic Acids Res 2009, 37(3):983-998.
44. McGeehan JE, Streeter SD, Papapanagiotou I, Fox GC, Kneale GG: High-
resolution crystal structure of the restriction-modification controller protein
C.AhdI from Aeromonas hydrophila. J Mol Biol 2005, 346(3):689-701.
45. McGeehan JE, Streeter SD, Thresh SJ, Ball N, Ravelli RB, Kneale GG:
Structural analysis of the genetic switch that regulates the expression of
restriction-modification genes. Nucleic Acids Res 2008, 36(14):4778-4787.
46. S. D. Streeter, I. Papapanagiotou, J. E. McGeehan and G. G. Kneale: DNA
footprinting and biophysical characterization of the controller protein
C.AhdI suggests the basis of a genetic switch. Nucleic Acids Research, 2004,
Vol. 32, No. 21 6445–6453 doi:10.1093/nar/gkh975
47. McGeehan,J.E., Streeter,S., Cooper,J.B., Mohammed, F., Fox,G.C. and
Kneale,GG: Crystallization and preliminary X-ray analysis of the controller
protein C.AhdI from Aeromonas hydrophilia. Acta Crystallogr. D Biol.
Crystallogr. 2004 60, 323–325.
45
48. Sophie Pasek, Jean-Loup Risler and Pierre Bre ́ zellec: Gene fusion/fission is a
major contributor to evolution of multi-domain bacterial proteins.
Bioinformatics Vol. 22 no. 12 2006, pages 1418–1423
49. Sarah K. Kummerfeld and Sarah A. Teichmann: Relative rates of gene fusion
and fission in multi-domain proteins. TRENDS in Genetics Vol.21 No.1
January 2005
50. Sorokin V, Severinov K, Gelfand MS: Systematic prediction of control
proteins and their DNA binding sites. Nucleic Acids Res 2009, 37(2):441-451.
51. Malone T, Blumenthal RM, Cheng X: Structure-guided analysis reveals nine
sequence motifs conserved among DNA amino-methyltransferases, and
suggests a catalytic mechanism for these enzymes. J Mol Biol 1995,
253(4):618-632.
52. Kim YJ, Kim MK, Bui TP, Kim HB, Srinivasan S, Yang DC: Solibius
ginsengiterrae gen. nov., sp. nov., isolated from soil of a ginseng field, and
emended description of the genus Sediminibacterium and of
Sediminibacterium salmoneium. Int J Syst Evol Microbiol. 2010 Dec;60(12).
53. Weon HY, Kim BY, Joa JH, Kwon SW, Kim WG, Koo BS: Niabella soli sp.
nov., isolated from soil from Jeju Island, Korea. Int J Syst Evol Microbiol.
2008 Feb;58(Pt 2):467-9
54. Sharma V, Firth AE, Antonov I, Fayet O, Atkins JF, Borodovsky M, Baranov
PV: A pilot study of bacterial genes with disrupted ORFs reveals a surprising
46 profusion of protein sequence recoding mediated by ribosomal frameshifting
and transcriptional realignment. Mol Biol Evol 2011, 28(11):3195-3211.
55. Ivanov IP, Atkins JF: Ribosomal frameshifting in decoding antizyme mRNAs
from yeast and protists to humans: close to 300 cases reveal remarkable
diversity despite underlying conservation. Nucleic acids research 2007,
35(6):1842-1858.
56. Mazauric MH, Licznar P, Prère MF, Canal I, Fayet O: Apical loop-internal loop
RNA pseudoknots: a new type of stimulator of -1 translational frameshifting
in bacteria. J Biol Chem. 2008 Jul 18;283(29):20421-32.
57. Gurvich OL, Baranov PV, Zhou J, Hammer AW, Gesteland RF, Atkins JF:
Sequences that direct significant levels of frameshifting are frequent in
coding regions of Escherichia coli. EMBO J. 2003 22:5941–5950.
58. Nastri HG, Evans PD, Walker IH, Riggs PD: Catalytic and DNA binding
properties of PvuII restriction endonuclease mutants. J Biol Chem. 1997 Oct
10;272(41):25761-7.
59. Paul L, Blumenthal RM, Matthews RG: Activation from a distance: Roles of Lrp
and integration host factor in transcriptional activation of gltBDF. J
Bacteriol. 2001 Jul;183(13):3910-8.
60. Platko JV, Willins DA, Calvo JM: The ilvIH operon of Escherichia coli is
positively regulated. Journal of bacteriology 1990, 172(8):4563-4570.
47 61. Gertz EM, Yu YK, Agarwala R, Schaffer AA, Altschul SF: Composition-based
statistics and translated nucleotide searches: improving the TBLASTN
module of BLAST. BMC Biol 2006, 4:41.
62. Tenreiro S, Nobre MF, da Costa MS: Thermus silvanus sp. nov. and Thermus
chliarophilus sp. nov., two new species related to Thermus ruber but with
lower growth temperatures. Int J Syst Bacteriol. 1995 Oct;45(4):633-9.
63. Baranov PV, Hammer AW, Zhou J, Gesteland RF, Atkins JF: Transcriptional
slippage in bacteria: distribution in sequenced genomes and utilization in IS
element gene expression. Genome Biol. 2005 6:R25.
64. Naderer M, Brust JR, Knowle D, Blumenthal RM: Mobility of a restriction-
modification system revealed by its genetic contexts in three hosts. J Bacteriol
2002, 184(9):2411-2419.
65. Aras RA, Takata T, Ando T, van der Ende A, Blaser MJ: Regulation of the
HpyII restriction- modification system of Helicobacter pylori by gene
deletion and horizontal reconstitution. Mol Microbiol 2001, 42(2):369-382.
66. Tindall BJ, Sikorski J, Lucas S, Goltsman E, Copeland A, Glavina Del Rio T,
Nolan M, Tice H, Cheng JF, Han C et al: Complete genome sequence of
Meiothermus ruber type strain (21). Stand Genomic Sci 2010, 3(1):26-36.
67. Dominguez MA Jr, Thornton KC, Melendez MG, Dupureur CM: Differential
effects of isomeric incorporation of fluorophenylalanines into PvuII
endonuclease. Proteins. 2001 Oct 1;45(1):55-61.
48 68. Bogdanova E, Djordjevic M, Papapanagiotou I, Heyduk T, Kneale G, Severinov
K: Transcription regulation of the type II restriction-modification system
AhdI. Nucleic Acids Res 2008, 36(5):1429-1442.
69. Cesnaviciene E, Mitkaite G, Stankevicius K, Janulaitis A, Lubys A: Esp1396I
restriction- modification system: structural organization and mode of
regulation. Nucleic Acids Res 2003, 31(2):743-749.
70. Ives CL, Nathan PD, Brooks JE: Regulation of the BamHI restriction-
modification system by a small intergenic open reading frame, bamHIC, in
both Escherichia coli and Bacillus subtilis. Journal of bacteriology 1992,
174(22):7194-7201.
71. Semenova E, Minakhin L, Bogdanova E, Nagornykh M, Vasilov A, Heyduk T,
Solonin A, Zakharova M, Severinov K: Transcription regulation of the EcoRV
restriction-modification system. Nucleic Acids Res 2005, 33(21):6942-6951.
72. Sorokin V, Severinov K, Gelfand MS: Systematic prediction of control
proteins and their DNA binding sites. Nucleic Acids Res 2009, 37(2):441-451.
73. Davis SE, Mooney RA, Kanin EI, Grass J, Landick R, Ansari AZ: Mapping E.
coli RNA polymerase and associated transcription factors and identifying
promoters genome-wide. Methods Enzymol 2011, 498:449-471.
74. Mendoza-Vargas A, Olvera L, Olvera M, Grande R, Vega-Alvarado L, Taboada
B, Jimenez- Jacinto V, Salgado H, Juarez K, Contreras-Moreira B et al: Genome-
wide identification of transcription start sites, promoters and transcription
factor binding sites in E. coli. PLoS One 2009, 4(10):e7526.
49 75. Shultzaberger RK, Chen Z, Lewis KA, Schneider TD: Anatomy of Escherichia
coli sigma70 promoters. Nucleic acids research 2007, 35(3):771-788.
76. Polard P, Prere MF, Chandler M, Fayet O: Programmed translational
frameshifting and initiation at an AUU codon in gene expression of bacterial
insertion sequence IS911. J Mol Biol. 1991 222:465–477.
77. Athanasiadis A, Vlassi M, Kotsifaki D, Tucker PA, Wilson KS, Kokkinidis M:
Crystal structure of PvuII endonuclease reveals extensive structural
homologies to EcoRV. Nat Struct Biol 1994, 1(7):469-475.
50