Quick viewing(Text Mode)

Identification of the Chromosomal Origins of Replication (Oricrsi Oricrsii) in R

Identification of the Chromosomal Origins of Replication (oriCRSI oriCRSII) in R. sphaeroides 2.4.1 Tim Johnson, Randi Harbour, Kristina Hernandez, Lin Lin, and Madhusudan Choudhary Department of Biological Sciences, Sam Houston State University, Huntsville, Texas 77341

INTRODUCTION RESULTS AND DISCUSSION

Rhodobacter sphaeroides belongs to the α-3 subdivision of the Proteobacteria. This is The advantage of the program used in this study is that it offers a progressive search and it also allows the metabolically versatile, and it grows under a variety of growth conditions, such as aerobic, semi-aerobic, processing of an entire at once whereas many currently available web-based programs only allow (a) and photosynthetic growth conditions. R. sphaeroides possesses a complex genome, which is comprised for a small number sequences, which is very time consuming and even providing limited information. of two (CI and CII) and five endogenous (1). CI and CII are ~3.0Mbp and However, an alternative approach through ARTMIS used in this study further validated the result as ~0.9Mbp in size, respectively (2). Analysis of the R. sphaeroides genome reveals that for a wide shown in Figure 2. Furthermore, the efficacy and the accuracy of this program was tested using the entire variety of essential functions are dispersed between the two chromosomes. Recently, it has also been genomic sequences of Caulobacter cresentus and Sinirhizobium meliloti, and the program was able to demonstrated that CI and CII have been both essential and ancient partners within the R. sphaeroides identify all putative origins including the one which is biologically functional. genome since its separation from its lineage (3). The output of both search programs provided 3013 and 336 sequence files of I Unlike , prokaryotic lacks or mitosis like apparatus. The existence of multiple and Chromosome II, respectively. After matching the overlapping regions, there were a total 125 CI- and chromosomes in may require a well coordinated chromosomal replication and chromosomal 16 CII-specific sequences remained. Following through the database search, there were 37 CI- and segregation to distribute the chromosomes equally in the two daughter cells. Therefore, in order to 9 CII-specific sequences were chosen to be analyzed further. These regions of putative origins were then understand the process of DNA replication, the origin of chromosomal replication must first be further analyzed to determine the presence of known cis-elements to which the DnaA and other replication identified. The (referred as oriC in E. coli) is the specific region in the chromosome bind as shown in Figure 3. Each of the total 13 DNA regions (as shown in Table 1) along with where the DNA double helix will begin to denature allowing replication of the chromosome to initiate the 300 nucleotide upstream and the down stream sequences was then analyzed for 21 different conserved (4). This region varies 40-80 base pairs in length among different bacterial , and usually remains boxes for oriC, DnaA, RepABC1, and RepABC2 (5). Many of these sequences contain 2 to 5 of these very AT rich (70 to 80%) as the bonds between and are more easily denatured than the conserved binding boxes as shown in Figure 3. Based on the %AT content and the number of binding bonds between and . There are cis-elements located within and around this region, boxes, 13 possible origin of replication were identified in R. sphaeroides’ chromosomes. which are recognized by a set of proteins including DnaA, RepABC, and other proteins associated with Comparison of the genome sequences of and Rickettsia prowazekii revealed that chromosomal replication that bind to the specific DNA sequences in this region and facilitate the both species shared a conserved cluster of genes in the hemE-hemH region that overlapped the established initiation of the chromosomal replication. origin of replication in C. cresecentus and the putative origin of replication in R. prowazekii (6). The origin of replication of the S. meliloti chromosome has also been predicted as well as experimentally In order to identify the putative origins on CI and CII in R. sphaeroides, a silico-approach was employed confirmed to be approximately 400 kb from dnaA and adjacent to hemE (5). A putative origin of to search CI- and CII-specific genomic sequences both with variable sequence length and %GC replication of CI in R. sphaeroides is located ~40 kb from hemE but it remains uncertain until it will be composition. Two different computer programs, which search either overlapping or discrete segments of confirmed experimentally. Like R. sphaeroides, Vibrio cholerae possesses two chromosomes and the DNA sequence, were used to search the entire chromosome specific sequences. All the sequences of 50 origin of replication of the two chromosomes (oriCIvc and oriCIIvc), has been experimentally studied (7). to 100 in length with >65% AT content were selected for further analysis. These sequences Thus, the identification of chromosomal origins in R. sphaeroides may further facilitate the mechanism of were then analyzed for the presence of cis-elements using the conserved found in chromosomal replication in bacterial species which possess multiple chromosomes. Sinorhizobium meliloti (5), which is closely related species to R. sphaeroides and which also belong to the α-3 subgroup of proteobacteria.

METHODS

Silico-approach for the identification of the origin of replication: To identify the chromosomal origin of replication in Rhodobacter sphaeroides 2.4.1, a computer program was designed in order to Chromosome I Chromosome II search the A-T rich regions within CI and CII sequence. Further, the sequence was analyzed for the ~69% GC ~69% GC presence of the consensus cis-elements which are necessary for the initiator proteins to start the replication. The algorithm was developed as such that it searches both variable nucleotide lengths (50- 100 nucleotide range) and varying %AT composition (65 % to 80%) in an overlapping and progressive manner as shown in Figure 1. The program was applied on each of the chromosomal sequence of R. sphaeroides in the fasta format, which were directly obtained from the NCBI server. For efficient use of and input-output loading, each sequence is analyzed sequentially in a a b Figure 3. DnaA and RepABC box biding sites for the origin of replication. a) A G+C content graph of a ~6kb region buffer. The analysis is performed by using the %AT calculation for each candidate sequence and then encompassing in the possible region for origin of replication in R. sphaeroides. b) The sequence of possible regions for checking if the nucleotide composition of the sequence is above a chosen threshold value. If a Figure 2. The G+C content and possible sites for origin of replication in CI and CII in R. sphaeroides 2.4.1 origin of replication. c) DnaA and RepABC biding sites that match the DnaA and RepABC box consensus sequences. d) sequence is shown to be above the chosen threshold value, it is then sent to the output data files. In (purple-below average; yellow-over average). a) G+C content and two possible sites for origin of The sequences of the putative DnaA and RepABC boxes. (* Biding sites for multiple box consensus sequences ) addition, ARTIMIS was also used to calculate the %GC composition within each of the discrete 120 replication in Chromosome I. b) G+C content and 9 possible sites for origin of replication in Chromosome nucleotide s long sequence along each of the two chromosomal sequences as shown in Figure 2. II. FUTURE WORKS Identification of the conserved DNA sequence boxes in the origin region: The sequences, All thirteen putative chromosomal origins of R. sphaeroides 2.4.1 will be cloned into the suicide vector (pLO1 or however, overlapped each other as was the of the program and as such had to be combined to Table 1. The possible regions for origin of replication in Chromosome I and Chromosome II pSUP202). The resulting recombinant will be tested biologically if one of these origins allow the suicide plasmid eliminate analyzing the same region twice. The assembled sequences were searched against the to autonomously replicate in R. sphaeroides. This work is currently in progress. protein database of the R. sphaeroides in order to identify if any of these sequences for the A+T content for A-T rich Locations protein. Finally, the remaining sequences were further analyzed using the DNADynamo to determine Coordinates Sequences (with A-T rich region marked as red) region TCGCATCGCCCCTCCCGCTTCGTTGAACATTTTGGCCGATTAAATTCATTTTTTTGCCGACCATCAACGTTTATTTTCTTTTTG whether they contain the consensus boxes as they were previously identified in the chromosomal 2380028-2380181 ATGAAGATTTCCAGATTTACTTTCAGTTTTTCCATGCTTATGCCTTGGAAACTGGCAGTTTCCCGTTGGC 69.32% CI origin of S. meliloti (5). The program was downloaded through the internet from the publically GGAGTGACTGAATGAAAGGCAACGATGTATCAATCATGAGATCGGAACATGAGTCTGCTCTCGAATAGAGTGAGATCAGG 1700865-1701165 ATTTAAGACAAAGTAAACATTTTTGGTATTCTTAAGTGATTGATTTTATTGAATAAATCAAGGGTGTCATATGGATTTGTTTT 72.58% CI REFERENCES available website. The program performs the searches both in forward and reverse complement TCTTAAGAAATCGTTTAATGATTGATTTATTGATTTATTAAGAAATGGATGAATCGAGATTTGATGTTCATGGTTCTTGAATG GGTATTCCATCAATGAACATGAACATGAGTGCATTTTGGCGTAAGTGAGCGAAGC directions of the target sequence. GAACGCCACCTTTAATCCACATAGAGGTTTTGAGATCAGGAAAGGAGTCTTCTTTCAGATAAAGGTTTGAGATCAGGAAAG 1. Suwanto, A., and S. Kaplan. 1989b. Physical and genetic mapping of the Rhodobacter sphaeroides 2.4.1 genome: 1701171-1701360 GAGTCTTCTTTCACATAGAGGTTTTGAGATCGGATAAACCTTTAATCCACATAGAGGTTTTGAGATCGGATAAACTGCATCG 63.06% CI AATAAGGGTCACCATAAGCAATCTGGC presence of two unique circular chromosomes. J. Bacteriology, 171:5850-5859. CCGCGCGAAGCGCCAATGGAATCGTTTATCCAATAGAGATTTGGACTCATACAGATCGGATAAATGATCTATGCTCAGATA 1701367-1701598 GAGATTTTGAGATATCAAATTTCATCAGATAAAGGTATTTTGGATCTTCAAACTTCCTTTCTCTAACTCAGATCTCATCTGGA 68.88% CI 2. Mackenzie, C., et al. (2001) The home stretch, a first analysis of the nearly completed genome of Rhodobacter CCTTATAGTTAAGATTCTGATTATAGCTCTATTTCTATAGGGGGACGAAACCCCCATTTTCGTGGTGA sphaeroides 2.4.1. Photosynthesis Research, 70: 19-41. a b c TCTTCCCCAGCTTATTGAAAGACAAACTGAAGAAAAAACGAGAAATTCTGACGGTTATAGAAAGTCAGACTTACAGAAGAT 199834-199941 CCGAGGGGGTGCTTTGAAACGCACATC 62.65% CII 3. Choudhary, M., Yun-Xin Fu, C. Mackenzie, and S. Kaplan. 2004. DNA Sequence duplication in Rhodobacter GTTCGGCGAGGCTCCACCTGTTCCCATTGACAGGCTAATCGAAAGCTAATCTAATAAAAACAAATAAAAGCTGACATGTGA 205692-205820 TGTAAGAAAATCTGACGAAAGAGAGGGGCGGATGTCGATCCGGATGCT 66.67% CII sphaeroides genome: Evidence of an ancient partnership between chromosomes I and II. J. Bacteriology, 187:2019-

AGTATCAACTAAAGGTTGTAACCCGTCTATACTTTAGCGATAGAGTTTCATTAAGATACAATCAAGCGGGATTGTTCCTTCG 365609-365736 AGACTGGAACACCGTCAAAAGTGTGGGATATGGTCATTTTGACACA 63.75% CII 2027.

GGAGTCAAGCATTTTGTAAACTTGTTATATACCAATCGGTTTCACTTGCTGAGCGAGGCCCCGGATAATCTGTTTTCGCATT 4. Fuller, R. S., Kaguni, J. M. and Kornberg, A. (1981). Enzymatic replication of the origin of 478071-478177 GTTTTGGAATGATAATCACTCTG 61.82% CII chromosome. Proc Natl Acad Sci USA 78, 7370-7374. GTTACATTTTGTGCAAGACCATCACGATCTGTCAATCTCATTTTGCCAGATTTTCATGCTGCACCGCAGATAAACTCGGTGA 583697-583830 TTGACTTGTTCATATGTTTATTTGACAACTAATATGATCGTAGCCCAAGCGC 60.57% CII 5. Sibley, C. D., MacLellan, S. R., Finan, T. (2006) The Sinorhizobium meliloti chromosomal origin of replication. AAGAAAGTCAGCATAGAAATTGAGAATTAAGCACTCGTCTGGCAGAAAGGCCTTCCCGAAATTACATCGGGCAATTCAAA 634469-634658 AGAACCACCGTATTTAAGTTGACTGACGAAATACACATGTAGTTAAAATGCAGCCAATCGGAGGGCAATATGGACGGTCAG 62.50% CII AGAGTATCACAAGAAGAGTTTGAGGAACT Microbiology 152: 443-455.

GCGAGTGGGATGTTCAGTAAGTTGATGAGTTTATCTGCTCGATAGTGCATGTATGCACCAATATTGGTTAAGTAAACGCTAC 6. Brassinga, A. K. C., R. Siam, and G. T. Marczynski. (2000) Conserved cluster at replication origins of the α- 738147-738270 CACTTTCGATTGAATCAAAAGCCGGACAAATCACCCATGGAT 63.64% CII Proteobacteria Caulobacter crescentus and Rickettsia prowazekii. Journal of Bacteriology 183(5): 1824-1829. AAGGACGAAAACACGTCATGACTCGCTTCATACTCAGCGACCTTTGCATCTGTTGTTATATTGGGGAAATAGTAGTGGTCTT Figure 1. a) Program window; b) Input data; c) Output data. 876323-876437 CAAATGCCATTATTTTCTTCCAATCTTTGTCGG 64.39% CII 7. Egan, E. S. and M. K. Waldor. (2003) Distinct replication requirements for the two Vibrio cholerae chromosomes. Cell AATGGCTGATCCTTGGGTAATTTGTCCGGCTTTTGATTCAATCGAAAGTGGTAGCGTTTACTTAACCAATATTGGTGCATAC 921529-921661 ATGCACTATCGAGCAAATAAACTCATCAACTTACTGAACATCCCACTCGCC 64.84% CII 114: 521-530.