Bacteriophage PRD1 DNA Polymerase: Evolution of DNA Polymerases (Lipid-Containing Phage/Protein-Primed DNA Replication) GUHUNG JUNG, MARK C
Total Page:16
File Type:pdf, Size:1020Kb
Proc. Natl. Acad. Sci. USA Vol. 84, pp. 8287-8291, December 1987 Biochemistry Bacteriophage PRD1 DNA polymerase: Evolution of DNA polymerases (lipid-containing phage/protein-primed DNA replication) GUHUNG JUNG, MARK C. LEAVITT, JUI-CHENG HSIEH, AND JUNETSU ITO Department of Microbiology and Immunology, The University of Arizona Health Sciences Center, Tucson, AZ 85724 Communicated by Carl S. Marvel, August 10, 1987 ABSTRACT A small lipid-containing bacteriophage PRD1 sensitive to the drug aphidicolin, which is a specific inhibitor specifies its own DNA polymerase that utilizes terminal protein of eukaryotic DNA polymerases a and A (10, 16). as a primer for DNA synthesis. The PRD1 DNA polymerase In this communication, we report the nucleotide sequence gene has been sequenced, and its amino acid sequence has been and the deduced amino acid sequence of PRD1 DNA deduced. This protein-primed DNA polymerase consists of 553 polymerase.* We have compared the amino acid sequence of amino acid residues with a calculated molecular weight of this polymerase with those of other DNA polymerases and 63,300. Thus, it appears to be the smallest DNA polymerase found that PRD1 DNA polymerase, the smallest known DNA ever isolated from prokaryotic cells. Comparison of the PRD1 polymerase possessing a proofreading function, has partial DNA polymerase sequence with other DNA polymerase se- homology with many other DNA polymerases including quences that have been published yielded segmental but sig- phage T4 DNA polymerase. These results together with those hificant homologies. These results strongly suggest that many of others (17-22) suggest strongly that many prokaryotic and pi-okaryotic and eukaryotic DNA polymerase genes, regardless eukaryotic DNA polymerase genes, regardless of size, have of size, have evolved from a common ancestral gene. The evolved from a common ancestral gene. The results also results further indicate that those DNA polymerases that use suggest that DNA polymerases that use either an RNA or either an RNA or protein primer are related. We propose to protein primer have arisen from a common ancestral progen- classify DNA polymerases on the basis of their evolutionary itor. relatedness. MATERIALS AND METHODS Since DNA contains the genetic information of an organism, accurate DNA replication is one ofthe most important events DNA Sequencing. A recombinant plasmid (clone 3) that ofthe life cycle ofan organism. DNA polymerases are the key contains PRD1 genes 1 and 8 was kindly provided by L. enzymes catalyzing the accurate replication of DNA. How- Mindich (Public Health Research Institute ofthe City ofNew ever, none of the known DNA polymerases can initiate de York) (13). The plasmid DNA was first cleaved with Pst I, novo synthesis of a DNA chain without a primer (1). Gen- and the resulting DNA fragment containing genes I and 8 of erally, small RNA molecules are used as primers. These PRD1 was recloned into the bacteriophage vector M13mp9 provide a 3'-hydroxyl group for DNA chain elongation (1). (23). We then generated a nested set ofdeletions according to However, a different type of DNA polymerase has been the method of Dale et al. (24). Each set of deletions was identified in various systenis (2-7). These enzymes utilize sequenced by the dideoxy chain-termination method of proteins as primers for the initiation of DNA synthesis. Sanger et al. (25). For confirmation, some regions were Whereas these DNA polymerases are distinctive in the way sequenced by the method of Maxam and Gilbert (26). they initiate DNA synthesis, their mechanism for achieving Computation. Protein data base searchest were performed high fidelity of DNA replication appears to be similar to that by a FASTP microcomputer program (27). DNA polymerase of the RNA-primed DNA polymerases. Thus, all known nucleotide sequences were obtained from the GenBank data DNA polymerases acting at protein, primers also possess a baset and translated and screened against consensus se- 3'-+5' exonuclease activity, generally known as the editing quences with the use of the IBI (International Biotechnolo- function (8-10). gies, Inc.) sequence analysis system program. We have been studying bacteriophage PRD1 as a model RESULTS system for the initiation of linear DNA replication. PRD1 is a member of a group of lipid-containing phages that infect a Nucleotide Sequence and Predicted Amino Acid Sequence of wide variety ofGram-negative bacteria harboring PRD1 DNA Polymerase. In agreement with genetic studies by plasmids of McGraw et al. (13), nucleotide sequence analysis revealed the P, N, or W incompatibility type (11, 12). The genome of that there are only two major open reading frames at the left PRD1 is 14.7 kilobases long with terminal proteins covalently end of the PRD1 genome (Fig. 1). One open reading frame linked at both 5' ends (13). The linkage ofthe terminal protein corresponds to gene 8, the terminal protein gene consisting of to the PRD1 DNA involves a phosphodiester bond between 260 codons. The other one is gene 1, the DNA polymerase a tyrosine residue of the terminal protein and the 5'-terminal gene that consists of 554 codons (Fig. 1). The nucleotide se- dGMP (6). The PRD1 genome encodes a DNA polymerase quence of the PRD1 DNA polymerase gene along with the that catalyzes the formation of the terminal protein-dGMP complex as well as DNA chain elongation, analogous to the *The sequence reported in this paper is being deposited in the systems ofadenovirus (14, 15) and 429 (4,5). The PRD1 DNA EMBL/GenBank data base (Bolt, Beranek, and Newman Labora- polymerase has been isolated and shown to possess a tories, Cambridge, MA, and Eur. Mol. Biol. Lab., Heidelberg) 3'--5'exonuclease activity (10). It has also been shown to be (accession no. J03018). tProtein Identification Resource (1986) Protein Sequence Database (Natl. Biomed. Res. Found., Washington, DC), Release 9.0. T,he publication costs of this article were defrayed in part by page charge tNational Institutes of Health (1987) Genetic Sequence Databank: payment. This article must therefore be hereby marked "advertisement" GenBank (Research Systems Div., Bolt, Beranek, and Newman, in accordance with 18 U.S.C. §1734 solely to indicate this fact. Cambridge, MA), Tape Release 48.0. 8287 Downloaded by guest on September 24, 2021 8288 Biochemistry: Jung et al. Proc. Natl. Acad. Sci. USA 84 (1987) PRD1 GENOME residues and a high percentage (41.2%) of hydrophobic residues. The initiation codon ATG for the PRD1 DNA poly- merase gene is preceded by a region complementary to the 3' terminus of Escherichia coli 16S rRNA (29). Comparison of Amino Acid Sequences Between Phage PRD1 and Phage 429 DNA Polymerases. Our studies show that PRD1 and 429 DNA polymerases are the smallest known enzymes that possess 3'-5' exonuclease activity. Moreover, these two DNA polymerases utilize terminal proteins as primers for the initiation of DNA synthesis (4, 6, 9, 10). Because 429 infects Gram-positive bacteria and PRD1 grows on Gram-negative bacteria, the amino acid sequences of these two DNA polymerases were compared to discern the possible evolutionary relationship between these proteins. As Although the overall sequences are not strikingly homolo- gous, three regions are highly conserved between these two er DNA polymerases (Fig. 3). To determine whether these localized homologies are specific for protein-primed DNA FIG. 1. Genetic and physical map of the region of PRD1 that polymerases, we compared these amino acid sequences with contains the genes for DNA polymerase (gene 1) and terminal protein that of phage SP02 DNA polymerase that has a similar (gene 8). The genetic map was adapted from that of McGraw et al. molecular weight. The Bacillus temperate phage SP02 spec- (13). Arrows indicate the direction of translation. The open circles at ifies its own DNA polymerase (31), and the gene encoding both ends of the genome indicate terminal protein. this polymerase has been cloned and sequenced by Raden and Rutberg (31). The SP02 DNA polymerase has a molec- predicted amino acid sequence and the flanking sequences ular weight of 72,000 and does not initiate DNA synthesis by are shown in Fig. 2. The calculated molecular weight ofPRD1 a protein-priming mechanism (31). No substantial amino acid DNA polymerase is 63,300, which is in rough agreement with homology was found when the sequence for SP02 DNA the reported size (28). This protein contains an approximately polymerase was compared with the sequence for either 429 equal nurpber of acidic (14.2%) and basic (15.7%) amino acid or PRD1 DNA polymerases (data not shown). AAA CGC GGC TAT GGC AGC MG GGG GTT TAA GAT ATS| CCG CGC CGT TCC CGT AAA AAG GTG GAA TAT AA AT GCC LJ P R R S R K K V E Y K I A GCC '[T GAc 'Tr GM. ACT GAC CCT TTC AAG CAT GAC CGA ATC CCT AAA CCG TTT TCA TOG GGT 'FF TAT AAT GGC 15 A F D F E T D P F K H D R I P K P F S W G F Y N G GAA AT TAT MAA GAC TAT 'OG GOC GAT (;A TGC AlA GM CAG AvlT'FF1' '[AC'TG IT GAT ACC A'l'A GAA GAA CCG 40 E I Y K D) Y W G D D C I E Q F I Y W L D T I E E P CAC GOrr ATA [AC GmT CAT AAC GC GGC MG FTr GAT 'FITr CT[ r[T CrC ATG AMA TAC 1'IT CGC GGG AAA TG AAA 65 H V I Y A H N G G K F DI F L F L M K Y F R G K L K ATA G MT GGG CGT AAT IGT GAM GTA GAM CAC GGC ATO CAT AAA Ft CGC GAT AGT TAT GCA ATC CTG CCG GTG 90 I V N G K I L E V E H G I H K F R D S Y A I L P V CCG CrI' GC' GCC Gc, (;GAT GM MAG ATA GMAA NrIr GAT TAT ('C AAG ATG GMA AGG GM ACA CGC GAA CAG CAC AAG 115 P L A A S D E K I E I D Y G K M k R E 'r R k Q H K GCG GAA ANT 'WA GMAA 'Ac (G.A GG (GAT ''T G'rA ACC cTG CAT AM TG GTr TC 'rFA '-FI AMT1 GCT GAA T- 140 A E I L E Y L K G D C V 1' L H K M V S L F I A E F GGA MGlY.