Proc. Nati. Acad. Sci. USA Vol. 86, pp. 4465-4469, June 1989 Biochemistry T5 DNA : Structural-functional relationships to other DNA (DNA polymerase I///evolution) MARK C. LEAVITT AND JUNETSU ITO Department of Microbiology and Immunology, University of Arizona Health Sciences Center, Tucson, AZ 85724 Communicated by Lester 0. Krampitz, April 10, 1989 (receivedfor review February 1, 1989)

ABSTRACT T5 DNA polymerase, a highly processive sin- proceed through double-stranded regions in template sec- gle-polypeptide , has been analyzed for its primary ondary structures or supercoiled plasmid templates. structural features. The amino acid sequence of T5 DNA We present here the DNA sequence of the T5 DNA polymerase has a high degree of homology with that of DNA polymerase * and the deduced amino acid sequence ofits polymerase I from and retains many of the product. Comparisons of the primary structure of this en- amino acid residues that have been implicated in the 3' -* 5' zyme with other DNA polymerases suggest differences that and DNA polymerase activities of that enzyme. may account for the high processivity of this enzyme. We Alignment with sequences of polymerase I and T7 DNA poly- also demonstrate the conservation of residues thought to be merase was used to identify regions possibly involved in the intimately involved in 3' -* 5' exonuclease and polymerase high processivity of this enzyme. Further, amino acid sequence activities. Finally, two amino acid sequence segments, which comparisons ofT5 DNA polymerase with a large group ofDNA may be involved in the 3' -*5' exonuclease function of these polymerases previously shown to exhibit little similarity to , appear to be highly conserved among a wide polymerase I indicate certain sequence segments are shared variety of DNA polymerases. among distantly related DNA polymerases. These shared re- gions have been implicated in the 3' -5' exonuclease function MATERIALS AND METHODS of I, which suggests that the proofreading domains polymerase DNA Sequencing. T5 was obtained from R. Fujimura (Oak of all these enzymes may be evolutionarily related. Ridge National Laboratory, Oak Ridge, TN). Phage T5 DNA was isolated from lysates of wild-type phage. T5 Bal I Bacteriophage T5 produces its own DNA polymerase that is fragments 11 and 12 were cloned into M13mp9 or -mpl9 (6) essential for phage DNA replication (1). This DNA polymer- and a nested set of deletions were created using the method ase is unusual in that it is highly processive; it extensively of Dale (7). Both strands were sequenced using either Se- elongates a primer before disassociating from the primer quenase (United States Biochemical) or the method of template and is capable of strand displacement. T5 DNA Maxam and Gilbert (8). A phage DNA fragment that included polymerase is more processive than any other single- the Bal I fragment 11-12junction was sequenced to eliminate polypeptide DNA polymerase on comparable templates (2, the possibility of a small intervening Bal I fragment at this 3). Similarly, its 3' -* 5' exonuclease or proofreading activity location. will processively hydrolyze hundreds of before Amino Acid Sequence Comparisons. similarity disassociation using either double- or single-stranded DNA searches of the National Biomedical Research Foundation substrates (4). The structural characteristics that confer high protein sequence library were performed (Release 16, March processivity upon polymerases are currently not understood, 1988) using FASTA (9). Amino acid sequence comparisons or perhaps because other well-studied DNA polymerases re- alignments were accomplished using LFASTA (9) or BESTFIT quire additional to become processive. In contrast, or GAP (from the University ofWisconsin Genetics Computer processivity is an intrinsic property of T5 DNA polymerase Group). and therefore it is an appropriate subject for investigation in the area of DNA replication. Furthermore, this polymerase is also rare in its ability to RESULTS utilize nicked circular duplex DNA as a template and can DNA and Amino Acid Sequences of T5 DNA Polymerase. unwind the parental DNA strand from its template as it The physical location of the T5 DNA polymerase gene has synthesizes the new DNA strand from the 3'-OH end of the been identified (10) by restriction fragment rescue of poly- nick. The only other DNA polymerase capable of using a merase amber , as being within the region of Bal I nicked template or of strand displacement in the absence of restriction fragments 11 and 12 at 58.3-61.3% of the distance other protein factors is Escherichia coli DNA polymerase I from the left end of the . DNA sequence analysis of (Pol I) or its large (Klenow) product (4, 5). The Bal I fragments 11 and 12 reveals an open reading frame of dual properties of high processivity and strand displacement 2487 nucleotides and suitable ribosome that may make T5 DNA polymerase well suited for use in dide- would code for a protein of 94.3 kDa (Fig. 1). This is in rough oxynucleotide DNA sequencing. The high processivity of agreement with the predicted molecular mass of T5 DNA modified T7 DNA polymerase-thioredoxin complex, known polymerase (96 kDa), as estimated by SDS/polyacrylamide commercially as Sequenase, has made it very popular for use gel electrophoresis (13), and is consistent with the direction in DNA sequencing projects. Additionally, the strand-dis- of transcription of the T5 DNA polymerase gene, as deter- placement ability of T5 DNA polymerase may enable it to mined by Schneider et al. (12).

The publication costs of this article were defrayed in part by page charge Abbreviation: Pol l, DNA polymerase I. payment. This article must therefore be hereby marked "advertisement" *The sequence reported in this paper has been deposited in the in accordance with 18 U.S.C. §1734 solely to indicate this fact. GenBank data base (accession no. M24354).

4465 Downloaded by guest on September 26, 2021 4466 Biochemistry: Leavitt and Ito Proc. Natl. Acad. Sci. USA 86 (1989)

BalI map of T5 genome (121 kb) 1 4 9 10 7 5 6 1211 8 2 13 3 I I I I I I I I I . .

0.5 kb A polymerase gene rTMi BalIji BalI HpaI EcoRI BalI SmaI RBS -35 -10 -1 ATCTATICCATATCIW~ ~ ~ ~ ~ ~ ~ ~ ~ a ATA7AGATl~fTATrAG 120 M Y S I C V T R S C PPVV V CCSS K KKHH I TIT I C TPT P ENNPP F DPD PN DYD Y D FVIF V I LYL V C 40 aMYS=AI~CVCTRSC 240 A E P FL Y F A G K K G I G D Y T G K R V E Y N G YA N W I A S I S P A Q L H F 80 360 K P E M K P V F D A T V E N I H D I I N G R E K I A KA G D Y R P I'T D P D E A 120 480 E E Y I K M V Y N M V I G P V A F D S E T S A L Y C R D G Y L L C V S I S H Q E 160 600 Y Q C V Y I D S D C L T E V A V Y Y L Q K I L D S E N H T I V F H N LK F D M H 200 TATMCTACC llNCGAAIClAMOCACAT_A_A A 720 F Y K Y HL G L T F D K A H K E R R L H D T M L Q H Y L D E R R G T H G L K S 240 840 L A M K Y TD M G D Y D F E L D K F K D D Y C K A H K I K K E DF T Y D L I P F 280 . Ba1I 960 D I M W P Y A AK D T D A T I R L H N F F L P K I E K N E K L CS L Y Y D V L M 320 1080 P G C V F L Q R V E D R G V P I S I D R L K E A Q Y Q L T H N L N K A R E K L Y 360 1200 T Y P E V K Q L E Q D Q N E A F N P N S V K Q L R V L L F D Y V G L T P T G K L 400 1340 T D T G A D S T D A E A L N E L A T Q H P I A K T L L E I R K L T K L I S T Y V 440 1440 E K I L L S I D A D G C I R T G FH E H M T T S G R L S SSS K L N L Q Q L P R 480 1560 D E S I I K G C V A P P G Y R V I A W D L T T A E V Y Y A A V L S G D R N M Q 520 1680 Q V F I N M R N E P D K Y P D F H S N I A H M V F K L Q C E P R D V K K L F P A 560 1800 L R Q A AKA I T F G I L Y G S G P A KV A H S V N E A L L E Q AA K T G E P F 600 1920 V E C T V A D A K E Y I E T Y F G Q F P Q L K R W I D K C If D Q I K NH G F I Y 640 2040 S H F G R K R R L H N I H S E D R G V Q G E E I R S G F N A I I Q S A S S D S L 680 CmrA 1111 2160 L L G A V D A D NE I I S L G L E Q E M K I V M L V H D S V V A I V R E D L I D 720 CAATACMSAAMAAA~cC~r~trAATA~h~AGAAAG~c1'iarA'c1AcUI X 2280 GIS I P GC P I G I D S D S E A G G S R D Y S C 760 Q Y N E I L I R N I Q K D R EcoRI. 2400 K M K K Q H P S I A C I D D D E Y T R Y V K G V L L D A E F E Y K K L A A M D K 800 2520 E H P D H S K Y K D DK F I A V C K D LD N V K R 1 L G A- 829 FIG. 1. Physical location of T5 DNA polymerase gene, its sequence, and predicted amino acid sequence. Bal I map of T5 with a solid bar represents the open reading frame on fragments 11 and 12 (10, 11). Transcription is from left to right (12). The nucleotide sequence of T5 DNA polymerase gene displays predicted -35,-10 transcription sequences, putative ribosome binding site, and predicted amino acid sequence. The single-letter amino acid code is used. A search of the National Biomedical Research Foundation as determined by metal and dNMP binding in crystals of protein data base showed T5 DNA polymerase to be strongly , are conserved in T5 DNA polymerase. homologous to E. coli DNA polymerase I (Pol I). Pol I is Residues identified through labeling of Pol I by dNTP ana- perhaps the most intensively studied DNA polymerase and logues as being implicated in dNTP binding (Lys-758, Tyr- its large proteolysis product (Klenow fragment) is the only 766, and His-881) are also conserved (14, 15). such enzyme for which three-dimensional structural infor- Further sequence alignments included the phage-encoded mation is available (for review, see ref. 14). High-resolution subunit of T7 DNA polymerase and the e subunit of E. coli structural studies, in conjunction with genetic and biochem- DNA polymerase III, which were reported (17, 18) to exhibit ical analysis, have determined the location of the 3' -- 5' segmental amino acid sequence similarity to Pol I. Fig. 3 exonuclease and DNA polymerase active sites and identified shows regions of homology shared by at least three of these several of the residues that are directly involved in these related proteins. Six of the seven residues involved at the activities (14, 15). 3' -* 5' exonuclease site and all three residues associated Homologies to Pol I and Related DNA Polymerases. The with the polymerase are located within conserved amino acid sequence alignment of T5 DNA polymerase and regions shared by these related proteins. Pol I (Fig. 2) shows that all seven amino acid residues Conservation of Regions 1 and 2 Among Many DNA Poly- involved in the 3' -*5' exonuclease activity of PolI (Asp-355, merases. Comparisons of T5 DNA polymerase to the amino Glu-357, Leu-361, Asp-424, Phe-473, Tyr-497, and Asp-501), acid sequences of a variety of DNA polymerases that had Downloaded by guest on September 26, 2021 Biochemistry: Leavitt and Ito Proc. Natl. Acad. Sci. USA 86 (1989) 4467

8 RSCPWCSKKHITIGTPENPFDPNDYDFVILVGAEPFLYFAGKKGIGDYT 1 2 227 RGAKTMAAKLEQNKEVAYLSYQLATIKTDVELELTCEQLEVQQPAAE LL TS 136 A F TSA 190VFNL -K F M H[3 Poll 353 A F DTT T S fil1y A 416V 0G N L - K YR ¢ I 58 GKRVEYNGYANWIASISPAQLHFKPEMKPVFDATVENIHDIINGREKIAK T7 3V5 I ANA DNA polymerases 56VH N|GHK Y V P A 277 GLFKKYEFKRWTADVEAGKWLQAKGAKPAAKPQETSVADEAPEVTATVIS 6 10 VLT TTGN 46 N N F H V Y L K P R L V T4 187 PF RDDM 210 FT G N I E G F V P 108 AGDYRPITDPDEAEEYIKMVYNMVIGPVAF 5S]TSAYCRDGYLLGVSIS PRDI 15 AF FTD 6inY A HNGGK LSF f 327 YDNYVTILDEETLKAWIAKLEKAPV--FA; TDSIDNISANLVGLSFA 029 7SCSF TTK 5 L Y FH N-K F G A S-1 211VA TLI L familyB 294V F H NN L S F G IN 235 V V T F G N F 158 HQEYQGVYI------DSDCLTEVAVVYLQKILDSENHTIVFHNLKFl VaCC 204 EY REVpolymErasesA N H L R_ 375 VZV 420 EF FEL 443 A T G Y N IV N F W A F IEPGVAAYIPVAHDYLDAPDQISRER;;ELLKPLLEDEKALKVGQN EBV 352 EFPS1DM 374V T G Y N V AN F W P Y 199 MHFYKYHLGLTFDKAHKERRLHDTMLQHYVLDERRGTHGLKSLAMKYTDM CMV 381 EFPS YEL 402 V T G Y N I N SF L K Y HSV 439EFSE FE 462 V T G Y N I IN F W PFj 425 RGILANYGIELRGIA.------FTHES:YILNSVAGRHDMDSCAERWLKH 3 4 249 GDYD ELDKFKDDYCKAHKI KKEDFTYDLIPFDIMWIAAITDATIRLH T5 AAKTTALfTI Rfi 326 G D R 469 KTIT EIAGKG------KNQLTFNQIALEEAGRAEjDbVTtLQLH Poll 496 R A EAD[VTLIQ L H 537 L R N G V K3I D RVLP T7 169 DN V Q~TV A 224 L RGFP T JA I 299 NFFLPKIEKNEKLCSLYYDVLMPGCVFLQRVEDRGVPISIDRLKEAQYQL c 1513E IDNS K RjH G A 202 I Q R I V[Q A SLKL R V V F A 510 LKMW;DLQKHKGPLNVFENIEM;LVPVLSRIERNGVKIDPKVLHNHSEEL Is T5 37 3 VV R LFDY V L TPTG L T 349 THNLNKAREKLYTYPEVKQLEQDQNEAFNPNSVKQLRVLLFDYVGLTPTG Poll 578F N[ SIT|K Q L IT I LFE K Q|G I KIPWKIKI-ITL G K 560 TLRLAELE------KKAHEIAGEEFNLSSTKQLQTILFEKQGIKiLK T734FNPSS|R D H ILSJK KL Q E A G W VP T- YTDK TS 404 4AJE1JT Q H ------I AN 399 KLTDTGADSTDAEALNELATQHPIAKTLLEIRKLTKLISTYVEKILLSID Poll 605 GAP ST EV L E YP------P K 601 K-TPGGAPSTSEEVLEELALDYjLjKVILEYRGLAKLKSTYTKLPLMIN T7 360 PVD EVL E GVR[DJE KQ A AID I K TS 425 TLL E I R K LITIK YV E I L IC SI DO- 449 AD-GCIRTGFHEHMTTSGRLSSSGK--LNLQQLPRDESIIKGCWAPPGY Poll 626 VIL ElARLA K S T YTTDK L PILLMJN P KjG 650 jKTGRVHTSYHQAVTATGRLSSTbPNLQNIPVRNEEGRRIRQAFIAjEbY T7 388 E QR I - G QSAG KA W L R Y VOE G 496 RVIAWDLTTAEVYYAAVLSGDRNMQQVFINMRNEPDKYPDFHSNIAHMVF TS 452 C FHEHMITHTfTiGR L S GKS|S KLT Poll 654 R TYQAAVLA[G R L SSTDF-IN LQ JIPI 700 VIVSADYSQIELRIMAHLSRbKGLLTAFAtGk.------HRATAAEVP T7 415 K JVNNPNG VVTA G R|AT H A F[-LNjLAII Pl 546 KLQCEPRDVKKLFPALRQAAfITFGI LAGSGPAKVAHSVNEALLEQAAK 6 7 TS 501 L T T A VYYA V 562NQ A A] TF GI 743 GLP-----;ETV EQRS NI SAFGLARQLN------Poll 705IDIY 1Q ILEYW1IM AH|A 754|R|RSRAAIAN| IGI 596 TGEPFVECTVADAKEYI ETYFGQFPQLKRWIDKCHDQIKNHGFIYSHFGR T7 475DJAG LIELR|CILAHJ 518 DDNATF I FfGl 779 ------IPRKEAQK;MDLYFERYPGVLEEYMERTRAQAKEQGYVETLDGR 8 9 TS 602 CVND1DDLK 646 KRRLHNIHSEDRGVQGEEIRSGFNAI IQSASSDSLLLGAVDADNEIISLG Poll 803 ER T R A QA K E 841R)AAIIN AP MIQ[AIA[D I I A M 822 RLYLPDIKSSNGARRAAAERAAINAPMQGTAADIIKRAMIAVDAWL--QA T7 557EN T PAI AV 60 7LE.JAL L LQ AIGLNLC 3 IS 10 696 LEQEMKIVMLVCPSWAIVREDLIDQYNEILIRNIQKDRGISIPGCPIGI TS 704 LV DJS VIA IjE L 870 EQPRVRMIM DELVFEVHKbDD6AVAKQIHQLMENCTRLDVPL-LVEV Pol N878 Q |V D E1L JFE K D T7 650 AWjVMD EII Q V G T E E 746 DSDSEAGGSR/74 residues/* 919 GSGENWDQAH* FIG. 3. Regions of homology among various DNA polymerases: T5, E. coli Poll (16), T7 (19), e subunit of E. coli DNA polymerase FIG. 2. Alignment of amino acid sequences of T5 DNA polymer- III (E) (20), T4 (21), PRD1 (22), 429 (23), maize mitochondria plasmid ase and Pol I (16). The top sequence is T5 DNA polymerase; the S-1 (S-1) (24), vaccinia (Vacc) (25), varicella-zoster (VZV) (25), bottom sequence is Pol l. Numbers indicate residues from the amino Epstein-Barr (EBV) (26), cytomegalovirus (CMV) (27), and Herpes terminus. A dash represents a gap, a colon indicates identical amino simplex virus type 1 (HSV) (28). In regions 1 and 2, similar amino acids, a dot indicates similar amino acids, and an asterisk represents acids, which are in a plurality, are boxed. In other regions residues a termination codon. Shaded amino acids are shared residues that are shared by two or more proteins are boxed. Shaded amino acids in involved in the 3' -. 5' exonuclease site (solid bar) or have been regions 1, 2, and 3 are conserved residues that are involved in the cross-linked to dNTP analogues in Pol I (arrowheads). Alignment 3' -. 5' exonuclease active site of Pol l (14). Shaded amino acids of was made using BESTFIT. The single-letter amino acid code is used. regions 7 and 10 are conserved residues that, in Pol I, have been cross-linked with dNTP analogues (14, 15). Underlined residues in T4 previously displayed little homology to Pol I or T7 polymer- are locations of antimutator mutants. Alignments were generated by ase show regions 1 and 2 to be conserved. The conservation programs LFASTA, GAP, or BESTFIT. The single-letter amino acid code of region 1 among Pol I and other DNA polymerases as well is used. as its significance to the exonuclease activity ofPol I has been noted by Reha-Krantz (29). interactions between polymerase and template than the elec- trostatic potential of the DNA-binding region. Primary struc- DISCUSSION tural dissimilarities between T5 DNA polymerase and related DNA polymerases may yield clues as to what makes a The characteristics of T5 DNA polymerase that confer high polymerase processive. Joyce and Steitz (14) proposed that processivity are not clear. Despite the apparent similarity of a disordered subdomain (residues 569-626) in the Klenow Pol I and T7 DNA polymerase amino acid sequences to that fragment may fold across the top of the DNA-binding region of T5 DNA polymerase, neither of the former proteins are enveloping the duplex DNA. While this region is remarkably processive, at least without additional subunits (3, 30). The similar (48% homologous) to the corresponding region in T5 charges in the proposed DNA binding region, as defined by DNA polymerase and would not appear to account for the extensive analysis of Klenow fragment, are approximately differences in processivity between the two polymerases, the same in each DNA polymerase (data not shown). Addi- there is a 9-residue (residues 357-365) segment present in T5 tionally, the amino acid substitutions in Pol I missense DNA polymerase and absent in the Klenow fragment that mutants polA6 (Arg-690 to His) and polA5 (Gly-850 to Arg) immediately precedes the disordered domain. This area could that show decreased DNA binding and decreased processiv- possibly form a hinge-like structure to enclose the DNA or ity, respectively (31-33), would not appear to remarkably lengthen the disordered domain of T5 DNA polymerase. change the electrostatic potential of the enzyme. Therefore, Perhaps the most striking primary structural difference it seems processivity may be more influenced by structural between T5 DNA polymerase and related polymerases is that Downloaded by guest on September 26, 2021 4468 Biochemistry: Leavitt and Ito Proc. Natl. Acad. Sci. USA 86 (1989) alignment ofthe three proteins shows T5 DNA polymerase to tertiary comparisons cannot be made at this time. The extend 70-75 residues past the carboxyl terminus ofT7 DNA existence of regions shared by members ofboth family A and polymerase and Pol I (Fig. 4). Since the carboxyl-terminal family B (Fig. 3, regions 1 and 2) removes the clear separation domain of Pol I has been implicated, through a variety of of these two DNA polymerase families and provides insight methods, as the major DNA-binding domain and the only concerning the 3' -* 5' exonuclease active site and evolution significant difference among the family A polymerases is the of the DNA polymerases. carboxyl-terminal extension of T5 DNA polymerase, it is The conservation of a particular amino acid sequence possible that this region is primarily responsible for the high throughout evolution implies that the segment may have processivity of T5 DNA polymerase. Examination of this some essential functional role. Evidence from two DNA area shows it to have no amino acid sequence similarity to E. polymerase groups, family A (Pol I) and family B (T4 and coli thioredoxin, which is required by T7 DNA polymerase to herpes), indicates regions 1 and 2 may be intimately involved become processive. in 3' -+5' exonuclease activity ofDNA polymerases. Regions Comparisons of DNA polymerase amino acid sequences 1 and 2 contain four highly conserved residues that are have suggested that there may be several distinct evolution- involved in the 3' -+ 5' exonuclease site of the Klenow ary groups of DNA polymerases. Based on their primary fragment. Moreover, Klenow fragment missense mutants in structural similarities, T5 DNA polymerase belongs to a the aspartic (Asp-355) and glutamic acid (Glu-358) residues of group that includes E. coli DNA Pol I, T7 DNA polymerase region 1 lack exonuclease but retain polymerase activity (37). (phage-encoded subunit), and the e subunit of E. coli DNA For T4 DNA polymerase, replacement ofTrp-202 with Ser or polymerase III, which contains the 3' -) 5' exonuclease Trp-213 with Tyr results in a weak antimutator phenotype. activity of that enzyme. This group has been designated These residues are located in or near regions 1 and 2. family A (22). T5 encodes its own 5' -+ 3' exonuclease Although these mutants have not been enzymatically ana- separate from the DNA polymerase, which we found to show lyzed, 3' -+5' exonuclease activity has been determined to be significant amino acid sequence similarity to the small pro- a major element of polymerase fidelity (3, 38). However, it teolysis product of Pol I and therefore may also be included should be noted that mutator and antimutator mutants have in this family (unpublished data). Fig. 4 shows these related been located at a large number ofsites in T4 DNA polymerase proteins and approximate regions of primary structure sim- and that other determinants such as base selection are ilarities. Through crystallographic, genetic, and biochemical integral to polymerase fidelity (29, 39). The location of an techniques, Pol I has been shown to consist of three struc- exonuclease active site involving regions 1 and 2 is consistent tural and enzymatically separate domains (14). The small with the proposed enzymatic configuration of the family B proteolysis product (residues 1-323) contains the 5' -- 3' polymerases. Evidence from T4 and herpes supports an exonuclease activity. The remaining large fragment contains N-terminal 3' --+ 5' exonuclease and C-terminal DNA poly- the 3' -- 5' exonuclease (residues 324-517) and DNA poly- merase activity (29, 40). It may also be suggested that the merase domains (residues 521-928). Homology ofa protein to presence or absence of this conserved sequence is indicative a specific Pol I domain(s) directly corresponds to shared of the enzymatic capabilities of a DNA polymerase. For enzymatic activity(ies). example, E. coli DNA polymerase III a subunit, which has Family B DNA polymerases are also grouped based on no exonuclease activity (41), does not appear to contain their amino acid sequence similarities and are quite extensive homologies to regions 1 and 2. Exonuclease activity has been in number and variety. They include bacteriophages (PRD1, detected in most ofthe polymerases containing these regions 429, PZA, and T4), eukaryotic viruses (herpes simplex virus, (3, 42-45) and has not been eliminated from the remainder Epstein-Barr virus, cytomegalovirus, varicella-zoster virus, (yeast Pol I, S-1, Epstein-Barr virus, varicella-zoster virus, adenovirus, and vaccinia), Saccharomyces cerevisiae POL 1, and cytomegalovirus). human DNA polymerase a, and putative DNA polymerases The evolution ofDNA polymerases is ofparticular interest from yeast killer plasmid pGKL1 and maize mitochondrial because of their essential role in the transmission of genetic DNA S-1 (21, 22, 34-36). Although the primary structural information from each generation to the next. Until recently details of these two groups seem very different, the three- it appeared that DNA polymerases were divided into several dimensional features may be similar. However, no family B wholly dissimilar evolutionary groups. The conservation of DNA polymerase has yet been crystallized and therefore an amino acid sequence segment, and its putative function

Small fragment Kienow fragment I II 1 2 3 4 5 6 7 8 9 10 Pol I a I* I l i l I I II I I I I I I II T5 exo -l-M I I I l II I I I I I I 1 T5 DNA polymerase - -- I / / I I I I I tI -1 I l l~- ,.- ' L T7 DNA polymerase II I (phage subunit) I I I \ I 100 residues epsilon subunit of Pol III L61 1l .

FIG. 4. Spatial relationships of conserved regions. Lengths of lines are proportional protein sizes. N and C represent amino and carboxyl termini, respectively, and thick lines represent approximate locations of conserved regions. Numbers correspond to region designations from Fig. 3 and the arrowhead indicates the division of the small 3' -+5' exonuclease subdomain from the large polymerase subdomain of the Klenow fragment (14). Downloaded by guest on September 26, 2021 Biochemistry: Leavitt and Ito Proc. Natl. Acad. Sci. USA 86 (1989) 4469 among these groups, suggests that they all share some 19. Dunn, J. J. & Studier, F. W. (1983) J. Mol. Biol. 166, 477-535. common ancestral gene. However, the alternative hypothesis 20. Echols, H., Lu, C. & Burgers, P. M. J. (1983) Proc. Natl. that dissimilar proteins evolved a common protein structure Acad. Sci. USA 80, 2189-2192. 21. Spicer, E. K., Rush, J., Fung, C., Reha-Krantz, L. J., Karam, to accomplish a shared enzymatic activity cannot be elimi- J. D. & Konigsberg, W. H. (1988) J. Biol. Chem. 263, 7478- nated. Since the remainder of these proteins is still quite 7486. different, it may be possible that the exonuclease domain of 22. Jung, G., Leavitt, M. C., Hsieh, J.-C. & Ito, J. (1987) Proc. these enzymes was fused to the DNA polymerase domain Natl. Acad. Sci. USA 84, 8287-8291. during the evolution of the various . This may be 23. Yoshikawa, H. & Ito, J. (1982) Gene 17, 323-335. supported by the structural and functional independence of 24. Paillard, M., Sederoff, R. R. & Levings, C. (1985) EMBO J. 4, 1125-1128. these two domains in Klenow fragment as shown through 25. Davidson, A. J. & Scott, J. E. (1986) J. Gen. Virol. 67, 1759- crystallographic analysis and the ability of the polymerase 1816. domain to retain activity when cloned separately (46). 26. Baer, R., Bankier, A. T., Biggin, M. D., Deininger, P. L., Farrell, P. L., Sequin, C., Gibson, T. J., Hatfull, G., Hudson, We thank Robert K. Fujimura for supplying bacteriophage T5. G. S., Satchwell, S. C., Tuffnell, P. S. & Barrell, B. G. (1984) This research was supported by Grant CM-28013 from the National Nature (London) 310, 207-211. Institutes of Health to J.I. 27. Kouzarides, A., Bankier, A. T., Satchwell, S. C., Weston, K., Tomlinson, P. & Barrell, B. G. (1987) J. Virol. 61, 125-133. 1. DeWaard, A., Paul, A. V. & Lehman, I. R. (1965) Proc. Natl. 28. Quinn, J. P. & McGeoch, D. J. (1985) Nucleic Acids Res. 13, Acad. Sci. USA 54, 1241-1247. 8143-8164. 2. Das, S. K. & Fujimura, R. K. (1979) J. Biol. Chem. 254, 29. Reha-Krantz, L. J. (1988) J. Mol. Biol. 202, 711-724. 1227-1232. 30. Tabor, S., Huber, H. E. & Richardson, C. C. (1987) J. Biol. 3. Kornberg, A. (1980) DNA Replication (Freeman, San Fran- Chem. 262, 16212-16223. cisco). 31. Kelly, W. S. &Grindley, N. D. F. (1976) NucleicAcids Res. 3, 4. S. K. & R. K. Nucleic Acids Res. 8, 2971-2983. Das, Fujimura, (1980) 32. Matson, S. W., Campaldo-Kimball, F. N. & Bambara, R. A. 657-671. (1978) J. Biol. Chem. 253, 7851-7856. 5. Fujimura, R. K. & Das, S. K. (1980) Prog. Nucleic Acid Res. 33. Joyce, C. M., Fujii, D. M., Laks, H. S., Hughes, C. M. & Mol. Biol. 24, 87-107. Grindley, N. D. F. (1985) J. Mol. Biol. 186, 283-293. 6. Messing, J. & Vieira, J. (1982) Gene 19, 269-276. 34. Jung, G., Leavitt, M. C. & Ito, J. (1987) Nucleic Acids Res. 15, 7. Dale, R. M. K., McClure, B. A. & Houchins, J. P. (1985) 9088. Plasmid 13, 31-40. 35. Pizzagalli, A., Valsasnini, P., Plevani, P. & Lucchini, G. (1988) 8. Maxam, A. & Gilbert, W. (1980) Methods Enzymol. 65, 499- Proc. Natl. Acad. Sci. USA 85, 3772-3776. 560. 36. Wong, S. W., Wahl, A. F., Yaun, P.-M., Arai, N., Pearson, 9. Lipman, D. J. & Pearson, W. R. (1988) Proc. Natl. Acad. Sci. B. E., Arai, K., Korn, D., Hunkapiller, M. & Wang, T. S. F. USA 85, 2444-2448. (1988) EMBO J. 7, 37-47. 10. Fujimura, R. K., Tavtigian, S. V., Choy, T. L. & Roop, B. 37. Derbyshire, V., Freemont, P. S., Sanderson, L. B., Friedman, (1985) J. Virol. 53, 495-500. J. M., Joyce, C. M. & Steitz, T. A. (1988) Science 240, 199- 11. Rhoades, M. (1982) J. Virol. 43, 566-573. 201. 12. Schneider, S. S., Roop, B. C. & Fujimura, R. K. (1985) J. 38. Muzyczka, N., Poland, R. L. & Bessman, M. J. (1972) J. Biol. Virol. 56, 245-249. Chem. 247, 7116-7122. 13. Fujimura, R. K. & Roop, B. (1976) J. Biol. Chem. 251, 2168- 39. Gillin, F. D. & Nossal, N. G. (1976) J. Biol. Chem. 252, 2175. 5219-5232. 14. Joyce, C. M. & Steitz, T. A. (1987) Trends Biochem. Sci. 12, 40. Gibbs, J. S., Chiou, H. C., Hall, J. B., Mount, D. W., Re- 288-292. tondo, M. J., Weller, S. K. & Coen, D. M. (1985) Proc. Natl. 15. Pandey, V. N., Stone, K. L., Williams, K. R. & Modak, M. J. Acad. Sci. USA 82, 7969-7973. (1987) Biochemistry 26, 7744-7748. 41. Maki, H. & Kornberg, A. (1985) J. Biol. Chem. 260, 12987- 16. Joyce, C. M., Kelly, W. S. & Grindley, N. D. F. (1982) J. Biol. 12992. Chem. 257, 1958-1964. 42. Yoo, S. & Ito, J. (1989) Virology 170, 422-429. 17. Ollis, D. L., Kline, C. & Steitz, T. A. (1985) Nature (London) 43. Watabe, K., Leusch, M. & Ito, J. (1984) Biochem. Biophys. 313, 818-819. Res. Commun. 123, 1019-1026. 18. Joyce, C. M., Ollis, D. L., Rush, J., Steitz, T. A., Konigsberg, 44. Challenberg, M. D. & Englund, P. T. (1979) J. Biol. Chem. 254, W. H. & Grindley, N. D. F. (1985) in Protein Structure, Fold- 7820-7826. ing and Design, UCLA Symposia on Molecular and Cellular 45. Knopf, K.-W. (1979) Eur. J. Biochem. 98, 231-244. Biology, ed. Oxender, D. (Liss, New York), Vol. 32, pp. 46. Freemont, P. S., Ollis, D. L., Steitz, T. A. & Joyce, C. M. 197-205. (1986) Proteins 1, 66-73. Downloaded by guest on September 26, 2021