<<

INFORMATION TO USERS

This manuscript has been reproduced from the microfilm master. UMI films the text directly from the original or copy submitted. Thus, some thesis and dissertation copies are in typewriter face, while others may be from any type of computer printer.

The quality of this reproduction is dependent upon the quality of the copy submitted. Broken or indistinct print, colored or poor quality illustrations and photographs, print bleedthrough, substandard margins, and improper alignment can adversely affect reproduction.

in the unlikely event that the author did not send UMI a complete manuscript and there are missing pages, these will be noted. Also, if unauthorized copyright material had to be removed, a note will indicate the deletion.

Oversize materials (e.g., maps, drawings, charts) are reproduced by sectioning the original, beginning at the upper left-hand comer and continuing from left to right in equal sections with small overlaps.

Photographs included in the original manuscript have been reproduced xerographically in this copy. Higher quality 6” x 9” black and white photographic prints are available for any photographs or illustrations appearing in this copy for an additional charge. Contact UMI directly to order.

ProQuest Information and Learning 300 North Zeeb Road, Ann Arbor, Ml 48106-1346 USA 800-521-0600

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. A NOVEL SECOND DOMAIN INVOLVED IN GEMINIVIRUS REP PROTEIN OLIGOMERIZATION

DISSERTATION

Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy

in the Graduate School of The Ohio State University

By

Jared Quentin LeMaster

* * * *

The Ohio State University

2002

Dissertation Committee: Approved By Dr. David M. Bisaro, Advisor

Dr. Mark Muller Dr. Michael Ostrowski ^ 2 y Advisor Dr. Deborah Parris Molecular Genetics Graduate Program

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. UMI Number: 3049067

_ ___ (8) UMI

UMI Microform 3049067 Copyright 2002 by ProQuest Information and Learning Company. All rights reserved. This microform edition is protected against unauthorized copying under Title 17, United States Code.

ProQuest Information and Learning Company 300 North Zeeb Road P.O. Box 1346 Ann Arbor, Ml 48106-1346

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ABSTRACT

The is a growing family of small ssDNA that causes significant disease in a variety of important commercial and staple crops throughout tropical and temperate regions of the world. Geminiviruses replicate their by a rolling circle replication (RCR) mechanism similar to the bacteriophage 0X174. Their replication cycle depends heavily on host cellular replication machinery. Only one viral protein, Rep, is required for initiation and termination of RCR. Rep protein-protein complex formation, i.e. oligomerization, is a key event in geminivirus replication. Current evidence suggests that Rep forms several, most likely different, multi-protein complexes in infected cells. These complexes, which include Rep oligomers, Rep-REn (Viral Replication ENhancer protein) and Rep-pRBl (plant homologue of the mammalian retino­ blastoma related tumor suppressor protein, Rb) complexes, are all formed via protein-protein interactions with the same target region of Rep. Therefore, a mechanism must exist that regulates what kind of complexes Rep forms. Therefore, we hypothesized that multiple protein regions (domains) are involved in geminivirus Rep protein oligomerization. We supported this hypothesis by making use of the powerful yeast two-hybrid system. We constructed a variety of Rep deletions and examined their abilities to interact with one another. Our results indicate that the geminivirus, Tomato Golden Mosaic (TGMV), Rep has two domains important for homo­ oligomerization. The first previously described region is located in the middle of TGMV Rep between residues 116 and 181 and is referred to as domain I in this work. The second, novel domain, domain II is located in the carboxy-terminal 52 amino acids of TGMV Rep and functions in a position-independent manner. Computer-aided analysis of the C-terminal interaction domain revealed the presence of a predicted a-helical region that is highly conserved throughout the Geminiviridae. Additionally, we demonstrated that the C-terminal region is involved in homo­ oligomerization of Rep proteins from three distinct geminiviruses. We also demonstrated that TGMV Rep is capable of forming oligomers with heterologous Rep proteins from two related geminiviruses, one of which belongs to a separate geminivirus genus. Lastly, geminivirus replication and the roles that Rep oligomerization may play in the replicative process are discussed in light of what is known about RCR and other relevant model replication systems.

n

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. THIS DISSERTATION IS DEDICATED TO

MY MOM AND DAD

THANKS FOR EVERYTHING!

iii

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ACKNOWLEDGMENTS

I would like to thank my advisor and committee member. Dr. David M. Bisaro for his patience, keen critical reviews and advice throughout my graduate education. I would also like to thank Dr. Garry Sunter, for advice on protocols and friendship. Janet Sunter and Dr. Kenn Buckley have also provided assistance with this work as well as many refreshing debates over the years. I would also like to thank my fellow graduate students for their help. To my good friends Linhui Hao and Hui Wang, thanks for all the times you looked alter my experiments and provided technical assistance. I am also indebted to graduate school veterans Marcus Hartitz and Dr. Fredrick Meyer for pointing out so many of the pitfalls graduate students can fail into along the way.

I would also like to thank all the people that made everyday life bearable, the staff of the biotechnology center Melinda Parker, Dave Long, Mike Zianni, and Billy, the MCDB secretary, Jan Zinich, the Molecular Genetics office staff, especially Jessie, for all their hard work and patience. And Molecular Genetics graduate school committee chairman, Dr. Mike Ostrowski for his continued patience with me while I finished my dissertation. I would also like to thank the additional members of my committee: Dr. Mark Muller, Dr. Deborah Parris, and Dr. Mike Ostrowski. Their advice has helped me to become a more disciplined and focused scientist

Additionally, I also owe much thanks to a great myriad of friends and family for all their help, thoughts and prayers. To my parents, James and Jean who have helped in every way imaginable and to whom I cannot say thank you enough. To my in-laws, Pete and Sharon, first of all, for letting me marry their wonderful daughter, and secondly, for providing so much support throughout the years. I must also thank the many friends I’ve been blessed with who have allowed me to live in their homes over the past two and a half years. Ed D’amato, Scott and Danielle Smith and Greg and Danielle Hartt. Although you never said it. I’m sure I was a nuisance, thank you very much for your hospitality. To my family. Dr. Elizabeth LeMaster (Betsy), Jacob Stanley (Jack) and Peter James (Pete) and little Isabelle, I’m sorry for all the inconvenience and irritation I have caused over the years and I am eternally in your debt for your constant love and support.

Lastly, I would like to thank the Lord, Jesus Christ, whose constant presence has seen me through even the darkest times and to whom all thanks and credit is ultimately due.

h r

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. VITA

Septembers, 1971 Bom - Ravenna, Ohio USA

1994, B.S. Biology, Kent State University, Ohio USA

1994-2001 Researcher, The Ohio State University, Columbus, Ohio Dept, of Molecular Genetics

1994-2002 Graduate Teaching Associate The Ohio State University, Columbus, Ohio USA Dept, of Molecular Genetics

POSTERS, SEMINARS AND PUBLICATIONS

Posters / Seminars: LeMaster, J.Q., Buckley, K., Sunter G., Davis, K.R., Bisaro, D.M. (Nov. 1996) Interactions Between Geminivirus Replication Proteins (REP and AL3). Plant Molecular Biology and Biotechnology 1996 Symposium- Plant Responses to the Environment. Museum of Biological Diversity, Columbus, Ohio.

LeMaster, J.Q., Bisaro, D.M. (Sept. 1998) Protein Interactions of Tomato Golden Mosaic Virus Replication Proteins REP and REn. Stone Laboratory Research Center Symposium, Put-in-Bay, Ohio.

LeMaster, J.Q., Bisaro, D.M. (Jan. 1999) In vitro Analysis of Recombinant TGMV Replication Proteins. Scott Falkenthal Memorial Graduate Student Colloquium, Columbus, Ohio.

LeMaster, J.Q., Bisaro, D.M. (Jun. 1999) Interactions Between the Geminivirus Replication Proteins AL1 (Rep) and AL3 (REn). Keystone Symposium- Molecular Mechanisms in DNA Replication and Recombination, Taos, New Mexico.

Research publications: LeMaster, J.Q., Bisaro, D.M. A Novel Second Domain Involved in Geminivirus Rep Protein Oligomerization. 2002. (manuscript in preparation)

FIELDS OF STUDY

Major Field: Molecular Genetics V

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. TABLE OF CONTENTS

Page

Abstract...... ii Dedication ...... ill Acknowledgments ...... iv Vita...... v List of Figures ...... ix List of Tables ...... x

Chapters:

1. Introduction...... l 1.1. Background...... 1 1.1.1. Historical Background...... I 1.1.2. Geographic Distribution ...... 5 1.1.3. Agronomic Importance of Geminiviruses ...... 6 1.1.4. Topics of Geminivirus Research Around the World ...... 7 1.2. The Family Geminiviridae:...... 10 1.2.1. General Family Characteristics...... 10 1.2.2. Genome Organization and Nomenclature ...... 12 1.2.3. Functions of Viral Genes ...... 15 1.3. Rolling Circle Replication...... 18 1.3.1. General Features ...... 18 1.4. Model Replication Systems...... 19 1.4.1. Replication of 0X174: The Classic Model of Rolling Circle. Replication ...... 19 1.4.2. Replication of Adeno-Associated Virus (AAV): A Related Rolling Hairpin Model Replication System ...... 21 1.4.2.1. AAV Replication Initiator Proteins ...... 25 1.4.2.2. Factors Associated With AAV Rep 78/68 ...... 29 1.4.2.3. Interactions of AAV Initiator Proteins...... 31 1.4.3. SV40 Large T antigen ...... 32

vi

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Page 1.5. Geminivirus Replication...... 35 1.5.1. General Characteristics of Geminivirus DNA Replication ...... 35 1.5.2. Review of Geminivirus Rep Protein Structure and Function ...... 38 1.5.3. Molecular Characterization of Rep ...... 41 1.5.4. Geminivirus Origin Interactions with Rep ...... 43 1.5.5. Rep Protein-Protein Interactions ...... 46 1.5.6. Review of Prior Geminivirus Oligomerization Domain Studies ...... 49

1.6. General Statement of Hypothesis and Related Research...... Goals 54

2. Two Separate Regions of Rep are Important for Rep-Rep Oligomerization and These Domains are Conserved in at Least Two Geminivirus Genera...... 55 2.1. Introduction...... 55 2.2. Experimental Results...... 56 2.2.1. Two Domains of Rep are Involved in Two-Hybrid Rep Oligomerization ...... 56 2.2.1.1. The Yeast Two-hybrid System...... 56 2.2.1.1.1. Advantages and Disadvantages of the Two-Hybrid System ...... 60 2.2.1.2. Two Domains of Rep are Involved in Rep Oligomerization in the Two-Hybrid System ...... 66 2.2.2. Geminivirus Rep Forms both Hetero-Oligomeres and Homo-Oligomeres in the Two-Hybrid System ...... 71 2.2.3. A C-Terminal Moiety is Important for Rep Protein Oligomerization in Three Distinct Geminivirus Species ...... 74 2.2.4. A 52 Amino Acid C-Terminal Moiety Important for Rep Oligomerization Retains Function Even When Moved Out of its Natural Context: ...... 80 2.2.5. Rept 16-181 /300-352 and Repn6-igi Gal4 DBD Fusion Proteins are Expressed at Similar Steady-State Levels in Yeast ...... 83 2.2.6. Quantitative Liquid Assay Measurements Indicate the C-Terminus of Rep is capable of Stimulating Two-Hybrid Rep Oligomerization as much as 400 Fold...... 87 2.2.6.1. The Two-Hybrid Liquid Assay System ...... 87 2.2.6.2.C-Terminal Oligomerization Stimulation Exhibits the Same Pattern of Activity with Three Different Two-Hybrid Rep Bait Constructs ...... 89 2.3. Experimental Methods...... 94 2.3.1. TGMV Rep Two-Hybrid Expression Plasmid Construction ...... 94 2.3.2. BCTV Rep Expression Plasmid Construction ...... 95 2.3.3. SqLCV Rep Expression Plasmid Construction ...... 95 2.3.4. Experimental Yeast Strains ...... 95 2.3.5. Yeast Transformation Procedure ...... 96 2.3.6. P-Galactosidase Filter Assay Procedure ...... 97 2.3.7. Plasmid Recovery from Y190 ...... 97 2.3.8. Selection of AD Plasmids ...... 98 vii

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Page 2.3.9. P-Galactosidase Liquid Assay Procedure ...... 98 2.3.10. Immunoprecipitations of Gal4 DNA Binding Domain/Rep Fusion Proteins ...... too 2.3.11. ECL Western Blot Technique ...... 101

3. Discussion...... 103 3.1. The Existing Rep Oligomerization Model: ...... 103 3.2. Suggested Refinements to the Rep Oligomerization Model Based on Two-Hybrid Results: ...... 105 3.2.1. Overview ...... 105 3.2.2. The TGMV Rep N-terminus Reduced Two-Hybrid Interaction Efficiency: A Regulatory Domain? ...... 105 3.2.3. Involvement of Domain II (Rep 3oo- 352) in Rep Oligomerization: How & Why? ...... 107 3.2.4. Species Generality of Rep Oligomerization: A Common Theme? ill 3.25. The Refined Model: ...... 113 3.2.6. Summary of Results and Further Speculation ...... 116 3.2.7.Rep Oligomerization Domain Transgenes for Engineering Disease Resistance?...... 116

Bibliography...... 117

viii

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. LIST OF FIGURES

Number Page

1.1. Electron Micrograph of Geminivirus Particles ...... 11 1.2. Geminivirus Genome Diagrams ...... 14 1.3. Rolling Circle Diagram of <(>X 174 Replication ...... 20 1.4. AAV Rolling Hairpin Replication ...... 24 1.5. Geminivirus Replication Scheme ...... 37 1.6. Schematic of TGMV Rep Functional Domains ...... 42 1.7. Diagram of TGMV Origin ...... 45 1.8. DNA and Protein Binding Domains of Rep ...... 53

2.1. Two-Hybrid System Diagram ...... 57 2.2. Two-Hybrid Rep Deletion Series ...... 65 2.3. Two-Hybrid Rep Deletion Series Example Filters ...... 68 2.4. Filter Assay of Repius-ist Oligomerization ...... 70 2.5. Two-Hybrid Heterologous Rep Interactions ...... 72 2.6. Two-Hybrid Rep C-Terminal Truncations ...... 77 2.7. Two-Hybrid Domain I vs. I & II ...... 81 2.8. ECL Western Blot of Repu6-isi & Rcpu 6-i8 i/3oo- 352...... 85 2.9. RLU vs. Repi 16-352 Protein Concentration Optimization Curve ...... 87 2.10. Graph of Two-Hybrid Liquid Assay Measurements ...... 90

3.1. Current Rep Oligomerization Domain Model ...... 103 3.2. Rep Protein Alignments ...... I l l 3.3. Refined Rep Oligomerization Model: ...... 114

ix

' 4 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. LIST OF TABLES

Number Page

1.1. Current Members of the Geminiviridae...... 3-4 1.2. Geminivirus Genes and Their Functions ...... 17 1.3. Cellular Factors Associated with Parvovirus Replication ...... 28 1.4. Geminivirus Rep Protein Functions ...... 40

2.1. The C-Terminus Participates in Rep Hetero-Oligomerization ...... 78 2.2. Percent Reduction / Increase in Liquid Assays ...... 91

3.1. Rep Protein Structural Predictions ...... 109

X

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chapter 1

INTRODUCTION

1.1. Background:

The subject matter of this document concerns geminiviruses, DNA replication, viral replication, protein interaction, the two-hybrid system and various other pertinent aspects of molecular genetics. Therefore, the first chapter is dedicated to reviewing relevant background information in these areas. If the reader is familiar with the pertinent background it may be beneficial to begin in section 1.6. with the statement of hypothesis and research goals and refer back to the introduction only if clarification is needed.

1.1.1. Historical Background:

Diseases caused by geminivirus infection were first described over a hundred years ago [1], but geminiviruses were not positively identified as the causative agents of these diseases until the 1970’s. By the late 1970’s several geminiviruses were isolated and confirmed to cause disease [2]. Among the first definitively linked to disease was Cassava Latent Virus; CLV (later renamed African Cassava Mosaic Virus, ACMV) which was determined to be the causative agent of cassava mosaic disease (CMD). Maize streak disease, which was demonstrated to be caused by Maize Streak Virus in 1974, is a devastating disease of maize and sugarcane. Also in the early 1970’s Curly Top Virus, CTV (later renamed , BCTV) particles were first purified and examined, justifying their inclusion in

1

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. this growing group of geminiviruses [2]. The geminiviruses have since grown to include more than 76 reported, reasonably well-characterized members [3] afflicting numerous dicotyledenous (dicots) and monocotyledous (monocots) host species. Hundreds of additional geminivirus sequence variants have also been described [4, 3] [6]. The majority of these variants are considered products of random sequence drift, and do not represent novel viruses, although their existence may illustrate a strategy for generating emergent viral species. Some geminivirus variants display great enough sequence (>S%) and biological (host, symptom development, vector) diversity to warrant classification as a new geminivirus [7], Such classifications are always being revised and updated as our knowledge of geminivirus biology and genetics increases. Table 1.1 lists viruses currently identified as geminiviruses.

2

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Vims Vicus Vims Vims • ofscrauu Number Nme AbbreviMioa Number None Abbievmioa sequence van M atfrevinu Bcfoaiovinis (coot.)

I B aa ytUow dwarf tins ScYDV 2i B a a golden y r ib w mosaic tin s 8CYMV 18

2 Chlorii arias* matair tin s CSMV 27 CoUage leefcuri tin u CaLCuV

3 Digkaria U m k tins DSV 28 Chayote mosaic tin s ChaMV

4 M ic «it m k tin s MSV 29 C*aomle*f trample tin s CLCrV

5 hiUUl ti m k tin s MISV 38 Cemon leaf ear! Alokmd t in s CLCuAV 2

f t Miiruiftsrstrrskvinu M1SV 31 Comom leaf carl Kokkrma firms CLCuKV 3

7 Faaiemm s a n rinu k PuSV 32 Ceesonkafcnrihiukoa wins CLCuMV 7

S Sctaria s tm k tin s SMSV 33 Comm leaf carl tin s CLCuV 35

9 Sagartam* s tm k Egypt tin s SSEV 34 Coaom yellow u n t i e f in s CYMV

I t Smgartam* s tm k Mnalam tin s SSREV 35 CowpngaUnmosaictins CPGMV 4

11 Ssisnswimslvftw SSV 38 Cacarkklnfcnaspl* tin s CsLCrV

12 Tokaeeo yellow dwarf tin s TYDV 37 Cacarkit In f cart tin s CstCuV

13 WftMftfaw/nnu WDV JS Dkbpterm yeBow mottle fin a DtYMsV

C u rto v iru s 39 East afiieam n s ia ta mosaic ti n s EACMV IS

14 B m mild early top tins BMCTV 48 Fapatoriam yeBow teia f in s EpYW 2

IS B n t serere earij top tints 8SCTV 41 Eapkoekia mmaie tin s EuMV

18 Beet a rty top tins 8CTV 42 tioaeysackk yellow teia atosaic t i n s IIYVMV

17 Honeradith arty l y rirai HfCTV 43 ladiaa c a w m ataic t in s ICMV

Topocu virus 44 Jatnplm m atak fin s JMV

18 Ttmmo pseado-carty top tin t (TPCTV) 45 Leoaans atosaic tin s UMV

Begomovirus 48 MGMV 4

19 Akatiloa mosaic tins AbMV 47 MYMIV

28 African c a u iw mosaic tin s ACMV 48 Maagkeaa yeBow mataic tin s MYMV

21 Agentmm yellow trta fin s AVW 49 Okneaatiomtins OkEV

22 A k h n r o ta tmtiom rftnu AREV 58 Okn mosaic M esin tin s OkMMV

23 Beam calico mosaic tin s ■CaMV 51 OknyeBow taia m a ta k w ins OWMV 8

24 Beam dw arf atosaic tin s BDMV 52 Fapaya leaf cart tin s PsLCuV

25 Beam goldem mosaic tin s BGMV 53 F tpptr golden mosaic rinu PtfGMV 3

54 F tfp tr bugfrrre jtBow taia tin s PHYW

55 Ftpptr leaf cart tin s PffLCV

58 Fame* yeBow mmair f in s PYMV 5

Table 1.1. Current Members o f the Geminiviridae: Viruses currently included in the geminivirus family. This list was compiled from Dr. Claude Fauquet's web site (hitpv/www.aruciwusLaiu/~«iihn)/counes/iiubjKmi) and genbank entries. The list includes the common name and corresponding abbreviation of each geminivirus member and the number of currently known sequence variants or strains examined to date. There are 13 mastreviruses, 5 curtoviruses, 1 topucovirus, and over 100 begomoviruses. Table 1.1. Current Members of theGeminiviridae .

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Vtro* Vinu Virus Vims • of strains/ Number Nime AbbstviMioo Number Nssne Abbreviation sequence variarm Begomovirus (coat) Begomovirus (coot)

57 Kkynekmia golden mosaic tin s BhGMV 99 Tomon le^few i Sinaloa tiens TiLCStaV 4

SS Side golden mosaic Costa Mica tin s 9GMCKY 99 Taman leaf eartSrt Lanka twos ToLCSLV

59 Side golden mosaic Florida tin s 9GMTV 91 Tomota leafeart Taiwan t in s ToLCTWV

49 Side golden mosaic Hoadans H w SGMHNV 92 Tomam leafeart Tanzania tin s ToLCTZV

<1 Somk African cassata mosaic tin s SACMV 93 Toma n leafeart tka s ToLCV 3

<2 Soykeon etiacUe leaf tin s SbCLV 94 TomanauldantUerkus TsMMsV

a Sgnask leaf cart Chino tin s SLCCNV 95 Toman mosaic Baekadas tin s ToMBV

*4 SgaasM leafeart tin s SLCV 94 Toman motaie Hateaa tin s ToMIIV

45 SfMS* jvflw momle tin s SYMnV 97 Toma n m attlt Taiaa t in s ToMoTV

47 Sweet pmmn leafeart tin s SPLCV 99 Toma n mottle tin s ToMoV

M Tokaeeo apical stant tin s TbASV 99 Tomtea ngnseatm aie tin s TaRMV 2

49 Tokaeeo I n f cart Cktaa tin s TbLCCNV 199 Toman tetere leafeart tin s TnSLCV 3

79 Tokaeeo leafeart India fin s TbLCIV 191 Toman Vherimdia tk a s TnUV

71 Tokaeea leaf cart Jogaa fin s TbLCJV 192 Toman yeUaw dwarf tin s TnYDV

72 Tokaeeo leafeart fin s u l c v 193 Tomwoyedaw leafcart Ckim tin s TYLCCNV 3

73 Tokaeea le a f earl Ymmm f in s TbLCYV 194 Tammo yedow le a f cart Kawait t i n s TVLCKWV

74 Tomato earty stam fin e T C tfV 195 Tommyedow leaf cart Sardinia rinu TYLCSV 9

75 Tomato dwarfleaf tori tinu ToOLCV 194 Toman yellow leafeart Sndan tin s TYLCSDV

74 Tomato golden atosaic fin s TGMV 197 Tomtnyedow leafeart Thailand tin s TYLCTHV 4

77 Tommo golden asottle t in s TGMnV 199 Toman yedow leafeart tin s TYLCV 24

79 Tomato leaf enmgle fin s TLCrV 199 TomQa jedow maale tin s TYMoV

79 Tomato leaf cart Bangalore fin s T otX IV 119 TamOn yedow teia streak t in s ToW SV

99 Tomato leaf cart Bangladesh fin s TsLCBDV 111 WmCSV 3

91 Tomrna leafeart India t in s TtLCIV 112 Wissadala golden m atair tin s WGMV

92 Tomato leaf cart Indonesia tin s TlLClDV 112-139 19 as ytl aaaaamd fnaW viiw cs

93 Tomato leafeart Karnataka tin s TsLCKV

M Tomato leafeart Laos fin s TsLCLV

95 Tomato le a f cart New DeOd fin s ToLCNDV

94 Tomato leafeart Nkangaa tin s T#LCNV

97 Tomato leaf cart MU&ine tin s TsLCPV

99 Taaana leaf cart Senegal tin s ToLCSV

4

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.1.2. Geographic Distribution:

Most geminivirus diseases are limited to tropical and subtropical regions of the world because geminivirus insect vectors thrive in warmer than average climates. These insect vectors remained partially geographically isolated until the advent of global transportation, which circumvented many of the barriers to vector population movement [6, 8]. Geminivirus vectors, which include , and whiteflies, occasionally experience shifts in population that can have major impacts on viral epidemiology. Adaptation of the insect vector to new habitats has greatly facilitated geminivirus spread. For example, since the early 1990’s a new, less fastidious and more fecund whitefly (biotype B) has almost completely replaced the previously established biotype A in North America [9,10]. The expanded feeding range of new whitefly vectors has contributed to an explosive radiation of many geminiviruses into new geographic areas and new species of host plants.

Traditional geminivirus vector habitats such as Southern Asia, India, Pakistan, Central America, Central and Northern South Africa, the Mediterranean Basin, and Middle Eastern countries remain among the most greatly affected areas. In these localities geminiviruses have been reported to cause nearly complete crop failure. For example, Tomato Yellow Leaf Curl Virus (TYLCV) is the predominant limiting factor affecting tomato cultivation in Central Africa and it is also of major concern in the Mediteranean Basin [11]. In Africa, ACMV routinely decimates staple cassava crops, causing socioeconomic hardship [5, 12]. Disease caused by geminivirus infection is less epidemic in more temperate regions of the world. However, geminiviruses and the diseases they cause have had a substantial negative agronomic impact in Australia, Southern Europe, South America, the Caribbean, Japan and the Southern United States [13]. In the United States, BCTV has periodically been responsible for the devastation of beet and pepper crops since the mid 1900’s. Spikes in annual population seem to coincide with more severe BCTV outbreaks, underscoring the reliance of geminiviruses upon their vector populations [1].

Concern about spread of geminivirus infection into previously unthreatened regions continues to grow. The appearance of more aggressive viral vectors, particularly the new 5

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. type B whitefly biotype, has raised the possibility that large numbers of crops that have not previously been in endangered by geminiviruses may soon be targets for infection. In the early 1990’s serious geminivirus infection of crops in Florida and Texas was first reported. As of 1997, geminivirus infection has been observed as far north in the United States as South Carolina, Virginia and Tennessee [6].

In addition to the appearance of new viral vectors and changes in viral vector habitat, virus evolution is another important factor contributing to the emergence of new disease. Geminiviruses continue to evolve and create emergent species with new and oftentimes expanded host ranges [14]. Different geminiviruses can frequently infect the same plant host, including sub-optimal weed species. It is thought that these abundant weed species act as reservoirs for geminiviruses, harboring several different viruses at the same time. It is known that geminiviruses are capable of freely undergoing intermolecular recombination when present in co-infected tobacco, Nicotiana benthamiana [4, IS, 16]. Although most intermolecular recombination events lead to the production of non-infectious chimeras, those between closely related viral strains have been reported to be infectious, and in some cases these chimeras possess altered and / or increased virulence. Mixed infections by diverse geminiviruses, brought together as a consequence of expanding geographical and host ranges of the viral vector, especially the whitefly vector, serve as major potential sources for the generation of new virus strains [4, 17]. Rapid generation of novel geminivirus genomes and concurrent vector range expansion seem likely to ensure that incidences of disease caused by geminivirus infection will continue to increase.

1.1.3. Agronomic Importance of Geminiviruses:

Geminiviruses were originally thought to be a relatively small group of plant pathogens that infected a limited number of host species. During the 1990’s, geminiviruses gained more attention as a major crop threat throughout the world. Currently, they are known to cause significant crop disease in at least 39 nations. For several successive years in the last decade, geminiviruses destroyed up to 95% of the tomato harvest in the Dominican Republic. In the United States geminivirus infection was responsible for an estimated $140 million loss to the Florida tomato crop during the 1991-1992 growing season [6]. In the 6

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. late 1990’s geminiviruses were described as having caused severe loss to the tomato crop in Brazil [18]. Texas Pepper Virus (TPV) has been observed in Texas, Arizona, throughout Mexico, and Costa Rica. In the year 2000, TPV was declared a major threat to capsicum production in Costa Rica with observed infection rates of between 25-75% [19]. Geminivirus disease accounts for an ever-increasing percentage of crop damage throughout the world. Geminivirus vector’s (especially the whitefly) steadily expanding range and the virus proclivity for adaptation to new hosts guarantees that geminiviruses will continue to be a formidable barrier for cultivation of economically important crops until effective means to control them are developed.

1.1.4. Topics of Geminivirus Research Around the World

Much research has been directed toward elucidating the molecular mechanisms of virus transmission, replication, macromolecular transport, , assembly, pathogenesis and suppression of post-transcriptional gene silencing (PTGS). All of these research areas involve aspects of geminivirus biology crucial to development of resistance strategies and disease control. Because of their relatively simple genomes and extensive reliance on host sub-cellular machinery, the study of geminivirus biology also provides detailed information about the processes of host DNA replication, transcription, gene expression, molecular trafficking, cell-cycle control and defense response.

Even a cursory overview of the manifold projects being undertaken worldwide concerning geminiviruses and their vectors is too lengthy to include in this document. Rather, it is hoped that a brief description of the following select group of projects will serve to illustrate the current diversity in geminivirus research. In July of 2001, at the John Innes Center in the United Kingdom, the 3"1 International Geminivirus Symposium was held and over 119 speakers discussed their research progress [20].

The research presented at the symposium included the first detailed model of a geminivirus particle with resolution on the Angstrom (A) scale. This work by the McKenna lab described the molecular structure of both the virion particle and the coat protein of MSV. The coat protein was modeled as an eight-stranded, antiparallel 3-barrel motif with an N-

7

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. terminal a-helix. The geminivirus particle was estimated to contain 110 copies of the coat protein arranged as a pair of fused, incomplete T=1 icosahedra. The virion dimensions measured 220 x 380 A, which is slightly bigger than most estimates of virion size for other geminiviruses. Other work by the Frischmuth lab identified two coat protein residues at positions 124 and 149 as being critical for whitefly transmission of Abutilon Mosaic Virus (AbMV) as well as a third residue at position 174 that greatly affects the efficiency of transmission.

Other projects discussed at the symposium covered topics dealing with virus-host interactions. Work performed by Hanley-Bowdoin et al. described a plausible method of regulation of host PCNA (proliferating cell nuclear antigen) gene expression by TGMV Rep protein involving the established pRb binding activity of Rep. PCNA is an essential processivity factor required for efficient DNA synthesis by DNA polymerase 5. Although direct evidence is lacking, by analogy with mammalian systems it seems likely that pRb is capable of binding a plant E2F homologue and thereby reducing the activity of E2F responsive promoters. Indeed, Hanley-Bowdoin’s group provided evidence that E2F binding sites in the PCNA promoter from N. benthamiana are important for PCNA gene activation mediated by TGMV infection in mature leaf tissue. This suggests a mechanism whereby Rep binds pRB and releases E2F to activate PCNA expression in tissues, such as mature leaf tissue, which would otherwise not express PCNA to high levels.

Several additional researchers are examining other host proteins involved in the geminivirus infection process. Gutierrez et al. used the yeast-two hybrid system to find several novel plant proteins that bind specifically to Wheat Dwarf Virus (WDV) Rep protein. The new factors are still being characterized but at least one of them is known to be related to a host protein that binds Rep and PCNA. Their data have lead them to suggest a stepwise mechanism in which Rep binds to the origin of DNA replication, forms an oligomeric complex, initiates replication and facilitates the recruitment of cellular DNA replication proteins to the viral DNA replication fork. Other laboratories, the Zhu lab for example, are taking a more holistic approach and are using oligonucleotide based micro­ array technologies to identify host genes that are induced or repressed during various stages

8

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. of geminivirus infection. This approach promises to generate a greater understanding of the global affect of geminivinis infection on their hosts.

Another important aspect of host-virus interaction that is being studied is the way in which geminiviruses circumvent and interact with host defense response pathways, including PTGS (post-transcriptional gene silencing). It appears geminiviruses are one of an increasing number of virus groups that are able to interfere with host gene silencing. The Baulcomb lab showed that the C2 gene of ACMV is a suppressor of PTGS which acts through a conserved zinc-finger motif. Furthermore, they demonstrated that the C2 protein is also involved in symptom formation. Additionally, the Bisaro lab has demonstrated that the TGMV and BCTV homologues of C2, AL2 and L2 respectively, play yet another role in plant-pathogen interactions. In vitro and in vivo studies have demonstrated that the AL2 / L2 proteins of these viruses interact with and inactivate both ADK (adenosine kinase) and SNFI (sucrose non-fermenting factor 1). SNF1 and ADK are important components of the plant stress response pathway. That geminiviruses target these proteins for inactivation implies that plant stress response pathways are integral partners with plant defense response pathways, and that both are important for defense against pathogen attack. In support of this idea, Bisaro et al. demonstrated that manipulation of SNFI expression levels in plants alters their susceptibility to pathogen attack. Over-expression of SNFI leads to decreased susceptibility and conversely, reduced SNFI expression leads to enhanced susceptibility.

In addition to work on virion structure, identifying host proteins interacting with geminiviruses and characterizing pathogen-host interactions, many important discoveries have been made in the fields of geminiviral epidemiology, the study of emerging virus strains, virus management and control, and viral resistance. As geminiviruses continue to present themselves as significant socio-economic threats, research along all these fronts will prove invaluable for developing coping strategies and ultimately overcoming the viral barrier to cultivation of susceptible crops.

9

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2. The FamilyGeminiviridae :

1.2.1. General Family Characteristics:

The Geminiviridae is a large family of plant viruses that infect a wide variety of plants [21]. The family derives its name from the twinned icosahedral capsid morphology common to all group members (shown in an electron micrograph in Figure 1.1.). The capsid consists of 110 copies of the coat protein arranged as a pair of fused, incomplete T=1 icosahedra. The virion dimensions measure 22 x 38 nm. The virion particle encapsidates a single-stranded DNA molecule of between 2.3-3.0 kb. The geminivirus genome consists of one or two separate DNA molecules, depending on the virus. Members of the Geminiviridae infect both dicots and monocots, primarily from the families Solanaceae and Graminae. [22] [20,23]

Geminiviruses are spread, sometimes over great distances, by their insect vectors. The vectors include several species of leafhoppers, treehoppers and whiteflies that spread the virus as they feed, moving from plant to plant. Virus transmission is semi-persistent, allowing a single viruliferous insect to spread disease to a number of plants [24-26]. The Geminiviridae is divided into four separate genera, categorized with respect to insect vector, host range, and genome organization.

10

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 1.1. Electron Micrograph of Geminivirus Panicles: Geminiviruses derive their name from (heir dumbbell shaped capsid morphology. Each capsid consists of 110 copies of coat protein arranged in a pair o f incomplete, T=l icosahedra. A single virion capsid measures 22x38 nm. Each capsid encloses a single ssDNA molecule Figure reprinted from Field's Virology [22].

Figure 1.1. Electron Micrograph of Geminivirus Particles

11

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Viruses belonging to the Mastrevirus genus primarily infect monocotyledonous plants, are vectored by any of a variety of leafhopper species, and possess a single component DNA genome [27]. Curtovirus members also have single component genomes and are leafhopper transmitted, but they primarily infect dicots [1, 28, 29]. Most viruses of the Begomovirus genus have two component genomes, although there are some monopartite members. All begomoviruses are exclusively transmitted by the whitefly, Bemisia tabaci, and all infect dicots [30-33]. A recently added fourth group is the Topocuvirus genus. This genus presently contains only a single member, Psuedo-Curly Top Virus. Topocuviruses are monopartite and are transmitted by treehoppers [34,33]. Very little else is currently known about the topocuviruses and they will not be discussed further in this document.

1.2.2. Genome Organization and Nomenclature:

Geminiviruses possess covalently closed, circular ssDNA genomes. The three most common geminivirus genera and their typical genome organizations are depicted in Figure 1.2. The ssDNA molecule packaged in virion particles is infectious and is termed the viral or plus (+) strand [36]. The term virus-sense strand is also used to refer to the (+) strand in some literature. Shortly after infection host enzymes conveit the plus-strand into a double stranded DNA intermediate (replicative form; RF) used as a transcription and replication template [37,38]. The nascent strand of RF DNA is referred to as the complementary-sense or minus (-) strand. To minimize confusion (+) and (-) strand designations are used in this text wherever possible. The genomes of all geminiviruses contain several open reading frames that flank an approximately 200 bp sequence called the intergenic region (IR) [39- 41]. The intergenic region is positionally and structurally conserved among all geminivirus. Furthermore, in each distinct two-component geminivirus, a subset of the IR DNA is identical in the intergenic regions of the A and B components. This shared intergenic region sequence in the begomoviruses, which contains the origin of replication, is often referred to as the common region (CR). The begomovirus CR is important for synchronizing replication of the A and B genome components during infection. The DNA sequence of the geminivirus IR DNA varies from species to species with the exception of certain conserved elements thought to be important for viral transcription and replication. The intergenic region of all geminiviruses contains the plus-strand origin of replication and 12

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. probably contains the minus-strand origin of replication of begomoviruses and curtoviruses as well [41-44]. TheMastrevirus minus-strand origin is located on the opposite side of the DNA molecule in an area known as the small intergenic region (SIR), which is a unique feature of this genus [40].

Up to eight open reading frames are contained in geminivirus genomes. Some of these genes are encoded in the plus-strand, others in the minus-strand. The one DNA component of monopartite viruses of necessity contains all viral information required for replication, systemic spread and symptom development. The bipartite geminiviruses have split genomes with two DNA molecules of similar size designated by an A or B. The B component encodes genes required for systemic spread and symptom development while the A component contains all the genes required for viral replication and encapsidation [45].

Genes encoded by geminiviruses are believed to be transcribed from divergent promoters in the intergenic region. To understand how geminivirus genes are named it is useful to visualize their genomes as depicted in Figure 1.2. The intergenic region is typically shown at the top of a genome diagram, with genes arranged such that the plus-strand is read 5'-»3' clockwise around the circle. With this convention in mind, genes transcribed to the right of the common region are designated rightward (R) and those transcribed to the left are referred to as leftward (L). In bipartite begomoviruses gene names are also preceded by an A or B, specifying which DNA molecule encodes them. In an alternative naming scheme, the designations rightward and leftward are replaced with the terms viral-sense (V) and complementary-sense (C), respectively. Only the first letter of the polarity description is typically used in naming the genes, for example the first gene transcribed to the right of the common region is simply referred to as R l, or in the begomoviruses AR1 or BR1. Alternatively, these genes may also be referred to as VI or AVI and BV1. Most literature uses the V/C convention for mastreviruses and the R/L convention for curtoviruses and begomoviruses. This text will use the R/L convention except when referring to previously published work, in which case the naming scheme used in that work will be maintained.

13

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Begomo virus M astre virus TGMV, SqLCV MSV, WDV

Rep A (UR)

(R2)

(C1:C2)

Curto virus TYLC BCTV

Figure 1.2. Geminivirus Genome Diagrams. The typical genome organization of the three common genera of geminiviruses are depicted above. Typical genome components range in size from 2.5 to 2.9 kb. Begomoviruses such as TGMV and SqLCV (top left) posses two DNA components (designated A and B). Other begomoviruses (TYLCV), the Mastreviros and the Curtovirus, possess single component genomes (bottom left & right). Terminology: (IR) = intergenic regioo, the IR contains the origin o f plus-strand replication and divergent promoters for V-sense and C-scnse transcription. The minus-strand replication origin is in the IR of the begomo- and curto- viruses and in the small intergenic region (SIR) of the mastre viruses. Protein names are listed with their corresponding gene names in parentheses. Rep = Replication protein, REn = Replication Enhancer protein, CP = Coat Protein, MP = Movement Protein, NSP = Nuclear Shuttle Protein, TrAP = Transcriptional Activator Protein. Rep A = Replication protein A, unspliced Rep variant in the mastre viruses.

Figure 1.2. Geminivirus Genome Diagrams

14

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.2.3. Functions o f Viral Genes:

The number of genes contained within geminivirus genomes varies among the different genera. The mastreviruses are the most genetically parsimonious. They encode only Five identifiable proteins one of which, RepA, is translated from an unspliced variant of the longer, spliced, Rep transcript [46]. The begomo- and curtoviruses possess from six to eight genes, depending on the species of virus. Much has been elucidated concerning viral gene function and is reviewed briefly in this section.

The following is a general overview of the pathways and processes in which certain geminivirus genes are known to function. These genes and their respective functions are also summarized in Table 1.2. There exists considerable debate about how to name geminivirus genes and this fact is reflected in the confusing and often times conflicting schemes seen in the literature. The naming scheme presented here is adapted from a recent review in Field’s Virology and will be adhered to throughout this document [22]. The VI and V2 genes of the mastreviruses and monopartite begomoviruses, the BCl and BV1 genes of the bipartite begomoviruses, and the VI and V3 genes of the curtoviruses mediate cell-to-cell and systemic spread of the virus within an infected host, and are therefore referred to as movement genes. The coat protein (CP) gene is designated VI in all the geminiviruses under this scheme. CP has been shown to affect accumulation of plus-strand DNA forms, probably due to its role in sequestering newly synthesized ssDNA and preventing its conversion into dsDNA [47]. The curtovirus V2 gene is also important for ssDNA accumulation . Additionally, the movement gene, BCl is important for symptom development in the begomoviruses [48]. The curtovirus C4 gene is also involved in the process of symptom development but its biochemical functions are not well understood [49].

Transcriptional regulation in the geminiviruses is rather complex [43]. In the mastreviruses, the Rep gene acts as a transactivator of coat protein gene expression. While some evidence exists that there may be a domain of curtovirus Rep capable of transactivation in vitro, the importance of this domain in vivo has yet to be demonstrated. In the begomoviruses, transactivation of coat protein gene expression requires the AL2 gene, which encodes TrAP 15

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (Transcriptional Activator Ptotein). TrAP resembles classic acidic domain transcriptional activators such as VP 16 and likely activates transcription in a similar manner [SO]. The curtovirus homologue of TrAP, L2, does not appear to be involved in viral gene transactivation. Interestingly though, TrAP and L2 appear to share a common function in circumventing host stress responses [51]. In addition to activating genes, some viral proteins negatively regulate gene expression. Begomovirus Rep protein binds its own promoter between the TATA box and the transcriptional start site and shuts off its own expression [43, 52]. The AC4/C4 gene is also involved in down regulation of viral gene expression, although the exact mechanism by which this is done remains unclear [53-55].

Geminivirus replication is carried out mainly by host replication machinery. These viruses do not encode a polymerase of their own and are therefore dependent on cellular enzymes for all DNA synthesis. Only the Rep protein, the product of the C1:C2 gene in the mastreviruses and the AC 1 or Cl in the begomo- and curtoviruses respectively, is required to initiate replication in a suitable host, from a circular DNA containing its cognate plus- strand origin. Geminivirus Rep is a multifunctional replication initiator protein and will be reviewed in more detail in section 1.5.2. In the begomo- and curto-viruses, a second viral protein enhances the accumulation of viral DNA forms through a largely unknown mechanism [56]. This protein is the product of the AC3/C3 gene and has been given the name REn, alluding to its replication enhancer phenotype. The mastreviruses do not encode a REn homologue and it is uncertain if the truncated form of mastrevirus Rep, RepA, shares a common functional role with REn protein during viral replication. Rep is the only viral protein necessary for replication of viral DNA.

16

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. VI V2 VI VI Cl C l C l monopartite Mastrevirus WDV. MSV MP V2 CP MP Rep A CP C2 VICP VI Gene Protein Gene BCTV Curtovirus monopartite C4 C4 C2 C4 C4 RepA REn C3 Rep A CP Protein Rep Cl Rep C1:C2 VI V2 V2 VICP C l VI C2 C3 C l TYLCV monopaitite MP V2 C4 C4 CP TrAP CP Begomovinis BLI.BCI AL2, AC2 ARI, AVI CP A LI.ACI Rep BL1.BCIBR1.BV1 MP V2 MP V3 AL1.AC1 AL2, AC2 AL1.AC1 Rep Gene Protein Gene bipartite Rep MP CP Rep AL4, AC4 AU, AC4 MP NSP REn AL3, AC3 REn Protein Rep TrAP CP ARI, ACI Table 1.2 Geminivirus Genes and Their Functions F u n rtin n : Repressionof Rep Activationof Late Genes Suppressionof Host DefensesRegulation ofViral ssDNA Accumulation TrAp Host Activation Symptom Development Encapsidation and Insect Transmission Systemic Movement Replication Transcription proteins have been given a descriptive name reflecting their andcurtovirus known C2. functions, e.g. the Cl genes of all geminiviruses arc involved in replication and are appropriately named Rep. Proteins with unknown or poorly characterized functions are simply referred to by their gene designation, e.g. begomovirus AC4 Table 1.2. Geminivims Genes and Their Functions: viruses Gemini are thought to encode between 5 and 7 genes in their relatively small genomes. Some

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.3. Rolling Circle Replication:

1.3.1. General Features:

Geminiviruses replicate their genomes through a rolling circle replication (RCR) mechanism similar to that employed by several prokaryotic plasmids like pUBllO or pTC181, and Coliphages such as 0X174 and M13 [57-59]. A necessary first step in RCR of ssDNA viruses is the conversion of the single-stranded viral genome into a double­ stranded, replicative form (RF) intermediate by host replication machinery (ssDNA—>dsDNA). Subsequent to second strand synthesis, RF may be used as template for either transcription or production of more RF (dsDNA—>dsDNA). Later in the infection cycle, (+) strand copies of the genome are produced from RF, (dsDNA-»ssDNA). The strand produced during dsDNA—>ssDNA is designated the plus (+) strand and the template strand is the (-) strand.

In a typical RCR system (Figure 1.3.), a viral protein possessing endonuclease activity produces a “nick” at a specific location within the (+) strand origin of replication in the RF DNA. The free 3' OH generated by the nick serves as a primer for a DNA-dependent DNA polymerase to extend the nicked strand around the genome and through the origin again. The newly synthesized strand displaces the original parental (+) strand. Once the parental strand has been copied, it may be cleaved at the replicated origin to generate linear monomers or be left uncleaved, producing several directly repeated multimers of the genome [60]. Newly synthesized genomes may remain linear or the linear ends of monomer DNA forms may be ligated to each other, generating circular, (+) strand, ssDNA genomes that may be subsequently converted to dsDNA or be encapsidated into virion particles. In addition to producing genomes for encapsidation and spread, RCR also serves as an efficient tool for amplifying DNA templates for transcription. Early in the infection cycle, circular monomers produced via RCR are primarily converted into dsDNA (RF), which is required for transcription of genes encoded in the (-) strand. Later in the infection cycle replicated monomers are sequestered for encapsidation into virions or for systemic spread through the host.

18

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.4. Model Replication Systems:

1.4.1. Replication of 0X174: The Classic Model of Rolling Circle Replication:

The prototypical 0X174 rolling circle replication (RCR) system is depicted in Figure 1.3.and reviewed in [61] Like geminiviruses, the phage 0X174 genome consists of a single-stranded circular DNA known as the plus (+) strand. The complementary (-) strand is synthesized by host enzymes, generating a dsDNA RF intermediate. RF is nicked by the phage-encoded gene A protein, designated protein A. Protein A introduces a site-specific nick in the phage (+) strand that defines the (+) strand replication origin. Protein A remains bound to the S' side of the nick site via a phosphotyrosine linkage, while the 3' -OH is extended by DNA polymerase. The nascent DNA chain is extended around the (-) strand template until it passes through the origin. Protein A, still bound to the S' end of the parental (+) strand, introduces a nick into the newly synthesized origin. A single molecule of protein A has two active site tyrosine residues capable of forming phosphotyrosine linkages with nicked 0X174 (+) strand origin DNA, thus one molecule is hypothesized to be capable of binding two cleaved (+) strand origins simultaneously [62]. The nicking event is followed by ligation of the parental S' and 3' ends releasing circular (+) strand progeny genomes. The ligation reaction is also performed by protein A and occurs concurrently with the nicking reaction. 0X174 protein A is a typical example of a class of RCR initiation proteins, mechanistically similar to topoisomerases, which are generally referred to as nicking-closing enzymes. Newly synthesized (+) strands may either be converted into dsDNA intermediates for use as replication and transcription templates or, alternatively they may be packaged into new phage particles during phage morphogenesis.

19

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ssDNA -W- dsDNA

ssDNA

A protein Dimer & M g++

A protein recruits rep Helicase, SSB DNA P o ly m e ra se /

ATP. dNTP’s

A protein rep Helicase

Figure 1.3. Rolling Circle Diagram of $X174 Replication: Diagrammatic representation of bacteriophage $X174 replication. The single (+) strand DNA enters the host cell and is converted to dsDNA (RFII) by host enzymes. A covalently closed supercoiled replicative intermediate is (RFI) is important for the processes of replication initiation and transcription. $X Protein A nicks the (+) strand origin and recruits to the origin cellular replication machinery such as DNA dependent DNA Polymerase III, rep helicase, and SSB. RCR initiates from the 3' end o f the nicked origin and proceeds around the minus strand template passing through the origin again where the newly synthesized DNA is cleaved and ligated, generating a nascent (+) strand. The new (+) strand may either be converted into RFII or be packaged into a virion particle. Figure 13. was adapted from DNA replication o fsingle-stranded Escherichia coli DNA by phages P.D. Baas [61].

Figure 1.3. Rolling Circle Diagram of

20

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Another aspect of DNA replication 0X174 and geminiviruses have in common is viral specific recognition of their origins of replication by their respective replication initiator proteins. Protein A is said to be cis-acting in vivo [22,60,63]. A property not conserved in a cell free in vitro system. In vivo, Protein A only nicks origin DNA sequences of the same phage genome from which it was produced. The factors which make Protein A m-acting in vivo have not been identified but likely involve an unknown host factor since the phenomenon is not observed in similar in vitro origin nicking experiments. It has been hypothesized that the c/s-acting DNA elements may lie in the DNA sequences immediately adjacent to the minimal Protein A nick-site, as determined in vitro. These adjacent DNA sequences appear to influence nicking efficiency in vivo. It remains to be discovered if Protein A is interacting with these sequences directly in vivo or, as mentioned above, another unknown protein is involved.

The great similarity between geminivirus replication and that of some prokaryotic phages and plasmids has led to the hypothesis that geminiviruses evolved from an ancestral prokaryotic plasmid [64]. This similarity between systems has produced much speculation about how geminivirus replication proteins function. The hypothesis that geminivirus Rep proteins function in a manner similar to their bacteriophage counterparts gained further support with the demonstration that Rep dependent replication of a geminivirus replicon can occur in bacterial cells. The fact that geminivirus replication origins, native to eukaryotic hosts, function in a prokaryotic host, argues strongly that bacteriophages such as 0X174 and geminiviruses are indeed close relatives [63].

1.4.2. Replication of Adeno Associated Virus (AAV); A Related Rolling Hairpin Replication Model System:

Further similarities to geminivirus replication are exhibited by the related mammalian parvovirus subgroup, the dependoviruses. Due to their relative simplicity and potential utility as gene therapy vectors, the dependoviruses have been extensively studied and a wealth of knowledge concerning proteins, both host and viral, involved in their replication has been generated. Geminiviruses are known to share many dependovirus replication

21

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. characteristics and it is likely that this homology extends to areas of geminiviruses replication that await investigation.

The family is split into two subfamilies, the Densovirinae and the Parvovirinae. The densoviruses infect insects while members of the Parvovirinae subfamily infect vertebrates. The Parvovirinae is further divided into three genera, Parvovirus, Erythrovirus and Dependovirus. AAV belongs to the dependovirus genus. Dependoviruses derive their name from the fact that they require co-infection of an unrelated helper virus, either adenovirus (Ad) or a herpesvirus (HSV), for productive infection in cell culture. Because they are frequently isolated from hosts co-infected by an adenovirus, dependoviruses are often referred to as adeno associated viruses, of which there are currently at least 11 members. For the following general discussion of parvovirus DNA replication, the term AAV refers to the dependovirus, Adeno Associated Virus Type 2 (AAV2) except where explicitly stated otherwise. AAV2 has been chosen as a model parvovirus because it is one of the best studied members of the Parvoviridae and shares many biological features with the geminiviruses [22].

The parvoviruses are among the smallest known mammalian DNA viruses. Parvoviruses typically posses genomes around 5 kb in size that encode seven proteins, four nonstructural replication proteins (Rep78, 68, 52, and 40) and three structural proteins (VP1, 2 and 3). The parvovirus eukaryotic replication environment closely resembles the in planta situation faced by geminiviruses. Therefore, cellular factor and pathogen-host interactions important in the model parvovirus replication system are likely more applicable to geminivirus replication than are those of the bacteriophages such as <|>X174.

AAV replicates via a mechanism similar to RCR, often termed a rolling hairpin replication (RHR) mechanism [66], The general mechanism of AAV RHR and important cu-acting elements are illustrated in Figure 1.4. Although the infectious ssDNA molecule is linear as opposed to circular, AAV and geminivirus replication mechanisms are quite similar. The linear AAV genome is converted into a duplex DNA molecule by host replication machinery. A multimeric complex of AAV encoded replication initiation proteins estimated to contain from two to eight Rep 78 and/or 68 molecules, accompanied by host 22

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. accessory factors, then binds to and introduces a site-specific nick in a single-strand of a duplex region of the genome, defining the origin of replication. The nick-site, also called the terminal resolution site (TRS), lies inside of a roughly 125 bp inverted, terminal repeat (TR), capable of forming hairpin structures located at the termini of the AAV genome (Figure 1.4.b). The nick-site serves as a primer for DNA polymerase to carry out displacement synthesis all the way through the end of the template molecule [67] The TRS is located in the stem region of the TR hairpin. Once the ends of the genome have been replicated by a process often called terminal resolution, they can form two new hairpins. One hairpin ends with a free 3' OH and can serve as a primer for DNA polymerase to elongate that strand through the opposite end of the genome. The other strand is converted into duplex DNA by initiation of DNA polymerase at the other end of the genome from a hairpin annealed to the complementary strand with a free 3'OH. Thus the process can continue amplifying the AAV genome until the ssDNA molecules are sequestered for encapsidation during viral morphogenesis.

Similar to the geminiviruses, AAV is almost entirely dependent on host replication machinery to complete its replication cycle. A great deal is known about protein complexes and viral protein modifications necessary for AAV replication. Owing to the great similarities between dependoviruses and the geminiviruses, many biological processes are likely conserved between these two systems.

23

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. f r r trs I j Rep 68/78 f I nicking ______▼

helLe I DNA Polymerase. ♦ RF C, PCNA, RPA

I Re-initiation from tr hairpin ^ I Ren 68 / 78 helicase

Reinitiation without |L J |^ tenninal resolution * % us

i

^ t r s B. RBE' Tri-paitite Rep 68 / 78 binding and nicking site

RBE . .. ( 3 1 IMiCTCTWCfcsewewcwctel' ■CCMtlf — (S I I US

Figure 1.4. AAV Rolling Hairpin Replication: A. Panel ’A" is a flowchart depicting the AAV replication cycle, beginning at the lop with an infectious ssDNA particle. Note the position of the tenninal resolution site (trs), which is the site of Rep 68/78 nicking. B. Close-up of the AAV Tenninal repeat (TR). The functional origin in AAV has three required elements. The nick-site (trs), the Rep 68/78 binding element in the duplex ONA, RBE and the auxiliary Rep binding element in the TR hairpin, RBE'. All three elements are required for proper Rep 68/78 function. Figure 1.4. was adapted from Field's Virology [22].

Figure 1.4. AAV Rolling Hairpin Replication 24

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.4.2.1. AAV Replication Initiator Proteins:

AAV encodes four multifunctional, overlapping Rep genes transcribed from two separate promoters. Rep52 and Rep40, transcribed from the pw promoter, are involved in sequestering and accumulating ssDNA for encapsidation and viral morphogensis. Two closely related replication initiator proteins called Rep 68 and Rep 78 (owing to their respective molecular weights) are produced from the p$ promoter of AAV, with the former being a spliced variant of the latter. The first 529 amino acids of Rep 68 and 78 are identical after which a 7 amino acid tail in the spliced Rep 68 variant replaces the last 92 amino acids of Rep 78 [68]. The exact role of these slightly different proteins in AAV replication remains unclear, although it is known that both are able to support replication from AAV containing replicons in vitro. Rep 68 and 78 share extensive functional homologies both with each other and with geminivirus Rep proteins. Furthermore, it is interesting to note that mastrevirus Rep also exhibits alternate splicing of its Rep protein, suggesting that a similar partially conserved initiator protein RNA processing scheme may be employed by these related viruses. The Rep 78 / 68 proteins play important roles in the processes of AAV DNA replication, viral genome host chromatin integration, trans- regulation of viral gene expression and trans-regulation of various heterologous genes - both host and viral [69].

Consistent with the multifunctional role of viral replication initiator proteins from other systems, AAV Rep proteins exhibit a diverse array of biochemical functions [69]. AAV Rep proteins are site-specific dsDNA binding proteins capable of simultaneously recognizing and binding to a tripartite element consisting of three separate sequences within the origin of replication (Figure 1.4.). One component of the tripartite AAV origin consists of a repeated four bp element in the AAV terminal repeat (TR) region, termed the Rep binding element or RBE. Binding to the RBE is stimulated by adenosine tri-phosphate and is required for efficient nicking in vivo and in vitro. Another region of the AAV origin contacted by Rep is the TRS. The TRS consists of a seven bp element within which Rep introduces a strand-specific nick that serves as primer for DNA polymerase to initiate displacement DNA synthesis. A third site at which Rep 68/78 complexes contact the origin is relatively distant from the TRS and consists of a conserved pentanucleotide sequence, 25

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. termed RBE'. AAV Rep complexes are dependent on correct secondary structure being present, partially due to the apparent need for the distant RBE' motif to be brought into close proximity to the TRS and RBE motifs. AAV Rep proteins fail to efficiently nick AAV TRS-containing DNA lacking the RBE and RBE' elements in in vitro assays, implying that binding to the RBE and RBE' elements is necessary for efficient endonuclease activity. In support of this, Rep 68/78 proteins nick AAV TRS containing plasmids with high efficiency, but do so only when the RBE and the RBE' elements are present in the correct orientation and position from the TRS. Thus, high efficiency in vitro nicking is dependent on maintaining the correct spacing and orientation between the TRS, and RBE and RBE' elements. In summary, AAV Rep 68/78 are the origin binding proteins (OBPs) of AAV and they recognize a tripartite origin in a highly coordinated manner [70].

Rep 68/78 proteins possess site- and strand-specific endonuclease activity, can either bind and/or hydrolyze ATP, and they possess an ATP dependent DNA or RNA helicase activity. AAV Rep ATPase activity is stimulated by the presence of ssDNA and nucleoside tri­ phosphate has been demonstrated to stimulate Rep DNA origin binding. Therefore, AAV Rep 68/78 helicase function is likely to be critical for origin opening and terminal resolution. Indeed, a K—»H substitution mutation within the ATPase motif of the helicase domain of AAV Rep produces a mutant which is dominant negative for helicase and endonuclease activities. The mutant Rep 67/78 protein is ATPase negative and no longer exhibits stimulation of origin binding by nucleoside tri-phosphates [71, 72]. It remains unclear whether or not AAV Rep 68/78 proteins serve as the main viral replicative helicases in vivo or if there is another helicase important during more processive, chain elongation phases of AAV replication. However, it is known that Rep 68/78 protein is able to support AAV replication in vitro when supplemented with cell free extracts that do not posses significant DNA helicase activity [71, 72]. Therefore, AAV Rep helicase activity may be sufficient for viral replication in vitro.

Rep 68/78 are also known to bind cellular replicational and transcriptional machinery components, e.g. Spl and PC4 transcription factors, cellular TBP (TATA binding protein) and RPA (replication associated protein A) (Table 1.3.)[73-76]. Additionally, Rep 78 has

26

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. been shown to form homo-multimers in vitro mediated by sequences in the carboxy half of the protein [69]. Unfortunately, very little information is presently available concerning the nature of AAV Rep complex quaternary structure. One of the most attractive features of the adeno-associated virus system as a model for geminivirus replication is the existence of an in vitro replication system. In vitro replication assays have allowed several aspects of AAV replication to be dissected at the molecular level.

27

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Factor______Description Function

S p l Transcription factor Origin opening / recruitment

PC4 Transcription factor Origin opening / recruitment

TBP TATA Binding Protein Origin opening / recruitment

RPA Replication-Associated Protein A ssDNA binding

RFC Replication-Associated Protein C Processivity damp loader

PCNA Proliferating Cell Nuclear Antigen Processivity clamp

Pol 5 o r £ DNA dependent DNA Synthesis DNA Polymerase5 o re

PIF Parvovirus Initiation Factor Interacts with Rep and binds Origin DNA Transcription factor

HMG-1 High Mobility Group Interacts with Rep and binds Cruciform DNA protein-1

ssD-BP/ Single-Stranded D-Sequcnce Origin binding and regulation of second (FKB52) Binding Protein / (52 ItD FK506 strand DNA synthesis 52 Id) FK506 Binding Protein)

Table 1.3. Cellular Factors Associated with Parvovirus Replication; Table 1.3. Lists several host factors curreatly known to be associated with AAV replication either in vivo or in v itro. The proteins include basal transcriptioa and replication factors such as TBP, RF-C, PCNA, RPA and DNA polymerase. PIF, HMG-1, and TBP have all been demonstrated Us interact directly with AAV Rep 68/78 proteins. Given the similarities between geminivirus and AAV replication, many of these factors or similar plant hnniologues may be involved in geminivirus replication

Table 1.3. Cellular Factors Associated with Parvovirus Replication

28

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1.4.2.2. Factors Associated With AAV Rep 78/68:

Careful mutation of key viral proteins and supplementation of in vitro assays with cellular extracts and purified host factors have refined our understanding of the nature of AAV replication. Much of this knowledge base is likely to be directly applicable to geminivirus replication. A list of the cellular factors known to be involved in AAV replication is presented in Table 1.3. In vitro replication assays have been utilized to identify cellular and helper virus factors required for AAV replicons to function. Several helper genes from adenovirus (Ad) and herpesvirus (HSV) have been identified as necessary for efficient in vivo AAV replication. These include the adenovirus E1A, E1B, E4, E2A, and VA genes. While all these genes participate in helper virus activity, they are not believed to play a direct role in AAV replication. Rather, helper viruses are thought to provide a favorable cellular environment for AAV replication by modifying host gene expression. However, some evidence does implicate Ad DNA binding protein (DBP), an RPA homologue, as being capable of stimulating AAV replication in vitro. The level of stimulation achieved with addition of Ad DBP is still well below what is seen when an Ad infected cellular extract is used to supplement the reaction, thus Ad infection probably stimulates AAV replication through additional, unknown mechanisms. The observation that external stimuli such as heat shock, UV-irradiation, treatment with hydroxyurea and carcinogen treatment push host cells into a state which is semi-permissive for AAV replication supports the hypothesis that helper viruses contribute primarily by modify the cellular environment [67].

Interactions between viral replication initiator proteins and cellular factors important for replication provide a means for targeting the necessary cellular factors to the viral origin. Cellular proteins that are known to be directly involved in AAV replication in vitro and in the absence of helper virus, include RPA, RFC, PCNA, and cellular DNA polymerases 5

and e (from crude HeLa cell extract) [67]. It is interesting to note that an in vitro system comprised of solely the aforementioned purified proteins added together with appropriate AAV DNA template and AAV Rep 68/78 does not support efficient AAV replication. Therefore, additional unidentified factors present in replication competent crude HeLa cell extract are probably involved in AAV replication. Crude HeLa cell extract, from uninfected cells (i.e. with no helper virus infection) grown under conditions known to make them 29

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. semi-permissive for AAV replication, plus the previously mentioned purified proteins contain all the necessary factors for supporting efficient AAV replication in vitro.

The identity of the stimulatory host cofactor or cofactors present in the crude HeLa cell extracts remains to be determined [67]. Some evidence points to cellular DNA binding proteins such as TATA binding protein (TBP) as being important for AAV replication. While other work has indicated that the cellular single stranded DNA binding protein RPA is necessary for processive AAV replication. Interestingly, while human RPA does support in vitro AAV replication, as mentioned earlier in this section, it appears that Ad DBP, when present, is preferred over cellular RPA [75]. Substitution of Ad DBP for human RPA, in an in vitro experiment using purified proteins stimulated AAV replication, indicating that this may be at least part of the mechanism by which helper virus infection stimulates replication. Although Ad DBP was clearly demonstrated to stimulate AAV replication in vitro with purified proteins, the extent of the stimulation was still far below that seen when crude semi-permissive, HeLa cell extract was added to the reaction, indicating other factors are present in crude HeLa cell extract which are involved in AAV replication [73].

Christensen et al. identified a small cellular factor, termed parvovirus initiation factor (PIF), required for efficient nicking and initiation of replication from the 3' origin of the related autonomous parvovirus minute virus of mice (MVM)[77]. PIF belongs to a growing family of KDWK motif containing transcription factors, of which the first to be identified was the Drosophila DEAF-1 homeobox domain protein. PIF as it functions in vivo seems to be a heterodimer of subunits with different molecular weights reflected in their names, p96 and p79. PIF interacts with the MVM Rep 68/78 homologue, NS1 and is hypothesized to facilitate Rep protein complex formation that is required for efficient origin DNA binding. PIF also binds MVM origin DNA but does so only in complex with NS1 Rep 78 /68. PIF may be a functional model for how geminivirus AL3 (REn) protein stimulates geminivirus replication. Evidence consistent with this hypothesis is discussed in section 1.5.5.

As mentioned earlier, cells in culture can be stimulated by various methods to render them semi-permissive for AAV replication. One of these methods is transfection of the cultured cells with an oncogene, indicating that cell cycle manipulation is a critical event in the 30

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. AAV replication cycle. Cell cycle regulation appears to be another function provided by helper viruses during AAV replication. Helper viruses, either HSV or Ad, stimulate terminally differentiated cells to re-enter S-phase upon infection and thus supply AAV with an environment that supports replication [22].

1.4.2.3. Interactions of AAV Replication Initiation Proteins:

Since the only viral proteins required for AAV DNA replication are either Rep 68 or Rep 78, the origin recognition proteins, it stands to reason that these proteins may form complexes in addition to those already discussed with one or more cellular host factors known to be involved in parvovirus replication. Related replication initiation proteins of phage, bacterial, viral and eukaryotic origins typically nucleate multi-protein initiation complexes situated at their origins of replication. The nature of these origin recognition complexes varies between the systems mentioned but some common themes exist. Nearly all the DNA viruses studied to date encode proteins that recognize their cognate origins of replication by binding to a repeated DNA element in a sequence specific manner [22]. These origin recognition proteins are usually involved in transcriptional regulation as well as origin recognition, opening/unwinding origin DNA, and recruitment of viral and host replication proteins to the origin. Perhaps the most well studied initiator protein belonging to this category is the polyomavirus Large T-antigen, which will be discussed later in section 1.4.3. Multifunctional replication initiator proteins such as Large T-antigen, Ad EIA, geminivirus Rep and AAV Rep all form multiple protein-protein associations both as homo- and hetero-oligomers [22]. Presumably these various complexes are necessary for the Rep proteins to carry out their different activities. Rep 68/78 is a typical example of such versatile origin recognition proteins. As mentioned earlier, several proteins have been identified that are capable of interacting with AAV Rep proteins. Parvovirus Rep 68/78 interact with cellular TATA binding protein (TBP), a core member of many transcription factor complexes [75]. AAV Rep has also been demonstrated to bind to and act through the transcription factors Spl and PC4 [74]. Parvovirus Rep 68 / 78 proteins have also been demonstrated to interact with the high mobility group protein 1 (HMG-l), which is known to bind cruciform DNA. HMG-1 binds both Rep 68/78 and AAV hairpins. HMG-1 has stimulatory effects on Rep nicking of TR DNA, Rep ATPase function and down regulation 31

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. of the PS promoter. Both AAV Rep proteins are also known to interact with a factor (ssD- BP) that binds single stranded sequences within the AAV TR called D-sequences. The ssD- BP is also known as the 52-kd, FK506 binding protein, FKBP52. This protein binds AAV DNA in two phosphorylation states. When phosphorylated, it blocks AAV second strand DNA synthesis from occurring. Conversely, the non-phosphorylated form allows DNA replication to proceed normally [22]. Furthermore, Hermonat et al showed that Rep 78 forms complexes with multiple, as yet uncharacterized, cellular factors and that the amino half of Rep 78 is involved in most of these interactions [78]. Additionally, AAV Rep 68/78 proteins have been demonstrated to form homo-oligomers, the formation of which is mediated through sequences in the carboxy-half of the protein [69]. As with geminivirus Rep, detailed biochemical analyses of sequences involved in multimerization are currently underway. For a more complete look at what kind of complexes DNA virus replication initiation proteins may form we can look to the well characterized SV40 system.

1.4.3. SV40 Large T-Antigen:

SV40 (simian virus 40) is a well-studied mammalian dsDNA polyomavirus that, like geminiviruses, is entirely dependent on a single multifunctional viral replication protein, Large T antigen, for initiation of replication of its -5.2 kb circular genome [22] [79]. Although SV40 replicates via a theta-mode, bi-directional mechanism, as opposed to RCR, Large T antigen performs many of the same biological functions as geminivirus Rep[80]. Both Large T-antigen and Rep are site-specific dsDNA binding proteins, posses ATPase function, bind Rb family proteins, stimulate host gene expression, repress viral early gene expression, form homo- and hetero-oligomers, and influence the cell cycle [79-81].

SV40 Large T antigen forms oligomers whose nature in part determines the function of the protein at various stages of the replicative process [22]. Large T antigen dimers bind to a repeated dsDNA pentanucleotide, the GAGGC element located in the SV40 genomic control region commonly referred to as binding site I [82]. Large T antigen binding to site I located within the early gene transcript promoter, serves primarily to abolish expression of Large T antigen itself [82]. However, there are some data that suggest that binding to site I

32

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. facilitates T antigen complex formation at site II, probably through protein-protein contacts between T antigen molecules [22,83].

A second strong Large T antigen binding site in the SV40 genomic control region has a composition distinctly different from that of binding site I. Binding site II contains four GAGGC elements, organized in pairs arranged in opposite directions, forming a perfect 27 base pair palindrome. The primary role of the pentanucleotides in site II is apparently to bind T antigen in the proper location and orientation for its function in DNA replication [83, 84]. Initial low affinity binding to site II is ATP independent and may involve assembly of a transient tetrameric T antigen complex [83]. The tetrameric T antigen complex most likely represents an unstable intermediate that exists only briefly while T antigen assembles into a more stable hexameric form. In the presence of physiological levels of ATP, T antigen bound to site II undergoes a conformational shift, which allows it to make specific protein-protein contacts and assemble into a two-lobed form with each lobe consisting of a hexamer of T antigen molecules (double hexamer form) [83].

Although T-antigen molecules assemble as double hexamers early in replication, single hexamers are the catalytic helicase unit [86]. The exact method of Large T antigen helicase function remains unclear. Individual hexamers within each double hexamer formed during origin opening might remain associated with each other throughout replication or they may split apart and translocate in opposite directions along the replication template [86]. Whether or not individual hexamers continue to associate with each other or they separate, initial hexamer formation does not require ATP hydrolysis. In support of this it has been demonstrated that many non-hydrolyzable ATP substrates can substitute for ATP during in vitro hexamer formation [87]. Conversely, it appears ATP hydrolysis is required to drive translocation of T-antigen hexamers along the DNA duplex, unwinding the DNA ahead of the advancing replication fork [87]. T antigen hexamers effectively unwind or open the SV40 origin and allow for the formation of a pre-initiation complex containing DNA polymerase o/primase. In vitro studies have demonstrated a strong 3'—»5' helicase function for Large T-antigen, which is dependent on ATP hydrolysis [87]. T antigen hexamers

33

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. remain associated with replication forks throughout replication and probably serves as the main replicative helicase during SV40 DNA replication [87].

The process governing the T antigen conformational switches from dimeric, tetrameric, hexameric, and double hexameric forms appears to be controlled by ATP binding and protein phosphorylation [87-89], ATP binding to the central region of T antigen has been postulated to induce conformational changes in T antigen that favor hexamer formation [83]. Consistent with this idea, T antigen’s ATP binding site is located at the interface between T antigen hexamer protomers. Interestingly, T antigen function also appears to be regulated by phosphorylation. Three residues, two serines (S120, S123) and one threonine (T124) are particularly important for regulating in vitro replication function. Phosphorylation of serine residues 120 and 123 inhibits large T antigen function in vitro, while phosphorylation of T124 is absolutely required to activate T antigen function during SV40 DNA synthesis both in vitro and in vivo [90-94]. Phosphorylation appears to regulate T antigen function at the level of protein-protein interaction. Phosphorylation of T124 stimulates association of single T antigen hexamers into the double hexamer required for processive bi-directional DNA unwinding. Conversely, phosphorylation of S120 and S123 destabilizes the hexamer-hexamer complex. CDK2-cyclin A kinase has been implicated in phosphorylation of T124, while protein phosphatase 2A (PP2A) or an isoform thereof, appears to be responsible for dephosphorylating S120 and S123. Casein kinase I or one of its isoforms is most likely responsible for phosphorylating the serine residues in vivo[22, 87]. The in vivo regulation of SV40 is more complex than the relatively straightforward in vitro senario. There is contradictory evidence that phosphorylation of the inhibitory serine residues, characterized during in vitro studies, is actually required in vivo [92, 95]. One hypothesis to explain this apparent contradiction proposes that the serine residues are part of an unknown in vivo regulatory system [92,95]. Such a model seems plausible given the observed rapid turnover of the serine phosphorylation on residues 120 and 123. Whatever the precise mechanism, there seems to be a consensus in the literature that phosphorylation of T antigen is likely to play a crucial role in tethering SV40 DNA replication to S-phase of the host cell cycle in vivo. Furthermore, such regulation of T antigen function involves

34

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.