<<

CHAPTER I

OVERALL ISSUES OF AND tree of . Yet do have the This book seeks to present the evolution of characteristics of life, can be killed, can become viruses from the perspective of the evolution extinct and adhere to the rules of evolutionary of their host. Since viruses essentially infect and Darwinian selection. In addition, all life forms, the book will broadly cover all viruses have enormous impact on the evolution life. Such an organization of the virus of their host. Viruses are ancient life forms, their literature will thus differ considerably from numbers are vast and their role in the fabric of the usual pattern of presenting viruses life is fundamental and unending. They according to either the virus type or the type represent the leading edge of evolution of all of host disease they are associated with. In living entities and they must no longer be left out so doing, it presents the broad patterns of the of the tree of life. evolution of life and evaluates the role of viruses in host evolution as as the role Definitions. The concept of a virus has old of host in virus evolution. This book also origins, yet our modern understanding or seeks to broadly consider and present the definition of a virus is relatively recent and role of persistent viruses in evolution. directly associated with our unraveling the Although we have come to realize that viral of and nucleic acids in biological systems. persistence is indeed a common relationship As it will be important to avoid the perpetuation between virus and host, it is usually of some of the vague and sometimes inaccurate considered as a variation of a host views of viruses, below we present some pattern and not the basis from which to definitions that apply to modern . organize our thinking on virus/host evolution. Most students of or Historical writings about viruses can be traced will be familiar with the back several thousand years. However, all virus families organized according to historical descriptions are in reference to specific replication strategy or disease they cause. and/or recognizable diseases caused by viruses. Such classical textbook organization will The very name virus stems from the concept of generally include a section, often at the end, or illness, which appears to move through in which some issues or observations the air. An early description of the of concerning the evolution of a particular Athens was written by Thucydides, This plague virus are presented. However, this was likely due to a viral . In it he presentation patterns is inevitably narrow carefully describes the epidemic that hit Athens and fails to address broader issues or in 430 BC. Although a clear progression of integrate our thinking about virus/host symptoms was presented, beginning with evolution. For the students of evolutionary respiratory disease, rashes, gastrointestinal biology, the importance of viruses to the symptoms, central nervous system symptoms, evolution of we cannot now be certain about which virus might have been responsible and to this day it life will be a new topic. As discussed remains a point of contention. My own below, has generally assessment of the timing and pattern of failed to consider the contribution that symptoms indicate that it clearly resembles those viruses have made to the evolution of life. seen with distemper (paramyxovirus), in Some of the reasons are historic, but mostly domestic dogs. However, distemper is not a this is due to the view that viruses do not currently known human disease. Other historical represent living entities and thus cannot be descriptions are sufficiently distinct for modern significant components or contributors to the virologist to be more confident of the virus 1 involved, such as virus. However, if a virus could also be a silent host associated the main theme has remained that viruses genetic element, that the evolutionary history of are invisible agents of disease and enemies virus and host could be highly entwined. which cause harm to the host. Although Evidence that some viruses could silently persist viral transmissibility and were in their host, but still emerge from the host described in these early writings, the first had actually been reported early (1909, proposal that these agents might be invisibly Rouse sarcoma virus, 1915 temperate phage). small entities was written by Girolamo. But without understanding the molecular genetic Fracastoro in the 1546, Although the nature of viruses, these observations had little potential was used to develop a influence on our understanding of virus or host against smallpox by evolution. It was not until many viral in 1798, not until the 1800s did the germ were finally sequenced, beginning in the 1970’s, theory of disease finally prevail following and phylogenetic methods for the analysis of experimental evaluations by J. Henle, Louis sequence similarity were developed (neighbor Pasture, Joseph Lister and Robert Koch and joining. parsimony) that inferences concerning others. That viruses were so small that they the evolutionary history of viruses could be could be filtered through ceramic filters drawn. Finally, with the sequencing of many which would not pass was host genomes in the 1990’s, it became clear that determined in the end of the 1800s (1898, all host genomes, from bacteria to human, had Loeffler & Frosch, 1899, Beijernck). Thus been strongly affected by viral colonizing viruses are very small agents of host disease activity. Thus, the needed information has and this was the only view of them that was finally been assembled to allow us to evaluate available to T. Dubzhansky and others that virus and host evolution together and connect developed the new synthesis evolutionary these two elements of the tree of life. biology with genetic theory and the origin of in the 1930s. Viruses were also first Prior to considering other general issues of virus crystallized around this (W. Stanley, evolution, it is important to define some terms to 1935), suggesting a very chemical-like be used in this book. nature to them and reinforcing the view that they are acellular replicators of disease, Virus. A molecular genetic parasite that uses belonging outside of the tree of life. cellular systems for its own replication. Note However, also around this time, other lines that this definition has no reference to the of research, by Max Delbrück and others, on molecular identification of the viral entity. Nor viruses that destroy bacteria began to does it specify viral genes or their role in unravel the genetic nature of viruses. The replication or the specific . This is modern definition of a virus, as a molecular in to allow the inclusion of both traditional genetic parasite, awaited the advent of viruses that transmit predominantly via methods of molecular biology in the 1950’s. extracellular means, hence make virions of In the 1950’s, S.E. Luria’s first provided a specific molecular structure as well as allowing modern virus definition in an essay the inclusion of viruses that transmit through the published in . It was also around host genome or other inapparent means, this time that it became clear, in molecular including defective viruses. In this regard, a terms, that some viruses were also silent, defective virus is a virus whose replication is and could be genetically maintained for long conditional upon another virus. periods by the by the host, such as temperate phage. Thus, both the viruses causing Defective virus. A viral genome or that disease and the silent viruses colonizing the lacks sufficient instructional elements to code for host genome were defined as molecular its own replication and depends on another virus genetic parasites. It soon became clear that for such functions. 2 Virus species. The traditional definition of Acute viral infection. A type of virus infection a species for an is an interbreeding associated with the replication and production of population that shares flow. As viruses an amplified number of viral progeny in which have no sexual exchange process, a virus the capacity of the virus to continue to replicate species must be defined by its lineage. A is transient and is not maintained in an individual virus species is thus “a polythetic of host. Ongoing virus replication is either limited viruses that constitute a replicating lineage by the of the host or the immune and occupy a particular ecological niche” response of the host. In terms of (Van Regenmortel, 2000). Thus a virus based study of acute viruses, the cell destruction species is mainly a related lineage. The wrought by these viruses has been the basis of characteristic of occupying a particular our quantitative methods of virology; the plaque niche, however, is problematic for some . This is a continuous regions of cell death viral species, which are known to jump caused by a single virus infection and has been species adapt an alternative life style and the basis of measuring biologically active virus. occupy different niches. This contrast with persistent viruses, which often fail to lyse cells and can be much more difficult Symbiosis. The state of two previously to measure. separate living entities living together in one organism. This definition is inclusive of a Fitness. The characteristics that endow an persistent virus that has colonized a host and organism or virus with a capacity or probability does not distinguish between mutually of continued life or the capacity of its offspring beneficial and parasitic states. All co- to persist and continue life. Fitness is a habitation relationships, such as viral conditional or relativistic concept. It will depend persistence defined below are included as greatly on the competition that is present at the symbiotic. time of selection, thus the concept of ‘fitness ’ will also have a conditional or relativistic Viral Persistence. The capacity of a virus nature to it. As described below, fitness or to be maintained in an individual host fitness space can be very difficult to measure organism in which the ability of the virus to experimentally since measurements are usually be transmitted to other host or based on relative rates of . offspring of the host is also maintained. Persistence can be maintained regardless of Lateral genes transfer. The movement of the host . This definition is genetic information from one lineage of inclusive of both latent and chronic organism to another isolated lineage of . A latent infection involved organism. For example, the movement of genes periods, sometimes extensive, in which no from a bacterial genome to a eukaryotic genome. virus is made in the host. A chronic infection involves a steady level of progeny Viral emergence. The sudden and previously virus production. This definition is also unknown appearance of a viral epidemic in a inclusive of genomic or defective viruses, particular host organism. which can efficiently transmitted to host offspring or other host in the presence of the appropriate . Persistence is The types and classification of virus. Viruses sometimes used in other context in virology. are generally classified according to the type of For example, the ability of a virus to persist in their genome (sdDNA, ssDNA, outside of a cell in the environment. Such dsRNA, ssRNA), the replication strategy of the uses are not part of this definition of viral genome (minus strand RNA, plus strand RNA, persistence. circular genome, linear genome, segmented 3 genome, retrotranscription) and the The overall diversity of viruses is hard to morphology of the virus particle ( size estimate since so many have not been and type, virion assemby, type and number characterized. The current virus database has of membranes, nuclear or cytoplasmic about 3,600 viral species. This relates to about assembly). These features are generally 30,000 virus strains and subtypes. Analysis of maintained during virus evolution. this current collection suggest that ssRNA Historically viruses were first classified by viruses are the most species diverse, followed by disease they caused, but this led to much the dsDNA viruses, then the dsRNA viruses and confusion as a single virus can be the ssDNA viruses. However, these numbers are responsible for an array of disease ( or no likely to be highly biased due to sampling disease) states in different host. For limitations as we have focused our studies on example, viruses that induce (liver viruses of E. coli as well as viruses of humans swelling and and resulting jaundice) were and their domesticated and . called hepatitis viruses. Yet we know that Clearly, relatively unstudied habitats are known there is no relationship between to exist, which have enormous populations of virus (a +RNA virus) and hepatitis certain types not included in this database. For (a pararetrovirus with a DNA genome). example about 20,000 species of Morphological classification according to (genomic DNA viruses or parasitoid wasp appearance using an species) are estimated to exist and about 1031 was more useful, but this too proved dsDNA viruses are found in the oceans inadequate as distinct viral species can be which are mostly unclassified. Thus our current morphologically identical. In 1971, D. tally of virus species might be enormously Baltimore proposed a viral classification underestimated. Viruses of humans are probably scheme based on the genome type polarity the best studied and we can estimate on the scale and organization. Current classification also of less then 1000 human specific exogenous includes sequence similarity and gene viruses (about 100 , 100 organization to assign and differentiate viral papillomaviruses, 40 adenoviruses, smaller species. numbers of herpesviruses, polyomaviruses, parvovirus, various RNA viruses). The human Viruses vary substantially in genome size genome also harbors a large number (thousands) and content. Defective viruses can be as of endogenous , most of which small as several hundred and not appear to be inactive. It is difficult to know if code for any open reading frames. these numbers of viruses are representative of viruses and dependo-viruses are a bit larger, other species or if humans host an unusually but can code for as few as one gene. Most larger number of viruses. One thing does seem viruses code for between 10-20 genes and clear, viral species are well in excess of host have genomes that range between 5-25 species numbers. thousand nucleotides. The largest viruses are dsDNA viruses found in bacteria (B. megaterium phage, 670 kbp), micro algae Virus habitat. Viruses not only have ecological (Pyramimonus, 560 kpb), and amoebae habitats in the usual sense, such as oceans, (, 670 kpb). The mimivirus is the etc., but they also have host and specific largest and most recently discovered (2003) habitats that also very distinct, such as bacterial, virus. It has genes clearly related to the fungal, host. Each of these habitats will phycodnavirus and poxvirus, but it is so tend to have its own specific viral . For large, it will not pass 0.2 micrometer filters. example, bacterial cells differ in many basic It encodes over 900 open reading frames, ways from a eukaryotic cell, such as the cell 80% of which are unique to the virus and wall, lack of nucleus or mitochondria, mixed has more ORFs then some free living cells. and . The most common 4 and diverse of the bacterial viruses, to conclude that viruses must have evolved after including , are large tailed the evolution of the first cell forms. However, as phage, containing dsDNA and resembling will be presented in chapter 2, viruses are simply the phage lambda and some of these viruses molecular genetic parasites and as such they are integrate into host DNA. Large tailed DNA capable of parasitizing any replication system, viruses are essentially absent from including other viruses as well as pre-biotic metazoans and large DNA viruses do not systems. We thus have reason to think that even normally integrate into metazoan prior to the evolution of cellular life forms, there . Unicellular eukaryotic green might have already existed molecular genetic algae also show a particular viral ecology in parasites. Evolutionary typically follow that their viruses tend to be large non-tailed host evolution by examining the of dsDNA viruses. As these organisms are the organismal physical characteristics. Although most abundant cells in the oceans, it is likely bacteria are simple cells, they still retain much that they are the host for the large numbers homology, such as cell wall structure, for of tailed phage like viral particles found in example. But this process presents a problem for the oceans (108 – 1011 particles per liter). virus evolution. There are additional host order associated differences in virus occurrence. For records indicate that cellular life or the example, higher plants are observed to started about 4 billion years before support a very large number of +ssRNA present (ybp). It is often suggested that this first viruses, which are uncommon in many other cellular life form was the common progenitor to host orders, such as bacteria. Conversely, all life and has been called as the Last Universal mammals support infection with herpes and Common Ancestor (LUCA). Recent sequence retroviruses, both of which are absent from analysis of the major domains of extant life higher plants. Most filamentous fungi are forms suggest that the number of genes in persistently infected with some form of common to all life is surprisingly small, about dsRNA virus whereas mycoplasma tend to 360 genes. These genes are thought to be support infection with ssDNA viruses. Fish descended from LUCA, but curiously replication and bats support the infection of many are not included in this conserved set. rhabdoviruses (-ssRNA), which are rare in and Bacteria are then thought to have avian species. Overall, we see broad but diverged early from LUCA, near 4 billion ybp. well maintained patterns of virus/host The Cyanobacteria appear to have been the next relationships. These patterns also apply to major cell type to have evolved, about 2.6 billion isolated host populations. For example, ybp. These three orders of all have diverse virus types, from algae to fish to currently distinct and characteristic viruses mammals, will often be distinguished by which will be presented in Chapter 3. However, being adapted to either new or old world common to all these prokaryotes are the tailed populations of their host species. It is phage, which appears to have evolved prior to assumed that the various host provide the divergence of the host cells. The earliest specific habitats that will favor or allow only appears to have evolved from about certain types of virus to succeed. 2.2 to 1.8 billion years ago, corresponding the first unicellular algae. After the emergence of this algae, there was a period of relative statis Host evolution. It seems most likely that and for more then one billion years, cellular life cellular life initially evolved in the oceans. appears to have evolved slowly and changed It also seems that there must have existed a little. However, at the end of this period, for pre-biotic, acellular system that led to the reasons that remain unknown, living systems . Viruses are defined as appeared to have acquired a method for obligate intracellular parasites leading some evolutionary creativity that resulted in Cambrian 5 explosion and a burst of new species. Many feeders that became extinct at Cambrian period. feel that some functional process of genetic Also prior to the Cambrian period, there were no innovation must have been acquired at this predators of these early animals. In addition, time to allow the transition to rapid various species of algae that had been alive for evolution. This unknown system of genetic long periods also became extinct. The novelty appears to have come into existence mechanisms that could account for this mass at this time. However, was neither the planktic remain unknown. acquisition of sex, , or sperm, as all Interestingly, C. Emiliani proposed in 1982 that appear to have been creations that followed mass extinction of planktic ocean species during the Cambrian explosion. The period was evolution may have been due to selective sweeps immediately preceded by the origin of by lytic virus infections. filamentous algae. Earliest to diverge after unicellular algae, were the In terms of modern life forms, the evolution of diplomonas/parabasala which includes the fungi marks a most important lineage as it was trichomonas and giardia species, all are directly involved in the origin animals and these are species with di-morphic nuclei, indirectly involved in but central to the origin of two nuclei that separate germ line from terrestrial plants. About 450 million ybp, soma function. Another kingdom of and animal life emerged from the oceans onto microscopic organisms to have diverged the land. Fungi had acquired characteristics that relatively early were the cilliophora were able to withstand the desiccation and (tetrahymena, plasmodium) and Euglenozoa of this new harsh environment. Fungal (euglena, leishmania, trypanosoma). symbiosis with plants appears to have allowed them to create root systems with the ability to The Viridiplantae lineage appears to have pull in water and nutrients from , in descended from green algae and gave rise to combination with photosynthesis which green plants (arabidopsis, solanacae, produced based source for the chlmydomonas). Another divergence gave fungi. There is also evidence that the emergence rise to fungi (saccharomyces, of life onto land had a major effect on life in the schizophyllum), which split off to form oceans. The oceans are generally poor metozoa. In metozoa, a basal divergence habitats, perpetually in a state of famine that tend was that of caenorhabditis, basal to to resemble deserts. Land occupation by life protostomes from deuterstomes, followed by appears to have significantly increased the flow the divergence of insects from . or runoff of nutrients into the oceans increasing the ability of this habitat to support diverse life About 545 mybp, the Cambrain explosion of forms. In fact, current estimates are that land species occurred, leading to the immense based species represent about 50 the increase in evolution that has led to all biomass of the combined oceanic species. On modern life forms. This is mainly observed land, modern plants represent a majority of this within the fossil record by the abrupt biomass. Although, unlike the oceans, we are appearance of numerous skeletal forms hard pressed to estimate the combined (trilobites, mollusks, echinoderms), which on land species, we do know it to be high for are totally absent from the prior fossil land plants (chapter 7). In addition, the fungi of record. As presented in chapters 5 and 6, land plant also support many types of virus. the evolution of these skeletal animal forms Consider for example, the Douglas fir, which can is also correlated with the likely emergence host 2,000 species of fungi, most of these fungal numerous types of virus. The very first species themselves host dsRNA virus (chapter ocean animals had evolved prior to this 5). explosion and these were flat, boneless, eyeless, mouth-less and brainless filter 6 Dilemmas of host evolution. Throughout some of viral groupings, such as the +RNA this book, we will examine major transitions viruses are so large and diverse that available during host evolution, with an aim to sequence data does not support the view that evaluate any possible viral role in these they evolved from one common ancestor. In this events. We will also consider if viruses case, it appears they may have been several might affect host evolution in punctuated or origins of the positive template riboviruses. episodic patterns. Starting with simulations There is also the problem that viruses are too of prebiotic evolution (chapter 2), then simple and show no homology in the classic considering the problem of the acquisition of sense. We cannot follow homologous traits complex phenotype in bacterial evolution suitable for phylogenetic analysis in virus (Chapter 3), I will present an outline of evolution. Yet it is still clear that viruses do evidence for possible viral involvement. A have lineage and evolutionary relationships. consideration of what must be the biggest Viruses generally conserve information problem in evolutionary biology, the regarding replication proteins and cis-genomic acquisition of the eukaryotic nucleus, will be signals for replication. Although high evaluated in chapter 4. Chapter 5 will rates can obscure this information, consensus examine the origin of innate immune sequences at important domains are systems in multi-cellular organisms. generally conserved and provide useful Chapter 6 will consider the origin of the phylogenetic data. Viruses also conserve adaptive in vertebrates. replication strategy, and gene order. Viral Chapter 8 will examine the dilemma of morphology and is another viviparous mammals and the origin of the generally conserved feature of a virus . and live birth. Finally, we will All these traits can be used to deduce virus consider the distinctions between primate evolution. However, that evolution may not be and . All of these issues linear and different parts of viruses can have will be examined from the perspective of different evolutionary histories. In fact it is now viral involvement. Finally, the generally accepted that at least for bacterial DNA accumulation of endogenous and defective viruses, evolution has occurred mainly by the viruses in the genomes of all cellular life high level recombination of sub-gene domains. forms will also be presented. An example of this problem is seen with the lambdoid phage. Although these viruses are How viruses evolve. There are several clearly similar in life strategy and morphology, major difficulties that apply to the study of and many are capable of recombining with each virus evolution. Viruses leave no in other, no one gene, including core replication the geological record, thus we have not genes, is conserved amongst all these phage. outside reference to calibrate the time of The high rates of recombination within these possible events. Another genomes appears to have erased any record of problem is that viruses clearly have sequence information that could have linked numerous origins and are thus polythetic. their lineage. This makes it very difficult to Hence, we cannot easily accommodate them understand evolutionary relationships amongst into one congruent tree of life. The this phage family. polythetic character corresponds mainly to the specific genome replication strategy of Understanding the ultimate origin of viruses the individual virus families, each of which seems unattainable. If they are as old as all life, seems to have evolved from a distinct their high rates of evolution have erased any common ancestor. For example, all small ds useful record of their lineage or age so their DNA viruses (papilloma or polyoma antiquity might only be inferred, not deduced. viruses) seem related to each other and Various scenarios have historically been probably have common origins. However, proposed to explain viral origins. Viruses with 7 RNA genomes represent the only current apparently supporting the idea that these are entities that use RNA for the purpose of host-derived genes. However, the emergence of storing genetic information. Thus it has a full virus, not just one specific gene, from the been proposed that Riboviruses may trace host genome has not been observed. Most all of their origins to the RNA world, the era prior the examples used to argue this idea can be to DNA genomes. Although seemingly traced to the emergence of an endogenous and logical, we currently have not way to sometimes defective that was evaluate this idea. Negative stranded colonizing the host, so they cannot really be said viruses in particular seem not to have any to have originated from host elements. Another host analogues for their structure or historic view was that viruses were degenerate replication in any known cell type. For unicellular life forms that had lost some genes example, there is no cellular equivalent of and become obligate intracellular parasites. This this viral replicase, yet all negative stranded idea was especially applied to the large DNA viral replicases seem distantly related thus viruses, such as the poxviruses which physically they may trace their origins to a time before resembled bacteria and have complex genomes. cells. With the plus stranded RNA viruses, However, with the sequencing of these viral although they all have replicases with genomes, it has become clear that they do not similar structure, leading some to suggest a originate from bacterial, or other unicellular common origin, subsequent sequence genomes. They all originate from well analysis did not support the existence of a established viral lineages. Thus the most common ancestor for all of them. These supported view is that most viral lineages are viruses appear to exist in ‘supergroups’ of old, originate independent of host replication more related replicases. Although each of systems and that several independent viral these supergroups’ may have a common origins have occurred. ancestor, there is little support for linking this higher-level taxa. Another idea is that Mathematical biology, host populations and viruses represent escaped bits of host virus evolution. Mathematical biology is the genomes, such as an application of mathematical descriptions to and a corresponding - biological issues. The relationship between a along with a genome binding-coating virus and its host can and has been proteins. This idea has been proposed for mathematically modeled. Current models can be various DNA viruses, especially those which traced to the initial developments by Vito use host-like replication processes. Volterra in the context of predator prey models. However, phylogenetic analysis does not The concept is that a virus behaves much like a support a cellular origin for any of the DNA prey on its host. Virus growth is dependent on viruses. For example, all evaluated ‘consumption’ of its host so the virus and host eukaryotic viral DNA families originate population densities will be linked but in a from progenitor virus, some of which can be predictable way. This results in a differential traced to bacterial phage, not host cell equation called the Lotta-Volterra equation. In ancestors. Another related proposal has addition, if coefficients are added for the been that retroviruses may also evolve from transmission efficiency of the virus, natural and host DNA sequences. Accordingly, various virus induced host death rates or survival from retroviruses appear to have originated from infection and subsequent immunity, and the birth sequences found in the host genomes (such of new host, mathematical models can be as transforming retroviruses). Thus, it has developed which predict the outcome of viral appeared that there was some support for growth or . With these models, we can this idea of escaped host elements. Also, understand the epidemic behavior of a virus with individual viral genes sometimes show respect to host population density and host strong similarity to homologous host genes, immunity, as first done for smallpox epidemics 8 by D. Bernoulli, the famous mathematician. predicted the outcome of various human viral When a virus is initially introduced into a epidemics, including the HIV in dense and non-immune host population, an Africa. They also allow us to understand that epidemic called a virgin soil epidemic will distinct selection conditions can apply to the result. This would be the situation, for same virus in the same host, but in different example, of what happened to the human population or immunity structures. For example, population of the New World with the in virgin soil epidemics, virus is selected for introduction of the smallpox virus by the rapid spread. This presents a selective situation Spanish conquistadors ( a new virus that closely resembles the R or growth infecting a naive dense population). dominated selection of newly introduced species However, once this initial sweep of the virus in island . However, in an already has occurred in the population, a different infected population, the virus is selected for virus-host dynamic will be established, infection of the diminished host capacity, which depending on the various coefficients. resembles the K or equilibria based selection of Essentially all surviving adults will have established island ecologies. been infected by the virus and immune to subsequent infection. This creates a Viral populations, quasi-species and the situation in which only newly born (or fitness landscape. A distinguishing immigrating) host will not be immune and characteristic of virus populations is the capacity they will constitute the main host for virus of their genomes to be highly variable. Some propagation. The virus will thus become a virus populations, such as HIV1, are so diverse childhood disease and to the host that it is estimated there may be no two gemones population. However, in order to produce a in a population that are identical to each other. sufficient number of host offspring to Furthermore, this diversity can be very rapidly maintain the chain of virus transmission, the generated. One measurement suggested that host population needs to be sufficiently large about 10,000 variants could arise from one round and interacting. For example, it is estimated of a single cytotoxic T-cell (CTL) infection, even with various human acute viruses that single from a cloned genetically homogeneous HIV populations of 50,000 can maintain some template used to initiate one round of infection. viral diseases. In the context of modern This diverse genetic character of viral human (post agricultural) populations, these populations has been called a quasi-species. The numbers are easily attained. Thus a large term quasi-species was originally developed by host population density allows the Manfred Eigen and Peter Schuster to explain the maintenance of acute . For diversity of chemical replicators that would example, European populations during the result from an inaccurate replication process as middle ages sustained childhood infections part of their study of the chemical basis for the with acute viruses such as smallpox and origin of life. They coined the term quasi- virus. However, throughout most of species to name an error prone process of human evolution, human populations were chemical replication which will result in a composed of hunter-gatherers, which are population of related but non-identical much smaller and could not have sustained chemicals. However, the term species has the transmission chain by essentially any introduced some confusion, since this has a acute virus. Given that most terrestrial host different meaning in a biological context; that is organism populations are not in large dense an interbreeding population of living organisms. populations, virus-host evolution will be As it applies to viruses, these two uses of the significantly restricted. term species sometimes overlap since a genetically diverse virus population may also These mathematical models have been represent a population that exchanges genetic highly successful and have accurately information. In fact, it has been measured that 9 quasi-species of some viral populations can through fitness landscapes. In fact it is estimated have a distinct relative fitness profile that HIV-1 can evolve 1,000,000 faster then its suggesting that evolutionary pressure may host. The rate of sequence change, although operate on quasi-species level. The absolute very fast, often appears to be essentially constant amount of sequence variation in a viral with time. Because of this, it can be used like a genome, however, can be enormous and has clock to estimate times of evolutionary sequence been called ‘hyper-astronomical’. In the divergence (a ). However, the case of HIV with a genome length of 10,000 molecular clocks for RNA viruses are so fast, nucleotides the total possible sequence some have argued that essentially all RNA variants corresponds to 106020. This number viruses are recently evolved (about 10,000 ybp). is large beyond comprehension. For The very high adaptability of an RNA or example, compare it to the estimated retrovirus is dependent on a high error rate. number of protons in the at about Inherent in the quasi-species theory is the 1080. This vast variation in sequence can be existence of an error threshold in which the error thought of in terms of sequence space where rate is so high that no viable virus will be each variant occupies a coordinate in highly reproduced. This point is also called error dimensional space. The distance between catastrophe. It is thought that without error variants corresponds to the smallest number correction systems, RNA genomes will have of point changes (Hamming distance). restricted lengths due to this threshold (about Although actual viral quasi-species can be 30,000 nucleotides). Some inhibitors of virus large populations, they are relatively small replication have been established to increase the compared to this total possible sequence replication error rates and push viral systems into space. For example, all of the combined error catastrophe. sequence variation of all people on that have had HIV might correspond to 1060 Fitness. Fitness is thus thought to be the genetic variants. Because a real quasi-species virus feature that increases the survival and would only correspond to s very small reproductive capacity of an organism. Fitness fraction of this potential sequence space, it would thus dictate the viable landscape of can be though of as occupying a relatively sequence space that a quasi species would follow small of the entire sequence space. to attain higher fitness. However, although we can readily agree on the concept of fitness, it is surprisingly difficult to define in measurable Fitness landscape and error catastrophe. terms. In the above section we have provided Evolution of a sequence often involves the some definitions of fitness. Two terms that are of sequences to higher fitness. commonly used to measure fitness Other parameters, such as sampling experimentally are relative fitness and restrictions (genetic bottlenecks) inherent to reproductive ratio (R0). Relative fitness is the small numbers of virus needed to establish a ratio of the number of progeny relative to the successful virus transmission and neutral average number of progeny expected for the drift also lead to population based sequence population. For reproductive ratio, in terms of changes. The pathway by which one virus we can define this as the number of newly sequence can evolve to another more fit infected organisms that arise from one initially sequence can thus be thought of as a fitness infected organism. Thus fitness is normally landscape that can exist in sequence space. measured in terms of progeny or reproduction. Because viruses can have very high error However, as we will see below for persistent rates (10-4 per viral genome base compared viral infections, fitness definitions may need to to 10-9 per host genome base), plus their be broadened from those simply dependent on small genomes and rapid generation times, progeny. they also have the capacity to rapidly move 10 Persistence and fitness. The foundations of how many successful offspring the virus makes the above mathematical biology all have as a is important. Yet if we reexamine our general premise that viruses have a predator-like definition of fitness, we see that the survival of relationship with their host. This is certainly the juvenile individual, not just the number of well supported in the disease based epidemic offspring is important. A long-life individual, if models we have considered, such as stable, can be the organisms that is left after a smallpox, measles, , selective sweep has exterminated its replicating virus, etc. Given the very origins of our competitors. In this case, the fittest organisms or concepts of virus are as disease agents, this virus are the ones left standing, without having predator-prey relationship is fully justified. necessarily made many or any progeny virus. We have previously defined the acute viral Consider as a specific example the fitness of a life strategy as one which does not persist in long term persistent virus, such as herpes zoster individual host. All these models and of humans. After initial infection and examples fit this acute lifestyle definition. establishment of persistence, the virus can spend However, we have also previously defined up to 50 years as a very low level persistent the persistent life strategy of a virus. (latent) infection in a single ganglion, not Examples of viruses that persist in making any progeny. However, to be fit it must individual host are numerous and span the be able to reactivate with high probability after a entire spectrum of virus and host types. particular and long duration and make enough However, even an initial consideration does virus to infect a new host, such as grandchildren not appear to support the idea that viral of the host, reestablishing another generation of persistence has a predator like relationship long lived persistent infection. A temperate with its host. As a rule, persistent infections phage in its host bacteria can show similar are inapparent and generally asymptomatic. temporal stability. E. coli with temperate lambda They do not consume their host. Instead phage can be passed for hundreds of bacterial they involve mechanisms or strategies that generations, seldom if ever reactivating. But ensure the maintenance and stability of the with the proper environmental queue, such as viral genome, but may compete with and UV induced DNA damage, the virus will exclude other genetic parasites. Persistence reactivate with high probability in almost all the is also often not associated with high level cells. It seems in this circumstance, the temporal or maximized production of progeny. The stability is part of the viral fitness as well as more typical relationship is that a low retaining the capacity to sense the appropriate amount of virus production is all that can be environmental signals. However, our needed to transmit to either new host or understanding of persistence is poor, both due to progeny host. In some cases, such as experimental and theoretical difficulties. temperate phage, the virus is a genetic Persistence presents a real problem for element of the host genome so its mathematical models and confounds most of the reproduction is dependent on host viral as we currently apply to the reproduction. This is clearly not the study of viral epidemics. As a current example, relationship of a predator to its prey. epidemic models were recently developed and presented for the SARS epidemic. These models What then defines the fitness of a stable, appeared to have accurately predicted the persistent infection? We have noted that containment of the epidemic. However, if SARS reproductive ratios are often used to measure had established a persistent infection in some acute viral fitness. These ratios are patients, the models would have failed to address dimensionless metrics, so there is no this situation and also failed to predict the temporal component to this application of outcome. fitness. How long the virus or its host persists is not what is important, but rather 11 Persistence, populations and evolution. In These slower molecular clocks are equivalent to general, it seems clear that persistent viruses the slow molecular clock rates of their host have a different population structure then genomes. The basis for this stability has not acute viruses. It also seems clear that they been determined. It is possible that the virus have a different evolutionary pattern and simply colonizes specific cells in small numbers relationship to their host than acute viruses. to persist for long periods and that the resulting Persistent viruses tend to show much greater homogeneous progeny virus represents relatively genetic stability, which can be observed on few replication rounds from these small numbers an evolutionary time scale. Persistent of colonizing genomes. This issue needs further viruses also show a pattern of co-evolution evaluation. with their host. This co-evolution is in keeping with the fact that persistent viruses In terms of host population structure, viral also tend to be highly host species specific. persistence also differs from that of acute virus Both of these issues are to be examined in and their host. Persistent viruses are not very great detail in all the subsequent chapters. dependent on host population densities and can be found highly prevalent in non-gregarious host The genetic stability of persistent virus populations. In terms of humans and their infection was initially noticed with many association with human specific viruses, pre- persisting DNA viruses (herpesvirus, agricultural human populations were likely adenovirus, polyomavirus papillomavirus). infected with most of these persistent human Infections with these viruses tend to be viruses and even primate relatives of humans much more homogeneous and do not to harbor most of these types of viruses. Thus the show the quasi-species population structure persistent virus-host relationship is stable on an discussed above. The viral genetic stability evolutionary time scale and is not dependent on is such that it can be used to follow the host population densities. This viral life strategy migration and even the evolution of its host. is highly prevalent in natural host populations of It was initially assumed that because many essentially all orders. However, this virus-host persistent infections were due to DNA relationship necessitates that the persistent virus viruses, error correction mechanisms has developed a process of transmission that is prevented the generation of the genetic closely linked to host biology. Thus persistent diversity that is characteristic of acute virus viral infections tend to be transmitted from old to populations. However, there are now young host, during sex and birth or by some numerous examples of species specific other process that is inherent to the host life persistent infection with various types of strategy (such as milk borne virus for mammals). RNA viruses that also show genetic stability Thus they are less associated with population and co-evolution with their host. This has structures then acute viruses. This means that been observed with hantavirus and persistent viruses must have the capacity to sense in their native rodent host as biological or temporal queues to insure high well as rhabdoviruses in their bat host and probability transmission at the opportune times. influenza viruses in the water fowl host. Persistence will also require that the virus has Because all of these RNA viruses have been some mechanisms that assure maintenance of the experimentally established to be able to virus and prevent elimination. As we will rapidly generate diverse genome populations present, maintenance can sometimes be assured in laboratory settings, it is clear in these by the use of a system called addiction modules cases these viruses also have high error rates (as seen in various unicellular host). Prevention in RNA replication, yet they mostly of virus elimination will require mechanisms that maintain homogeneous populations in nature counteract host immunity systems as well as and show molecular clocks that are much sometimes involving mechanisms that suppress slower then those observed in acute viruses. competition by other genetic parasites. The main 12 point to emphasize is that persistence infectious processes. We will thus consider why requires some phenotype or strategy in order an infectious process was not apparently to be fit, to attain temporal stability and maintained in . A major dilemma in assure transmission. This will clearly evolutionary biology, which will receive special differentiate the concept of persistence from attention, will be the origin of the eukaryotic that of selfish DNA or genes, as by nucleus. The relationship between the origin of definition, selfish DNA has no phenotype the nucleus in unicellular algae and viral for the host. parasites will thus be examined in some detail. The early life of the oceans is thought to be Finally, it needs to be noted that a persistent crucial for the evolution of all higher life forms. virus in one host can be an acute virus in Thus we will present one chapter which another host species or host population. evaluates what we know concerning viruses of Some persistent viruses are able to jump aquatic animals and the early evolution of species or shift host to become acute viruses metazoans, maintaining the theme of considering in new host. Because persistence in an also what is known about persistent viruses and inherently more stable evolutionary genetic parasites that have colonized the host relationship between virus and host, this genome. As fungi were so crucial to the means that most acute viral diseases will emergence of animal and plant life from the have originated and adapted from a oceans, we address the relationship of fungi to persistent state. their viruses. Terrestrial plants, insects and their viruses appear to have evolved together to a Organizational overview. The rest of the large degree. Because of this, the chapter that chapters will examine the evolution of presents plants and insects will have the unusual viruses from the context of host evolution. organization of considering the evolution of Chapter 2 begins with issues related to plants, insects and their viruses all together. This prebiotic evolution by considering trinity will hopefully allow the reader to see the simulations of early evolution that use viral threads that link these hosts. The final computers and chemical replicator to model chapter is the longest and addresses the prebiotic origins of life. We (the reader hosts that have most often been the subject of and I) will consider the possible viral role in virus and evolution study; the terrestrial these simulations and the effect on the vertebrates. However, this chapter is presented outcome. We will next consider prokaryotes from an evolutionary context, first addressing and their viruses. Bacteria, Achaea, and those animals that were first to evolve and Cyanobacteria will all be presented from the diverge and also considering the viruses that context of viruses specific to these host infect them. Like the chapter on viruses of orders. An emphasis will be to include a bacteria, this chapter also attempts to summarize consideration to persistent (temperate) a detailed and rich literature so the reader will be viruses and how such viruses affect host presented with much specific information. evolution. Along the way, various dilemmas Although some of these viruses and host are of evolutionary biology (especially the indeed well studied, this organization also makes acquisition of complexity) will be noted and clear the major gaps in our knowledge, such as evidence of viral involvement in these the monotremes and their viruses which are so situations will be considered. The bacterial- poorly studied. I end the book with the phage literature is very rich and detailed and consideration of human and primate evolution it is hoped the reader will not get lost in this and our study of their viruses. Attention is paid necessary evaluation of experimental detail. to the evaluation of what makes humans distinct As we will see, bacteria are the most form our primate brethren and the viral adaptable of all cellular life forms and it has associations of this difference. become clear that they mainly evolve by 13 Throughout this book, we also consider which disease is not involved. Thus, the absence those viral agents and their defectives that or rarity of viral disease in large sea urchin farms have colonized the host genome and attempt from Japan, for example, should receive equal to evaluate the relationship of these agents attention as a natural demonstration of viral-host to host and viral evolution. From bacteria to fitness and evolutionary success. It is the absence humans, clear patterns exist which are of such balanced observations and our failure to frequently ignored in the traditional consider what such observations imply to the presentation of evolution. In this regard, I evolution of life that has led to the currently pay attention to those sequences within skewed, one-sided view of virus-host chromosomes that have been much less relationships. It is the intent of this book to studied, such as the human Y provide the less popular but more biologically and why it is so colonized by endogenous relevant perspective of persisting viruses. retrovirus. Too often, such sequence is dismissed as junk, thus ignoring any The perspectives of this book will often stem virological inferences. from the consideration of simple, child-like questions, such as where do viruses come from? Why do some viruses persist? Why do some Another perspective addressed in many viruses make us sick, but not others? The reality chapters of this book will be the periodic is the most viruses don’t cause disease in their examination of natural biological native host. So more questions will follow, such populations with respect to their viruses as as: why not and why only sometimes? Viruses well as the experiences with viruses of those will be seen to be fundamental, present at the who are practitioners of biology: the brewer, dawn of life and present today as species the farmer, the fisherman, all those that have differentiate. Pathogenesis can be looked at as had practical experience with large resulting from a failed persistence. New host populations of living organisms and have orders may result from a successful viral witnessed the consequences of virus persistence. Evolution itself may depend on this infections. Natural or studies are colonization process in order to create genetic significantly unrepresented in the virology complexity. It is hoped that this book literature but are essential in order to organization will stimulate many other such evaluate the relevance of our many questions and possible answers and serve to laboratory developed viral models to actual remind us that we really have much to learn about virus-host relationships. It is important to what viruses are and what they do to life on our understand the realities of virus-host world. Viruses are part of this world with an relationships in an ecological context and evolutionary power that is immense and not simply consider virus-host relationships unmatched by any other living entity. But how from the perspective of diseases caused by virus evolutionary power applies to host evolution viruses or from laboratory models, most of is a topic in need of much study. We have a which are highly selected for specific strong cultural bias regarding the concept of virus. biological characteristics. Every since Our history and suffering has led us to view human culture first recognized that viruses viruses as the hidden enemy, evil entities that exist and can cause diseases of humans and simply seek to destroy life and many books have diseases in their domesticated plants and been written with titles along these lines. But animals, disease eradication has been the perhaps we have simply ignored those situations main perspective and goal of virology where destruction is not the outcome. And studies. However, in order to better sometimes it appears that the destruction wought understand how viruses affect the evolution by a specific virus are of the enemies or of all living organisms, we now need to competitors of the species that harbors it. For include those numerous observations in example, Humans have been lethally infected by 14 viruses found in African monkeys (HIV), 1991) (Lipsitch, Nowak et al. 1995) (Kerszberg water fowl (influenza), Gambain 2000) (Szathmary 1988; Szathmary 1992) () or civit or their prey (SARS), all of which host these very same Issues of evolutionary biology: (Maynard Smith viruses as benign persistent infections. Do and Szathmâary 1995) (Orgel and Crick 1980) the viruses of these species really exist only (Doolittle and Sapienza 1980) (Giske, Aksnes et so that they can adapt to destroy us? What al. 1993) (Cracraft and Donoghue 2004) about the silent viruses all humans harbor but will seldom if ever cause disease in us or any Evolutionary virology: D. P. Mindell, , J S Rest other organism, such as TT virus, papilloma and L. P. Villarreal. Viruses and the Tree of viruses and polyomaviruses? Of what Life. In Tree of Life. Oxford University Press, species are they the enemy? Clearly our (in press) 2003. perceptions of virus as simply the enemy of life are simpleminded and faulty. JJ Holland, The Origin and evolution of Viruses, Ch 2 in (Topley, Wilson et al. 1998) (Nowak and However, it is worth considering a perception Schuster 1989) put forward by S. Luria, who helped express the modern definition of a virus as infective (Szathmary and Demeter 1987; Villarreal, genetic material. That definition was first Defilippis et al. 2000) presented by S. E. Luria (: an essay on virus reproduction. Science 111 Viral biology/ecology: (Hurst 2000) (Cooper p.507. 1950). Later that decade, when 1995) (Oldstone 1998) considering the role viruses might play in host evolution of the host he wrote; References. “… may we not feel that in the virus, in their merging with the cellular genome and re- Brock, T. D. (1961). Milestones in microbiology. emerging from them, we observe the units Englewood Cliffs, N.J.,, Prentice-Hall. and process which, in the course of evolution, Brock, T. D. (1999). Milestones in microbiology have created the successful genetic patterns 1546 to 1940. Washington, DC, ASM that underlie all living cells? “ Press. In Virus Growth and Variation, 1959 Cooper, J. (1995). Viruses and the environment. New York, Chapman & Hall. I would amend this perception to include that Cracraft, J. and M. J. Donoghue (2004). it is the viruses that can persist in their host Assembling the tree of life. New York, cells which have left their indelible mark on Oxford University Press. and assisted in the cells of all life. Doolittle, W. F. and C. Sapienza (1980). "Selfish genes, the phenotype paradigm and genome evolution." Nature 284(5757): 601-3. Recommended reading. Eigen, M. and R. Winkler (1992). Steps towards life : a perspective on evolution. Oxford ; Historic accounts and definitions: (Brock New York, Oxford University Press. 1961; Brock 1999) (Luria, Kellenberger et Giske, J., D. L. Aksnes, et al. (1993). "Variable al. 1959) (Luria 1959) generation times and Darwinian fitness measures." Evolutionary Ecology 7: 233- 239. Mathematica biology: (Nowak and May Hurst, C. J. (2000). Viral ecology. San Diego, 2000) (Eigen and Winkler 1992) (Nowak Academic Press. 15 Kerszberg, M. (2000). "The survival of slow Szathmary, E. and L. Demeter (1987). "Group reproducers." J Theor Biol 206(1): selection of early replicators and the 81-9. origin of life." J Theor Biol 128(4): 463- Lipsitch, M., M. A. Nowak, et al. (1995). 86. "The population dynamics of Topley, W. W. C., G. S. Wilson, et al. (1998). vertically and horizontally Topley & Wilson's microbiology and transmitted parasites." Proc R Soc microbial infections. London Lond B Biol Sci 260(1359): 321-7. New York, Arnold ; Luria, S. E. (1959). Viruses: A survey of Oxford University Press. some current problems. Virus Villarreal, L. P., V. R. Defilippis, et al. (2000). Growth and Variation. A. Isaacs and "Acute and persistent viral life strategies B. W. Lacey. London, England, and their relationship to emerging Cambridge University Press: 1-10. diseases." Virology 272(1): 1-6. Luria, S. E., E. Kellenberger, et al. (1959). Virus growth and variation: Ninth symposium of the society for general Figures and tables. microbiology London, England, Cambridge University Press. 1-1. Table summarizing the distinctions Maynard Smith, J. and E. Szathmâary between acute and persistent life (1995). The major transitions in strategies of viruses. evolution. Oxford ; New York, W.H. Freeman Spektrum. 1-2. Figure (dendogram) showing the broad Nowak, M. (1991). "The evolution of pattern of host evolution viruses. Competition between horizontal and 1-3. Table summarizing the virgin soil of mobile genes." J.Theor.Biol. 150: epidemics, New world. 339-347. Nowak, M. and P. Schuster (1989). "Error 1-4. Table summarizing some historical viral thresholds of replication in finite epidemics. populations mutation frequencies and the onset of Muller's ratchet." 1-5. Table summarizing observed characteristics J.Theor.Biol. 137: 375-395. of acute and persistent viral infections. Nowak, M. A. and R. M. May (2000). Virus dynamics : mathematical principles 1-6. Table summarizing the characteristics of of and virology. Oxford the fitness of persistence. ; New York, Oxford University Press. 1-7. Table summarizing some genes Oldstone, M. B. (1998). "Viral persistence: associated with persistence mechanisms and consequences." Curr Opin Microbiol 1(4): 436-41. 1-8. Table of host genome evolution showing Orgel, L. E. and F. H. Crick (1980). "Selfish acquisition of genetic parasites and DNA: the ultimate parasite." Nature comparison to virus. 284(5757): 604-7. Szathmary, E. (1988). "A hypercyclic illusion." J Theor Biol 134(4): 561-3. Szathmary, E. (1992). "Viral sex, levels of selection, and the origin of life." J Theor Biol 159(1): 99-109.

16 CHAPTER II

INSIGHTS FROM SIMULATED EVOLUTION the simulations themselves may be one of our All biological systems, including viruses, only sources of insight as we attempt to are essentially systems that store, copy and reconstruct the process from which life and express information. Because these basic viruses emerged. The concept here is that if we attributes can also apply to manmade are indeed able to understand some of the more systems of information, including theoretical aspects of early evolution, we may be computers, it seems logical to consider that able to understand and predict the emergent theoretical models derived from artificial properties leading to living systems. This computer based or simulated systems could chapter presents some of these insights of provide insight into some of the basic simulated evolution and attempts to evaluate the principles of biological information systems. relevance to extant biological systems and This hope, that simulations are biologically processes that we can now observe. informative, has been the motivation to develop and evaluate a large array of ‘Viruses’(parasites) of the Prebiotic World. chemical and computer based models that attempt to emulate the evolution of What is the ultimate origin of virus? It is biological systems. These models seek to generally accepted that prior to the evolution of create a bottom-up solution to problems of cellular life forms, there must have existed a the early evolution of life and it is hoped period in which precellular chemical life-like that if they are correct, basic features of forms (or autocatalytic replicators) existed which living information systems will also be were the predecessors of cellular life. Such elucidated, including systems that display replicators would essentially be chemical entities complex (non-linear, or non-additive) that would be able to catalyze and template their behavior leading to the emergence of the own synthesis from existing substrate complex characteristics of living organisms. – present in the primordial soup or The capacity for more complexity to emerge spontaneously generated. The study of chemical from less complex systems is a basic feature replicators, described below, thus attempts to of evolving living systems which is create models of catalysis in which the pre-biotic currently poorly understood. However, a characteristics can be determined. Because all common, and sometimes compelling existing life now uses nucleic acid based genetic argument against these artificial computer information and protein based catalysis, it seems and chemical models is that the fact that likely that these prebiotic replicators that led to something can be modeled, or even that the extant life forms would also have been based on model has good internal consistency or nucleic acid related chemistries. However, DNA behaves in complex ways, does not is now the main for the storage of necessarily make it applicable to the real genetic information and DNA is a rather biological world. It thus becomes chemically inert and persisting molecule which incumbent on the model builders to show could not perform the needed catalysis. that the systems which have been developed However, RNA is known to function as the have clear relationships to authentic genetic material of viruses and RNA is also able biological processes. However, when we try to perform some catalysis as a . to model very early events in evolution, such Accordingly, one of the more accepted views of as prebiotic evolution, we have few if any this prebiotic world is that in which autocatalytic solid facts that can be applied to evaluate the RNA was the principle molecule for both validity of such simulations. In this case, information and catalysis. This situation

17 constitutes the prebiotic RNA world. viruses are thus exactly the parasitic replicators However, there is one problem with this of a functional virus, itself a parasitic replicator. reasoning. Although RNA is known to be Thus these parasites of parasites are expected to catalytic, especially with respect to the have existed even in prebiotic conditions and cleavage of RNA bonds, it is not efficient at will be described further below. In addition, being able to polymerize RNA from an computer based modeling of replicator evolution RNA template. Thus RNA appears to lack (below) also suggest a role for parasitic one of the basic features required for a replicators as well as the involvement of prebiotic replicator. However, inefficient parasites of parasites. RNA based RNA polymerization might still suffice in the probiotic world to allow pre- Precellular RNA replicators; some dilemmas - living replicator systems to get a foothold, As mentioned above, it is currently accepted that since there would exist no competing or the pre-biotic world might have been the more efficient replicators. If so, the of self replicating RNA molecules or catalytic subsequent development and evolution of . However, as also mentioned, there the much more efficient protein based are no surviving autonomous organisms or catalysis might be considered as the subcellular organelles that use RNA as a genetic emergence of a much faster parasitic protein template. Only some types of viruses (especially replicator, superimposed onto the RNA negative strand RNA viruses) and viriods may templates and replicators. With the represent the sole remaining decedents of this emergence of protein catalysis and fast RNA RNA world since they are the only biological replication, it has been assumed that entities that use only RNA (and some ribozymes) replicators solebased on RNA then as a genetic system of information. Also, there essentially became extinct. Thus, RNA are no cellular analogues for RNA dependent viruses and may represent the sole RNA known. If this inference is descendents of this prebiotic world in that correct, these families of viruses may retain only they retain an RNA based genome. some basic features present during this prebiotic period, such as the frequent and conserved In the precellular world, it is also frequently occurrence of secondary structure in the RNA. In assumed that viruses were absent. This is the cases of viriods and negative stranded RNA because viruses are obligate parasites of a viruses, as there is no clear DNA based process cellular host, and is dependent on host that might have led to the generation of these specific systems for replication (e.g. protein RNA systems. Thus we cannot propose a DNA translation and energy generation systems). based origin for these viral systems. The viral Accordingly, viruses could have only come replicase seems to be a very early invention in into existence after the genesis of cellular evolution which might thus predate the evolution life forms needed to support their of DNA based cellular life. Interestingly, and replication. However, if we recall our consistent with this idea, it has also been definition of a virus as simply a molecular proposed that RNA dependent RNA polymerase, genetic parasite, it becomes apparent that which is central to the replication of both plus any genetic replicator, even prebiotic ones, and negative stranded RNA viruses, might be will be susceptible to parasitic replicators or ancestral to DNA dependent RNA polymerase, viruses. The tendency for replicators to based on protein structure and phylogenetic become parasitized, and even for the considerations. Alternatively, these viral RNA parasitic replicators themselves, to become polymerases might have evolved independently parasitized is a well established phenomena after the evolution of DNA. We are currently in virology. These would correspond to the unable to differentiate these two scenarios from defective viruses that can be observed to each other since there is no way to calibrate occur in most types of virus. Defective 18 when the differences between RNA viruses template and the catalysis for replication can be and DNA based genomes might have considered as two separable functions. Because emerged. of this, a variant template might still function as a template, but lost its ability to catalyze the Currently, the most conserved features of synthesis of another template. In addition, for many RNA viriods and RNA viruses are to evolution to occur, some process of variation in be found in the secondary structure of the the template is needed that also results in RNA, especially the stem-loop structures variation in the catalytic properties of the RNA associated with replication as well that yields a more-fit phenotype. If we consider as the RNA replicase gene in the case of this process from the perspective of parasitres, RNA viruses. Mutation and reversion we can see that it is possible for a parasite to experiments confirm the importance of separate the template or function from secondary structure for many RNA viruses the catalytic function of the RNA, and thus drive outside of any coding potential. Frequently, the evolution of the system through competition there also exists a link between stem-loop by more efficient parasitic replicators. Thus the structures and priming or RNA replication functional separation of template from catalysis as the replicase is usually covalently renders the RNA replicator susceptible to attached to the 3” end of such stem-loop molecular parasites and presents a situation that structures and is needed to prime RNA drives evolution to higher efficiency. synthesis. This priming reaction is unique to virus systems as no similar process is used For the molecular RNA parasites to initially by the host. Also, this priming step appears emerge, all that is needed is a variant or to define a distinct strategy to mark the defective of the RNA template that is able to be molecular basis of virus identification and copied with greater relative efficiency then the directs RNA replication to viral, not host original template, yet fails to synthesize a RNA molecules. However, as alluded to catalytic version of itself. This defective RNA above, protein primed RNA replication will therefore be continually copied, without poses a problem for current concepts of the needing to invest the time also needed to RNA world as it requires the simultaneous catalyze the synthesis of daughter molecules thus evolution of the template and the replicase, out-competing its ‘host‘ RNA template. The without the existence of a translational resulting parasitic replicator is defective for system. This problem of simultaneous catalysis, but relatively efficient for replication. evolution of complimenting functions is Such variations are expected to be frequent since actually part of a much more general they involve relatively simple loss-of-function problem of evolutionary biology, that of the . However, such parasitic replicators development of complex phenotype. The would remain dependent on the occurrence of the dilemma of complex phenotype will be replication competent ribozyme active templates. discussed further in chapter 3. Thus parasitic replicators are more likely to initially occur under conditions where the Adhering to our basic definition of a virus as catalytic replicators are prevalent. In addition, a molecular genetic parasite, it can now be this parasitic dependence on the simultaneous or argued that even pre-biotic replicators would episodic presence of the catalytic replicator be prone to the generation of and infection would also establish selective conditions that by viruses. An auto-catalytic self replicating would favor the persistence of the parasitic RNA would need to copy not only its own replicator. If a parasitic replicator can persist in genetic instruction, but would also need to an inert state in the environment until a catalytic copy its catalytic ability (ribozyme) to replicator is encountered, it will still have the synthesize another self RNA molecule. In a capacity for competitive reproduction. Thus, sense the instruction for making more defective replicators will also be under selection 19 for persistence in the environment. This complicated biological catalysis needed for process is in fact the well established basis living systems. Still, it is hoped that some of the for the generation of defective viruses noted chemical characteristics of self-replicating above, and can explain various biological systems can be applied to help understand the phenomena associated with defective process of prebiotic replication. Some of these viruses. Defective viruses are deletions of systems have interesting and complex infectious viruses that often have a topological behaviors and are able to form replicative advantage over their infectious intriguing two-dimensional patterns of products, host genomes, but may not code for any such as concentric circles and swirls, when genes or catalytic activity themselves. The conducted in flat surfaces. In some instances, generation of defectives is conditional. They however, it has been reported that these reactions require the presence of an infectious helper can terminate in circular patterns. It is virus, which will occur in conditions of high interesting that at the boundaries of the patterns prevalence of infection, such as during the there can at times be found ‘defective’ versions high multiplicity passage of virus stocks. of replicator molecules that are assembled from Most virus systems (especially RNA and substrate, but fail to catalyze the assembly of retroviruses) are prone to the generation of daughter molecules. The relationship of these defectives, which can outnumber the chemical models to prebiotic conditions has been infectious virus. Furthermore, defective have questioned since as mentioned they often have been experimentally shown to mediate little apparent chemical similarity to molecules persistent infections under some condition. involved in living systems, such as RNA, amino Thus, like these defective viruses, prebiotic acids or proteins. In addition, chemical replicators are expected to host parasitic replicators frequently lack effective mechanisms variant versions of themselves and we can to introduce ‘genetic’ diversity, nor do they tend expect this process to drive the evolution of to show complex behaviors associated with the system. living systems, such as increasing informational or chemical complexity. However, these replicators still retain the capacity to propagate Chemical Replicators. information into future chemical generations thus they may yet display some rather basic Another area of research seeks to study the principles, of information transmission. chemical principles of prebiotic replicators. Nevertheless, these systems lack an essential This is the study of chemical replicators. In element in that they do not link the production of these systems, chemical substrates are the substrates needed for their own synthesis, to presented to a chemical replicator molecule their own replication. that is able to catalyze the assembly of these substrates into a copy of itself. The main Chemical replicators and hypercycles. problem with these systems is that the Manfred Eigen has noted that the RNA viruses various chemical replicators that have been reveal two principals of the organization seen in found to be able to catalyze their own all living systems, but including prebiotic synthesis, for the most part have no clear systems. These two principles are the cyclic relationship to biologically relevant reaction pathways and compartment formation. molecules. These are mainly simple organic In considering the dichotomy between genotype molecules that are able to stimulate the and phenotype, he has proposed these two chemical bonding of two or more substrate process can be spanned by cyclic feedback molecules provided in the reaction media. coupling. With respect to this coupling Eigen Thus it is not clear that these models are wrote “ A mutation in the genotype that very informative about the more expresses itself in the phenotype brings about immediate evolutionary response.” This leads to 20 reaction cycles with a superimposed but of a parasitic replicator (I), with higher affinity higher order cycle coupling he called a for the replicase (E). If this were to happen, the hypercycle. The expected time dependent hypercycle would collapse. Although parasitic behavior of these hypercycles could then be networks that have interdependent elements with expressed in a series of differential varying half- and various links to phenotype equations. A feedback loop would then and genotype might also be considered, they are exist that connects the replication to not a component of the hypercycle model. Thus its RNA template, but requires that these the hypercycle model appears not to remain within each others vicinity or are accommodate parasitic replicators. compartmentalized. These compartments will also allow the containment of quasi- species of template. These hypercycles have Insights from Artificial Life Simulations. the feature of limiting competition between different replicators by the cyclic coupling Computer programs have also been used to try to which allows complementing replicators to model and understand the behavior of self share advantages that might develop. They replicating genetic molecules, subjected to also might allow a quasi species to maintain Darwinian selection and evolution, including information content over many generations. sexual exchange. These computer systems are Thus the hypercycle idea seems to account collectively known as artificial life programs. for some important behavior of biological The intent of these computer based studies and systems and would be applicable to the early models is to evaluate if life-like behaviors can evolution of prebiotic chemical replicators. emerge from man made systems that utilize self However, although hypercycles unite the reproducing automata. Like the chemical workings of several genes and can evolve as replicators, the relevancy of these systems to life a unit, one of the curious features of has also been questioned since most computer hypercycles is that they tolerate no internal based programs don’t need to consume physical competition as the cycle links phenotype, or substrates, such as food, to exist. However, the catalytic activity to genotype. This linkage behavior of an information system, such as those does not seem biologically realistic to many. modeled with computers, poses many of the For example, the translational machinery of same basic issues as biological systems, the cell translates all genes, not just a linked including information content, copy fidelity, set. In fact this curios feature of hypercycles error rates, error thresholds and complex also limits the evolution of new species of nondynamic behavior. From simple behaviors of replicators. This is because the hypercycle these information systems, it is hoped (and at assumes that the replicase (expressed as Ei) times observed) that more complex behaviors of has highest affinity for its own template , reproduction and evolution will molecule (Ii). If we consider the problem of emerge. Thus many feel these systems display n species generating a new species (n+1), non-linear characteristics that make them worthy we see the dilemma that the new replicase of investigation. It is important to also and template must both appear distinguish the field of artificial life studies from simultaneously (In+1 and En+1), which is not that of artificial intelligence (AI). AI seeks to accommodated in the hypercycle model. create programs that can solve problems from a This is essentially the same problem of the top down programming approach. The field of acquisition of complex phenotype noted AI has often been met by hostility in the above and also related to the possibility of biological as it is accepted by biologist punctuated evolutionary events. Thus the that nature is fundamentally parallel and that evolution of new species poses a problem more complex properties emerge from the for the hypercycle model. Even more bottom up, not top down. troubling, however, would be the occurrence 21 biological systems (cells and organisms) many Artificial life simulations generally depend other self-identification systems are known to on some type of self reproduction processes. exist. Thus an automata or program set from a The basic concepts of self reproduction in a computer simulation will also need to computer system were first put forward by differentiate its own descendents or instruction Von Neumann. Von Neumann was first to set from those of other, competing automata. define the simple formal elements of self Without a rigid kin definition, the rapid reproduction and he noted that such a emergence of deceitful automata will be system would need to be able to copy the expected in which these deceitful automata can machine as well also needing to be able to elicit assistance (complementation or ‘sucker’ copy the description or instruction for the the other program) without contributing to machine to pass onto its descendents. propagation of the functional automata. These However, these descriptions will have both automata would be helped, but not obligated to interpreted instructions, which are needed help either their own reproduction or that of for self construction, as well as other automata and would allow such deceitful uninterpreted instructions which will be automata to more efficiently utilize resources. passive, unexpressed data needed to form Most artificial life simulations thus have also the description of the offspring. Thus defined some process by which kin identification inherent in self reproducing automata is the is maintained. requirement for both silent and active information. From these basic concepts Real and electronic viruses. If we consider our along with the addition of elements that are operating definition of a biological virus as a basic to evolution, such as heredity, molecular genetic parasite which directs the host variability, fecundity and fitness, there have to maintain and copy the virus, the similarity to a developed an array of computer based is clear. A computer virus is simulations that attempt to model the basically a file or instruction set that instructs the behavior of living systems. Frequently, host computer to maintained or copy and these programs are implemented as graphic transmit itself. Thus a computer virus is an simulations which compete for and occupy electronic file able to parasitize the computer for the computer screen. Some off these the purpose of self reproduction. Because both programs are of a more practical nature, biological and computer information systems such as genetic programs that are used to must copy information, a very basic question can solve problems that may not initially be well be posed in which we ask: is it possible that a defined. These programs emulate genetic system able to copy information can be made processes, including sexual exchange and which is not also capable of being parasitized or recombination and cross over to combine making unauthorized file copies? Can it be solutions into a maximized set. made virus-proof? In other words, is it possible to design a computer (or biological) information Another fundamental characteristic of system that is able to prevent all disallowed or biological systems or their computer parasitic copies of information? Many would simulations is the issue of identification self assume that it should indeed be possible to or kin versus non-self or non-kin. All design such a system that would have biological systems, including viruses, sufficiently elaborate safeguards to prevent differentiate their own and related lineage unauthorized file copying or modification. Thus from that of others. In the simplest of the endless effort to design computer operating genetic systems, this is often accomplished systems and scanning software that prevents at the level of the catalytic recognition of the computer viruses and unauthorized file copying. template, such as polymerase recognition of Still, it seems some enterprising and creative an origin of replication. In more complex computer hackers manage to design viruses that 22 get past all these protections, hence the system, especially the identity mechanisms of endless upgrades to protection software that system? involving the latest set of computer virus definitions. There seems to be no end to this Artificial life and parasites. Simulated process. An indeed, theory may tell us that evolution has often been attempted on the 2D is to be expected. Interestingly, this world of computer screens. Programs, such are question about preventing all strategies for Artificial BUGS, (W. Packard) provide an unauthorized file copying can be and has artificial ecology in which simple graphical been posed in mathematical terms and thus organisms reside within a lattice, seek and be subjected to rigorous mathematical compete for ‘food’, so that they might reproduce. analysis. Clear but surprising result were Generally, these models have predefined the obtained by an orthogonal proof which nature of the replicator and also provide finite established that it will not be possible to lifetimes during which they must succeed at design an self reproducing information reproduction. The basic features of Darwinian system that can prevent all forms of evolution (descent from common ancestors, unauthorized file copying, although it was , sexual exchange competition, clear that very restricted copying can be survival of the most fit,) were also programmed attained. into the simulation. With these added features, the replicator program was allowed to run its If information systems that must reproduce course of computer simulated evolution. This information cannot be made ‘virus-proof’, model will often display complex behavior, what does this infer for biological viral and sometime even ‘creepy’ behavior that appears to host systems in terms of evolution? Will all emulates living things in their behaviors. biological systems inevitably become However, they generally fail to show the parasitized by the viruses that are allowable development of the richness inherent in within that system? In terms of computers, biological evolution, including the tendency to computer viruses are often platform or generate more complexity. software specific. Do we expect similar ‘platform specific’ (species specific?) One particular model, however, did result in the biological viruses to develop? Are such evolution of greater complexity. This model was infections inevitable? With computers, can developed by T. S. Ray in 1992 and was called we expect that all computers will eventually Tierra. In this model, the replicator is not be exposed to some form of computer specified ahead of time. Furthermore, time viruses. Practically speaking, this certainly slicing of the CPU gives each program limited seems to have been the experience as almost time to execute the program, thus simulating a everyone in the world with a computer has basically parallel environment. The idea behind had to deal with virus infections. And as this model was to allow the replicator itself to predicted by theory, this situation appears to evolve and compete, not to be predetermined. be unending. If so, what does this suggest The time-sliced character of this model selects for future computer system development and for speed of the replicator and removes the what does this suggest for the evolution of offspring of replicators with time. In addition, a biological information systems? Clearly not built in generator of diversity is provided. all computer viral strategies are allowed on When the simulation is run, within a few million all computer systems so some limitations are instructions parasitic species were seen to apparent. Do biological systems also show develop from single point mutations. These broad patterns of platform or host restricted parasitic species are generally shorter replicators virus parasitization? If so, how do these that are lacking copy instructions, thus they are patterns of ‘allowed’ information parasites copied more rapidly then parental replicators. affect the development or evolution of the These parasitic replicators were allowed use of 23 other functional ‘organisms’ code (4-8X), but requiring considerable time to take (complementation) and thus were very over the population of replicators. analogous to defective viruses. However, like defective viruses, these defective Thus, starting from the premises of an undefined replicators need these other functional replicator, Darwinian evolution, these computer similar ‘organisms’ to be around to provide replicator programs spontaneously developed instructions. If the parasites become too genetic parasites which in turn drove the successful, they consume the available CPU evolution of replicators to new and higher space and don’t copy themselves. This complexity. It seems possible that these results behavior is very reminiscent of the Von may identify some rather basic aspects of Magnus phenomena known to apply to systems that replicate information. However, we many defective viruses. If a defective virus are hard pressed to state that such results relate is too successful, it prevents the copying of directly to early evolution of biological the infectious virus resulting in inhibition of replicators. We have little evidence to bring to replication for both the defective and the bare on this issue. Faced with both an inability infectious virus. Eventually, inhibition is so to reproduce early biological replicators or to strong that only a few progeny are generated reconstruct their history, computer simulations, and these will need to be non-defective however, may remain our only system to infectious virus in order to establish a explore some of these issues. subsequent infection. Thus the infectious virus is transiently free of defectives, and replicates rapidly, but the abundance of Persistence in simulations infectious virus allows the rapid generation Computer based models or simulations can only of defectives to start the cycle again. This be as good as the premise on which they are results in cyclic, sinusoidal production of based. Many of the premises basic to our infectious virus with a phase shifted understanding of the Darwinian evolutionary sinusoidal production of interfering process have thus been incorporated into the defective virus. simulations. However, as we discussed in Chapter 1, not all of the basic issues of In the Tierra model, variants of these evolution, such as the fitness of persistence, have parasitic replicators would then evolve been sufficiently defined such that they might be which attempt to block or poison the pre- implemented as a computer program. We have existing parasites use of the CPU. This then noted that distinct acute and persisting life results in the generation of a new more strategies of viruses have distinct features with complex order of parasites. Eventually, respect to replicative success and fitness. even more complex parasites develop that Mathematical models have been developed are parasitic to the existing parasites, or which appear to accurately reflect the replication hyperparasites. Over a long perios (2-3 and population dynamics of acutely replicating billion instructions) yet again even more viruses, using many of accepted premises of the complex parasites will develop which can evolutionary process. However, the persistent even be hyper-hyper parasites. This life strategy, common to so many viruses, is not evolutionary behavior can be punctuated a well developed topic nor are effective models leading to the bust and sweeps as new of this life strategy well evaluated. If we species of replicators becoming prominent. consider the time sliced Tierra model presented Overall, during this evolution it is seen that above, we quickly encounter a conundrum in our the initial ‘organisms’, which started with definitions of fitness as it applies to persistence. relatively smaller instruction sets, would The time dependent character of a persistent life eventually evolve to larger instruction sets strategy and how this strategy competes with

24 rapid acute replicators would seem not to fit they might compete with fast replicators (see most of the premises of the simulations that suggested reading). However, these models still have been evaluated to date. For a rely on relative replication rates for fitness biological example, consider the fitness of definitions so our existing models of fitness and an E. coli colonized by a lysogenic lambda. selection do not properly address persistence. One hundred generations of the host can For an extreme example of this issues, consider occur during which the virus replicates in how fit an eternally living individual might be? direct concert with the host and only With no progeny, our eternal life form would be amounts to that equal to a few rounds of an unfit. Hence models must first development acutely replicating virus. Yet even with this more formal ways to define the issues of long period of viral silence, it would still be persistence before they can provide useful essential for the fitness of the lysogenic insights into the successful life strategies of a lambda that if that be able to move to new persisting virus host when host survival becomes problematic. When the host sustains mortal Recommended reading. damage from UV light irradiation, it is Chemical replicators: (Shapiro 2000; crucial for viral fitness that the virus has Szathmary 2000; Hutton 2002) maintained a high probability of Hypercycles: (Eigen, Schuster et al. 1980; Eigen reactivation, replication and transmission to and Winkler 1992) (Cronhjort 1995) another host. Thus a temporal component, (Szathmary 1988) during which the phage is not replicating, (Szathmary 1992) persistent, stable, but still capable of virus replication is under positive selection. Artificial life simulations: Ray, T. S. in During this period, the virus is not under (Langton 1991) (Adami 1998) (Huberman and selection to maximize virus replication. Glance 1993; Ward 2000; Wilke, Wang et al. Hhowever, during this silent period, the 2001; Szabo, Scheuring et al. 2002; Lenski, virus may still need to compete with and Ofria et al. 2003) (Gesteland, Cech et al. 1999) exclude parasitic competitors (see chapter (Yedid and Bell 2002) (Rowe 1994) 3). This fitness profile seems to operate on a different time scale then is normally Adami, C. (1998). Introduction to artificial life. considered in most acute models of virus New York, Springer. replication. The temporal component for Cronhjort, M. B. (1995). "Hypercycles versus silent virus selection may be even more parasites in the origin of life: model extended then described above for lambda. dependence in spatial hypercycle For example, consider a phage that is systems." Orig Life Evol Biosph 25(1-3): lysogenic in a spore forming gram positive 227-33. bacterial host and that the host cell has Eigen, M., P. Schuster, et al. (1980). undergone sporulation. The lysogenized "Elementary step dynamics of catalytic spore might sit idle for a very extended hypercycles." Biosystems 13(1-2): 1-22. period, possibly thousands of years in some Eigen, M. and R. Winkler (1992). Steps towards measurements. Yet the viral fitness would life : a perspective on evolution. Oxford ; still depend it its ability to survive this New York, Oxford University Press. extended period in its dormant host but Gesteland, R. F., T. Cech, et al. (1999). The retain the ability replicate progeny virus at RNA world : the nature of modern RNA the appropriate time and condition of spore suggests a prebiotic RNA. Cold Spring reactivation. Clearly, such longevity can be Harbor, N.Y., Cold Spring Harbor a way to attain successful continuation of a Laboratory Press. viral lineage. There have been some Huberman, B. A. and N. S. Glance (1993). attempts to model slow replicators and how "Evolutionary games and computer 25 simulations." Proc Natl Acad Sci U S autonomously replicating computer A 90(16): 7716-8. programs." Nature 420(6917): 810-2. Hutton, T. J. (2002). "Evolvable self- replicating molecules in an artificial chemistry." Artif Life 8(4): 341-56. Langton, C. G. (1991). Artificial life II : the Possible figures. proceedings of an interdisciplinary workshop on the synthesis and 2-1. Hyper-cycles from M. Eigen. simulation of living systems held 1990 in Los Alamos, New Mexico. 2-2. Image from T. Ray Tierra software. also Redwood City, Calif., Addison- see Wesley. http://www.genarts.com/karl/evolved- Lenski, R. E., C. Ofria, et al. (2003). "The virtual-creatures.html evolutionary origin of complex features." Nature 423(6936): 139-44. (I don’t have permission for these) Rowe, G. (1994). Theoretical models in biology : the origin of life, the immune system, and the brain. Oxford ; New York New York, Clarendon Press ; Oxford University Press. Shapiro, R. (2000). "A replicator was not involved in the origin of life." IUBMB Life 49(3): 173-6. Szabo, P., I. Scheuring, et al. (2002). "In silico simulations reveal that replicators with limited dispersal evolve towards higher efficiency and fidelity." Nature 420(6913): 340-3. Szathmary, E. (1988). "A hypercyclic illusion." J Theor Biol 134(4): 561-3. Szathmary, E. (1992). "Viral sex, levels of selection, and the origin of life." J Theor Biol 159(1): 99-109. Szathmary, E. (2000). "The evolution of replicators." Philos Trans R Soc Lond B Biol Sci 355(1403): 1669- 76. Ward, M. (2000). Virtual organisms : the startling world of artificial life. New York, St. Martin's Press. Wilke, C. O., J. L. Wang, et al. (2001). "Evolution of digital organisms at high mutation rates leads to survival of the flattest." Nature 412(6844): 331-3. Yedid, G. and G. Bell (2002). " simulated with

26 CHAPTER III

VIRUSES AND UNICELLULAR ORGANISMS discussed below, phage from healthy and History of bacterial viruses; a conflict diseased human can often show distinct between the concept of genetic virus and biological characteristics. an acute parasite. The viruses that infect In the now classical paper of d’Herrell, it unicellular organisms were amongst the very was reported that a filterable fluid that could lyse first viruses to be studied and to this day, bacteria from patients that were recovering form remain the best understood of all viruses. dysentery, suggesting that a virus of bacteria Although the paper of Beijerinck first could explain the capacity of these filtered fluids clearly proposed in 1899 the idea of a virus to kill bacteria. Due to its important medical (Foot and Mouth Disease Virus) is a parasite implications, the idea that a very small parasite of the cell that replicates within its host of bacteria existed which was able to infect and using cellular systems for that replication, specifically kill bacteria gained much attention, this ideas was not initially widely accepted. especially prior to the discovery of . Viruses of bacteria, however, were In addition, early on it was realized by some that instrumental in changing existing views. It if such parasitic agents did indeed exist, they is ironic that in the 1915 paper of F. W. might provide an ideal and simplified way to Twort, he was attempting to grow the understand the nature of genes themselves since virus on defined media as an the parasite seemed to be using host systems of autonomous agent (at odds with Beijernick’s genetic information. Almost immediately, idea of a virus and assuming it was an however, their developed a serious and autonomous organism) when as an essentially philosophical schism amongst unintended consequence of bacterial that was to last for 30 years contamination, he observed a filterable fluid concerning the nature of phage and their that would lyse these contaminant bacterial relationship with its bacterial host. cells (called a glassy transformation). However, this initial report was generally The Bordet school: a genetic virus. In ignored until a subsequent report introduced the early 1920’ Bordet and Ciuca argued that this the concept of bacteriophage and also gained phage induced was a product of a normal much attention. Early on, a close characteristic of some bacteria. By various relationship between a specific bacterial methods, they and others showed that a bacterial viruses and their specific bacterial host was culture was able to produce phage and lyse recognized. It was observed that the very susceptible cultures but that this was a herediraty identity of some strains of bacteria could be characteristic of the bacteria themselves. If a best determined by the specific phage or bacterial culture has the capacity to produce the virus type associated with it. This host ‘lytic principle’ as suggested, this was in direct identification became knows as phage opposition to the view that a bacteriophage was a typing and is still udes today. Current virulent parasite of bacteria. Bordet vigorously examples include phage typing used for B. argued with the supporters of d’Herrell that the subtillis, staphlococci, and myobacteria, all virus was a hereditary element of the bacrteria. of which show virus specific surface markers. It was the bacteria associated with The Delbrück school: lytic bacterial both healthy and diseased human intestine virus. In the 1930’s, Max Delbruck, then a (E. coli, salmonella and micrococcus quantum physicist, became interested in the respectively) that provided the virus for study of phage as a way to understand the very these initial phage studies. However, as elemental or molecular nature of genetic material

27 and its reproduction. Using serologically was worked out by Benzer and Crick. Thus in related T-even phages (T2 and T4, and the early 1950’s, S. Lauria was able to provided others of the original phage isolates now us with our now accepted modern definition of a named T1-7) there followed from him and virus as a ‘molecular genetic parasite’ dependent later Salvadore Lauria, a series of elegant on the host mechanisms for its replication. In the and precise experiments which established late 1950’s, A. Lwoff published a more single step viral growth curves and clearly extensive definition of a virus as a molecular showed that these phage were parasitic and genetic parasite which was inclusive of both unfailingly lytic virus of their host bacteria. acute lytic or virulent viruses as well as the The Delbrück school of thought was thus persisting ‘’ or hereditary virus, which formed and armed by these clear were called ‘lysogenic’ or temperate phage. The experimental results developed a violent ‘’ model of Campbell in 1962 finally disagreement with the followers of Jules provided the mechanistic details for the model of Bordet who still maintained the hereditary how this prophage worked, invoking integration nature of phage production and had coined of viral chromosome into the bacterial the term prophage to describe this hereditary chromosome. Thus it was finally clear and viral state. accepted that there were two distinct life strategies (acute and temperate or persistent) that Genetic and lytic virus schools applied to the viruses of bacteria and that both reconciled. However, the ‘hereditary- these life strategies identified successful phage’ views of Bordet were also supported molecular genetic parasites of bacteria. A third by additional experimental results. Of life strategy for virus replication, ongoing particular note were studies by A. Lwoff and nonlytic replication of the RNA and DNA also Jacob and Wollmann which carefully miniphages (PhiX 174, M13) was also followed individual bacterial offspring in discovered. This life strategy is distinguished by micorcultures and observed results that continued or chronic virus production and clearly supported the idea that “..the genetic shedding, without the corresponding cellular material of the E. coli and the genetic lysis associated with lytic infections or silent material of the prophage have originated lysogeny associated with temperate phage. from the very same material.” This clash of views was to be maintained for several Thus the very and molecular decades. However, in the 1950’s, much biology is itself intermingled with the conceptual happened in our understanding of the tension and confusion that arises between viruses molecular basis bactriophage and the field of that are acute or lytic and viruses that can persist molecular biology was born as a and/or colonize the host genome. However, consequence. It became clear that both the when examined from a modern perspective, the idea of a virulent or lytic virus and the idea reason for this long lasting disagreement seems of a genetic or hereditary phage were to have been mostly one of perception. These correct. In 1953, the structure of DNA as two processes have been (and still are) perceived the genetic material was discovered by to be in direct opposition to and mutually Watson and Crick and could now be applied exclusive of one another. Yet in one sense this to understanding phage (or viruses in difference in perception seems correct. T2 and general). In addition, the capacity of one T4 phage, for example, do not, with passage or such ‘hereditray virus’ to move host genes in time, become hereditary persistent viruses. and hence be one and the same with the They remain lytic phage and will invariably lyse cellular genetic material was clearly infected cells if their replication is successful. reported by Leaderberg and Leaderberg. This acute behavior is a stable biological Alsom using the phage T4 as a model, the characteristic and is not compatible with the idea very nature of the gene and the of a hereditary virus. This situation is also at 28 odds with the prevalent views of some early on that in addition to providing immunity evolutionary biologist, who felt that lytic against the specific colonizing prophage, a viruses must represent a ‘evolutionary prophage can often confer immunity to similar, young’ relationship which will tend to and sometimes even dissimilar phage. Thus, the evolve into persistent, or benign phage colonization of the bacterial genome relationships with their host species, given results in a clear and selective cellular enough time for evolution to attain phenotype. These colonized bacteria will now equilibrium. However, we know lytic have acquired a new ‘viral-derived’ molecular viruses like T4 do not evolve to become genetic identity, which had been superimposed benign parasites of their host. Other similar onto the bacterial host identification system. ‘lytic-only’ viruses can also be found in Along with and inherent to this new identity, the many other host organisms indicating that host has also acquired the ability to recognize this life strategy is common. These acute and preclude other competitive genetic parasites. viruses always harm or kill their host. Furthermore, we now also know that even defective versions of prophage, lacking the However, one source of confusion was that ability to produce infectious virus, can still some specific viruses could be both lytic and provide these immunity functions. Defective persistent. Early on, it was clear that a phage of various types can successfully and have specific viral agent (such as phage lambda) colonized their host and preclude infection with could have both acute and persistent life other related parasites. Thus the phenotype for strategies, but this was depending on host that is prophage colonization is distinct and specific bacterial host or growth conditions. separable form those of both uncolonized hoist This capacity to switch between lytic and as well distinct from that of an acute virus. persistent infections was a characteristic of temperate viruses that tends to be highly Prokaryotes and their viruses; lysis and host specific. The term lysogenic was persistence. In Chapter 1 we discussed the originally coined to explain what happens general issue of viral life strategy and the distinct when two strains of bacteria (one harboring character of viral fitness that applies to acute and virus, the other not) were mixed resulting in persistent viral life strategies. It will be argued the lysis of the bacterial without the that viruses of essentially all organisms will tend virus. One bacterial strain is thus lysed by to adopt one of these two life strategies. In some the other ‘lysogenic’ strain, but the cases, as will be presented below, this pattern lysogenic strain itself is protected from lysis. applies broadly to an entire family of virus or to In a sense, the term lysogenic is confusing a a particular order of host. For example, Fungi since the lyogenic strain does not itself lyse. are frequently infected with persistent and We now know that the lysogenic strain is inapparent versions of dsRNA viruses, whereas protected from lysis by the presence of the Eukaryotic are susceptible to acute prophage, but that this same prophage can infections with large dsDNA containing reactivate at some low rate to infect and lyse phycodnaviruses In the case of prokaryotes, we the second susceptible bacterial strain. In now know that both acute lytic phage and this instance, the selective advantage to a persisting prophage are very common in bacteria that is colonized by a prophage essentially all microbiological communities. The appears obvious. The prophage provides most common of these ecologically abundant protection from an otherwise lysogenic viruses resemble tailed phages like T4 and phage. Thus in a competitive situation, Lambda. Many of these tailed phage are also where both phage colonized and known to be temperate (discussed below). This uncolonized bacterial cells might be found type of viral morphology is arguably the most together, a colonized bacteria will have an abundant and dynamic life from on the entire immunity advantage. It was also established planet as it is highly abundant in the oceans and 29 soil (discussed below). How can this communities,. The study of phage-phage gene abundant phage affect the evolution of life? function situation is especially well developed in How do phage contribute to the origin and the very large microbiological populations of the evolution of prokaryotic host genomes? dairy fermentation industry as will be presented Historical accounts in evolutionary biology below. do not generally pose this question as phage were generally though of as simply Terminology for persistent, lysogenic and destructive entities or disagreed about the temperate infections. The various terms used basic nature of phage. What was generally here to describe the temperate lambda life style missing from the historical disagreement should first be clarified. As described in Chapter between lytic nad hereditary viruses noted 1, where I present the distinct life strategy of above (and to this day), however, was the viruses and the fitness associated with the acute idea that there exist a dynamic but enduring and persistent life strategies, I have defined the tension between these two states. Acute and term persistent virus to describe the capacity of persisting viruses (or their defectives) exert an individually infected organism to produce an enduring tension on each other and their virus at a later time. This is a more general use host that is stable on evolutionary time scale. of the term then is typical of the scientific This means that each of these types of virus literature so in order to avoid confusion, this must not only adapt to the host they general use needs to be emphasized and it should parasitize, but they must also adapt to the also be noted that this use is inclusive of the life prevailing acute and persisting viruses that strategy of temperate phage. The terms will inevitably seek to occupy their temperate and lysogenic are often used ecological habitat and their host. This interchangeably, although they can sometime be represents a previously invisible adaptation differentiated. As defined by Lewoff, a that molds the entire prokaryotic world. A lysogenic infection is one in which the infected striking example of the existence of this bacteria has the hereditary capacity to make lytic ‘acute-persistent’ virus dynamic can be virus at a later time. Not all temperate infections found in the very first molecular genetic go on to make lytic virus. However, the ability element to be identified as a ‘gene’ at the of an infected individual host cell to foundation of molecular biology. This first subsequently make virus at a later time requires and foremost gene is the RII gene of T4 the persistence of viral genetic information. phage. The RII gene is considered non- Thus these infections represent stable and essential, in that it is not needed to growth in persistent viral genome within their host. In the most E. coli host. However, RII is essential case of lambda (and P1), this can be a highly for a T4 phage that infects an E. coli host stably relationship and can involve an harboring a lambda prophage (or its lambda ‘epigenetic’ type of stability, capable of being defective). The RII gene represents a class maintained for hundreds of generations of of ‘accessory’ genes which are well bacterial cells. As noted, the term lysogenic was conserved in clinical isolates of T4 from derived from the ability of such persistently E.coli. Such genes are not a unique to infected bacteria to induce this hereditary virus lambda and represent a general situation, as production and lytically infect a mixed culture will be presented below. Such ‘accessory’ containing a susceptible (nonlysogenized) phage genes, whose role it is to counter the bacterial host. However, the term temperate effects of other phage genes, are commonly describes more restricted process in which a observed and even applies to the function of phage infection, rather then leading to lysis, various other T4 genes. Given the high establishes a non-lytic, lysogenic state without abundance of phage in natural environments, killing it host. In the case of lambda, the we can expect the existence of related genes lysogenic state usually involves the integration in essentially all microbiological of lambda DNA into specific regions of the host 30 chromosome. This state of viral DNA when dsDNA viruses with moderate to large (20-180 integrated into host chromosome is called a Kbp, both linear and circular) genomes packaged prophage. However, non-integrated into icosahedral (or filamentous forms in persistence is also known which involves the case of hyperthermophiles). Currently, about episomes of lambda. Some temperate 96% of known prokaryotic Bacterial phage are phage, such as P1, normally persist as tailed, 4% are isometric. The abundance of episomes and will seldom integrate thus Archael phage differs significantly from this DNA integration is not necessarily required. distribution (tailed phage are only 5%). Of the Another related term which is sometimes tailed phages, those with linear and circular used is vegetative phage replication. This is dsDNA genomes are classified as the state of production of lytic virus and cell (contractile tails), (noncontractile killing associated with either the reactivation long tails, linear DNA) and (short of a temperate virus or the lytic infection of tails, linear DNA). Both lytic and temperate a susceptible host. (episomal and integrated) life strategies are found in all these groups, although this Overall patterns of prokaryotic viruses; characteristic tends to be associated with the high prevalence of tailed phage. Some specific of phage and host within each of might consider the historical account, listed these families. Also, acutely replicating viruses above, as biologically misleading. This is (T4, T7, PRD 1) show various other because the focus of these studies is characteristics, such as a strong tendency to code overwhelmingly on enterobacteria and the for their own replication, recombination and phage associated with them. How repair proteins (DNA polymerase, ligase, etc.), representative might these systems be of as opposed to most temperate phage which will other bacteria and their viruses? There is utilize cellular replication systems and tend to good reason to worry about this issue. lack such genes. Bacterial and their corresponding viral ecology were not well studied as many Membrane bound or associated ds virologist might suppose. Thus, this issue is DNA viruses (like animal herpes or algal more difficult to address then might be phycodnaviruses) are rare in Bacteria (but found apparent. However, in the last ten years the in some Archaea). The next most common virus topic phage ecology has become better type are ssDNA viruses of smaller genome size studied and it seems clear that both lytic and and show icosahedral and filamentous capsids, temperate viruses are common to many such as phiX174 and M13/ColiphageF1 bacterial species and in various habitatsn respectively. Unlike the other ssDNA viruses addition, we now know that what was which also replicates by rolling circular replicons initially considered prokaryotic bacteria (RCR) found in plants (geminiviruses), these actually consist of two distinct and ancient bacterial RCR viruses are not segmented. orders of prokaryotic unicellular life known dsRNA viruses are uncommon, in bacteria but a as Bacteria and Archaea and that these few are known and well studied, such as Phi-6,. kingdoms can be further subdivided into Phi-6 is clearly related to dsRNA viruses of distinct orders of prokaryotic organisms. animals. ssRNA of positive polarity are also The ecology of Archeal phage is even less much less common in prokaryotes, but are understood then that of Bacteria phage, but known (such as Q-beta), but currently less is developing. However, even if we limit abundant or not observed in Archea. ssRNA our consideration to the viruses that infect viruses of negative polarity are essentially both these prokaryote orders (Bacteria and unknown in prokaryotes as are authentic RT Archea), we will observe distinct patterns viruses (autonomous animal retroviruses or and relationships of virus to specific host. pararetrovirusses of plants). However, The great majority of prokaryotic viruses are (defective retroviral elements) are 31 found in some Archea and in micrococcus. packages as linear DNA, but due to short regions As will be presented below, these virus-host of terminal repeats, viral DNA will generally relationships are distinct between various replicate via circular theta and rolling circular Archea and Bacterial orders and specific to forms. In addition, as there is no nucleus and even subdivisions of host species within no cytoplasmic or ER specific transport system, these orders. We will consider possible bacterial viruses will not need to devise methods reasons and evolutionary implications for to move through these cellular systems and these associations. breach the nuclear membrane or nuclear pore complex. In addition, prokaryotic transcripts are Bacterial cells as a viral habitat. Bacteria not capped, spliced or polyadenylated and are present viruses with a specific cellular and not transported from the nucleus. Thus viruses molecular habitat to which such phage must of prokaryotes are expected to have host specific adapt. Generally, bacteria are unicellular molecular to all these situations and organisms with rigid cell wall able to will thus differ in many fundamental ways from withstand high osmotic pressures. The viruses of eukaryotes. composition of the cell wall will vary substantially thereby presenting a diverse The prokaryotic system for DNA replication is chemical surface to the virus for recognition, also distinct from that of the eukaryote. As attachment and penetration. Peptidoglycans mentioned above, most Bacterial chromosomes that make up the Bacterial cell walls will are circular (sometimes multiple) and initiate need to be mechanically breached. This DNA synthesis from a single origin of physical barrier probably accounts for the bidirectional replication, which is often able to fact that the great majority of prokaryotic reinitiate prior to the completion of . viruses do not physically enter their host Thus bacteria lack the basic components of a cell cells, but rather attach to the surface and cycle as seen in eukaryotes. Viruses that inject their genomes (usually DNA) into the persistently parasitize cellular replication host cell, often by active mechanisms, such systems (such as many temperate or episomal as contractile tails or with ‘’ proteins. phage) will need to devise mechanisms that The rigid cell wall also presents a problem ensure their DNA replication is coordinated with for the exit of progeny virus as this barrier that of the host. Conversely, lytic viruses that will need to be breached, often by viral replicate using their own viral encoded induced lytic so that virus release replication proteins will need to bypass existing is accomplished by host cell burst. In host controls on extrachromosomal DNA addition, the internal workings of a replication. Another common feature of the prokaryotic cell present some specific bacterial cell habitat is the occurrence of molecular situations that the virus will need restriction modification systems. This consist of to deal with. Prokaryotic cells have DNA two matched enzymes; a DNA modification that is not highly organized into a enzyme (usually a methylase) that will typically topological superstructure and is not tightly covalently modify DNA during replication and packaged into chromatin. Although there do protect it form the matching endonuclease which exist some bacterial DNA associated would otherwise degrade unmodified DNA proteins, they are not in highly stable (usually cutting at a specific palindromic structures such as exists in chromatin of sequence). Restriction modification systems are eukaryotes. Therefore free DNA molecules found throughout both Bacteria and Archea and will be more likely to directly interact with represent highly diverse systems. Statistical the cellular replication and trasncription analysis of the occurrence of palindromic machinery. Bacterial DNA is generally sequences in prokaryotic genomes strongly circular, with a unique origin of DNA suggests that most all prokaryotes have been replication. Most dsDNA viral genomes are under intense phage selection by 32 restriction/modification systems since their rearrange DNA, can explain why bacteria seem genomes are highly underrepresented by generally able to generate a lot of genetic potential restriction sites. As discussed variation and can quickly selected for genetic below, many restriction modifications variants with greater fitness. In nature, bacteria systems are themselves coded for by both are the most genetically adaptable organisms temperate and lytic viruses and these genes known and have been observed to adapt to even can be virus specific. Other systems of virus multifactor changes in their habitat. Intense heat, restriction are also known, such as small and even intense that can normally interfering . These features are break DNA into relatively small fragments have common to all bacteria and in prokaryotes been circumstances to which bacteria have are essentially invariant. Thus we expect adapted and survived. that a bacterial virus will not be able to evolve replication mechanisms outside of Complex bacterial adaptation and infection. the molecular habitat described above and However, the very high rates of bacterial the barriers this habitat presents are adaptation to changes in their environment to do fundamental. Curiously, some bacterial not simply stem from their high reproductive species, such as mycoplasmas and likely rates and associated ability to select rare . descendents of bactreria (such as eukarytoic Often, bacteria have shown an ability to acquire mitochondrial DNA) have genomes that are DNA with genes in complete and complex sets free of the palindromic restriction sequence from external sources in their environment. bias noted above, suggesting that they are Because these gene sets were also not present on not under phage selection. It seems possible any direct cellular predecessors of the bacteria, that by becoming a parasite within another they cannot be considered to have been a basic cell (Eukaryotic), these degenerate bacteria component of the essential genetic lineage of may have developed a way to escape bacterial cells. This type of gene acquisition has exposure to and selection by viruses and generally occurred by an infectious, or horizontal thus no longer need to retain the avoidance processes. Several mechanisms of gene of palindromes that otherwise acquisition are known. Some bacteria posses prevails. specific systems for DNA transfer, such as conjugative and Fertility (sex Bacterial Population Structure and systems). However, these transfer systems are ecology of the viral habitat; prevalent not uniformly found and are absent from some virus and host fitness. Besides the cellular highly adaptable bacteria. Furthermore, these and intracellular habitat provided by bacteria sex systems have clear association with bacterial for , the population structure viruses. For example, the origins of these sex and ecology of bacteria also present specific systems is a question whose answer is most circumstances for virus adaptation and probably viral in nature, as will be discussed replication. Bacteria are haploid and below. Also, it is often observed that these generally do not require sexual exchange for transfer systems will make the host cells carrying reproduction. This might suggest that them susceptible to infection with various bacteria tend to have genetically uniform viruses. For example, PDR1, the multivalent populations since a successful individual lytic phage, able to infect a broad range of gram bacteria would be expected to rapidly negative bacteria, was originally isolated in generate a large clonal population of association with multidrug resistance (an descendents. However, due to the very acquired complex set of genes). However, it was rapid growth rates, and large populations, not the multidrug resistance that allowed PDR1 most bacteria have the capacity to rapidly to infect these cells, but the pilis associated with select for rare mutants. This along with the DNA transfer as PDR1 infects via pili. It is the activity of various insertion sequences to 33 presence of these pili that allows PDR1 inter-viral interaction of prevalent genetic entry into its bacterial host. parasites.

The overriding mechanisms of bacterial Oceans: a viral soup. In nature, several genetic exchange and adaptation are environments exist that represent large bacterial basically infectious in nature. However, this populations and ecologies from which we can infection must result in a persisting genetic consider viral-host and viral-viral interactions. adaptation if it is to affect host evolution. Probably the largest of these are the oceans. The acquisition of multiple Acquatic systems typically have up to a million and factors are probably the most bacteria per milliliter. The unicellular studied and best known examples of this cyanobacteria and the filamentous cyanobacteria type of rapid but persisting bacterial as well as heterotrophic marine bacteria adaptation and clearly involves the constitute the major prokaryotic component of acquisition of complex multifactor gene the oceans. In addition, eukaryotic algae along function. However, phage exclusion, is with their phycodnaviruses are also important another well studied example of an acquired oceanic organisms, although less abundant then phenotype that sometimes clearly involves prokaryotes. Viruses are abundant in the oceans acquisition of complex phenotype, and can and are generally observed by electron be similarly acquired by many bacteria. microscopy in at least a ten fold numeric excess This capacity to exclude phage furthermore to bacterial counts. Temperate phage are also identifies how various competing phage can known for most these bacteria. Physical counts themselves select for the adaptation of of viruses in the oceans from concentrated ocean complex phenotype. Phage-phage water using electron microscopy indicate that in interactions can directly contribute to the total the world’s oceans harbor about 1031 viral acquisition of complex phenotype, such as particles. The majority of these particles appear phage immunity. A bacterial virus that is to be tailed phage, hence they are large, DNA well adapted to its bacterial host is, in a containing virions. Measurements of viability sense, in direct competition with other suggest that the half-life of these phage is less temperate and lytic viruses (and their genetic than one day and that they represent a highly derivatives) that will attempt to parasitize or diverse population, most of which are not colonize the same host. Thus, the ability of represented in the current genetic database of one phage to compete with and exclude known phage families or host genes (measured others will provide the host with a new by mass cloning and sequencing). In source of complex genetic information that cyanobacteria, the major classes of phage are can result in resistance for that same host. similar to those dsDNA phage of E. Coli This will have a large affect on the (Myoviridae, Siphoviridae, Podoviridae). There survivability and hence fitness of the are also many large, poorly characterized DNA parasitized bacterial host. But the main viruses of eukaryotic algae and amoebae making point is that such enhanced host fitness will this oceanic viral soup by far the most abundant be an attributable of the successful genetic and diverse collection of life forms on the planet. colonizer (phage genes) and for success, this colonizer must successfully exclude Cyanobacteria are an ancient bacterial lineage prevalent competing phage in the ecology. and are thought to have diverged 3.5 billion Thus these genes will tend to have been years ago from their prokaryotic ancestors. Both derived from a virus, not another host lytic and temperate phages are known to exist in lineage. Therefore, in order to understand cyanobacteria. Some measurements of the phage host and viral fitness and evolution, we must and gene transfer rates in the ocean have been also consider the phage (viral) ecology and attempted. One such estimate is that 1014 events per year occur in Tampa Bay 34 alone. Thus it is clear that this tremendous range in streptomycetes host is the FP22 phage. viral of the oceans has an However, this phage is known to form lysogens established pathway by which some of these in Streptomyces ambofaceins. Similar results genes can become a part of the host genome. have been observed for thermophilic bacillus Given the age and the volume of the ocean, species. In one study, 19 strains of bacillus and its central role in the early evolution of megaterium were examined and all were shown multicellular life, the bacterial and viral to harbor temperate phage, most harbored several adaptive and genetic activity within the phage, some also harbored defective phage, that oceans must be considered as the main could be induced with mitomycin C. It thus candidate for the creative genetic process appears that like the oceans, soil bacteria also that ultimately resulted in higher, more have a high rate of lysogenization. Unlike complex life forms. The infectious nature of aquatic measurements, however, much less is bacterial evolution and the tendency of known concerning possible rates of gene phage, the main transmissible genetic transduction in soil organisms or concerning parasites, to evolve by shuffling sub-gene accurate counts of soil phage. Although the size cassettes (described below), implies that this of this soil bacterial population and their phage ongoing process may have also played a must be huge, few specific measurements are central role in the origin the higher life to available concerning this issue. Still, given the come from the oceans (discussed in chapter enormously large bacterial populations present in 4). soil and the oceans and the known occurrence of both lytic and temperate phage, we can conclude Soil; a viral slurry. Soil represents another that soil bacterial fitness, adaptation and that maintains large evolution is likely to be strongly influenced by populations of bacteria and hence would the ecology and activity of these viral genetic also be expected to have a large population parasites. of virus. However, soil virus ecology has not been as well studied as the oceans since The enteric bacterial/phage habitat of it has been technically very difficult to animals. Although soil has relatively low measure populations of phage found in soil. bacterial counts, the gastrointestinal or enteric Soil phage estimates have ranged from 102 tract of animals has exceedingly high counts of to 107 pfu/g dry topsoil. Because of poor bacteria. Bacteria in fecal animal waste is so energy sources, bacteria of soil tend to be concentrated that it can constitute up to 80% of concentrated at and frequently associated its dry mass of the feces. The quantity of with plant root regions. Thus the evolution bacteria growing in these enteric habitats is thus of root systems must have had a big sufficiently large enough to affect environmental influence in the emergence of soil bacteria measurements of bacteria, such as in water and their phage. In addition, many soil runoff from soil, for example. The will be in the form of spores in the animal gut and its ability to support very high which associations with temperate viruses concentration of bacteria , which for the most will not be apparent, unless spore part turns over every day with waste excretion, germination and virus reactivation can be thus represents the development of a significant followed. Streptomyces is an example of a and new habitat for bacterial growth in the soil host bacteria in which both lytic and . With so many enteric bacteria, we temperate DNA phage have been frequently can also expect that this is an excellent habitat isolated. Some of these phages have shown a for virus or phage. As mentioned in chapter 2, high degree of polyvalency with respect to the very discovery of lytic and lysogenic phage broad bacterial host genus they will infect, is due to studies of the human enteric . The reminiscent of PRD1 in coliforms. An best studied phage in this context is T4 of E. coli. example of a phage with a very broad host T4 has been frequently isolated from clinical 35 specimens. However, the large majority of division occurs rapidly in a spatially limited clinical isolates of E. coli (the normal host portion after the transition of the small to large for T4) will not support the replication of T4 intestine. In a diarrheal patient, therefore, it is for unknown reasons. Relatively few of likely that these bacterial growth patterns are these clinical isolates were lysogenic for disturbed. This relationship of lytic versus lambda so exclusion by lambda is not temperate phage with respect to common in a clinical setting. In one study, bacterial growth and human intestinal health may only 38 of 200 E. coli clinical isolates were explain early disagreements in the study of phage able to replicate T4. Clearly non-lambda as presented above. d’Herrel, working with mediated restriction of T4 infection is dysentery patients, believed that lytic phage were common in clinical settings. As the P1 the norm, whereas Twort, working with a natural temperate phage is also a potent restrictor of micrococcal contaminant, viewed temperate T4 infection, and as temperate phage and phage as a more normal or typical situation. As colonization is also common in discussed below, with both acute lytic and clinical E. coli isolates, it seems likely that persistent lysogenic phage, virus reproduction is other genetic symbionts, such as P1, might affected by bacterial growth, but lysogenic be involved in generally restricting T4 establishment is more frequently associated non- permissivity in these clinical E. coli isolates. dividing states. This issue of host cell growth in This issue, however, has not bee well relation to phage growth is also well studied in studied and needs to be further evaluated. the context of the diary industry which deals The isolation of T4 related phage from with very large and rapidly growing populations or raw , however is also of bacteria (discussed below). Clearly, virulent affected by the health of the individual phage are common in numerous bacterial human subject. For example, from healthy populations in nature. T4 therefore appears to be human subjects, 209 of 607 people were quite representative of a prevailing natural phage reported to harbored a phage that could be strategy. Thus lytic phage appear to be stable in isolated, but these phage was mostly natural ecologies. The populational and temperate (36% related to F 80, 27% related evolutionary stability of a strictly lytic agent has to l Lambda, 17% related to F 28 ). also been considered from theoretical perspective However, when diarrheal patients were and it has been concluded that virulent phage can used, 98 of 140 (70%) were producing exist in a stable dynamic (albeit sometimes phage at high concentration which was chaotic) relationship with their host bacteria. mainly virulent (T4, T5 and TU23 related; see Goyal et al. Phage Ecology). This indicates that lytic phage are more The Lytic Phage of Bacteria; T4 as the associated with disturbed, possibly rapidly Paradigm. Our focus above on the history of growing bacterial populations in enterically viruses of bacteria was presented from the ill humans. perspective of acute and persistent (temperate) viruses. Although this focus will continue in the A healthy human is clearly growing enteric subsequent chapters, it will also be the style of bacteria (including E. coli) and replacing the this book to examine the best studied bacterial population every day in association experimental models in each of the with the digestion of food and passage of corresponding sections. The lytic phage T4 is stool. Therefore, it might be concluded that probably the best studied lytic virus in all of the intestinal track harbors a large virology. T4 phage (and the serologically related population of rapidly and continuously T-even T2 and T6) viruses containing large dividing bacteria. However, we know that linear dsDNA genomes (about 170,000 bp) that the bacteria throughout most of the intestinal code for about 140 genes. Other phage with track is not actively dividing as bacterial related acute biology include T5, T7 and SP01 of 36 gram positive bacillus. The DNA termini of prokaryotic examples of group I self-splicing the T-even phage are repetitions of around . The occurrence of such introns in these 400-800 bp and are involved in DNA well conserved viruses of bacteria argues that replication via circular forms. The template introns may have evolved in viruses, prior to the for T4 DNA replication and recombination evolution of eukaryotes. Also, many of these is not a naked DNA but a DNA protein phage code for a set of tRNA molecules, that can complex, the protein being gene 32 which is be deleted but still replicate in laboratory strains a ssDNA binding protein (able to removes of E. coli. Yet the tRNA genes are conserved in hairpins from ssDNA). The DNA is natural populations, suggesting some type of packaged into an icosahedral head and the ‘accessory’ function. One of the very tails are contractile. Interestingly, T4 distinguishing features of the DNA of these lytic packaging accepts greater then the genome phage is the high degree of modified nucleotides length of DNA and will package up to 20% that they contain. In the case of T4, cytidine is additional sequence into the phage. Thus replaced by hydroxymethylcytosine (HMC). the virus is always partially diploid and may This synthesis of modified phage DNA serves to also carry host and other resident DNA mark the molecular identity of the phage genome sequences. The virus attaches to the apart from the host DNA since T4 also encodes a bacterial wall via a baseplate at the end of restriction endonuclease (II and IV) that will the tail, which provides cell specific binding. degrade unmodified host DNA. However, HMC Although the T-even phage have DNA modification renders it sensitive to Mcr morphologically complex virion structures, restriction (discussed below). The glycosylation these virions tend to be highly efficient at HMC residues of T4 DNA serves to prevent structures. T4 phage (and many other this Mcr restriction. T4 DNA modification phage, including lambda) can have a particle makes it difficult to study with restriction to plaque forming unit ratio of one, enzymes. However, modification of T4 DNA is indicating that essentially every phage is not essential and deletions of dCTPase and biologically active. This is in sharp contrast endonuclease IV, along with other alterations can most animal viruses which tend to have a be used to make T4 with unmodified DNA when particle to pfu ratio in the 100’s grown in nonrestricting E. coli. SP01 modifies DNA using hydroxymethyluricil in place of Lytic virus, autonomous replicators and thymidine, rather then HMC, but does not marked DNA. All acute phage will lyse degrade host DNA. SP01 also modifies phage their susceptible host bacteria. Generally, DNA by methylation. It thus seems to be a these acute phage also code for virus common principle that lytic phage can mark or specific DNA replication proteins (such as modify their DNA, distinguishing it from host DNA polymerase, DNA ligase, thymidylate DNA. synthetase) in keeping with a replication strategy that is rather autonomous form that T4 and the definition of a viral species. We of the host cell. In addition, this group of have been considering the example of T4 as a viruses codes for virus specific DNA repair viral species that is an acute, virulent bacterial proteins, including versions that have no virus. Can T4 be considered to belong to s homologues in their corresponding host species of virus? The nature or any ‘species’ as cells. This is in keeping with the high rates it applies to a virus, however, deserves some of DNA repair (such as following UV additional consideration. An accepted definition damage) that is also a characteristic of T of a biological species is that of a population of even phage (compared with much greater interbreeding individual organisms, as proposed UV sensitivity of temperate phage). by E. Mayer. If we apply this definition to T4, Another intriguing feature of these lytic we might also conclude that possibly all the T phage is the presence of one of the few even phage may represent one ‘species’ of virus, 37 since there is evidence of genetic exchange within this . All the T-even T4 gene conservation thus showssome very phage have 85% . In general characteristic that can be seen in most, addition, all T-even phage show high rates but interestingly not all other viral families (such between them. Thus, as lambdoid phage). Those general much of the differences between T2 and T4, characteristics include the existence of a ‘core’ for example, might then be considered as subset of genes, as well as a conserved genetic the normal variation within one population map or gene order (common in most viral of species. However, this variation would families). However, an even more basic aspect seem to represent more heterogeneity then is of all viral families appears to be the usually associated with genetic variation maintenance of a particular molecular strategy of within one species. Also, distinct gene sets viral replicator (discussed below). T4 (and T (sometimes called accessory genes) can be even phage) has distinct, non-host like DNA identified that distinguish the T even phage, replicase (discussed further in chapter 4). so this seem to represent genetic differences Viruses will often display and maintain specific beyond usual species variation. In terms of (non-host-like) systems for their replication, similarity, all T even phages have conserved which includes the replicase and corresponding certain genes that are curiously not simply cis-restricted origins or regulatory nucleic acid those that have been identified as essential sequences. In addition, most virus families also genes. T4 phage codes for 140 genes. maintain a set of ‘accessory’ genes that are Genetic analysis indicates that only 69 of relatively unique to the particular lineage of virus these genes are essential for replication in but also generally dissimilar from host analogue laboratory (non-lysogenic) strains of E. coli. genes. These genes include structural (capsids, base plate etc.) and basic replicative genes Conservation of ‘accessory genes’ and (polymerase, ligase, etc.), which are thought interaction with prophage; the need to compete to be essential for virus replication. with other viruses. The only known biological However, these ‘essentail’ genes are only a requirement for T4 DNA methylation of adenine fraction of the T even conserved genes. The is to protect T4 against degradation by the high degree of conservation of the other 50 restriction/modification genes (or addiction or so ‘non-essential’ genes in different T- module) of the lysogenic P1 prophage (discussed even phage suggests that they also have below). E. coli free of P1 prophage does not essential and selectively conserved function have a restriction endonuclease that will degrade in natural settings. These non-essential methylated T4 DNA. The RIIa and b gene conserved genes include the tRNA cluster (conserved in T2, T4 and T6), which played and the RII genes. In general, these such an essential role in the history of molecular conserved genes are not related to sequences biology (e.g., the molecular concept of cistron found in E. coli, so they appear to be derived and the genetic code), functions only with from virus, not host genomes. For example, respect to allowing T4 replication in host the DNA methylation and repair enzymes colonized with prophage lambda. RII has no found in T4 are distinct and have no host function in an E coli free of a lambda (or counterparts. Curiously, neither the highly defective lambda) prophage expressing rexA and conserved HMC content of T4 DNA nor its rexB genes. Even more surprising is that HMC methylation are essential, they have been incorporation into T4 DNA is also not essential called accessory functions. Yet the HMC, and T4 can be grown free of HMC with glucosylation and methylation of T even additional T4 mutations (such as T4 restriction DNA is charateristic of and used to identify endonuclease) in nonrestricted E. coli. HMC all the T even phages which argues against containing DNA, however, renders it susceptible an ‘accessory’ role in vivo. degradation by McrA and MrcB genes 38 (restriction endonucleases). HMC is thus most like T4 corresponds to the head and needed to counteract this restriction that contractile tail regions of these phage, but also glucosylation of DNA allows. However, includes some early genes (DNA topoisomerase, although McrA is often thought of as a host DNA ligse,, ), consistent gene, it actually resides within the e14 with the maintenance of T-even replication genetic element that is itself a cryptic strategy. Yet, the DNA polymerase is not prophage element, not found in all strains of amongst this conserved set. However, more then E. coli. The e14 element also codes for one third of the psudeo T-even DNA has no other proteins that inhibit T4 translation. homology to T4 nor is the DNA HMC modified. Thus both T4 methylation and T4 accessory Phylogenetic analysis of all known phage based genes appear aimed at countering the effects on similarity of 105 phage proteins, places T4 at of other genetic parasites of E. coli. the unresolved root of the tree that includes Podophage, and other phage families. Thus T4 In addition, P2 prophage, (a ubiquitous thus seems to show evolutionary connections to phage known to exclude many lytic phage) a broad array of phage. Furthermore, T4 also will also exclude T4 by several mechanisms. shows clear similarity to Eukaryotic DNA This includes the expression of P2 Tin viruses, such as Herpes virus (see chapter 4). protein that T4 gene 32 (ssDNA This is observed via viral morphogenesis as well binding protein) as well as a P2 nuclease as sequence similarity of some replication that will degrade T4 DNA from exposed proteins. Thus T4, appears to represent a most ends. T-even phage appear to have a ancient viral system whose relationship to other monophyletic origin. Various viruses that infect distantly related host is still characteristics, such as common DNA apparent. However, T4 also has conserved modification, capsid morphogenesis, the specific similarities to eukaryotic cells and not nature and order of genes, mechanism of only the viruses of eukaryotes. These replication, all support a common lineage. similarities will mainly be presented in Chapter Yet T4-like phages (lytic with contractile 4. However, we can note, as an example, that T2 tails) can be found to infect many diverse and T-even-like RB3, (as well as SPO1 and its types of prokaryotes. T4 also shows some relatives - SP82, Fe, 2C and even numerous clear similarity to other polyvalent lytic phage of Streptococcus thermophilis) all have tailed phage, such as T7, RD114. In group I self splicing introns. Group I introns are addition, it appears that there exist a broad also found in fungal mitochondria, nuclei, distribution of nucleotide bias patterns in the plant and mitochondria, but not in lytic phage (inclusive of T4), but not most prokaryotes. Interestingly, Group I introns temperate phage (discussed below), do occur in some species of purple bacteria and suggesting evolutionary isolation of these cyanobacteria (discussed below), but these are two viral life strategies. The existence of polyphyletic and seem to have been acquired by broadly conserved genetic patterns in the during relatively recent lytic phage suggests an even broader evolution. T4 DNA polymerase, lysozyme and evolutionary connection amongst some lytic several other phage proteins are also clearly tailed phage then can currently be observed similar to Eukaryotic proteins but not the by sequence analysis. homologues of their prokaryotic host. Yet the genes if these lytic viruses do not seem to evolve Another group of viruses with clear by the same process as host genes. The host similarity to T4 are the psudoT-even genes or more stable (as entire genes) and their bacteriophage. This is a diverse group of ancestral relationships easier to discern. The T- viruses that can show cross hybridization, even genomes, (unlike the cellular genomes) even under stringent conditions, to T4 and clearly appear to have evolved by modular sub- RB49 DNA. The psudoT-even sequences 39 gene evolution from a network of both their animal host, but as episomal forms. closely and distantly related genomes. The Prokaryotes seem especially prone to persistent one evolutionary force that binds all these infection to larger dsDNA . Given the lytic phage together appears to be their vast ecological habitats occupied by all these common strategies of replication and prokaryotes, we can only imagine the enormous morphogenesis. Thus, the most reasonable impact that these parasites have had on their host view is that these phage constitute a genome. common, compatible and ancient network or pool of genetically interchanging replicators Lambda represents a rather large family of and not a single lineage. Thus, phage related tailed phage (Sipoviridae) whose genome nomenclature, such as the T-even is on the scale of 40 kbp of daDNA with around designation, does have a very strong 40 genes. However, unlike the T-even phage evolutionary significance (compared to host and many other viral families for that , species nomenclature) as the phage recent phylogenetic analysis of 105 phage genes nomenclature represents at best a viral failed to identify any genes that were in common mosaic which we call a virus family, but (core genes) to the entire family of lambda-like which cannot be represented as a traditional Siphophage. Yet the lambdoid phage, like T- evolutionary family tree. even, still represent a viral family with a common replicator strategy that allows genetic exchange and maintains both the virus specific Persistent-temperate Phage; Phage replication strategy and morphology. The Lambda Paradigm. T4 is considered as the closest and best studied temperate phage with bacteriophage that is the best studied clear similarity to lambda are P2 and P22 of example of an acute lytic bacterial virus. Salmonella. These phages all show some We can now examine the example of phage serological cross reactivity to each other. At the lambda to consider the characteristics of a nucleotide level, P22 is the phage most similar to persisting life strategy or temperate phage lambda and shares 22% sequence homology and compare this model to temperate viruses whereas P2 shares only 10% homology with of other prokaryotes. The section below lambda. These three phage can recombine with develops the general and specific features of each other thus they fit our view of a viral the lambda temperate phage system. species. As in the case of virulent T-even phage discussed above, the regions of conserved Characteristics of temperate phage and sequence similarity are distributed throughout the role of immunity. When we the genome in a patchy manner often involving contemplate the lambda model for sub-gene regions. The difference with other lytic prokaryotic virus persistence and T-even phage is that within the lambda family integration, a number of broad patterns can there are also more distant members of this be discerned. For one, the existence of a temperate virus family that show little remaining large number of similar dsDNA viruses that similarity within their genes. The lambdoid infect a broad array (possibly all types) of phage are even more heterogeneous the T-even prokaryotic cells seems peculiar to phage. This raises many questions in thinking prokaryotes (including Bacteria, Archaea, about common mechanisms of virus evolution. purple Bacteria and Cyanobacteria). How can we account for this broad difference? Although dsDNA viruses can be found in Why are the ‘core’ genes of this lambda virus and are prevalent in many other organisms, family not conserved? Do these unrelated such as insects and all vertebrates, few if members represent an independent lineage of any of these viruses persist by integrating phage? Are the temperate phage replicators so their DNA into that of the host chromosome. dependent on host cell replication systems that For example, Herpes viruses often persist in they do not need to maintain any core or 40 distinctive viral replication genes? Can the prevention of growth of competing viruses. P2 viral persistent life strategy be involved in is similar in this respect, but involves the genes this difference and do temperat phage exist old, tin and fun which resist lambda infection, in a distinct gene pool from that of their block T-even phage and inhibits T5 phage host? respectively. Interestingly, this region also shows a high AT content that distinguishes it as When in a persistent prophage, lambda a more recently acquired sequence. It is also expression is controlled by a bi-stable very interesting that P2, unlike lambda, is a non- genetic switch that will allow the prophage inducible prophage and appears locked into its to express only the genes associated with host (thus not lysogenic in the usual sense). immunity (cI, Rex). Stable protein-protein From this, it would appear that a major selective interactions with promoter sequences are pressure on a persisting phage is to resist used to achieve this epigenetic stability. The competition and by other phage. switch is sensitive to small changes in However, this exclusion has clear limits and it is concentrations of the cro repressor, which still possible to establish multiple prophage will switch expression and lead to the infection (with as many as 8 distinct prophage) induction of lytic virus replication. Due to in laboratory conditions. the expression of one of these genes (cI), a lambda lysogen is immune to superinfection A less autonomous replicator that senses host of phage related to lambda. In addition, as . The establishment of lysogeny is noted above, a lambda lysogen is resistant to affected by the physiology of the host cell at the superinfection with T4 (via Rex). However, time of infection. Cells that are starved for the mechanism of lambda immunity is media or in cold environments will tend to specific to the individual lambdoid phage. establish a lysogenic infection rather then a lytic Other related temperate phage (e.g. P2, P22) infection. Since in natural ecological settings, will differ in the mechanism of and genes bacteria are seldom in logarithmic growth, this associated with immunity and this variation situation is expected to prevail. In E. coli, represents one of the most variable regions stationary phase is also associated with of lambda like phage. For example, P22 and hypermutability due to high rates of phage L are two very similar phages, but recombination. In addition, although some differ completely in the immunity region. temperate phage can infect multiple host cell Yet it also seems clear that these phage are types (such as P2), most temperate phage are part of the same family as they conserve highly specific to their host bacteria. This is in their relative gene order. However, in contrast to the lytic phage, which tend to be more essentially all cases, lysogenic bacteria are host polyvalent. immune to similar, and sometimes dissimilar phage types. This situation is also called The induction of lytic phage from lysogeny can lysogenic conversion (although this be highly efficient under some conditions. In frequently refers to surface- rich media, E. coli lysogenic for lambda will be modification). induced by irradiation with UV light in essentially every cell, resulting in mass lysis of a In the case of P22, three immunity genes culture. Many other temperate phage are also (mnt, sicA, a1) are expressed from the induced by UV irradiation and this is a common prophage and all affect the replication of assay for the presence of a prophage in a other phage (via phage immunity, phage bacteria. It seems that disturbances of DNA exclusion and altered surface protiens replication might be responsible for this affecting phage attachment). Thus 12% of induction as thymidine starvation or treatment the p22 genome is dedicated to the with mitomycin C also frequently induce prophage. Using such an assay it has been 41 measured that 20% of the bacteria harbor prophage in ruminant intestines. However, Presistence without integration or induction: the biological relevance of this UV P1 and P2 – and defective lysogens. We have induction for enteric bacteria is debatable noted that P2 prophage will protect E. coli and it is clearly not always the case that against l infection by expressing the old gene prophage will induce with UV or any other and inhibiting l specific DNA replication. Thus treatment. Defective or episomal versions even amongst temperate phage there is of lambda, ldv for example, while still competition and exclusion. It seems likely that providing immunity and inhibiting T4 RII this viral capacity to successfully compete in mutants, will not UV induce. More order to colonize the host must be under positive important is the example of P2. P2 is a selection and may explain the evolutionary member of a distinct, very large and widely conservation of so many viral ‘accessory genes. dispersed family of temperate phage and can P2 is clinically prevalent and can be isolated infect E. coli, Shigella, Serratia, Klebsiellia from about 1/4 of clinical human isolates of E. and Yersinia species. Yet P2 will not induce coli. Thus P2 lysogeny is a much more common with UV light and is essentially non- and successful colonizer than the l lysogen. inducible. P2 replicates as a RCR, but this Like various other prophage, P2 integrates in a replication can be itself parasitized and site specific manner adjacent to various tRNA induced by the satellite virus P4 (discussed sites (the 7 bp anticodon loop). However, below). In this case, one silent virus needs because P2 is essentially a non-inducible another for induction. prophage, it is interesting to consider the life strategy and fitness of such a system. How does Another process of induction can also be P2 survive and transmit if it cannot reactivate? observed by l prophage and is known as P2 has clearly conserved the ability to make zygotic induction. When a l prophage has virus and thus does not appear to be defective. been transduced as a part of fertility factor P2 fitness thus appears to require the capacity to (F+) mediated chromosome trandsuction to produce infectious virions at some time, an F- and non lysogenic recipient, l will be presumably for transmission to other host (but induced in the zygotic recipient to produce not during extended persistence). The SOS lytic phage due to the absence of immunity inducible nature for both P2 and the related 186 function in the zygote. In a sense, l and HP1 is consistent with this view. prophage itself is behaving like an addiction Interestingly, 186 also does not interfere with module and is toxic (lytic) to E. coli lacking exogenous phage infection nearly as efficiently l, but protected by the persistence of l as P2 does. How we rationalize this P2 prophage prophage. More importantly, this situation situation, which persists as a non-inducible also suggest how the presence of a persisting prophage? A commonly expressed view is that a virus can lead to the reproductive isolation bacterial host will be under some positive of its host since the infected and non- selection for the P2 prophage virus to persist in infected host no longer make compatible sex its host because it may provide protection of the partners. This issue has major and general colonized cell against environmental stress and implications for evolutionary biology and damage. This relationship is considered will be discussed in subsequent chapters. symbiotic. But if so, we are still left trying to However, not all prophage (e.g. P2) undergo explain how the virus moves to new host or why zygotic induction. Stable persistence that it conserve virion structural proteins. In uses both harmful genes and genes that response to such issues, it has also been reasoned prevent harm or ‘addiction strategies’ are that if the infected host is damaged, it then is in commonly used by various persisting the best interest of virus survival for the virus to genetic parasites. induce lytic phage production and seek other

42 host (‘jump ship’). However, P2 (and Episomal persistence. The temperate phage P1 various other persisting phage including (related to P7) represents another family of episomal version of l and RP4), which are temperate phage that was first identified due to clearly prevalent and fit for persistence, do its ability to excluding growth of l. P1, not induce phage production even following however, is a bit more complex them most other lethal host genome damage. It is possible temperate phage and contains a genome of about that phage persistence itself has a fitness 100 kb, coding for about 100 genes. P1 displays advantage (such as preventing other viruses a variety of biological characteristics, which and squelching competition). In this case, makes it interesting to consider from the the life strategy of P2 (and various other perspective of persistence. Of particular interest defective prophage) might be more like a is that P1 appears to have several addiction ‘king of the hill’ game closely associated modules that ensure the maintenance of the virus with the simple successful prevention of the host. An addiction module can be defined competition. Thus, the maintenance of as a set of functions or gene products that are persistence, not reactivation, is the main toxic or harmful to the host (which are generally selective pressure and this can then explain stable) as well as a matching set of functions and the success of a defective l or 186 gene products that counteract, inhibit or provide prophage. Any such ‘defective’ genetic immunity to these same agents (these are parasites that improve the competitive generally unstable). The two together are survivability of persistence will be under necessary as a set to prevent harm to the host and positive selection. In this regard, the maintain the parasitic agent. P1 is generally presence of a retron (RT coding element) maintained as an episome and thus seems more within P2 and other similar defective similar to F factors. Although it is often prophage (fR67 and fR86), may identify a observed that persisting plasmid version of parasitic element that is invasive of other prophage, such as l, are less stable then potential competing phage host colonizers. integrated prophage, P1 has evolved a rather By interrupting the competitor genomes, elaborate addiction system that maintains resulting in the loss of the resident plasmid stability in daughter host bacteria (lose immunity/addiction modules, one persisting rate 10-5/generation). The system involves both phage can defeat the already resident coordination of cellular and viral DNA colonized prophage. Thus, invasive replication and the partitioning of daughter parasitic elements within P2 can enhance the plasmids into daughter bacteria and precludes the persistence function of P2. These P2 co-existence of more then one plasmid in a parasites thus improve P2 fitness, yet not daughter cell. In addition, P1 has one of the code for any gene products, nor are they more complicated systems for expressing simply selfish as they must improve P2 immunity then most other temperate phage and persistence. If this is in fact the function of uses three distinct immunity regions. Of these retrons, it identifies the existence of particular interest is the expression of a P1 coded significant competitive interactions between restriction/modification system, which is various persisting genetic parasites which involved in exclusion of other phage, as well as have distinct fitness profiles. Clearly P2 contributing to P1 plasmid maintenance. P1 does interact with other genetic parasites and was actually the very first system in which host is further discussed below. As we will see, restriction of this type was observed. P1 (as well P2 is efficiently mobilized, to produce some as other parasitic plasmids) code for fast acting infectious virions, but only following modification enzymes (EcoP1) that act on infection with an associated but defective replicating DNA along with a more stable slow family of satellite phage P4. acting type III restriction endonuclease that will cleave unmodified DNA. This can function as

43 an addiction module in that daughter cells persistence appears to identify an infectious that have lost the P1 episome, as these molecular identity process that could lead to daughters will not express the corresponding reproductive isolation of an organism. modification enzyme thus resulting in post segregation killing of the uninfected There are several other interesting characteristics daughter cell. that P1 lysogens show that may affect virus-host evolution. P1 contains an IS1 element that is Mechanisms of persistence; molecular involved in the occasional chromosomal identity markers and addiction modules. integration of P1. In addition, P1 undergoes Yarmolinksy in 1993 first coined the term switching of a genetic module that controls host ‘addiction module’ to describe the killing restriction. Using a site specific recombination action of a serine of phage P1 in system, P1 will invert a coding sequence order to explain post segregation killing. allowing for expression of one of two sets of tail This killing is a type of programmed cell fibers, similar to phage Mu. These two sets of death, that occurs (via phd-doc ‘death on tail fibers will have distinct host specificity’s. A curing’) along with the curing of the P1 lysogec bacterial host colonized by P1 will thus plasmid. This system, along with others, is have acquired both of these complex but designed to compel the infected cell to retain adaptive genetic systems. Another and rather the viral genome. It results in a very stable surprising characteristic of P1 is its ability to but persistently infected cell lineage. The provide the host chromosome with a second, stability is such that only one cell in 105 cell functional origin of replication. P1 can replicate generations will spontaneously lose the P1 both as a plasmid (oriL) and as in integrated plasmid. To accomplish this, P1 also codes prophage (oriR). When integrated, P1 (and P7) for another addiction module this consists of will allow E. coli with an impaired DnaA gene to a toxin/anti toxin gene set. The toxic gene replicate the bacterial chromosome from the (Doc) is stable while the antitoxin (Phd) is integrated P1 prophage origin. Thus the phage unstable; thus, like restriction endonuclease has the capacity to superimpose a new noted above, Doc will also kill host cells (or replication system onto its host in a way that daughters) that lack the P1 episome. These replaces the host ori function. This situation has are thus two examples of a mechanisms (or important implications for the evolution of new phenotypes) that compel viral persistence host replication systems as will be discussed in and appear to be crucial for the persistent chapter 4. life style. However, as a consequence of P1 genes forcing viral persistence onto its host, General implications of persistent phage and two very important effects can be observed. fitness. Several general conclusions from the One, P1 lysogens are immune to considerations of phage P1 and P2 should be superinfection with unmodified phage. Thus noted. One: a temperate phage (which is a a new P1 lysogen identity is acquired that is persisting genomic parasite) can be highly exclusive of other viral identity systems (as successful, hence fit, yet not have the ability to well as multiple copies of P1 itself). Two, induce the prophage to produce lytic virus, even P1 lysogenized bacteria are reproductively when colonizing a dying host cell. Thus it would isolated in that uninfected E. coli that mate be difficult to define this persistent fitness in the via F transduction with P1 lysogenized E. context of Ro. In addition, defective and non- coli are killed, as are daughter cells that inducible version of inducible phage (such as l) have lost P1. In addition, only phage grown can similarly be fit, as evidenced by their in a P1 lysogen will be properly modified biological and competitive stability. This fitness and able to subsequently infect P1 lysogenic and the capacity to subsequently make phage has E. coli, so even permissive phage a basic temporal component which allows for susceptibility is host restricted via P1. This 44 some subsequent, infrequent but dependable chromosome, but can also clearly be episomal. event (infection with a helper) to propagate However, the existence of episomal phage virus with high probability and success. persistence diminishes distinctions between a Second, temperate phage can also be very persisting phage and a parasitic episome, successful and stable by predominately especially as many parasitic episomes appears to existing as an episome, not usually be defective prophage. This blurred distinction integrated into the host chromosome. In this between virus and plasmids will be discussed situation, the phage must express genes that further below. Finally, we have seen several insure coordination of two replicons, the examples of phage that have become themselves virus DNA and the host DNA, as well as parasitized by other genetic parasites express systems for the stable partitioning of (hyperparasites). In some cases, these secondary viral chromosomes into daughter cells. genetic parasites are specific to a virus and are These systems needed for viral persistence, not otherwise present in the host (such as even defective virus, are elaborate and in invasive introns of T-even phage). In other sharp contrast to the concept of selfish cases, the parasites are interdependent. As is DNA, which has no phenotype in the host. discussed below, both temperate phage and lytic Viral persistence always seems to require a phage appear to become parasitized by either phenotype or strategy that compels the host defective viral or subviral agents. In the case of to maintain the viral genome, yet maintains temperate phage, such hyper parasites can result the capacity to recognize the distinct in the destruction of the immunity module that molecular genetic identity of the parasite. In controls persistence, leading to the induction of a many bacteria, these persistent phenotypes lytic phage. Thus, there is also good evidence of usually involve immunity functions or nested interactions of genetic parasites exists in addiction modules, but sometimes no gene natural settings and that these interactions affect products are involved and persistence the virus-virus and virus-host relationships, some simply provides a genetic system that is of which appears to be mutualistic. It is worth efficiently parasitic to and competitive with noting the similarity of these hyperparasites to other prevalent genetic parasites. Phage those observed in the Tierra simulated life genomes contribute a significant portion of program described in chapter 2. their coding capacity to these inter-parasite functions and this is an especially dynamic Distinct gene pools of persisting and acute component of phage/viral DNA. We must viruses. In the above section, I have presented again emphasize that persistent phage fitness the view that the viruses of bacteria can be (and hence host fitness) requires the capacity considered to have at least two distinct life to compete, preclude or to be parasitic to strategies; acute and persistent. I have also other prevalent viruses, both persistent and shown that these situations involve distinct acute. If we attempt to evaluate the fitness relationships and fitness between virus and host. of persistence in a host cell without such In addition to these noted relationships with competing parasites, we will fail to see their gram negative host, there also appear to be broad essential contribution and think of them as evolutionary distinctions between acute and accessory in function. persistent phage and their relationship to host. It is now well established that all organisms, Although a persistent phage colonization including viruses, have rather distinct event is forced onto the host by addiction frequencies for the occurrence of the four modules and other molecular strategies, it nucleotide bases (such as AT of GC content) as will often appear more benign, as a well as distinct patterns of nucleotide ‘words’ mutualistic (symbiotic) relationship between (di, tri, tetra nucleotides) and nucleotide virus and host. These stable persistent palindroms. It has been observed that acute states often involve integration into the host bacterial phage have nucleotide word frequencies 45 that are fully dissimilar from that of their common in plants, as will be discussed in host. However, temperate phage (and chapter 7. parasitic episomes) have word frequencies that are the same as their host. This As discussed above, P2 is itself a member of a observation confirms the distinction between large family of temperate phage, related to l, acute and persistent viral life strategies in that are able to exclude T4 and other phage by bacteria but also indicates that temperate several mechanisms. P2 was originally isolated phage are in the same gene pool as that of from an E. coli strain that also harbored P1 and their host. Furthermore, this distinction P3. When integrated into E. coli, P2 itself between acute and persistent word becomes a molecular ‘defective’ in that frequencies is not unique to bacterial viruses integration interrupts P2 transcription. and can also be seen in essentially all Furthermore, P2 is not readily induced from viruses. In the case of bacteria, we will lysogeny and is not activated by UV irradiation develop the view that persistence provides a or zygotic induction, and persists seemingly as a pathway by which viral derived genes can stable defective parasite only expressing contribute to the evolution of bacterial host. immunity function. Yet P2 and all of its However, we will see that temperate phage relatives can function as helper virus for the for the most part tend to acquire new genes smaller P4 and is induced to produce phage at (and sometime complex gene sets) from a low levels by P4. P4 has no gene related to P2 recombinational processes involving other genes. P4 is a 12 kb ds linear DNA phage that temperate phage or genetic parasites. packages and replicates via circular DNA using a bidirectional origin of replication (as opposed to RCR DNA replication of P2). P4 replication is The P2/P4 satellite phage: a parasites of dependent on P2 helper and provides a distinct parasite paradigm – survival of the more and smaller version of a capsid protein, but defective. Lysogenic phage are highly derives all the other and numerous structural successful in many natural bacterial proteins, including tail proteins, from the P2 populations. But this high rate of helper. Infection with P4 has several possible colonization and success can itself lead to outcomes. If the host is not colonized by a P2- additional opportunities for viral like helper, P4 will either integrate into a parasitization. In chapter 1 and 2, we prophage at a unique tRNA site, or more rarely considered the general tendency of viruses establish a munticopy (30-50) episomal state. In to generate defective versions that are both of these situations, P4 will always express parasitic to the non-defective helper virus. immunity function, which is mediated by a rather We have also discussed defectives of distinct mechanism involving expression of a lambda and P2 and the relationship of P2 to small stem-loop like RNA and transcriptional P1 above. However, besides affecting P1, termination and this immunity function must be P2 is also a satellite virus of P4. A satellite suppressed for lytic replication. Thus persistence virus is defective for autonomous and immunity is the default mode of P4 infection replication, and requires a helper virus, thus (in the absence of P2). P4 also codes for various it resembles a defective virus. However, ‘addiction modules’ such as the gop killer toxin unlike defectives, a satellite virus is not and the b gene which prevents gop cell killing as directly derived from the helper virus. well as several other potentially toxic genes and However, for a satellite virus to be numerous small genes. If the resulting P4 prevalent, it requires that the helper is also lysogen is subsequently infected with P2, it will prevalent. The best studied satellite/helper suppress P2 replication and produce P4 instead. virus system of bacteria is the P2/P4 phage When P4 infects a host colonized by P2, it can of E. coli. Satellite viruses are also very also either integrate or become an episome

46 resulting in a lysogenic state. A frequent strategies make this persistent life strategy highly outcome, however, will be the induction of successful. It is the viral genes and viral specific P2 and the efficient lytic replication (using phenotypes (such as viral immunity, addiction, P2 lysis genes) of P4. However, during this , etc.) that will generally be involved in the induction, some P2 is also produced, thus successful persistence. In the case of P4, we ensuring the propagation of P2. must also consider that a host E. coli colonized by P4 will consequently be strongly affected by Defective versions of P4 are frequently these same viral superimposed phenotypes and observed. Because of its defective and have a different adaptive and fitness profile and a episomal nature, P4 (like P2 described distinct evolutionary trajectory as a consequence above) also closely resembles an array of of multiple parasite colonization. Especially plasmids that resemble cryptic phage affected will be how this colonized host will elements. P4, however, is a highly interact with other genetic parasites. P4 is thus successful parasite in nature, hence is clearly the best studied example of a parasites of fit in a natural habitat. In clinical isolates of parasites (hyperparasite). And in the case of E. coli, 1 in 4 are observed to harbor P4 or a fR73, there is yet another level of retron parasite defective derivative of P4. Furthermore, that can be further considered (a hyper-hyper also like the P2 situation noted above, P4 parasite). This is simply considering the itself can be parasitized by a retron. This is interaction of two persisting parasites (P2 and apparent with the phage fR73, which is P4) and not even considering the possible effects nearly identical to P4 but with a retron on other lytic phage, such as T4 or P1 and P3, all element (coding for ), of which amazingly were present in the original which has integrated to the right of the att natural P4 isolate. These lytic phage are also site and now provides a different tRNA known to be important to the outcome of P4 target site for prophage integration. These colonization and host survival. What we see in two P4-like cross immunity to these interactions is the existence of a highly each other. Retrons are generally rare in sophisticated and complex web of parasites in prokaryotes and found mainly in various the context of their natural habitat and host. The phage and other genetic parasites. However, caldron of nested sets of competing and the presence of a retron within a viral interacting and often seemingly defective genetic genome in is not unique to fR73 and is also parasites is indeed reminiscent of the common in the viruses that infect observations in Chapter 2 on computer cyanobacteria, discussed below. ‘simulated evolution’ in which parasites of parasites lead to the evolution of much more Host fitness in the context of successful complex and higher order systems. In the case defective parasites. The P4 satellite virus of P2, after host colonization, P2 is otherwise makes several important and general points unable to reactive from a lysogenic state. P2 that should now be emphasized. A defective may be considered to depend on its own genetic and generally persisting virus can be highly parasite, the satellite P4, in order to undergo fit and adapted to its host, yet its own reactivation. Once liberated by this low level P4 transmission be parasitic to other prevalent mediated reactivation, P2 can undergo a much and persisting (and possibly also defective) more efficient but lytic replication in susceptible genetic parasites. This almost sounds like host E. coli. But P2 will become host trapped if an oxymoron. How can two unrelated it again undergoes lysogenization. Thus P4 defectives have evolved to work together? appears needed to mobilize P2 at a low but Can two wrongs make a right? Is this an successful rate from its colonized host, example of group selection? Yet the explaining why P2 is under selection to retain all presence of addiction modules and other the gene functions of a virus. Thus we can now

47 clearly see why the fitness of a persisting Reo virus (see chapter 8) and thus represents a virus and its host will be exceedingly well established viral lineage. In contrast to difficult to measure in the absence of the tailed phages, however, the virus capsid also other collaborating or competing genetic contains an essential envelope. parasites. Another lytic phage that infects via a pilus is Qb. PHAGE THAT INFECT HOST VIA Qb is also of a distinct phage type as it is a small PILI icosahedral capsid of plus polarity ssRNA. This family of viruses is organized into several Acute infections. As mentioned above,, groups, which can infect a wide range of enteric bacteria, which are amongst the best bacteria, but all are clearly related to one another studied, will often carry sex plasmids that (such as M52, f2). The infections are strictly confer the capacity for conjugational lytic and selection for resistant bacteria results in transfer of DNA via a pilis structure and also the loss of the pilus. Phage of this family are code for an integrase. These include F mostly found in association with animal feces in (fertility) factors, N pili, and drug resistance which case phage counts can be as high as 107 factors, such as RP. The likely origins of pfu/ml. Curiously, F-specific coliphages are rare pili are discussed below, but their clear in human and cattle feces, but common in pigs similarity to the capsid proteins of and birds. The RNA dependent replicase of Qb filamentous phage makes it likely that they has been especially well studied and has themselves are originally derived from provided considerable insight into the persisting viral parasites. The presence of biochemical evolution of a replicase and its these external appendages that transport intact or defective substrate (see chapter 2). DNA, however, also makes these cells Early on, Speigleman showed when highly susceptible to infection with various types of purified, this polymerase could spontaneously virus. In fact the specificity of virus assemble substrate nucleotide triphosphates into infection will often be via the pilus and not Qb origin containing templates, which would highly dependent on the bacterial host. then amplify in vitro to very high levels, a However, these pilis infecting viruses are of process called ‘monster’ formation resembling distinct types form the larger dsDNA tailed spontaneous biogenesis. Although the Qb phage we have already considered. Some of replicase has one of the highest polymerase error these pili infecting viruses are acute agents rates ever measured, natural and field isolates of and do not establish either provirus or this family of phage show very little sequence persisting infections. 6 is one of the better f variation. However, variant versions of Qb can studied examples of a pilus restricted acute easily be isolated in lab settings and these will virus. F6 is a strictly lytic small virus grow well in laboratory settings. Only when containing a dsRNA genome coding for 14 these variants must compete with wt Qb will the genes, including an RNA dependent RNA relative of the variant fitness be replicase. It is most interesting that F6 observed, as it is lost from the passed culture. infection is restricted to pesudomonas bacteria, for unknown reasons, and is Nonlytic pilis infections. Filamentous phage, frequently found in association with such as Ff and M13 also infect their host via the degrading plant material. Land plants also pilus structures. However, in contrast to the support symbiotic filamentous fungi that are RNA phage above, these infections do not result essentially all also infected with various in lysis. Instead a chronic, ongoing but non-lytic types of dsRNA viruses (see chapter 4). The production of virus is established. The cells do f6 family of virus is clearly related to not need to lyse as virus coded functions allow dsRNA viruses found in animals, such as for continuing virus extrusion through the cell

48 membrane and cell wall. This is a variation also clear that the presence of a conjugative sex of viral persistence since virus production plasmid has important impact on the relationship can be continuous, but some of these the host has with various other prevalent viral filamentous phage are also able to integrate agents. We therefore expect viruses to have an into prophage. This relationship also important and perhaps central impact on the resembles one that is called evolutionary potential and consequence of this ‘pesudolysogeny’ in which a phage genome sexual process. However, there are very few remains as an episomal or a plasmid-like laboratory or direct measurements of these prophage and does not integrate, by results interactions, so we are hard pressed to make any in chronic phage production. Filamentous definitive evaluation of this issue. For example, phage have ssDNA circular genomes that if a bacteria which harbors a F factor is also are packaged into rod-like protein structures infected with M13, what consequence does this (with very high alpha helical content) and have with respect to colonization with other are extruded from infected host without temperate phage or infection with lytic phage? lysis. There is a clear structural relationship How do the expected milieu of virus-virus between capsid genes of filamentous phage interactions affect sexual exchange and and that of the pilis. However, it has been evolution? We currently lack answers to such observed that M13 coat protein (which is so questions. useful for phage display of cloned protein epitopes) is tolerant of a surprising level of Other types of acute RCR phage: The variation. The amino acid sequence of M13 filamentous phage are very similar in their can actually be inverted from its natural replication strategy to another well studied E. polarity, yet still result in a highly efficient coli phage, fX174. However, unlike the yet totally novel capsid protein. filamentous phages, fX174 is a lytic virus is not Filamentous phage replicate as rolling l;ilis dependent. Infection with fX174 excludes circular replicons (RCR), involving a virus virus re-infection, codes for capsid and encoded site specific endonuclease that replication primer genes as well as coding for leads to the covalent attachment of the viral numerous other small protein, several that are protein to function as a primer for the not essential for replication in culture and are of replication of viral DNA. This process of unknown function. Although fX174 is also a protein primed replication is found in ssDNA circular genome, it is packaged into an various families of DNA and RNA viruses, icosahedral capsid, not a filamentous rod. but is absent form the host genome. This fX174, however, is the best studied of phage that family of filamentous phage, however, will replicates by an RCR. It also uses a phage also code for addiction modules and toxin specific and sequence specific endonuclease and genes. An addiction module of specific covalently attaches a primer protein to the viral interest is the CTXphi phage of Vibrio DNA to prime rolling circular replication of viral which codes for the cholera toxin DNA using host DNA III. fX174 is genes (discussed below). structurally more related to small ssDNA viruses found in plants and animals so it is considered a It seems clear that there is a strong better model for an ancestor for these eukaryotic relationship between the pilus mediated viruses. fX174 protein primer (gpA) is thus a sexual system of bacteria and various phage, basal member of a very large family of RCR involving both acute and nonlytic infections. primer proteins and similarities to the viruses of It seem likely that pilis structure, with the plants and animals is apparent. This RCR based integrase and the transmembrane transport replication can generate concatamers, but it use of DNA used for transduction, has itself is specific to numerous other viruses (including originated and evolved from an ancient virus P2, P1, ) replicons and is not used to replicate infection (discussed below). However, it is l 49 any host chromosome. Thus the presence DNA integration into specific chromosomal of an RCR replication strategy is a reliable sites, associated with specific tRNA sites. In marker for an ancient virus specific some cases, the integrase itself can function as a replication system. phage specific . For example, the prophage encoded integrase of Dichelobacter nodosus (ovine footrot) is one such a virulence Relationship of persisting phage to factor. Bacterial virulence itself is a topic that plasmids and sex in prokaryotes. also establishes the similarity between plasmids and phage. Bacterial virulence is probably the How phage resembles plasmids. We have best studied example of an important complex noted above that in several cases, the host phenotype, that is acquired in one genetic distinction between plasmids and episomal event. It has always been clear that a prophage persisting phage can be very blurred. Yet, can confer onto their host bacteria this rather plasmids and fertility factors are often complex phenotypes associated with the thought of as distinct entities from acquisition of virulence factors. For the most bacteriophage since they lack many of the part these factors are phage encoded toxin genes, structural genes characteristic of a virus. such as those of diptheria, erythogenic toxins, Entire books that address the issues of staphylokinase, enterotoxin A, Shiga-like toxin, plasmids have been written from this Clostridium botulinum neurotoxin. In addition, perspective. However, as we also noted alterations in the bacterial cell surface, such as above, persisting episomal phage closely the O- of Shigella, can be due to phage resemble plasmids in various ways, and may coded virulence factors. These virulence even lack the genes coding for virion associated genes are, for the most part, virus structural proteins. In the above section on derived genes and generally have no host P4 satellite virus and other defective virus, counterpart. Also, they tend to reproductively persisting viral parasites are often be isolate their host form uncolonized host. Finally, defective for most if not all viral structural it should be noted that by acquiring one of these genes. Yet these persisting phage retain an virulence associated prophage, the host has essentially virus dependent replication acquired a complex and new phenotype that can strategy, requiring the help of another include the acquisition of dozens if not over one infectious agent for its mobilization and hundred new genes, all in one genetic event. transmission. Clearly these persisting phage closely resemble plasmids. In these How plasmids resemble phage. All of the examples, the persisting virus may be a above characteristics of phage persistence are hyperparasite. In order for such a persisting also seen in plasmids. Let us now consider how and hyperparasitic virus to be fit, however, some well studied plasmids closely resemble they must retain some phenotype or strategy phage. Bacteriocins are plasmid encoded which compels persistence by either bactericidal particles that are highly specific to providing a competitive advantage to the and active against other bacterial strains that lack host or a new molecular identity system that the plasmid. Bacteriocins exist in two recognizes and precludes other genetic categories, large particles and small molecules. parasitizes. Plasmids are often identical to The large bacteriocin particles are clearly related persisting defective phage in these same to phage particles (forming both icosahedral and characteristics. filamentous forms), although such particles often lack DNA. Thus they closely resemble phage Another similarity between prophage and virion structural proteins able to forms holes in plasmids, is observed via their respective and kill susceptible cells. The bacteriocins can coding functions. Both phage and plasmids also be due to small molecules, which are toxins can code for a specific integrase that directs which can specifically kill host strains that lack 50 the plasmid encoded anti-toxin. In this clear selective pressure, such as the presence of a regard, they clearly resemble the addiction drug in the growth media. However, it has been modules found in numerous persisting experimentally observed that following selection phage. K is a plasmid encoded for plasmid persistence, plasmids can confer a somatic cell wall antigen that confers competitive advantage onto parasitized host in virulence and also clearly resembles a phage the absence of an obvious selective pressure. In conversion situation. Thus, it seems most several other respects these plasmids resemble likely that these bacteriocin and other phage in that they are stable, can have alternative plasmids have originated from persisting replication strategies, yet can be made up of cryptic phage which retain the associated sequences that are distinct in base content (GT, persisting phenotype. It is also well AT etc.) from the cellular chromosome. As established that plasmids can code for discussed below, mobile plasmids will often also restriction/modification systems that also code for integrases that are clearly related to function as addiction modules and are phage coded integrases, using the same tRNA widespread in both Bacteria and Archeae. integration sites. Although many plasmids are In the case of Lactobacillus, there is a related to one another , plasmids do not generally plasmid based example of both the have the same degree of phylogenetic restriction and modification activity residing conservation as do phage families. It therefore within one peptide which still confers seems most likely that for the most plasmids resistance to phage infection and represents have evolved from polythetic lines of cryptic the simplest know version of a persisting phage. restriction/modification system. Thus these plasmids are essentially identical to Pathogenic islands (PAI) origins and phage. persisting cryptic phage. Pathogenic islands (PAI) are also a well studied plasmid mediated genetic system, due to their Multidrug resistance and virulence plasmids obvious medical importance. However, these are also very well studied and are known to sequences are integrated, not episomal. Studies code for large numbers of genes. In some of PAIs makes it clear that in one genetic event, cases, very large virulence plasmids have a bacterial cell can acquire a very large and been observed (e.g. Bacillus anthracis complex set of interacting genes, which confers pXO1, greater then 180 kbp) which in to the host bacteria the ability to colonize human addition to virulence determinants, these host, affect immune recognition, alter or regulate plasmids appear to be sites for acquisition of cell physiology, and cause disease. Thus PAI’s other plasmids, various addiction and provide a clear example of the acquisition of immunity modules and transposable complex phenotype by the host. 75% of these elements. However, in some cases, these PAI’s are associated with tRNAs at a sequence plasmids are so large that they also appear to junction at the point of integration. This be second bacterial chromosomes. In a observation suggests phage involvement since sense, these large plasmids function as sinks phage (not host) integrases target tRNA DNA or traps for other plasmids and prophage, sequences. Consistent with this idea, it is known making the host more able to adapt to that the PAI integrase, which is essential of host complex environmental changes. A good colonization, is related to the integrase of P4 or example of this situation is with the genome f73. As noted above, these phage integrase are of Vibrio cholera, which has a large plasmid encoded by retron present in the phage that exists that can also be considered as a second in two types. However, PAI integrases are chromosome. As a rule, it is often stated frequently defective, indicating that they are that the presence of some of these plasmids inactive. It thus seems likely PAI elements may (especially large ones) represents a fitness need a helper phage to mobilize and colonize burden for the host bacteria in the absence of 51 additional host. Some of these PAIs have in structure of F factors to capsids of filamentous fact been directly shown to be excised and phage as well as the relationship of both their transmitted by helper phage. One example integrases and tRNA att sites to those of phage is that of salmonella SaP1 island, mobilized and how phage are the likely progenitors of sex by f 13 or f80 phage. Thus the PAI factors. However, lack of sufficient sequence colonization process appears to essentially similarity prevents us from generally concluding be an infectious event involving defective that phage were indeed the predecessors of sex replicator elements and phage. The factors. In the case of phage, the integrase can distinction between this process and the also be part of the primer-replication system of defective prophage relationships we have the phage genome and the att sites define outlined above seems minimal. In a sense specific phage replicators (lineages). Except the term PAI is really a misnomer and have Mu, all known phage integrases mediate site been better called ‘fitness islands’. They specific recombination (via tRNA sites) and clearly introduce new phenotypes into their belong to the same gene family (l type integrase host and alter host fitness. Thus PAIs can family). Thus it has been argued that all phage also be considered to be persisting genetic integrases appear to be monophyletic and to have colonizers that are defective for evolved from a common ancestor. We have also mobilization. It is clear that PAIs cannot be noted how these sex appendages make bacteria considered selfish elements since they susceptible to various acute phage infections and clearly bestow important and complex are thus directly subjected to phage based phenotype onto their host. However, PAIs selection. In addition, transduction requires the are often thought of as having moved genes trans-membrane movement of DNA. The from one host to another, thus representing plasmid protein responsible for this DNA lateral transfer of gene sets. The problem movement clearly resembles the DNA ring with this view, however, is that it fails to helicase, characteristic of various phage address the origin of these complex gene replicators. Thus, for the most part, these F sets, which are very often unlike any other plasmid essential genes have clear viral host genes. The acquisition of extended counterparts, but do not have host counterparts. regions of DNA constitute the most dynamic Furthermore, in many cases, these factors can portion of the bacterial genome. Since it is affect the outcome of phage infections. These clear that PAI acquisition results from an effects include F-factor mediated phage infectious colonization event, it is also resistance, induction of prophage production, as logical to propose that an infections agent well as F-factor invasion of silent prophage. itself, such as a persisting phage, might also This invasion of silent prophage can lead to the have provided the original genetic material loss of the prophage immunity module, from which to assemble these ‘fitness suggesting that F-factors are in cometition with islands’. some prophage. In addition, F-factor mutation- reactivation of lytic phage, along with F-factor Sex Factors and transposable elements invasion of the prophage, can result in the relationship to phage. Of special interest mobilization and hitchhiking of the F-factor are the mobile plasmids and sex factors of within the phage to transmit to another host. All bacteria, both because they are thought to be these characteristics suggest that sex plasmids of major importance to bacterial evolution not only are derived from phage, but are a and because of their close relationships with component of the continuum of genetic parasites viruses. However, F-factor distribution in we call viruses, but perhaps being most natural populations is not uniform as is associated with a ‘defective’ persistent life phage distribution. We have already noted strategy and often dependent on acute virus for the clear physical similarity of the pilis transposition to other host.

52 most part, however, viruses that infect E. coli F-factors represent efficient transposable have been the best examined models for both elements. However, by far the most highly lytic and persistent Bacterial infections. adapted, complex and efficient transposable However, we know that the prokaryotic world element of all is Mu phage of E. coli. Mu is represents two of the three distinct domains of a temperate linear dsDNA tailed phage life; Bacteria and Archaea. The Archaea domain (similar to T-even phage) that is able to is further divided into two kingdoms of life transpose at rates 100-1000X greater then known as the Euryachaeota domain, which that of nonviral transposible elements. In includes methanogens and extreme halophiles, addition, Mu can transpose to almost any and the Crenarchaeota domain, which includes site in the E. coli chromosome, thus the and sulfur metabolizing organisms. name Mu for mutagenic. Like other In keeping with the general observation that all temperate phage, Mu codes for a DNA life forms have their own particular types of modifying enzyme and Mu is resistant to P1 virus associated with them we see that Archaea phage mediated restriction. Most often, Mu also have unique relationships with their viruses, infections result in lytic infection, as but differ from Bacteria in this. Like the lysogenic establishment is not efficient. In bacteria, the large majority of the viruses so far addition, unlike l, Mu prophage is not characterized are also dsDNA viruses. However, induced following UV irradiation or there are some significant overall differences mitomycin C treatment. However, unlike all between the viruses that infect Archaea and other phage types presented so far, Mu lytic Bacteria. replication is coupled to transposition. Thus the transpositional activity of Mu proteins Euryarchaeota. One of the best studied (which provide att site recognition, nick Archaeal phage is the halophage fH, which DNA to prime integration) are essential for infects halophiles (Euryarchaeota). fH is a Mu lytic replication. About 100 tailed phage strikingly similar to T-even phage in transposition events will occur in each morphology, replication and transcription lytically infected cell. It is furthermore strategy and contains a 59 kb ds DNA linear interesting to note that Mu can integrate into genome. Hs1 and HF1/HF2 are similar tailed and inactivate a l prophage, suggesting that phage of halophiles but the Ja1 halophage is the capacity for such high level transposition notable for having a very large genome of 230 might also allow Mu to compete kbp. (lacking modified nucleotides). However, successfully with other prophage that may in contrast to the T-even phage, fH is a have colonized the same host cell. One temperate phage which is induced in stationary thing is clear, however, viruses are by far the growth. In addition and like P1, fH persists as a most efficient transposable elements known stable episome, or autonomous plasmid that and their unmatched rates of genetic confers immunity. Like T4, fH also will adaptation and evolution makes it likely that package almost 20% more DNA then is coded they were the progenitors of both sex for by the virus, but typically the fH genome plasmids and other bacterial transposons. will contain tandem copies of the ISH 1.8 insertion element, which also leads to genetic instability of the phage genome. The HF1/HF2 The Remainder of the Prokaryotic World: phage are acute version of tailed halophage and The Archaeae Domain and Their Viruses replicate only in a lytic mode and are also resistant to type II restriction enzymes. Within In our above discussion, we have examined methanogens, which constitute one of the largest viruses that infect Bacteria and considered groups of archaea, yM1 is the best studied their relationship to Bacterial host. For the phage. Similar to fH, yM1 it is also a temperate

53 phage and like T4 has a dsDNA linear carrier infected states are exceedingly common genome that occupies less then a headfull of hot thermal habitats. One of these viruses is also DNA but which contains a multimer cryptic known to integrate into the host chromosome. plasmid providing additional DNA to fill the All cells isolates so far appear to host some and phage head (pME2001). Thus in this often multiple phage infections. Strikingly, in Euryarchaeota kingdom of Archaea (which one study, one isolated hyperthermophile hosted is most related to Eucharea), the type of all nine phage morphotypes so far characterized. phage present and relationship to its host is Thus mixed persistent carrier infections are very similar to, but distinct from that seen in common. Several genomes of these viruses Bacteria. have been recently sequenced. One sequenced genome, PSV, was shown to have a linear Crenarchaeota. However, in the sdDNA in which all open reading frames are on Crenarchaeota kingdom, the types of virus one DNA strand. Most remarkably, however, found and relationship with its host are not one of these open reading frames showed any unique. The overall pattern of phage found recognizable similarity to any gene in the in these hosts differs strikingly from that GenBank database! This would include viral found in Bacteria or Euryarchaeota. In core and replication genes, which often show contrast to Bacteria, in which 95% of phage clear similarity to other viral genes. are some type of tailed dsDNA containing Furthermore, initial screening of other phage capsid, only 5% of phage so far are of this clones suggest that the low similarity (less then tailed type. 95% of these phage are of some 5%) of these phage genes to the database might other morphology, with filamentous forms be a general characteristic of most of these being the most abundant. No RNA viruses uncharacterized genomes. The implication is have yet been reported for the Archaea. that a vast repertoire of unique genes exists in However, 9 distinct morphotypes of various these hyperthermophillic phage. Another phage, DNA viruses have been observed in these AFV-1, was also sequenced, which did show host, most of which are of a unique physical some gene similarity, but also showed some structure. The most striking difference, highly interesting properties. One property, however, concerns the prevailing life AFV-1 uses eukaryotic-like TATA promoters strategy of these viruses in their host. All of (unlike its host cell). Of high relevance to the 9 types of virus are non-lytic, and are chapter 4 (the possible origin of the nucleus), produced by continuous extrusion and not this phage has a linear dsDNA in which the ends by a cell burst process as is common in are composed of G/C rich 11mer repeats, that Bacteria. One of these viruses had a clearly resemble the telomeres of eukaryotic completely unique double tailed chromosomes. Furthermore, sequence analysis morphology. Thus most or all infections are suggest that this family of virus is basal to and some type of persistent carrier state, not resembles the chlorella viruses, the poxviruses lytic. The Crenarchaeota include the and AFSV – all large DNA viruses of hyperthermophiles and sulfur metabolizing eukaryotes. No other prokaryotic DNA virus bacteria. The viruses that infect the had been shown to occupy this basal hyperthermophiles have attracted the most phylogenetic position. attention since the present a potentially rich source of proteins that have high thermal stability and might have much commercial Of the sulfur metabolizing bacteria, the best value. Recent phage surveys studied virus is the Solfolobus virus SSV1 which and sequencing projects by D. Prangishvili, as a novel lemon shaped morphology with a very W. Zillig and others have begun to give us a short tail not seen in any other type of virus. better picture of these remarkable viruses SSV1 is also distinct from bacterial viruses in and their host. These extra-chromosomal having a closed circular ds DNA genome of 54 about 15.5 kbp. Intriguingly, and possibly appear to bind to pilis. In addition, a widespread unique in the biological world, this DNA is plasmid, pDL10 is found in Sofolobus which packaged as a positive supercoiled allows alternative oxidative or reductive topoisomer and requires a reverse gyrase for metabolism of sulfur and its copy number is replication. In addition, this virus is also amplified in linkage to energy metabolism. not lytic. It is lysogenic (with tRNAarg Another plasmid, pTIK4, has an addiction integration) in its host and does not module and is able to induce killing in non- reactivate lytic virus production. Instead, it colonized cells via a cell-cell contact associated appears to be rather unique and spread from process. However, plasmid-virus interactions lysogenic host to uninfected host by direct have not been well evaluated in Archaea, cell-cell contact involving low level non- although the existence of plasmid encoded lytic virus production. This virus therefore restriction/modification addiction modules and has a highly inapparent and persistent life the mobilization of plasmids by parasitizing strategy, more so then most viruses of infectious phage suggest that considerable bacteria, and seems adapted to essentially interaction must occur. never make large quantities of virus. SNDV is another droplet shaped virus with a 20 kbp Overall, Archaea support virus infections that are closed circular DNA genome, so it appears clearly similar in some respects to those in that circular viral genomes are the norm in Bacteria (lytic and temperate ds DNA viruses), these host, unlike hyperthermophillic which can have restriction/modification and bacteria which hosted mostly linear DNA toxin based recognition systems. Yet the two viruses. There also exist unique kingdoms of Archaea are distinct from each filamentous forms of DNA viruses of other and from Bacteria in the types of virus they Crenarchaeota, such as TTV 1, 2, 3,4. support. Of special interest is to note the Unlike the filamentous viruses of Bacteria, presence of linear ds DNA viruses which have these viruses have linear dsDNA genomes ‘chromatin’ bound DNA as well as lipid and are highly heat stable. Also in envelopes. These are all characteristics of distinction to viruses of Bacteia, the DNA of Eukaryotic chromosomes which will be these viruses is stochiometrically bound by presented in chaoter 4. Given the tendency of one or several highly basic viral encoded viruses to distinguish their genomes from those proteins. Thus we see a chromatin-like of their host by various covalent modifications, structure if the viral genome, more like the we can infer that a tight association of viral ds DNA viruses of eukaryotes. In addition, DNA-protein seen in Archaea will also function these virions have lipid envelopes, either as a molecular system that differentiates viral internal or external to the capsid. Both from host chromosome while also providing temperate (TTV1) and lytic (TTV4) life protection against sequence specific host strategies are found in these filamentous recognition systems, such as restriction viruses. In Sofolobus, six unique virus modification. particle morphologies have been observed, three of which were completely novel. T-even like phages predating Essentially every virus yet found in the Bacteria/Archaeae host divergence. Although it Crenarchaeota kingdom is of a unique type, was noted above that Archaeal and Bacterial not found in either Bacteria or Eucharya. phage have numerous distinctions, the striking similarity between the halophage like fH and Plasmids are also known in Archaea. Bacterial T even phage in structure (including Besides the ISH elements present in contractile tails), DNA replication strategy halophage genomes, noted above, (including concatamers and headfull packging) conjugative plasmids are have been as well as transcriptional organization (including observed. Like Bacteria, many viruses back to back promoters controlling early to late 55 transcription) are all hallmarks found in genomes from enteric phage indicated a ‘patchy’ related viruses. This makes it likely that or mosaic character to the similarity between these phage originated from a common genomes, often consisting of gene or subgene ancestors and it has been argued that all modules. However, phylogenetic analysis has tailed phage have a monophyletic origin. generally shown that the acquired genes that Given the highly host dependent nature of distinguish diverged lineages of phage have few these phage and the major difference in the if any counterparts in their host. Phage genes are life style and physiology of their host, it also mostly unique to the specific phage lineage. In seems likely that the host for the common some viral lineages, very few or none of the viral tailed phage ancestor would be the ancestral genes show similarity to any host genes. One cell progenitor to both Archaea and Bacteria character that appears to distinguish phage genes (an undifferentiated prokaryote). If so, this from chromosomal genes is that phage (and most argues that at least this lineage of phage has viruses) have an overrepresented level of small, been present in prokaryotes prior to the single domain genes (100 a.a. or less). The most divergence of Archaea from Bacteria and recognized of these genes are the small genes also argue against the idea that these phage such as 206 a.a. (Tat, Rev, E6, E7, etc.) of HIV- evolved later (or frequently) from escaped 1, and HPV but similar small regulatory genes host replicons as has been often suggested. are found in the genomes of essentially all However, the mosaic and network nature of viruses and phage. Gene loss has also been phage evolution makes it very difficult to observed in specific viral lineages, but this trace ancestry. It is interesting, however, appears to be much less common then is gene that sequence analysis of 105 phage genes acquisition, which is most characteristic of new places T4 at the unresolved center of the viral lineages. major phage tree. Such placement would be consistent with a very old origin of tailed T- The best studied system for phage variation with even phage. respect to large bacterial populations is to be found in the dairy industry. Due to large Phage variation and evolution studied in economic impact of lytic phage, bacterial bacterial populations: In 1981, it was fermentation of milk lactose into lactic acid in suggested by Botstein that virus genomes yogurt and cheese (via Streptococcus undergo ‘modular evolution’ in which new lactococcus) has been carefully followed for over viruses were originated by a combination of 30 years. The enormous culture volumes genes or gene clusters derived from multiple involved (up 50,000 liter per day) would seem to sources including chromosomes, defective be ideal situations to observe the dynamics of viruses, plasmids, transposible elements etc. lytic phage adaption in large populations. The observations that have accumulated in Although lots of lytic and temperate phage the ensuing years have for the most part interactions have been observed and their been consistent with this ‘modular’ view of genomes examined, it does not appear that the phage evolution. However, it also now dairy industry provides a situation that puts seems clear that the origin of most viral evolution in fast forward and creates new phage lineages is from other predecessor viruses types. Instead, most of the new lytic variants (possibly networks of related viruses), not appear to enter these cultures from preexisting escaped cellular replicons as was frequently outside (natural) sources, such as raw milk. In considered in the early literature. In addition, specific lytic variants can be stable or addition, the source of most new individual fit for extended periods suggesting that they are viral genes generally is not traceable to non- neither being selected to become temperate or prophage chromosomal host genes. In an nonlytic nor do they show high rates of variation. above section, we noted that comparisons of

56 The dairy based observations, however, do element. Thus there may exist a dynamic confirm the patchy or modular nature of relationship between a persistent or temperate phage evolution, likely due to high level phage in which lytic variants can be derived recombination. Comparisons of 60 isolated from a successful persistent state and escape and related lytic phage types, such as Sfi 19 virus specific immunity. of S. Lactococcus, show extensive cross hybridization and patchy sequence similarity Practical observations of phage control: use of within gene sized and sub-gene sized plasmids and defectives. The dairy industry has regions, corresponding to sub-gene domains. been mainly interested to understand how to The pattern of cross hybridization is often make starter cultures that are resistant to an array distinct for the different genes, that is of lytic phage. Thus they provide much practical individual genes will cross hybridize with insight into the genetic factors that are most distinct sets of phage. Sequence analysis affecting phage-host interactions. The indicates that these genes are mainly derived observations have been that the colonization of from other viruses, including cryptic the starter culture by various plasmids, such as proviruses, but seldom from genes form host those that code for restriction/modification chromosomes. This is especially true for addiction modules, will provide some of the regions coding for phage tails and base most robust protection against lytic phage. This plates, as well as immunity regions which includes the plasmid W10, which codes for one are the most diverse. These genes seem to protein providing both restriction and be assembled from numerous other phage modification function. However, the most sources and not one single progenitor virus. impressive resistance to a broad array of phage In keeping with this idea, the phylogenetic (23 of 25 phage evaluated), was accomplished by analysis of the integrase gene establish that using defective plasmids of the phage this is clearly virus specific and not themselves. By using two distinct types of congruent with phylogenetic patterns of phage based origin sequences, synthetic other phage genes. Thus it seems that a recombinant replicons were constructed that gene specific network of various viral would amplify following complementing phage lineages is contributing to many of the infection and interfere with phage replication. phage genes, although more basic genes, Thus confirming the defective or cryptic phage such as helicases, are better conserved based plasmids can be highly fit for persistence within one viral lineage. Of specific interest if subjected to acute phage based selection. This was the observation that this strictly lytic observation also suggest a genetic pathway by virus is highly related to a temperate phage which persistence might evolve from acute Sfi 21, differing by only 10% in its infection. sequence. The differences between the two phages also shows the invasion of an The cyanobacteria and their viruses, steps into the gene and again suggests that a towards Eukaryotic evolution. The marine temperate phage can lose immunity function environment is of special interest to evolutionary and be induced, following colonization by a biology as it is the birth place of so many other transposon to generate a strictly lytic lineages of life. The marine bacterial variant. This result suggest a clear strategy environment accounts for about 70-90% of by which the intron parasitizes a temperate organic marine matter, thus this presents a very phage for its mobilization. Interestingly, large cellular and viral habitat. Cyanobacteria Lactobacillus phage LL-H contains a group and their viruses (phage) are of special interest II intron , which has not been observed to due to the more developed nature of these occur within its host genome, although photosynthetic and nitrogen fixing bacteria. group II introns (with reverse transcriptase Cyanobacteria are thought to have diverged from domain) are present in pRS01 conjugational Bacteria and Archaea about 3.5 billion ybp, thus 57 representing one of the first and oldest living much of a sequence encoding capsid assmebly domains to diverge from prokaryotes. protein found in T4 (gp20). S-PM2 Cyanobacteria are themselves prokaryotes has conserved genetic module that includes g18 and exist in five groups or orders. These to g23, which includes the major virion groups show both asexual and sexual components. However, in filamentous reproduction as well as morphlogical cyanobacteria of the LPP group (genera lyngbya, differentiation. Two groups are unicellular plectonema, and phormidium) lysogenic phage and divide by fission, two groups are (such as LPP-1) are common and have long been filamentous of which one divides by fission, observed. It is not clear if the paucity of the other forming sexual heterocysts (e.g. lysogenic phage from unicellular cyanobacteria anabeana), and a fifth group is is due to lack of screening and reliable methods morphologically complex showing for phage induction. LDP-1D is a variant of differentiation, filamentous (multicellular) LDP-1 that is temperate in plectonema, which is branching and heterocyst formation. not induced following UV irradiation, but can be Curiously, lytic phage of cyanobacteria are induced with mitomycin C. well known for the first four groups but not known for the more complex fifth group CO2 fixation and phage production. AS-1 is a (Seigonemetales). Cyanobacteria are more cyanophage of anabeana that also infects complex then most prokaryotes in that they thyalkoids, but in this case phage replication is have photosynthetic chlorophyll containing obligate to photosynthesis and needs light. multi-membrane structures internal to the Viruses affecting subcellular sturctures cell wall called thylakoids, which adsorb (precursor ‘organelles’) are thus common in this light and CO2 and emit O2. Their cell walls order. There is also a relationship between resemble those of gram negative bacteria, phage infection and heterocyst formation. For which they appear to have evolved from. example, A4 of anabeana infects vegetative but The phage that infect cyanobacteria are not heterocyst cells. Curiously, selection for mainly large ds DNA tailed phage that vegetative cells that resist A4 infection can result closely resemble T7. Mostly, these are in mutation of the HU (histone-like) gene that is ubiquitous and lytic phage that are specific required both for A4 replication and heterocyst to their host orders (unicellular, formation, suggesting a link between virus filamentous). Interestingly, some of these replication and . phage infect and replicate in the thylakoid structures, whereas others infect the As was observed with both Archaea and their nucleoplasm. LDP-1 is an example of a viruses, with cyanobacteria and their phage there cyanophage of plectonema that infects also exist a strong association between host order thyalkoids, displaces the photosynthetic and the nature of viruses they support. As we lamellae and stops CO2 photoassimilation. have seen, lytic phage are most common in Thus some of these phage can clearly alter unicellular cyanobacteria whereas lysogenic and regulate the bacterial photosynthetic phage are more common in filamentous system, and as will be discussed in chapter cyanobacteria. In addition, the linkage of virus 4, can even encode the core phtotsynthetic replication to organelles function could suggest enzymes. the involvement of viruses in the origins of these organelles. Generally, the accepted view is that In unicellular cyanobacteria, cyanophage are the more autonomous organelles of eukaryotes mainly lytic (such as SM-1, AS-1, S2-L) and represent a ensymbiotic relationship between a until recently no lysogenic phage had been prokaryotic host and a degenerate symbiont cell. described. Three genetically distinct phage This is supported by the presence of distinct 16S of unicellular phycoerythin containing and 23S rRNA genes in the ensymbiont, genes cyanobacteria are known and these conserve characteristic of a cell, not a virus or plasmid. 58 Recently, however, it has been observed that between E. coli and B. subtillus. Comparing E. in red algae, there is a large (150 kpb) coli to B. subtillus genomes shows bacrterial plasmid (from RK-1 host strain) which occurs in a patchy manner involving contains these rRNA genes, but is an gene sets in that there exist in about 230 regions autonomous replicon containing inverted of distinct dissimilarity between these genomes. repeat regions, characteristic of viral The great majority of these regions of difference genomes. Thus the possibility that phage are flanked by tRNA sequences, which marks the were involved in the origins of the integration events associated with these gene eukaryotic organelles remains open. The sets. As presented above, tRNA primed issue of possible viral origins of organelles intergration is characteristic of viral integrases, will be better developed in the context of which is also found in some plasmids, but is mitochondria of fungi in chapter 4. neither a typical of nor essential for host gene function. This clearly defines an infectious Differentiating bacteria and phage process involving the colonization of host production. Within prokaryotes, there also genomes by genetic parasites as being primarily exist examples of bacteria that can undergo responsible for mot of the genetic events that cellular differentiation. As this is a lead to the speciation of E. coli from B. subtillus. characteristic associated with higher It has often been proposed that such types of organisms, it is interesting to examine what insertional events would likely be mediated by is known concerning the relationship of such IS elements. However, B. subtilis DNA contains bacteria with their viruses. During its no IS elements or transposons. However, B. differentiation/sporulation life cycle, subtillis DNA is now known to contain 10 Thermoactinomyces vulgaris bacteria show proviral genomes (including cryptic – defective a clear linkage of virulent bacteriophage Ta1 phage) in its chromosome. Thus it seems clear phage replication to cellular differentiation. that IS elements are not always involved in how In this case, the primary mycelium arising bacteria alter or adapt their genomes. from spores was the only stage allowing Interestingly, during the early bacterial genomic phages replication. Curiously, infection of sequencing projects, it was observed that about mycelium or of late sporulation stages 1/3 of B. subtillis sequences won’t clone in (are resulted in a loss of phage. And if phages toxic to) E. coli, such as to prevent exhaustive were added at the beginning of spore phage library construction for a B subtilis formation, this resulted in the phage genome proteins. Thus there seems to exist clear limits becoming integrated in the developing on the degree of species compatibility for many spores rather then a lytic infection,. bacterial genes and could also suggest that some Subsequent outgrowth of these prophage- ‘horizontal’ gene movement may not be tolerated carrier spores, reactivated lytic phage between even closely related bacterial genomes. production. This linkage raises the question This observation also raises issue as to the origin of whether the phage life cycle might have of novel bacteria specific sequences. Rather then contributed to the evolution of this cellular considering that new genes tend to come from differentiation life cycle. other lineages of bacterial cells, I would suggest that viral lineages may be a more likely source Comparative bacterial , for the origin of most such new genes. Bacterial evolution & dynamic genomes. With the evolution is now established to be essentially completion of the genomic sequence of infectious, but the resulting changes seen must numerous bacterial species, we can now be persistent in the lineage in order to result in a examine the specific types of global changes new species. Bacteria represent the most associated with speciation between bacteria. adaptable organisms we know and also have the The first of these comparative genomic most dynamic genomes. analyses has already been completed 59 In terms of naturally dynamic genomes, accessory part of the bacterial genome. It is natural isolates of E. coli strains are known interesting to note that many DNA viruses also to vary in DNA content from 4.5 to 5.5 show this same characteristic ‘core genome’ and megabases. Thus 20% of E. coli genome is an ‘accessory’ or dynamic genome. Yet some dynamic. These strains differ significantly virus families, such as the lambdoid phage, show in gene number and this variation includes no conserved core sequences and still these some iconic operons of E. coli, such as the phage families maintain sequence signatures and lactose operon, which are not found in all other characteristics, including compatible natural E. coli isolates. Most of these recombination, which suggest they are clearly in variable genes (755 genes, of which 515 are a common gene sequence pool as that of their in 62 sets) are now proposed to be host. We might then ask: What constitutes an associated with mobile accessory elements evolutionary (or phylogenetically) stable such as IS elements or prophage (including genome? What genes go on to ‘persist” in some large defective prophage). The evolution and why? Can these core genes also prophage associated genes comprise the originate from virus? We have seen examples by largest set and include over 120 genes. The which persisting phage (P1) can replace the most most variable and thus most dynamic of basic element of the host replication machinery, these genes include those that code for the origin of DNA replication and the restriction enzymes as well as those coding corresponding origin recognition proteins. Thus for surface lipopolysaccharides. If we it is clear that viruses can create and superimpose consider a possible phage-based origin for the most basic core host replicative functions. such genes, in keeping with our discussions Yet such basic functions would seem to define above, such changes would clearly seem the host itself. It would thus follow that related to phage colonization events, often essentially any host function might similarly involving acquisition of addiction modules, have been virus derived from or replaced by a consistent with the view that stable host stable persisting provirus. Given the above colonization is a major force in sculpting the discussion, might we now conclude that such bacterial genome. We have argued that infectious genetic colonization must indeed be a restriction/modification systems represent major driving force in prokaryotic evolution? examples of such addiction modules. Prior The genomic data that are now available, at least to these recent genomic analysis, restriction within the prokaryotes, indeed appears that enzymes had already been established to be support such a claim for a viral role. However, highly variable and mobile between strains this raises a conundrum. If this infectious indicating a strong association with genetic colonizing genetic mechanism were so important colonization. for the creation of genetic novelty and evolution in bacteria, why was such a process not The concept of the stable bacterial maintained in more complex organisms such as genome vs the unstable genome and the eukaryotes? DNA viruses (prophage) or their role of plasmids/phage. Numerically, the defective derivatives are generally not colonizing dynamic portion of the E. coli genome is and excising from the genomes of any now established to be much more a eukaryotes. This process is mainly absent from consequence of phage colonization activity eukaryotes. However, other colonizing genetic then to result IS activity. However, a major parasites may yet play a major role in eukaryotic portion of the E. coli genome is considered evolution. As will be presented in the following ‘stable’ and is maintained in evolution and chapters, germ line persisting genetic parasites does not appear to have resulted from phage are exceedingly common to eukaryotes and also colonization. Evolutionist often consider specific to their host species. These genomic this to be the true ‘core’ genome and parasites, however, are seldom derived from consider the dynamic portion to be DNA viruses but instead are mostly derived 60 related to retroviruses. Such stable genetic help of addiction modules. As noted previously, parasites may provide an answer to the Yarmolinksy first coined the term addiction dilemma of the apparent absence of an module and applied it to the serine protease of infectious mechanism involved in eukaryotic phage P1 to explain post segregation killing, or evolution. . This system, along with others, makes retention of the viral genome very Plasmids as chromosomes; origin of stable. I have reasoned that addiction modules, multiple chromosomes – One significant and involving stable toxin and unstable antitoxin or general distinction between prokaryotic and the restriction/modification systems are one of the eukaryotic chromosomes is that Eukaryotic general strategies that allow an infectious genetic chromosomes are multiple and linear, not parasite to successfully attain stable persistence. circular as in Prokaryotes. However, some Often, phage and plasmid addiction modules use prokaryotes do harbor multiple chromosomes various types of toxin and anti toxins for this and warrant examination. Vibrio cholera is purpose. I have also argued that persisting phage one example that has a second chromosomes can themselves function as an addiction strategy of significant size. However, unlike the in that they can kill uninfected members of the primary chromosome, this second same or related bacterial species. In this case, the chromosome has many non-essential addiction module is a general genetic strategy accessory genes, including sequences from since there need not be a specific toxin and anti- plasmids as well as genes for various toxin gene. The lytic action of the reactivated addiction and immunity modules. This vegetative phage provides the harm (toxin) striking occurrence of so many such whereas the protective action of he viral sequences has led some to propose that this immunity module is the ‘anti-toxin’. Bacterial second chromosome constitutes a plasmid sex is also affected by such addiction strategies. capture system which allows the acquisition For example, the loss of an F plasmid can kill its of new genes useful for adaptation. The host due to the action of the small toxic ccd existence of multiple chromosomes in some gene’s inhibitory effect on host gyrase A. prokaryotes raises several questions Interestingly, although host gyrase resistance to concerning the mechanisms that allow stable ccd killing can be selected for, such mutations are maintenance of multiple chromosomes as unstable and recessive to the wt gyrase. Thus occurs in Eukaryotes. Could they be derived natural selective pressures favor the maintenance from a common ancestor? Chromosome of host sensitivity to the actions of ccd. Such coordination would seem to require highly toxin genes are often small proteins (less then 100 linked control of DNA replication and a.a., consisting of a single protein domain, often segregation. As we noted above, some phage with an active site). Also, and like ccd, these derived plasmids are persisting and toxins frequently target the most basic host extrachromosomal, such as P1. P1 acheives machinery (such as cellular gyrase) or can create plasmid stability by being able to tightly holes or pores in the target cell. In some cases, coordinated plasmid replication with host anti-toxin can be an antisense RNA (e.g. hok/suk chromosome replication. Thus it seems family). Often one system will have several worth considering if such phage related independent addiction modules suggesting the strategies might have led to the origin of major importance of such strategies to virus multiple chromosomes as seen in Vibrio fitness. Clearly such a persisting virus-host cholera. relationship is under strong selection. An unexpected example of the importance of phage Addiction and multi-genome stability. In biology to various addiction modules and to the P1 episome, the coordination between phage survival can be found in lambda. During plasmid and host chromosome and the lysogeny, lambda expresses only the rex A, B resulting stability is accomplished with the (T4 rII exclusion) along with cII repressor. 61 Although this immunity function controls (Oshima, Kakizawa et al. 2001) lytic lambda replication, it is also directed at (Tetart, Desplats et al. 2001) excluding unrelated phage. However, the (Bollback and Huelsenbeck 2001) lambda immunity genes can also exclude the (Bernstein and Bernstein 1989) addiction modules of other persisting phage, (Blaisdell, Campbell et al. 1996) such as P1, that might occupy a lambda host. (Rohwer and Edwards 2002) RexB prevents lamdba O protein degradation which is involved in DNA replication. Lytic phage However, by affecting targets of CIpP protease, rexB also inhibits the degradation (Tetart, Desplats et al. 2001) of antitoxin proteins phd of P1 and of Maze (Mathews and American Society for (rel operon), thus stabilizing these anti-toxin Microbiology. 1983) proteins to prevent post segregation cell (Karam and Drake 1994) killing. With respect to the Maze protein of the rel operon, rex B prevents the starvation Lysogenic - episomal phage induced killing that would otherwise occur. Thus lambda rex B can be an ‘anti-death’ or (Hershey 1971; Hendrix 1983) ‘anti-addiction’protein, by stabilizing and (Gordon and Wright 2000) extending the lifespan of the P1 addicted host. Thus RexB provides a competitive Phage Mu advantage to lambda relative to other potential colonizers by allowing a lambda (Symonds 1987) colonized E. coli to preclude other persisting (Morgan, Hatfull et al. 2002) parasites. The occurrence of a large number of similar types of addiction modules within Viruses of Archaea. the second chromosome of Vibrio cholera might well suggest that this second (Zillig, Prangishvilli et al. 1996) chromosome also has attained stability which (Prangishvili 2003) originated from a persisting phage. (Rice, Stedman et al. 2001)

Oceanic and soil phage.

Recommended reading. (Hurst 2000) (Tikhonenko, Belyaeva et al. 1975) Phage history, classification and host (Perova, Tikhonenko et al. 1977) association. (Fuller, Wilson et al. 1998) (Hambly, Tetart et al. 2001) (Delbrèuck 1950; Cold Spring Harbor Laboratory of Quantitative Biology., Cairns P2/P4 helper phage system et al. 1966; Goyal, Gerba et al. 1987; Ackermann 1998; Ackermann 2001; (Lindqvist, Deho et al. 1993; Bertani and Deho Brussow 2001; Tetart, Desplats et al. 2001; 2001) Davis 2003) Addiction modules and persistence Phage – plasmid evolution. (Engelberg-Kulka and Glaser 1999) (Botstein 1981) (Hendrix 1999; Hendrix 2002) Pathogenic islands. (Brussow and Desiere 2001) 62 (Cheetham and Katz 1995) Brussow, H. (2001). "Phages of dairy bacteria." (Finlay and Falkow 1997) Annu Rev Microbiol 55: 283-303. (Hacker and Kaper 2000) Brussow, H. and F. Desiere (2001). "Comparative phage genomics and the Bacterial genomes. evolution of Siphoviridae: insights from dairy phages." Mol Microbiol 39(2): 213- (Gelfand and Koonin 1997) 22. (Karlin and Burge 1995) Cheetham, B. F. and M. E. Katz (1995). "A role (Karlin, Campbell et al. 1998) for in the evolution and (Riley and Serres 2000) transfer of bacterial virulence (Mrazek and Karlin 1999) determinants." Mol Microbiol 18(2): 201- 8. Cold Spring Harbor Laboratory of Quantitative Biology., J. Cairns, et al. (1966). Phage Ackermann, H. W. (1998). "Tailed and the origins of molecular biology; bacteriophages: the order [essays]. [Cold Springs Harbor, N.Y.]. ." Adv Virus Res 51: Davis, R. H. (2003). The microbial models of 135-201. molecular biology : from genes to Ackermann, H. W. (2001). "Frequency of genomes. New York, Oxford University morphological phage descriptions in Press. the year 2000. Brief review." Arch Delbrèuck, M. (1950). Viruses 1950. Virol 146(5): 843-57. Proceedings of a conference on the Bernstein, H. and C. Bernstein (1989). similarities and dissimilarities between "Bacteriophage T4 genetic viruses attacking animals, plants, and homologies with bacteria and bacteria, respectively. [Pasadena], eucaryotes." J Bacteriol 171(5): Division of Biology. 2265-70. Engelberg-Kulka, H. and G. Glaser (1999). Bertani, G. and G. Deho (2001). "Addiction modules and programmed cell ": recombination in death and antideath in bacterial cultures." the superinfection preprophage state Annu Rev Microbiol 53: 43-70. and under replication control by Finlay, B. B. and S. Falkow (1997). "Common phage P4." Mol Genet Genomics themes in microbial pathogenicity 266(3): 406-16. revisited." Microbiol Mol Biol Rev 61(2): Blaisdell, B. E., A. M. Campbell, et al. 136-69. (1996). "Similarities and Fuller, N. J., W. H. Wilson, et al. (1998). dissimilarities of phage genomes." "Occurrence of a sequence in marine Proc Natl Acad Sci U S A 93(12): similar to that of T4 g20 5854-9. and its application to PCR-based Bollback, J. P. and J. P. Huelsenbeck detection and quantification techniques." (2001). "Phylogeny, genome Appl Environ Microbiol 64(6): 2051-60. evolution, and host specificity of Gelfand, M. S. and E. V. Koonin (1997). single-stranded RNA bacteriophage "Avoidance of palindromic words in (family Leviviridae)." J Mol Evol bacterial and archaeal genomes: a close 52(2): 117-28. connection with restriction enzymes." Botstein, D. (1981). A modular theory of Nucleic Acids Res 25(12): 2430-9. virus evolution. Gordon, G. S. and A. Wright (2000). "DNA . B. N. Fields, R. Jaenisch segregation in bacteria." Annu Rev and C. F. Fox. New York, Academic Microbiol 54: 681-708. Press: 363-84. 63 Goyal, S. M., C. P. Gerba, et al. (1987). Morgan, G. J., G. F. Hatfull, et al. (2002). Phage ecology. New York, Wiley. " genome sequence: Hacker, J. and J. B. Kaper (2000). analysis and comparison with Mu-like "Pathogenicity islands and the in Haemophilus, Neisseria and evolution of microbes." Annu Rev Deinococcus." J Mol Biol 317(3): 337- Microbiol 54: 641-79. 59. Hambly, E., F. Tetart, et al. (2001). "A Mrazek, J. and S. Karlin (1999). "Detecting alien conserved genetic module that genes in bacterial genomes." Ann N Y encodes the major virion components Acad Sci 870: 314-29. in both the coliphage T4 and the Oshima, K., S. Kakizawa, et al. (2001). "A marine cyanophage S-PM2." Proc plasmid of phytoplasma encodes a unique Natl Acad Sci U S A 98(20): 11411- replication protein having both plasmid- 6. and virus-like domains: clue to viral Hendrix, R. W. (1983). Lambda II. Cold ancestry or result of virus/plasmid Spring Harbor, N.Y., Cold Spring recombination?" Virology 285(2): 270-7. Harbor Laboratory. Perova, E. V., A. S. Tikhonenko, et al. (1977). Hendrix, R. W. (1999). "Evolution: the long "[Bacteriophage induction in cultures of evolutionary reach of viruses." Curr C1. botulinum type A]." Zh Mikrobiol Biol 9(24): R914-7. Epidemiol Immunobiol(11): 125-8. Hendrix, R. W. (2002). "Bacteriophages: Prangishvili, D. (2003). "Evolutionary insights evolution of the majority." Theor from studies on viruses of Popul Biol 61(4): 471-80. hyperthermophilic archaea." Res Hershey, A. D. (1971). The Bacteriophage Microbiol 154(4): 289-94. lambda. [Cold Spring Harbor, N.Y.], Rice, G., K. Stedman, et al. (2001). "Viruses Cold Spring Harbor Laboratory. from extreme thermal environments." Hurst, C. J. (2000). Viral ecology. San Proc Natl Acad Sci U S A 98(23): 13341- Diego, Academic Press. 5. Karam, J. D. and J. W. Drake (1994). Riley, M. and M. H. Serres (2000). "Interim Molecular biology of bacteriophage report on genomics of ." T4. Washington, DC, American Annu Rev Microbiol 54: 341-411. Society for Microbiology. Rohwer, F. and R. Edwards (2002). "The Phage Karlin, S. and C. Burge (1995). Proteomic Tree: a genome-based "Dinucleotide relative abundance for phage." J Bacteriol extremes: a genomic signature." 184(16): 4529-35. Trends Genet 11(7): 283-90. Symonds, N. (1987). Phage Mu. Cold Spring Karlin, S., A. M. Campbell, et al. (1998). Harbor, N.Y., Cold Spring Harbor "Comparative DNA analysis across Laboratory. diverse genomes." Annu Rev Genet Tetart, F., C. Desplats, et al. (2001). "Phylogeny 32: 185-225. of the major head and tail genes of the Lindqvist, B. H., G. Deho, et al. (1993). wide-ranging T4-type bacteriophages." J "Mechanisms of genome propagation Bacteriol 183(1): 358-66. and helper exploitation by satellite Tikhonenko, A. S., N. N. Belyaeva, et al. (1975). phage P4." Microbiol Rev 57(3): "Electron microscopy of phages liberated 683-702. by megacin A producing lysogenic Mathews, C. K. and American Society for Bacillus megaterium strains." Acta Microbiology. (1983). Bacteriophage Microbiol Acad Sci Hung 22(1): 58-9. T4. Washington, D.C., American Zillig, W., D. Prangishvilli, et al. (1996). Society for Microbiology. "Viruses, plasmids and other genetic elements of thermophilic and 64 hyperthermophilic Archaea." FEMS Microbiol Rev 18(2-3): 225-36.

Possible figures.

3-1. Table of phages (bacterial)

3-2. Dendogram of phage evolution

3-3. Table of RNA phage evolution.

3-4. Nature of addiction modules and parasites of parasites (needs rendering).

3-5. Cyanobacteria evolution

3-6. Cyanophage evolution

65 CHAPTER IV THE DILEMMA OF THE BIG TRANSITION IN EVOLUTION

The evolution of the eukaryotic nucleus and Recent phylogenetic analysis suggests that the the eukaryotic cell represents the largest eukaryotic nucleus evolved prior to the discontinuity in the evolution of all life. symbiotic acquisition of (both Beyond the dilemma this presents to mitochondria and chloroplast). If so, the evolutionary biologist, the origin of the acquisition of the nucleus represents the most eukaryote also raises important issues for basal and initial event in the evolution of virologist as this resulted in a distinctly eukaryotes. In this chapter, I will present the different habitat for viruses. As mentioned possible role of viruses in this and other in Chapter 1, geological evidence of evolutionary discontinuities. For the most part, microfossilized cellular structures suggests evolutionary biology has not considered the that prokaryotes were present for at least possible role of viruses in host biogenesis. 3,500 million years before the present (bp). Fossil evidence also suggests that CURRENT VIEW: SYMBIOSIS BETWEEN cyanobacteria also had developed by 2,700 SEVERAL PROKARYOTES. A current and to 2,800 million years bp. Thus prominent view concerning the origin of cyanobacteria appear to have originated eukaryotes is that they represent a fusion of two prior to the evolution of the first eukaryotes. or more symbiotic progenitor cells. A similar The earliest eukaryote for which there exist kind of symbiosis is also thought to have resulted clear geological fossil data are the in generation of , mitochondria in microalgae (similar to red or brown algae) addition to that proposed for the nuclear structure which occupy the pre –Cambrian boundary. and all of these are thought to been derived from These algal fossils date to about 2,000 to distinct prokaryotic cellular predecessors. Such 2,200 million years before the present. On a symbiosis is thought to have provided the a geological time scale, the Cambrian prokaryotic progenitor of the nuclear structure radiation occurred relatively soon after the was symbiotically engulfed by another cell wall- appearance of the first eukaryotes resulting lacking prokaryotic predecessor. The resulting in the generation of diverse phyla and many unicellular eukaryotic would then resemble a new species which included fossilized shell primative algae-like cell. Cyanobacteria are and bone structures. Thus prokaryotes believed by many to be the likely ancestors to existed for over one billion years before the both mitochondria and chloroplast. Numerous first unicellular eukaryotic cell emerged, similarities have been noted to support this view, followed by the eukaryotic radiation. Why such as the similarity of chloroplast and was a prokaryotic world so stable and why mitochondrial 16S RNA to that of did this change so rapidly? Although it cyanobacteria and purple bacteria. The evolution appears that the origin of eukaryotic green of these bacteria prior to eukaryote evolution is algae is associated with the symbiotic also though to have resulted in the change in acquisition of chloroplast, it seems probable oxidation status of the early , that the eukaryotic nucleus along with substantially altering the worlds ecology, and various other cytoplasmic characteristics setting the stage for the emergence of eukaryotes (such as cytoplasmic motility and the with their oxidative mitochondrial metabolism. ) were acquired even Photosynthesis is thought to be a central before this time. Giardia species, for participant in the origin of eukaryotes because example, constitute one of the most that it allows unicellular organisms to live primitive forms of eukaryotes in that they without a dependence on chemical energy but lack mitochondria, but they still have nuclei. instead use photosynthetic phosphorylation to

66 provide energy. Photosynthesis, however, and as we will present below, no existing needs to ‘pump’ away excess energy to prokaryotic cells stand out as being the likely prevent photo-oxidation of chlorophyll so progenitor to the nucleus. some system for this purpose (such as carotenoids) would also need to be created The nucleus contains numerous basic and in the early eukaryote. More problematic distinguishing features of the eukaryotic cell, for a Eukaryotic cell, however, is the including all the highly coordinated genes incompatibility between photosynthesis and involved in genome replication. The eukaryotic free oxygen needed for mitochondrial replication proteins and apparatus, although oxidative respiration. These two features do functionally homologous to the replication not appear to be chemically compatible proteins and apparatus of prokaryotes are very within one cell and would require a strict distinct. Eukaryotic replication proteins have chemical separation. Unicellular algae have amino acid sequence composition that differ clearly solved this problem by almost completely from those of prokaryotes. compartmentalization of the two plastids; This sequence difference is so large that chloroplast and mitochondria. Thus prokaryotes do not appear able to have been the Cyanobacteria or purple bacteria which progenitor to most of these functionally could provide the basis for both chloroplast homologous eukaryotic proteins. However, the and mitochondria seem like possible prokaryotes of the archaeal linage do have some participant in these plastid acquisitions and notably greater sequence similarity in some of symbiosis. their replication proteins to those of eukaryotes then do eubacteria. This observation has led However, the origin of the nucleus, the most some to suggested that archaebacteria were the basal distinction of all eukaryotes, but likely symbiotic progenitors of the eukaryotic presents a the biggest challenge for theories nucleus. There are, however, major problems based on a symbiotic origin. As will be with this scenario, the main one being that it presented, most molecular and structural leaves unexplained the origin of too many other features of the nucleus pose a problem for features of the eukaryotic nucleus. Pool and having originated from prokaryotes. A Penny (2001) reviewed the evidence for the widely accepted view, proposed by archaeal origin of the nucleus and concluded that Cavalier-Smith (1987), is that the early existing evidence argues against archaeal origins. eukaryotic cell also must have lacked a cell This conclusion is also consistent with wall, allowing motility and cytoplasmic observation that Archaea are much more like engulfment of food. Mycoplasms are Bcteria then Eukaryotes and have 4 times more bacteria that lack a cell wall and thus have Bacteria-like proteins then eukaryote-like been proposed as the likely source of this proteins. Thus Archaea are significantly more progenitor cell. However, there do not now related to Bacteria then they are to eukaryotes. exist any known prokaryotic organisms This same dilemma has led Smith and Szatharey (including mycoplasms) that feed by to conclude that the evidence of the symbiotic engulfulment phagocytosis of food, as do origin of the eukaryotic nucleus is presently eukaryotes, which suggests that the weak and that we still lack a sensible scenario for phagocytic character of a eukaryote did not the origin of the nucleus. There currently exist exist in the prokaryotic progenitor. As no living cell that has all or even many of the mentioned above, recent phylogenetic characteristics needed to have provided the analysis has suggested that the acquisition of nucleus. Below, we list the specific examples of the nucleus in early eukaryotes may predate nuclear characteristics that lack a sensible the acquisition of both the mitochondria and explanation based on having originated from a chloroplast. Thus a conundrum seems to prokaryotic cell. Each of these characteristic exist concerning the origin of the nucleus alone raises a dilemma for explaining the origin 67 of the nucleus. Yet all are considered characteristics that clearly resemble the genomes unique cellular and molecular characteristics of some DNA viruses, such as the poxviruses. of all eukaryotes. Another example of a prokaryote that has additional chromosomes is that of Vibrio THE WORKINGS OF THE NUCLEUS. cholera. As mentioned previously, this The existence within the nucleus of additional chromosome has also been called gene numerous molecular distinctions with capture system, which contains addiction genes, prokaryotes raises numerous specific issues. toxin genes and is associated with prophage. In Each of these distinctions will require an fact, it is the prophage genes within these explanation according to the theory of chromosomes that provide the gene functions prokaryotic symbiosis as has been presented responsible for the colera toxin. The central role by L. Margulis. In addition, these molecular of circular chromosomes in prokaryotes is in distinctions of the nucleus, have major contrast to the chromosomes of eukaryotes, all of implications for the function of eukaryotic which have multiple and linear, with some type RNA and especially DNA viruses. Not only of repeated telomere end. In eukaryotes, circular does the nucleus segregate the process of chromosomes or chromosomes with only one transcription and DNA replication from that origin are essentially nonexistent. Circular of translation, it also provides a very genomes that are found in eukaryotes are all due distinctive molecular and chromosomal to either viral episomes or the result environment for both DNA replication and differentiation linked DNA amplification transcription. Essentially all currently (endoreduplication) of specific replicons, such as characterized prokaryotic organisms have rDNA in diplonomads. circular DNA genomes with a unique origin of replication that attaches to the cell Another major molecular distinction between membrane to allow daughter chromosome prokaryotes and eukaryotes is the packaging and segregation. Some examples of linear large replication control of the DNA. Prokaryotic plasmids or accessory chromosomes are also DNA is not tightly associated with now known in some bacteria (e.g, stoichiometrically bound basic chromatin Agrobacterium tumefaciens). However, as proteins, such as histones. This distinction previously mentioned the genes within these appears to have affected viral strategies in uncommon linear plasmids are usually prokaryotes. For example, the great majority of associated with the accessory genes, (such prokaryotic viruses inject naked DNA into their as the T-DNA transferred into host plant host cells and integration of viral DNA into host cells) or the presence of addiction modules, chromosomes is a common viral strategy, pathogenic islands and transposable especially during lysogenic-persistent states. In elements, (see chapter 3). The core contrast, eukaryotic chromosomal DNA is replication and biosynthetic genes are always tightly associated small basic DNA generally in the , binding proteins, usually histones, with the suggesting that the linear clearly interesting exception of the naked DNA present resemble remnants of a colonized host. in a gamete just after sperm penetration and Another very interesting exception to uncoating. In keeping with this host circular chromosomes of prokaryotes is that characteristic, eukaryotic DNA viruses all use of Borrelia parasitic spirochetes. They can some type of basic polymer or protein bound contain sets of both circular and linear ds chromatin structure that is used package virion DNA chromosome, the latter being DNA, but also used to infect host cells. associated with pathogenicity and have Eukaryotic viruses appear to avoid naked DNA covalently closed snap-back ends. As will in their replication strategy. Another major be described below, covalently closed snap- distinction with prokaryotes, Eukaryotic DNA back ends of DNA are molecular replication initiates in numerous (generally 68 thousands) of sites which can have both a which must occur out of host cell cycle control. loosely defined origin sequence Interestingly, latent eukaryotic DNA viruses (corresponding to regions of initiation) but often replicate their DNA in coordination to host also be specific to an origin sequence (such cell cycle control (which is also typically linked as amplified rDNA ori’s). In stark contrast to cellular differentiation). Eukaryotic DNA to prokaryotes, the re-initiation of virus do not inject naked DNA into host cells as eukaryotic DNA replication is very do bacteriophage (except for DNA viruses of stringently regulated within a complex cell algae). Eukaryotic DNA viruses generally use cycle control system (except for the either viral or cellular encoded DNA associated diplonomad macronucleus discussed below) basic polymers or histone-like proteins to and shows exceedingly small stoicheometrically coat and condense their overreplication error rates (generally less chromosomes. In addition, all eukaryotic than one in 107). Daughter eukaryotic DNA nuclear viruses appear to have specific molecules segregate via attachment to mechanisms for nuclear entry, often involving tubular proteins which make up the spindles, viral structural proteins. Human adenovirus, for not by membrane attachment as do example, specifically docks subviral-like prokaryotes. Furthermore, a complex particles onto the nuclear pore opening and protein-set of greater then 10 proteins is injects viral chromosomes into the nucleus. involved in the control of the initiation and Human herpes virus has a similar virus specific extension of DNA replication. Although process for nuclear entry. The existence of a functionally analogous, all these proteins are nucleus thus imposes many molecular and distinct in sequence from those of evolutionary constraints on eukaryotic DNA prokaryotes (see Forterre). Thus the viruses. One such constraint may require replication control proteins in eukaryotes eukaryotic DNA viruses to coat the DNA in generally lack close prokaryotic homologues order to protect their genomes for passage and are not part of the universally conserved through the . Yet, as noted above, set of proteins found in all domains of DNA integration is very common in both cellular life. However, essentially all of Bacteria and Archaea DNA viruses, but these eukaryotic replication proteins can be uncommon in eukaryotic DNA viruses so clearly found to exist as identifiable homologues some molecular constraints also appear to exist within various DNA phage and eukaryotic within the nucleus as well. DNA viruses. The most distinctive of these replication proteins are those that are RNA TRANSCRIPTION AND SPLICING involved in the very tight control of the AND THE NUCLEUS. The eukaryotic nucleus initiation of DNA replication (the Origin contains three classes of DNA dependent RNA Recognition Complex, ORC). Eukaryotic polymerases (pol) that lack compelling ORC proteins have no direct prokaryotic homology to the RNA polymerase used by any homologue. Interestingly, the only prokaryote. Although there exist some prokaryotic ORC system that clearly consensus sequence similarity within the resembles that of eukaryote are the ORC catalytic core of the two largest subunits of all proteins of the lysogenic prophage such as DNA dependent RNA pols, this homology is lambda, whereas the lytic phage, such as T4, mainly structural and cannot be seen at the lack these homologues. amino acid sequence level, suggesting it results from . Thus the Eukaryotic viruses appear to tightly adhere transcriptional enzymes are distinct for to these same basic molecular characteristics prokaryotes and eukaryotes. Also, the products and strategies that are used by the eukaryotic of these RNA polymerases must frequently nucleus, except for the notable need of DNA undergo post-transcriptional modification (such viruses to overreplicate acute viral genomes, as splicing) prior to functioning as mRNA, tRNA 69 or rRNA in the cytoplasm. This poses unicellular algae, do have both of these another dilemma for the origin of the characteristics and frequently code for and nucleus. In order to prevent mis-translation conserve spliced RNA of various types, of mRNA or prevent unspliced tRNA and including splicing of coding sequence, such as rRNA from entering the cytoplasm, the that of thymidylate synthase in T4. nucleus must separate the transcription and processing of RNA from cytoplasmic 5’ and 3’ RNA MODIFICATION AND THE transport. In fact, it would seem that the NUCLEAR MEMBRANE. Eukaryotes will 5’ nucleus would need to have existed first, in cap their mRNA with 7-methyl-G and add poly- order to allow the evolution of events, such A sequences to 3’ ends. Although some bacteria as splicing of translated mRNA sequences. can also attach short 3’ poly A tails to mRNA, Furthermore, eukaryotes will splice the pre- bacteria use a poly (A) polymerase that is mRNA of coding sequences via complex distinct from the eukaryotic poly (A) protein based spliceosomes, whereas polymerases as only the eukaryotic ones are all existing prokaryotes do not splice within members of the polymerase beta superfamily. In coding regions or use spliceosomes, addition, bacterial polyadenylation of mRNA suggesting that this RNA processing did not decreases its chemical stability and does not evolve first in the progenitor prokaryote. increase mRNA half life as it does in all Thus it would seem logical that the eukaryotes. The resulting 5’ and 3’ modified progenitor eukaryotic cell first needs to eukaryotic mRNA is then transported through invent the nuclear membrane in order to nuclear pore structures which reside on the allow the evolution of introns, at least for nuclear membrane (NE membrane) or cage. introns for at least within coding regions. This membrane is itself distinct form the plasma Three types of splicing are known, one membrane and is dissolved after S phase and (Group I) are self splicing, mobile elements subsequently reformed at late anaphase/telophase and often code for a DNA transposase of the cell cycle. No such division associated protein. Group II introns code for an RT- membrane dissolution/reformation process is like protein and although they can be found known for prokaryotes. In addition all of the in the phage and some tRNA genes of complicated molecular modifications of mRNA cyanobacteria, they too are absent from most and nuclear RNA between prokaryotes and prokaryotes. These grroup II introns are eukaryotes are highly conserved in Eukaryotes., thought to have originated in the RNA but absent from any prokaryote Thus these world. All three of these intron systems are traits of mRNA modification appear to have been mainly absent from prokaryotic cells (but rapidly do novo acquired during the evolution of not prokaryotic viruses). Furthermore, the nucleus and cannot now be identified in any genomic analysis now suggests that bacteria existing prokaryotic cell. have never had introns in any of their coding genes. Curiously, in chloroplast, which are considered to have originated from Thus we cannot identify the prokaryotic cell that symbiotic prokaryotes, cytosolic GAPDH might have symbiotically provided the protein have an intron in similar location to eukaryotic nucleus. This leaves us with several nuclear gene, suggesting intron invasion unsatisfactory options. One option is that the from the nucleus to the plastid, after plastid progenitor single cell life form to the eukaryotic colonization. Thus no bacteria has the nucleus must represent a distinct order of life molecular characteristics for either RNA from Bacteria and Archaea, but also that all polymerization or splicing that would make members of this order have become extinct, in it looks like the possible progenitor to the spite of phylogenetic evidence that suggest that eukaryotes. However, as discussed below, genes unique to this putative predecessor are as prokaryotic phage and DNA viruses of old as the Bacterial lineage itself. Thus the only 70 surviving cellular descendents of such cells bacterial Fts z protein structurally resembles would be the current Eukaryotes. Another tubulin, has some low but discernable sequence even less appealing possibility is that the similarity to tubulin and can it be assembled into complex molecular distinctions of the tubular sheets, this idea seems viable on the Eukaryotic nucleus all arose at the time of surface. However, closer examination of this the symbiotic fusion of the two progenitor hypothesis shows major problems with it (see prokaryotic cells and that evolution rates below). For this to have happened, resulting in a underwent a major acceleration at that time, complex and distinct molecular system for resulting in a huge increase in genetic chromosome segregation, the scale of change novelty. For example, certain bacteria, such needed is well beyond that which can be as Agrobacteria tumefaciens, are now explained by any existing process. These single known to host more then one chromosome examples involve so much complex change, that and sometimes also host large linear they defy explanations that are based on accepted plasmids. These plasmids appear prone to Darwinian processes, such as genetic mutation, acquisition of new genetic information duplication, and recombination. In the specific (prophage and addiction modules). Thus, a case of the bacterial Fts z protein, this protein multi chromosome system with increased sequence is well conserved in all prokaryotes, rates of evolution might have developed but is very different from the tubulin sequence in from such a progenitor. Parenthetically, eukaryotes. Thus it must have assumed a role in some spirochetes have second linear eukaryotes that is different from its conserved chromosomes that show clear relationship to role in prokaryotes. None of the prokaryotic viral genomes. However, it would be lineages can be identified as a predecessor of the necessary not simply to increase evolution eukaryotic tubulin. One would need to propose rates, but to massively accelerate evolution that at the time just after the prokaryotic nuclear to allow the development of all the other symbiosis, the rate of adaptation and genetic eukaryotic molecular traits characteristic of change was transiently much greater then it the nucleus. If such a massive increase in currently can be measured to be. However evolution could be attained, this idea might proposing such a transient but enormous increase also explain the great genetic change and in the rate of evolution following fusion of the massive genetic morphing that must have progenitor cells after almost two billion years of also led the predecessor bacterial stable prokaryotic life on earth is problematic. mitochondrial genome to transfer many How and why could this happen? The problems genes into the nucleus. posed by the origin of the tubulin system are actually even less daunting then the problems However, even considering only one of posed by the origins of the other eukaryotic these ‘rapidly evolved’ complex eukaryotic characteristics of the nucleus, such as pore traits still poses a major problem for known complexes, replication transcription and splicing mechanisms that could have created such systems. Thus we are left to chose from several large scale genetic novelty. The highly very improbable scenarios. Finally, there are conserved mitotic spindles and their also other distinctions that also require associated tubulin pose one such large explanations, such as that eukaryotes frequently dilemma for this idea. How did the tubulin have diploid (or sometimes polyploid) based system for chromosomal segregation chromosomes, that eukaryotes generally have originate so quickly? The possibility has sexual meiotic reproduction involving hapliod been suggested that the bacterial Fts z gemetes (the soma-germ line dichotomy) and protein, involved in chromosome also the existence of specialized cells, all segregation, might have evolved to become features with no clear prokaryotic counterpart. the microtubular proteins involved in Thus we come to understand the depth of the chromosome segregation. Since the dilemma of explaining the nucleus and hence the 71 conclusion of Smith and Szatharey that we have contributed to the evolution of the lack a sensible scenario to explain the origin eukaryotic cell? of the eukaryotic nucleus. A VIRAL ORIGIN OF THE NUCLEUS: DO THE CYTOPLASM: DILEMMAS ASIDE VIRUSES HAVE ENOUGH GENES? We FROM THE NUCLEUS. In additional to shall now consider the possibility that a complex the distinctions of the eukaryotic nucleus DNA virus was involved in the symbiotic origin and its associated chromosome structure, of the eukaryotic nucleus. This possibility, other important differences outside of the although it has been proposed on several nucleus also differentiate prokaryotes from occasions, has been essentially ignored or eukaryotes. Prominent amongst these is dismissed in most earlier reviews of the topic of the existence of the endoplasmic reticulum nuclear origin (see Margulis). Therefore, a more and the golgi complex, a complex system of detailed evaluation will be provided here. internal membranes involved in protein Simply stated, the hypothesis is that the processing, modification and transport. predecessor of the nucleus is derived from a Also distinct and absent from prokaryotes large membrane bound DNA virus that are the cytoplasmic role of tubulins which in persistently colonized a prokaryotic host cell. addition to chromosome segregation are also This colonized host lost its cell wall (resembling involved in spindle and microtubule a phage conversion event) and subsequently, the formation and participate in several basic virus acquired many of the prokaryotic genes cytoplasmic processes such as motility, cilia (mainly metabolic and translational system and flagellin function. Eukaryotes also have genes) into the proto-nuclear chromosome a complex cytoskeleton and actinomyosin. (similar to the acquisition of transposons and Another issue is the distinct nature of the accessory genes by parasitic bacterial plasmids). eukaryotic translational system. The various This view corresponds to the Viral Origin differences between the eukaryotic and hypothesis. As a corollary, this hypothesis prokaryotic translational systems indicate would also argue that there never existed a free that the eukaryotic translational system living progenitor cell to the eukaryotic nucleus could not have come from one Prokaryotic and therefore that the reason that the eukaryotic source but instead appears to be a mosaic of lineage appears old is because the viral lineages prokaryotic ancestors. There does not exist that created it are themselves old (setting aside an obvious precursor prokaryotic cell which the problem that high virus evolution rates can could have provided all of the above system confound evolution studies). The idea, however, or even one specific lineage which could that a large cytosolic extrachromosomal DNA have provided the origin of the translation virus could have provided all the genes needed system. for eukaryotes is generally met with skepticism. How could a relatively small genome of a DNA From the perspective of a virus, it seems virus have been able to provide all the genes clear there are major distinctions between needed to create the eukaryotic nucleus? One prokaryotic and eukaryotic as a molecular point to consider along these lines is that habitat and hence the viruses that infect following a successful and permanent host prokayotes and eukaryotes are similarly colonization, virus transmission would distinct. Eukaryotic viruses must be adapted subsequently occur through the host cell to the much more complex nuclear and reproduction, so that packaging constraints that cytoplasmic structures of the eukaryotic cell. would have previously been necessary for Yet some of these eukaryotic DNA viruses assembly into an infectious virion would be lost, have clear phylogenetic relationships to allowing an increase in the gene content. viruses of bacteria. Might these viruses However, although this proposal might allow a rapid acquisition of genes into the protonucleus, 72 the colonizing virus would still need to was proposed in 1998 by C. Woese. In this provide a substantial number of complex proposal, LUCA is not a discrete organism but is and interacting genes at the start. The gene instead a pool of exchanging genetic elements, content of a large DNA virus ranges only due to high rates of the lateral transfer of DNA. from 150 to over 900 genes. The largest are As such a LUCA would not be a specific entity, dsDNA viruses that infect bacteria (670 kbp, it can be thought of in terms of high ‘genetic’ B. megaterium phage), algae (560 kbp temperatures early in evolution of cells, which Pyramimonus phycodnavirus) and most had not ‘cooled’ or crystallized into organisms recently amoeba (670 kpb mimivirus). with specific genomes. Therefore LUCA would Interesting that these viruses of algae and have been rather a diverse community of cells amoeba show some clear relationships to exchanging DNA. Although not addressed by each other. This viral gene content, Woese, it is obvious that such a LUCA can also although large by viral standards, might be thought of as necessarily including viruses, seem inadequate to have formed the genetic since they would represent the originators of the basis of all eukaryotes. It should be noted transferred DNA involved and also represent the though, that these viruses code for more main driving force of ‘lateral’ gene transfer or genes then the genomes of the smallest genome colonization as we have considered in cellular organisms (mycoplasms). However, chapter 3. However, the concept of ‘lateral’ this is still far smaller then the genome size gene transfer between cellular organisms raises of most free living bacteria (about 2,000 some problems concerning the ultimate origins genes), let alone the more complex genome of genes that are considered below. of a eukaryote. POSSIBLE PROTOVIRUSES PREVAILING A Viral LUCA. Recent sequence analysis AT THE ORIGIN OF THE NUCLEUS: Let us of whole genomes of numerous prokaryotic now consider the possible viral candidates that and eukaryotic organisms indicates that the might have been the progenitor to the nucleus. If number of genes conserved amongst all life we assume that viral strategies are both old and forms is surprisingly small. That number is stable during evolution (such as the tailed phage thought to correspond to those genes in which appear to predate the Archaea-Bacteria common to the Last Universal Common divergence), we might be able to identify Ancestor (LUCA) and still found in all candidate contemporary viruses classes from cellular life (prokaryotes and eukaryotes). existing prokaryotic or unicellular eukaryotic Current estimates are that LUCA consists of populations. Cyanobacteria appear to have only about 324 genes. Ironically, this evolved just prior to the evolution of the first conserved LUCA gene set does not include eukaryote. We have presented the arguments the proteins that replicate the DNA genome, (Ch. 1) that persisting viruses, rather then acute which might have been considered as viruses are the most likely sources of new fundamental and in common to all life. genetic entities that can become stably associated Thus the gene content of LUCA is well with their host. By examining viruses of within the range of a large DNA virus. A cyanobacteria and their closest eukaryotic viral genome as the nuclear progenitor, relative, e.g. unicellular algae, we may identify would be expected to have its own distinct, these possible proto-nuclear viral agents. The non-cellular viral based lineage of evolution. large DNA viruses that infect unicellular algae As a virus, it could also provide a much high (phycodnavirus/chlorella virus) shows clear rate of evolution then present in prokaryotic relationships both the bacteriophage and large genomes, and could explain both the rapid DNA viruses of mammals. Because this family rate of early eukaryotic evolution and the of virus has clear links to both prokaryotes and current absence of a progenitor prokaryotic eukaryotes, they could be of central importance, cell. A competing and perhaps related idea in the evolution of eukaryotes, although clearly 73 extant viruses may also have developed and However, Sogin (’91) noted that a tree based on diverged after the nucleus was formed rRNA (not proteins) places eukaryotes at root of Eubacteria. However, this placement is very What characteristics might be needed for the difficult to be certain of due to the low proto-nuclear virus? To generate the confidence in the statistical analysis. One existing nucleus, we might expect a large suggestion is that archebacteria are secondary dsDNA virus with a linear possibly multiple and specialized, not primary and ancient. Protein DNA segments or alternatively multi ori tree analysis suggests that Eukaryotes and DNA genome with eukaryotic-like telomer Eubacteria are sister groups. This seems to ends. The virus would be non-lytic, yet indicate an old eukaryotic lineage but leaves code for it own viral specific DNA unclear the issue of the likely progenitor replication and transcription proteins that are prokaryotic host for possible viral proto-nuclear clearly related to those of eukaryotes, not colonization. prokaryotes. The virus should be membrane bound (preferably a double membrane) and Cyanobacteria would seem to be a likely its chromosomes should be symbiotic source of chloroplast and stoichiometrically coated with small basic mitochondria, which also implies that they may polymers, histone or histone-like proteins. have colonized eukaryotic host after the This virus should be able to process RNA generation of the nucleus. In fact evolutionary (5’ capping, 3’ poly-A addition, splicing and evidence suggests that the colonization by transport RNA through membrane bound chloroplast of eukaryotes may have occurred pore-like structures). The proto-virus would many (over 30) times due to the diversity of C3 probably be a non-integrating virus, but with and C4 photosynthesis systems found in higher transposases or other DNA mobilization plants. As cyanobacteria are now thought to enzymes that allow acquisition of host have evolved relatively near but prior to the genes. It would have mechanisms origin of eukaryotes and as they also seem to (preferably tubulin based) to segregate and have contributed to the origin of the package viral chromosomes and lead to the chloroplastids, it is worth considering what type evolution of the tubulin system. Finally, the of contemporary viruses are known to infect viral persistence and/or reactivation must be cyanobacteria as possible candidates for the compatible with cellular differentiation, proto-nuclear virus. Cyanophages are clearly mitotic replication, gamete formation and related to bacterial phage and both lytic and sex. On the surface, this might seem like we lysogenic versions of these viruses are abundant are asking for way too much genetic in the oceans. Cyanophage CPS1 and CPS2 as complexity of our protovirus. However, well as S-PM2 show close similarity to capsid surprisingly, all these characteristics can be assembly proteins of T4 phage. S-PM2 also found in viruses. encodes a T4-like gp49 recombination endonuclease protein. It is worth re-stating that In terms of predecessor prokaryotic host the morphogenesis of T4 phage and cyanophage cells and possible symbiosis, several is exceedingly similar to the morphogenesis of possibilities are apparent. It was previously Human Herpes Virus I, which strongly supports suggested by Lake and Rivera (’94) that a a common lineage between these evolutionary gram negative Bacterium may have engulfed distant viruses. In addition, this group of cyano- an Archebacterium and that this phage codes for viral specific DNA polymerase, Archaebacterium then evolved to provide and RNA polymerase. However, like the T even the source of the nuclear DNA replication phage, (but unlike bacteria) these viral encoded system. That a bacteria that had lost its cell DNA and RNA pols clearly resemble those of wall might be the progenitor was first eukaryotes. More recently, and rather proposed by Cavalier and Smith (’87). surprisingly, S-PM2 has also been shown to 74 encode the two genes that are central for from infected cells, thus they not only have the photosynthesis (D1 and D2). It is thought desired membrane, but also a membrane that these viral specific genes may allow associated export systems. Their genomes are phage infected cyanobacteria (Synechoccus) exceedingly unique and can be highly diverse, to overcome the excess-light mediated and most viral proteins, including replication damage (photoinhibition) that occurs to the proteins, are unique to these viral lineages. Thus photosynthetic complex in the host bacteria these families of viruses represent a very large (discussed further below). Thus this extant and dynamic source of genetic novelty. In terms family of cyanophage could provide a good of specific viral characteristics, the recently starting point for the origin of the eukaryotic sequenced AFV-1 is especially noteworthy. nucleus. However, these viruses are mainly This linear dsDNA virus uses eukaryote-like lytic agents with no membrane thus they TATA promoters to regulate transcription, but of seem to lack some of the other essential even more relevance, it has small direct repeat nuclear components. sequences on the ends of its DNA that are very similar to the telomer sequences at the ends of Archaea, which includes the extreme eukaryotic chromosomes and unlike any telomer- thermophiles and halophiles, such as like sequence yet found in any prokaryotes. Sulfolobus sp. are also important to consider Furthermore, phylogenetic analysis indicates that as sources of possible proto-nuclear viruses. the AFV-1 genome is basal to and likely Based on phylogenetic analysis of 16S ancestral to the major groups of eukaryotic large rRNA sequences, some have proposed that DNA viruses, including the phycodnaviruses Archaea cells may be more related to the (Chlorella virus), the poxviruses and African Eukaryotes then are Bacteria. Perhaps more Swine Virus (an insect transmitted DNA compelling is the observation that the amino virus). This relationship is of special interest (as acid sequence of Archaeal E1-1a contains presented below) since these viruses appear to an 11 a.a. insert present in that of have essentially all the desired charactreristics Eukaryotes, but absent from Bacteria. for a prot-nuclear virus. Thusthere is reason to think that Archaea might be the most likely source of potential In addition to AFV-1, other Archael viruses and proto-nuclear virus as well. As presented in cells have interesting characteristics worth Chapter 3, Archaea are known to host considering. For example, SIRV1 also has a distinct viruses relative to all other linear ds DNA but with covalent closed ends, a prokaryotes, with especially distinct feature of the large eukaryotic DNA viruses, not morphologies, such as ADTV with a double typcal of phage. Like other viruses of archaea, tail or the droplet shaped SIRV1. Overall this virus is not lytic so that their genetic these viruses have many of the desired capacity to persist provides good starting point characteristics of our ideal proto-nuclear for the possible evolution of the nucleus. virus(es). This includes a general and strong Possibly more interesting are the TTV 1,2,3,&4 tendency to establish non-lytic chronic and viruses of Crenarchaeota. These viruses have persistent infections with dsDNA viruses. linear dsDNA genomes with stoichiometrically Furthermore, these infections are highly bound and highly basic DNA binding proteins. prevalent and often mixed infections of Thus they might provide a molecular basis for mainly viruses with linear dsDNA genomes, the evolution of eukaryotic chromatin. In so the maintenance and coordination of addition, the capsids contain both internal and complex sets of persisting linear ds DNA external lipid envelopes and both temperate and genomes is very common in these host. In lytic versions are known. Therefore, the viruses addition, many of these viruses are infecting Archaea have many, if not most, of the enveloped and are continuously extruded features that would make make these agents an attractive candidates to have contributed to the 75 origin of the pro-nucleus. Recently, a new telomer repeats. This virus has lots of the order of Archaea has been proposed called needed characteristics for a protovirus, such as nanoarchaea. These are very small cells (N. the viral encoded DNA dependent DNA equitans - 490 kbp genomes) that appear to polymerase, DNA dependent RNA polymerase, live as symbionts on the surface of a larger ssDNA and dsDNA binding proteins, plus an ‘mother’ archaeal cells (Ignicoccus). This internal membrane. It is especially interesting observation suggest the existence of that the viral DNApol and RNA pol are more Archaea with genomes as small as that of similar to the eukaryotic counterparts then are some large DNA viruses, but which persist the related prokaryotic genes. However, this on the surface of other cells. The family of virus follows a mostly lytic or acute relationship of these cellular genomes to life cycle so they seem to lack the constellation each other or the potential participation of of genes needed for stable host colonization. persisting genetic parasites in this However, as indicated below, there are other relationship has not yet been evaluated. related phage in sporulating B. Subtillus that will latently infect spores, leaving this family of Mycoplasma, since they lack cell walls, are phage open as a strong candidates for the proto- often thought of as the most likely source for nuclear virus. the host cell that was colonized by the protonucleus and led to development of a With respect to extant viruses of prokaryotes, we eukaryotes. Mycoplasma virus L2 is a cannot now be certain of which of these viruses quasi-spherical enveloped virion containing might be most related to the putative proto- circular double-stranded DNA. This virus nuclear virus. Several strong candidates have family can show viral DNA integration into been identified. Thus it seems clear that viruses the host cell genome. However, L2 infecting prokaryotes still retain most of the infection of Acholeplasma laidlawii host features that would be required for this germinal cells leads to a episomal noncytocidal role. However, another way to evaluate this viral productive infection cycle in most infected origin hypothesis is to examine existing cells with the possible involvement of two cytoplasmic eukaryotic DNA viruses to see if origins of DNA replication. Virus early they retain basal characteristics expected of a expression is followed by establishment of proto-nuclear virus. lysogeny in all (or most) infected cells. This cytosolic characteristic would be good for THE BEST STUDIED EUKARYOTIC DNA the putative pro-nuclear virus. However, the VIRUS: small size of L2 DNA (11,965-bp) and the Best characterized cytoplasmic DNA viruses are absence of an extensive viral encoded DNA vaccinia virus and the other related members of replication and transcription system seems the poxvirus family as well as some members of problematic for this family of virus to alone the insect iridoviruses. These viruses have a have become the nucleus. multiple membrane with internal core structure containing a viral chromatin. Thus a multiple BACTERIAL VIRUSES WITH BROAD membrane arrangement is inherent to and HOST RANGE: PRD1 virus (related to phi- conserved amongst these viruses. The virus 29, see below) is an intriguing candidate for loses the outer membranes after entry but retains the putative proto-nuclear virus. It has a viral core structure. These membrane-less core broad host range and is a dsDNA structures will re-acquire membranes later in the with an internal membrane. P1 is a PRD1 virus life cycle and clearly resemble mini-nuclei. related tailed polyhedral virus also with Furthermore, these core structures have within broad host range but is a virus of them viral DNA dependent RNA polymerase mycoplasma pulmonis. P1 has a linear ds which will polymerize and transcribe, 5’ CAP DNA genome with inverted terminal and 3’ poly A modify mRNA, extruding it into 76 the cytoplasm through as yet packaging of single chromosome into new uncharacterized structures. Interestingly, virions and the subsequent tubulin mediated another primitive member of the large DNA transport of viral structures to the plasma virus family (e.g. the only DNA viruses that membrane. This transport involves a new ER can infect both insects and mammals) is wrapped cores which become attached to tubules African Swine Fever Virus (ASFV). This in order to move to the plasma membrane of the virus also codes for DNA dependent RNA cell. At this point, DNA synthesis stops as the polymerase. However, phylogenetic virion becomes membrane unwrapped (probably analysis indicates that this ASFV RNA via the action of viral kinases). A maturation of polymerase is basal to all three classes of the virion structure then occurs in which it eukaryotic DNA dependent RNA acquires a second plasma membrane, then an polymerases. It is worth emphasizing that association with occurs in which an actin no prokaryotic DNA dependent RNA polymerization dependent motility system moves polymerase is a member of the of the virion to exit cell at the plasma membrane eukaryotic RNA pols.I, II or III, let alone and infect nearby cells. Thus DNA synthesis is basal to all three eukaryotic . Only directly linked to membrane acquisition and this virus seems to hold that basal position. subsequent membrane loss (resembling S-phase) These cytoplasmic DNA viruses also code and the daughter chromosome/cores are for enzymes that CAP the 5’ end of mRNA. transported via tubulin action. Similar to the story with ASFV RNA pol, ASFV capping enzyme, as well as the The resemblance between these poxviral capping enzyme from PBSV-1 (DNA virus processes and the activities of a cycling of chlorella-like unicellular algae, discussed eukaryotic nucleus are striking and clear. This below) have both been shown by similarity encompasses most of the events and phylogenetic analysis to be basal to mRNA mechanisms that are characteristic of a mitotic capping enzymes of all eukaryotic cells. nucleus. Viral transcription is fundamentally These two viruses also code for a poly-A segregated from translation. The viral DNA is polymerase which will attach poly-A linear, chromatin associated and has telomeres. residues to the 3’ ends of mRNA. The mRNA undergoes host-like 5’ and 3’ processing extrusion of the mature vaccinia mRNA into and is exported. A dissolvable multiple the cytoplasm occurs via ATP dependent membrane is associated with the synthesis of process through as yet undefined exit viral DNA. Viral proteins bind and affect both structures on the viral core membrane. tubulin and actin polymerization and mobilization function so that it is clear that viral Vaccinia mRNA becomes tubulin associated genes are directly involved in and dependent on soon after synthesis and this mRNA is also motility. This tubulin associated transport of associated to the ER. This association is immature viral cores is also associated with the involved in the cytoplasmic translation of resolution of multiple viral genomes and the viral structures. It is especially intriguing acquisition of a second membrane during viral that the vaccinia viral core structures maturation. These events and processes become wrapped in ER derived membrane, encompass most of the features that distinguish following which the synthesis of viral DNA eukaryotes from prokaryotes and seem to add a ensues within these mini-nuclear membrane powerful argument for the hypothesis that bound structures. DNA synthesis initiates viruses could have provided the origin of all from viral telomere end repeat sequences these functions. However, it might be counter and can result in concatenates of long, multi argued that the selective pressure on a eukaryotic ori DNA structures. Resolution of these DNA virus would lead to adaptations in which concatenated structures occurs via the viral molecular strategies resembled those of the telomere and must occur prior to the host and hence this similarity between virus and 77 host could be evidence for virus-host co- suffice it to say that viral genetic creativity is adaptation or convergent evolution. Yet we vast and unsurpassed by any other life form. know of clear examples in which these same And, as we noted for phage evolution, new viral viral processes differ fundamentally from genes tend to originate from other viral elements, those of the host, including protein capped not host genes. There is the technical problem 5’ mRNA of or the that due to the much higher rate of virus completely distinct DNA polymerase and evolution relative to that of host, it can be DNA synthesis process of Adenoviruses difficult to be certain of the relationship between from that of the host or the existence of virus gene evolution relative to that of host single stranded viral used as genes. Yet we know numerous examples, templates for transcription and replication. especially in persisting viruses, in which virus All of these other (no-nucleus-like) virus and host gene trees are highly congruent, distinct strategies are equally old according indicating similar patterns of co-evolution. In to phylogenetic analysis. Clearly viruses do addition, in spite of prevailing views to the not need to be host-like to function properly, contrary, phylogenetic analysis indicates that even to perform host-like functions. Other there are few ( or no) clear example of viruses eukaryotic DNA viruses that use host like acquiring core genes from host sources (such as DNA replication processes can differ from T-Ag example mentioned above). Most all these their host, hence the existence of many viral core genes are of an ancient origin and their examples of highly conserved core viral lineages are generally monophyletic. replication proteins, such as the T-Ag of Phylogenetically, when present the core viral polyomaviruses (or early genes of all replication and transcription proteins are as well papovaviruses) which have no cellular conserved amongst different viral lineages as any analogue. This leaves us with the question viral gene. This conservation is especially true of why are the cytoplasmic DNA viruses so for viral encoded DNA pol, PCNA, RNA pol, similar to their host in all these and mRNA CAPing enzymes of DNA viruses. It mechanisms? Two other points should now is in fact the conservation of these ‘core’ genes be made. A proto-nuclear virus offers a that are used to construct the phylogenetic solution to the dilemma of the origin of the relationship of DNA viruses, and in the case of nucleus. If we eliminated the possibility that DNA pol gene, will link eukaryotic DNA viruses viruses were able to provide the origin of the to prokaryotic DNA viruses. Vaccinia DNA nucleus, we also eliminate the solution to the polymerase, for example, most closely resembles dilemma of the missing symbiotic cellular that of phage T4. Furthermore, some viral progenitor of the nucleus. And second, the lineages are clearly very old and as we have viral proteins involved in the host-like mentioned the example of herpes viruses which processes generally appear to be basal to still show clear relationships to the T-even those of the host, as described below. phage. Yet the Herpes lineage (and poxvirus) is paraphyletic to host and dos not stem from that That various viral genes (DNA pol, of the host. Taken together, these observations RNA pol, capping, etc.) are phylogenetically strongly argue that these large DNA viruses are basal to those of all eukaryotes does not not derived from rouge host replication systems convince everyone that the viruses were the and establish that they have the evolutionary and progenitors to these genes. After all, the genetic capacity to have been the origin of the ability of a virus to acquire host genes is eukaryotic nucleus. known so the old and popular argument is that viruses ‘steal’ host genes (especially accessory genes), might account for a virus NUCLEAR PORES. The nuclear pore structures with host-like genes. We have previously of the eukaryotic nucleus pose another mentioned this issues in in chapter 1, but significant dilemma for the possible prokaryotic 78 origin of the nucleus. These large complex highly dynamic process. It may even be more structures have no counterparts in the dynamic then previously thought. For example, prokaryotic world. However, pore recent reports of phage of Borrellia have structures would not seem to be nearly as identified a family of phage that has evolved a problematic for the viral origin hypothesis. reverse transcriptase mediated system that can In extant eukaryotic DNA viruses, we know mutagenize the mRNA of the baseplate receptor that some mammalian DNA viruses, such as gene, followed by integration of the gene to adenovirus and herpesviruses, will generate sufficient diversity such as to allow the specifically dock onto the nuclear pore rapid adaptation of phage to new or altered host structure in order to allow nuclear entry of with distinct receptor proteins. This is a most viral chromatin and initiate infection. remarkable phage RT based system for the Clearly these viruses are highly adapted to generation of protein diversity and it clearly nuclear pose function and seem to use them resembles that of the of as internal receptors. We have also noted vertebrates but must be much older functionally homologous ‘pore’ function in phyogenetically. All this enormous amount of that vaccinia cores will extrude mRNA from viral based gene diversity is used to simply make viral core/chromatin structures via an ATP pores in bacterial cells in order to allow viral dependent process. This indicates the DNA entry and suggest that viruses would make existence of some viral based process that good candidates to have originated nuclear pores transports mRNA from the point of as well. synthesis into the cytoplasm. In terms of prokaryotic viruses, the idea that viruses could have led to the creation of novel pore VIRAL MEDIATED COVALENT MARKING structures is not without evolutionary OF NUCLEIC ACIDS: One might pose the precedent. As we indicated in chapter 3, question about the origin of the viral enzymes bacterial viruses frequently use various that add modifications to viral RNA (5’ CAP, 3’ types of pores as toxins which are also a poly A). How might we justify the view that component of immunity modules and which these processes first occurred as a viral, and not compel the host to maintain the persisting cellular mediated process? In many cases (such virus. In addition, bactreiophage (like as poly A enzyme of vaccinia), these viral lambda) use holins as small membrane enzymes show no similarity to host enzymes so proteins that accumulate on the membrane that in those cases it cannot be argued that they then at specific time will program membrane could be of host origin. However, in other cases, permeability for the release of lytic virus. such as Chlorella Virus 1 (CSV-1), the viral Greater then 100 distinct viral holin genes poly-A polymerase is similar but basal to that are known which can be organized into over found in eukaryotes. Why would viruses modify 30 orthologous groups, thus this system is their nucleic acids in such ways? As we have highly diverse. Another related point is that discussed previously in chapters 3, almost all the base plate at the tailed end of viruses are known to covalently ‘mark’ their bacteriophage clearly resembles a pore in genomes and RNA and proteins with various function. It is a highly complex multiprotein types of chemical modifications. The most structure that attaches to the host cell common amongst these is the methylation of receptor, generates a hole in the membrane various bases of DNA by viral specific and injects the viral nucleic acid. methylases. However, it is also clear that Furthermore, the proteins making up viruses, even prokaryotic viruses have employed bacteriophage baseplates and receptors are non-covalent DNA binding or chromatin-like probably the most diverse of all phage and proteins (e.g. TTV of thermophiles) to bacterial proteins. Baseplace-receptor differentially condense or mark their genomes. evolution in phage and prokaryotes is a This marking allows other viral enzymes, often 79 hydrolytic enzymes, to distinguish viral Acute Phage FtsZ and chromosome segregation. genomes and transcripts from those of the However, bacterial viruses code for FtsZ-like host, as well as to distinguish one virus from proteins as well, such as f-29 (a T7-like virus of another and allows the degradation and B. subtillis). But the phage, FtsZ-like proteins recycling of these molecules. Thus the idea can have distinct functions form those seen in that a virus might have marked its mRNA bacterial host cells. Furthermore, these phage via 5’ capping and/or 3’ poly A addition fits proteins also display some biochemical and well with known viral molecular strategies structural characteristics that make the phage of genetic identity. proteins appear to be more similar to tubulin then the cellular FtsZ. f-29 is a linear ds DNA phage with short terminal repeats and covalently A SCENARIO FOR VIRAL ORIGIN OF attached terminal proteins. In the case of f-29, it SPINDLES AND TUBULIN: Let us also codes for very abundant small ss DNA consider a specific and significant molecular binding and ds DNA binding proteins which are dilemma in understanding the origin of essential for DNA replication, as well as coding Eukaryotes; that is, the origin of eukarytoic for a DNA polymerase (a type B pol ), as well as spindles and tubulin from the perspective of for a late expression DNA dependent RNA a putative viral-origin. Prokaryotes do not polymerase. With, f-29, the P1 gene encodes have a tubulin system so it has been hard to early protein which is the cellular FtsZ analogue. see how this complex process evolved from It is thought that the P1 protein may bring viral prokaryotes. In fact, the tubulin problem is DNA polymerase to membrane at ‘telomers’ considered one of the major dilemmas for a ends of phage DNA. This is a core prokaryotic origin of eukaryotes, thus we that is needed for the initiation of DNA will now consider this situation in replication, and to attach the replication complex considerable detail. However, the details to the membrane for segregation. This phage presented below are largely circumstantial protein is able to form much more ‘tubulin-like’ and may present a burden for some non- polymerized tubular and tertiary sheet structures expert readers. Thus those readers might then observed with the host cellular protein. choose to skip this tubulin section. However, the f-29 P1 is still rather different from tubulin and thus may not be the direct Cellular FtsZ. As mentioned, prokaryotes progenitor to tubulin. Yet, that prokaryotic do have the FtsZ gene, which is a protein viruses can code for such proteins clearly raises involved in prokaryotic chromosome the possibility that tubulins may also have a viral replication and segregation. This occurs via (phage) based origin. f-29 virus is relatively what is believed to be ring-shaped septum small DNA virus (15 kb) and mainly lytic so it and membrane attachment. Structural does not appear to be a good candidate by itself solution of FtsZ indicates it has a very to have evolved the eukaryotic tubulin structures. similar physically to tubulin and that it also In addition, this phage clearly has a ‘non-host- has some regions of discernable but low like’ DNA replication system which resembles sequence homology. So, it seems plausible adenovirus in replicating DNA by a 5’-protein that prokaryotic FtsZ and tubulin are related. primed mechanism using a DNA polymerase that Yet, FtsZ is highly conserved in all is related to that of adenovirus. f-29 would not prokaryotes but not similar to tubulin (less seem to be the likely direct progenitor to the then 20% sequence identity) and tubulins are tubulin system of the nucleus. highly conserved in all eukaryotes. Thus we cannot identify prokaryotic cellular FtsZ and phage immunity. However, -29 P1 progenitor (either functional or by sequence f similarity) to tubulin. might well represent a remnant of how phage can link DNA replication to tubulin. There is

80 compelling evidence for the existence of a replicate following germination of the latently large number of unassigned members of f- infected spore. Thus latent infection by a DNA 29 related Podoviridae, infecting a wide virus with many of the characteristics needed to range of bacteria. In addition, there is strong have been the proto-nucleus, including a viral evidence that FtsZ related proteins in such encoded FtsZ protein, is established. phage are directly important for virus Furthermore, this latent infection is associated persistence. For example, the host bacterial with characteristics that resemble both sexual FtsZ protein is a very frequently a target of reproduction and differentiation as seen in various prophage immunity genes. eukaryotes. However, f-29 itself may lack the Numerous eubacteria have sequences related genetic carrying capacity to have been the sole or to the Kim region of lambdoid prophage that direct progenitor of the tubulin system of the codes for DicF RNA which have been nucleus. But various other temperate and psudo- identified. This RNA is antisense to FtsZ temperate (non-integrated) phage of bacillus (SP- that inhibits cell division and appears to be beta, SP15; which establish extended latent part of an addiction module (see Chapter 3 infections) are known to have much larger for the role of phage addiction modules on genomes – up to 385 kb - and are frequently host evolution). Most often, temperate super-immune to other phage. As discussed in phage use anti-sense RNAs as an anti-toxin Chapter 3, some interaction with acute f-29 and to a second stable viral death gene. these latent phage are very likely to provide However, some of these DicF-like RNA’s mechanisms able to suppress f-29 replication do not affect bacterial host cell division. and provide the missing mechanism of Thus the intended target for these RNA’s is persistence, as well as other features, such as a not be the replication of the bacterial host membrane and additional genes. Although these itself. It seems more likely that these RNAs non-lytic B. subtillis phage are not well studied, instead target the replication of other some psudo-temperate phage of B. subtillis (that persistent or acute virus. thus it seems more continue to make virus without ) are likely that they are involved in immunity known to express viral genes that increase and under the selective presser associated cellular sporulation frequency and, intriguingly, with immunity. In addition, the similarity of also express insecticidal proteins. Clearly flanking regions of DicF RNA to immunity these viruses are manipulating and compelling region of P4 are also seen further suggesting basic host cell differentiation programming as a role in immunity. well as providing a survival advantage to latently infected bacterial-plant symbionts. Because f-29, FtsZ LATENCY AND these phage need to maintain a latent infection SPORULATION. B. subtilis can produce for survival, their fitness is linked temporally to sporulating bacterial cells which are of spore cell survival and germination. special interest in evolutionary biology. Sporulation resemble both sexual gametes of DEFECTIVES, VIRAL DEFENSE AND eukaryotes (hence relate to the soma germ TUBULIN ORIGINS. Prophage, however, such line dichotomy) as well as representing an as those that encode DicF-like RNAs, are early version of committed cellular frequently defective, hence it is often considered differentiation that is otherwise absent from that they no longer function as a virus and that most prokaryotes, yet common in viral issues are not involved. However, as eukaryotes. In the case of f-29 infecting B. presented in chapter 3, we have noted important subtillis, vegetative f-29 replication will be examples in which a seemingly inactive or inhibited during cellular sporulation and the defective prophage can strongly enhance viral virus will incorporate into sporulating cells persistence and also affect the outcome of host in a latent state. f-29 will re-express and evolution and survival from other acute and

81 persistent virus infections. The example of 40% of planet photosynthesis, contributing about the P4 defective virus and the non-defective 1012 tons of cell wall per year or1011 tons of P2 being a case in point. Given the high to the biosphere. Microalgae therefore prevalence of prophage-like genetic represent a substantial part of the Earth’s elements that encode Dif-F-like RNAs, it biomass. Bacteria-like DNA viruses are known thus seems likely that FtsZ (and the to exist for many species of unicellular algae. defective prophage that encode it) may also Micromonus pusilla is the best studied free be under selective pressure by competing or living microalgae, which has a simple sexual latent phage. The phage FtsZ gene (and its cycle. Algae of species Chlorella are the most antisense) may identify elements of an widely distributed and frequently encountered addiction module, whose purpose would be throughout the water habitat of earth. Chlorella to ensure the continued prophage species undergoe mitotic division, and most are colonization of its host. Thus we can free living but ensymbiont versions called propose the possibility that some distant f- zoochlorella are also well known. Chlorella 29 like virus was able to create a novel species have cell walls made of liposaccharide version of FtsZ gene. This tubulin-like gene that chemically resembling gram negative could have resulted in a more efficient bacteria. Chlorella have mitochondria, golgi and system for the extra-chromosomal ER and are photosynthetic containing persistence and segregation of linear viral chloroplast. Viruses are known for over 44 taxa chromosomes. Eventually, this host of eukaryotic algae (sometimes referred to colonization became permanent resulting in generically as Chlorella viruses). Similar to a superimposed viral mediated system for phage but unlike DNA viruses of animals, the chromosomal replication and segregation. Chlorella virus virion remains external after Eventually, eukaryotic tubulin evolved from injecting viral nucleic acid into the host cell. this viral system. The virion is not taken into the cytoplasm in contrast to essentially all other eukaryotic virus. The Chlorella 16S RNA of both plastids, chloroplast and mitochondria, is more similar to that of cyanobacteria and purple bacteria then to other organisms strongly suggesting that these plastids were both derived from symbiotic free VIRUSES OF MICROALGAE: As we living photosynthetic bacteria. However, plastid noted above, the viruses that can now be RNA genes appear to be composed of mosaics of commonly found in unicellular eukaryotes, particular bacterial lineages. such as microalgae, are of special interest in the evolution of eukaryotes in that they might shed light on the relationship of large Phycodnaviruses are phage-like. Paramecium DNA viruses with the eukaryotic host and Bursaria Chlorella Virus (PBCV-1) is the its evolution. Microalgae constitute the prototype for the phycodnavirus family earliest representative of a eukaryote for (Chlorella viruses). The chlorella-like algae, which there exist clear sedimentary fossil host to PBCV-1, are both free-living unicellualr data concerning their early origin. algae and zoospores. As mentioned, zoospores Microalgae are both free living and also are algae that live symbiotically within exist as ensymbionts or zoospores of other paramecium and other eukaryotic host (providing species, such as paramecium. Microalgae photosynthesis). Virus of Micromonus pusilla, are an abundant species and it is estimated which is free living microalgae, are also known that as many as 100,000 species of marine and well studied. The viral genomes are linear algae exist, which would contribute up to dsDNA (330,742 bp) with closed hairpin ends.

82 PBCV-1 encodes 376 predicted coding not well understood. Particles that closely regions, 40% of which clearly resemble resemble phycodnaviruses are exceedingly other known prokaryotic and eukaryotic abundant and can be found in surface waters of proteins. Structurally, phycodnavirus virions the oceans and freshwater at levels that range resemble animal iridioviruses and show from 1-5 X1011/liter. The studies of some sequence homology to iridiovirus phycodnaviral populations have frequently been capsids. The great majority of viruses of done with the aim of biological control of microalgae are of related large dsDNA oceanic algae populations and their blooms. viruses. However, a few RNA viruses have Such blooms can devastate other oceanic also been observed, such as a rod shaped ecology by killing other species due to oxygen RNA virus (TMV-like) reported in Chara depletion. Clearly these viruses represent a corallina microalgae. Algal DNA viruses major and natural constituent of the aqueous are similar to bacterial phage in many ways, habitat. With respect to the symbiotic algae, it is although they tend to be generally larger and interesting that the paramecium host to the algal more complex then most bacteriophage. zoospore may prevent access of and infection by Unlike many eukaryotic viruses, PBCV-1 phycodnaviruses to the symbiotic algae as these has a high particle to pfu ratio (25-50% of symbiotic algae are not susceptible to PBCV-1 particles are infectious), indicating that these infection when the algae are within its protozoan viruses undergo efficient virion assembly. host. However, zoospore algae can frequently be In this characteristic, they are more grown as free living cells in which PBCV-1 will reminiscent bacteriophage which also show grow and plaque on this permissive algae. The high particle to pfu ratios. Also like phage, natural biology of this relationship is not well phycodnaviruses generally have high levels understood in that it is not clear how algae of methylated DNA bases. An additional colonize their paramecium host in nature. phage like characteristics is the ability of However, the evolutionary implications of a host phycodnaviruses to digest an opening in the species escaping acute viral parasitization by cell wall and inject the viral genomes. No becoming engulfed by another and very different other eukaryotic virus appears to operate in cell are very intriguing. If the engulfing cell is this phage-like way. Also phage-like is that sufficiently different from the sybiont (such as phycodnaviruses code for numerous lacking the same viral receptors), the engulfed restriction/modification enzymes. In fact, cell would be surrounded by an alien cell type, these restriction enzymes are the only and be shielded from any acute virus. This example to date of eukaryotic restriction relationship may well define a virus based modification systems. Other phage like pressure that can drive a virus features of phycodna viruses includes the susceptible host into an initially parasitic presence of transposons, mobile introns and relationship within another cellular species a phage-like DNA repair system. Yet in simply to escape from prevalent acute viruses. spite of all these similarities to phage, in This engulfment by another cell, if stable, could many other respects phycodnaviruses are provide an initial selective pressure to initiate the much more like eukaryotic viruses and evolvolution into a symbiotic relationship eukaryotic host then they are like between the two cell types, without the need one prokaryotes (discussed below). cell providing a clear advantage to the other cell. Such a ‘virus escape’ idea might also apply to the origin of symbiotic eukaryotic plastids (such NATURAL HISTORY OF as chloroplast and mitochondria), which appear MICROALGAE, PHYCODNAVIRUS to have originated from free living bacterial AND SYMBIOSIS. In spite of much organisms. These free-living plastid ancestors virological study, the natural history of might also have been driven into the early phycodnaviruses and the microalgae host is ‘aplastid’ eukaryotic cell to escape lytic 83 cyanophages, which are prevalent in the very efficient light dependent repair machinery oceans. In support of this idea, plastid for UV damage. As mentioned above, they also sequence data clearly show they have encode phage versions of the D1 and D2 nucleotide word frequencies that do not photosynthetic proteins that would presumably avoid restriction/modification or restore the photosynthetic capacity of phage palindromic sequences. Yet all known free- infected cyanobacteria in excess light. Because living prokaryote genomes avoid such these viral genes also have several mobile introns palnindromic nucleotides. As restriction within them, they can be clearly distinguished modification is a major bacterial system for from the corresponding host genes. With the immunity to or addiction by phage, all phycodnaviruses, there also exist various genes prokaryotes are under pressure by lytic that would aid infected algae in dealing with viruses to maintain restriction/modification excess light energy. Several specific adaptations systems. Thus the lack of restriction word for repairing the damaged to proteins and DNA avoidance in plastids suggests that the caused by UV light are known. In addition, selective pressure to avoid cyanophage phycodnavirus replication itself can also be viruses was absent after becoming engulfed. affected by light, such as with Micromonus However, as discussed below in the section pusilla whose production is light dependent and on fungi, DNA viruses that colonize fails to induce severe disease in the dark. mitochondria are known, and prevalent in some situations, but such viruses are not Surprisingly, phycodnaviruses have been shown controlled by restriction/modification capable of replication even in UV killed host systems. cell, such as PBCV-1 with can infect and replicate at reduced but significant levels in LIFE IN THE SUN: LIFE AFTER damaged cells. This restored replication is due DEATH. Another very intriguing to the expression of various virus specific repair biological feature of phycodnavirus, and enzymes that can resurrect the capacity of the marine phage (cyano-phage) with broad cell to synthesize macromolecules. A similar implications for evolutionary biology, is the capacity has long been known for bacterial ability of these viruses to respond to UV phage. Besides restoring the damaged cell, light damage. At the ocean’s surface, where viruses are inherently much more resistant to UV the majority of the microbiological flora killing then that of the host cell due to the much resides, UV inactivation from sunlight smaller genome target size. However, even the accounts for the most significant source of UV mediated inactivation of virus does not bacterial, algae and phage death and necessarily prevent subsequent viral replication. turnover. In addition, photosynthetic This is because a UV-killed virus can still organisms, such as cyanobacteria and replicate due to a process known as multiplicity Chlorella green algae will undergo photo- reactivation. Multiplicity reactivation will occur inactivation of photosynthesis. This occurs if there is a sufficiently high ratio (or when excess light damages the D1 and D2 multiplicity) of virus to the host cell, such that proteins of the photosynthetic reaction one host cell is infected with numerous virions. centers, resulting in decreased In this way, even if each of these virions has photosynthesis. The intense light levels that sustained a lethal UV hit in part of its genome, it can exist in the ocean’s surface have may still be capable of expressing some subset of strongly affected the genetic makeup of genes. If these expressed genes either lead to the oceanic viruses. Accordingly, most marine genomic repair and/or allows complementation viruses and phage appear to have a half-life of otherwise damaged genes, then the virus of less then one day, mainly due to intense replication will be restored by the combined light levels. To counteract this, action of the damaged parts. The selective Cyanobacterial phage (such as S-PM2) have advantage of such a ‘resurrection’ capacity 84 seems clear and large. It provides a selective precisely due to this circumstance in which the coordination of complementation/recombination capacity that otherwise defective genetic elements is some virologist worry that some otherwise strongly favored by being inherently unavailable viruses, such as smallpox, might be dependent on the complementation of these reassembled from subgenomic parts of the otherwise UV-killed genomes. Therefore purposes of . the defective mixture must be able to cooperate as a set to reconstitute virus replication. This process is clearly similar to SIMILARITY OF PHYCODNAVIRUS a ‘group selection’ process that has been REPAIR GENES TO EUKARYOTES: We considered and dismissed as implausible by have noted above many clear similarities of the most evolutionary biologist. But in the phycodnaviruses to the viruses of prokaryotes. context of UV-killed virus, group selection For example, PBCV-1 encodes a UV DNA must operate on a population of otherwise repair enzyme that is clearly T4-like (denV dead viral genomes, not an individual viral gene), to which there are no know cellular genome. In addition, to acute viruses, such enzymes (prokaryotic or eukaryotic) that group selection of otherwise defective virus resembles this protein or its mechanism of DNA could also apply to viruses that colonize the repair. However, there are an equal number of host genome. Like the persistence of compelling similarities of phycodnaviral genes defective prophage discussed in chapter 3, a to those of eukaryotes, including most of the host colonized by mixture of defective core viral genes. In the context of repair, PBCV- prophage could be stable but could also 1 superoxide dismutase is thought to protect express complementing phage genes with from sunlight induced reactive oxygen and most the combined capacity to produce virus. probably extends the life of infected cell in The apparently stringent conservation of damaging sunlight. This viral enzyme is of the viral specific repair genes in DNA aerobic form, which according to phylogenetic viruses/phage, and some defective prophage, analysis, is basal to those superoxide dismutases supports the idea that repair capacity is of eukaryotic cells, but similar to those found in indeed highly selected in natural virus the large DNA baculoviruses of insects. populations. Thus, neither the death of the Furthermore, this enzyme shows no similarity to host cell nor the death of the individual virus those of prokaryotic cells. However, similarity is sufficient to exterminate the survival between PBCV-1 SOD and the SOD found in potential of such a virus system. The lysogenic bacteriophage Fels-1 is apparent. implications of this are mind boggling: to think that a mixture of dead virus may still DNA pol. Numerous other PBCV-1 genes persist in its potential for life. Only viruses (such as DNA polymerase) also show a related are known to have such a complimenting, pattern of similarity to eukaryotic genes and to resurrecting capacity as no other biological viruses of eukaryotes and prokaryotes, but not to entity can resurrect itself after certain death prokaryotic cellular genes. PBCV-1 does not by mixing defective genomes. Most DNA encode it own DNA dependent RNA virus families not only have highly polymerase, as does ASFV (discussed further conserved DNA repair genes, they also below). However, it does encode a DNA conserve their ability to recombine genomes polymerase. The PBCV-1 DNA pol is a highly and can do so at very high efficiencies. It is conserved core enzyme that has been used to in fact this very ability to recombine identify other members of both the defective genomes into an infectious virus phycodnavirus and families (see that has been of put to practical use in that it below). Natural PBCV-1 isolates conserve this was and remains the method used to core gene sequences, but often differ by having generate recombinant viruses. It is also acquired additional but unknown ‘accessory’ 85 genes. Phylogenetic analysis indicates that absent from lower eukaryotes. However, the PBCV-1 DNA polymerase is basal to the phylogenetic analysis show this PBCV-1 gene to DNA pol beta (extension polymerase) of all be basal to and the likely ancestral to all three eukaryotes. Yet, it is most similar to the versions of the eukaryotic gene. This result best DNA polymerase of the Human Herpes supports a viral, not bacterial origin of this Virus family. In terms of prokaryotes, the mostly eukaryotic gene. PBCV-1 also encodes a polymerase is most similar to those of the chitosanase gene, that may be packaged into the T4 (even) phage, but distantly related to virion. This enzyme makes a linear DNA polymerases of Archaea. PBCV-1 homopolymer which is a normal component of DNA pol is not related to the replicative- fungal cell walls, insect exoskeletons, and extension polymerase in Bacteria. Thus, this crustacean shells, but is rarely found in algae. Of viral DNA polymerase occupies a basal considerable interest, PBCV-1 makes its own position in the eukaryotic glycosylating enzyme, found in golgi/ER, which and resembles the progenitor to all is not present in its algal host cell and is also not eukaryotic extension polymerases. a common constituent of algae. Also, a putative cellulose synthase is present as a PBCV-1 . RNA modification. The early PBCV-1 Thus the PBCV-1 virus seems to encode a transcripts are polyadenylated, but late surprisingly large number of biosynthetic RNA’s are not. In this the virus mRNA is enzymes that represent synthetic pathways both eukaryotic-like and prokaryotic-like. mostly associated with eukaryotic cells. In other This viral poly A polymerase does not respects, PBCV-1 appears to span the boundary resemble those of prokayotes but does of prokaryotes and eukaryotes. Even PBCV-1 resemble eukaryotic poly A polymerases. In promoters seem to span the addition, PBCV-1 mRNA are 5’ capped. prokaryotic/eukaryotic boundary in that these This viral capping enzyme is related to those viral promoters are unique and able work well in of yeast and is also basal to all eukaryotic both higher plant and bacterial cells. RNA capping enzymes. Introns. Finally, there is the issue of introns. 19 Biosynthetic enzymes. Viruses are not of 42 viruses that infect Chlorella strain contain generally considered to contribute much short, nuclear-located, spliceosomal-processed with respect to host metabolic enzymes or intron in a viral DNA repair gene (these are U2 metabolic activity as they are not considered type GT-AG introns). Interestingly, the intron to be requiring any virus specific metabolic sequences are more conserved then exon activity. Yet, unlike most other viral sequences, seemingly at odds with the protein- families, a large number of PBCV-1 genes domain exon shuffling hypothesis. The highly (12) are made which synthesize or conserved DNA pol gene also has a related metabolize sugars, and intron, but it is found in all strains and its polysaccharides. This includes a hyaluronan greatest sequence conservation is at the exons. synthase protein, which accumulates on the Clearly viral introns do not evolve faster the outside of infected cells. Hyaluronan is of exons in Chlorella virus. special interest because it was previously thought only to be found in (and Thus PBCV-1 appears to span the discontinuity characteristic to) vertebrate species, along between prokaryotes and eukaryotes. It has with some capsules of pathogenic bacteria. prokaryotic characteristics but also has numerous Intriguingly, when the human genome was genes and processes that are thought of as sequenced, this gene was identified as one of characteristic of and basic to eukaryotic the few clear examples of what appeared to organisms, but which are absent from be , presumably prokaryotes. Furthermore, the PBCV-1 version from bacteria to vertebrates, since it was of the eukaryotic specific genes and elements 86 appear to be more basal then those found in by extranuclear virus assembly. This also eukaryotic cells. corresponds to a period in which no cell wall synthesis is occurring by host, so virus is VIRUSES OF FILAMENTOUS BROWN released by the same stimulus that releases ALGAE-MULTICELLULARITY AND spores or gametes. Phaeovirus only infect the SEXUAL REPRODUCTION: 8 species wall-less free swimming spores or gametes of of filamentous brown algae are currently algal host. In natural populations, infections can known each of which harbors its own be highly prevalent. The gametangia (or species specific DNA virus known as sporangia), the mobile gametes or spores are phaeovirus (phaeo Greek for brown). These frequently virus infected. In some host species, viruses are clearly related to the all individuals are infected. The extremely high phycodnaviruses, but differ in several levels of virus production during sporulation can important molecular characteristics. disrupt sexual cycle essentially rendering the Ectocarpus species virus (EsV) and host asexual. In this feature, it seems that virus Feldmania species virus (FsV) (virus of reproduction will override host sexual Feldmania simplex) are the most studied reproduction. The virus does not grow during members of these viral families. The vegetative growth of host. EsV-1 will infect biology of these viruses and their host Feldmania zoospores, but does not multiply and differs substantially from that of PBCV-1 causes malformations so there is clear species and its unicellular host. Unlike the strictly specificity and possible to this lytic relationship that phycodnaviruses have virus-host relationship. EsV-1 does not affect with their chlorella-like host, phaeovirus are host rate of photosynthesis or rate of growth. persistent genomic parasites, and are passed Viral DNA becomes integrated into the host in a Mendelian fashion to infected host chromosome and is one of the only eukaryotic offspring. In addition, the host brown algae DNA viruses that does this as a normal and has a much more complex life and sexual essential part of a productive life cycle. cycle, which involves diploid states (not simply haploid as are microalgae). Brown PHAEOVIRUS GENOME AND LATENCY algae will also differentiate sex structures GENES. EsV-1 is a 335,593 bp linear dsDNA and produce mobile gametes. The virus has with inverted repeat ends and codes for 231 circular DNA, not a linear DNA with predicted proteins. Only 28 of these proteins are snapback repeat ends as does PBCV-1. similar to genes found in Genbank. EsV-1 EsV-1 occurs worldwide and infects host differs from PBVC-1 in that it codes for no Ecotocarpus silliculosus in all areas. tRNA genes, no poly A pol, no capping enzyme, Furthermore, the complex sexual cycle of has no introns, no DNA dependent RNA pol, but the host is linked to virus replication. Host it does code for a bacteria-like sigma RNA pol algae grows in vegetative haploid state factor. It also differs from PBCV-1 and which can grow male and female gametes bacteriophage in that 1/3 of the DNA is non- which can fuse to form diploids. These coding. This non-coding DNA corresponding to diploid forms grow the filamentous forms both repeated and nonrepeated sequence. These and can produce a diploid spore. These repeats are similar to poxvirus 4 ankyrin repeats diploid spores can undergo meiosis to – and may encode SET-like genes involved in subsequently make the haploid meiospores. protein-protein interacting domains (possibly It is during the production of the sex used for chromatin remodeling). These types of structures that algae make the diploid spore genes are absent from PBCV-1 and are suspected and it is also at this point that EsV and FsV to be involved in coordinating the latency to lytic are produced from infected host. transition of EsV-1. EsV-1 also encodes a lot of This can result in 1-5 X106 pfu per cell, signal transduction proteins, including 6 which degrades host nuclei and is followed histidine kinases, which are rare in bacteria but 87 commonly found in two-component eukaryotic and have the nuclear structure, the transduction systems of eukaryotes. These mitochondrial and chloroplast plastid structure as genes are also suspected to regulate latency well as ER and Golgi structures that are in EsV-1, since PBCV-1 lacks them. characteristic of all eukaryotes, although flagella Interestingly, EsV-1 codes for an H1 like are notably absent. However, in many respects histone and an RCF small subunit protein. Red algae seem to either be a sister group to all The EsV-1 DNA pol is much more like that other Eukaryotes or to possibly be the oldest of Feldmania then either host or PBCV-1, so Eukaryote. This is mainly due to the nature of this core enzyme appears to define a the chloroplast (as well as being supported by common viral lineage. EsV-1 also has a rRNA analysis). A distinctive feature of red PCNA gene (which is PBCV-1 like). algae chloroplast are their disorganized Interestingly, EsV-1 has a bacterial-like unstacked thylakoid photosynthetic membranes transposon with ORF that codes for factor as well as the occurrence of phycobilin pigment related to plant defense protein granules which give them their distinctive red (pathogenesis PR-5). This transposon has a color. In these characteristice Red algae more phage like transposase (integrase) as well as closely resemble cyanobacteria then do green a lactococcus phage-like anti-repressor of algae. As consequence of these pigments and the , establishing a clear chloroplast, red algae can a tolerate a wider relationship of EsV-1 to phage. It is range of light levels then any other group of assumed that some of these EsV-1 viral photosynthetic plant and can also thrive in proteins must allow the virus reactivation relatively deep water (up to 268 M) as well as and replication to link to host sexual shallow tropical waters under intense light. In reproduction. This intimate link of EsV-1 addition evolutionary links via 5S rRNA analysis virus reactivation to host sexual between green and red algae are tenuous which reproduction is especially intriguing when would be consistent with a sister group we recall a similar link was observed with relationship in which the Red algae sister group phage that latently infect B. subtillis spores would lack any other out groupings. However, (presented above). Given the old Red algae have an uneven fossil record which evolutionary lineage of this virus, it seems limits geological consideration of this issue. possible that this virus system may have also Species of Red algae are not nearly as numerous been involved in the origin of this host as other algae and it is estimated that they process as well. account for only 1-2% of all algae species. Rhodophyta are thus the least studied and RED ALGAE. Since algae are the major understood all algal groups. life form in the oceans and are mainly photosynthetic. Thus by fixing carbon Red algae and viruses. There is considerable dioxide into organic molecules, they provide interest in red algae as a group since they are the major source of the foodweb and energy responsible for toxic red algal blooms (red ) flow for most all oceanic life but are that can be so destructive to shellfish, fish and especially used as food by marine micro- marine mammals. A well studied species grazers such as , ciliates, responsible for toxic blooms is Heterosigma and micro (usually micro- akashiwo Hara et Chihara (Raphidophyceae). larval forms). The above chapter focused Curiously, these red bloom populations have mostly on eukaryotic green microalgae and a highly clonal character and are known to brown filamentous algae and their viruses frequently terminate rapidly. There is now since they constitute the majority of algal strong evidence that such toxic blooms can be species. However, there also exist another terminated by the production of lytic virus distinct order of eukaryotic algae; the red specific to this species. Early observation using algae. Reg algae (Rhodophyta) are electron microscopes showed that at the 88 termination of such blooms were often expect persisting viral agents of red algae to also associated with the induced production of interact with and/or compete with acute viral viral like particles (VLPs). Subsequent agents of the same host, these issues have not yet studies have been able to isolate a lytic large been examined in Raphidophyceae. 202 nm icosahedral DNA virus (HaV) that was able to lyse specific strains of Transferred or Infectious Nuclei of Red algae. Raphidophyceae. The virus appears to Although the nuclei of red algae appear to be replicate in the protoplasm of infected cells. typical of eukaryotes in most respects, except for Molecular details about these viruses are there uniformly small hapliod size (1-3 microns still lacking. Since then, numerous other compared to 3-10 microns for higher strains of HaV have also been isolated eukaryotes), and the curious absense of a establishing a broad viral diversity in natural nucleolus, there is one striking characteristic that settings. Currently, it is felt that production applies to all red algae which is worth of lytic virus is often associated with the considering. The nuclei of Red algae have the termination of red algal blooms and thus it characteristic of nuclear migration. All algae appears that these infections are having large that are not strictly haploid seem to cycle effects on natural host population dynamics. between haploid and diploid states in association Although these lytic infections are species to sexual reproduction. This alternative ploidy specific, many natural strains of feature was discussed in some detail above with Heterosigma species are resistant to respect to the sexual reproduction of filamentous infection and resistant clones have been seen brown algae and the production of genomic to develop at the termination of blooms. phaeovirus but it also applies to many fungi (see The mechanism of this resistance has not below). In red algae, the formation of the diploid been determined but this behavior is very cell during sexual reproduction occurs by the reminiscent of the establishment of lysogeny migration of the nucleus from a donor haploid and subsequent lytic phage immunity by cell to a recipient hapliod cell via a primary pit bacterial phage. However, species specific connection (PPC), which provides a cytoplasmic non-lytic persistent infections by HaV have bridge between adjacent cells. The transferred not been investigated. Furthermore, it is nucleus then replicates in the recipient cell. clear that HaV is not the only type of virus However, besides sexual reproduction there are that can infect red algae. A rod shaped virus many other examples of nuclear migration in red that can form hexagonally packed inclusions algae involving vegetative cells from the same in the endoplasmic reticulum has been organism that are not derived from a common observed in Audouinella saviana species, cell division that can result in heterokaryons. which seems likely to be an RNA virus. The transfer of replicated nuclei can be on a very However, all isolates of this algae species large scale, resulting in cells that contain seem to harbor this virus (but not related hundreds or thousands of nuclei, sometimes species) so it may be a highly ubiquitous but arranged into hexagonal arrays under the surface species specific persistent infection. In of the plasma membrane of the recipient cell addition, various red algae are know to resulting in striking geometric patterns of DAPI harbor ssDNA plasmids with clear similarity . In addition, nuclear transfer to Geminiviruses in that they have between different species to form heterokaryons covalently attached initiator proteins. Such has also been well established. Furthermore, and rolling circle viruses are well known in unique to red algae, the transfer of nuclei various other host orders (bacteria, plant between different species of red algae can also animal) to either be dependent on (e.g. be parasitic. In some cases, the parasitic nuclei satellite viruses) or interfere with the will fuse with the nuclei of the host. In other replication of acute or sometimes larger cases, however, the parasitic nuclei undergoes DNA viruses. Although we would clearly rapid replication and are transferred throughout 89 the host in a most infectious process which essentially no genes that are similar to those of can also spread to new host. All these bacteria and only a few (10%) genes that are nuclear transfers have several features in similar to those of eukaryotes, even though its common, including the migration of newly DNA polymerase does show similarity to replicated nucleus to the plasma membrane, phycodnavirus and herpes virus DNA the formation of a pit connection from the polymerase. Taken together, the viral origin ‘parasitic’ cell to the host and the migration hypothesis for the eukaryotic nucleus is well of the parasitic nuclei into the new host (be supported by the characteristics of these viruses. it from the same individual, another The capacity of large DNA viruses for large individual or another species of organism). scale creation of genetic novelty is well The behavior of these parasitic nuclei is established and the possibility that this viral clearly virus-like. In fact, most of these based genetic creativity can colonize the host, nuclear migration processes are highly resulting in the origin of the eukaryotic nucleus similar to those that were described above can now be well supported. for the movement and transmission of poxvirus. This distinctive virus-like nuclear characteristic is found in essentially all red Recommended reading. algae and may well relate to the biological origin of the nucleus from a virus. Another Evolutionary dilemma: ancestor/nucleus. order of organism which also commonly (Maynard Smith and Szathmâary 1995) able to transfer nuclei between cellular host (Poole and Penny 2001) are the fungi (discussed in chapter 6.) (Cavalier-Smith 1975) (Cavalier-Smith 1991) Overall, the viruses that infect algae appear (Cavalier-Smith 2002) to have most of the characteristics that (Lake and Rivera 1994) would be needed to span the prokaryotic and (Sogin 1991) eukaryotic kingdom. They are both lytic and (Woese 1998) latent and the latent life cycle is tightly (Kyrpides, Overbeek et al. 1999) linked to host sexual reproduction. Analysis of the viral DNA polymerase Symbiotic theory. suggests that these two families of (Margulis and Sagan 1997) phycodnaviruses and phaeovirus are clearly related to each other but their corresponding DNA polymerase issues, lytic and latent life styles has endowed them (Spicer, Rush et al. 1988) with distinct gene sets. Curiously, (Braithwaite and Ito 1993) Herpesvirus was amongst the closets algal (Bernad, Zaballos et al. 1987) viral relative. The algal viruses also have a (Wang, Wong et al. 1989; Wang 1991) clear relationship to Herpesvirus, (Forterre 1999; Forterre and Philippe 1999) poxviruses, baculoviruses and African (Edgell, Klenk et al. 1997) Swine Fever virus. Yet all of these viruses, as noted previously, appear to have evolved The viral origin hypothesis from AFV-1 like virus of thermophiles. (Villarreal 1999; Villarreal and DeFilippis 2000) These algal viruses have a lot of genes that (Bell 2001; Takemura 2001) are both Bacteria-like and Eukaryote-like. (Filee, Forterre et al. 2002; Forterre 2002) This mixture of prokaryotic and eukaryotic (Filee, Forterre et al. 2003) genes not a general rule for other DNA viruses, even those found in the oceans. For Viral defectives, group selection. example, WSSV is a large DNA virus that (Szathmary and Demeter 1987; Szathmary 1992) infects shrimp (see chapter 5) that has 90 Braithwaite, D. K. and J. Ito (1993). Vaccinia as a mini-nucleus "Compilation, alignment, and (Tolonen, Doglio et al. 2001) phylogenetic relationships of DNA (Mallardo, Leithe et al. 2002) polymerases." Nucleic Acids Research (Mallardo, Schleich et al. 2001) 21: 787-802. (Moss and Ward 2001) Bravo, A. and M. Salas (1998). "Polymerization of bacteriophage variant phi29 replication Phi-29 and tubulin protein p1 into protofilament sheets." (Bravo and Salas 1998) EMBO (European Molecular Biology (Serna-Rico, Salas et al. 2002) Organization) Journal 17(20): 6096-6105. Cavalier-Smith, T. (1975). "The origin of nuclei The phycodnaviruses and of eukaryotic cells." Nature (London) (Van Etten and Meints 1999; Van Etten, 256(5517): 463-468. Graves et al. 2002) Cavalier-Smith, T. (1991). The evolution of prokaryotic and eukaryotic cells. The Fundamentals of Medical , (Delaroque, Maier et al. 1999; Delaroque, Vol. 1. Evolutionary Biology. Xi+333p. Muller et al. 2001) Jai Press Inc.: Greenwich, Connecticut, USA; London, England, Uk. Illus. . 1991. Cyanophage 217-272. E. E. Bittar. (Mann, Cook et al. 2003) Cavalier-Smith, T. (2002). "The phagotrophic origin of eukaryotes and phylogenetic Red Algae classification of Protozoa." International (Cole and Sheath 1990) Journal of Systematic & Evolutionary (Douglas, Zauner et al. 2001) Microbiology 52(2): 297-354. Cole, K. M. and R. G. Sheath (1990). Biology of Possible figures the red algae. Cambridge [England] ; New York, Cambridge University Press. Table of nuclear dilemmas Delaroque, N., I. Maier, et al. (1999). "Persistent virus integration into the genome of its The vaccinia life cycle algal host, Ectocarpus siliculosus (Phaeophyceae)." J Gen Virol 80 ( Pt 6): Tree of DNA pol 1367-70. Delaroque, N., D. G. Muller, et al. (2001). "The Characteristics of phycodnaviruses complete DNA sequence of the Ectocarpus siliculosus Virus EsV-1 A picture of Red Algae nuclei genome." Virology 287(1): 112-32. Douglas, S., S. Zauner, et al. (2001). "The highly Citations. reduced genome of an enslaved algal Bell, P. J. (2001). "Viral eukaryogenesis: nucleus." Nature (London) 410(6832): was the ancestor of the nucleus a 1091-1096. complex DNA virus?" J Mol Evol Edgell, D. R., H. P. Klenk, et al. (1997). "Gene 53(3): 251-6. duplications in evolution of archaeal Bernad, A., A. Zaballos, et al. (1987). family B DNA polymerases." J.Bacteriol. "Structural and functional 179: 2632-2640. relationships between prokaryotic Filee, J., P. Forterre, et al. (2003). "The role and eukaryotic DNA polymerases." played by viruses in the evolution of their EMBO J. 6: 4219-4225. hosts: a view based on informational

91 protein phylogenies." Res Microbiol and evolution. New York, Copernicus. 154(4): 237-43. Maynard Smith, J. and E. Szathmâary (1995). Filee, J., P. Forterre, et al. (2002). The major transitions in evolution. "Evolution of DNA polymerase Oxford ; New York, W.H. Freeman families: evidences for multiple gene Spektrum. exchange between cellular and viral Moss, B. and B. M. Ward (2001). "High-speed proteins." J Mol Evol 54(6): 763-73. mass transit for poxviruses on Forterre, P. (1999). "Displacement of microtubules." Nat Cell Biol 3(11): cellular proteins by functional E245-6. analogues from plasmids or viruses Poole, A. and D. Penny (2001). "Does endo- could explain puzzling phylogenies symbiosis explain the origin of the of many DNA informational nucleus?" Nat Cell Biol 3(8): E173-4. proteins." Mol Microbiol 33(3): 457- Serna-Rico, A., M. Salas, et al. (2002). "The 65. Bacillus subtilis phage variant phi29 Forterre, P. (2002). "The origin of DNA protein p16 7, involved in variant phi29 genomes and DNA replication DNA replication, is a membrane- proteins." Curr Opin Microbiol 5(5): localized single-stranded DNA-binding 525-32. protein." Journal of Biological Chemistry Forterre, P. and H. Philippe (1999). "Where 277(8): 6733-6742. is the root of the universal tree of Sogin, M. L. (1991). "Early evolution and the life?" Bioessays 21(10): 871-9. origin of eukaryotes." Curr Opin Genet Kyrpides, N., R. Overbeek, et al. (1999). Dev 1(4): 457-63. "Universal protein families and the Spicer, E. K., J. Rush, et al. (1988). "Primary functional content of the last structure of T4 DNA polymerase. universal common ancestor." J Mol Evolutionary relatedness to eucaryotic Evol 49(4): 413-23. and other procaryotic DNA Lake, J. A. and M. C. Rivera (1994). "Was polymerases." J.Biol.Chem. 263: 7478- the nucleus the first ?" 7486. Proc Natl Acad Sci U S A 91(8): Szathmary, E. (1992). "Viral sex, levels of 2880-1. selection, and the origin of life." J Theor Mallardo, M., E. Leithe, et al. (2002). Biol 159(1): 99-109. "Relationship between vaccinia virus Szathmary, E. and L. Demeter (1987). "Group intracellular cores, early mRNAs, selection of early replicators and the and DNA replication sites." Journal origin of life." J Theor Biol 128(4): 463- of Virology 76(10): 5167-5183. 86. Mallardo, M., S. Schleich, et al. (2001). Takemura, M. (2001). "Poxviruses and the origin "Microtubule-dependent of the eukaryotic nucleus." J Mol Evol organization of vaccinia virus core- 52(5): 419-25. derived early mRNAs into distinct Tolonen, N., L. Doglio, et al. (2001). "Vaccinia cytoplasmic structures." Molecular virus DNA replication occurs in Biology of the Cell 12(12): 3875- endoplasmic reticulum-enclosed 3891. cytoplasmic mini-nuclei." Molecular Mann, N. H., A. Cook, et al. (2003). Biology of the Cell 12(7): 2031-2046. "Marine : bacterial Van Etten, J. L., M. V. Graves, et al. (2002). photosynthesis genes in a virus." "--large DNA algal Nature 424(6950): 741. viruses." Arch Virol 147(8): 1479-516. Margulis, L. and D. Sagan (1997). Slanted Van Etten, J. L. and R. H. Meints (1999). "Giant truths : essays on Gaia, symbiosis, viruses infecting algae." Annu Rev Microbiol 53: 447-94. 92 Villarreal, L. P. (1999). DNA virus contribution to host evolution. Origin and evolution of viruses. E. Domingo, R. G. Webster and J. J. Holland. San Diego, Academic Press: 391-420. Villarreal, L. P. and V. R. DeFilippis (2000). "A hypothesis for DNA viruses as the origin of eukaryotic replication proteins." J Virol 74(15): 7079-84. Wang, T. S. (1991). "Eukaryotic DNA polymerases." Annu.Rev.Biochem. 60: 513-552. Wang, T. S., S. W. Wong, et al. (1989). "Human DNA polymerase alpha: predicted functional domains and relationships with viral DNA polymerases." FASEB J. 3: 14-21. Woese, C. (1998). "The universal ancestor." Proc Natl Acad Sci U S A 95(12): 6854-9.

Possible figures.

4-a Figure of prokaryote and Eukaryote evolution 4-b Figure of viruses in oceans

4-1. Figure of replication component proteins; prokaryote and eukaryote 4-2. Figure of displaced replication proteins 4-3. Table of distinctions of the nucleus absent from proksaryotes 4-4. Viral candidates for protonucleus 4-5. Conserved regions of DNA pol 4-6. aligned DNA pol alpha 4-7. DNA pol dendogram 4-8. CAP dnedogram 4-9. Other dendograms SOD, RNRd, PCNA 4-10. PBCV1-HAS

93 CHAPTER V MICROSCOPIC AQUATIC ORGANISMS AND THEIR VIRUSES

The viruses of microscopic aquatic genes appear to have evolved in parallel to those eukaryotes and their relationship to the of higher eukaryotes. These organisms include evolution of their host is a topic that has the ; Tetrahymena (Ciliates), historically received little attention. Given trypanosomes (Kinetoplastids), Euglenida the importance of such organisms to the (Euglanas), Plasmodium (Apicomplexa) and origin of higher life forms, this chapter will plant ‘’ (Heterokonts, genetically distinct consider the virology of these organisms in and not true fungi). The microscopic eukaryotic some detail. Later in this chapter, we also lineages that are more related to multicellular consider the virology of lower and higher eukaryotes would then include the algae fungi since as indicated in chapter 1, fungal (discussed in Chapter 6), Dictyostelium (social evolution is of central importance for the Amoebozoa) and the true fungi (e.g. S. evolution of animals and higher plants. The cerevisiae and Neurospora), the latter two which acquatic can be defined are most related to animal lineages. operationally as being of the size range from 50 to 500 microns and thus include both Protozoa and persistence of dsRNA viruses large unicellular eukaryotes, such as Protozoan organisms are most often found in protozoa as well microinvertebrates , or the aquatic environments. As such, they are microscopic larval forms of invertebrates. clearly exposed to large quantities of viral Invertebrates will be covered in subsequent agents known to exist in all aquatic habitats. chapters. These organisms include many As mentioned, these aquatic viruses have that feed by grazing on algae. However, morphologies that mainly correspond to those with only some exceptions, the viruses of of phage (i.e.icosahedral capsids with tails, these organisms and their host present little containing DNA) and phycodnaviruses (large apparent medical or agricultural risks so dsDNA icosahedrons), although a significant their studies have generally not been well but small subset includes a mixture of various supported. Because of this, it is possible other viral morphologies (small icosahedrons that our current understanding of these and rods). In spite of this common immersion organisms and their viruses is limited or in pools of mainly diverse types of DNA distorted by the relatively few examples that containing viruses, aside from algae, the have been investigated in greater detail and orders of protist are not typically associated that their may yet exist other virus-host with infections by large DNA viruse but are relationships that await discovery. Some of instead more often associated with RNA these organisms, such as Giardia viruses infections. There may be a caveat to (Diplomonads) and Trichomonas this situation in since our observations my be (Parabasalia) are clearly rather primitive biased towards acute viruses. For example, the versions of eukaryotic cells in that they lack recent discovery of a very large DNA virus of mitochondria or, in the case of the amoeba (Mimivirus) may signal the existence dinoflagellates, lack histones. Generally, of a larger number of such inapparent viruses. these organisms are not thought to be on the In some cases, however, sufficient same lineage that lead to multicellular investigations have been completed to indicate eukaryotes and appeared to have diverged that these are general relationships between early from that lineage. Other microscopic viruses and host. One clear and overall aquatic organisms, although more like virus/host pattern is that a wide range of these higher eukaryotes, still represent lineages microscopic eukaryotes are frequently infected that diverged early in evolution from that of with related families ds RNA virus that have multicellular eukaryotes and most of their small icosahedral morphology, but that these 94 infections appear to be mainly persistent nature. In such circumstances, we can expect or latent and are non-pathogenic. This is that virus-virus competition may also be in stark contrast to viruses of both prevalent and may likewise provide a selective microalgae (Chapter 4) and insects in pressure for the persisting virus to exclude host which many examples of strictly lytic ds colonization by a competing virus. Since such DNA viruses are known. As we have exclusion or competition viral systems will already discussed the micro algae and often involve addiction modules, including filamentous algae. In this chapter we toxin/immunity genes that are harmful or lethal therefore collectively consider the non- to uninfected host, we can also expect these algal aquatic microscopic species which colonizing viruses may also provide a make up a rather diverse set of eukaryotic selective pressure that can eventually organisms that includes the protists, reproductively isolate infected from uninfected ciliated protozoa, dinoflagellates and the host species. Along these lines, it is clear that lower and higher fungi. some of these persisting agents can result in harmful maternal effects during host sexual Persistence, sex and reproductive reproduction. This situation may be rather isolation As in previous chapters, this generalized and appears to occur across a chapter will also examine the best-studied broad array of virus and host. Sometimes, this examples of virus and host to consider maternal harm involves virus reactivation, their host interactions. The main type of mitochondria infection, or associated virus virus to be considered are dsRNA (Parti production. For example, sexual induction of and toti viridaea), ssRNA viruses of fungi, lysogenic virus of bacteria occurs when the as well as linear dsDNA viruses of fungi. male cell harbors a prophage expressing All these viral agents are generally immune functions, but the recipient bacteria is persistent in their protist host but were not not latently infected or immune, resulting in prevalent in either bacteria or algae. As lytic virus induction. Other examples include presented in chapter 1, persistence tends to the Phaeovirus of filamentous algae, killer be highly host species-specific due virus of fungi, Gypsy virus of Drosophila, apparently to the need to closely ascovirus of locust: all of these are examples in coordinate the virus with host regulatory which harm to the offspring will result systems. Persistence tends to superimpose following sex between infected and uninfected onto the host mechanisms for virus sexual partners such that uninfected females or maintenance and competition/exclusion of eggs will reactivate virus and harm the host. other viral agents and frequently, these Only infected females (or recipients) will systems of maintenance are via addiction produce viable offspring from with modules. Also as presented earlier, viral infected males. A specific example of this is persistence is often linked to and the killer virus of various yeast species (these transmitted during host reproduction. are generally species and strain specific, Thus, the reproduction of latent virus can discussed below). Examples of sex linked be directly associated with the sexual virus effects from other protist and fungal reproduction of the host as we have noted species are also numerous and will be in chapter 4 for the phaeoviruses of presented below. This sex-related virus harm filamentous brown algae and the spore is a characteristic that seems to especially infecting phage of B. subtillis. These apply to host that have been colonized general characteristic of persistent (infected) by various retroviruses, such as the infections are all also clearly evident in Gypsy virus in Drosophila species and has various protist and their viruses. In many been called a ‘maternal’ effect. However, cases it appears that persistent infections colonization of a host genome by an of protist by viruses are highly prevalent in can result in permanent 95 alterations to the germ line of a particular isolates harbor persistent virus. It seems likely host. Examples of genome wide stable that this state of cryptic virus relates in some retrovirus colonization, such as that of the way to the mainly fused-hyphael life strategy fungal order will therefore also be of fungi, which provides much opportunity for presented. It will be seen that these ERV the transmission cryptic or silent viruses. It is colonization events frequently represent striking that there are few, if any, clear points of bifurcation and divergence of examples of lytic virus in protist, unlike the host lineages, some of which maintain situation observed with micro algae, or as will these ERV agents, but other lineages do be described for insects. A number of cryptic not. plant viruses have also been observed containing two dsRNA segments. These Other cryptic phenotypes of persistence. crypto viruses lack movement functions and In other persistent infections, the are vertically transmitted. These cryptoviruses mechanism of viral persistence and also appear to have little phenotypic effects on maintenance and its consequence to the the host cell, although here to they can effect infected host are not clear as the agents the outcome of other virus infection. appear to be very cryptic. These cryptic viruses may appear defective (not Mineralized algae; a viral paucity. The encoding gene products for virion biology of green micro algae and the production) or be without obvious filamentous brown algae was presented in addiction module characteristics. Yet even chapter 4 and will not be repeated here. in these circumstances, it is likely that However, the term algae covers a diverse set of presence of such cryptic agents in the host organisms that have distinctly different modifies the host biological outcome to characteristics. In fact, historically, the term infection. Satellite viruses (such as those algae was also inclusive the prokaryotic found associated with yeast killer viruses), cyanobacteria. The current use, however is are such examples of cryptic infections restricted to eukaryotic organisms. The that can affect the replication potential of mineralized algae are algae, such as diatoms, related killer viruses and appear to allow a that form a shell or exoskeleton that is more stable persistence. In other cases, composed of minerals taken in from the ocean. more direct effects on the host can be seen, The most common minerals used for these such as the longevity phenotype that can shells are Ca or Si, which is actively pumped result from podospora mitochondria into cells as soluble ions from sea water and infected with the pAL2 virus (referred to deposited at membrane interfaces onto the cell as linear plasmids), seemingly extending exterior to make a solid mineral exterior, the life span and transmission potential of usually in plate patterns. The capacity of algae the infected host. Thus, like bacteria to make mineralized exterior occurred at the colonized by temperate phage, a main precambrain-cambrian boundary and rapidly consequence can be to affect the host evolved such that it is estimated within 40-50 potential for other virus interactions. As million years after Cambrian period. By that virus-virus interactions are inherently time, there had already evolved a large number conditional (thus potentially cryptic) and of species of these organisms, which are seldom evaluated in laboratory visible in shale and diatomaceous earth. These experiments, the literature only has algae are especially efficient at fixing CO2 and isolated examples of such relationships to are thought to have been responsible for consider. Yet in the entire order of fungi lowering the CO2 content of the early and protist organisms, we see a atmosphere and they continue contribute a surprisingly high prevalence of persistent- large quantity of fixed CO2 and mineral cryptic virus. Most fungi from natural sediment (especially calcium) to the biosphere 96 and remain very important on a global diatom genome is not well known, recent scale. Diatoms are responsible for the success at introducing foreign plasmid DNA main sediment of the ocean and have into diatom genomes has shown that these contributed massive mineral production to integrated DNA are frequently in tandem these sediments. Thus these organisms are duplications. This establishes that a RIP-like both very abundant and of major ‘anti-viral’ system that is known for some significance to the early (discussed below) is not operating in eukaryotic life. diatoms to prevent the acquisition of duplicated DNA into their genomes. Origins of diatom photosynthesis. Diatoms represent the brown algae, which Microscopic studies fail to find diatom virus. are distinct from both the red algae and Due to the highly recognizable and species green algae and have distinct (brown specific shape of the mineral shells which is pigmented) light harvesting system and well preserved in sedimentary material, shale proteins. Diatoms have large central deposits of these organisms have been nuclei, and Golgi complex involved in intensively studied by transmission and shell synthesis. Recent phylogenetic scanning electron microscopy and it has been studies of diatom (cyanophora) chloroplast reported that some species (such as testate genomes (of about 210 proteins) indicates amoeba) have been morphologically stable for that they appear to be sister groups to the millions of generations. In spite of the red algae, their closest relative, as 45 prevalence of phycodnavirus in non- genes were identified as being conserved mineralized microalgae, however, no between cyanobacteria, red algae and phycodnavirus, or any other virus for that diatoms. These data has further suggested matter, is known for any member of that the progenitor organism to both these mineralized algae species. Specific examples lineages was likely cyanobacteria. of diatoms for which there are no reports of Diatoms are distinct from other algae in virus are Navicula pelliculosa or Foraminifera that the genes for light harvesting are species. This striking situation could simply nuclear, not chloroplast encoded. This be due to the lack of a systematic search for observation supports the idea that the early viruses in these organisms. These algae can evolution of photosynthesis was nuclear, have hard glass-like shells composed of silica followed by the migration of in very highly structured shapes, so entry and photosynthetic genes into the genomes of release of virus could be a highly restricted the chloroplast. These diatom event. However, these silicone-based shells photosynthetic proteins are translated in have been of intense interest as possible the cytoplasm, then they are imported past sources of biological nanofabrication of highly a distinctive membrane that surrounds the structured silicates and they have often been diatom chloroplast. This scenario is also extensively examined by electron microscopy. consistent with the proposal in chapter 4 It therefore seems likely that viral like particles that the eukaryotic nucleus appears to have (VLPs), or sub-particles, would have been evolved from the stable colonization of an observed by now, such as was the case with the extra genomic DNA virus of a VLPs observed by EM examination of cyanobacterial host, followed by the protozoa (discussed below). Yet such reports migration of bacterial metabolic and other are conspicuously absent from the literature on genes into the proto-nuclear viral genome. mineralized algae. Although it remains The light harvesting genes exist in loci that possible that the lack of virus observation is clearly appears to have evolved by gene simply due to a lack of looking, it seems more duplication. Although the occurrence of likely that lytic virus replication or high level duplicated and transposed sequence in the latent virus reactivation is at the very least are 97 rare or possibly absent from these species. pathology in their host, exist in low copy This raises the possibility that the hard numbers and are vertically transmitted. mineral shell may be a very efficient Viruses enter cells by , so they barrier to the entry and exit of viruses and clearly are similar to other eukaryotic viruses might relate to the long-term stability or in this regard. Secretion of progeny virus is nondynamic morphology of these ongoing and via peripheral vacuoles. Several organisms. Whether or not diatoms have can act as helpers to satellite ds virus-like inapparent genomic parasites or RNA viruses, some of which code for and can their defective genetic transposon relatives secrete ‘killer’ toxins and kill sensitive, has not yet been determined and will need uninfected host. A virus similar to GLV can to await the sequencing of one of these be found in the benthic larvae of crayfish, genomes as well as the sequencing of which is used for . In crayfish, possible extra-genomic elements. however, the virus is lytic, not latent.

Protist mitochondria and medical studies. The more developed protist (not diplomonads) Protist: a diverse order. Protists are a very undergo sexual reproduction and tend to be diverse collection of microscopic diploid. They are motile via flagella, engulf eukaryotes. The most primitive members food and are abundant in aquatic habitats. of these organisms are Diplomonads These more developed protist also have (Giardia) and Parabasalia (Trichomonas), mitochondria, but interestingly, these which lack mitochondria, golgi and ER. A mitochondria are often very unusual, almost large proportion of protist organisms, bizarre relative to those of higher eukaryotes, including those that are the more such as replicating their linear DNA genome primitive, are binucleate. Overall, these via rolling circular or protein primed processes. organisms appear to clearly support Such a replication processes used by many infections with viruses, mainly ssDNA viruses, but not used by any free living nonsegmented dsRNA viruses that also organism to replicate its genome. Although tend establish persistent infections often not themselves photosynthetic, some protist with continued virus shedding. The harbor symbiotic zoospores of algae that are accumulation of VLP’s in of photosynthetic, as presented in chapter 4. An protist has often been reported and example of this would be Paramecium bursaria transmission to uninfected host has also (a Ciliata) and Hydra viridis (Coelenterata) been established in some cases. The two both of which harbor symbiotic microalgae best-studied members are GLV of Giardia which provide photosynthesis. It is interesting lamblia and TVV of Trichomonas that these symbiotic algae can be infected with vaginalis. These viruses are related to phycodnavirus, but not the protist host as each other. However, the RNA discussed in chapter 4. As mentioned, Protist polymerase gene of TVV virus is more are often parasitic to other organisms (such as related to viruses that infect S. cerevisiae humans and other animals) and hence they and Leishmania. These viruses belong to have been well studied from the perspective of the family , which includes a human disease, especially intestinal and number of dsRNA viruses that infect mucosal diseases. Because of this medical protozoa parasitic to animals and fungi importance and as a result of extensive parasitic to plants. Infected protozoa experimental evaluation , these agents have include Leishmania, Eimeria, Giardia, and been well observed making it more likely that Trichomonas. Infected fungi incluse we have reported the most prevalent virus-host Saccharomyces, Ustilago, Aspergilus and interactions. Theilaviopsis. Viruses show little 98 LRV-1 is most similar to that of Saccromyces Prevalent persistent infections of protist. cerevisiae L-A (ScV-L-A) , a yeast killer virus As mentioned, latent or inapparent virus and thus LRV-1 seems to represent a infection seems to be a normal or common predecessor to the yeast virus. Consistent with situation in most protozoan species. Early this basal placement of the LRV-1 viruses EM based observations in 1970’s found relative to the yeast viruses, the four dsRNA VLPs were made by most protozoa species virus types that persistently infect fungi and examined. Possibly the best examined of protozoa appear to have common lineage these species has been Entamoeba (which also includes Ustilago maydis virus H1 histolytica. Several physical versions of (UmVH) in Ustilago, VLPs were reported, including a small 40 virus (TvV) and LRV of Leishmania. Many nm ovoid VLP. All protozoan strains that strains of Trichomonas vaginalis are also were examined seem to harbor such infected and the presence of the virus is particles that were made in large numbers directly associated with host phenotypic especially in early sporozoytes, but not in variation involving surface antigen switching. other cells. These agents thus appear to be Thus persistent TvV infection can result in a latent viruses, but have also been called specific host phenotype. Infection with these hereditary viruses since attempts to clone viruses predates divergence of host leishmania or cure species of VLPs have generally species, which further supports the old and failed. Subsequent experimentation stable nature of this virus host relationship. established that some of these VLPs were However, because most leishmania are clearly authentic virus as they could be infected, it has been difficult to study the used to infect and lyse other permissive consequences of infection without an cells. Other VLP forms are also seen, uninfected host for comparison. However, including some that are filamentous and there are a few distinct leishmania species lack beaded string structures in nucleus. Some any virus infection, but these species seem of these agents are also associated with unable to support heterologous virus toxin production but they have not been infections. These virus-free species, characterized. apparently arose from infected predecessors due to lost persistent infection of ancestor LRV-1 virus and origins. One of the other organism. In addition to leishmania, other best characterized protozoan viruses is of parasitic protozoa are also known to harbor protozoan parasite Leishmania. Isolates virus. The ds RNA viruses found in Giardia are frequently persistently infected with a lambia (GLV) are clearly related to ds RNA 32 nm virus particle (LRV1virus) having a viruses of yeast and other fungi. Like many of single segment dsRNA 5.2 kb. This RNA the viruses described above, GLV is also codes for two overlapping ORF ; one an continually shed into the media. RNA dependent RNA polymerase and the Cryptosporidium parvum protozoa (parasitic to other a capsid protein. Old and New world the GI of mammals), is also frequently infected specific versions of Leishmania RNA by a two segment dsRNA virus. Previously virus (13 strains) are known, (12 LRV1 these bipartite partiviridae were mainly found and 1 LRV2) which are highly conserved in fungi and plants. In C. parvum, the virus is and phlyogenetically congruent with their mainly found in cytoplasm of sporozoites, but corresponding host, suggesting a long term not found in other species of this genus. stability. A survey of Leishmania isolates Strikingly, there are only a few examples of showed that 12 of the 71 isolates examined phycodna-like viruses of protozoa in the harbored a virus related to LRV-1, and literature, which contrast the situation between most of these viral harboring isolates were algae and protozoa. from the Amazon basin. This RNA pol of 99 Ciliated protozoa. Protist can be species undergoes continued DNA replication considered as several major groups of (endoreduplication) without mitosis resulting organisms. The ciliated protozoa ( in the over replicate the DNA. This initially Ciliosphora) include species of generates a polytene chromosomes, followed Tetrahymena, Paramecium, Euplotes, by vesicle formation within compartments of Oxytricha, which all have dual nuclei in the DNA, then followed by excision and loss vegetative cells. One of these nuclei is a of much of the intragenic DNA. The presence small diploid and transcriptionally silent of this initially polytene chromosomes micronucleus which maintains the germ distinguishes hypotrichs from other ciliates. line and the other nucleus is a large The macronucleus, can be contrasted with the vegetative transcriptionally active micronucleus in that it overrepresents gene macronucleus which has both elevated sized DNA fragments that are highly copies of active genes and also has lost expressed, but nuclei is a terminal in that it lots of intergene sequences. This genome- becomes unable to continue the lineage of the wide rearrangements of macronuclear organism. In the macronucleus, intragenic DNA is a striking difference between excised sequence (IES) are small repeat ciliates and most other organisms. In sequences that constitute a significant part of addition, with the separation of nuclear the genome. IES are precisely excised and fates, we see the first evolutionary degraded. Furthermore, these resulting separation of the germ line and from soma fragments are 0.4-20 kb DNA and are present lineage. Many ciliated protozoa are in high numbers (15,000-40,000). These linear members of the hypotrichs, which will sequences acquire telomere sequences via de undergo multiple rounds of DNA synthesis novo DNA synthesis. This telomeric DNA (endoreduplication) during the generation consists of short sequence made of the macronucleus. During sexual by error prone RT polymerization. The RT reproduction, the micro nucleus is able to activity is present at high levels during somatic undergo meiosis, losing 3 of 4 of the phase, but quickly diminishes with sexual resulting haploid nuclei. Also, during phase (during conjugation and autogamy). sexual conjugation, one of these resulting This nucleus thus has undergone a terminal haploid micronuclei is transmitted via a transition to a state of amplification and high cytoplasmic bridge into another haploid , acquired new molecular cell to fuse two haploid nuclei into one genetic identity, but has also lost DNA regions diploid nuclei. This process of nuclear and withdrawn from participation in the germ transport and transmission from one cell to line DNA. another via a cytoplasmic bridge is most intriguing and is also very reminiscent of The excision of IES from macronucleus of the infectious-like transmission of nuclei ciliates (paramecium) is under epigenetic between cells that was widely seen in red control and thousands of IESs will undergo algae. In a sense, the dimorphic nuclei of excision from germ line during development. ciliates also resemble the two nuclei of During the formation of the somatic heterokaryons (as in fungi) which can also macronucleus, genomic DNA undergoes about result from nuclear transmission. It is 6,000 deletion events which involve sequences interesting that in some cases, the nuclei of containing LTRs. The Tlr 1 element is one of the recipient cell will undergo enlargement the better studied of these eliminated LTR and degrades its DNA, reminiscent of a sequences and consists of a 13kb sequence macronuclear DNA changes. with a 825 bp LTR. These are clearly transposon like structures that resemble viruses Macronucleus-Hypotrichs. The in their excision. In some striking cases (such macronucleus of the numerous hypotrich as the actin gene of O. nova) the initial sub- 100 gene segments are scrambled and out of lambda DNA will replicate as a linear DNA to coding order, but will be reassembled high copy, resulting in up to 20,000 copies per along with differential and sequential IES nuclei. The single micronuclear copy rDNA deletion to form a functional contiguous will undergo DNA replication in the and correctly ordered gene in the macronucleus resulting in 9,000 copies of the macronucleus. These IES sequences are 21 kbp repeat derived from chromosomal copy short, non-coding elements with direct during conjugation. This rDNA replicon can repeats and thus closely resemble also be maintained as a linear plasmid. This transposon excision. In fact, longer repeat distinct difference in control of DNA sequence elements (Tec1, TBE1, Tlr1) can replication between the micronucleus and the be found in the genomes of various ciliates macronucleus is striking. The high level (Euplotes, Oxytricha, Tetrahymena). overreplication of marconuclear DNA can These longer elements not only have the attain levels similar to those attained by some IES sequences as inverted terminal repeats DNA viruses. at their ends, but also code for a transposases. Thus they clearly resemble a RNA splicing. Loss of RNA sequences from functional genetic parasites. It seems self splicing also occurs in the macronucleus. likely that these longer - ‘less defective’ The group IB introns were in fact first versions of the IES’s may have been the demonstrated in tetrahymena and were found original colonizers of ciliate genomes, in nuclear rRNA gene. Tetrahymena’s group leading to the evolution of the much IB introns (splicing with no protein factors) smaller IES direct repeats. Interestingly, a have conserved the same catalytic fold as the single copy of the longer IES sequence in ribozyme of T4 (td synthase introns). Thus the macro nucleus will prevent excision in there seems a clear relationship between new developing macronucleus with sexual bacterial phage and tetrahymena in intron germination. This is a maternal processing. Such a high degree of similarity chromosome effect that probably works by has led some to suggests that this is the result pairing of homologous nucleic acids. of a common ancestry. The hallmark of these self-splicing introns is a 16 nucleotide Random DNA amplification and telomere consensus sequence. This element also addition. Tetrahymena probably resembles consensus found in RNAs, represents the best studied example of a such as potato spindle tuber viroid. It is also ciliate undergoing macronuclear interesting that the group I introns found in formation. The two nuclei (macro, micro) Chlorella viruses mentioned in chapter 4, are differ significantly. In stark contrast to the also very similar to those in found algae, yeast micronuclear tight control of DNA and paramecium cells, suggesting that this replication, the macronucleus is able to virus could represent the ancestor of these replicate (endoreduplication) and amplify cellular introns, or possibly, that a related virus not only specific regions of rDNA but is mediated the spreading of these introns into all also able to amplify most any DNA these protist lineages. sequence. The macronuclei will even allow Lambda DNA to replicate when this Origins of dimorphic nuclei. The dimorphic DNA is microinjected directly into the character and variable DNA content of two macronucleus. Furthermore the injected nuclei of ciliated protozoa raises some lambda DNA will also acquire telomeres, interesting issues with respect to their origins which are added via an RT like terminal and possible relationships to DNA viruses. transferase activity. These telomeres The diploid micronucleus transmits the germ appear to inhibit end to end fusion of line, but maintains strict control of DNA linear DNAs such that the resulting replication in that it only allows each replicon 101 to replicate once per cell cycle. In this on gradual accumulation of favorable point regard it is similar to the nuclear mutations and recombination events. regulation of DNA synthesis of most However, if a nuclei can arise from a viral eukaryote nuclei. However, the progenitor, these characteristics themselves micronucleus is in a transcriptionally could have all come about from elements of inactive state, which is not typical of germ known and prevalent viral life strategies. In line nuclei higher eukaryotes. In contrast, this case, the various micronuclear the macronulceus has relaxed cell cycle characteristics resemble those of a latent DNA control of DNA replication, but is highly virus whereas the various macronuclear active for transcription. What might characteristics closely resemble those account for the origins of this dual nuclei expressed during the lytic reactivation of a strategy? It is worth recalling that we persistent virus for high level virus production. have already argued that the nucleus itself We could thus propose that both the may have evolved from a persisting micronucleus and the macronucleus closely infection with a large DNA virus (Ch. 4). resemble known and prevalent life strategies of A micronuclei that is silent or repressed DNA viruses. would clearly resemble a persisting or latent genome of a DNA virus. This Why are there few DNA viruses of ciliates? A repressed transcriptional state closely ciliate macronucleus, with unregulated DNA resembles the chromatin repressed DNA replication, highly active and terminally that would be expected for a latent virus. committed gene expression, would seem to Thus the link of this silent nucleus to germ provide an excellent habitat for the general line transmission would also fit the general amplification of viral DNAs and virus tendency for latent virus to associate with replication. And in fact we know macronuclei sexual reproduction. Furthermore, the injected lambda DNA amplifies very well in activation and over replication such nuclei. However, in spite of this micronucleus during sexual reproduction seemingly inherent capacity to replicate also resembles a virus-like behavior. In foreign DNA and high levels of gene this regard, the micronucleus closely expression, there are very few examples of resembles the lytic reactivation of a latent nuclear DNA viruses that infect hypotrich DNA virus in that it is characterized by species. Possibly this may be related to the DNA amplification out of the cell cycle frequent parasitic life style of protozoa in control, the subsequent degradation of the which their parasitized host cell may shield non-amplified DNA sequences, followed themselves from exposure to many DNA by the high level global transcriptional viruses. Yet symbiotic algae are also shielded activation of the replicated DNA from phycodnavirus by their host but can sequences, with corresponding high level clearly support numerous DNA viruses. It is protein expression. These events all seems more likely that the dimorphic life clearly resemble the replication and late strategy and the wide distribution of IES in the gene activation of a DNA virus. chromosomes along with system for IES Furthermore, the fate of such a excision (protection) and DNA degradation of macronucleus that is essentially terminal non-telomere containing DNA might pose a (i.e. lytic –like) in that the macronucleus is major barrier for any DNA viruses that a dead end, destined to degrade and cannot colonized ciliates. Any DNA virus that finds contribute to the germ line. Explaining itself in a micronucleus undergoing the origins of all these characteristics, endoreduplication and subsequent DNA especially amplification of some DNA but degradation of the macronucleus is most likely the loss of other DNA, presents a major not to survive the dual nuclei process. This challenge for theories of evolution based would especially pose a viral barrier if, due to 102 the sexual cycle, the virus must first infect the micronucleus. This is because a DNA Dinoflagellates. Dinoflagellates represent virus would need to initially infect the another distinct and old form of microscopic micronucleus prior to host genome oceanic eukaryotic life. However, amplification and silently persist in this Dinoflagellates are sufficiently different from nuclei until macronuclear formation. The all other eukaryotes to be considered as a sister dimorphic nuclear life strategy may, in group. These organisms are responsible for the fact, be a powerful mechanism to strip out toxic red tides, which is due to the synthesis of rogue DNA replicons that have infected domoic acid, an analogue of glutamic acid that the nuclei by requiring that parasitic DNA irreversibly binds glutamic acid receptors on not amplify in the micronucleus. In neurons. Some of the dinoflagellates have addition, it appears that permissible features of both plant and animal cells in that amplification in the macronucleus is also they are both photosynthetic and mobile, able be subjected to transposon mediated to move towards light. In addition, some excision and degradation. In considering members of this order, such as sea fire, are able how such a dimorphic life strategy might to also emit light. The ability to have originated, our prior reasoning photosynthesize and move towards light is proposed the possibility that a lytic- reminiscent of photosynthetic cyanobacteria , persistent virus system could have which also move towards light. The originated both the micronuclear and photosynthetic capacity of dinoflagellates macronuclear structures. However, we seems to have evolved early in the evolution of might also suspect that an additional these organisms, which may account for their colonization event by a second parasitic very unusual chloroplast. The individual genes element must have also been involved of these odd chloroplast are coded by which could contribute the transposon minicircular plastids, a situation unique to needed to elimination intergenic DNAs, Dinoflagellates. Dinoflagellates are frequently Such a process could have resulted from symbiotic with other organisms, such as the selection needed to compete with and coelenterates which can build coral reefs but eliminate other genetic parasites. In this only with the cooperation of the symbiotic second element case the most likely algae. Dinoflagellates are clearly a distinct candidate for this additional parasitic order of life and are distinguished from all element would be a virus related to the other eukaryotes in that they have no histones Tec1 or Tlr1 transposons. The resulting or nucleosomes on their chromosomes. Instead colonized genome would then required the they have variable numbers of chromosomes presence of the transposon LTRs to protect which contain 4 basic chromosomal proteins the replicons from DNA degradation and that are present at 1/10 the mass of DNA, and loss. Perhaps this system now prevents not at the 1:1 mass ratio seen in all other the colonization of ciliate organisms by eukaryotes and protozoa. The DNA-chromatin any DNA viruses. However, as is organized into right handed double helical mentioned above, the large majority of the bundles. Also, distinct form other eukaryotes, extant viruses of these ciliate species are the nuclear membrane does not dissolve during latent dsRNA viruses whose primary mitosis. Dinoflagellates are also bi-nucleate habitat is the cytoplasm and the and undergo sexual reproduction. However, translational system of the host, not the the dinoflagellate nuclei are not dimorphic and nuclear system. It seems clear, therefore, they do not appear to undergo DNA changes that although ciliates species may have (amplification and deletion) noted above for developed systems that exclude most DNA micro and macronucelus in ciliates. viruses, they remained highly susceptible Dinoflagellates have enormous genomes, being to dsRNA viruses. 1 to 10 times the size of the human genome 103 with 75% of this DNA being of low copy are in stark contrast to both the protist and complexity, 18% intermediate repeat and cilliates in that they are prone to infections the rest very simple sequence so it appears (often latent) with DNA viruses. that they have been highly colonized by repeated elements. Also distinct, in Fungi The fungi are of a special interest from Dinoflagellate nuclear rRNA is a plastid the perspective of evolutionary biology since (SSU). they appear to have evolved well after many of the protist discussed above. Fungi are thus not DNA viruses. Unlike the hypotrichs, and representatives of the earliest eukaryotes. in apparent support of the notion that However, it is now accepted that the animal dimorphic nuclei may protect against lineage appears to have a monophyletic origin DNA viruses, DNA viruses of which shares ancestry with fungi and the dinoflagellates are known. In this case of marine fungi are likely to represent the earliest the dinoflagellates that are symbiotic to form of fungi. Thus it seems likely that the coral building organisms, there is evidence molecular characteristics of Fungi and their that they can be killed by virus infection. relationships to viruses could identify These coral building dinoflagellates molecular characteristics that also led to the appear to be lysed by an icosahedral evolution of the animals. 1/5 of all known dsDNA containing virus. Interesting, this fungal species are obligate symbionts and are virus may often be latent and may be colonized with green algae or cyanobacteria, induced following exposure to increased such as lichen like Ascomycota (which temperatures. Some observations suggest includes 98% of lichenized species). This that many of these dinoflagellates may be symbiotic relationship meets carbohydrate latently infected. It appears that latent requirement of the fungi when colonized by a virus can then become lytic leading to CO2 fixing symbiont. Current analysis death of the dinoflagellate and bleaching suggest that that lichenized fungal species of the coral. evolved early, followed by multiple loss of symbiosis. Furthermore it appears that all the Other DNA viruses of dinoflagellates are higher forms of fungi evolved from such also known. A virus infecting the symbiotic species that later became shellfish-killing dinoflagellate (H. autonomous. For example Penicillium and circularisquama Virus: HcV) was recently Aspergillus appear to have been derived from isolated from such lost symbionts. Japanese coastal waters following initial observations of VLPs from EM. This Overall viral patterns. Overall, there are some virus was also icosahedral ds DNA virus clear patterns of fungi and their viruses. And of about 200 nm diameter, lacking a tail. these patterns are distinct from the other This size is similar to a poxvirus. The microscopic aquatic organisms. Infection of virus was found in large numbers in natural fungal populations with viruses is in ‘viroplasmic structures’ in the cytoplasm. many cases exceedingly common. dsRNA The virus was lytic in 18 strains of H. viruses, ssRNA viruses, dsDNA viruses, circularisquama, but not in 24 other retroviruses and even have all been species. Thus, like frequently found in fungi. In some cases, such microalgae, lytic DNA viruses of as with retroviruses, fungi represent the simplest dinoflagellates are established and organisms which are known to broadly support prevalent. Curiously, no other types of this virus family, although interestingly these virus (e.g. filamentous, dsRNA, ssRNA retrovirus all lack env sequences. However, containing, etc.) have been reported for most, possibly all natural isolates of fungal these organisms. Thus the dinoflagellates species (aside from yeast) harbor persistent and 104 inapparent viral infections. The great appear to represent the earliest evolutionary majority of these persistent infections are versions of fungi and are morphologically close due to dsRNA viruses. In stark contrast to to algae. Higher fungi have septate-reticulate eukaryotic algae, dinoflagellates and mycelium and large complex fruiting bodies and prokaryotes, few nuclear or large dsDNA are composed of multihyphal structures that tend viruses have been observed to infect any to be capable of prolonged survival. Higher fungal species, except for some very fungi are much more diverse (such as the interesting agents that infect fungal ascomycetes) than lower fungi. In addition, mitochondria. In this regard, fungal plastids many lower fungi are asexual, whereas the are especially unusual compared to all other higher forms tend to be sexual. Fungi can be eukaryotes in that their mitochondria are both diploid and polyploid but there is a frequently infected by both dsRNA and tendency, like algae, for fungi to be DNA plastids or viruses. Although for the predominantly in haploid states and to cycle most part systemic infections of fungi with between diploid and haploid states along with dsRNA viruses are not pathogenic, they are sexual reproduction. Fungi have notably small frequently associated toxin genes and killer nuclei (1-3 micron) relative to the 3-10 micron phenotypes that will be pathological to nuclei for most eukaryotes and fungi lack the nearby uninfected host. This situation chromosomal plate characteristic of mitosis. clearly resembles the addiction module Another unusual feature of most fungal nuclei is persistence strategy previously described that their mitosis is closed, that is the nuclear for phage and bacteria in chapter 3. As with membrane does not dissolve during mitosis most persistent infections of lower similar to dinoflagellates. The DNA content of organisms, transmission (both infection and the fungal genome is rather small relative to production) of fungal persisting dsRNA other eukaryotes, and the occurrence of repeated viruses is vertical or frequently associated sequences in some cases (such as neurospora) is with the sexual reproduction of the host and very limited. in some cases directly associated with the mating type of the host. In addition to the Modular, hyphael and long-lived organisms. highly common infections with dsRNA With the evolution of fungi, we have the first viruses, some specific lineages of fungi can clear example of creation of non-motile modular, be infected with linear dsDNA viruses (often hyphal organisms as well as the development of called plastids). individual organisms that can have very extended life spans. Although not all fungi are Lower fungi. Lower fungi represent a hyphal, this is by far the most common fungal rather diverse and polyphyletic assemblage. morphology since it is estimated that only 1% of Many are zoosporic (especially aquatic fungi species are yeast-like (and these tend to fungi) such as Oomycetes, live on plant surfaces). This is in contrast to Plasmodiaphorales, Thraustochytriales unitary organisms (like animals and most which will grow from small sexually bacteria) which have set morphology for the reproduced uninuclear spores. Lower fungi juvenile and the adult forms. Modular organisms (such as phytophthora) are defined by their grow by branching (tree or root-like) processes relatively simple mycelium in which the and thus have pliable morphology that can adapt separation between cells tends to be via to the local circumstances, such as growing simpler structures and they also have towards food, invading new habitats and fruiting bodies that similarly tend to be growing away from toxins. Such modular simpler then those of higher fungi. Many organisms are essentially clonal and some, such simple fungi are also acquatic, such as as deuteromycetes, Aspergillus and Candida Rhizidiomyces, and these tend to have albicans are also asexual (or parasexual). Some motile uninuclear zoospores. Zoospores fungi can generate very large, often clonal 105 organisms via mycelia that grow by degeneration), especially between same or invading adjacent habitat. However, this similar species, although this seems an growth characteristic in which new cells are uncommon outcome in some natural settings. physically in continuity with the parent However, such an interconnected characteristic creates a difficulty for the definition of of filimentous fungi would be expected to fitness, since growth and survival can occur provide a very attractive and possibly unique with little reproduction of independent habitat to viral agents as it could allow rapid progeny. Even in cases where the continuity access to the entire cellular network, without the of parent and offspring is broken, the need to make extracellular virus. Some higher progeny will often be clonal and most often fungi (such as ascomycetes and basidiomycetes) haploid. What then defines reproduction can undergo self fusion, of either the same or and fitness in this modular circumstance? It different parts of the organisms. However, this would seem that the continued existence of a hyphal characteristic of fungi is associated with metabolizing organism with the potential for another somewhat unique biological growth and reproduction would need to be characteristic, that of heterokaryon formation. In considered. This growth characteristic is addition some fungi form stable dikaryon and especially evident in some specific cases, mating types exist in many species. As such as the Armillaria bulbosa fungal mentioned, a rather unique and widespread mycelium (mats) that have been found in biological characteristic is self-fusion. Canadian forest. On such mat was estimated Sometimes, self-fusion can result in by aerial observation of ring growth pattern protoplasmic degradation via induction phenolic to cover about 15 hectars and to be about compound oxidation. Such fusion events can be 1,500 years old (Smith, ’92). Other very followed by nuclear replacement reactions large and old fungal mats are also known. (involving very rapidly motile small nuclei that In addition, fungi that grow within the move through hyphae), in which nuclei in the stones found in the Antarctic appear to be recipient hyphael compartment are often the longest lived organisms on earth (Prince, destroyed and replaced by daughter mitotic ’92) and such longevity been referred to as nuclei from the donor. This situation is most the Methusalah factor. Such a long lived reminiscent of nuclei of transmissible parasitic life strategy has been referred to as a K- red algae described in the previous chapter. selected life strategy which would indicate Mitochondria can also undergo elongation and that such species are under competition for move through hyphae, but not via the same prolonged periods for limited resources. It process that moves nuclei, nor are they generally is interesting that long lived fungal species transferred into fused hyphae. It has been tend to be diploid, which is otherwise suggested that such invasive ‘male-like’ behavior sporadic in most filamentous fungi. of the transferred nuclei, allows the nuclei to leave behind mitochondria, or rogue A network of transmissible nuclei. Fungal mitochondria and other cytoplasmic parasites hyphae form interconnected networks and that have colonized the host network of cells. It such connections can result in many nuclei has also been proposed, with some experimental that reside in one shared, communicating support, that the vegetative incompatibility cytoplasm, similar in this respect to some system, is another system of self identification species of red algae. These hyphae grow by and may have developed to limit the spread of nuclear division and hyphal extension at the cytoplasmic genetic parasites. Although it is tips. When growing tips encounter other known that mitochondria can also sometimes be parts of the same or different organism, they horizontally transmitted, vegetative can either be repelled or be attracted to each incompatibility clearly limits such transfers and other. Those that are attracted can undergo also affects the transmission efficiency of viral- fusion (anastomosis and septal like parasites. 106 both nuclei; suggesting that trans gene This invasive characteristic of fungal nuclei interaction is occurring and affecting opposite has also been called ‘parasexual’ in that it cis-acting elements between the two allows a form of sexual colonization or chromosomes. With Neurospora, there are 4 exchange (but only sometimes with heterothallic species that will not yield fertile heterokaryon recombination) with out the offspring in interspecific crosses. However, not typical sexual process. This parasexual all fungi undergo nuclear fusion. characteristic can even apply to non-sexual fungal species, which may help explain how The best studied (but possibly overemphasized) such otherwise clonal and haploid species mating type system is in saccharomyces can maintain genetic diversity. In field cerevisiae (which unlike filamentous fungi, isolates, fungal individuality is rather rare doesn’t differentiate between two sexes). Two due to frequent formation of heterokaryons mating type versions are known, a and a, in following hyphael fusion, although this is which a will repress a. These are both expressed not typical of Podospora and Neurospora. in haploids, but when combined in a heterokaryon diploid will act together to make a Sex types incompatibility and nuclear diploid. The Saccharomyces cerevisiae mating interactions. N. crassa has 10 loci that will type elements are small genomic regions (600- make heterokaryones incompatible, thus self 700 bp) that code for trans acting DNA binding recognition systems are clearly under strong proteins containing HMG-box motif. The selection in this species. The mating type is diploids express a complemented gene pattern one of these incompatibility regions. In which induces pheromones, mitosis, and addition mating type switching can also generates haploids by mitosis of 2N cells. This occur, although this switching is a highly situation has a clear resemblance to a atypical situation for filamentous fungi. reactivation program between otherwise Neurospora, for example, doesn’t switch defective elements. During type switching, the mating types as do some of the well studied silent mating type copy is transposed into active yeast species, saccharomyces cerevisiae. site in a transpositional activation that clearly When switchable mating types exist, it is resembles that of Mu phage, P2/P4, or Borrelia generally the case that there will be one phage. This transposition allows early stable type and one type which is silent but (unmethylated) DNA replication to occur and switchable. The mating type locus can be activate transcription. Targeted transposition repressed often via DNA methylation and involves flanking repeated elements. Yet some heterochromatin formation. Incompatibility fungi prevent repeat element expression or mating function of Neurospora mtA1 is transcription. Therefore, it is clear that the RIP related to S. cerevisiae a1, so some common system of Neurospora is not present in these processes seem to apply to diverse fungi. yeast species as it would not tolerate such a This gene is needed to be able to respond to duplicated sequence. However, it is interesting pheromones. Mating Type genes can often to note that the use of DNA methylation to encode the pheromone precursors that repress mating type bears a clear resemblance to undergoes protein processing, very similar MIP suppression system (described below). to the processing of insulin like hormones. Mating type switching thus resembles a These mating type genes can also resemble transpositional reactivation (invasion) from a addiction modules in that wrong (non-self or state of silence (persistence). non-complementing) combination can induce a damaging response. In dikaryons Asexual fungi, phenotypic diversity and and diploids, the combination of two nuclei repeat elements. Some yeasts are asexual. Due will switch the developmental pattern of to its medical importance the best studied is of

107 these asexual yeast is Candida albicans, special interest if we consider that such repeats which appears to always remain as a diploid. could be the remnants of early colonization by In this state it resembles the transient genetic parasites. Neurospora and other fungi, dikaryon of the sexual phase of the such as Ascobolus, have actually developed induction prior to spore production in other molecular systems that effectively prevent the fungi, such as Nadsonia species. Although accumulation of repeat sequences. The best it lacks sex, Candida albicans has a highly studied such system is in Neurospora crassa switchable colony morphology, which is which has the RIP system (Repeat induced also closely associated with human premiotically, now called Repeat Induced Point pathogenicity. This switching can be Mutation) that will efficiently induce point induced by UV light irradiation, but it does mutations in repeat sequences. This process is not appear to be a mutational event. Instead induced in one of the dikaryon nuclei during it appears likely that it involves a the sexual cycle when two haploid nuclei of transpositional and recombinational process opposite mating types come together to share a involving silent regions of heterochomatin common cytoplasm. In one sexual cross, adjacent to telomeres, but it is not directly between 8-80 point mutations (CG to AT related to MAT system of S. cerevisiae transitions) have been measured to result from described above. It is also possible that the RIPing of one copy of a gene following sexual mechanism involves changes to chromatin (but not vegetative) reproduction. Ascobolus structure and committed expression of gene (also a haploid) will induce high level DNA sets. One idea that has been suggested is methylation (MIP) in repeat sequences (DNA that a system of homology dependent gene methylation is generally at a low levels in most silencing could be involved in a switching fungi). The origin and mechanism of this system to induce new gene sets. process are not well understood, since it occurs Regardless of the specific mechanism, on a microscopic cellular level it is difficult to which remains unknown, this is clearly a investigate the biochemistry. However, as it is complex and highly adaptive system induced only premiotically, following the (affecting cell morphology, protein fusion of the two gamete nuclei, it resembles a synthesis, antigenicity, drug sensitivity) nucleic acid surveillance system derived from involving persisting and otherwise silent some genomic element which will preclude genetic elements that can be activated. related elements. This would be along the Although their mechanism of function is lines of DNA methylation/restriction system of currently unknown, it is interesting that phage and bacteria, which can also repress Candida albicans genome is know to harbor extra or related copies of the sequence. This several classes of moderately repeated RIP-like system is not seen in other fungi, sequences (such as 27A, Ca3, Ca7, CARE-1, such as Aspergillus and Podospora so it is not a CARE-2 and RPS1) since such repeats are universal characteristic. In addition, other clearly not found in all fungi. mechanisms of homology dependent have also been described, such as RIP, Repeats and fungal genomes. quelling which is known to Genomes of filamentous fungi are rather postranscriptionally silence repeated sequences small, about 30 Mega bases (ten fold E. in Neurospora. Thus suppression of repeat coli, one hundredth of the human elements is clearly a redundant and well genome). It is most interesting that some maintained system in these species. However, fungi are essentially devoid of repeat DNA this observation makes the clear case that the (such as Neurospora crassa), but others, accumulation of genetic parasites, such as such as the lower fungi (phytophthora), retroposons, is not an essential feature for the can have between 18-53 % of their adaptation and survival of specific fungal genome from repeat elements. This of genomes. Thus, systems that prevent 108 accumulation have clearly been copies of highly mutated (RIPed) and successfully developed by some dysfunctional Tad sequences. The presence of organisms. Neurospera crassa in these degenerate sequences indicates that prior particular appears to have a wide array of (or possibly ongoing) Tad colonization has genome defense systems. Neurospera has occurred but has been rapidly and effectively no intact and only inactivated in the Neurospora genome. This 10% of its genome is repeat DNA (much degenerate Tad presence further suggest that of this due to rDNA in the nucleolus). there exist a natural but unknown source of Curiously, the genome of N. crassa is Tad that has continued to colonize the about 40 megabases (about 10,000 ORFs), Neurospora genome. In stark contrast to only 25% smaller then that of Drosophila Neurospora, Crytococcus neoformans has a malenogaster and much larger then S. mating type locus that is composed of 60 Kb of cerevisae (12 Mb), both of which lack the DNA but has both interspersed unique and RIP system. repetitive elements (which also encodes the pheromone precursor peptide). It seems clear Repeated rDNA and TaD. Although that the presence of, or prevention of repeat Neurospora has few repeated sequences, elements varies considerably in specific fungal the clear exception is the rDNA repeat lineages. Such variation suggest that at the found in the nucleolus. Repeated rDNA of origin or radiation of fungal lineages, a Neurospora is present in 185 copies in peculiar and lineage specific pattern of genome chromosomal DNA. This is quite unlike colonization by these elements occurred. what was described above with the unique Along these lines, it is interesting that most genomic rDNA in tetrahymena that filamentous fungi have low copies of the L1 – amplifies in the macronucleus. As with Line like element (which lack terminal repeats most higher organisms, rDNA is organized with identical 3’ ends). This element is into the nucleolus organizer as a maintained in plants, invertebrates and distinctive structure (which was absent mammals, which all appear to have descended form red algae species). It has been from a fungal predecessor. proposed that the organization of rDNA copies into a nucleolus may have evolved to protect these repeat DNAs from a RIP- THE FUNGAL VIRUSES like system that would otherwise destroy Penicillium stoloniferum was used to isolate first repeated sequences, although this would source of at Lilly Laboratories in not explain the absence of extra-nuclear early 1950’s. An agent named Statolon was rDNA amplification as in ciliates. identified from cultures of Penicillium Retroposons are generally absent from the stoloniferum to posses the interferon inducing genomes of some fungi. However, it is activity via inhibition of VSV plaque known that N. crassa can have a 7 kb Tad, production in mouse L cells. Subsequent a retroposon repeat element, which investigation established that this agent was encodes an RT. This element is rather due to a high level production of dsRNA of a unique in lacking the LTR repeats, persistent and inapparent (dsRNA however. Also, when present, Tad is as a virus). Other strains of Penicillium low or unique copy and most field isolates funiculosum were also observed to make lack the element altogether. Tad was mycovirus. Subsequently, the list of dsRNA found in an isolate from the Ivory Coast viruses found in mold and other higher fungi and thus is not common in field isolates. grew to be large and now seems like essentially Tad therefore appears to be a recent all such species are inapparently infected with colonizer of Neurospora. However, most related viruses. Ubiquitous and persistent virus Neurospora genomes have numerous 109 infection now clearly seems to be the rule could not be cured of VLP production by in higher fungi. These have subcloning. The capacity for virus induction now been well studied and will be was also very stable for long term passage of presented below. However, the situation cells (up to ten years). Other lower fungi, may be different in lower fungi relative to Rhizidiomyces, Thraustochytrium, higher fungi. Since we wish to examine Schizochytrium, and Paramoebidium species virus evolution from the perspective of may all contain similar inapparent DNA host evolution, we will first consider the viruses. Clearly the size of these icosahedrons lower fungi and the various dsDNA viral and the DNA content resemble Adenovirus, agents that have been described in these and not the much larger algal phycodnaviruses species. (625 S vs 6340 S respectively). No uninfected isolates of lower fungi have been seen. This Linear dsDNA Viruses of lower fungi. viral ecology is seldom mentioned in virus Marine fungi, such as Spartina alterniflora fungal or viral text books, but seems distinct have no reports of virus in the current from both algae and higher fungi in which scientific literature. If present, it seems most higher fungi viruses are persistently most likely that such virus would be infected with dsRNA viruses. This situation inapparent or require induction. However, clearly resembles a lysogenic state in bacteria without direct evidence this can only be a or more specifically, resembles the relationship speculation. Other aquatic fungi are that phycodnaviruses (phaeovirus) have with clearly known to harbor virus. their filamentous brown algal host, including Rhizidiomyces (Hyphochytriales, a lower the link of latent virus induction to sexual aquatic fungi) will form fungal zoospores gamete production. Although ubiquitous, this that are parasitic on Oomycetes of Achlya family of virus has not been well and Saprolegnia. When standard lab characterized. It might be predicted that such strains were heat shocked in the early persisting agents would need to have genes that 1980’s, it was observed that the resulting compelled the host to maintain the infection zoospores would no longer infect their (such as addiction strategies and modules). In host. Zoospore failed to develop and addition, their ubiquitous nature would also instead their nuclei made large numbers of predict that competition between host specific VLPs. In addition, these VLPs are viral agents for host colonization should be commonly observed following heat shock fierce. However, essentially nothing is known in many other lower fungi. This VLP about the genes of these viruses or how they assembly is in the nucleus such that this attain stable persistence. nuclear assembly resemble Herpesvirus. However, the VLP capsids are smaller DNA Viruses in mitochondria of higher fungi. then herpes capsids, similar in size to The virus situation in higher fungi clearly adenovirus. Like Adenovirus, the appears to be distinct from that described icosahedral viral like particles contain above. Higher fungi are known to frequently linear dsDNA. Other observed VLPs that support linear DNA plasmids (defective virus), were cytoplasmic. Interestingly, most but all these agents are organelle associated natural isolates of Rhizidiomyces have and they infect mitochondria, not nuclei. similar VLPs. In all, 8 distinct isolates Infection of plastids by genetic parasites is were evaluated and all were positive for essentially unknown in most higher VLPs. When zoospore production was eukaryotes. For example, no animal virus is induced by salt induction, or heat shocked, known for mitochondria. Related linear DNA all isolates made large numbers of plasmids, however, are found in many intranuclear VLPs, resulting in the filamentous fungi, yeast and some higher distruction of zoospore. These isolates plants. All have terminal inverted repeats with 110 5’ covalently attached terminal proteins, pGKL2 codes for the RNA polymerases but it is although a few have covalently closed pGKl1 that codes for additional genes, essential snap back ends. These plasmids generally for the maintenance and immunity such that the encode 2 ORFs which usually include a two elements are complimenting. In this case, DNA polymerase and an RNA polymerase plasmid persistence is attained via an addiction that are clearly similar to those found in module in that pGKL1 codes for a three subunit Adenovirus (e.g. pMC3-2), but more killer toxin gene and an immunity gene against distantly related to the linear phages PDR- the toxin as well as a DNA polymerase. pSKL is 1 and ∅-29, all of which also have 5’ related to pGLK1 in organization but not by terminal protein DNA priming. A distant hybridization and doesn’t confer killer phylogenetic relationship to linear phenotype. The RNA polymerase encoded by plasmids found in soil bacteria this plasmid is most similar to 140 kdal subunit (Streptomyces) can also bee seen. This of yeast pol II. Circular forms of related DNA type of protein primed polymerase is plasmids (LaBelle nad Harbin-1) are also known, clearly of viral origin and is not found in but are uncommon and confined to a few genera any host cell. Also, the plasmid DNA will of yeast. Such killer plasmids or viruses have replicate via a process that is the same as not been reported for any filamentous fungi. that of Adenovirus. It is therefore very Furthermore, in yeast species, all of the linear curious to note that these mainly plasmids appear to be cytoplasmic whereas in mitochondrial plasmids are clearly similar filamentous fungi, the linear plasmids are all to the mainly nuclear adeno-like DNA mitochondrial. Thus these two groups of viruses seen in lower aquatic fungi. It plasmids appear to make up distinct types, not seems more than coincidental that capsid just in sequence, but in their relationship to their encoding nuclear DNA viruses are host. ubiquitous in lower fungi, but that defective, non-capsid versions of such a Plasmid phenotype and relation to other similar virus family are ubiquitous in the elements. These linear plasmids have additional mitochondria, but not the nucleus of similarities to other genetic agents. For example, higher fungi. the plasmid terminal nucleotides which correspond to the origin of replication and Killer phenotypes, addiction and DNA attachment point of the terminal protein clearly parasites. All these linear plasmids can be resemble the copia transposon sequences, so grouped according to the conservation of common in insect genomes. In Actinomycetes their DNA pol sequence, such as the bacteria, related linear plasmids are also known. prototype PMC3-2. Two major pol groups In these soil bacteria, these plasmids generally are apparent that corresponding to those code for advantageous abilities, such as found in yeast species and filamentous fungi nutritional versatility, or the degradation of respectively. In Saccharomyces kluveri, toxins, such as phenols, so they clearly have pSKL plasmid has terminal protein that phenotypes. Similar, but cryptic elements (with resembles those in both adenovirus and f29 few genes) which are also called hairpin of B. subtillis. Some yeast harbor two elements and can also be found in plant plasmids. Kluyveromyces lactic and other pathogenic (but not nonpathogenic) Rhizoctonia yeast species, have multi-partite linear DNA solani. These plasmids identify a plasmids (pGKL1/2) that have genes in complementing but persisting system of multiple addition to the DNA and RNA polymerases. cryptic element. However, this observation The additional genes confer a killer makes another important point. The stable phenotype via encoded toxin and immunity maintenance of a two genome system (a simple genes. With pGKL1 - pGKl2 plasmid set, version of a dikaryon) can be attained by a combination of an element with addiction 111 module and a second suppressing defective was described as having 12 different plasmids. that allows persistence. In addition, giant (50kb) tandem plasmids were also reported. There was no evidence of DNA parasites that affect longevity. plasmid– plasmid interaction. Loss of some Although the majority of these linear plasmids was often seen during sexual transfer, plasmids of filamentous fungi are not consistent with the above predictions to associated with killer phenotypes, many explain the male-like behavior of nuclei to do have other phenotypes in their host and escape genetic parasites These linear plasmids some can affect the host longevity. In are inherited maternally – as they are Podospora anserina, the PAL2-1 plasmid mitochondrial plasmids, and in field isolated can integrate into mitochondrial DNA and they are readily transmitted between isolates. as a consequence stabilize mtDNA from The transmission of these linear plasmids is age dependent degradation and induce clearly restricted by the vegetative longevity in the host, allowing colonies to incompatibility of the host which was shown to grow to ten fold larger radius. Given the decrease (but not eliminate) transmission rates evolutionary importance of longevity and by ten fold. hyphal growth to the survival of higher fungi in large fungal mats, this is a very Fungal mitochondria. The association of interesting characteristic for a genetic various genetic parasites with mitochondria of parasite to endow onto its host. Various filamentous fungi is a rather unique feature of forms of these PAL plasmids are seen in fungi. It is worth considering what is known natural isolates (78 have been about fungal mitochondria that might illuminate characterized), and all appear to have a this unique situation. The well studied terminal protein bound to 5’ end of the Ascomycetes mtDNA range from 19kb to 115 kb DNA. These plasmids all have a common in size, which is much smaller than the mtDNA central DNA pol region, although the of higher eukaryotes. Fungal mtDNA codes for terminal inverted sequences do vary in a 2 rDNA subunits, tRNAs and introns (some N. host specific way. It appears that mtDNA crassa introns are infectious). Unlike some with integrated PAL plasmid becomes fungal nuclear genomes, mtDNA can tolerate stable to degradation. But the integrated the presence of repeated DNA sequences. For copies are not suppressive of wt mtDNA example, N. crassa mtDNA contains a series of genomes, which is in sharp contrast to the GC rich palindromes. Also, unlike other situation seen in Neurospora below. The mtDNAs, fungal mtDNA recombines readily, as phenomena of linear plasmids in do parasitic DNA plasmids of these Podospora is not a laboratory curiosity as mitochondria. Clearly mtDNA is not subjected it is found in natural settings. With to the RIP system or other repeat limiting system Podospora anserina, of 14 of 78 isolates of of nuclear DNA. In Podospora anserina, rogue natural populations were shown to have mt DNA replicons can occur by intron invasion the presence of a PAL-2 related longevity into the Cox gene, leading to senescense. Yet inducing linear plasmid. Curiously, in integration by PAL2 DNA can lead to these natural isolates most of these stabilization of mtDNA and extend lifespan. In plasmids were not mtDNA integrated and addition to these linear DNA plasmids, fungal only one strain showed the longevity mitochondria also seem to be highly prone to phenotype. As most of these plasmids colonization by other types of genetic parasites. were not inserted into mitochondrial DNA, For example, dsRNA based virus like particles they were being maintained as are often found in mitochondria. In Neurospora autonomous linear mitochondrial Mauriceville and Varkud retro-poson like replicators. The presence of mixed element will invade and disrupt mtDNA.. As to plasmids was also common and one isolate why these mitochondria are so prone to genetic 112 parasites, there is currently no clear answer. and RNA polymerase. The TIR ends are the It seems likely that the capacity of fungal most diverse part of the sequence and show a mitochondria to undergo recombination phylogeny that is congruent with the host nuclear might provide a good molecular habitat for DNA. Transmission of these elements between the linear and circular DNA plasmids and species is common but mating type vegetative the retroposons. However, this feature incompatibility reduces transmission rates ten would not be expected to support fold, leading to suggestions that vegetative colonization by dsRNA viruses. Perhaps the incompatibility may function to limit such answer lies outside of the mitochondria. It parasite transmission. This plasmid and related might be that the ubiquitous presence in plasmids are widely distributed across this fungi of homology-dependent gene genus. silencing systems, which operate both transcriptionally and post-transcriptionally Yeast retroposons. Yeast species also have (such as quelling and RNA interference) other types of retroposons, belonging to the Ty3 would provide a molecular habitat that (gypsy) and Ty1(copia) family. Unlike Tad would prohibit many nuclear and cytosolic (LINE-like element) of Neurospora, these yeast genetic parasites. elements are more typical retroposons and have associated LTR’s. These retroposons code for both gag, reverse transcriptase, protease, RNAse Neurospora and senescence. Neurospora H and integrase like genes. In keeping with the crassa is also commonly colonized by general absence of fungal viruses with a natural mitochondrial genetic parasites. Linear extracellular route of transmission, (DNA dsDNAs, such a Kalilo agent are the best plasmids and mycoviruses below), the yeast studied if these elements in neurospora. retroposons also do not code for the main These mitochondrial agents also strongly external structural proteins of the retro virion, the affect host longevity, however in the case of env gene. These retro-elements appear to Neurospora, longevity is always decreased represent the first instance of genome (i.e. senescence induced) and not extended colonization by defective endogenous as in Podospora with the PAL-2 plasmid . retroviruses, although the numbers and diversity Neurospora mitochondrial genetic parasites of these elements are very much smaller than include not only these linear DNA plasmids that which occurs in plants and animals. For but also retroposon elements such as the example, the LINE like elements (Tad, CRE1, Mauriceville and Varkud retroplasmids and SLACS-trypanosomes) are present in relatively related to pFOX plasmid of Fusarium low (unique) genomic copy numbers in some oxysporum. These mitochondrial retro fungi. elements use an unusual and possibly primitive form RT polymerase activity in Mycoviruses. As previously mentioned, that it is not primer dependent and may mycoviruses of fungi are generally latent and represent a transition between RNA highly prevalent in numerous fungal species. polymerase and a DNA polymerase. In this The majority of these viruses have double case, the primer 3’ end is similar to tRNA. stranded RNA genomes and like most fungal Also fungal mitochondria have group II viruses, lack an extracellular transmission intron, which are self splicing and code for phase. RT and are invasive. All of these agents have a tendency to induce erratic growth and Totiviridae, as mentioned for protozoa, have a senescence in infected Neurospora. Like the single RNA genome, of which the L1 virus of linear plasmids described above, the linear yeast is the prototype. This dsRNA virus kalilo DNA plasmids is a 9 kb linear DNA establishes a permanent persistent infection. that also codes for a Adenovirus-like DNA These agents can be found in 9 genera of fungi 113 and four genera of protozoa (, major potential disruption to brewing industry. Leishmania LRV-1). Totiviruses can also It is therefore interesting that the best defense encode Killer phenotypes. Partiviridae that the brewing industry has developed have multipartate dsRNA genomes and against exogenous yeast genetic parasites has can be found in an additional 9 genera of been to colonize the industrial yeast cultures fungi. Only in Candida, with no know with their own and protective versions of killer sexual cycle, and in Podospora have no viruses. This is operationally very similar to dsRNA viruses been found. In all cases, the dairy industry and lactobacillus mediated only persistent infections are observed. fermentation of milk products which are also threatened by wild phage and best protected by Killer viruses. In various yeast host, these the immunity of various persisting lactophage. totiviruses (like the DNA plasmids above) can also code for killer character via a The L-A killer virus codes for 4.7 kb L1 diverse set of toxins that kill by membrane dsRNA, which can undergo encapsidation into disruption, although some (K28) can also icosahedral particles and are present at several stop DNA synthesis. The L-A killer thousand particles per cell. This RNA encoded dsRNAs are the best studied yeast killer a major coat protein and an RNA dependent agents. As these killer agents also code RNA polymerase. Like dsRNA phage and for an immunity protein to the matched Reovirus, transcription of the dsRNA template toxin, this makes the gene set an addiction is occurs within a viral core particles and this module. A major coat protein is also polymerase shows clear similarity to those generally encoded. However as described polymerases, but not to polymerases of ssRNA below, many of these agents are cryptic or viruses. Core particle transcription results in satellites to other agents with reduced the production of uncapped viral mRNA’s, coding capacity. Killer dsRNA first which might be a target of the S1 host defense discovered in Saccharomyces cerevisiae. system. However, it seems likely that such In 1963, Bevan and Makower first virion associated transcription could shield the described the killer characteristics and viral transcription system from some host corresponding VLP were produced. silencing that targets transcription. In some Killer viruses were later shown to be cases, (k1 killer strains), a second satellite ubiquitous in natural and lab isolates and dsRNA exists (M1 satellite to L1), which most strains of Saccharomyces cerevisiae codes for preprotoxin. The toxins encoded by have killer virus, although curiously, some this second dsRNa, 1.9 kb M1 RNA, is a field studies show a relatively low dimeric, exocellular toxin and the RNA also prevalence (1-5 %) of these agents in codes for specific immunity to toxin. The natural yeast species. toxin alone will kill sensitive yeast strains. Additional satellites, M2, M3, M28, etc., that encode additional killer toxins genes are also Killers and Beer. Yeast have been known so there is clearly much natural extensively studied due to the economic diversity in these killer and satellite system. In importance of their fermentation laboratory handling, yeast strains are usually properties as used in the brewing and used as haploid killer strains and can bakery industry. The very large frequently have multiple killers in different industrially grown cultures of yeast are clones. These dsRNA viral genes can also be susceptible to infection from agents involved in linear DNA plasmid present in wild yeast isolate and can result incompatibility, indicating some clear in major disruption to production. interaction between dsRNA agents and dsDNA Operationally, killer systems of wild type agents. The details of this interaction have not yeast pose a big problem and represents a been explored, however. No recombination is 114 seen in these dsRNA viruses. In nature, infected. Consistent with this, laboratory they appear to be maintained by vertical infection has been achieved during the mating transmission, but it is likely that hyphal of haploid strains in the presence of killer virus fusion is a major process of transmission resulting in a fraction of offspring that acquired in hyphal viral forms. Defective the killer phenotype. A curious distinction of interfering version of these viruses are these dsRNA viruses is that they package plus naturally occurring and these defectives stranded ssRNA, which suggest that these generally lack the preprotoxin coding fungal viruses are a distinct order of virus. sequence. The effects of defectives on host colonization by the infectious versions of Although dsRNA viruses are common in many virus is not always clear, but in the case of fungi, not all of these agents encode capsid M, their absence results in massive genes or make particles (such as increase in L-A dsRNA and particles, blight virus below). Many thus appear to be although cells remain healthy, thus they more defective version then the killer viruses clearly can interfere with the similar described above and it is not clear if they can viruses. In this regard, the defectives exploit other less defective dsRNa viruses for appear to allow a more stable form of low packaging or if they provide interfering level persistent host infection. Curiously, capacity to the host cell. However since viral replication of these defective RNAs have a packaging and replication sequences are different dependence on host genes generally the same sequence, this would (requiring the non-essential SK1 super- require that satellites maintain sequence killer genes) then do the helper viruses, similarity to potential helpers and this is not indicating that defectives have a distinct always seen. In all cases, however, infection albeit poorly understood relationship to appear to be persisting an generally inapparent. their host. Ironically, some of these host SK1 genes are members of a genetic ssRNA circles. ssRNA viruses of fungi are also system which is thought to function to known, although not nearly as common as search out and destroy uncapped mRNAs. dsRNA viruses. Of these viruses, the 20 S replicon has been best studied. This replicon is The killer toxin, when produced by these highly unusual relative to all other RNA satellite viruses, undergoes processing and viruses in that it is a circlar RNA. Thus it secretion. The toxin acts by binding to requires internal initiation of translation since the glucan cell wall receptor and kills it would not produce capped mRNA. sensitive strains via plasma membrane Interestingly, this might also protect it form an interaction and altered permeability. SK1 defense system. These 20S RNA virus Immunity is not well understood by particles are induced under conditions that also appears to possibly be by masking this induce sporulation (high temperature, acetate receptor. Similar toxin/immunity system medium) and can under such conditions is seen in Yarrow, Aspergillus and amplify up to 10,000 fold. In this character, Penicillium filamentous species, so unlike they are similar the nuclear DNA viruses the linear dsDNA viruses, the dsRNA induced in sporulation of the lower aquatic viruses show killer phenotype in both fungi described above. yeast fungi and filamentous fungi. Virus transmission is likely to occur via hyphal Fungal dsRNA and partiviruses. In fusion in filamentous species but in yeast filamentous fungi, dsRNA virus infections are species transmission is not well studied in also prevalent and were first described in 1948 natural settings. During mating, however, during commercial production of mushrooms. the yeast cell wall is dissolved and it is These viruses show clear similarity to viruses likely that these spheroplast can then be found in other protist, such as GLV of Giardia 115 lamblia, and Trichomonas virus (a virus However, in the 1950’s a markedly reduced associated with host phenotypic variation). virulence of fungus was noted in some trees In filamentous fungi like Penicillium and and this fungal strain was called hypovirulent. Aspergillus, one can also find partivirus, Hypovirulence was associated with decreased which is a bipartite genome with two asexual spore formation and other phenotypes. segments that are very related to plant It was shown to be due to a cytoplasmically viruses. Other viruses similar to inherited factor later identified as partiviruses also infect plants. Cryphonectria (CHV-1/ EP713). Multipartate viruses are also known. In The tripartate (M, L, S) dsRNA of this virus the case of Ophiostoma novo-ulini, virus has an unusual termini, 3’ poly A (L RNA) infection involves greater then 7 dsRNA and 5’ poly U (M RNA) and codes for two species, only one of which is an RNA gene products, one an RNA dependent RNA dependent RNA polymerase that polymerase and the other a helicase, both resembles mitochondria RNA polymerase. undergo proteolytic self cleaving. The virus It is not known how the transmission of shows relationships to plant . such a multipartate genome is coordinated. Defectives of these viruses are very are Unlike the situation with influenza and common. Like the other fungal viruses, other multipartate viruses, the absense of CHV1-EP713 is transmitted via hyphal fusion. an extracellular phase would not allow co- Infected fungi show various alterations, such as packaging of these genetic components. changes in growth rate, female infertility, In Cryphonectria parasitica, a ds RNA reduced asexual sporulation and changes in agent transmitted via hyphae and also by gene expression, including genes involved in meitoc spore formation, is found in the signal transduction. Field studies show mitochondria. This virus codes for an considerable variation in these dsRNAs, RNA polymerase that is similar to those including substantial diversity in the RNA pol found in RNA phage. sequence as well as a diversity on effects on host gene expression. In Europe, about half of the field isolates are infected and virus Multipartite CHV1 and fungal transmission between types and isolates occurs hypovirulence. On of the best studied in most pairings, indicating that vegetative and more interesting of the fungal dsRNA incompatibility did not prevent virus viruses are the virus of the transmission. Because of the clearly beneficial fungus. Like most fungal viruses, these effect on Chestnut trees, there has been also lack an extracellular phase. In considerable effort to use these viruses for addition, the ds RNA does not code for a biological control of the fungal disease. These capsid gene, but it still becomes membrane efforts have been successful in Europe. enclosed and these bound structures are However, in spite of considerable effort and sites of RNA polymerase. These viruses clear European success, much less success has came to attention of biologist because of been attained using these viruses to control their ability to induce hypovirulence in disease in American trees, outside of some some infected fungi. Chestnut blight areas in Michigan. fungus, Cryphonectria parasitica, is native to the oriental species of Chestnut tree and The most accepted theory to explain this was first identified as a major problem in situation is that a high and diverse level of North America in 1904, following fungal virus already exists and colonizes the accidental introduction into America. The American Cryphonectria parasitica fungal fungus also spread to Europe where it had population relative to that found in European a similar devastating effect on the fungus. These diversely infected American European species of Chestnut tree. populations severely limit the spread of the 116 hypovirulent European viral strains, such as CHV1-Euro7. Some direct Protozoan viruses measurements of existing viral diversity (Widmer and Dooley 1995) and its relationship to virus spread support this theory. This suggest that viral-viral Viruses of fungi. competition (spread in the face of (Dawe and Kuhn 1983) competitor persistence) is a major issue (Ginsberg 1984; Ghabrial 1998) with respect to the fitness of the (Dawe and Nuss 2001) hypovirulent virus. It would also appear to identify a major element of viral fitness Interferon and fungi. as it applies to a realistic ecological (Ellis and Kleinschmidt 1967) setting. Thus overcoming the seemingly limited and ‘selfish’ genetic effects of host fungal and Neruospora biology. colonized by a defective hypoviruses can (Davis 2000) have a crucial role in the ecological (Gow 1994) outcome. It seems likely that mycoviral gene functions have evolved that provide Fungal incompatibility and longevity. this an essential ability to compete with (Glass, Jacobson et al. 2000) other persisting virus, thus allowing (Hirsch, Eckhardt et al. 1995) successful persistence or transmission. (Smith, Duchesne et al. 1990) Such functions would appear to be nonessential ‘accessory function’ when Linear plasmids evaluated in host free of viral colonization DNA/adeno-like Yet in the natural , viral (Rohe, Schrage et al. 1991; Meinhardt, colonization of fungi is essentially Schaffrath et al. 1997) inevitable making the ability of one virus and longevity. to counter prior colinization by another (Hermanns and Osiewacz 1996; van der Gaag, virus an essential phenotype for a natural Debets et al. 1998) habitat. In this case, like the T4 RII gene, (van der Gaag, Debets et al. 1998) it is not simply the host that determines the and relationship of killer to virus fitness of a virus, but other competing (Klassen, Tontsidou et al. 2001) viruses as well. (Hishinuma and Hirai 1991) (Sturley, El-Sherbeini et al. 1998) It is therefore worth considering why all (Wickner 1996) these distinct characteristics of fungal virus -host interactions (i.e., linear Fungal gene silencing. plasmids, killer phenotypes, mitochondrial (Pickford and Cogoni 2003) infections, distorted senescence, (Cogoni 2001) ubiquitous dsRNA colonization) are mostly peculiar to the fungal orders of organism yet generally absent from the plants and animals that are descendents of fungi, or also absent the algae or prokaryote predecessors. Perhaps these Cogoni, C. (2001). "Homology-dependent gene genetic parasites are an integral part of the silencing mechanisms in fungi." Annu successful genetic milieu of the fungi Rev Microbiol 55: 381-406. themselves. Davis, R. H. (2000). Neurospora : contributions of a . New York, Oxford University Press. 117 Dawe, A. L. and D. L. Nuss (2001). Pickford, A. S. and C. Cogoni (2003). "RNA- "Hypoviruses and chestnut blight: mediated gene silencing." Cell Mol Life exploiting viruses to understand and Sci 60(5): 871-82. modulate fungal pathogenesis." Rohe, M., K. Schrage, et al. (1991). "The linear Annu Rev Genet 35: 1-29. plasmid pMC3-2 from Morchella conica Dawe, V. H. and C. W. Kuhn (1983). is structurally related to adenoviruses." "Virus-like particles in the aquatic Curr.Genet. 20: 527-533. fungus, rhizidiomyces." Virology Smith, M. L., L. C. Duchesne, et al. (1990). 130: 10-20. "Mitochondrial genetics in a natural Ellis, L. F. and W. J. Kleinschmidt (1967). population of the plant "Virus-like particles of a fraction of armillaria." Genetics 126(3): 575-82. statolon, a mould product." Nature Sturley, S. L., M. El-Sherbeini, et al. (1998). 215(101): 649-50. "Acquisition and expression of the killer Ghabrial, S. A. (1998). "Origin, adaptation character in yeast." Journal of: 179-208. and evolutionary pathways of fungal van der Gaag, M., A. J. Debets, et al. (1998). viruses." Virus Genes 16(1): 119-31. "The dynamics of pAL2-1 homologous Ginsberg, H. S. (1984). The Adenoviruses. linear plasmids in Podospora anserina." New York, Plenum Press. Mol Gen Genet 258(5): 521-9. Glass, N. L., D. J. Jacobson, et al. (2000). Wickner, R. B. (1996). "Double-stranded RNA "The genetics of hyphal fusion and viruses of Saccharomyces cerevisiae." vegetative incompatibility in Microbiol Rev 60(1): 250-65. filamentous ascomycete fungi." Widmer, G. and S. Dooley (1995). "Phylogenetic Annu Rev Genet 34: 165-186. analysis of Leishmania RNA virus and Gow, N. A. R. (1994). The growing fungus. Leishmania suggests ancient virus- New York, Chapman & Hall. parasite association." Nucleic Acids Res Hermanns, J. and H. D. Osiewacz (1996). 23(12): 2300-4. "Induction of longevity by cytoplasmic transfer of a linear plasmid in Podospora anserina." Curr Possible figures: Genet 29(3): 250-6. Hirsch, P., F. E. W. Eckhardt, et al. (1995). 5-1. Dendogram of +RNA viruses of oceans "Fungi active in weathering of rock and stone monuments." Canadian 5-2. Diagram of fungal killer agents (need) Journal of Botany 73(SUPPL. 1 SECT. E-H): S1384-S1390. 5-3. Table of acquatic microorganisms and Hishinuma, F. and K. Hirai (1991). their viruses (need) "Genome organization of the linear plasmid, pSKL, isolated from 5-4. table of mitochondrial parasites (need) Saccharomyces kluyveri." Mol Gen Genet 226(1-2): 97-106. Klassen, R., L. Tontsidou, et al. (2001). "Genome organization of the linear cytoplasmic element pPE1B from Pichia etchellsii." Yeast 18(10): 953- 61. Meinhardt, F., R. Schaffrath, et al. (1997). "Microbial linear plasmids." App.Microbiol.Biotech. 47: 329-336.

118 CHAPTER VI

A VIRUS ODYSSEY FROM WORMS TO FISH: viruses of the early animals and the aquatic animals

In the prior chapter, we considered the prevalent large dsDNA algae, the viruses of fungi and their particular phycodnaviruses, are not found to infect any relationship to both lower and higher fungi. fungi or higher plant species. Nor are the We are now interested to consider the Adenovirus-like parasites of fungal mitochondria virology and host evolution of animals (and found in plants, yet clear relatives of these later higher plants,) which have evolved viruses are common in fungi and animals. These from or depended on fungi for their and other issues concerning the general the evolution. As was previously presented, relationship of viruses to higher plants are with few exceptions, fungi show general discussed in chapter 7 to follow. ability to support both DNA and dsRNA viruses. However, these fungal viruses, Dictyostelium and a virus paucity. From a although ubiquitous, are generally lacking in phylogenetic perspective, dictyostelium is most an extracellular form. As fungi are accepted closely related to fungi and protozoa. as being representative of the predecessors However, dictyostelium displays many of to both higher plants and animals, they characteristics of an early animal so it appears represent a most significant point of to represent an evolutionary step in the origin bifurcation in the origin of these two higher of animals. Dictyostelium can be considered orders of organisms. We can thus propose to represent a social amoebae, or the with high confidence that the ancestors to Amoebozoa which is an order that is basal to both higher plants and animals are expected both the higher fungi and animals but was not to have been exposed to representatives of an apparent direct ancestor to animals. One the various fungal viruses. The relationship distinction between amoebozoa and fungi of plants to fungi, especially the filamentous however, is that unlike filamentous fungi fungi appears to be clear. Both are which are predominately haploid and transient characterized by nonmotile branching diploids, dictyostelium are stable diploids and organisms and growth is used to acquire is only a miotic haploid during spore nutrients from the local environment. Also, formation. In this feature dictyostelium more the common symbiosis between fungi and closely resembles higher animals. photosynthetic algae makes it easy to Dictyostelium has been an intensively studied envision how higher plants might have model system due to the social character of the evolved form these simpler fungal ancestors. assembly of its cells. Thus it is most surprising With respect to ds RNA viruses of fungi, we to realize that there are no reports of viruses in can also observe clear relationships between the literature of any dictyostelium species. the viruses that infect fungi and those that Given that there has been substantial electron infect higher plants, consistent with a microscopy (EM) study of dictyostelium, linkage of virus to host evolution. Also, especially from the perspective of the analysis many of the molecular process that the host of the formation of pre-spore vesicles and uses to control virus replication (e.g. non- spore coat protein assembly, it seems likely adaptive gene silencing, RNA interference) that prevalent assembly of viral like particles can be found in both fungal and plant would have been observed by now. It thus lineages. However, there are also some viral seems that assembly of viruses are rare or discontinuities between these host lineages entirely absent from the stidied dictyostelium For example (and most curious), the very species. This would suggest that dictyostelium

119 may have developed some highly efficient genomes, but are not basic to the function of systems that prevent colonization by viral the host cell. This functional similarity seems agents. to strongly imply that RdRP may have evolved from a viral ancestor. Yet no apparent Antiviral systems of dictyostelium - sequence homology can now be seen between origins. Historically, the presence of these cellular RdRP and known viral RdRPs. antiviral systems was not well studied in However, the complete absence RdRPs from dictyostelium. It was known that prokaryotic genomes makes a viral origin the dictyostelium have SK18 (super killer) most likely explanation. In this regard, the related genes, that are used to control ds ubiquitous occurrence of persistent infections RNA virus infections in yeast species. of diverse dsRNA viruses in filamentous fungi, Thus some antiviral genes appear to be described in the prior chapter, and the fact that present in dictyostelium genomes. these cells are the evolutionary progenitors to However, more recently it has become dictyostelium, makes it very plausable that clear that like C. elegans (discussed cellular RdRP’s resulted from the colonization below), dictyostelium also encodes the of the host by some ancient, persisting RNA genes needed for an RNA interference viral agent, that was very successful at response (RNAi). The RNAi system precluding competing dsRNA and other viral responds to dsRNA by interfering with parasites via the RNAi system. In this and degrading the complementary mRNA. scenario, it is worth recalling that stable viral The RNAi response is also both colonization of host generally requires the amplifying and transmissive in that it can acquisition of some viral phenotype that allows spread to nearby cells and transmit the or compels persistence and withstands the RNA specific interference. In onslaught of genetic competitors. The RNAi dictyostelium , three genes related to he rrf system is known to silence genomic transcripts genes of c. elegans have been reported. and degrade those RNAs that show some These genes code for a protein highly relationship to the silenced gene. RNAi, for homologous to the RNA dependent RNA example, is known to silence retroposon polymerase (RdRP) of C. elegans, which elements in C. elegans genome, but not to is involved in amplifying the dsRNA prevent their reactivation by other signals. signal. It therefore seems very likely that Thus RNAi seems to have the needed like elegans, dictyostelium also amplifies characteristics of system for persistence and RNAi via these cellular RdRP’s. The preclusion of competitors. However, besides likely fungal ancestors of dictyostelium do the possibility that that an RNA virus might not appear to have had this antiviral have contributed the RdRP, another retrovirus system. Thus it seems very likely that this must have also been involved by providing the RNAi system may help account for the reverse transcriptase needed to copy this gene apparent absence of RNA viruses from into DNA and integrate it into the dictyostelium species. dictyostelium genome.

At this point, it would be most interesting to Dictyostelium genome and genetic parasite consider the likely origin of such a cellular colonization. In terms of genomic parasitic system. The general virus-like elements, dictyostelium clearly have some characteristics of RNAi are striking. rather unique elements that are related both to Besides the amplifying and transmissive retroposons and type II DNA mediated nature of RNAi, the participation of RdRP elements. In contrast to the N. crassa genone, is especially a virus-like characteristic, about 10% of the dictyostelium genome is since these enzymes are the basal made up of such repeating elements. The most replicative enzymes of all RNA viral abundant and best studied of these elements is 120 the DIRS-1 element, present at about 200 does not appear to copies per genome. Thus unlike higher be involved in this preferential insertion. The fungi, such as Neurospora, dictyostelium, reason for this insertional preference have not the dictyostelium genome clearly tolerates been studied, but it could result from the acquisition repeated elements. competition between colonizing elements as However, these DIRs are most unusual they seek to interrupt related competitors, relative to the vertebrate LTR retroposons similar to what was seen by some lysogenic or endogeneous retroviruses in that their phage in bacterial host chromosomes. LTR’s are inverted or ‘split’ and they integrate without creating a duplicated Mitochondria free of parasites. The sequence. They do have several ORFs, mitochondira of dictyostelium have not been one of which encodes a reverse reported to host any genetic parasites, either transcriptase (RT). This DIRS RT shows dsRNA, dsDNA or retroposon elements. Thus clear sequence relationship to those of the in this regard they differ significantly from Ty3/gypsy – a vertebrate endogenous fungi (and many plants). The dictyostelium retroviruses. However, unlike those mitochondria (56 kbp) are large relative to retroviruses, but like many fungal viruses, fungi and are relatively devoid of intragenic or no coat or env genes have been seen in the repeated sequences. However, the dictyostelium genome, suggesting that mitochondrial genomes of parasitic nematodes these are either defective viruses or the can be very small, 13,747 bp circular and in virus lacks of an extracellular phase in its some cases, these genomes are multipartite. life cycle. Other DIR ORFs are present but of unknown function. It is most Overall, the virology of dictyostelium species is interesting that outside of dictyostelium, in sharp contrast to fungi and algae in that DIR –related retroposons (such as Tdd) dictostelium appear not to harbor no known are found in only in nematodes, sea acute or persisting autonomous virus, in either urchins, fish and amphibia (discussed the nucleus, cytoplasm or mitochondria. below), consistent with a clear lineage However, all the dictyostelium genomes are relationship between these organisms. clearly colonized by atypical retroviral-like Another very distinguishing feature of agents that were not found in fungi or protozoa DIRs is that one of the previously and these elements show clear relationship to uncharacterized ORF is now known to bacterial virus as well as to encode a protein clearly related to a vertebrate RT encoding retroviruses. The lambda-like recombinase. The biological consequences of these colonizing participation of such a recombinase in elements are not yet clear. Their mechanism of DIRS-1 integration would also explain the persistence , their effect on host fitness or absence of repeats at point of integration. longevity, or their effects on other competing The dictyostelium genome is known to genetic parasites have not yet been evaluated. harbor several other DNA elements, such However, their invariant presence in all as DDT. Thus several transposon families dictyostelium lineages could clearly suggest a are represented in the genome. It is thus direct involvement of these parasites in the interesting to note that there is appears to origins of these lineages. Also, with be strong evidence that interactions dictyostelium we saw the first representative for between these various families retroposon evolution of the RNAi system of RNA silencing. are occurring. All these elements show This defense system appears to offer a possible insertion preference into loci in which explanation for the absence of many viral agents other elements of the same family also from dictyostelium. However, because the reside. However, these other, interrupted RNAi system itself is dependent on RdRDP, it elements lack sequence homology so seems most likely that the RNAi system may 121 owe it own origins to a distant persisting macronuclear (but not micronuclear) DNA genomic parasite derived from a dsRNA rearrangements of ciliates. virus. Parasitic (endoparasitic) nematodes have Nematodes and virus. Nematodes are several distinctions from non parasitic clearly more developed organisms then nematiodes and present a global agricultural dictyostelium and represent an important problem as they are known to be important and clear transition towards the evolution vectors for the transmission many virus of higher animals. The nematoda order is infections to crop plants. They do this by directly under the Metazoa classification allowing virus to adsorb to specific and thus represents a basal component of mouth parts and mechanically transmit virus to the animal lineage. Nematodes produce plants being feed on by the nematodes. These most of the cell types typical of animals, plant viruses can physically persist but not including all three germinal layers; replicate in the nematode mouth tissue (such as endoderm, mesoderm and ectoderm. The nematode transmitted TRV). Some unsegmented nematod is tubular fungi (Polymyxa betae) can also function as with bilateral symmetry, typical of higher vectors for trasmission (e.g. + animals. The tissue types include a neural ssRNA , Beet necrotic yellow vein system, a muscular system and a digestive virus (BNYVV) and Beet soilborne mosaic tract. Thus nematodes have been virus (BSBMV) by specific persistence (but especially examined as models for the not acute replication) in resting motile fungal study of tissue development and zoospores. In both the fungal and differentiation and consequently the endoparasitic nematode situation, it has programming fate of every cell from been clearly established that various plant embryo to adult has been mapped. viruses have genes for the specific interaction Overall, these small worms can have two with receptors and other molecules present in distinct life styles; autonomous or it’s the fungal and/or nematode vector. Thus parasitic. Caenorhabditis elegans is by far the plant-virus-nematode interaction is highly the best studied of these autonomous specific to both the virus and the nematode and nematods and its genome was the first involves specific viral structural proteins. animal genome sequenced. Nematodes Nematodes, like fungi, can clearly be involved have a small number of chromosomes (4- as important vectors for viruses. This contrast 5), for a total DNA content of 97 Mb. with the viral situation with which are Most interestingly, some of the parasitic also very common vectors for plant viruses nematodes have an X/Y chromosomal (Ch. 7). However, unlike nematodes, system for the determination of sex, which cells replicate these RNA viruses. In spite of represents an early example of this sex this frequent association between nematodes strategy, common to so many higher and virus as a vector for plant virus animals. Nematodes undergo cell aging transmission and in contrast to fungi, no and programmed cell death like higher persisting or acute virus has been described for animals and do not appear to show the nematodes of any kind. Nor are any parasites highly extended life span of some higher of nematode mitochondria known. In these fungi. Parasitic nematode can also two characteristics, nematodes seem most undergo a process of chromosomal similar to dictyostelium and distinct from diminution (in non-germ somatic cells) fungal and higher animal species. involving loss of DNA sequences. Since this process is limited to somatic, not germ line nuclei, it is reminiscent of Nematode non-adaptive immune systems. It seems likely that nematodes have developed 122 (possibly from a dictyostelium-like dependent RNA polymerase (RdRP), such as predecessor) some systems wide defenses the ego-1 gene, appear to be of a more central against most cytoplasmic and nuclear importance since they are needed for the viruses. Like dictyostelium , nematodes germline RNAi response. The rrf-1, rrf-2 and also show the presence of a genome wide rrf-3 genes involved in somatic RNAi. All of occurrence of RNA interference. This these genes are also homologues to RdRP RNAi system confers the epigenetic found in plants, but absent in drosophila and capacity of dsRNA to induce gene human cells. An additional gene is the inactivation, that will transmit to other gene, which encodes an RNAse III tissues. In C. elegans and hydra species, endonuclease that is specific for dsRNA. The the capacity to induce gene silencing is germ line ego-1 gene is also involved in striking in its systemic nature and can be suppression of Cer-retroviruses and expressed in all tissue following regional retroposons present in the C.elegans genome, exposure. The C. elegans RNAi response but as shown by the rde-1 , RNAi can is even transmitted to the progeny of the be inactivated without a corresponding individual RNAi exposed worms. This increase in transposition so the relationship of systemic and transmissive response to this function to retroposon control is not clear. RNAi is not found in any other animal, As presented above with dictyostelium RNAi, although regionally transmitted RNAi the nematode RNAi system is also virus-like in responses are found in plants. The its transmissive and amplifying nature. systemic RNAi response is so efficient and Furthermore, and like dictyostelium, the prevalent in C. elegans, that is was central role of RdRP is suggestive that this operationally used (via bacteria producing defense system might also have evolved from a ds RNA) to genetically map the function stable colonization by a persisting RNA viral of most (96%) of all its genes in particular agent. That original agent could have chromosomes. Such a systemic system superimposed the RNAi recognition system of would clearly be expected to exclude the gene silencing (possibly for persistence) while dsRNA (and ssRNA via transcription and allowing preclusion of competing genetic replication intermediates) viruses that are elements. so prevalent in fungi. Although the evolutionary origin of the RNAi system Nematodea also lack DNA viruses. The has not been well evaluated, the genes additional absence of nuclear or mitochondiral involved have been mostly characterized. dsDNA viruses (not just RNA viruses) from Recently thesid-1 gene was shown to be a nematodes presents another problem. Perhaps needed membrane spanning protein this absence is only apparent, due to the limited thought to be involved in some type of search for DNA viruses of nematodes. Also, signal transduction. This protein is found the apparent absence of DNA viruses would in non-neuronal cells, but curiously is not not seem to be satisfactorily explained solely found in neuronal cells. It is interesting by the presence of the RNAi defense system, that neuronal cells are resistant to the since not all DNA viruses code for dsRNA, systemic effects of RNAi, yet neurons are needed to induce the RNAi response. Since not known to support infection by any nematode genomes (unlike neurospora) appear virus, but they do respond to autonomous to tolerate repeated sequences (e.g. Cer RNAi. Homologues to sid-1 are not retroposons, see below), there is no obvious uniformly conserved in other species that genomic barrier to prevent genomic or extra- have elements of RNAi, such as genomic colonization by a DNA replicon or drosophila, although homologues are episome. Currently, we lack a sensible present in mammals (mouse/human). The scenario to explain the absence of nuclear various genes that are related to RNA DNA viruses in nematodes. It remains 123 possible that nematodes have simply not one corresponding full length Cer element that been sufficiently studied to have identified is basal. Since the coding sequences of most genetic parasites (especially latent ones) of these full elements have been conserved, that are prevalent in natural field this strongly suggests that some selective populations. pressure is maintaining the Cer ORFs and hence maintaining the colonization of the Nematode genomes and genetic parasites. genome by the intact Cer viruses. The selected Overall, nematode genomes are compact presence of the intact Cer sequences could also and they have a relatively small number of be allowing accumulation of the more retroposon (less then 1%) compared to the numerous defective Cer copies. As we will see much larger numbers in plants ( up to below, this pattern of a small number of 90%) and animal genomes (about 35%). conserved intact endogeneous retroviruses but These worm retroposon numbers compare a much larger number of related defective to about 300 LTR elements in various retroposons, will be seen in the genomes of yeast species. In C. elegans, the most various other higher animal species. Like the abundant of these repeat elements is the DIR elements of dictyostelium, Cer elements 124 members of the Cer family of also show a strong tendency insert into each elements. These Cer elements show a other retroposon class. It is interesting the C. clear relationship via RT coding sequence elegans Y chromosome, like that of animals, is to Ty3/gypsy and vertebrate retroviruses. particularly rich in retroposon sequences, such 19 families of these elements are known as the RT encoding TOY element. C. elegans and have been defined by RT and LTR is colonized by about 30 dispersed elements similarity. A majority of these elements related to Tc transposons (Tc1/mariner), but are located on the ends of the these numbers vary in a strain specific manner. chromosomes. Most interestingly, and in These Tc-like elements are either inhibited contrast to fungi, and dictyostelium, in from activity in the germ line or defective for Caenorhabditis elegans, some of the Cer their corresponding transposase, although in elements can be non-defective, encoding the presence of functional transposase, these the full complement of retrovirus genes, elements will undergo transposition. However, including an env gene. This occurrence of unlike what is seen in larger eukaryotes and env ORF’s makes the first example in drosophila, the centromere is not a site of eukaryotic evolution that full transposon colonization in C. elegans. C. (nondefective) endogeneous retrovirus elegans also has an abundant but rather unusual genomes are present in the host genomes rolling circular Helitron DNA transposon in significant numbers. However, similar element. The Miniature inverted-repeat to most higher eukaryotic genomes, the (MITE) elements are also very abundant in the defective versions of nematode retroposon C. elegans genome. The ‘Tourist’ family elements are significantly more numerous members which are also present and encode a in the genome than are the full length transposase and can also be found in plants, retroviruses counterparts. These defective insects and vertebrates. Overall, we see in the retroposons are often deleted for most of C. elegans genome the colonization of full the coding regions, but most commonly length retroviral elements and their more they have deleted the env sequence. Yet, numerous defectives. We also see a striking most interestingly, phylogenetic analysis lack of any other viral or other transmissible of the Cer elements of the C. elegans genetic parasites, similar to the dictyostelium genome indicates that the full length virus situation. version of the Cer retroviruses are basal to the larger numbers of defective family Parasitic nematode species genomes. Parasitic versions. Most of the 19 Cer families have nematode species, like the Panagrellus 124 redivivus, have genomes that are more related to the parasitic elements of colonized by atypical DIRS1 genetic dictyostelium. elements, yet are still recognizably similar to those of dictyostelium species. The Nematodes basal biology. The small C. elegans most obvious of the worm retroposon worms also very much resemble in their elements was also the most abundant biology that of the and insect element of the dictyostelium genome: the larvae, in that they both are small motile DIRS1 group of retroposons. And like worm-like forms that feed on algae. those of dictyostelium (but unlike Nematodes are micro-consumers, and will eat vertebrate retroposons), the termini are many of types of algae and other cellular inverted rather then direct repeats and they forms. Thus it felt that these simple worm also conserve an ORF that encodes a forms are basal to the more complex aquatic lambda-like recombinase (a similar animals. All these worm-like larval animal recombinase is found in insect baculovirus forms also feed on algae, which, due to and in some plant mitochondrial photosynthetic growth, is the most abundant genomes). Although related DIRSI sources of nutrition in the ocean. elements have been found in other parasitic nematodes, they are not present INVERTEBRATE AQUATIC ANIMALS: in the sequenced genome of DEUTEROSTOME – PROTOSTOME Caenorhabditis elegans. Thus genome DIVERGENCE colonization by these elements is host lineage specific and appears associated Aquatic animals are thought to have evolved with the origin of nematode species from simple unsegmented worm-like diversification. Similar to Cer elements, organisms, similar in biology to C. elegans . DIRS1 elements also insert preferentially Thus the very diverse array of animal lineages into copies of themselves, often which are currently found in the oceans is the interrupting resident elements. However, result of a major diversification of simpler about half of the DIRSI characterized so worm-like forms. From the perspective of far, insert in Cer elements as well higher animals lineages, an early distinction in suggesting some interaction or metazoan evolution would be the differences competition between these two types of between deuterstomes and protostomes. The elements. Although found also in Deuterostomia order includes the chordata Deuterostomes species such as fish and (chordates - vertebrates), urochordates and amphibians, no DIRS-1 like elements have echinodermata (echinoderms). As the been found in insects or plant genomes. deuterstomes are progenitors to bony fish (and Phylogenetic analysis indicates that these all vertebrates) with their highly developed elements are basal to the retroviruses of adaptive immune systems, they are of special the Ty3/gypsy viruses, and also basal to interest from the perspective of evolutionary the caulimoviruses of plants. Parasitic and virology. Protostomia includes Mollusca free living nematodes also often have (mollusk: bivalves, gastropods) and moderate to low copies of transposase Panarthropoda (arthropods such as crustaceans encoding mariner like (mle-1) and Tc1- -shrimp and insects). The crustaceans are like elements as well as numerous considered as the progenitor to terrestrial defective, non-transposase versions of insects; the virology of arthropods is such elements (frequently found on the X considered in detail in the next chapter. chromosome). Thus the parasitic nematodes can be differentiated from the Aquatic farming and viruses. Interest in the free living nematodes by their patterns of virology of aquatic animals has been colonization by genetic parasites and are enormously stimulated in recent years by large 125 worldwide increase in marine aquaculture. as other distinct classes of DNA viruses, such Hugh losses in farmed fish and shellfish as WSSV. ssDNA viruses, gemini and have compelled greater attention on the densoviruses are also known for aquatic study of the viruses of these organisms. animals. Yet the farming of fish is a very old practice, dating back 3000 ybp in China. With respect to RNA viruses, an equally broad In addition, the occurrence of viral disease array of viral types is known to infect fish in fish farms was first reported as long ago which includes + and +/- RNA viruses; as 1563 and was then associated with togaviruses, pircornaviruses, birnaviruses, enlarged livers and spleens (due to carp nodaviruses, , and reo-like poxvirus). Bony fish (like their other viruses. In addition, in the bony fish we now vertebrate relatives) are especially known see for the first time in evolution in any life to support the replication of a large form, the occurrence of negative stranded RNA number of types and strains of virus. viruses which includes rhabdoviruses, Frequently, viral induced mortality in fish myxoviruses and orthomyxoviruses. In fact, is associated with the infection of juvenile within teleost fish, rhabdoviruses (such as or larval forms, as some (but not all) adult VHSV, IHNV) constitute the single largest fish tend to be persistently or inapparently group of isolated viruses and are responsible infected. Viral infection of protostomes is for epizootics and heavy losses in aquaculture. also common, and here too, viral mortality Thus the sudden evolutionary appearance of is frequently associated with infection of this family of viruses is striking in fish species. larval or juvenile forms. Correspondingly, Also in bony fish, we also see for the first time commercially exploited shellfish have also in evolution, the occurrence of an authentic suffered huge losses due to viral autonomous, prevalent and disease associated infections. These farmed shellfish include retrovirus. Furthermore, some of the mollusks, echinoderms, and crustaceans. associations that are characteristic of virus- tissue associations in mammals are seen in Overall patterns of virus in aquatic aquatic animals and their viruses. For animals. Observations from mariculture example, alpha herpesviruses which are known have lead to the identification of some to establish latent infections in peripheral overall and sometimes striking virus-host nervous tissue are also seen in clams and patterns. One general observation is that specific fish species to establish persistence in along with the diversification of aquatic ganglions. It is therefore highly ironic that animals, there has also occurred an with the origin of the highly complex adaptive associated diversification and radiation of immune system of vertebrate fish, we can also the viral species that infect aquatic observe the surprising radiation of fish specific animals. This is especially apparent viral diversity. This fish virus-host association within the bony fish. Bony fish are known is especially in striking contrast to the virus- to support infection of almost all the types host relationships we described above for of viruses that are found in mammals, plus nematodes but also very different from that of some types of virus that are otherwise the pre-chordate duterostome, such as found only in insects. These include many echinoderms, urochordates and their viruses kinds of large DNA viruses, such as the which will now be described. iridoviruses (of which over 180 types are known in fish), the baculoviruss Echinoderms, the simplest dueterostomes, (especially found in shrimp and insects), show a paucity of virus. Echinoderms are an the herpesviruses (found in both shellfish example of a deuterostome that diverged early and bony fish), poxviruses, ascoviruses, in the evolution of the chordates. By far, the adenoviruses and polyomaviruses as well best studied example of a basal dutersome is 126 the echinoderm sea urchin, presents a stark contrast with vertebrates and Strongylocentrotus purpuratus. Though the echinoderms. Importantly, in this absence they are deuterostomes, sea urchins are of viral agents, echinoderms appear very clearly primitive relative to vertebrates, similar to C. elegans. lacking even notochords. One of the more striking differences between echinoderms In the case of sea urchin, it is not fully clear how and bony fishes and amphibians is the they might prevent viral infections. Systems of presence of the much more complicated genome defense, such as RNAi are known in adaptive immune system of the sea urchin. This might allow Sea Urchins to vertebrates. Also, and of special interest generally limit infections with RNA viruses. to virologist, however, is the absence of a Sea Urchin RNAi, however, does not appear to basal layer in echinoderms and their be transmissive and systemic as it is in C. simple epithelium relative to vertebrate elegans so it is not clear Sea Urchin RNAi skin, since these rapidly differentiating would be as effective as it appears to be in C. basal cells that support the replication of elegans. Furthermore, no specific process so many fish viruses. One surprising fact, (such as sex associated RIP of Neurospora) is however, with respect to viral genetic known for echinoderms, which might help parasites of sea urchins already seems prevent the presence of duplicated autonomous clear: because sea urchins are or genomic DNA sequences resulting from commercially produced in substantial infection of various DNA viruses (Iridoviruses, numbers in Japan and because of the baculoviruses, herpesviruses, parvoviruses). intense interest sea urchins have received DNA viruses are known to frequently infect as an experimental model for the study of shellfish, crustaceans and vertebrates, animal development, there has been sometimes by the same virus. Thus, their ample opportunity to observe viral induced absence from Sea Urchin is enigmatic. It is disease in these host. However, few, if thus exceedingly ironic that these relatively any, autonomous viral agents have been simple lower animal organisms (dictyostelium observed for sea urchin or any other , C. elegans, sea urchin), which lack the highly echinoderm or urochordate. Sea sophisticated adaptive immune systems of cucumbers are also cultured in Japan and vertebrates, but maintain seemingly simple have similarly not been observed to innate immune processes, appear so inert to all support any virus infection. Similarly, (nongenomic) viral agents. lampreys and hagfish urochordates are also conspicuous in their lack of reported Invertebrate immunity is non-adaptive. viral associated disease. More recently, Lower duterostomes are considered a sister however, reports of a lamprey herpesvirus group to vertebrates but are distinct from have surfaced, suggesting that at least protostomes in having a compliment system. herpes viruses can infect these host. Both Sea Urchin genomes code for an analogue of aquatic vertebrates and the aquatic the C3 component of complement; SpC3, as Protostomia orders (mollusks and well as coding for a factor B and mannose arthropod species) are subjected to intense binding protein associated serine protease. marine aquaculture and are known to be This C3 component of complement is highly susceptible to acute disease by a considered as the ‘non-classical’ or alternative broad array of viruses. Some of these pathway. However, given that it appears to be viruses are also able to infect both the earliest component of the complement duterostome vertebrate species and system to have evolved, it is likely that these protosome species (such as fish and crabs genes are representative of the origin of the or shrimp). Thus the general absence of complement system. Sea Urchins lack many virus in sea urchin and most urochordates of the associated molecules and receptors 127 involved in the induction of an adaptive case of tunicates, it is interesting that gonad immune response in vertebrates. They expression of these proteins is also apparent. It have no TNF, TNF receptor, no cytotoxic thus appears likely that all tunicates and , no and no acute lampreys, maintain a C3-like complement phase or inflammatory cytokines. Like system and corresponding RCA system of other lower animal forms, however, they nonadaptive immunity. This is in keeping with do have several types of nonspecific the monophyletic character of urochordates. cytotoxic cells. These orders also have non-specific cytotoxic cells which are discussed below. The origin of complement. Complement attacks and destroys membranes of invader organisms, but also assists in allowing Cytotoxic cells/tissue recognition. Sea urchin phagocytosis of -antigen do have at least 4 types of cytotoxic cells that complexes. However, the existence of this will kill target foreign cells (such as human red membrane attack complex needs to also blood cells). Only some of these cells express have a corresponding safety lock system to C3, which is inducible. Other cytotoxic cells prevent self killing. In mammals, this appear to be phagocytic amebocytes. These safety lock is provided by the RCA cells express various markers characteristic of (regulators for complement activation) mammalian cytotoxic cells such as CD14, system of proteins which prevent CD56 and CD158b, but do not express other inappropriate complement activation. As common CTL markers such as CD3, CD4, both the attack complex and the RCA CD6, CD8, CD16. These cytotoxic cells are proteins must act together and as both able to lyse contacted target cells via calcium consist of a complex set of interacting dependent hemolysins and contain lytic proteins, this situation represents another granules, with acid protease, that seem similar apparent example of a complex phenotype to those found in NK cells. These amebocytes that appears to have evolved and appeared are also involved in an encapsulation response, together. There are no examples of that will wall off invading organisms and can progenitors that have C3 analogues but involve the deposition of calcium and other lack RCA-like proteins. What might have substances. These cytotoxic cells can make been the origin of this complex , aggutinins and . In addition, it recognition phenotype? Pore forming appears that anti-fungal peptides can be protein complexes that are able to generate generated from C-terminus of hemocyanin. (a holes in membranes as well as matching copper binding proteins involved in O2 inhibitors of pore formation are a well transport). All of these non-adaptive immune established component of various systems have evolved prior to the evolution of addiction modules found in unicellular the gene rearrangement system required for organisms (killer phenotype is yeast and adaptive immunity. In addition, it appears that bacteria). Thus C3 can also be protostomes have additional mechanisms for functionally considered as the toxic half of countering virus infections that are poorly a two part addiction module since it needs characterized. In blue crab, tissue extracts can the matching RCA complement binding be demonstrated to inhibit broad array of protein to prevent self killing. These RCA viruses (including sindbis, vaccinia, VSV, proteins all have characteristic short mengo) via what appears to be a viral consensus repeat sequences (SCR). attachment inhibition. However, it now seems Tunicates (eurochordates) also have a clear that at least some urochordates do have a protein called HrSCR-1,2,3 which cell based system of tissue recognition. Many maintain the SCR elements and appears tunicates are able to undergo fusion with other likely to represent an RCA system. In the individual organisms and will form chimeric 128 tissues in which one partner will often genes, which indicates that they are not reabsorb the cells of the other fusion hypervariable. Furthermore, analysis of rates partner, but maintain some of the partners of synonymous substitutions in RT ORFs cells in its germ and stem cells. However, indicates that these ORFs are under strong this fusion is restricted between positive selective pressure to be maintained in genetically distinct partners by a non- the genome. In other echinoderm species, adaptive compatibility system (Fu/Hc) that SURL-like elements are frequently interrupted, involve one highly polymorphic gene with although most species also conserve RT up to 500 alleles. The origin of this elements with an intact RT ORF. Of particular compatibility system or its possible interest is the observation that env-like ORF relationship to viruses is not known. also exist for at least a smaller number of these However, it seems likely that such a SURLs in sea urchin. However, unlike C. recognition system was present in elegans, it has yet to be determined if these eurochordates before the origin of the complete retroviral Sea Urchin elements are adaptive immune response. basal to the more numerous defective copies. The SURLS exist in distinct families and the Sea Urchin genomes and ERV genetic specific family of SURL shows a clear link to parasites. Sea urchin genomes and those its host species. SURL clones from the same of other urochordates (tunicates and species are very similar to each other, but lampreys) are relatively simple compared distinct from the SURL families of other to vertebrates. The biggest difference is echinoderm species. Thus it appears that large that the vertebrate genomes appear to have scale colonization of echinoderm genomes by undergone large scale sequence specific families of retroviral-like elements duplication relative to urochordates. The occurred at the divergence of these species sea urchin genome sequencing project is from one another, but these species specific currently underway so we cannot yet SURLs have been maintained under positive evaluate all the specifics of these selection. This pattern of lineage specific differences. Correspondingly, the various retroposon colonization not only applies to genetic elements that have colonized the other sea urchin species, but also applies to Sea Urchin DNA have not been tunicates, starfish and herring in that the determined as has been done with C. specific retroposons are clearly conserved elegans described above. However, it is within a lineage, but distinct from each other. still clear that the urochordate genomes are Many of these SURL sequences are also colonized by both defective and full known to be highly expressed in the sea urchin retroviruses, but the level of such egg, prior to fertilization, suggesting a large colonization (well below 1%) is much less scale ERV reactivation early in development. then that of vertebrates. For example, the In sea urchins, very high levels of transcription genome of Sea Urchin is colonized by of poly-A containing retroposon occurs in pre- retroposon such as Sea Urchin Retroviral- fertilized egg – prior to gastrulation. This Like elements (SURL). From the genome constitutes up to 50% of total RNA at this project, Sea Urchins are predicted to have point. The purpose of such intense retroposon 27,350 major ORFs. Within this genome, transcription is not clear and it has been 315 copies of SURLs have been identified. proposed to be a system for the storage of Most of these elements conserve the RT nucleotides needed for early development. coding domain although 46 are also However, other rapidly developing eggs, such known to be interrupted for RT ORF as frogs, don’t accumulate retro-transcripts to (which is related to Ty3/gypsy elements). such high levels. The sea urchin SURLs The mutation rate of the RT ORF is clearly resemble the endogenous retroviruses similar to that of single copies of other of mammalian genomes in two ways: both 129 types of organisms are colonized by defense system to counter the colonization of lineage specific ERV families and both the germ line by retroviruses, similar in types of ERVs become highly activated concept to the RIP system described above for for expression in the early embryo. As N. crassa. Methylation of retroposons would mentioned, the function of this SURL thus be a genome defense system. However, reactivation in sea urchin remains DNA methylation in urochordates differs unknown. The possible effect of these significantly from this mammalian pattern. In elements on other, potentially competing urochordate (ciona intestinalis), DNA genetic parasites or viruses has not been mehylation is fractional and converse to the evaluated but it is conceivable that they global methylation in mammals in that could be interfering with other genetic retroposon or repeat DNA is not methylated parasites. whereas gene coding DNA is methylated. This appears to be the opposite of what might be Retroposon DNA methylation and the expected from the genome defense theory. In genome defense theory. The general fact, both the mammalian and tunicate patterns linkage and phylogenetic congruence of of DNA methylation are in direct conflict with retroposons colonization to a specific host the theory that methylation as a genome species suggest retroposons play some role defense system selected to suppress the in the process of host speciation. But how colonization of germ lines by retroviruses. In ERVs might contribute to host speciation the case of the urochordate genome, the ERVs is not clear. It might be expected that are in the un-methylation fraction of embryo patterns of ERV activation or transcription DNA, the opposite of what would be expected could relate to any putative role they might for ERV suppression by the genome defense have had in host speciation. However, it theory. In the case of the early mammalian appears that ERV expression patterns are embryo, the genome is specifically susceptible distinctly different in Sea Urchins and to retrovirus expression and integration prior to mammals. The differences in the pattern the differentiation of germ line cells (in the of transcription of retroposons in blastula) and it is the somatic cell lineages that urochordates are most apparent in early are suppressive of high level retroposon embryo. In Sea Urchins, high level ERV expression, after germ line commitment. Ergo, transcription occurs in the blastocyst, after the genomes of gem cells of the early fertilization. In the mammalian embryo mammalian embryo are open to retrovirus (but not avain or marsupial embryo), high colonization, not protected against ERVs by level ERV expression is also observed DNA methylation. Thus the viral defense which is due to un-methylated DNA, theory fails to explain these well conserved which then becomes methylated with patterns of ERV expression and suppression. blastula formation. This DNA Yet the pattern of retroviral colonization and methylation in a mammalian embryo expression remains a major differences suppresses retroposon transcription in between genomes of urochordates and most of the somatic cells. Somatic cells vertebrates. This issue will be further for the most part are responsible for high discussed below. level gene expression as observed in most tissues. Thus global DNA methylation of The order of aquatic protostomes. retroposons, but not global methylation of Protostomes, as a group, are organisms that are expressed genes, is typical of mammalian not as well studied as duterostomes and no geneomes. This suppression of somatic member has had its genome sequenced. Like retroposon expression has lead to theories echinoderms, this order of animals all lack that propose that this methylation and adaptive immunity systems and in this feature ERV suppression has evolved as a genome they resemble the early duterostomes. 130 Chelicerates are a basal protostome order some specific examples in which a specific and appear to represent a progenitor to the viruses will be able to persistently and arachnid order, which likewise appears to inapparently in infect younger forms of have conserved the RNAi system. specific host. In these cases, persistently Horseshoe crabs would be an example of a infected host can often function as a reservoir species that is also basal to arachnids. for viruses that cause disease in other, but Protostomes includes mollusca species sometimes related species (some specific which is a very diverse order with greater examples are discussed below). This situation then 110,000 species, having many suggests the existence of viral/host ecology members with mineralized shells. that could have major impact on the host Molluscs support the replication of many population dynamics and probably accounts for and various virus types, including the numerous examples of population crashes iridoviruses, baculoviruses and in farmed populations of protostomes. herpesviruses. Crustacea order is However, for the most part, this type of host characterized by chitinous mineral dependent persistent-acute viral ecology not exoskelleton and includes crabs, lobsters, well studied, making it difficult to generalize barnacles and shrimp. This order also this issue. supports many types of virus. The most evolved member of the protostome order Protostome immunity. Although not well is the octopus, which is also know to be studied, the immune systems of protostomes is susceptible to viral infections although clearly nonadaptive. In addition, protostomes these agents are for the most part poorly appear to also lack the C3 complement system studied. Crustacea are progenitors to the found in deuterostomes as described above. arthropods which includes the terrestrial Nor do they encode any of the other receptors insect orders that will be considered and signal molecules associated with the further in chapter 7. It is curious that adaptive immune system, such as TNF, TNF although crustaceans are predominately receptors, imflammatory cytokines, acute water dwelling organisms, they are phase proteins or CTL receptors. All predecessors to insects, but almost no protostomes lack hemoglobin and instead use insect dwells in the oceans. However, hemocyanins, copper containing globin family viruses that infect both insects and proteins for O2 delivery to tissues. Chelicerate, crustaceans, such as baculoviruses and crustaceans, myriapods, and a few insect nodaviruses, are often rather similar to species also use hemolymph as an element of a each other, but can be quite distinct from defense system that will cause coagulation to the viruses that infect vertebrates. wall off parasites. This process is also clearly apparent in the horseshoe crab. This Viral/host ecology. The protostomes all coagulation is not like fibrinogen of have some general similarities in their life vertebrates, however, in that it uses no proteins strategy. All are hatched from eggs that common to those of vertebrates. Protostomes are generally produced in large numbers. do appear to have an RNAi defense system. All give rise to small larval and juvenile This RNAi system appears to represent a basal forms that tend to feed on and adaptation since it is found in Chelicerates – algae. Overall, all of these orders appear which are basal to the arthopod group (RNAi is to host viruses, of various types, often lots also find in paraphyletic spiders). However, of them. These viruses have a tendency this RNAi system not fully preserved in all to infect and cause pathology in younger decedents, such as in drosophila. Protostomes or juvenile forms and either not infect or also have cytotoxic cells, although the specific inapparently persist in adult forms. features and functions of these cells are not However, it often appears that there are well studied. However, it still appears clear 131 that these cytotoxic cells resemble NK tissue, but surprisingly is also specifically cells and may even kill contact target cells found in oyster nervous system such as visceral by a Fas/FasL like process. One common ganglion. High level viral expression in response seen in most shellfish (and hemocytes is also seen. It is most interesting conserved in insects) is that foreign cells that the tendency of alpha herpes viruses to or organisms and irritating materials tend persist in ganglions is a rather unique to be walled off by a host response. The biological characteristic that is maintained in cells that do this are amoeboid cells from the alpha herpesviruses of vertebrates. the heamolymph. The material that walls OsHSV-1 represents an early occurrence of off the invader or irritant can be either a this viral-ganglion biology in the evolution of a chiteneous material, a mineral (most often true herpes virus. ,OsHSV-1 is a member of a calcium) or a like coagulated matrix. large family of multi-membrane, large nuclear In pearl oysters, this process of DNA virus with well defined genome mineralization can be exploited by the organization and replication strategy. introduction of a foreign irritant leading to Additionally, and like the alpha herpes viruses the deposition of calcium carbonate pearls. of mammals, OsHSV is able to efficiently It seems possible that the use of a walled establish latent or persistent infections. off process to protect the host from Furthermore, this persistence by OsHSV is predation and other assaults, could have common as the virus is found in 80-90% of all originally stemmed from such innate adult Pacific oysters, but is asymptomatic in immune reactions. It is curious that except these host. Some of these viruses may also for RNAi, none of these defense systems replicate as an acute infection in different host. clearly suggest processes that would be There appears to exist many versions of expected to be effective against viral OsHSV-1 related herpesviruses, which might infections. explain why OsHSV-1 cross reacts immunologically with Channel catfish herpes Viruses, farming and shellfish and virus. It would be interesting to also know the herpesvirus. The problems posed by relationship of OsHSV-1 to herpes viruses virus infection of shellfish became very reported from lampreys and sharks. apparent with the expansion of mariculture. Mass mortalities were Oyster farms and other viruses. Besides especially noted in 1994 when the French herpes viruses, shellfish are susceptible to oyster aquaculture industry crashed, as other types of virus. Japanese pearl oysters can well as in 1999 when a similar crash was be infected by a marine birnavirus MABV (of seen in Japan. Other major crashes in which 7 strains are known). The virus infects commercial shellfish populations have many deuterostome fish species and provides also been experienced, such as the shrimp another example of the tendency of aquatic aquaculture in China. These crashes have viruses to cause disease in highly divergent been mostly due to infections with various species (often inclusive of prostostomes and types of virus. Nodavirus (biRNAvirus) is deuterstomes). Infections are especially an especially big problem in cultivation of prevalent in summer, possibly die to higher clams (Mediterranean Sea). Furthermore, water temperatures. This virus has caused in clams, herpesviruses are also a major commercial population crash in 1997-98. This impediment to commercialization. For virus also causes a high mortality in many, but example, Ostreid herpes virus (OsHSV-1) not all, infected species. In those species is generally highly lethal in juveniles and which establish persistent infection, the virus larvae of various oyster species (marine can be re-isolated suggesting persistence is a bivlaves) at 4-5 d post fertilization. stable reservoir of virus and a source of virus OsHSV is expressed mainly in connective spread to other host. Virus replication is 132 especially high in hemocytes, where established to function in shrimp or any persistence also appears to be established. protostome since they lack inflammatory Inapparent persistence in wild caught cytokines. shellfish thus poses a major threat to the commercial farming of shellfish. WSSV therefore has a large number of relatively unique virus specific genes of unknown Shrimp farms and baculovirus and function, not found in the host or in other origins of WSSV. In the shrimp industry, viruses. As such, the origin of this virus and baculovirus infection has been a major these genes is obscure. Although WSSV cause of problems, such as Baculovirus resembles a rod shaped version of a penaei (BP), which has been responsible baculoviruses in several morphological for massive farm mortality. However, characteristics, it shares no sequence homology Virus (WSSV; a to that group of viruses. The closest virus Nimaviridae DNA virus) has been an genes by sequence similarity are the DNA pol especially important pathogen. As with the of the animal herpesvirus. However, major oysters described above, baculovirus distinctiveness of WSSV from herpes virus (a infection of shrimp tends to be pathogenic much bigger genome, distinct virion structure, in larval and juvenile forms of shrimp, but distinct genome organization, no inapparent and persistent in adults. WSSV multimembranes, no nuclear virus assembly, is probably the best studied of the DNA no genes for transcription and no other gene viruses that infect shrimp and has a wide similarity), indicates that WSSV represents a host range in various shrimp species and is distinct family of DNA virus from the herpes generally disease causing. WSSV is a rod viruses. This along with the high number of shaped enveloped virus and was the only virus unique genes in WSSP, suggest that it large DNA viruses of shellfish that has represents a new and distinct clad from some been sequenced. At 292,967 bp, it distant progenitor DNA virus or polyphyletic represents one of the largest animal DNA mixture of DNA viral progenitors. virus, yet characterized, surpassed only by eukaryotic phycodnaviruses and Unlike many viruses described above, the Mimivirus of amoeba in genome size. It specific WSSV strains tends to be highly has a circular DNA virus with 184 ORFs. species specific with respect to host mortality. Interestingly, only 6% of these ORFs are Yet WSSV are found in a broad array of related to any sequence in the Genbank species. This may indicate the existence of database, and these related sequences are many species adapted versions of WSSV. mainly proteins involved in viral DNA Virus replication is especially high in replication. Thus, the WSSV genome hemolymph. In fish host, WSSV shows high represents a large amount of genetic mortality only in young fish and has been novelty. One of the WSSV isolated from fish kidney tumor cells. In wild helicase/nuclease genes, however, clearly caught specimens of shrimp, WSSV can be appears similar to those found in found in 60% of larvae, thus it is ubiquitous in arthropods, but may be basal to these host nature and not simply a product of aquaculture. genes. One very interesting WSSV gene It has been isolated from 5 shrimp species, 2 is that for collagen, in that such structural fresh water prawns, 4 crab species, 3 lobster genes are not usually observed in virus species – all these species were susceptible to genomes. Another very interesting and some level of disease. In some shrimp species, curious gene of WSSV is a gene that the mortality can be as high as 100%. appears to regulate TGF beta, which However, WSSV can also infect Blue crabs as controls inflammatory reactions in well as other species. Interestingly, however, in vertebrates, but has no yet been 2 species of mud crabs, WSSV infection 133 results in no symptoms but virus can be re- polyomavirus. Viral particles with icosahedral isolated from these previously infected symmetry were found in both the cytoplasm host. This is especially the situation with and the nucleus of numerous cell types. Virus- benthic larvae of mud crab, which can infected cells showed severe alterations, frequently be found to harbor virus, but including hypertrophy, reduction of the without viral induced disease. In addition, intracellular compartments and extrusion of the wild caught Metapenaeus dobsoni also nuclear envelope. But this may have also been carry virus with no disease. Others have due to a superimposed bacterial infection. reported that in addition to crabs, prawns Moreover, gill epithelial cells showed and lobsters can also be asymptomatic disorganization and swelling of the apical carriers of WSSV infections. The WSSV region, which affected the ciliary structure. It pattern of viral ecology thus seems to be is interesting that these cellular alterations are that WSSV persistence is asymtomatic and very similar the effects of mouse polyomavirus highly host specific, but acute replication has on newborn mouse epithelia, and viral induced disease is seen in related suggesting a strong conservation of the but distinct host species. polyomavirus-host biology.

There is one interesting exception to the RNA viruses of protostomes are also known, normal larval vs. adult biology of shellfish such as the Reo-like virus that infects Chinese virus infection. The usual pattern of mitten crabs. In addition, various rhabdoviruses juvenile crustaceans being more are known such as yellowhead virus of shrimp, susceptible to viral disease compared to which seems to identify one of the few adults is not true for the baculovirus mid- characterized negative stranded viruses of gut gland necrosis virus which infects protostomes. Of some note are the picorna-like larval stages but induces disease only in virus present in the oceans and known to infect late post-larval stages. In this regard, this oceanic animals. Throughout this book, we will shrimp baculovirus is more similar to the use a lose definition of a picorna-like virus to biology seen with insect baculoviruses mean a + strand RNA viruses that includes the which also often infect gut tissue and families of Picornaviridae, Calciviridae, cause disease in post-larval host. Comoviridae, Sequiviridae, and . It should be noted that in the context Polyomaviruses are a family of small of plant RNA viruses, this usage is often nuclear dsDNA viruses with circular controversial since many feel that members genomes that commonly establish within these viral groups share little genetic persistent infections in many mammals, homology. However, since it is the perspective of including most humans, and can cause this book to consider the possible evolutionary acute disease in avian host. history of viruses from the perspective of host, the Polyomaviruses have not been seen in use of the term picorna-like virus can be well plants and insects, or lower eukaryotes, justified. Recent surveys of ocean samples based but it appears that clams may have this on consensus PCR primers to RdRP, have now virus family. This would suggest the identified the existence of large and diverse simplest metazoan are able to supports populations marine picorna-like viruses that can replication with this family of virus. be persistently isolated from all oceanic sources. Tapes semidecussatus clams from the Phylogenetic analysis of these amplified northern Mediterranean coast of Spain sequences established the existence of two revealed that 86% of the clams were entirely new families of virus. One of these infected with a virus that shared families was found to contain a new virus that ultrastructural, morphologic and was lytic to , a toxic red cytopathic characteristics of a tide algae responsible for fish . Members 134 of this family were the most numerous in the that viral-viral interactions can indeed have a oceans, but also the most diverse and distant large impact on their infected host. However, from the animal and plant picorna-like the inerent complexity of mixed infections viruses. It seems likely, that such members makes it difficult to experimentally evaluate may be the oldest and most basal family of the evolutionary pressure exerted by such the picorna-like viruses. that situations, either on the host or on the viral belong to other families (Dicistroviridae), genome. Prevalent mixed infections would such as Virus (TSV), have appear to provide a selective pressure for the received much attention since they cause function of viral and host genes in the context large scale disease in shrimp farms. This of the genes of other viruses, not just the ss+RNA virus (TSV) shows clear specific host. This mixed virus circumstance is relationship to both viruses of plants and the not unique to aquatic animals, as we will see cricket paralysis-like viruses of insects, below, pathology due to mixed virus infection including the use of an alternate to AUG is also a well established situation in many initiation. Thus TSV may represent an higher plants. evolutionary intermediate towards the evolution of terrestrial picorna-like viruses. Protostome genomes, a paucity of data. It is interesting to note that seagulls, which Although it would be interesting to compare the often feed on shrimp, can excrete TSV virus pattern of acute viral infections in protostomes to in their droppings, suggesting some any genomic colonization by viral-like genetic interesting inter-species viral ecology as well parasites, unlike echinoderms, these genomes as a system for terrestrial viral transport. have not been well studied so little can be said currently. It is known that internal DNA Mixed viruses. The abundance and deletion occurs in during development in some diversity of marine viruses acutely and crustaceans, reminiscent of the ciliate genomes persistently infecting shellfish species and of a related process seen to occur in the makes it appear likely that some viral-viral parasitic nematodes. However, details of this interactions may also be occurring in these genomic rearrangement are lacking. It is also host. If so, it is expected that persisting clear that retroposons are present in protostome viruses, in particular, would often have genomes, but until more sequence data is genes that affect the ability of other available, we cannot evaluate if, like sea urchin, viruses to colonize the same host. The there exist intact basal and species specific resulting interactions could be either versions of retroviruses that are associated with interfering or complementing, as seen with more numerous defective ERV derivatives . It is satellite and their helper viruses. In however, known that molluscs genomesdo have shrimp, support for complementing mixed non-LTR RtE-1 like elements as found in C. virus interactions is seen in at least one elegans, which also code for 1,200 a.a. ORF. As situation; that is the shrimp midcrop mentioned previously, this type of retro-element mortality syndrom (MCMS) which has was distinct from other retroposons involving a caused major losses in the shrimp farming distinct mechanism of integration. The presence industry. Viral analysis has shown that of such elements appears to link the evolution of this syndrome is multi factorial and is protostome genomes to the genome of C. known to involve simultaneous infection elegans. Similar elements are also found in with four distinct viral types. These four flatworms. However, these RtE-1 like elements viruses include a Reo-like virus a parvo- are absent from the genomes of vertebrates so like virus, and two rhabdo-like viruses. this seems to mark a boundary between the Although it is felt only two of these agents genomes of protostomes and are directly contributing to disease, the dueterostomes/vertebrates. occurrence of such a mixed state suggests 135 JAWED VERTEBRATE FISHES bunyavirus, paramyxovirus, othomyxovirus) as – another major evolutionary well the first example of a non-defective, discontinuity in host and virus autonomous and prevalent retroviruses. Some Genomic changes. Urochordates had new DNA viruses are also seen, such as developed most of the types of tissue present papillomavirus, (distantly related to in vertebrates. They are characterized by the polyomavirus) as well as the first example of presence of a notochord, which is a collagen autonomous or extracellular adenoviruses. In containing structure that follows the neural addition other DNA viral lineages became crest and provides a main structural element established, apparently derived from previously for the urochordates (including jawless fish established DNA viruses. This includes the such as lamprey). However, there are many Herpesviruses, baculoviruses, poxviruses, all of differences between urochordates and which appear to be descended from distinct but vertebrates, which identifies an evolutionary apparently older large DNA viruses related to discontinuity. There occurred a general phycodnaviruses, mimivirus and T even phage. expansion (duplication) of genes and genome We also see the evolutionary re-entry of some of vertebrate species relative to the smaller viruses (reovirus, parvovirus, birnaviruses), which and simpler genomes of urochordates. Along were present in ancestral lineages (such as fungi), with this vertebrate genome expansion, there but had been absent from less distant basal also occurred a significant increases in metazoan predecessors (worms and basal retroposon colonization of all vertebrate duterostomes). This aquatic viral expansion genomes, yet most (but not all) previously mirrors the Cambrian host radiation and accounts existing retroposons and transposon and for a large fraction of viruse families we currently repeat families in urochordates were retained. see in mammals, plants and insects, which all evolved near the origin of acquatic vertebrates. Viral types. Bony fishes are earliest Many of these viruses were also seen to infect the divergent vertebrate lineages to have both protostomes as described above, but with a innate and an acquired system of immunity. notable reduction in viral types. Thus all of the required gene complexity of these complex immune systems was also Bones. A characteristic difference between acquired at the origin of all vertebrate urochordates and vertebrates includes the lineages. In bony fish we also see the development of both central nervous system evolution of jaws, spinal chords, vertebra, and peripheral nervous system surrounded by cranium and bones. Thus there are numerous bone (cranium and vertebrae respectively), as distinctions between bony fishes and well as an association of immune and urochordate predecessors and this identifies a heamopoetic generating cells with bone (bone major evolutionary discontinuity. However, marrow). What was the origin of these bony along with this discontinuity at the origin of structures and why are they associated with aquatic vertebrate species there also exist a nervous and immune tissues? Although the parallel, although more shadowy, shells of protostomes also use calcium evolutionary discontinuity; that is a matching carbonate deposits to form structural elements, and major radiation of viral families that the phosphate containing hydroxyapatite used infect vertebrates also occurred. At the to form bone is distinct. The origin of bone origin of vertebrates, there is evidence that calcification is unknown. However, this type many viruses also evolved. This includes of calcification resembles that associated with several classes of virus families that are not nonspecific cellular immune responses of early represented by any lower life form that we metazoans, which will wall off parasitic agents have so far considered. New viruses of this by calcification from surrounding lymphoid type include four families of negative cells and is still characteristic in vertebrates of stranded RNA viruses (rhabdovirus, a chronic inflammatory response. 136 for understanding the origins of immune Skin and virus. In addition to bone, the systems. Adaptive immunity is a highly architecture of various other tissues is also complex phenotype that appears to have been different in bony fish. One notably acquired in its functional totality, thus it is example is that the vertebrate skin (and monophyletic. It represents a linked other tissue, such as gut) now has basal interaction between inflammatory and acute layer that undergoes continuous terminal phase proteins (SAA, SAP), cytokines (IL1, differentiation. Along with this IL8, interferon, TGF beta, TNF alpha) and continuous differentiation capacity, we see chemokines (CC, CXC), their receptors the first evolutionary occurrence of high (various CD types), and signal transduction rates of viral induced skin growth system that initiate both humoral and cellular anomalies (hyperplastic dermal lesions or antigen specific immune responses then benign tumors). No lower metazoan or generates and selects sequence diversity. It few protostome show any such related also involves complement. As described growth effect of virus infection. Both above, complement C3 and its RCA control large DNA viruses (iridioviruses, WSSV proteins were clearly present in Urochordates. related viruses, herpesviruses and In addition, a non-adaptive an MHC-like adenoviruses) and retroviruses (e.g., system was present in tunicates. The Walleye dermal sarcoma virus) show a complement system in higher vertebrates is clear tendency to cause growth composed of about 30 proteins. Many of these abnormalities in fish and amphibian skin seem to have evolved from duplication of (but not in shrimp, crabs, oysters). Some earlier genes such as C4 and C5 from C3 of these viruses can also cause occasional present at the protostome-deuterostome growth abnormalities in other vertebrate boundary. In bony fishes, it appears that C3- tissues (lymphatic, connective tissue, C9, MASP, MBL, and Bf are all present. C3 kidneys). In many cases, the altered skin is a single copy gene in all . growth is the main consequence of virus However, in fish species C3 is multiple gene infection as mortality is not necessarily with 11 C3 genes known in some species. associated with such vertebrate viral Usually such a gene increase is considered to infection. In these cases, these viruses have occurred following . tend to have virus specific genes that can With respect to antigen specific cytotoxic T control vertebrate cellular division or cells, Channel catfish have been established to differentiation as discussed below. have cloneable CTL’s. These cells likewise Furthermore, these skin infections and appear to kill target cells via Fas/Fas-L like growth anomalies are often prevalent in process. It also appears that they may use a wild fish and amphibian populations, so perforin/granenzyme response for killing. they are not products of aquaculture with These CTLs have receptors for Fc and IgM. their attendant large and crowded fish Thus bony fish appear to essentially have all populations. the elements of the humoral and cellular adaptive immune system. All of these Adaptive immunity. Possibly the most complex characteristics appear to have evolved striking difference between urochordates together, seemingly at the same time, since and vertebrates is the invention of a full they can all be found in all examined bony adaptive immune system. The adaptive fishes (e.g. catfish and salmonids) but aside immune system uses elements from both from C3-like complement and non-specific the to initiate and cytotoxic cells, none of these elements are control the acquired immune response. present in urochordates (or protostomes). The basal phylogenetic position of bony fish makes these hosts of special interest 137 The accepted view is that this surprisingly become more apparent in such a farmed complex adaptive immune system has setting, they are fished in large numbers and at evolved to control invasion, especially by least viral induced skin growth anomalies viruses. However, the extreme irony as occurring in nature should have been noted above is that aquatic vertebrates also identified. The one exception to the represent an expansion of the viral observations that sharks essentially devoid of families that can infect vertebrate host, so any virus is a single report in 1985 of the it is not apparent that adaptive immunity is presence of a herpes virus in wild and captive more protective against viral infections the smooth dogfish showing necrotic skin lesions previously existing innate immune in fins and trunks. Thus herpesvirus may be systems. Yet it is possible that this viral one of the only viruses found in both lampreys expansion may have followed an and sharks, organisms that span the evolutionary period in which viral development of the adaptive immune system. infection of jawed fish was rather Sharks also seem remarkable for their restricted. The early vertebrates may have extremely low tumor incidence relative to bony indeed been rather resistant to most viral fishes (although sharks have a simpler skin infections, but vertebrates that evolved tissue architecture then bony fish). If sharks later became more susceptible to virus. are representative of the very first organism One reason for thinking this is that the with an adaptive immune system, then it seems most primitive jawed fishes are very possible that these organisms were in fact represented by the shark family. Sharks successful at preventing the infection and appear to have essentially all the basic colonization of many viral types and that the elements (albeit simplified) of the adaptive expansion of virus susceptibility in bony fishes immune system; an antigen specific was a later development in evolution. humoral response and allogeneic cellular response (in which CTLs have clear TCR homologues in extensive gene families). Fish retrovirus Sharks are also notable in several other features. One is that aside from their jaws, In fish, we see first clear example of an they use cartilage, not bone. Another is extracellular, autonomous and acute that various taxa are viviparous and give retroviruses in any organism, best exemplified live birth to their young. This raises an by the Walleye dermal sarcoma virus – immunological dilemma of how an (WDSV). WDSV is the best-studied example adaptive immune system of the mother of a fish retrovirus and is well established for fails to recognize an allogeneic embryo, being able to induce skin tumors. This virus is which is also characteristic dilemma of most related to the vertebrate Moloney placental mammals and is considered in virus family (MLV), which is basal detail in chapter 8. Little is known to a large family of mammalian retroviruses. 3 concerning the details of this types of WDSV are known but they are immunological dilemma in sharks except unusual relative to other retroviruses in several that the placental yolk sac is a site of high respects. For one, they share a surprising level level expression. However, with of sequence similarity amongst themselves and respect to virus susceptibility and viral also conserve a very unusual protein cleavage induced tumors, sharks seem to differ site. In addition, the WDSV viruses are highly noticeably from bony fishes as almost no atypical of mammalian retroviruses in that they viral associated pathology has been all encode a D-type cyclin that is distinct form described for sharks. Although sharks that of host cyclin. No other retrovirus has cannot be reared in aquaculture and thus these two characteristics. Regulatory proteins might not display viral diseases that 138 found in retroviruses (such as Ras and unknown host. However, there is no Myc,) are often considered to have been information concerning what other organism host acquired accessory proteins in the might harbor WDSV. Other aquatic mammalian retroviruses (such as in the retroviruses are also known to be prevalent in transforming retroviruses like RSV) since specific host. Retroviruses have been reported in the absence of such proteins for Eel, Snakehead fish, Sea Bass (associated retroviruses can often still replicate in with erythrocyte growth), and Salmon dividing cells. However, because WDSV (associated with plasmacytoid leukemia). represents the first example of an However, these viruses are rather distinct form autonomous retrovirus and shows a each other. For example Snakehead fish virus predilection for replication in has a unique map location for its env gene, and differentiating cells that were not shows little sequence similarity to WDSV. In represented in predecessor organisms, it general, however, these viruses are not well seems equally likely that these fish studied, although there is some evidence that retroviruses may also be the predecessor these viruses can suppress host immune the autonomous vertebrate retrovirus. This response. view is also consistent with phylogenetic analysis of the retroviruses which supports Retrovirses of amphibian are also known which the basal positioning of the fish show clear relationship to viruses in fish and retroviruses. reptiles. Many of these, however, are endogenous viruses in their respective host and A most intriguing biological characteristic are only distantly related to currently accepted of these fish retroviruses is that they are 7 retroviral genera. Retroviruses related to biologically prevalent in natural settings. these amphibian viruses are not widespread in The prevalence of WDSV in fish of other vertebrates so they appear to be restricted Oneida lake in New York has been studied to fish and amphibian lineages. It is and shown to cause a seasonal pattern of a interesting, however, that a recently described proliferative skin disease in fish. This complete endogenous retrovirus of pythons study suggest that the seasonality was due (Python molurus), is highly expressed in all to fish migrating to spawn and acquiring molurus pythons, but absent for all other WSDV infection, from unknown sources. python species, although p. curtus had a related It appears that fish which returned from endogenous retrovirus. This molurus virus also migration to the lake only show skin had an additional and unknown ORF within its growth disease for a short period then pol sequence. The python viruses show little recovered. WDSV virus production did relationship to retroviruses of higher not persist in these fish, thus these appear vertebrates and cannot be classified with them. to be acute infections. However, it also Similarly, retroviruses of fish do not group appears that essentially all fish in the lake with true retroviruses of mammals or birds, eventually get infected with WDSV. such as the five genera; , human T- These observation further suggest that cell leukemia virus, avian leukemia viruses, there exists an exogenous source of type D retroviruses, mammalian type B WDSV virus and that the fish migrating retroviruses. Thus the fish and reptile from Oneida lake encounter this unknown retroviruses are mainly novel retroviral genera natural source of WDSV. The oceans and show interesting links to each other and would seem to present a most probable their host, but not to mammalian and avian source of this exogenous virus. It is host lineages and their corresponding suspected that WSDV may persist in retroviruses. another host, possibly as an expressed endogenous retrovirus in this putative 139 Retroposons and fish genomes. In the and fish are clearly related to each other, but genomes of fish, we see a widespread distinct from those found in mammals and colonization by retroposon elements avians. derived from retroviruses. However, as we will see with the genome colonization Fish genomes, adaptive immunity and of many other vertebrates, the retroviral addiction modules. The acquisition of the derived sequence present in fish genome adaptive immune system represents a represents distinct virus groups, that are punctuated and transforming event in evolution specific to and highly reiterated in the fish of animals involving the acquisition of a highly genome but are not part of any other major complex phenotype. This phenotype includes retrovirus group. Fish genomes have an evolving and dynamically adapting genetic vertebrate short system, able to recognize and attack non-self, elements V-SINES and related long repeat but preventing the development of self LINE (RT encoding) elements in relatively recognition and self attack. This was large numbers. Both these elements accomplished by acquisition of a new and appear to have evolved from common fish complex set of communicating molecules specific retroviruses. For example, fish (cytokines) receptors and transducers (TCR have Poseidon and Neptune elements family being the most basic), with no clear which are fish specific retroposons, related evolutionary predecessors for the most part. via RT similarity to Penelope in This acquired gene system was adapted Drosophila vivilis – but highly reiterated (designed) to detect new non-self agents in the in fish. In Fugu fish genomes, we can find context of a polymorphic self detection system the Xena element, as an example of a (MCH). Following this detection, the system lineage associated retrovirus. Also, in fish then generates a response that creates and genomes, we can also find Jule element utilize a new molecular process (not present in with gag, pol, ORFs but no env ORF. predecessors) to generate genetic diversity and Jule is a member of MAG retroelement stimulate the clonal growth of specific cells family found in C. elegans and silkworm which recognize these non-self agents. These (described above), consistent with resulting cells then either secrete a novel class deuterostome evolution form worm forms. of molecules that bind and inhibit non-self Furthermore, this Jule element is related to agents or allow an amoeboid cytotoxic cells to sea Urchin SURL element, but Jule is only find, contact and destroy cells harboring these present at 3-4 copies in the zebrafish agents. Most of the features of this adaptive genome and not in several hundred copies immune system were not present piror to the per genome as is SURL in Sea urchin. A evolution of vertebrates. We know that Ty1-copia element is also found in fish, predecessor urochordate tunicates had a amphibians and reptiles genomes, polymorphic MHC-like system linked to non- establishing a link between these adaptive amoeboid heamolymph cell induced organisms. The lineage specific nature of killing. But this tunicate system has no these endogenous fish retroviruses and molecular similarity to the MHC system. To retroelements and the relative unique class create the adaptive immune system, a parallel of retroviruses involved, suggest that the MHC system like this one of tunicates needed ERV colonization of the fish genomes to have acquired an adaptive component that occurred early in the origin of these has linked to non-self recognition, but species and that further horizontal preventive of self-recognition. That is, it is transmission of retroviruses between both destructive of and preventive of vertebrate classes occurs relatively destruction of cells at the same time. These infrequently. As mentioned previously, linked features are characteristics of an the endogenous retroviruses of amphibians addiction module and we can propose that this 140 adaptive immune system can be recognition and from what source did this gene considered an example of a most elaborate likely evolve? The RAG proteins are addiction module. Host that have acquired responsible for DNA rearrangements and adaptive immunity have also acquired a recombination that generate the essential very destructive system (equivalent of a genetic diversity required generate surface powerful toxin) that can essentially kill receptor diversity and this is the starting any cell it contacts. The host must, material for selection by adaptive immunity. however, also be protected from this same Phylogenetic analysis of these RAG proteins destructive power by the simultaneous indicates that they seem to have no predecessor acquisition of a self-recognition system genes in early eukaryotic or prokaryotic that prevents self-killing (equivalent of the genomes. Instead, RAGs genes are more anti-toxin). As in other addiction closely related in both sequence and function modules, the killing (toxic) capacity of to the integrase of various retroviral genomes. adaptive immunity is stable and long- Can a virus employ such an integrase-RT lived, but the anti-toxic capacity (self protein for the purpose of generating diversity recognition) is transient occurring mainly of surface receptors as we see in the adaptive during the development of the immune immune system? Along these lines, it has cells. Hence adaptive immunity displays recently been reported that prophage of both parts of an addiction module; toxin Bordetella bacteria are responsible for tropism and anti-toxin, as well as the differential switching by altering surface receptor stability of the toxic component relative to expression. This phage directed tropism the protective anti-toxin. switching was established to be the result of a template dependent process that uses reverse A Scenario for possible viral origins of transcription of the RNA of the phage encoded adaptive immunity. Many of the surface gene to introduce nucleotide alterations individual strategies or elements needed and genetic diversity and thus generate a vast for an adaptive immune system have been repertoire of possible ligand-receptor used before in biological evolution, but not interactions. This represents the only currently by pre-existing cellular organisms (as known prokaryotic example of a genetic noted above). Rather these strategies and cassette that uses RT and can adaptively systems can be found in various types of generate such large pool of genetically diverse virus and are used by them to colonize receptors. Furthermore, the reverse their host. For example, the capacity of transcriptase of this phage most closely virus (a bacterial prophage) to recognize resembles that of and alter (by phage conversion) host (MLV). MLV is a retrovirus that is basally bacterial surface receptors is the well related to many other retroviruses and known phenomena of phage conversion retroposons of vertebrate organisms and their and represents the most dynamic genetic genomes, including the endogeneous feature of phage-host systems. This retroviruses found in fish genomes. Thus we conversion is related to both the success of see that the molecular machinery needed to phage colonization and bacterial virulence have generated the adaptive component of (a resulting host phenotype). Such phage vertebrate immunity can be found in various receptor conversion is still currently used persisting viral, not cellular genomes. to type specific bacterial strains. With respect to the origin of the adaptive Other novel components of the adaptive immune system, we can start by posing the immune system can also be found in viral question; what is the most basal gene examples. For example, the ability of a virus function in an adaptive system for the to express cytokines, cytokine receptors or generation of diversity needed for non-self alter host cell signal transduction and growth is 141 well established viral capacity and is mammalian viruses in that they use a the characteristic of most of the DNA viruses greatest diversity of viral and cellular receptors and retroviruses that infect bony fish. to infect cells. They are especially notable for These are clearly molecular strategies using receptors associated with the hemopoetic highly used by numerous and prevalent and the adaptive immune system. Vertebrate viruses in fish species, but viral strategies retroviruses have long been noted to have an that appear to have been absent prior to the inherent propensity to infect cells of the evolution of bony fish. Some of these immune system. viruses, especially the poxviruses and herpesviruses, are also known to encode Where is the proto-viral ancestor of adaptive viral version of cytokines and receptor immunity? One seemingly major problem molecules. A common view concerning with the viral origin hypothesis is that we the occurrence of such ‘host-like’ immune cannot currently identify a specific viral regulatory genes by these viruses is that candidate family that might have been the they appear to have ‘stolen’ such genes progenitor to the adaptive immune system. No from their host during evolution in order to one viral family has all these needed functions. suppress host immunity and allow active Beyond the basal requirement for the viral replication. However, phylogenetic acquisition RAG function and its control, analysis of these viral immune and growth which clearly could have come from a regulatory genes does not generally retrovirus, the extensive complexity of the support this ’thieving’ hypothesis because adaptive immune system suggest that a specific the viral version of such genes are usually or single viral agent would be unlikely to be either unrelated at a sequence level to host able to provide all the required gene functions. homologues or basal to those homologous This point, along with the genome wide and that show sequence similarity. Rather, high level ERV colonization that now viral regulatory genes often appear to characterize all vertebrate genomes, suggest represent more primitive or basal proteins that the acquisition of adaptive immunity was (such as using single protein domains) part of a complex and punctuated genetic relative to those homologues in the host event. That event was not the product of a genome. In addition, as mentioned, some single genetic parasite or virus, but rather viral version of such genes have no host resembles a stable colonization by a sets or homologue. In this light, it is highly swarms of complementing and defective interesting that herpesviruses that infect genetic agents. These agents must have both lampreys and sharks are known, superimposed a most complex addiction implying the existence of persisting virus module onto their host, able to compel stable in predecessor host that could have colonization but also able to exclude spanned this evolutionary transition to competing genetic parasites, onto a previously vertebrates and provided a source of these existing notochord host. That predecessor signaling systems. Also, relevant to the host most likely had some form of existing possibility viruses can provide genetic MHC recognition, a system cytotoxic cells and novelty for the origin of host regulatory a complement system, that were parasitized molecules are observations with fish and regulated by the new colonizing agents, retroviruses. The fish retroviruses have resulting in the creation and evolution of the viral-specific version of growth regulatory adaptive immune system. genes such as the D-cyclins, absent from the host. These genes are clearly of viral Fish Iridoviruses. Iridoviruses are large ds origin. In addition, retroviruses DNA viruses and represent one of the most themselves have a highly distinctive abundant types of virus that infect fish. Over characteristic relative to all other 100 types of fish Iridoviruses are known, some 142 of which can also infect amphibians, concatenated DNA intermediate. The DNA reptiles and turtles. These viruses have appears to be packaged by headfull DNA genomes that range from 100-200 kbp, packaging process. The DNA is also circularly with linear DNA and encode capsid permuted, terminally redundant and thus proteins that are well conserved. The resembles phages P22 or T4, and is the only virion membrane is non-cellular and the other examples of a virus family with this type virus is very stable allowing aquatic of replication strategy. This unusual process of persistence. Two large groups of these DNA replication suggests that iridoviruses may viruses are known consisting of the have evolved from a mixture of several DNA LCDV group (Lymphocytivirus) and IIV viral lineages. The insect and vertebrate group (). Mandrin Fish iridoviruses differ from each other in some Infectious Spleen and Kidney Necrosis general characteristics. Insect Iridoviruses Virus (ISKNV) are some of the better have methylated DNA, whereas the vertebrate studied LCDV members. LCDV has viruses are not methylated, suggesting some 111,362 bp of DNA with 124 ORFs host dependent link to viral DNA methylation. identified. Several of these viral genes are related to genes of other DNA viruses, Iridoviruses induced cellular growth in fish, especially the replication proteins. All octopus and amphibians. The iridoviuses are Iridoviruses code for a DNA pol gene, 2 generally ubiquitous in their host. These subunits of DNA dependent RNA pol and viruses tend to have many genes that affect the show some similarity (via capsid genes) to host cell cycle, which appears to account for ASFV. This viral DNA pol is most viral induced growth alterations in infected similar to phycodnaviruses DNA pol gene, cells, but these growing cells are not invasive. thus iridoviruses likely evolved from these For example, fish lymphocystitis disease virus older acquatic DNA viruses of algae or (LCDV) induces no substantial pathology in other marine protist. With iridescent virus infected flounder as virus replication results of marine animals, a complex DNA dep. only in a transient and benign surface lesions RNA polymerase is encoded consisting of of skin growth, which eventually disappear. between 8-14 subunits, with two large Goldfish iridovirus (GFV) shares little subunits, of which the largest (RP01) sequence similarity with LCDV, but is more conserves a universal hexapeptide found in closely related to frog iridoviruses, such as all RNA polymerases. This large subunit FV3. Like LCDV, no viral induced disease is is more similar to RNA pol of insects observed in infected Goldfish. The compared to other cytoplasmic DNA iridoviruses thus tend to show persistent, viruses. The fish iridoviruses are clearly inapparent life strategies in their natural host. most related to insect iridoviruses, and However, some viruses can show severe acute were their likely ancestors. However, it is disease, such as with Pacific Herring, and viral curious that there are no mammalian or hemorrhagic septicemia virus (VHSV), a warm blooded versions of iridoviruses. disease situation that poses a big commercial Iridoviruses have a nuclear phase but problem. It is interesting that fish pathology unlike herpesviruses and more like can involve myocardial mineralization and poxviruses, assemble nucleocapsids in hepatocellular necrosis. Infected fish can show cytoplasm, resulting in the characteristic chronic and display severe focal iridescent observed in skin reddening. In some cases, and inapparent insect iridoviruses. The early nuclear virus in fish (e.g. LCDV) appears to be lethal phase requires host RNA pol activity in marine bivalves, establishing that these leading to initial nuclear viral DNA viruses can also jump species and infect synthesis, followed by a cytoplasmic protostomes. A related iridovirus is also phase of viral DNA replication via known to infect octopus vulgaris, which is also 143 associated with tumors but otherwise which affected fish show epidermal shows little disease. Related viruses also hyperplasia in fins and skin. Herpes viruses are found in amphibians and reptiles; and such also known for salmonid species such as White viruses corresponds to a large amphibian bream and Salmonid herpes virus (SalHV-1) of virus family. Frog virus 3 and 23 related rainbow trout. Like Channel Catfish virus, viruses were isolated from renal tumors of this acute disease has high mortality, due to field frogs and toads. Thus the amphibian infection of gills. Frog (Rana pipens) version viruses are also highly prevalent in natural of herpes virus are also known, such as Lucke populations. These amphibian virus frog tumor herpesvirus (RaHV-1) – which can families, like all iridoviruses, are limited induce renal adenocarcinoma from American to piokilothermic (cold-blooded) animals. leopard frog Rana pipeins. This virus is Also, all these host have an aquatic phase distinctly related to fish herpesvirus but in their life cycle. No warm blooded different from mammalian and avian animal versions of Iridoviruses are known. herpesviruses. These herpes viruses encode growth factors, some of which are clearly non- host-like growth factors. Although latency seems to frequently occur, especially in adults, Fish Herpesviruses. it is not understood to any great detail. However in herpesvirus cyprini (CHV) virus With the , we also see the DNA is found latent in spinal nerves even after evolutionary introduction and expansion viral induced papillomas have regressed of true herpesvirus members. Channel suggesting a tendency to establish latency in catfish herpes virus (IcHV-1) was one of nervous tissue, similar to alpha herpesviruses. the best studied fish herpesviruses, due to it ability to induce acute disease with high Herpesvirus inapparent persistence and acute mortality and large population losses in disease is host species dependent. In the farmed catfish. The virus tends to infect above examples we have seen that some fish gills of juvenile fish but will establish herpesviruses appear to induce acute disease in persistence and latency in adult fish. juvenile fish. Other disease inducing herpes Channel catfish herpes virus has viruses are also known, such as the lethal Koi discernable homology to capsid genes and Herpesvirus of carp (KHV), which is also the DNA polymerase gene of Herpes distinct from other fish herpes viruses. Simplex virus, but no other sequence However, not all herpesvirus induced fish or homology to other herpesviruses is amphibian disease is restricted to juvenile host. apparent. By virion morphology, A herpesvirus of Green turtle is known to morphogenesis and genome replication, induce fibropapilomas even in adults. CHV (IcHV-1 new nomenclature) is Interestingly, the same turtle virus can be clearly a herpes virus. The fish viruses found in some fishes (Saddleback wasse), also have multimembranes with nuclear which are asymptomatic carriers, suggesting replication, assembly and as do that persistence with this virus may be other herpes viruses. Conserved genetic restricted to specific fish host. In fact, the maps and morphogenesis are characteristic ability of some herpesviruses to persist in of these Herpes viruses. These fish viruses specific host but induce acute disease in other show a tendency to cause epithelial but sometimes very related host appears to be a tumors, especially at higher temperatures, rather general characteristic of the entire but all tumors are benign. Viral epidermal herpesvirus family. For example, in the hyperplasia (VEH; aka Walleye epidermal European eel, herpesvirus anguillae (HVA) hyperplasis – WEHV), is due to a herpes virus establishes an inapparent persistent virus causing mortality in hatchlings in infection and infected eels are prevalent and 144 remain healthy. Even when these eels are iridoviruses, fish herpes viruses and fish treated with dexamethasone, which can retroviruses described above in that all are induce herpesvirus reactivation and associated with cellular (mainly epithelial) production in some host, the eels remain growth abnormalities. Bony fish (but not healthy. Yet this same virus is lethal to shark) thus have this surprisingly general Japanese eel causing a gill disease and pattern: this diverse families of fish viruses, strong replication in fibrocytes. Because all of which depend to a variable but vital the natural prevalence of herpesvirus is degree on the host nuclear machinery, induce high in European eel, any contact between cellular growth. Additionally, although the Japanese and European eel seems destined ability to alter cell growth by mammalian to eventually expose the Japanese eel to a members of these same virus families has long lethal HVA infection. This host dependent been recognized (hence the name tumor persistence and host dependent lethality, viruses), it is striking that the mammalian would seem to predict major consequences version of these viruses, altered cell growth is a with respect to interspecies competition biological abnormality, not a normal outcome and the natural selection of these of viral reproduction. In natural infections of populations. Along these lines, it is mammals, few herpes virus, adenovirus, or interesting to consider that all herpes retrovirus infections commonly result in cell viruses are monophyletic. Yet most growth alterations. Of these mammalian individual types of herpesviruses are also viruses, Papillomaviruses seem the most prone mostly phylogenetically congruent with to induce and other epithelial growth their persisting host, but not their acute alterations, but even here it is well established host. This suggests that the long term that the vast majority of HPV infections of evolutionary stability of herpes viruses is human cervical epithelia are silent and maintained in the persistently infected persistent, showing normal epithelial growth. host, not the acutely infected host. It seems likely that this situation relates to a Although the herpes families conserve distinct cellular or nuclear habitat for these replication proteins, genetic organization viruses, specific to fish cells. Also it is clear and morphology they tend to show large that this growth does not suggest a general host scale gene rearrangement with respect to response to any virus infection since such different orders especially genes that growth adnormalities are not characteristic of regulate the host. the numerous RNA viruses that infect fish as described below. Other fish DNA viruses and altered cell growth. Other DNA viruses are known to also infect fish and for the most part these Fish RNA viruses. In bony fish we can observe virus families were also absent in a large array of RNA virus species, many of urochordates. One such virus is which were not present in the predecessors to papillomavirus. Atlantic salmon, brown the vertebrates. These RNA viruses include bullhead and winter flounder are all fish rhabdovirus, paramyxovirus, known to support the replication fish orthomyxovirus, , coronavirus, papillomavirus and show associated calivirus and birnaviruses. Of special note alterations in cellular growth. This altered from the perspective of virus evolution is that growth characteristic is also apparent with these RNA viruses now include several adenoviruses of fish. Atlantic cod is families of negative strand viruses, which known to support adenovirus infection, where essentially absent from all the organisms associated with hyperplastic dermal that would represent predecessors to the bony lesions. Thus these ‘new ‘fish specific fish. It is striking that rhabdoviruses in DNA viruses, are very much like fish particular, are such a common source of fish 145 infections. Currently, we have no This virus also found in healthy shellfish coherent explanation for this observation species. Overall, Birenviruses clearly resemble especially since these viruses would seem arthropod-borne viruses (), EEV to be relatively less dependent on the host and Sinbis and is situated between these insect machinery for their own replication. Yet groups by phylogenetic analysis. In surveys of natural epidemics show terrestrial host, these arboviruses are of interest important and devastating infections with in that they cycle between to widely separated VHSV, IHNV SVCV indicating these taxanomic host groups, vertebrates and rhabdoviral infections can occur in nature invertebrates, both which may replicate the and are not restricted to mericulture. virus. Many marine birnaviruses also appear to However, as was the case with the fish cycle between different host orders and are DNA and retroviruses, many of these viral able to replicate in bony fishes and prtostome agents have come to the attention of fish species. biologist due to their strong impact on fish farms. For example, Japanese flounder Nodavirus are also a major source of fish and yellowtail are two of the most heavily mortality in farms. These viruses are farmed fish species. They are both prone ubiquitous. the disesaes induced by them to infections and disease with various fish include viral encephalopathy and retinopathy RNA viruses. The most problematic and in sea bass. As noted above, in contrast to the numerous of these viruses are the fish DNA and retroviruses, viral induced birnaviruses (causing viral ascites) and the growth abnormalities are not characteristic of nodaviruses (causing viral nervous nodavirus or birnavirus infections. necrosis). Birnaviruses are abundant in fish and some shellfish. Currently, 231 Host-virus diversity and host evolution. From strains (in 6 genogroups) of birnaviruses the perspective of virology, vertebrate fish have isolated from various fish species. represent some truly important biological For the most part, these infections are transitions, including a large scale disease associated and appear to be acute diversification of host species, the invention of infections. In some instances, however, the adaptive immune system and the origin of birnaviruses can also be isolated from the predecessors to terrestrial vertebrates. The healthy fishes, which establishes that most striking virus-related change amongst inapparent persistent infections are also these is the creation of the adaptive immune occurring. Birnavirus persistence may also system, with its breathtaking complexity and be occurring in some shellfish. In wild adaptability. Such a highly sophisticated molluscan shellfish species, 60% have system would clearly be expected to limit viral been reported to be positive for viral RNA parasites of these host. So it is most ironic that by PCR based assays. Because these instead we see a large scale radiation of viral shellfish show no disease, this suggests the species infecting bony fish, corresponding to virus may be in a persistent state. the radiation of these same host species. This However, with filter feeders, the presence may identify for us a broad pattern we will of viral RNA does not necessarily indicate again see with other hosts and their viruses. an infectious process is responsible. Yet On a large scale of evolution, there appears to the RNA sequences of birnaviruses appear exist a discernable virus-host connection. Host distinct from those of fish, suggesting that species diversity appears associated with a shellfish are in fact persistently infected by corresponding host specific viral diversity. birnaviruses. A specific example of this This host-virus pattern is not restricted to relationship can be seen with fish oceanic organisms presented in this chapter as infectious pancreatic necrosis virus similar host-virus diversity-linked patterns will (IPNV), which causes serious disease. be presented in the next two chapters on the 146 origin of land plants, land animals and vertebrate fish, such as the puffer fish, their corresponding viruses. However, at represent much more diverse orders and have the origin or base of these important highly compact genomes, devoid of many of in host species, we can these genomic parasites. We do not known frequently observe an order of host species how or if these genetic colonizers changes or (such as sharks, sea urchins, nematodes), affect the interaction of these species with their that appear to remain relatively devoid of viruses. We might propose that the evaluation viral parasites. Rather it appears that it is of such a question might help us to better the descendents of these early virus-poor explain the selective pressures that created the organisms that will not only develop much host genome. greater species diversity (such as mollusks and bony fish), but they will also develop Recommended reading. a correspondingly large diversity of viruses. It seems possible that this perceived broad pattern is misleading, due perhaps to experimental bias in the study of mainly those viral parasites of Fish viruses economically important host organisms. (Ahne 1993; Essbauer and Ahne 2001) However, in several specific examples this (van Hulten, Witteveldt et al. 2001) does not appear to be the case as we have (Muroga 2001) noted above. If this broad host-viral pattern is indeed real, how are we to explain it? How does the existence or Endogenous retroviruses. absence of large numbers of viral parasites (Gonzalez and Lessios 1999) affect host evolution and species (Leaver 2001) formation? In bacteria, there was strong (Herniou, Martin et al. 1998) evidence that viral parasites do indeed (Goodwin and Poulter 2001) sculpt the basic nucleotide word bias and (Ganko, Fielman et al. 2001) evolutionary capacity of host genomes. What might be the evolutionary consequence to the eukaryotic host RNAi/DNA methylation genome in a situation of either prevalent viruses or a viral paucity and how might (Barstead 2001) this relate to host speciation or viral (Tweedie, Charlton et al. 1997) colonization of that host genome? Perhaps by simply posing this question, we can now begin to evaluate the contribution of deuterstome immunology virus-associated forces to the evolution of species. The Lungfish, genome, for (Miyazawa, Azumi et al. 2001; Nonaka and example, has 40 times the content of DNA Miyazawa 2002) then that of human DNA. Why is it so (Magor and Magor 2001) heavily colonized by repeat sequences (Lin, Zhang et al. 2001) (genetic parasites)? The lungfish (Gross, Al-Sharif et al. 1999) represents the acquisition of numerous important biological characteristics that Genomes. were basal to the evolution of many diverse terrestrial vertebrate species. Yet (Cameron, Mahairas et al. 2000) as a species lungfish are not diverse. We (Bowen and McDonald 1999) know that some representatives of early 147 horizontal transmission." Gene 271(2): 203-14. Lin, W., H. Zhang, et al. (2001). "Phylogeny of natural cytotoxicity: cytotoxic activity of Ahne, W. (1993). "Viruses of chelonia." coelomocytes of the purple sea urchin, Journal of Veterinary Arbacia punctulata." J Exp Zool 290(7): Series B 40(1): 35-45. 741-50. Barstead, R. (2001). "Genome-wide RNAi." Magor, B. G. and K. E. Magor (2001). Curr Opin Chem Biol 5(1): 63-6. "Evolution of effectors and receptors of Bowen, N. J. and J. F. McDonald (1999). innate immunity." Dev Comp Immunol "Genomic analysis of Caenorhabditis 25(8-9): 651-82. elegans reveals ancient families of Miyazawa, S., K. Azumi, et al. (2001). "Cloning retroviral-like elements." Genome and characterization of integrin alpha Res 9(10): 924-35. subunits from the solitary ascidian, Cameron, R. A., G. Mahairas, et al. (2000). Halocynthia roretzi." J Immunol 166(3): "A sea urchin : 1710-5. sequence scan, virtual map, and Muroga, K. (2001). "Viral and bacterial diseases additional resources." Proc Natl of marine fish and shellfish in Japanese Acad Sci U S A 97(17): 9514-8. hatcheries." Aquaculture 202(1-2): 23-44. Essbauer, S. and W. Ahne (2001). "Viruses Nonaka, M. and S. Miyazawa (2002). "Evolution of lower vertebrates." Journal of of the initiating enzymes of the Veterinary Medicine Series B 48(6): complement system." Genome Biol 3(1): 403-475. REVIEWS1001. Ganko, E. W., K. T. Fielman, et al. (2001). Tweedie, S., J. Charlton, et al. (1997). "Evolutionary history of Cer "Methylation of genomes and genes at elements and their impact on the C. the invertebrate-vertebrate boundary." elegans genome." Genome Res Mol Cell Biol 17(3): 1469-75. 11(12): 2066-74. van Hulten, M. C., J. Witteveldt, et al. (2001). Gonzalez, P. and H. A. Lessios (1999). "The white spot syndrome virus DNA "Evolution of sea urchin retroviral- genome sequence." Virology 286(1): 7- like (SURL) elements: evidence 22. from 40 echinoid species." Mol Biol Evol 16(7): 938-52. Goodwin, T. J. and R. T. Poulter (2001). Possible figures. "The DIRS1 group of ." Mol Biol Evol 6-1. Figure of DIR map 18(11): 2067-82. Gross, P. S., W. Z. Al-Sharif, et al. (1999). 6-2. Figure of DIR recombinase dendogram "Echinoderm immunity and the evolution of the complement 6-3. CER dendogram from C. elegans system." Dev Comp Immunol 23(4- 5): 429-42. 6-1. Table of characteristics of duterstome Herniou, E., J. Martin, et al. (1998). immunology (needed) "Retroviral diversity and distribution in vertebrates." J Virol 72(7): 5955- 6-2. Table or schematic of characteristics of 66. adaptive immunity (needed) Leaver, M. J. (2001). "A family of Tc1-like transposons from the genomes of 6-3a. Table of viruses of bony fish – RNA table fishes and frogs: evidence for 148 6-3b. and DNA virus table

6-4. Characteristics of genomic evolution and ERVs.

6-5. Adenoviruses dendogram and fish

149 CHAPTER VII

Viruses, Land plants and Insects: A Trinity of Virus, Host and Vector.

Rationale for the Trinity. In this chapter evolution of terrestrial insects. As the we will examine deep issues concerning the crustaceans are the accepted progenitors to land origin and evolution of land plants, insects insects, it is hoped that the consideration of these and their viruses together. Earlier Chapters oceanic crustaceans and their viruses may help had examined the relationship between us better understand insect origins. It was noted viruses and their host from the perspective in the last chapter that viruses infecting of host evolution. These prior chapters, crustaceans are surprisingly diverse relative to however, had all been able to consider one more primitive host (nematodes, dictyostelium). specific lineage of host and overlay the Viruses infecting shrimp, in particular have been known relationships with the persistent and well studied and a shrimp-like organism was the acute viruses found in that host lineage. most likely ancestor to insects. Accordingly, in chapter 5 we had examined the oceanic algal species of green With respect to land plant evolution, there is microalgae, red algae and filamentous reason to believe that various fungal species may brown algae. In chapter 6, we had examined also be important as symbionts to the evolution the evolution of aquatic animals and their of the root systems of land plants. In addition, viruses. In this chapter, however, we will fungal species also function as important vectors examine the evolution of higher plants and for plant virus infection. Thus, we will also insects together along with their viruses. briefly examine the filamentous fungi and their The reason for combining these host will be ubiquitous infection with RNA hypoviruses in addressed in more detail below, but suffice it the context of plant evolution. Finally, we must to say that that plants, insects and their not forget that understanding the virus-host viruses all show intimate linkages that evolutionary dynamics also require us to indicate they are often co-evolving. As it is consider the various virus life strategies which the premise of this book to evaluate include the evaluation of virus-virus interactions. situations of co-evolution for the potential It will be thus important to consider the role of viruses, this chapter will seek viral relationship between persisting and the acute links between plants and insects. The viral agents of plant and insect host. This earliest insects date to the Devonian period consideration will also include those persisting (400 MYBP). The major evolution of land viral agents (and their defective derivatives) that plants and insects corresponds to a major have colonized the host genome in lineage evolutionary explosion (via fossil data) at specific ways. As we will see, virus-virus the start of the Cretaceus period (about 135 interactions seem to be an especially prevalent YBP). We will begin by examining the situation with respect to RNA viruses and land relationship of algae to the evolution of plants and it is common to find mixed virus green plants. As the green algae are accepted infections in natural settings. as representing the progenitors of land plants, they will be worth some re- Conditions and limitations of the trinity. We examination at this time. The strong seek to integrate two major lineages of host preponderance of large DNA viruses evolution along with their viruses and vectors (phycodnaviruses) that infect green algal that transmit them. This is daunting task that species was previously considered. In this would seem to pose a risk of adding an chapter, we will now consider the oceanic unnecessary layer of complexity onto an already crustaceans along with their viruses and the complex issue. To begin with, we will have little 150 to no archeological record of viruses to use However, for the most part we know little in order to calibrate and understand the concerning the natural biology or origin of these origin of various virus-host relationships same viral agents. Given the overall high that many have existed early in the evolution diversity of viruses of higher plants, especially of these host. This leaves us with mainly +RNA viruses, we must seek explanations for phylogenetic analysis with which to provide such distinct virus-host patterns and appreciate inferences concerning viruses and host how little we actually know of the natural forces evolution. However, there are clear that have led to such broad relationships. In limitations to such an analysis. natural non-agricultural settings rich in plant life, Furthermore, knowledge about persisting such as in a tropical forest, viral persistence plant and insect viral agents is most often appears to be a common situation and little virus incomplete relative to our understanding of mediated disease is evident. However, there are viruses that cause acute disease in these few systematic studies on this topic. In this host. Thus it will generally be more difficult chapter, we shall seek to compose an overview to evaluate the contribution of persistent of this issue with what can at best be considered virus to host evolution. In addition, the incomplete and at worst possibly misleading literature on viruses that infect plants and information. To address this lack of balance in insects is highly biased towards the study of our studies, at times it will be necessary to draw viruses infecting crop species of plants or strong inference from a relative few better- viruses that can be used to infect insect studied examples of natural virus-host as biological control agents. This literature relationships. However, such inference from few bias must always be kept in mind as we seek examples has the potential to also be misleading to evaluate what is known about broader when generalized. Land plant and insect evolutionary patterns. Furthermore, human evolution is highly linked, especially with the activity, especially agricultural activity in angiosperm plants which depend on insects for the last 10,000 years has generated large, pollination. That linked evolution, however, closely spaced and genetically homogeneous must occur on a fitness landscape in which plant populations, which have frequently competing and interacting viral agents represent been introduced into new habitats. This very important and ubiquitous agents that give human agriculture has almost certainly shape this selective landscape of their host. The affected both the ecology viruses and insects role that viruses have played in the evolution of that are now prevalent. their host has seldom been addressed in either the context of plant and insect evolution. In this Paucity of natural biology. As a rule, we chapter, we will first present the overall patterns have little knowledge concerning the natural of the evolution of the host plants, then host biology of most plant viruses and their host. insects. We will then consider their viruses. For example, TMV (tobacco ), Finally, we shall consider how insect and insect was the very first viral agent to be viral evolution intersects with the evolution of discovered and isolated. TMV is easily the plants in an effort to integrate these issues. best studied of the plant viruses. However, our understanding of the possible natural The co-evolved trinity of plant, insect and origin of TMV and its relationship with its virus. As presented in earlier chapters, early natural host is surprisingly incomplete and plant and animal life forms both initially evolved has only recent received attention (discussed in the oceans. From the perspective of a virus, below). 55 virus types of virus are the oceans represent a very supportive habitat in currently known to be capable of infecting that the aquatic media conducts virus to all tobacco crop plants. Thus, with this one nearby host, avoiding desiccation and without most important cash crop we can see a very the explicit need to use vectors. In the oceans, large viral diversity of viral parasites. viruses are susceptible to UV light mediated 151 damage and inactivation, which accounts for organisms then early plants since they have the majority of virus killing in this habitat. already developed complete organ systems. At depths of a few meters, filtration of However, virus infections appear to be prevalent damaging UV light would aid virus survival. in both the representative progenitors of early Thus, the interaction of viral systems with plant and insects. As already mentioned in light is a crucial aspect of the oceanic chapter 4, algae are highly prone to infections habitat. As plant and insect host adapted to with large DNA viruses of the phyciodnavirus- a terrestrial habitat, clearly the viruses of like family. Both lytic-acute infections and these host were also under selective pressure persisting, integrated and inapparent infections not only to survive the more direct and (transmitted through gametes ) are known. Other intense irradiation from sunlight, but were virus types, such as RNA viruses, are also known also now susceptible to much greater but much less common in algae (discussed desiccation. In addition, the absence of a below). Shrimp species are also known to be water media would require viruses to infected with an array of viruses which includes develop much less passive or diffusion almost all families of DNA and RNA viruses. As based systems for virus transmission to new mentioned in the last chapter, the picorna-like host, such as the use of mobile vector hosts. TSV has especially been a problem for shrimp Thus a major problem confronts virus farming. Since most of these viruses appear evolution as it moves to the terrestrial highly specific to and congruent with their host, habitat and encounters a major shift in this it is inferred that similar virus relationships new habitat and its land adapted host. prevailed in the oceans at the origin of land Except for host that still required water for plants and insects. Thus we fully expect that at egg and larval development (such as the origins of these two orders of land species, amphibians), terrestrial viruses must now viruses that infected them were prevalent in the find new host for virus infection no longer oceans. by diffusion in water, but by specific adaptations that would allow multi host transmission. In many cases this will At the evolutionary junction corresponding to the require a motile vector (e.g. insects) to infect initial land colonization by life, we can also see a an immobile host (e.g. plants). Thus we can corresponding sharp shift in the virus-host see how a virus may provide a direct linkage relationships with respect to both plants and between the and plants. insect animals. Plants underwent adaptation to land, creating bryophytes, acquiring root Early plants and insects. It is now systems, then acquiring vascular structures. This accepted that a progenitor of land plants led to the emergence of the ferns, and later to the were green micro algae, based on origin of gymosperms. Along with these phylogenetic analysis of chroloplast and adaptations we encounter what appears to be mitochondrial DNA. The earliest land based another transitional viral void in evolutionary plant decedents of these oceanic agal species biology. Although viruses of modern plants are would be best represented by the very numerous and prevalent, few viruses appear bryophytes, which are simple moss-like able to infect the extant representatives of early plants. The progenitor organisms of terrestrial plant host. This situation is similar to terrestrial insect now appears to have been the ‘viral void’ seen in the evolution of some type of shrimp species that resembles dictyostillium and nematodes as presented in fairy shrimp. This first land based chapter 5. There are relatively few, if any, descendent can be represented by blatteria viruses if these early plant host; i.e. there are species (which includes modern essentially no reports in the literature of such cockroaches). Of these two orders, clearly viruses. This pattern of viral paucity also the insect progenitors are the more complex includes the ferns, which represent a very 152 successful plant that was an early colonizer that infect them. In fact, this same origin and of land. It must be stated, however, that radiation of so many angiosperm species was such an absence of data is always suspect noted by Darwin to pose a puzzling and and possibly distorted due to insufficient or significant problem for evolutionary biology as uneven analysis. However, we can state there seemed to be no apparent driving force for with some confidence that at least the acute such large scale speciation. With respect to the viruses which infect such plants are corresponding radiation of viruses, the great exceedingly rare. Admittedly, there is little majority of viruses that infect angiosperms are data exist concerning the possibility of various types of +ssRNA virus families that persistent or inapparent virus infections in appear to have several distinct origins. However, these same host, but unlike the situation some of these virus families are unique to plant often observed with other persistently species. As will be discussed below, the infected host, such as the filamentous algae confusing complexity of nomenclature of these or filamentous fungi, virus reactivation in viruses poses a significant problem for non-plant reproductive tissues and gametes has not virologist, similar to the problem with been reported with the representatives of endogenous and autonomous retrovirus these early land plants. nomenclature discussed in chapter 8. A most difficult issue will be to evaluate the possible The genomes of flowering plants are known evolutionary origin of all the families of viruses to be highly colonized by various genetic extant is modern plant species. As will be parasites (mainly retroposon derived discussed below, there is reason to think that elements). However, the evolutionary many of these viral families are polyphyletic and patterns of the colonization of early plant they may not share common origins. For genomes by genetic parasites is not well example, ss+RNA virus families exist in both studied. The main problem is that the rod and isometric forms which are thought to genomes of such plant species have not been have distinct lineages. In addition, some of these sequenced. However, the little information virus families have segmented genomes which that currently exists nevertheless supports are also considered distinct. Yet we would the idea that high level colonization by expect that all of these extant viral families can retroposons, retroviruses viruses, and their most likely trace their origins to some oceanic derivative elements is much less prevalent in virus, similar to those found in oceanic algae, these early plant species compared to the animals and fungi. The most prominent and genomes of flowering plants and diverse families of are +RNA plant viruses. As gymnosperms. It is not known, however, if noted in chapter 6, if we loosely define the such genome colonization by genetic ’picorna-like’ viruses, as ss+RNA viruses, we parasites has any consequence to of some can see that the oceans are now known to contain virus-host relationships. Thus with the six known and four previously unknown families evolution of higher plants, we see great of such viruses. In plants, two of these known accumulation of genetic parasites. families, the Potyviridae, and Comoviridae, are numerically amongst the most numerous plant Angiosperms and virus linked species viruses in terms of species counts. Thus some radiation. With the evolution of linkage to the ‘picorna-like’ viruses found in the gymnosperms, some viruses that are able to oceans seems clear. The , infect these species become apparent. and the rod shaped However, a very large radiation in both host are the next most species rich plant viruses, but species and virus types occurs with the are not considered to be ‘picorna-like viruses. evolution of flowering plants, the However, as presented below, we may still be angiosperms. Here we can see an explosive able to trace some these virus families to those radiation of both host species and viruses that infect likely predecessor host (such as fungal 153 ). Of special interest are the flying insect evolution to the origin of vascular plant , (a floating genus) and plants with tall stature has been proposed. In the family whose taxonomy is addition, many larval forms of flying insect are tightly linked to that of it angiosperms host. responsible for much of the foraging on and These two viral families appear to have consumption of plants, but these larvae may distinct relationships to their host, discussed themselves be subjected to being parasitized by below. Some viral families are associated parasitoid wasps. As discussed below, with high mortality of their host whereas endogenous DNA viruses of these wasps can be others are not associated with much host essential for this parasatoid-host relationship. disease. In terms of early evolution, we Intriguingly, it appears that plants may be able to might infer that the various plant +ssRNA manipulate this parsitoid-host insect relationship virus families most likely evolved from by producing signal molecules that recruit the viruses that might be found in oceanic host. parsitoid, following predation by larval host of Although some +ssRNA viruses may infect the parsitoid. Finally, it needs to be emphasized algae, such viruses are especially known to that by far the greatest number of insect species infect shrimp species. Thus it will be that are vectors for plant viruses fall within the important to consider the possible hemiptera order (true bugs). This order includes interactions between insect and plant orders aphids and which is a very diverse via viruses. It is curious that though green order. For the most part, plant viruses algae were prone to infections with large transmitted by these insects are specific to the nuclear DNA viruses, no large or insect vector, although they seldom replicate in intermediate DNA viruses are known for the vector. In the case of aphids, this virus- any land plant species so there is clearly a vector specificity can be very tight. The unusual sharp demarcation of the plant-virus biology of aphids (e.g. aphids are often relationship. parthenogenic, haploid and clonal) and their link to viruses make an important point of the Insect-plant-virus radiation. In addition, intimate relationship that exist between viruses, with the origin and radiation of angiosperm host and vectors. Several explanations account species we also see a corresponding for the specificity of a plant virus to its insect radiation in insect species. Like flowering vector. Often, this specificity relates to the plants relative to other plant orders, insects specific binding of the virus to surfaces in the also show the greatest diversity of all of the insect mouth parts. In the case of some aphids, animal kingdom so they represent a highly however, it has been established that virus diverse and successful order. Flowering specificity relates to effective viral persistence plants are, for the most part, dependent on within aphid cells and this is mediated by virus insects for their pollination. Conversely, specific binding to heatshock proteins (GroEL) many insects depend on plants as food expressed by ensymbiotic bacteria (Buchnera) of sources. This has the practical consequence the aphid. Like the various orders of plants, the that in commercial terms, insect predations insects orders also have broad patterns of virus- accounts for the majority of agricultural host relationships that are well maintained. losses. Pollinators are mostly flying insects (hymenoptera, dipterans, ), which The earliest land insects appear to be of the evolved at the time of angiosperm evolution. Blatteria order, which includes modern Most flying insects also undergo cockroaches. Although some +RNA viruses metamorphosis from larval ‘worm-like’ have been reported for roach host, they are more forms into adult winged flying forms. The commonly found in other species, such as bee origin of insect flight and the evolution of and cricket species. Overall, however, there metamorphosis both appear to be linked to appears to be a clear paucity of +RNA viruses in plant evolution. As an example, a linkage of insects, not only representatives of early insect 154 orders but including those insects, such as are not considered as a basic element of the aphids, that function as vectors for plant evolutionary tree of its host. However, as RNA viruses. Other major insect viral presented below, viruses may well provide the families appear to be absent from these early thread that binds the fabric of co-evolution in insect orders (i.e., baculoviruses, plants and insects. entopmopoxviruses, negative strand viruses etc.). This contrast with Orthoptera insects (crickets/) which are known to Plants and their viruses support infections with ascoviruses and entomopox viruses. Overall, there appears to be a clear and general trend in the insect- All green plants can be considered as belonging virus relationships. The majority of insect to two major phyla; the Streptophyta (which viruses are DNA viruses of moderate to includes all land plants, and their charophytes large genome size. Mostly, these viruses relatives) and the Chlorophyta (which includes (such as baculoviruses) appear to have the rest of the green algae). All land plants lytic/acute relationships with their host and appear to have evolved from one common also tend to infect and cause acute disease progenitor, which underwent diversification to larval forms of insects. Unlike bacteria and currently existing families and some extinct plant algae, however, there are currently no families. A common origin is inferred from reports of insect dsDNA viruses (non- various features found to all land plants, such as gemini virus) that are able to integrate and the characteristics of chloroplasts and RNA persist in host genomes as do the editing in plastids. The overall pattern of land phaoviurses of filamentous algae or plant evolution begins with green microalgae in lysogenic phage of bacteria. However, as the oceans, which were associated with a discussed below, the polydnaviruses are particular type of chloroplast. Plant adaptation integrated genomic viruses of parasatoid to land is now accepted to have occurred along hymenoptera. with the acquisition of a root system via ensymbiosis with filamentous fungi. The origin of leaves, followed by the origin of vascular Thus there are many observations that plants, were subsequent major developments. suggest strong linkages between plants, The origin of seeds in gymnosperms and the insects and their viruses. If we are to origin of flowers were the subsequent consider the combined evolutionary developments, but these acquisitions are relationship between land plants, insects and associated with major radiation of plant species. the viruses that infect them, it seems clear The currently extant representatives of early that they will not be represented by a simple divergence (for example moss) are considered tree-like topology with a common trunk. paraphyletic to the lineages that led to Instead, some type of overlapping, angiosperms. Major taxonomic clades of extant reticulated or superimposed network will be plants accordingly are Gneteles, Sphenopsida, required. A simple tree topology will not Ferns, Conifers, gymosperms, monocotyledons suffice. This superimposed network will and angiosperm. need to show clear patterns of co-evolution between these distinct orders of life. Along these lines, we might envision two major lineages (plants and insects) that are mostly Early plants/virus – Land plants congruent with each other, but symbiotic (embryophytes) are considered to have evolved with fungi at their origin. Viruses will have from charophycean green algae as both of these both host congruent and incongruent orders have lignin, which is absent from other patterns of evolution. Typically, viral agents green algae. In addition both orders also have 155 similarities in chloroplast, such as the barnaviridae and further that it has clear presence of group II introns in their similarity to some plant RNA viruses (discussed chloroplast tRNA which is a conserved below), it seems quite possible that this virus characteristic of all land plants. Bryophytes could represent viruses of fungi that were present represent a lineage that diverged early from early in the evolution of land plants. Thus, some that of land plants and is now considered viral lineages might date back to the very origins praphyletic to them. These bryophyte of land plants and their associations with fungi. species resemble some algae in that they produce diploid motile spores, but no seeds. RNA viruses of fungi. Mushroom Bacilliform The evolution of early land plants is also Virus is the sole member of a virus family and is associated with the development of a root unique to fungi. Its genome consist of one RNA system and it is now generally accepted that segment of 4 kb that encodes a replicase, a coat a symbiotic relationship with filamentous protein and a 72 kd Orf Z which appears to be a fungi, with their capacity to make helicase. It has no movement proteins. Thus penetrating hyphae was the likely source of MBV represents a relatively simple RNA virus. the land plant root system. A symbiosis MBV has striking resemblance to Alfalfa mosaic between plant root systems and separate virus (AMV), which is one of the better studied fungal organisms is especially evident for members (of which there are many). tree species, such as pines, which generally Within the potyvirus group, Spanish Latent virus have a mantel of mycorrhiza covering the (SPLV) appears to be the basal member based on root system of essentially every tree. Fungi phylogenetic sequence analysis. AMV has 3 can also be found associated with lower RNAs, one of which encodes a movement plants and can produce structures analogous protein, thus it is a more complex virus then is to vesicular arbuscular of higher plants. MBV. It has been suggested that MBV may Various extant symbiotic fungi show such represent a progenitor virus with one RNA associations with lower plants, such as segment that evolved to the multiple RNA zygomycetous, ascomycota and segments of potyviruses. This idea may be basidiomycota. These associations are general in that most RNA viruses with similar to that of root fungus mycorriza and segmented genomes may have evolved from would be consistent with hypothesis that progenitor viruses with single RNA segments. symbiosis between fungi and plants was However, the possibility that multiple viral important for land plant evolution. lineages may have converged to generate multisegmented viruses is also viable. Bromoviruses also have multipartite RNA However, this hypothesis of symbiosis with genome (typically 3). Its RNA replicase shows fungi poses an interesting situation with some similarity to that of TMV. However, in respect to possible viral relationships. As other genes Bromoviruses are more similar to essentially all extant filamentous fungi cucumoviruses so it is possible that the appear to be persistently infected with Bromoviruses may have had multiple viral various types of RNA viruses (mainly progenitors. hypovirus and other RNA virus, see Ch. 5) we fully expect that the progenitor to the land plant root system would also have been RNA viruses of algae. Although some DNA persistently infected with RNA viruses. If virus families known to be prevalent in algae so, this would provide a good habitat for the (large DNA phycodnaviruses and phaeoviruses), origin of some land plant RNA viruses by these viruses have no counterparts in land plants adapting to the new plant host. Given that and clearly could not have been direct Mushroom Bacilliform Virus (MBV, + ss progenitors to viruses of modern plants. RNA) is a unique and the only member of

156 However, some RNA viruses of algae are motility. These viruses are mostly rod shaped, also known which might be plant progenitor +RNA containing viruses which will become viruses. Some rod shaped RNA viruses are encysted within the zoospore. In most cases, the known to infect algae. Of these the best virus enters the spore while in the fungi, in vivo, characterized and most relevant is Chara establishing a tight biological relationship AlgaeVirus (CAV). This virus indeed has between virus and fungi. Once encysted within several characteristics which suggest that it these zoospores, virus is protected and can could represent an ancestral virus to various survive over 20 years in some vectors. extant plant viruses. A relationship between CAV to TMV can be seen by cross reactivity of to capsid proteins of Spores seeds and early plants. It is likely that TMV. Thus CAV is a tombavirus-like agent in both algae and early land plants there existed that resembles viruses known to infect a motile spores, prior to the evolution of seeds. wide range of angiosperm species. Most algae are haploids that occasionally Furthermore, phylogenetic analysis supports become diploid during sexual reproduction and the idea that CAV might represent the oldest zoospore generation. Interestingly, in and most basal member of this viral clade. filamentous brown fungi and other higher fungi, The implications of these results are that zoospore formation is frequently associated with CAV does not seem to represent a more reactivation of species specific persistent virus recent adaptation of a modern plant replication (see Ch 4). Along with the evolution tombavirus to algae, but rather that CAV of seeds in plants, there also must have occurred may be the oldest member of this virus a shift from these diploid algae-like spores to family suggesting further that such viruses haploid seeds as well as a shift from algal-like may have been prevalent prior to the origin haploid to diploid stoma seen in many plants. of angiosperms. The forces that may have led to this shift in life strategy are unknown. However, it seems likely Fungal vectors and zoospores. Besides that examining representatives of early land directly supporting the replication of fungal plants might allow us to gather some information viruses, fungi are also known to transmit on this issue. Identifying the likely early RNA viruses of plants, especially the ancestors of land plants is based mainly on the furoviruses. Of these, CGMMV is a well- analysis of mtDNA and cpDNA. Such analysis studied example. CGMMV also has the suggests that the prasinophyceae algae appear as curious ability to be able to also parasitize early decedents of green plants. algae. Partiviruses are also known to be able to infect fungi, as well as to infect two genera of plants. In plants, partivirus infection is typically a cryptic or inapparent infection with no or few symptoms. Thus Plastid DNA and early plants. Algae, such as this seems to be a persistent virus-host Mesostigma viride, appears to be basal to both relationship. These infections tend to be streptophyta and chlorophyta. This plant has a seed transmitted. The biological circular mtDNA that resembles that of other relationship between these plant viruses and green algae in that it is small (about 42,000 their fungal vector is striking. Fungal bases) with high gene density (87% coding mediated virus transmission often involves sequence, 65 genes). This mtDNA has 4 group I the motile fungal zoospores, which are able introns (in cox1) and 3 group II introns (in cox2) swim to roots of host plants carrying virus that is characteristic of all plants. Mesostigma with them. In a sense, the motile zoospore viride is a scaly green biflagellate with very large provides the same function that an aphid chloroplast DNA (135 genes), which does for many angiosperm infecting viruses; 157 distinguishes it from other plant species. ferns, horsetails). Lycophytes have leaf-like Mesostigma viride appears paraphyletic to structures, but have a low plant stature and do higher plants and may be earliest green plant not make seeds. Euphyllophytes are vascular to diverge from the lineage that led to land plants that can be of high stature and with seeds. plants. The biological and reproductive There are two major characteristics whose characteristics of Mesostigma viride, developments allowed plants to adapt to new however, resemble algae more then land previously unavailable habitats. The increased plants, strongly suggesting that early lands plant stature that resulted from these vascular plants were very algae-like. plants (such as with trees) also required a corresponding increase in root penetration so In bryophytes, such as codium fragile, the there must have been some coordination with cpDNA is circular. At 89 kb, it is the other characteristics. The creation of seeds smallest cpDNA known and also lacks large clearly enhanced the habitat available to plants repeat elements present in other plant and allowed plants to colonize drier land. This cpDNAs. Curiously, within these hosts, the colonization most likely led to enhanced soil cpDNA has a low evolutionary rate. This is production and subsequent runoff. It has been in stark contrast to plastids of other plants suggested that as a consequence of increased which show high evolutionary rates. It is runoff resulting from land colonization, interesting that the bryophyte cpDNA increased algal blooms, resulting in widespread plastids also code for a RNA pol which anoxia, and lowered CO2 production and resembles RNA pols of bacterial phage. possible early glacial period, would affect the However, in moss, such as physcomitrella oceans themselves. However, as discusses in patens (which is a sister group to all land chapter 4, most algal blooms are also associated and flowering plants), the plastid RNA pol with or terminated by corresponding increases in has become a nuclear encoded chroloplastid lytic virus production. This suggests that a RNA pol; PpRPOT 1/2. These are clearly major shift in global virus/host dynamic was phage type RNA polymerase and appear to likely to have occurred at the origin of vascular have been a relatively recent acquisition into land plants. the genomes of these land plants. These observations suggest that viral like agents Virus paucity in early land plants. The participated in the early evolution of cpDNA situation with respect to viruses and early land in these decedents of early land plants. It is plants is curious. Although these plant species interesting to note that virus, SBV are relatively well studied (6,000 scientific also encodes a phage-like (T7-like) RNA papers published in the last 10 years), and pol. Furthermore, SBV also infects viruses in both algae and fungi , land plant chloroplast, an otherwise an unusual predecessors, are prevalent, there are no reports situation in viruses of higher plants. of viruses infecting any of these early land plant species. No virus, for example, has been Early land plants characteristics. The reported for any bryophyte. The paucity of origin of vascular plants and the virus reports also applies to ferns, which like development of seeds represent two linked bryophytes are also spore formers. Very few early and major development in land plant ferns have been observed to support the evolution. Vascularity appears to have replication of any virus, although tobra-like virus evolved from bryophyte-like ancestors, particles have been found in Hearts toung fern. which tend to be parasitic sporophyte Ferns are also sufficiently well studied to have species whose spores are diploid. Vascular made likely the identification of prominent virus plants have two big classes – lycophytes infections. It is therefore clear that lytic virus (clubmoss) and euphyllophytes (seed plants, infection of these plant species is either rare or non-existent. However, as has been argued in 158 earlier chapters, the lack of virulent virus direct predecessors of angiosperms. infection does not preclude the possibility of Gymnosperms are monophyletic. Within the prevalent infections with persistent, non- gymnosperms, Gnetales appears to be the closest lytic virus, as we have described for higher relatives to the conifers. Based on fossil fungi and algae. The observation of tobra- evidence, Cycads appear to have been the first like virus particles in one fern might be such plant lineage to diverge from gymnosperms, a situation. Yet, persisting viruses are often followed by ginkgo, then Gnetales and identified by the presence of their nucleic Pinaaceae, which is a sister group to the acids in host cells or the transient production monophyletic conifer families. Extant Cycads of virus particles (such as in zoospores of are thought to be essentially living fossils. Some brown algae and higher fungi) and no such viruses of gymnosperms are known but their observations have been made suggesting numbers appear to be very much reduced that persisting viral infections are also not compared to angiosperms discussed below. common in these plants. Given the technical Some reports of can be noted, such as difficulty of observing viral persistence, with Cycad revolute, wgich has been reported to however, it remains very possible that viral harbor a nepovirus– like virion particle. It seems persistence is prevalent and has simply been likely that such an observation could represent overlooked in these plant species. Still, persistent infections. However, as is often the relative to higher plants, there is clearly a case concerning persistence, this issue has not void of virus disease of lower land plants. been evaluated carefully. Prevalent infections As mentioned, this putative ‘viral-void’ is with acute viruses, however, is clearly not most reminiscent of a similar void described observed in most gynmosperms. for the early aquatic animal life forms (such as C. elegans and sea urchins) in chapter 6. Genomes. Especially noteworthy are the In that animal situation, it was noted that genomes of gymnosperms which are often highly these host had also acquired the RNAi colonized by various families of retroposons. system, which in these early animals has the These retroposons can constitute the majority added characteristic of being a systemic and DNA of gymnosperm genomes. Curiously, transmissible response and could preclud authentic autonomous retroviruses for most viral infections of these host. gymnosperms, or any other plant for that matter However, in the case of early land plants, we have not been observed. This poses the question do not currently know of the existence of a of how these retro-agents gained initial access to general host antiviral defense system that plant genomes. Even more curious, and in might be able to preclude infections by apparent distinction with most other eukaryotic various viruses. Since the genomes of these genomes, plant genomes appear to lack any full plants are poorly studied, however, there length versions of endogeneous retrovirus which may well still exist some yet to be contain env sequences, although a few copies of discovered system that can preclude gag and RT containing plant retroposons have infections by numerous viral families. been identified in rice (Oryza sativa) genomes. Para retroviruses or (which Gymnosperms (gym = naked, sperm = replicate via extranuclear DNA genomes), are seed) represent a major order of land plants known for many plant species and are also characterized by seeds not enclosed in an represented in relatively small numbers within ovary, such as the cycad/conifer families the genomes of gymnosperms. However, as and various palm-like plants. Extant these agents lack integrase, their integration gymnosperms are considered to be a sister could be dependent on other sources of integrase group to angiosperms and diverged early on to allow genome colonization. The presumed from other higher plants. Thus source of such integration would likely be gymnosperms do not appear to have been 159 activated RT from endogeneous plant colonization by lineage specific retro elements. retroposons. The great majority of these elements are defective, especially for the env sequence. The analysis of gymnosperm retroelements However, some intact copies appear to have been was initially accomplished using PCR conserved but the possible significance of such consensus primers to amplify and sequence conservation will need to await comparative full the various families of retropsosns. From genome analysis of various gymnosperm this, it was observed that gypsy-like genomes. elements, their non-LTR LINES derivatives and copia-like elements were the most widely distributed. Most of these element Angiosperms are land plant species families appeared to be monophyletic and characterized by ovules enclosed in an ovary or well conserved within a specific plant the flower of flowering plants. Angiosperms lineage, even though the different retroposon have the most complex reproductive organs of groups, such as gypsy, copia, LINES were land plants. Furthermore, flowering plants widely separated from each other within one represent by far the largest and most diverse of plant lineage. The great majority of these all land plants with over 250,000 species, and elements lack intact gag and pol ORFs, so within angiosperms orchids are most diverse and no functional proteins could be expressed also an old lineage. This enormous diversity was from these elements. The gypsy-like an enigma to and was referred to elements are rather similar to the as Darwin’s abominable mystery. The retroviruses found in animal genomes. outstanding mystery remains in that there is still However, in no situation has a transmissible a need to explain the forces that might have led version of a plant retroposon been reported to such a variation. A linkage between flowering to have been propagated, so there seems to plants and insects seems clear. The large be no example of reactivation of an majority of flowering plants rely on pollination endogenous plant retrovirus. Badnaviral by insects for their reproduction, thus plant related elements formed distinct clade from reproductive success is partially determined by those of the retroviral related elements. insect behavior. However, less well appreciated These Badnaviral related elements were is the large diversity of plant viruses also basal. Curiously, the clades as defined by associated with angiosperms, discussed below. the various retroelements, did not clearly Flowering plants exist in two major families. define major taxanomic plant clades Monocotyledons are flowering plant with a (conifers, ferns, gymnosperms, etc). This is single cotyledon (leaf seed / embryo) such as rather different then the ERV-genome lilies, orchids, grasses, and cereals. situation in mammalian genomes (ch. 8). Monocotyledons constitute about 1/4 of all Within various families of trees for both flowering plants. The origin of angiosperms can gymnosperms and conifers: gypsy and copia also be traced to the early Cretaceous. In are both monophyletic. This monophyletic flowering plants, there is strong evidence or pattern was evidence against the recent repeated, recent and diverse transfers of mt DNA horizontal transfer of these elements to these genes to nucleus. Therefore some system or plant lineages and suggested they were process must exist that allows nuclear gene acquired at the origin of these lineages. We migration. In addition angiosperms have are thus left with a pattern of retroposon experienced a lot of what appears to be colonization that is rather reminiscent of that horizontal transfer of sequences especially into seen in the early animals (e.g. C.elegans, sea mtDNA. The invasive homing group I introns of urchin). For unknown reasons, gymnosperm the cox gene have been acquired over 1000 times lineages, like the simple animals, are in various angiosperm species. The mechanisms associated with early events of widespread that allow these sequences to move, seemingly 160 from one species into another, are obscure. relatively small openings that normally restrict However, the possibility that these movement of macromolecules between cells and ‘horizontal transfers’ might actually reflect could preclude DNA virus transmission. These independent colonization events mediated structures are targets of many RNA viral proteins by related genetic parasites, such as viruses, and can be enlarged by movement proteins of cannot be ruled and also seems probable. various + RNA viruses. That only plant RNA Thus unlike the early land plants, the viruses have such movement proteins and that plastids of angiosperm are highly evolving these proteins are known to be crucial for plant and under constant invasion by genetic virus viability, strongly supports the idea that parasites. Of 14 extant monocot lineages, virus movement is a special problem in plants. all appear to have diverged from each other However, DNA viruses of algae and other at early cretaceous period (~ 100 mybp). aquatic cells are clearly able to penetrate tough cell walls by hydrolyzing an opening and Viruses and angiosperms; overall injecting DNA from the outside of the cell. patterns. Like the angiosperms themselves, Some phycodnaviruses actually have ORFs that the viruses that infect these plants are by the appear to code for various hydrolytic enzymes most diverse and numerous of the entire and even cellulose synthase, suggesting that they plant kingdom. Because of this diversity, can code for enzymes that might be needed to most published considerations of viruses of breach plant cell walls. The absence of any such plant are actually focused almost exclusively plant DNA virus might indicate that the on viruses that infect angiosperms. thickness of plant cell walls is simply beyond the Consequently, in this chapter, there will also capacity of such external viral injection be a more detailed examination of these mechanisms to overcome. Another very curious viruses and their possible origins. Overall, but general issue of angiosperm virus several patterns of virus and their relationship concerns the retroviruses. angiosperm host can be noted. As Angiosperms lack any true retrovirus that is able mentioned repeatidly, most viruses of to integrate following reserve transcription of angiosperms tend to be RNA viruses of plus RNA viral genomes into host genomes. Yet stranded polarity. An additional distinction angiosperm do support various pararetroviruses, with other orders of host is that angiosperms such as Bdnaviruses, that replicate using reverse are also frequently infected with mixtures of transcriptase via cytoplasmic DNA forms. Thus virus agents, including satellite viruses. it is difficult to explain the absence of true This mixed infection is also common in field retroviruses in plants. This observation is isolates. As also previously noted, especially puzzling when it is noted that the angiosperms lack any of the larger dsDNA genomes of most plant species show high level viruses, beyond the gemini viruses. Unlike colonization by various retroposons, clearly insects and algae, no members of the derived from retroviruses which include phycodna-like viruses, pox-like virus, integrase ORFs. It seems that retrovirusee may baculoviruses or herpes viruses are known have been highly active early in the evolution of for any plant species. A prevalent plant genomes, but are no longer supported by explanation for this lack of DNA viruses has these host orders. been that plants, with their very thick and relatively impervious cellulose cell walls, In terms of presenting a habitat for virus, plants present a impenetrable barrier for any have several features worth noting. Plants lack moderately sized DNA viruses (although adaptive immune systems, but clearly retain an Agrobacterium can penetrate plant cells, array of post transcriptional control and other leading to integration of T-DNA plasmids). defense systems. Although plants do not retain Communication between plant cells occurs the fully transmissive RNAi system, they do via plasmodesmata. Plasmodesmata have show strong regional response to dsRNA and a 161 regional RNAi induction to affect the templates is high, (~103) substitutions per site possible expression of viral genomes. per year, viral RNA genomes can essentially erase their ancestral or historic sequence information rapidly. Thus this poses it Because of the preponderance of plus significant practical problem of establishing the stranded plant RNA viruses, at this time it is phylogenetic relationship of RNA viruses. Yet it worth reconsidering in greater detail the is equally clear that various RNA viruses are likely origins of these virus families. phylogenetically congruent with their host, Currently, 14 families of plant virus are strongly implying co-evolution with host on the known covering 70 genera. Mostly, these much longer time scale of host evolution. The viral families are distinctive. As previously conservation of molecular characteristics (active mentioned, overall there are two clearly sites, capsid types, molecular strategies etc.) recognizable physical types of plant RNA might then reflect the need for the virus to retain viruses; rod shaped and isometric. In limited sequence solutions to meet functional addition, within these broad morphologies, constraints on a background of very high rates of there exists numerous viral families with sequence change in the rest of the genome. Such variations in the following characteristics: an overall high rate of change would result in single or multiple segments, membrane phylogenetic branch lengths that are too long to coverings, distinct 5’ and 3’ RNA ends (with quantify hence too long to provide meaningful or without 5’ terminal proteins, 3’ poly A), lineage identification. Virus-host co-evolution gene order expression of polyproteins or use would then be visible due to the maintenance of of internal ribosome entry sites for host restricted molecular constraints on translation. All of these features, along with otherwise high rates of viral evolution. In methods of transmission, (plant host, insect addition to the problem with high clock rates, vectors etc.) are used to define specific virus RNA viruses appear to have undergone frequent families and results in a large set of recombinational events that appear to have angiosperm viruses. resulted in new viral lineages. Such a situation would also clearly require that these viral Such a large array of virus types lineages would have mixed ancestry, thereby immediately poses the question of if there further complicating phylogenetic analysis. Thus can be any common lineage to these viruses the existence of numerous +RNA viral families or if they must be polyphyletic. However, that infect angiosperm does not really raise a the most common feature that seems to link theoretical problem in terms of understanding this entire set of viruses is the need to how such viral variation might have come about, replicate RNA using and RNA dependent but does pose a practical problem in that our RNA replicase (RdRp). Thus, much methods of analysis seem unable to deal with attention has been directed towards the such high diversity. analysis of RdRp to determine if it can be used to trace viral origins. However, there also clearly appears to be a fundamental Several early reports have argued that all + problem concerning the rate of sequence ssRNA viruses appear to have evolved from a change in RNA viruses and using this common ancestor (see Argos ’84, ’88, information to infer origins. It has been Gorbalenya, 95). A ‘GDD’ sequence motif was recently argued (Holmes, 2003), that identified that was essential for the RNA essentially all extant RNA viruses appear to polymerizing activity of the enzyme and thus have evolved as recently as 50,000 ybp. this motif appeared to link most viral RNA This result is based on molecular clock dependent RNA polymerase (RdRp) together. estimates from various RNA viral genomes. However, others (Zanotto et al., ’96) have argued Because the rate of error in replicating RNA that phylogenetic analysis does not support the 162 inclusion of all the plus RNA viruses as Comovividae have 2 RNA segments, expressed having evolved from a common ancestor. as polyprotein and appear to be distinct. In Yet, more recent analysis by Gibbs et al contrast, the Bromvividae have 3 +RNA (2000) did support a monophylyl for RdRp segments. Multisegmented viral families are and postulated that an alpha-like virus expected to most likely have evolved from single supergroup represented the predessor for the segmented ancestors. One member of this ss + RNA viruses. Such a supergroup might family that may represent a basal version is have conceivably evolved from the single OLV-2 , which infects olives in latent way. The segment iosmetric Leviviridae family of virus has no vector and is not associated with phage. Some have also suggested that disease, which appear to indicate a persistent dsRNA viruses may have originated viral life strategy. In addition, and also multiple times from ss+RNA virus consistent with a persistent life strategy, it has a predecessors (Koonin and Dolja, 1993, narrow host range. Ward 1993). RNA viruses have also been classified into six supergroups. These Virus-plant dynamic recently altered by supergroups include Carmo-like, Sobemo- human activity. As we consider the relationship like, Picorna-like, Flavi-like, Alpha-like and of plant viruses to their angiosperm host, we Corona-like viruses. Interestingly, most of need to raise a note of caution. There are an these supergroups appear to be represented estimated 250,000 species of angiosperms, by far in ocean isolates. Clearly +RNA viruses can the most diverse of plant species. Given that be split into major families, based on RdRp. most angiosperms are susceptible to virus These families are distinct form those infection and the numbers of viruses infecting generated using plant movement proteins these host, it is also estimated that there may be (which are polyphyletic). The replicases for up to 26 X 106 possible virus/host combinations all +RNA viruses can also be placed into in the world. This is a staggering number that three distinct supergroups. Group II has yet to be sampled sufficiently, but there are includes dianthoviruses, tobamoviruses, other reasons to think that our understanding of carnoviruses and a subset of these possible viral-host combinations are very (described below). Also, replicase of RNA narrow or distorted. For the most part, as phage QB, and yeast dsRNA mentioned, plant viruses have been studied due elements.are within this group. The to their ability to cause acute disease in crop inference of this inclusion of RdRp from species. Besides the fact that this focus tends to viruses of lower organisms is that this ignore non-pathogenic or persistent virus-plant lineage of RNA virus predates even the relationships, crop disease is strongly influenced origin of green algae, the predecessors of by human activity. For example, human are plants, but includes major families of all the known to have introduced virus into new areas, +RNA plant viruses. introduced vectors into new areas, create large dense monocultures that are highly virus Additional results appear to support long susceptible, prolong crop seasons thus allowing range evolutionary linkages between +RNA virus transmission. All of these situations tend viruses. The capsid of tobacco ringspot to result in viruses as agents of disease in crop virus shows an antigenic link to picornavirus species. It thus seems possible that human superfamily. This virus is of the nepovirus agricultural activity is distorting the natural genus, which are know for causing disease relationship between plant viruses and their host in fruit crops. In fact , as they have existed for much of plant evolution. and picornaviruses appear to be within one super family according to similar Angiosperm, though, appear to have arisen about structures determined from crystallized 120 MYBP. Within these, Solanaceae appear to capsids. Yet we can see that the be earliest angiosperm that can be identified 163 from fossil of the cretaceous period (65 be explained. However, sometimes MYBP). Within the Solanaceae, nicotiana Tobamoviruses are isolated from unexpected species have been most studied as models of sources. A few of these unusual sources, such as plant-virus relationships. Currently the ORSV of Orchids, are known to establish persist major center of species of nicotiana is in infections resulting in inapparent maintenance south and central America. This area also within their host. Sequence analysis suggests corresponds approximately to the center of that the most basal member of the tobamoviruses native plant resistivity to TMV (see below). appears to be Sun Mosaic virus and It has been proposed that the current CGMMV. CGMMV is of special interest as distribution of nicotiana species may relate fungi as described above also transmit this virus. to the early continental break up. A Assuming these two viruses represent early group also clusters with these tobamoviruses, the filamentous viruses of nicotiana host and this clustering has been Charophycean algae, but also possibly to used to suggest co-evolution between these plasmodiophoromycete fungi would seem to be viruses and host. Such suggestions would possible viral progenitors of modern be consistent with the idea that these host Tobamoviruses. and the viruses that infect them are biologically linked at the earliest period of Other Tobamoviruses have also received much angiosperm evolution. attention. ToMV is of interest as it has been isolated from glacial ice cores, which would correspond to viruses prevalent about 10,000 Models of acute plant viruses ybp. Such isolates are identical to contemporary ToMV isolates, further supporting the view that Tobamovirus (a floating viral genus) can be tobamoviruses tend to be genetically very stable. considered as a well studied example of rod- This result also indicates that molecular clock shaped RNA viruses of plants. As such we estimates noted above must be highly inaccurate can examine them as models for virus host when applied to such stable virus populations. relationship and possible origins. In most The most studied Tobamovirus is clearly TMV, host populations, Tobamovirus populations the very first virus ever characterized. Despite are both stable and broadly congruent with this distinction, as noted above the natural host angiosperm. Furthermore, biology of TMV and TMV-host relationships Tobamovirus taxonomy mostly correlates was not well understood. Recent studies now with the taxonomy of it host plant. Many of suggest that the original habitat for TMV appears these host are solanaceous plants (members centered about Peru, Bolivia, Brazil. (see of the night shade family), thus they would Holmes). The area is known to have 3 species of be expected to have emerged early in nicotiana (glauca, raimondii, wigandiodes) that angiosperm evolution. Most of these viruses tolerate TMV infections. These nicotiana appear to have co-diverged with host. This species show few or no symptoms when infected would include tobamoviruses that infect by TMV and thus appear to show a persistent life Solanaceous, Brassicas and legume host. strategy in these host and provide a long term The majority of these infections appear to be niche for TMV. TMV was first studied due to its acute infections as they are disease ability to cause severe disease in the crop associated, which would not appear to offer species, N. tabacum. In this host, TMV is only an explanation for the virus-host an acute agent. As N. tabacum is an congruence. Possibly, a narrow host amphidiploid and thus a species found only in restriction can maintain virus-host crops, TMV from N. tabacum is most likely a congruence but then maintenance of genetic ‘crop fugitive’ and thus represents an unlikely stability would pose a problem that needs to origin of this virus. This point serves to

164 emphasize the consequences that human host. For the most part, Luteoviruses appear to activity can have on virus host relationships. have an acute life strategy with their host and it seems likely that difference in viral life strategy may be the important difference between Other rod shaped plant viruses show distinct determines evolutionary pressures faced by these host relationships. The Carlaviruses supergroups of virus. In addition, Luteoviruses (floating genus) are distinguished by their are frequently observed to assist in transmitting long filaments and a single segment 8 kb other viruses. Thus there also appear to be many RNA with poly A 3’ end. Disease caused by virus-virus interactions associated with this Carlaviruses are rare, thus this virus family supergroup. Leuteoviruses are diverse. is poorly studied. However, persistence Many more Leuteovirus lineages have evolved with this virus can be highly prevalent. through interspecies recombination then have Carlaviruses tend to be found wherever the Tobamoviruses. Thus in contrast to the genetic host plant is found. The Carnation latent stability of Tobamovirus, Luteoviruses appear to virus is possible best studied example of the have adapted a life strategy characterized by high Carlaviruses and tends to induce either mild rates of genetic change and exchange. In this disease or asymptomatic infections. light, a major event in Luteovirus evolution Another well studied member is PSV, which appear to have been the replacement of the RdRP is found as a persistent or latent infection in within the lineage such that RdRP can now be most potatoes. In the case of some seen to exist in two distinct RdRp sets which Carlavriuses, tramsmission is via aphid have little or no homology to each other. One vectors, but some Carlaviruses have no RdRp set closely related to Carmoviruses, while known vector, nor are they known to be seed the other RdRp set is related to . transmitted. Examples of other latent Recombination may have also been important for infections include MLV (mulberry), PLV, the evolution of OSRV as well. OSRV appears CPMMV. Carlaviruses are not known to to be an atypical Luteovirus in that it is a virus of infect cereal crops. These observations orchids (a monocot). Sequence analysis suggests strongly suggest that for the most part, that OSRV may have originated from a Carlaviruses have a stable persistent life recombinant of parental viruses that had strategy with their host. unrelated dicot. host. However, in spite of these recombinational events that appear to have In contrast to the Carlaviruses, the resulted in a new order of RdRp and RSRV Closteroviridae are another family of lineage, the virion structural proteins of all filamentous ss+RNA. However, these Leutoviruses are clearly members of a single viruses can have either mono or bi partite family according to structural considerations. RNA genomes. In addition, these viruses This indicates that at least some of Leuteovirus are associated a lot of crop disease. decent should be from mixed lineage. As to the origins of Luteoviruses, It has been suggested Luteoviruses can be considered as a well that Luteoviruses may have arisen from an RNA studied model of an isometric +RNA virus virus of animal or insect origin that adapted to supergroup. In contrast to the plants. However, a fungal origin also seems Tobamoviruses, Luteoviruses phylogeny dos possible given the similarity of these viruses to not correlate with the phylogeny of its host. those of animals. In addition, Luteoviruses are associated with causing a lot of disease in its crop host. Other viruses have also been considered as There seems to be conflicting evolutionary possible progenitors to the Luteoviruses. For pressure between Luteoviruses and their example, the Necroviruses, such as TNV, have a host compared to Tobamoviruses and their single segment of + ssRNA, with no capped 5’ end. They are associated with limited disease, 165 and host restricted. They are found to involve aphid sybionts, as discussed in detail naturally persist in roots of host plants below. Yet other insects seem to seldom transmit worldwide. In addition they are commonly plant viruses. Orthoptera and Dermaptera, for associated with smaller satellite virus example, have few vectors for plant virus STNV-1. Another possibility for the origins whereas Coleoptera (beetles; 55,000 species) of Luteoviruses are the Carmoviruses as have lots of vectors for tymo-, como-, bromo- they also seem to be a source of RdRp in viruses. Some pollinating insects can also some Luteoviruses. These are small transmit virus, such as BLMV via honey bees, icosahedral +RNA, of about 4 kb in length. but given the numbers of pollinating insects, this Carmoviruses tend to cause mild or is not very common. For the most part, these asymptomatic infections in their host. TCV striking patterns of virus-vector biology are not is a well studied member which shows a well understood. very restricted host range. Carmoviruses can be found ornimental host plants, such as Other viruses. There are many individual CarMV. That it can be transmitted by families of plant viruses that could be considered fungal zoospore (and beetles) suggest that in greater detail. From the perspective of virus these viruses could have originated from or host evolution, however, these other viral fungal sources. families are not very informative on the whole and will not be presented. Several additional Bromoviridae are amongst the most points, however, are worth noting. The Gemini common plant viruses known (over 1000 viruses of plants are ssDNA viruses that replicate host species, 85 plant families are via rolling circular mechanisms. They have some represented). These viruses have the similarities to , multipartite ssDNA broadest host range as a group, although viruses. Viruses with related replication individual members can have a narrow host mechanisms are found in most living orders. range. Curiously, all appear to be beetle With the Geminiviruses, however, plant viral transmitted, in apparent emphasis of the DNA replication is also supported by bacterial importance of virus-vector evolution. CMV replication machinery. ToLCV replicates its and PSV represent the best-studied members DNA in agrobacterium tumorfacicens suggesting of this family. One genus of virus with origin form ssDNA phage and that distinct biology is the (OLV-2), are injected into plant DNA. Thus, this replicon olive latent virus. This virus has no known functions in both a prokaryotic and eukaryotic vector and induces no known disease. system. Furthermore sequences related to Because it is also a member of the geminiviral DNA can be found in some plants, supergroup, which is also such as nicotiana, suggesting some involvement common to animal viruses, it may be at the origin of those plant genomes. This result representative a progenitor virus, possibly of has been used to argue that Geminiviruses fungal origin. evolved from prokaryotic replicons, but have invaded plant genomes. Nepoviruses also add an Vectors and plant virus. We will be interesting note. These viruses are isometric discussing insect vectors below, but some RNA viruses that are all nematode transmitted. comments on insect vectors should also be They show a wide host range throughout North noted now. Up to 30% of all plant viruses America and Europe, but not worldwide. They are of the Potyviridae family. These are are frequently associated with satellites and other amongst the most damaging of crop viruses RNAs. It is curious that these viruses can be and are generally transmitted by aphids. highly specific to their nematode host, but as Each potyvirus genus appears to have discussed earlier (Ch. 6), nematodes are not adapted to specific vector organism so virus- known to support the replication of any virus. vector biology is tightly linked and can also Plant rhabdoviruses are known to often be able 166 to replicate in insects and are highly specific host crop is growing. Thus it is very clear that to their insect vector. However, plant an essentially defective and persisting life rhabdoviruses can also be adapted to plant strategy of GRV can be highly successful in only replication with in nature. It is not clear, however, what effect this plants. This has been used to suggest that would have on the fitness and reproductive plant rhabdoviruses likely evolved from success of the helper virus or the host, which is insect viruses that also acquired an discussed below. additional plant movement protein. Thus there are several instances in which it Tight linkages between helper virus and satellite appears that an animal virus may have can also be observed. For example, TNV can be adapted to plants. classified on the basis of the specific satellite virus it activates. This link of helper (host) to satellite (parasite) is reminiscent of typing of Mixed autonomous and satellite virus; It bacteria based on phage sensitivity and appears was noted above that, overall, plants appear to support the concept that virus-virus to have a strong tendency to support mixed relationship can be crucial for success in the infections with RNA viruses. Numerous field. STNV is a well studied satellite of tobacco specific examples exist in the plant virus necrosis virus. This was the first plant satellite literature that establishes this situation. Less discovered in 1962. STNV has an un-capped clear, however, is the evolutionary mRNA that works in both prokaryotic and consequences (to both the viruses and host) eukaryotic translation systems. Also, the 5’ of such mixtures. In some cases, RNA lacks terminal viral protein and the 3’ RNA interactions between two different viruses is resembles a tRNA in structure, lacking a poly A able to affect viral host range. For example, tail. These features make STNV very useful for TMV does not normally infect , but if efficient in vitro translation from wheat germ also BSMV infected, which supplies needed extracts. STNV codes for one gene, a coat movement proteins, TMV will be protein, but codes for no other structural complimented to replicate. Another proteins. The helper –satellite virus activation is example of mixed virus disease is from strain specific. There is also high specificity for Groundnut rosette disease. This is caused fungal vector involved and some specificity to by complex of two autonomous viruses plus host plant (this being a common situation for a satellite virus. GRAV is aphid satellites). This STNV gene shows some transmitted, and provides the coat protein. relationship to phage proteins, suggesting a GRV and satellite GRV (sGRV) are needed phage based origin. There also appear to be for disease development. It was mentioned relationships amongst various satellite viruses. the Luteoviruses can sometimes help other viruses. Some Umbraviruses can persist in Parasitic RNAs. In addition to satellite viruses, their host plants with no helper virus. there are also many small satellite RNA’s found During such persistence they can make a lot in plants. These small RNA’s tend to have a lot of virus specific RNA, but produce little of secondary structure, due to RNA base pairing, virus or disease. These viruses have no coat and they do not code for gene products. They proteins so it seems they have fully adapted can also be circular. In being non-coding but to use coat proteins of other Lueteoviruses. parasitic replicators of helper virus, they clearly These viruses are related to . resemble fully defective viruses and in some These infections can be highly host specific. cases (e.g., CMV) they can attenuate symptoms In addition infected plants can also support induced by the helper virus and can also interfere dsRNA satellite GRV. The presence of with other viruses. Other satellite RNA, GRV can be so prevalent, that it can be however, are associated with enhanced virulence found anywhere worldwide in which the during helper virus infection so their relationship 167 with the host is complex. In addition to and embryos (i.e. wheat germ) are excellent satellite RNA’s, plants can also harbor source of translational systems, but also contain endogeneous dsRNA’s. These tend to be ribosome inhibitors that self regulate translation found in various cultivated crop species, as described below. Seeds can also express self such as rice, bean and barley. These incompatability functions. With respect to dsRNA’s appear to resemble hypovirus in infection by virus, seeds would appear to offer that they are cryptic and code only for some very specific advantages. replicase and coat protein. They do not code Efficient seed based transmission would reduce for movement proteins. Also like the need for vector dependent virus transmission. hypoviruses, they show no clear association Viral persistence in infected plants would seem with viral particles, but are often prevalent necessary to allow efficient host reproduction and persistent. However, unlike thus the fitness of persistently infected host hypoviruses and fungal hyphae, plants to not would need to be maintained. One way persistent tend to fuse in the field, to transmit them. infections could affect the fitness of infected host This means that their mechanism of plants would be to interact with or prevent other movement between plants is unclear. It is acute viruses from host colonization. This idea, also unclear what effect they have on either however, has not been evaluated. Tymoviruses, the host plant or the ability of the plant to which are mostly seed transmitted, appear to support other viral infections, although it have originated from a common source about seems reasonable to predict some interaction 200-240 MYBP. Thus this virus-seed with other infectious agents. It has been association may date to the origin of suggested that these dsRNAs may have angipsperms. Some Potyviruses and originated from defective ssRNA viruses. Partiviridae, can also be seed transmitted. As noted above, many fungi also appear to be involved in zoospore based virus transmission. Seeds Angiosperms and Virus. In One example of a Bromviridae is OLV-2 (which filamentous algae, zoospore associated as 3 ssRNA segments). transmission of virus is commonly observed. It might thus be anticipated that a similar However, the expectation that seed transmitted type of virus transmission might be viruses might frequently be persistent would also associated with seeds of angiosperms. Such suggest that such relationships might be difficult seed associated transmission could allow to observe in the field. There is evidence that persisting viral infections to attain a high this may indeed be so. A case in point relates to probability of infecting subsequent host plant pararetrovirus TVCV (tobacco vein generations. The invention of the seed clearing virus) of nicotiana species. This virus is allowed land plants to colonize drier lands 100% transmitted vertically via the seed. It is away from shore, and thus was a major possibly the only example of seed transmitted determinant of plant reproductive success. pararetrovirus so far studied. Prior to this report, Seed plants appear to be monophyletic and Caulimoviruses were unknown in nicotiana. It is thus seem to have evolved from one interesting that TVCV came from hybrid plant – common predecessor. The major groups of nicotiana edwardsonii, which is used as an seed plants are found in three existent indicator plant for various viruses. The parents clades, cycads/ginkgo, Gnetales, conifers. of this hybrid were a male N.,glutinosa and Ferns appear as sister group to these clades. female N. clevelandii. TVCV was not observed Seeds are notable for the large quantity of in either parent. No TVCV transmission can be storage proteins they can contain. Tonoplast demonstrated by mechanical methods or by proteins, for example, are abundant and grafting infected plants to 7 other nicotiana highly expressed in seed in many monocots, species. Nor can TVCV transmission be dicots and gymnosperms. In fact plant seeds accomplished by aphid vectors. TVCV DNA 168 hybridizes to genomic DNA of N. infected with Green mosaic virus (TMV). These edwardsonii and male parent N. glutinosa. infected plants developed no further symptoms TVCV DNA does not hybridize to female when infected with yellow mosaic virus. This parental DNA. Interestingly, TVCV DNA type of interference very common in related shows a 78% sequence identity to some strains of virus. Tungro virus is found in south pararetrovirus–like sequence present in high East Asia from asymptomatic infections of its copy in N. tabacum (and N. rustia) genomes. natural host and provided cross protection in that TVCV is unrelated to other Caulimoviruses host. Another process of virus protection was with related biology, such as banana streak discovered when transgenic plants were first virus (BSV), which can also be integrated. generated. Tobacco plants expressing TMV coat Caulimoviruses, such as CaMV exist in 4 protein were shown to resist infection with known groups and show very narrow host TMV. Such experiments led to the idenification range. There is no sequence relationship of of various systems by which plants can post GAG to any animal transcriptionally repress expression of foreign retroviruses, although its reverse genes. One such response system was described transcriptase (RT) has a clear relationship to in detail in Chapter 6, dealing with the animal retroviruses. CaMV are mainly Caenorhabditis elegans. Like the nematodes, aphid transmitted and not typically seed or plants also have and RNAi system. This consist pollen transmitted. However unusal it may of and RNA directed RNA pol along with other seem, TVCV makes an important point. genes (nucleases) that are induced to degrade and The hybrid based reactivation of TVCV silence foreign genes. In Arabidopsis, the appears to identify a situation in which relevant genes are the RdRP SDE1/SGS2 genes. endogenous viruses are reactivated from the Homologues to these genes are not found in male genome following mating to an drosophila or humans, but they are clearly uninfected female. This link of related to to genes found in C. elegans. endogeneous virus reactivation to sexual However, unlike C. elegans and other simpler reproduction very much resembles the animals, the RNAi response of plants is not situation with of hybrid dysgenesis for systemic or transmissive. It is restricted to a drosophila described in detail below. Such more local response. It should be remembered sex based virus reactivation has major that no known RNA viruses have been observed implications for the sexual isolation of for those early animal systems, yet RNA viruses related species and separation of are highly prevalent in angiosperms. It would interbreeding populations. These therefore seem that the extant plant RNAi implications are also considered in the response is not sufficient to prevent RNA virus drosophila section below. parisitization of plants.

Plants have additional systems that can recognize Antiviral and self responses: Plants have self and differentiate self from non-self. For clearly developed an array of antiviral example, 3 distantly related families of flowering response systems encoded by their genomes. plants, solanacaeae, scrophulariaceae, Rosaceae First, however, we will examine extra- all have self incompatibility RNAses that limit nuclear plant systems that can affect virus self fertilization. These systems also depend on infection. One such process has been named RNAses that from monophyletic clades. ‘pathogen derived resistance -PDR’. This Although many other plants limit self type of protection consists of a virus fertilization, the mechanisms employed vary. mediated cross protection that can occur These types of RNAses, however, are during mixed virus infection. An early reminiscent of SK18 system of yeast for example of this was first reported controlling ds RNA or dicer of elegans. The (McKinney, 1929) for tobacco plants 169 possible relationship of these systems to and PLRV. These RIPs appear to operate in virus colonization has not been evaluated. response to virus damaged induced release into the cytoplasm to prevent ribosomal EF2 binding Another plant regulatory system can be and inhibit virus translation. Yet RIPs are most found in seeds and embryos. As mentioned frequently associated with seeds and embryos above, plant embryos are excellent source of which suggests that these may not be RNA virus- translational system (i.e. wheat germ). friendly tissues. However, the consequences of However, most of these seeds have potent abundant RIP production to seed mediated virus protein inhibitors of translational systems transmission have not been evaluated. called ribosome inactivating proteins (RIPs). RIPs are often localized within the Nuclear and plastid genome colonization. endosperm, which must be removed to give With the sequencing of entire plant genomes we stable and ongoing in vitro translation. can start to get a clear picture of the effects Ribosome inactivating proteins can viruses and virus-like transposons have had on constitute major proteins in seeds and can the evolution of plant genomic DNA. Overall, depurinate a single adenine from the stem- retro elements found in high numbers in most, loop of 28r rRNA. Ricin, is perhaps the but not all plant chromosomal DNAs. The two most potent example of such a RIP. Both major groups of retroelements found in plants are single and double chain versions of RIPs are copia-like and gypsy-like elements which are known, but often RIPs are stored in non- both better studied and more widespread then are toxic forms as polymers or bound to RIP other elements. Gypsy and copia elements differ inhibitors (resembling a toxin/antitoxin from each other by the order of RT and integrase addiction module). Tritin is the major RIP domains. Related fungal derived retroposon found in wheat seeds and RA39 is found in sequences appear to be basal to those of plant rice (also poke weed, melon seeds, kernels and drosophila and thus support the view of of camphor tree, maize, barley seed, iris fungal involvement in early plant and animal bulbs all have RIPs). When first discovered, evolution (or possibly colonization of fungi, it was thought that the main function of animals and plants by the same elements). these proteins was to inhibit pathogenic Curiously, these fungal elements have not been fungi since several RIPs were observed to be found in algae. Also curious is that antifungal. However, it is now known that phylogenetic analysis indicates that the some fungi have their own RIPs. Fungal Drosophila gypsy element is basal to those found RIPs are found in fruiting bodies. Other in plants, suggesting the possibility that an plant examples also include RIPs expressed animal species might have been the original in roots, agrobacterium rhizogenes source of these elements now in land plants. transformed hairy roots make RIPs, also Overall, in most plants, these copia-like elements mirabilis Jalapa roots. However, it is now are monophyletic. This strongly suggests that known that various fungi themselves make retroposon colonization is associated with the RIP proteins in their fruiting bodies. In origins of the various plant lineages and that addition, it has been established that not all subsequent selection has not eliminated these plant RIPs are antifungal, as they fail to agents or contributed to additional large scale affect important fungal . colonizations. However, another function of RIPs appears to be as antiviral factors and several RIPs In plants, the large majority of retroposon copies were discovered with potent antiviral are inactive, defective and unable to transpose or activities. RIPs with established antiviral make virus. Very few ORF with intact gag or activities include Pokeweed Antiviral Factor pol exist and no copies of env have been (PAP and PAP II), TMV, and PVX which sequences. Yet a small number of retroposons were both highly active against PVX, PVY do appear to have retained gag and RT activity. 170 There is one exceptional retroposon, other retroposons already in the rice genome. however, that has maintained the gag ORF, However, grasses (rice) in particular appear to that is the BARE-1 copia-like element found have high numbers of retroposons. Two in large numbers in the Barley genome. complete plant genomes are now known. The This retroposon appears to retain first sequenced was , which transposition activity. The degree of has a 460 kd genome. AtC2 is retroposon found genome colonization by retroposons varies in Arabidopsis that is present as a full copy significantly. Some plants, such as lilium copia-like element. 300 Atc2-related elements henryi have much less gypsy-like elements. can be observed in the genome that constitutes It appears that increasing genome about 1% of the total genomic DNA. Of these colonization by retroposons occurred during 300 copies, 23 are full-length copies with the evolution of higher plants. In cereal putative ORFs, but lacking env. Interestingly, all genomes, the family Graminae (grasses), of these full copies appear to represent distinct there is an especially large fraction of retroviral families, similar to the ERV patterns retroelements. For example, retroposons are observations with C. elegans. In Arabidopsis, present in high numbers, such as ~ 50,000 the gypsy-like elements are monophyletic and BARE-1 coplia-like copies in barley, or appear to have been early colonizers of the wheat (Bis-1 is 5% genome) and Maize genome in this lineage. The subsequent gypsy (50% of genome is retroposon). The maize element sequence diversification appears to situation is of special interest as it has been occur mainly during vertical, not horizontal reported that the maize genome only transmission. A similar conclusion with respect recently (e.g. 3 X106 ybp) underwent such a to gypsy elements colonization applies to the rice large scale retroposon amplification. A genome. similar situation applies to the BARE-1 element of barley, but here evidence suggests that habitat specific amplification of BARE elements in wild barley species All three of these major plant retroposon families has occurred. Gymnosperms (conifers) are are abundant in animal lineages. LINE elements similar to wheat with regards to retroposon are related to these families and can be found in content. Gymnosperms tend to have very both plant and animal genomes, but generally at large genomes, (but not much polyploidy), very different levels. LINE- elements can be and retroelements tend to be about 50% of defined as retroposon derived sequences that these genomes. One explanation for this lack LTR or 3’ polyA sequences. In placental high level retroposon based genome vertebrate animal genomes, LINE elements are plasticity is that amplifying retroposons may present in exceedingly high numbers provide the needed genomic rearrangements (106/genome). Generally, LINEs will contain required during periods of rapid adaptation. degenerate GAG and RT ORFs. But some of A puzzle that remains, however, is these retroposon families are not well understanding how each plant lineage represented in plants. For example, plants may became initially highly colonized by have much less or no gypsy/Ty3elements or Pao- particular families of retroposons. It seems like elements. In Arabidopsis, Tal-1 and Tal 17 likely, some exogenous source of ERV was LINE elements have been identified. However, involved at the origin of the species. For Thal is present at a very modest level, between example, it is known that each major clade 1-3 copies per genome. A related sequence of rice retroposon element is more related to found in Potato is the Tst1 element also found at elements in other species then to other between 1-3 copies per genome. In plants, the elements in the rice genome. This suggest Ty1/copia-like LINE related element is also that the original source of these amplifiable known but has been given the unfortunate (and retroposons were from other organisms, not 171 obscuring) name of psudoviridaea. The 276 infectious and transmissible viruses, especially in psudoviridaea elements are simply the LTR reproductive an embryonic tissue (discussed deficient Ty1/copia-like (LINE-like) below). Also, phylogenetic analysis supports the elements present in Arabidopsis DNA. idea that the fungal and insect versions of these genomic retroviruses are more basal then those The Rice genome is similar in many respects found in plants. This leads to the startling to that of Arabidopsis thaliana. Rice has a possibility that insects or early animals ancestors 466 Mbp genome in which repetitive DNA of insects may have hosted and transmitted the is at about 42%. MITES, miniature inverted retroviruses that colonized plant genomes in repeats, are present at about 33,000-50,000 significant numbers, forming large numbers of per genome. Most of these repeats can be ERV defectives, at the very origins of land plant found in fungal genomes, but 1/3 are not lineages. This counterintuitive idea is that, in a found in any fungal species DNA. Rice has sense, insects begat the higher plants leading to about 98,000 transposable elements, 18% of the creation of flowers that now feed them. these being retroposons. Thus the ratio of Another observation that may be consistent with DNA to retro transposons is very different such a counterintuitive idea is that the DNA from that of vertebrates, and much more replication genes of Rice are phylogenetically similar to C. elegans. Of the Rice incongruent. These rice genes are more similar retroposons, 80% are similar to those of to the human genes then to those of drosophila. Arabidopsis. Rice has 20 copia-like families Some important animal-like replication genes (compared to maize which has 10 copia-like appear to have entered the rice lineage from families). The rice genome project has the unknown sources. As strange as this idea may added advantage of being able to compare to seem to some, it is not without some precedent. at least a partial sequence for another Others have previously suggested that the species of Japanese rice. Little difference evolution of plant flower structures themselves, was seen in coding sequences between two may have stemmed from insect parasitization in rice species. However, a major difference which insect embryos were deposited in plant with Japonica species was found in their tissue, induced growths () in plant tissue, transposable elements – of which 63% of leading to the generation of the protective flower which were retroposons. The different structures of angiosperms. Such a scenario, retroposons in Japonica were mainly of the however, would also require the permanent Ty1/copia type, plus some Ty3/gypsy-like. colonization of the plant genome by insect Wild rice has similarly acquired the RIRE1 derived genes from the insect embryo. If this copia-like element and the idea is correct, it would suggest a truly amazing RIRE1 pol sequence conserves sequences evolutionary cycle in which insects were similar to those found in Drosophila. This involved in the origin of flowering plants, which observation again raises several major then contributed to the evolution of insects! questions on what drives plant speciation. Although this idea may account for many of the Two big questions can be posed. One, why currently observed genetic characteristics of is speciation associated with genome plant and insect genomes, we must await a more colonization by specific retroviral derived comprehensive sequence analysis of both agents? Two, how are these new elements putative insect and plant progenitors to support entering the rice genome, especially since or dispel such a hypothesis. plants are not known to support autonomous retroviruses, nor are insect vectors known for plant retroviruses? Plastid DNA colonization. Unlike the situation observed in lower plants, the plastid DNA Insects, however, (such as drosophila) can genomes of higher plants show much evidence express endogenous retroviruses as of evolution via repeated colonization and 172 invasions with genetic parasites. The most to do this way. If so, DNA plasmid - ‘virus’ remarkable of these is the observation that in transmission in a plant requires a complete angiosperm mtDNA, Homing group I Agrobacterial cell to accomplish. introns within the cox gene have been acquired over 1000 times in evolution of various angiosperm mtDNA. The open Overall, we can see clear patterns between plants question then is how these homing introns viruses and insects, when examined from the might have been transmitted to so may perspective of plant evolution. Next we consider angiosperm mtDNAs. The insertion process the perspective of insect evolution. very much resembles an IS-like insertion Viruses and the origin of insects. event, such as that used by ssDNA plectovirus of bacteria. It would thus seem We have previously mentioned some overall reasonable to hypothesize that some relationships that link plant and insect evolution. infectious agent must have existed that The most prominent link is the congruent allowed the transmission of these introns to explosive radiation that occurred in both all of their host, but this putative virus is not angiosperms and insect species. We now seek to currently known. Lacking a system of include the patterns of virus infection and packaging and transmission, how introns transmission in insects as we consider insect move remains a mystery, but mechanical evolution. In some cases, very tight links insect mediated transmission has been between plant viruses and the insects that proposed. One other example of plasmid transmit them are known. Some of these viruses DNA transmission in plants should be are transmitted by one insect vector, and not mentioned. That involves the T-DNA that is transmitted by others. Plant virology chapters transmitted from Agrobacterium symbionts will frequently point out such relationships. to plant root cells, inducing the cells to form However, the viruses that infect insects N2 fixing nodules. Bacterially derived themselves are not generally considered in those plasmid DNA, is able to enter plant cells, chapters, which we will now include. When this pilot to the plant nucleus and integrate into is done, it becomes apparent that the insect host cellular DNA. This bacterial plasmid orders and families show a very uneven pattern clearly shows a ‘virus-like’ behavior. Yet it of virus susceptibility and vector function. For does not replicate in the host plant cell, but example, tetraviruses appear to exclusively infect simply integrates, expresses some genes and lepidopteran host. Such issues have not persists, very much like a lysogenic state of previously received much attention. phage. No movement or transfer genes are expressed in the plant cell. The plasmid Arthropods-Arachnids. This is little doubt that does express plant specific growth factors the fist insects evolved from arthropods in the that allow the nodules to support they oceans prior to angiosperm evolution. We can symbiotic bacteria. It is however, use fossil evidence to trace the evolution of remarkable to consider that this T-DNA arthropods to Arachnids and hexapods, the shows an ability that is absent from any ancestors of insects. Arthropods consist of plant DNA virus. It can penetrate host cells, hexapods (insects), myriapods (centipesed, move to the nucleus via the well conserved millpedes), crustaceans (shrimps, crabs) and virD2basic protein, enter open chromatin chelicerates (sea spiders, horseshoe crab, and express proteins, leading us to wonder arachnids), all characterized by exoskeletons why no plant virus can do the same thing. It made of chitin, segmented body plan and paired seems possible that the all the functions appendages. These represent the most successful needed to penetrate plant cells might be animal species numbering about one million beyond what could be expressed on the species in total and representing the majority of surface of a virion. It is simply to complex 173 animal biomass. The smallest of these, are which was coincident with angiosperm radiation, mites and parasitic wasp, which are as observed with megafossils during the measured at less then 1 mm. Fossil Cambrain explosion. Also at the Cambrain evidence clearly suggests that early explosion (about 500 mybp), was the continental arachnids occupied seaside habitats. These 8 breakup of Gondwanaland, resulting in the legs arachnids included trilobite, to the separation and isolation of S. America. horseshoe crab then sea scorpions. To this Hexapoda or insecta fossils became abundant at day, however, we know little about viruses this same time, as did the first plant fossils that infect these species. With respect to (about 428 mybp). horseshoe crab, viruses do not seem to be prevalent, but the topic is poorly studied. However, it is clear that the arachnid order The basal hexapods that first appeared on land was not nearly as successful on land as the were related to blaterria species as well as other insect orders became. Furthermore, with wingless organisms resembling Protura and respect to plant virus interaction, only one Collembola species. Paired wings, common to order of 12 terrestrial arachnidia is involved so many insects, arose later, around the time in virus transmission and infection. These vascular plants developed. The earliest insect are mites/. Arachnids have been mostly fossils correspond to those for Blaterria (roaches) studied as source of fungal pathogen species. Blatteria are estimated to have evolved transmission to plants, such as the about 380 mybp, Orthoptera (crickets), and Eriophyoids spider mites. Virus disease of Hemiptera (ture bugs, leafhoppers, aphids) mites themselves are known, but much less originated much later, about 95 mybp. Blaterria common then corresponding diseases of now appears to be the common ancestor to all insects. Two known virus infections of insects as inferred by limited sequence analysis arachnids includes those of the Citrus red of mtDNA cox sequence. However, it now also mite and European red mite that can be appears that the oceanic ancestor to these early infected with non-occluded, but poorly insects were the crustacea, Anostraca; or fairy characterized DNA viruses. Although not shrimp and brine shrimp. These shrimp species sufficiently well studied, these viruses are are both bisexual and parthenogenic. Recent possibly related to iridoviruses, such as analysis of mtDNA sequence confirms that reported in mites associated with Anostraca mtDNA is almost identical to that moribund honey bee colonies. However, found in drosophila species. Unfortunately, the mites are often parasitic to insects and viruses of these simple crustaceans are not well directly associated with transmitting insect studied. As presented in Ch. 6, other viruses, especially bees. Generally, these crustaceans, such as panaeus species are known viruses do not infect their mite vectors but to support diverse families of both DNA and can show some vector specificity. RNA viruses, so we might expect prevalent viral infections of Anostraca. However, within Hexapoda (insects). Because of their Anostraca species, it has only been reported that enormous numbers, hexapods will be the WSSV DNA can be found associated with cysts, main focus of this section. Insects are by far and no virus replication could be confirmed in the most diverse of any animal. 30 orders adults. Thus the virus might be absorbed to the are known, of which nine are known to feed shrimp cyst and not be replicating. WSSV on green plants (mostly chewing insects). shares very limited sequence homology with any Insects are especially diverse in their mouth other DNA viruses, including insect DNA part structures. Insects exist in hundreds of viruses so it does not appear to have been a thousands of species. Although the insect direct ancestor to modern insect DNA viruses. order is older then many modern lineages, Brachiopodo species now appear to be the insects underwent explosive radiation, closest relative to the insect order, inferred from 174 cox-1 and EF-1a sequence analysis. Crustacea , such as fairy shrimp and In addition to plant predation, many insects have blaterria would both be sister groups to predatory and parasitoid relationships with other Branchiopoda. The transformation of insects and will feed on or parasitize insects. As arthropod to terrestrial habitat associated this relationship presents clear opportunities for with early vascular plants (~ 425 MYBP). virus transmission, this topic is considered in During this early adaptation, there is also greater detail below. evidence of insect predation on these early plants. The most related insects to cockroaches are the orders of mantids, As was mentioned above, insect-plant termites. Termites appear to have evolved is especially apparent due to the role from wood eating blaterria following of insects as pollinators of angiosperms. Within symbiotic acquisition of gut bacteria for the insect orders, Diptera (flies), Lepidoptera cellulose digestion. Viruses of cockroach (moths), Hymenoptera (Bees, wasp, ants) are orders are not well studied as there are very three orders are major plant pollinators. These few reports in the literature on this topic. orders, however, do not constitute the most However, some positive strand icosahedral efficient vectors for plant virus transmission viruses have been observed in cockroaches. compared to true bugs and beetles. However, Yet it seems clear that acute infections of insect success as pollinators clearly has a strong cockroaches by viruses is not a widespread behavioral component, so insect behavior is very phenomena. Persistent infection has not important for angiosperm reproductive success been observed so it remains an open and flower recognition by insects can be specific question if this is common to this insect to the plant. The molecular basis insect behavior order. is poorly understood. However, as noted below, parasitoid wasps (hymenoptera) are known to From the perspective of plant evolution, we greatly modify and control behavior of larval are interested in insects that are both lepidopteran host in very specific ways. As we predators of plants, and insects that are will see, there is a viral component to this vectors for plant virus. The largest groups behavior alteration. of plant eating insects are the coleoptera and opthoptera. The next order that most Some insect viruses are highly insect and vector commonly feeds on plant eaters are the restricted, implying intimate molecular thysanoptera (thips). In terms of virus relationships between insects and viruses. We transmission, but also predation, Aphids will consider a few situations in which the nature (members of true Bugs) are clearly most of such linkages have been studied. successful plant feeding insects and it is estimated they are responsible for the transmission of up to 50% of all insect Major classes of insect viruses; DNA viruses. transmitted plant viruses. Within these aphid The major groups of viruses of insects are clearly transmitted viruses, over 600 viral species distinct both from those found in plants and are known. For the most part, plant viruses animals. The most apparent distinction is that do not replicate in aphids. In a few aphid insects support the replication of many large species, however, plant virus can persist DNA viruses, none of which are found in plants. within animal aphid tissue and be The great majority of these viruses have transovarial transmitted (these are called relatively large genomes of dsDNA and encode persistent propagative vectors). For these their own replication proteins. Some of these reasons, the aphid-plant virus relationship is DNA viruses are found in other animal species, examined in greater detail below. but some (Ascoviruses, Polydnaviruses) are

175 unique to insects. Another distinction with little homology to any aquatic or vertebrate plants is the relatively small number of animal virus. The Entamopox family of insect disease causing RNA viruses of insects. Yet virus is clearly related to the poxviruses of insects are a main vector for such viruses. vertebrate animals. Entamopox replication Evolutionary relationships between insect appears especially associated with viruses and viruses of other orders is clear in metamorphosis of its host insect and is also most cases. Within the large DNA viruses found in various flying insects. The of insects, the Iridoviruses show a clear (PDV) are multi-segmented relationship to viruses of aquatic vertebrates, circular DNA viruses of parasitoid hymeoptera such as LCDV of fish. No related virus of species. PDV presents an especially interesting land animals is known for this family, case from the perspective of virus host co- however. The Ascoviruses, are unique to evolution. These are endogeneous DNA viruses insects with little sequence similarity to that replicate via circular dsDNA and are found other DNA viruses, thus Ascoviral in the genomes many parasitoid hymenoptera immediate origins are obscure. However, species. Polydnaviruses are examined in some Ascoviruses do show more distant sequence detail below. Finally, insects also support the relationship to other large DNA viruses replication of Densoviruses, which are ssDNA (phycodnaviruses of algae, T even phage) viruses that replicate via rolling circular can be seen with the Ascoviral DNA mechanisms and show clear relationship to replication proteins. In addition, Ascoviral parvoviruses of animlas and Gemini viruses of biology is similar in several important plants. One curiosity, however, is that although respects to that of the Polydnaviruses and insects support the replication of many types of this will be considered below. There is one DNA viruses, there are no Herpesvirus family animal DNA virus that appears to occupy a members within the insect orders. This is a relatively unique position between insect rather curious especially since herpesvirus DNA viruses and DNA viruses of land replication proteins are clearly similar to the vertebrates, that is African swine Fever DNA replication proteins found in both algae Virus (ASFV). This virus has the distinction and bacteria (T4 phage). However, as we of being the only DNA virus known to discussed in Chapter 6, aquatic herpesviruses replicate in both insects and animals. It is often show a clear preference for nervous tissue also phylogenetically distinct in that it is and insects might present a very different cellular often an outgroup for both vertebrate and habitat for herpesvirus. insect DNA viruses and frequently shows sequence relationships to several viral groups that are otherwise fail to show strong Insect RNA viruses. As mentioned above, plus liks to any other virus groups. The stranded RNA viruses were highly common in Baculoviruses are a very important and large angiosperm plants. Relative to this plant-virus group of insect viruses that show clear abundance, insect orders show an overall relationship to viruses of aquatic animals so paucity of diseases caused by RNA viruses, it seems likely they can trace their lineage to especially in some specific insect orders. Yet, those viruses. As baculoviruses are an picorna-like viruses (nonsegmented +RNA, important group of insect viruses, they are polyprotein, isocaherdal) are known to infect examined in greater detail below. However, several orders of, both pollinators and plant as discussed below, the nonoccluded predators insects, including hymenoptera (bees ‘baculoviruses’ (such as Hz-IV) are wasps), dipteran (flies, mosquitoes), hemipteran sufficiently different from both (aphids, leafhoppers), lepidopteran (moths, baculoviruses and other DNA viruses (both butterflies) and coleopteran (beetles) host. The biologically and by sequence) to be best studied of the picorna-like insect viruses is considered another DNA viral group, with (CrPV) and its relatives 176 the drosophila C virus and Gonometa virus. Cricket paralysis virus has a 3’ polyA tail, Bees and RNA virus. There appears to be a and a 5’ terminal protein. CrPV was curious general link between RNA viruses and discovered during mass breeding of bees. In the 1960’s in Britain there was an Australian field cricket, and showed up as a initially observed outbreak Sacbrood paralysis hind limb paralysis. The virus was also a that was specific to the European honey bee. major problem for silkworm growers in The causative virus was an ovoid or ellipsoid Japan. In crickets, CrPv can be found on ss+RNA virus, chronic paralysis virus (CPV). surface of eggs and this feature seems However, in spite of this initial isolation as a important for natural transmission. disease agent of bees, most bee viruses are now Curiously, CrPV is hard to find in natural known to establish persistent infections in their field populations of both drosophila and host with slight or inapparent disease. This has crickets. This suggests that CrPV has resulted in reduced attention to these agents as recently adapted an acute infection to these their effects on bee populations are often minor. host. However, CrPV can be found in The ability of these viruses to induce disease in natural populations of D. immigrans virus particular bee colonies is often dependent on the but in this host the virus multiplies with no parasites of the colony, such as mites, which will disease. These results strongly suggest that transmit CPV to adults. CPV is common CrPV naturally has a persistent life strategy throughout world and can be found endemic in with specificity to D. immigrans host, but is healthy bee colonies. In this endemic state, it able to jump species to cause acute disease appears that the Queen bee is needed to provide in other host. Tetraviruses are exclusively protection to hive. CPV can also be frequently isolated from lepidopteran host. Curiously, found with an associated virus (CPVA), which is these viruses don’t grow in insect cell also an isometric RNA virus. Clearly virus-virus cultures. Some of these tetraviruses show a interactions are common in this persistent state. high host specificity in which closely related CPVA multiplies at low levels and uses CPV as viruses (i.e., with 90% replicase sequence helper – but interferes with CPV replication. identity) will have a distinct host range. When permissive, CPV reproduction is most Vertical transmission of tetraviruses is evident in female reproductive cast – i.e. queens known in some host and in these cases the - but can also be highly produced in other virus cannot be eliminated from lab bred species, such as in an ant head. Other RNA host. Clearly tetraviruses have a persistent sequences with clear sequence relationship to life strategy in these wild host populations. CPV have also been reported. Some viruses can However, even in this relationship, host be highly productive, such as Sacbrood virus population structure can affect virus host which can make huge quantities of virus in bee dynamics and virulence. For example, in larvae. APV is another common bee virus that the case of NbV, when high host densities persists in adult bees and shows no disease in are attained, horizontal transmission nature. However, when persistently infected becomes efficient resulting in enhanced bees also become infested with mites, APV transmission and high host mortality. It replication is induced by the mites and will kill seems likely that similar (but the bees. The apparent mites induction of APV uncharacterized) viruses and virus-host multiplication occurs along with mites feeding relationships will apply to orthoptera on bees. Another mite related bee infection is (/locust), isoptera (termites) or seen with Kashmir bee virus (KBV). KBV is acarine (mittes) host as unclassified viral naturally found persisting in A. cerana bees. agents have been reported to infect some of Along with infestation by the V. jacobsoni mite, these host. KBV becomes activated resulting in overt infection, especially in European bees. Others viruses, such as (BQCV) 177 can specifically kill queen larvae but recombination with a plant virus. Gibbs and depends on a microsporidian vector for this Weiller proposed some plant viruses may have killing. In addition, several other bee viruses switched host to infect vertebrates. One are asymptomatic and distributed possible example of this are the ssRNA worldwide, such as DWV, SPV, Bee Virus in vertebrates which might have X & Y. Interestingly, some of these acquired a segment of a plant . The N- infections are associated with shorter life terminal region of rep protein similar in these spans for the infected host. A few additional two families. However, C-term region similar to viral types of bees have been reported, 2C RNA binding protein of picorna-like viruses, including rod shapped DNA virus and one such as Caliviruses which are only infect iridovirus, Apis Iridescent virus which vertebrates. causes clustering disease Bees is isolated in old world hives. Overall, persistent DsRNA insect virus. Silkworm midgut virus is infection of bees by RNA viruses is a dsRNA that is distinct from vertebrate dsRNA common and can be accompanied by viruses. One member, insect Cytoplasmic complicated biological interactions with polyhedrosis virus (CPV), shows a wide host other agents and parasites which is range, with cross species transmission. It can be associated with bee disease. It is worth highly infectious and is able to be transmitted noting, however, that bees are not native to form one host larvae to another. This appears to the new world so new world angiosperm indicate that the virus has an acute life strategy. evolution is linked to other insect Curiously, the virus is not lytic to cells in pollinators, such as moths. Thus the biology culture. Finally, Drosophila enhanced sensitivity of bee-virus interaction in the is mainly a to CO2 can be due to infection with an alpha product of human activity associated mainly virus, which is a bisegmented ds RNA virus. with commercial hives. The natural distribution of this virus is not known, however. Nodaviruses are a distinct family of insect ss+RNA viruses that differ from the picorna-like viruses in that they are bipartite Insects classified as vectors for plant viruses. with dual genome encapsidated into one Most plant viruses need an insect vector for virion. Similar viruses are found in plants transmission but in almost all cases, the virus but not vertebrates, so a relationship to plant does not replicate in insect. The most common viruses seems likely. One of the nodavirus vectors for plant viruses are homopterans with RNA segment codes for an RdRp, the other their piercing and sucking mouth parts. segment codes for the capsid protein These Arthropods, nematodes and fungi are all also RNA are also distinct from picorna-like numerically important vectors for plant viruses. genomes in that they have capped 5’ ends, Various kinds of virus-vector interactions appear although they retain a 3’ poly A tail. Some to occur. Plant viruses must generally interact of these viruses have surprisingly broad host specifically with insect mouth parts. These specificity. (FHV) is one mousth parts are also the most variable part of such example, but curiously the natural insect morphology. Three major classes of plant- distribution of this virus is unknown. This virus transmission by insects are known. These virus will not only replicate in many insect biological classes for the insect vectors reflect host, it will even infect mice. FHV is different levels of intimacy between plant-virus known to kill mosquito larvae and will also and insect and are termed non-circulative, infect adult honey bees. circulative persistent, and propagative. The first (circulative) involves the binding of virus to It has been proposed that some animal insect mouth parts, then movement of virus to viruses may have evolved following the salivary gland and crossing of the insect 178 cellular membranes to infect a subsequent plant at feeding. Viruses that move this way includes luteoviruses, gemini viruses and Aphid biology. Aphid have unusal biology in fungally transmitted furoviruses. Viruses several respects. For one, some aphid species can also persist in specific host, which is have live young. Also, aphids species have a called a persistent vector. However, in tendency to reproduce by parthenogenesis. This general this persistence is really a form of can result in predominantly clonal reproduction virus storage in insect cells in that viruses with occasional sexual reproduction in a species. generally do not replicate in the insect The extensive diversity of aphids is striking. vector. The term noncirculative applies to There are about 15,000 species of aphids in virus that associate with cuticular lining of about 2000 genera. Of these, 21 aphid genera mouth parts or foregut. Virus is then de- have been established to be vectors for plant adsorb to infect plant without moving virus transmission, especially within the through insect tissues in an almost Luteoviruses. It is interesting that aphids can mechanical transmission. Aphid also transmit some rhabdoviruses (SYVV, transmission of potyviruses can occur this LNYV), suggesting that such rhabdoviruses way. The nature of insect vector is an may have originated in predecessor insect to important and often specific determinant of aphids. In fact, these viruses are most related to the virus and can be used to identify a plant VSV, an animal virus. This has led some to virus. Thus it is often the case that the best suggest that plant and animal rhabdoviruses may protection against plant viral disease is via both have originated in some early insects. that destroys insect vectors (e.g. Along these lines, the insect vector is normally aphids). highly specific to the plant virus, involving long latency and also transovarial transmission via insect eggs (such as SYVV, PYVV). Such Homoptera can vector various plant viruses. viruses can sometime be adapted to insect only There are two major groups of vector transmission via insect passage and conversely insects, tree/leaf hoppers and aphids/white can be adapted to plant only replication flies that diverged about 180 mybp. Aphids following serial plant passage. into especially seem to have their own patters of plant and insect cells clearly distinct, suggesting association with plant viruses and some that insect or plant cell entry characteristics can aphid species transmit only one virus. For be under independent selection. example, closteroviruses show some co- associatiion with aphid, mealybug and white Aphids tend to be persistent-circulative insect fly vectors. White flies are also often vectors for plant virus transmission. Thus they associated with geminiviruses, which can have a more intimate relationship with the plant produce salivary gland toxins that are virus then many other vectors. general growth retardation toxins, but only Leafhoppers/plant hoppers tend towards plant female flies make the toxin. Some viruses virus persistence as well. In some aphids, of Leafhopperrs (such HLRV, a fiji-like specific plant virus transmission is restricted to virus) may represent the earliest type of one or a few aphid species. Because persistent- insect virus as they are insect egg circulative transmission requires the plant virus transmitted and show no multiplication in to reach salivary gland, the virus must avoid the plant host (maize). In this case, the roles degradation within the insect cell. The may be reversed and plants appear to be the specificity of this process is often the reason for vector leading to persistent viral infection vector based virus restriction. The basis of this of the insect. There are a few other similar specificity is discussed below. Sometimes, a examples, such as rice and barley leaf second virus can help a specific aphid species transmission via aphid of RhPV. transmit an otherwise aphid restricted virus to a 179 host plant, such as PAMV needing PVY for makes these also similar to aphid transmission so virus-virus virulence plasmids of bacteria. It thus seems interactions are known to be important for likely that this rep-gene and inverted repeats can some persistent-circulative insect vectors. account for the high level of natural sequence variability observed in these plasmids. Endoensymbionts-virus specicifity. Another common and curious distinction Viru-like origin of endosymbiont plasmids. concerning aphid biology is that most of Phylogenetic analysis suggests that these them support bacteria-like endoensymbionts, plasmids may have a single evolutionary origin such as Buchnera, which is related to the E. in Buchnera. Furthermore, these plasmid coli genome. Outside of a few exceptions, phylogenies are incongruent with all aphids, mealy bugs, and tsetse fly have chromosomally encoded genes. Plasmids, endosymbiont, although these however, are not exchanged between host so endosymbionts all have different origins. their lineage is host restricted. These The best studied of these is the pea aphid, observations suggest that the plasmids have . This aphid can be independently colonized specific host lineages. parthenogenic and thus can be observed in Two well studied plasmids are the trpEG nad clonal populations in field isolations. leu plasmids, which encode genes for amino acid Analysis of mtDNA sequence shows little biosynthesis. These plasmids can undergo variation in these field isolates. These are amplification and express elevated levels of important crop feeding aphids and it appears biosynthetic enzymes for Trp and Leu. Other, they have undergone substantial species smaller, non-coding plasmids, resembling diversification in last 100,000 years. defectives, are also known but are of unknown Buchnera DNA is present at about 120 function. As noted above in the plant section, copies per aphid cellular genome. The DNA these plasmids also express GroEL (hsp 60 is a single origin containing circular DNA homologue). These proteins are essential for and is maternally transmitted. The 200-250 binding plant viruses within the aphid cell and mybp genome corresponds to about 1/7 the they will provide molecular specificity for the size of E. coli and thus is smaller then many aphid-virus interaction, such as for the aphid large DNA viruses. However, unlike specificity for TYCCV. The plasmid encoded bacteria, the 16S and 23R rRNA GroEL proteins appear to allow the plant virus to transcription units are separated in this persist in the aphid cell and avoid degradation. genome. The endosymbionts are maternally Interesting that the ORF (GroEL) is the most transmitted. In addition, they are often variable sequence of aphids and their associated with two additional small DNA endosymbionts, suggesting it is under strong plasmids. Some aphids also have a selective pressure to change. Although it seems secondary symbiont, which varies within the that plant virus might be involved in this GroEL specific host lineage. Overall, three types of diversity, it is not obvious how the plant virus bacteria-like symbionts are known and these would positively select a plasmid derived gene can be found in sexually reproduced eggs of diversity in an aphid vector. Some aphids have the aphid. Also, APSE-1 bacteriophage been reported to produce a lot more young by DNA sequence is present in these plasmid feeding on virus infected plants (i.e. BtMV, sequences. It is curious that unlike mtDNA, BYV), so this could presumably provide some Buchnera associated plasmids show selective pressure for virus-plasmid interaction. considerable sequence variation. These In some cases, however, aphids feeding on virus plasmids have inverted repeats and rep infected plants may have shortened life spans. genes that are consistent with belonging to The Pea aphid is also subjected to parasitization the phage-like Rep A1 family. The presence by hymenoptera Aphidus ervi Haliday of these rep genes and repeat sequences braconidea (discussed below). In a most 180 amazing case of intimate but complex interaction between parasites, Iridoviruses are known to infect both insects endosymbionts, host and possibly an and aquatic animals. Thus these viruses are endogenous virus, the parasitoid wasp larvae clearly expected to have been in the oceans, will redirect aphid host reproduction and infecting host such as shrimp, prior to the metabolism to favor the development of evolution of insects. Insect iridoviruses are juvenile stages of parasitoid. Presumably known to favor infection of Diptera species this is via the that is co-injected (anopheles, flies, drosophila). In 1953, the first along with the wasp larvae which is know to report of an insect Iridovirus was found in larvae be the main parasitoid derived agent to alter of crane fly of N. American aedes mosquito. It aphid biology. However this as yet was observed that the virus could be transmitted uncharacterized process works, ultimately it by transovarial transmission is some, but not all, must affect the Buchnera endosymbiont to host. In aquatic dipterans, vertical transmission is increase plasmid mediated expression and frequent. As previously mentioned, there are no amino acid biosynthesis in the aphid, iridovirus in higher animals. CIV and TIV are especially that of tyrosine. Thus it may be the most studied insect iridoviruses. that the fitness consequence to the host of Phylogenetic analysis suggests that older these endosymbiont, ‘phage-like’ plasmids versions of this virus family appear to have may also, to a large part, be determined by smaller genomes, such as coleoptera iridovirus interactions with other parasitic agents, such virus (CzIV) which appears older then as such as parasitoids and plant viruses. Lepidoptera iridovirus (WIV) and has extra DNA in its genome relative to CIV. The viral DNA of iridoviruses is packaged into a Wolbachia is another bacteria-like nucleosomal-like structure. The DNA is about endosymbiont of some hybrid/aphid species. 110-155 kbp and is terminally redundant and It is interesting that Wolbachia also have circularly permuted. The level of redundant virus-like phage particles ‘WO particles’ DNA corresponds to between 4-50% of the total which appears to be carried as a prophage. genome. With respect to genome and replication Wolbachia has received much attention by topology, Iridoiruses clearly resemble p22 phage. evolutionary biologist because it can be In natural host populations, high rates of responsible for sex distortion in these hybrid infection (70-90%) can be observed. Thus these species and can result in sexually viruses are highly prevalent in nature. However incompatible populations. Such distortions in some specific populations, such as the Black may represent a first step in the speciation of fly, viral associated pathology is rarely observed, host. yet these populations can be 37% positive for viral DNA by PCR based DNA analysis. This Families of insect DNA viruses. suggests that host specific persistent but inapparent infections are common. Iridoviruses Insect Virus were first noticed early in the have also been observed to be in mixtures with 1800’s. In 1808, there occurred an outbreak picorna-like viruses. However, the biological of Silkworm jaundice that was shown to be significance of the mixed virus has not been due to a transmissible infectious agent. We evaluated. now know this outbreak was due to polyhedrosisvirus, a large DNA virus which Entomopoxvirus. Othoptera (crickets and generated refractive crystal-like bodies in locusts) appear to have a notable affinity for hypertrophied nuclei. The virus infects the Entomopoxvirus. There are currently no obvious gut epithelium, resulting in silkworms that reasons to explain this tight host-virus stop feeding and leads to large losses of relationship. Although entomopoxviruses are infected host. clearly related to the orthopox vertebrate animal 181 viruses, phylogenetic analysis suggests that common to all baculovirus members. Within the entomopoxviruses are basal to and more this entire tree, 17 gene losses in specific variable then the orthropoxviruses. This branches could be seen, but 255 new gene would suggest that the insect acquisitions observed in the in total tree. Of entamopoxvirus evolved first from some these acquired viral genes, 80% of then were large DNA viruses present in the oceans. these new to the Genbank database. The Possible candidates would be Iridovirus or conclusions seem clear. For the most part, relatives of baculoviruses, although both of baculoviruses evolve from a common core of these virus groups are highly diverged from genes (mostly replication and structural genes) entamopoxvirus. Entamopox sequences are by the acquisition of or creation of novel genes. most related to ASFV and Granulosis This virus family is mostly composed of acute, viruses. In othoptera, EPV is known to lytic viruses. Although not highly host infect greater then 31 species of insect. restricted, some host specificity can be noted. Entomopoxvirus can also infect some With NPVs, all subgroup II-A taxa are from Lepidoptera and coleoptera species. The noctuid host (night moths), thus there are some interest in entomopoxvirus was initially virus –host connections. Some viruses harbor focused on its possible use as a biocontrol TED ‘gypsy’ element in DNA, but how this agent, especially since it infects insects element affects viral or host biology is not clear. families that are major perditors of crop Thus entamopoxviruses themselves can be species. However, the close similarity of colonized by retrovirus genomes. Although the entamopoxviruses to animal poxviruses has viruses are generally lytic, some can persist in prevented its development into a biocontrol the larvae of their host, but kill the host at post agent. larval development. In these cases, it is likely that the lytic viral genetic program is linked to the differentiation of the host tissues. Baculoviruses as models for DNA virus evolution. Baculoviruses as a group includes possibly the largest number of Nucleopolyhedrosis virus. AcMNPV is best known insect viruses. These viruses have studied of the nucleopolyhedrosis virus group. dsDNA genomes of 90-180 kbp. For the most part, this is due to the existence of Baculoviruses are mostly pathogenic for cell culture system for virus growth which has their arthropod host. The occlusion allowed experimentation to proceed. The morphology that results form infection is polyhedrons are group . This virus used to classify baculoviruses into two group is one of the largest dsDNA viruses. As major groups. Nucleopolyhedrosis virus cytoplasmic viruses, they have no RNA splicing, (NPV) infection results in polyhedral bodies but do have 5’ leader sequences. The viral DNA in infected cells that contain large amounts pol is unusual in that it is not aphidicolin of virus. Granulosis virus (GV) results in sensitive which would identify it as distinct ovoid occlusions, with few virions. Because lineage from T4, phycodnaviuses or herpesvirus. of their possible use as agents for biocontrol, AcMNPV has been extensively used as a system baculoviruses are well studied. The entire for recombinant expression which has the added phylogenetic tree of all family members has advantage of allowing post translational protein been evaluated. Although the tree shape modifications, such as glycosylation, to occur. will depend somewhat on the specific genes Most NPVs are not species specific but their host used, to generate the tree, it is now possible range is rather narrow. These viruses are highly to include all the genes in one lytic, thus they appear to have a strictly acute life comprehensive phylogenetic analysis. From strategy. Infected larvae will undergo virus such a comprehensive analysis, it was induced liquefaction of the internal tissue along reported that 63 ‘core’ genes were in with lytic virus replication. This results in a 182 dissolved larvae which contaminates soil plants, following predation by an insects, can and the environment and leads to the emit chemical signals that attract parasitoid infection of other larvae via feeding. In wasps assisting the wasp in finding its host. field situations, about 5% of population of Parasitoid species are surprisingly diverse its host tend to be infected, but in some (25,000 species). Such large numbers of species situations of high host population density raises the possibility that these wasp species infection rates can be can be up to 90%. could provide a major vector function for insect virus, akin to the insect equivalent of aphids as Granulosis virus – There are no vertebrate vectors for plants or mosquitoes as vectors for DNA virus that show a clear evolutionary viruses of vertebrates. If so, we can expect some relationship to this group of baculoviruses. very intimate molecular linkages between Even the most conserved gene of most viral parasitoid insects vectors, their transmitted virus lineages, such as the DNA polymerase, are and their host larvae. Finally, there is some not highly related to other eukaryotic DNA evidence of mixed infection between GV and viruses, probably owing to the fact the NPV have distinct biology. Mixed infections Granulosis viruses have an unusal gama with GP & NPV (and some epizootics) in the like DNA pol. Thus the Baculoviruses have field have been reported. It seems possible that distinct DNA pol phylogenetic patterns. these mixed infections may involve some type of The most similar DNA pol sequence to that complementation between the two viruses as of baculovirus are in the entamopoxviruses some larvae are 50 fold more susceptible to (TC1). Mostly, the viruses show a narrow mixed infections than to single infections. host range, but with a high and a However, such double infections are usually not high mortality. Like the other observed in the same cell type, but are in the baculoviruses, Granulosis viruses appear to same larvae, suggesting infections compliment at adhere to a strictly acute life strategy. organism level. These mixed infections show Natural infection with Granulosis virus is high levels of field mortality suggesting that the poorly studied. PrGV is best studied of this mixture affects the viral-host phenotype. group and natural infections of P. rapae have been reported. One particular epizootic Non-occluded ‘Baculovirus. Until recently, the event in Southern California was observed non-occluded baculoviruses have not been well in which > 90% of the host larvae in 1000 studied, although reports of their observation km2 range were infected. This is a highly have been scattered in the literature for some impressive level of infection for an acute time. These viruses resemble baculoviruses, but agent. In this case, the virus was being lack the cellular accumulation of occlusion transmitted by parasitoid Apanteles bodies typical of baculoviruses. Several have glomenrates and was shown to account for now been better characterized which includes up to 84% of GV transmission. This Hz-1, Oryctes virus and Gonad specific Vrius observation makes several important points. (GSV). In addition viruses resembling Hz-1 Clearly these acute viruses can have big have been reported to persist in some host. In impacts on host population in the field. fact, it now appears that persistent infections Second, the role of a parasitoid vector for may be the normal kinds of infections insect virus transmission had not previously established by this family of viruses. Gonad been well appreciated. Such high vector Specific Virus (GSV) is known to infect the mediated transmission efficiencies suggest reproductive tissue of helicoverpa zea moth and that the reproductive success of this GV is associated with ovarian atrophy and infertility virus itself, very much depends on the in some host. This virus is sexually transmitted behavior of the parasitoid vector. Plants are and is generally latent. Although infertility can also involved in communicating to the result, some moths are fertile when infected. parasitoid and it has been reported that some The virus is a rod shaped capsid and contains a 183 circular supercoiled dsDNA of 236 kbp. production as plaque purified Hz-1 virus does This virus was initially isolated from cell not readily establish persistent infections, line in culture that were persistently and suggesting an important role for defective virus inapparently infected. Hz-1 can also infect in the natural life strategy (and fitness) of Hz-1. and persist in many insect cell lines. Hz-1 This also poses the theoretical conundrum to has low sequence homology to baculovirus- evolutionary biology in that inhibition of virus GV (only 3% similarity). Hz-1 shows only reproduction (not maximixed replication) is 0.1 to 1 % similarity to other DNA viruses. important for natural survival and reinforces the The viruses also have highly unique genes, concept that persistence, not replication, may be not found in the Genbank database. Thus crucial for the fitness of some viruses. It is although they are called non-occluded worth emphasizing that defective virus mediated baculoviruses, they are distinct and are persistence could result in a virus-host currently an unclassified virus family. Of the relationship in which little if any virus is made 154 ORFs, in Hz-1, only 29 showed some but all host are infected. This situation would relationship to both baculoviruses and/or suggest way to evolve fully defective viruses that cellular genes, 16 of these related ORFs establish stable persistent infections, such as the were mostly genes involved in replication, polydnaviruses as discussed below. Hz-1 was transcription factors or as structural genes. natural isolated from adult ovarian tissue of 81% of Hz-1 ORFs show no homology to Heliothis zea grown in culture. These infected any other genes. The DNA is also distinct cells can be cultured. If these cells are also from baculovirus DNA in that it has repeat transfected with NPV, DNA, (even UV killed sequences, high in AT content with DNA), Hz-1 virus production will be induced, tandemly highly repeated 21-75 bp, even though NPV will not produce inclusions. elements. Interestingly, Hz-1 DNA Thus, there seems to be good evidence for virus- depRNA pol is most related to pSKL virus interactions. Hzv-1 expression is mainly plasmid DNA pol, thus it has a ‘Yeast-like’ silent during persistence and the discovery of RNA pol. The viral DNA dep DNA pol Hz-1 was essentially fortuitous. This raises the most resembles that of the host whereas the question of how common are similar persistent DNA ligase is mot similar to that found in infections by other viruses and how many more Humans. Also distinct from most other persistent viruses like Hz-1 have yet to be DNA viruses, Hz-1 encodes several host- discovered. like nuclear metabolic enzymes. Like Hz-1, OrL is another member of this virus Persistence by Hz-1. Because of the family, which was isolated from the Philippines. readily established persistence by Hz-1, it However, the virus is known to infect O. has been possible to study some of the rhinoceros beetle host in a strictly acute, non- molecular characteristics associated with persisting way. Instead it causes a highly lethal this persistence. In several respects, infection. Why might such similar viruses differ persistent Hz-1 strongly resembles that of so much in their ability to cause disease? The O. Human Herpes Virus type I. During Hz-1 rhinoceros beetle is not a native species of the persistent infection, only a non-coding Philippines and was an accidental pest species ‘persistent-associated’ RNA is expressed. introduced by humans into the Philippines. OrIL This is similar to LAT of HSV-1 infections emerged after the O. rhinoceros beetle persistence. The Hz-1 expressed sequence introduction, resulting in considerable mortality is called PAT 1 RNA (persistence associated to beetle, especially in larvae. The virus transcript). However, ther are some replicates widely in beetle tissue, especially in distinctions between Hz-1 and HSV-1. the midgut epithelium and is normally a fatal Mainly, Hz-1 persistence appears to also infection. In fact, OrL is considered the best involve defective interfering virus field model for the use of a virus as a biological 184 control agent of a pest insect species. poorly transmitted by mechanical and other However, OrL did not originate from the O. means, but it is highly infectious in natural rhinoceros host but can be found in native transmission following oviposition. DpAV will Philippine beetle populations, with little prevent the development of larval host of wasp – disease in these host. Thus the virus appears the moth pupa. In addition, DpAV codes for to have a persistent life strategy with its genes that are suppressive of moth immunity – natural beetle host, but can undergo a including a protein highly related to scorpion species jump to the O. rhinoceros beetle short toxin. In many respects, Ascoviruses host, causing a highly acute infection. resemble polydnaviruses (PDV) biologically, but is not a genomic provirus, as discussed below. However, unlike PDVs, DpAV4 will replicate Ascoviruses. This virus family is well in the moth larvae. The capacity for associated with the Hymenoptera ascovirus to replicate well in moth larvae but parasitoids. These wasps will function as persist in wasp, suggests that Ascovirus have a vectors of the ascovirus to lepidopteran host. hybrid and host dependent life strategy, are classified in 4 species, in persistent in the wasp vector, but acute in the which 3 groups are opportunistic and one moth larvae. group is obligate for the Hymeoptera. In some cases, Ascovirus replication has been linked to iridovirus infection. Ascovirus are Parasitoids wasps and virus. The relationship sexually transmitted via oviposition by the between parasitoid wasp and it host were noted Hymenoptera wasp host. Mechanical by Darwin as an example of Nature’s cruelty. transmission is accomplished to the Various parasitoid wasp species (Hymenoptera) lepidopteran host from a parsitoid wasp at will inject their eggs via an ovipositor into larval oviposition. In terms of wasp biology, the host (i.e. Lepidopteran) in which the wasp larvae specific Ascovirus, DpAV, is beneficial to develop, thereby consuming the parasitoid host Diadromus Pulchrellus wasp, and can from within. These are highly successful species enhance wasp development and has no and there are an estimated 200,000 parasitoid adverse effects on the wasp otherwise. wasp species that have various biological However, in Itoplectustics tunetana, DpAV strategies for parasitizing their host. Parasitoid infection has a most adverse affect, causing hymenoptera, using caterpillar and lepidopteran an acute infection with rapid virus host, are ubiquitous in terrestrial habitats. Wasps replication and nuclear lysis. Ascoviruses were an early insect to evolve and are have large bacilloform virions with circular predecessors to ants, another highly successful dsDNA of about 116-190 kbp. Ascoviral insect group. Fossil evidence indicates that the DNA pol clearly clusters with Iridoviruses oldest parasitoid wasp is about 140 mybp. In by phylogenetic analysis and is also linked some orders, wasp are known to harbor to those of phycodnaviruses and T even endogenous viruses that are involved in their phages. ASFV DNA pol is basal to many parasitoid life style. These viruses were first seen DNA pol sequences as well as basal to in 1967 when virus-like particles observed in other eukaryotic DNA viruses. Ascoviruses hypertrophied nuclei in oviducts of parasitoid resemble entomopox viruses and wasp. In 1981, it was shown that the DNA of iridoviruses in virion structure and vesicle this ‘virus’ was present in every female of the formation. SdAV1 and HvAV3 are found wasp. Since then, various observations suggest Noctuid host, but DpAV4 is found in wasp that the ovaries of ichneumonids are susceptible host. In D. Pulchellus, DpAV is transmitted to a wide variety of silent virus infections, most vertically, but is not integrated into the wasp of which are not well characterized. These chromosome. It is transmitted only by viruses are now known as the polydnaviruses female wasp. Curiously, DpAV is very (PDVs) which exist in two orders (Brachoviruses 185 and ). 30,000 parasitoid wasp including innexins or ‘vinexins’ which code for species appear to harbor PDVs. This gap junctional structural proteins involved in the wasp/virus relationship appears to have selective transfer of small molecules and ions existed for a period of 60 million years. At across cells. These proteins may be needed to least three independent lines of polydnavirus affect histocytes (or encapsidation response of have evolved, indicating that this virus-host plasmocytes) since vertebrate neutrophils and relationship is polyphyletic. In addition, are known to use these molecules other viruses (ascovirus, entamopoxvirus, to communicate via gap junctions. The normal unclassified virus) are known to exist in larval host immune response is via fused parasitoid reproductive tissue, but these melanized hemocytes that surround and suffocate agents are poorly characterized and the the parasitic egg. The PDV are known to significance of these viruses is unknown. prevent this response, allowing egg survival. Some PDVs (MdBV) are known to induce high This diverse and stable PDV virus-host level expression of viral genes in larval relationship establishes that the combined hemocytes. PDV DNA is replicated from fitness of a parasite and its host can result in genomic copies within the female reproductive highly stable lineages. The polydnavirus tissue, where it is assembled into virions. The relationship is therefore one of the most PDV DNA is thus amplified from proviral complex and highly evolved interspecific sequence, that are clustered in the wasp genome. interactions known among insects. The CsIV The mechanism of DNA amplification is not PDV is the best studied model of these known but could provide important clues as to viruses. They are composed of numerous the evolutionary origins of PDVs. A rolling circular dsDNA genomes, packaged into one circular DNA amplification has been suggested, virion, which is injected along with the egg which would imply RCR-viral ancestors since into the larval host. CsIV capsids are such a process is not cellular replication complex and are composed of over 15 mechanism. Excised and amplified PDV DNA peptides. The virions are rather distinct appears able to re-integrate into the genome. No from other DNA viruses and so far only sequence relationship to ascoviruses or show some structural similarity to two iridoviruses has been identified, so the possible unclassified viruses observed in fire ants and progenitor of PDV has yet to be identified. whirligig beetle (Avery, ’77). The viral genomes are very distinctive. They are The wasp-PDV story is such a striking and composed of numerous small super coiled ds successful virus-host relationship that it is worth DNA range from 6.2 to 18 kbp in size and further considering the possible origins of PDVs. are composed of 10-30 segments. The DNA A main conundrum is that PDVs express their is episomal, present in uneven molar ratios genes in the larval host, not the wasp vector. and 22 are non-redundant segments. In this This suggests that the PDV genes were originally feature, they do not represent any other adapted for functions needed in ledpidopteran insect DNA virus. Although their genomes larvae and further suggests that they most likely are still being evaluated, it appears that these originated from a virus that infects and expresses viruses do not code for their own in these larval host. However, PDV’s persist in polymerases. There are two major classes of the wasp, and given the mechanistic difficulty of PDVs, Brachoviruses have fewer segments jumping persistent infections into a new host and bigger DNA genomes relative to species, this wasp persistence suggest that the . About 2/3 of the DNA is non- PDV progenitor virus was also adapted to wasp coding and resembles defective genomes. persistence. Both of these biological PDVs have lots of non-coding sequence in characteristics are known to apply to the all segments plus they also have introns and Ascoviruses, which persist in wasp reproductive encode genes from multigene families, organs, but amplify and express genes in 186 lepidopteran host larvae. However, PDVs Nuclear Polyhedrosis virus (AcMNPV) along also appear to resemble defective virus, and with CsIV allows BV to replicate in a normally in this they resemble the Hz-1 virus during resistant host – Helicoverpa Zea. This affect persistence in reproductive tissue. Also, the appears to be mediated by CsIV depletion of the numerous small, supercoiled circular DNA hemocyte population, allowing BV infection and genomes and possible rolling circular further indicating the importance of hemocytes replication resemble no other insect virus, for BV infection control. Mixed infection has but do resemble other animal DNA viruses, also been shown to be important for BV co- such as TT virus in replication mechanism infection with TrIV. and size. This group of viruses is a recently discovered group, which shows no homology to any other virus group. Finally, Insect and Densonucleosis virus. the fact that there are three independent Small DNA virus infections are known for origins for PDVs (one Ichnovirus plus two insects. Generally insect parvoviruses infections brachoviruses) and that these groups are are acute and typically fatal. The DNA of some paraphyletic to each other but monophyletic of these viruses shows the ability to encapsidate within the brachovirus, suggests that PDV both + and – of template DNA, such as DNV colonization was highly favored by selection greater wax moth silkworm virus. However, in order to allow multiple viral colonization there are some examples of what may be events. All wasp within each of the host persistent infections, such as BmDNV-2 which groups have PDV. No wasp descendent will chronically multiply in mulberry pyralid within these groups appears to have lost Glyphodes Pyloalis with no corresponding PDV sequences during evolution. Thus, the disease. acquisition of PDVs appears to be a clear example of viral mediated acquisition of a stable and most complex phenotype. Insect Genomes?

It is interesting to note that there is also a Like all other animals, insect genomes show behavioral component to the PDV-larval considerable levels of colonization by DNA host relationship. PDV express genes that transposons and retroposons. Below, we seek to directly affect behavior of infected host evaluate the changes in insect retroposon larvae, resulting in both lethargy and also composition associated with insect speciation inducing climbing activity later. and genome evolution. We will also examine the Interestingly, as previously mentioned, some possibility that insect derived retroposons might plants, infected by larvae, can respond to have contirubuted the large scale retroposon salivary gland secretions from predation by colonization of plant genomes described above larval host to produce volatile compounds for the angiosperm plants. Not all retroposons that attract parasitoid wasps. are conserved in the evolution of related lineages and some retroposon elements that were present Typically, viral origins and fitness is in aquatic animals appear to have been lost in evaluated from the perspective of virus or insects. These include the Poseidon and Neptune host reproductive success. However, as we (penelopy) elements common in fish, shrimp, sea have argued before, persistence and urchin, frog, but absent in Drosophila and human sometimes associated viral-viral interactions genomes. In addition, as discussed in chapter 5, might also be important to understand how some organisms, such as neurospora, actively PDV may have originated. Some virus- destroy or eliminate repeated retroposon virus interactions are known between PDV sequences so their ability to be maintained in and BVs. For example, baculovirus co- genomes after initial colonization is not inefection of Autographa Californica M 187 necessarily assured. It would appear that are generally present at less then 5 copies per some type selective pressure must maintain genome. In Drosophila, the telomerase (TART) these elements in those lineages to which region clearly is distinct from vertebrate animals they are common. and resembles retroviral RT in sequence and in its functions to maintaining the telomeres. The RT nature of these fly telomeres suggest that The Drosophila genome, retroviruses and they may have a possible viral origin. In speciation. Speciation and diversification is addition TART related LINE-like (ty3-gypsy) associated with germline acquisition or element have been reported to be expressed in colonization of DNA transposons and oviduct tissues. Furthermore, this oviduct ERV retroposons in both insect and plants. expression can be induced by mating pheromone. However, we currently lack a solid Euchromatically located LTR containing explanation for this observation. The retroposons are also present, but these appear to concept of selfish DNA has often been be much younger then the whole Drosophila applied to explain the accumulation of such species, as they are not congruent with genetic agents, as discussed in chapter 1. Drosophila genomes. In some Drosophila However, the lineage-specific maintenance strains, there is a ZAM element which is an env of sometimes large numbers of non- containing full copy of the gypsy element. ZAM replicating, defective versions of is, therefore, a full retrovirus which is also transposons is difficult to explain by the overexpressed in follicular cells that surround the selfish DNA hypothesis. With the oocyte and can be sexually transmitted. In Rev I completion of the Drosophila melanogaster strain of D. melanogaster, the mobilization of genome, we can now evaluate the entire ZAM occurred following a spontaneous insertion genome with respect to these transposon event by another transposon. These activated elements. Drosophila has 178 full length viruses can lead to segregation distortion LTR retroposons, or endogenous following mating between virus producing and retroviruses. Most of these full elements are nonproducing drosophila strains. of he ty1/copia or gypsy element family. In addition some other retroposon family Plant/insect retroposons. As mentioned above. members are present at low numbers, such Plants lack authentic retroviruses, yet plant ROO elements and the BEL element, which genomes are known to have acquired many is also seen in maize. These two elements copies of retroposons that resemble gypsy-like appear to be a more recent acquisition in the and copia-like retroviruses found in insects. In Drosophila gnome and show high similarity fact, it is precisely such blocks of retroposons to the more abundant plant elements. In elements that differentiate recent events in plant Drosophila, retroposons exist in 17 distinct genome evolution, such as the difference families but two of these families are present between maize and tinosinte DNA, maze and in the greatest numbers; the copia/gypsy sorghum DNA, or barley and rice DNA. Even family and BEOL/ROO family. In D. wild rice DNA can be differentiated from crop melanogaster, the ty1/copia family is present species rice DNA by such retroposon blocks. We at about 10 copies per genome. It is should also recall that in gymnosperms/conifer important to note that no flies lack the gypsy lineages, the clades of gypsy elements were element, but as noted above, we currently monophyletic, suggesting a congruence between lack an explanation for this conservation. plant orders and retroposon block acquisition. Furthermore, all Drosophila melanogaster Cereals, in particular, appear to have evolved strains have defective gypsy provirus in more recently then other angiosperms and show pericentric heterochromatin and a few wild genomes that are especially high in copia and strains of Drosophila have additional gypsy gypsy families of retro elements. In considering elements that are active in euchomatin, but how these changes might have come about, two 188 events appear to have been required for Parasitoid wasps are also well established for these genomic retroposon-block changes. being able to make various types of endogenous One, genome colonization by specific family (i.e. Polydnavirus) and exogenous virus (e.g. or variant of a retrovirus must have occurred Ascovirus) in association with their reproductive at the origin of these plant species. Two, tissue and some of these viruses are essential for diversification (with deletions), wasp reproduction and egg survival . These amplification, expansion of genomic viruses can even control wasp sexual behavior. retroposons must have been relatively rapid, The earliest flowering plants are thought to be but perhaps remains an ongoing process. represented by orchards (which are also highly This amplification and diversification diverse species). In the section on the evolution especially involves defective derivatives of plants above, the idea was proposed that (LTR –LINE-and non-LTR) of the insects may have been directly involved in the colonizing retroposon. The latter event has origin of flowers. If this speculation is correct, occurred on a relatively recent time scale. can we observe in extant wasp and orchard The question this poses is why is there an species some remnants of relationships association between speciation and consistent with this insect role? Parasitoid wasps retroposon colonization and what role the must be able to implant their eggs in appropriate colonizing retroposon and subsequent host. Although larval insects are currently their expansion plays in the origin of plant normal host, it would seem that at some distant species. As we present below, the gypsy time they might have had similar relationships elements can be fully expressed as an with plants as host for egg development. Some infectious retrovirus in large numbers in relationship like this seems essential if large insect reproductive tissue. Furthermore, this numbers of genetic elements from insect virus can be passed as a sexually transmitted lineages, such as retroporosn, are to enter the agent. Given the co-evolutionary germ line of plants or lead to the evolution of relationship between insects and plants (as plant embryos/flowers. Some orchids, such as mentioned above), it thus seems plausible the Austrailian Chiloglottis, rely exclusively on that insects might have been the source of specific wasps (e.g. Neozeleboria cryptoides) for the gypsy elements that colonized various pollination. These orchids emit volatile plant genomes. This would require that compounds that sexually attract specific wasp insect derived retrovirus could somehow species and also make flower structures that find their way into a plant germ line of induce male wasp mating behavior, which results DNA. in flower pollination. This complex relationship has been called ‘sexual deception’ as practiced We can now propose some speculations as by the orchard flower against the wasp and is to how insects might have contributed to considered a relatively unique relationship. Yet flowering plant evolution. Much of our lots of orchards use wasp pollinators. However, understanding of insect retroviruses and we have good evidence that wasp sexual retroposons comes from studies of behavior can be directly controlled by viral drosophila. As described below, Drosophila parasites. If insects were indeed involved in the speciation may directly relate to the origin of flowering plants, relationships like the expression of transposons and retrovirus by one above may actually be much older then reproductive tissue. However, these previously considered and may represent Dipterans were not yet present at the remnants of early wasp-plant relationships. evolution of the early flowering plants (nor Future phlyogenetic analysis of these wasps, were many of their larval insect hosts). their viruses and orchard genomes and Flies (and ants) most likely evolved from retroposons may help clarify this possibility. solitary wasps predecessors (which includes the very numerous parasitoid species). 189 Insect speciation and endogenous retrovirus. As is always the case, larger numbers retrovirus. D. melanogaster and various of defective copies of these retroposons are also other species of Drosophila have been found in insect genomes. However, all species extensively used by evolutionary biologist to of drosophila have maintained some prototypical study the issues related to speciation. D. version of the gypsy endogenous retrovirus, simulans and D. melanogaster are currently related to that found in D. melanogaster. In D. estimated to have diverged about 2.3 X106 obscura species, one aspect of obscura specific ybp and this has served as a useful time to gypsy element is distinct in that the env estimate species diversification. Drosphila sequences are monophyletic, indicating a species diversification can also be conserved and vertical mode of transmission. considered from an old world and new However, D. hydei and D. virilis have gypsy env world perspective as these species are sequences related to the D. obscura element that distinct. The old world D. melanogaster are not conserved in a monophyletic clade and species has successfully colonized the new appear to have been acquired by more recent world. With the completion of the D. horizontal transmission. In these species, an melanogaster genomic sequence, however, interrupted the env ORF has been maintained. we can now begin to evaluate the patterns of Thus intact gypsy (env containing) ERVs show transposon and retroposon acquisition with both conservation and congruence with the host evolution and speciation of these flies. As as well as recent colonization and incongruence many of these retroposon elements are found in distinct ways, but related to the specific in heterochomatin regions of the genome Drosophila species. For example, D. obscura has (especially the X and Y chromosome), as a gypsy env containing element related to the D. noted above, it has been proposed that they melanogaster element (GypsyDm). This element reflect simple selfish transposons that lack appears to be recently acquired and is inactive. phenotype and thus accumulate in inactive, It seems likely that sexual transmission may less damaging, regions of the fly account for these transmission patterns from a chromosome. However, more recent stably colonized host to another recently infected evaluation of this idea does not support the and sexually isolated host genome. For example, selfish DNA concept. Not all families of the gypsy-like Osvaldo retroposon (with known retroposons are found concentrated in similarity to HIV1 and SIV animal viruses) is heterochomatin, thus there seems to be no maintained in D. buzzatii and is known to link between the stability of the retroposon undergo activation of transposition in hybrids and its location in heterochomatin. In between D. buzzatii and its sibling D. koepfera. contrast to the plant retroposon sequences, In D. melanogaster, the gypsy element is flies conserve intact ERVs: 10 of the normally persisting as an inactive and inapparent retroposon families of the Drosophila provirus under the control of the flamenco gene, genome are complete with all ORFs, which must be repressed to activate the gypsy including an env ORF (e.g.17.6, 297, gypsy, virus. Idelfix, opus, Quasimodo, roo, springer, Tirant, ZAM, Osvaldo). The conservation Speciation and DNA transposon colonization. of env gene strongly suggests some Another genetic element that appears to functional selection for this gene. Plus distinguish old and new world drosophila various other retroposon family members genomes is the presence of P elements. P- encode all the other normal retroviral genes elements are 2.9 kb transposable DNA elements minus the env, also suggesting a functional with 30 bp repeat at the DNA ends. They encode env conservation. Nor is the presence of full four ORFs, which includes genes coding for endogenous retrovirus sequence unique to transposase and a repressor of immunity. This drosophila genomes as the TED from organization and gene coding functions clearly lepidopteran species is also a complete resemble phage Mu, but P element transpsons 190 resemble a defective relative of phage Mu but essentially results from a viral strategy for by lacking any capsid genes. In this capsid the competitive colonization of its host. absense, P-elements also resemble the hypoviruses of fungi. Nevertheless, there is In the above section we have noted that the evidence of P-element undergo mobilization major patterns of genomic change and the and colonization of new genomes. P distinctions between the genomes of closely element activity is normally regulated via related species all appear to involve changes in differential RNA splicing in which 4 introns and acquisitions of retropsoons and DNA are transcribed in non-germline cells. transposons. Furthermore, there is a striking Failure to remove these introns, via RNA contrast between distribution of endogenous splicing, results in a truncated transposon retrotransposons in various lineages, such as that fails to express gene products and is between drosophila and human genomes, or also inhibitory to other transposons, drosophila and plant genomes. In the fly rendering the transposon immobile and genome, most families of retroposons are found persistent in the genome. Differential in relatively low frequency (1-10 c/genome), but splicing in germline tissue would seem at variable sites. In human genomes, most necessary for P element germline retroposon elements are numerous and fixed in colonization. Like the gypsy element, P their location as such are rare. element can be both However, other insects can clearly have much congruent and incongruent in various larger genomes with a much greater content of drosophila species. In African originating repeated elements then drosophila. One striking fly species, such as D. melanogaster, P example of this is the mountain Podisma element phylogenetics are incongruent with grasshopper, which can have genomes that 100 the host genome, strongly suggesting a fold greater then those of humans (yet curiously recent colonization by P elements. don’t spontaneously delete DNA at high rates). However, in American fly species, such as Clearly the relationship between genome D. willistoni, P element phylogenetics is colonization and host speciation (or competition congruent with its host genome. Thus it with other genetic parasites) is poorly understood appears that D. melanogaster lacked P but such colonization is associated with and element before American colonization. In possibly responsible for the events leading to keeping with this hypothesis, P-element is speciation. also associated with hybrid dysgenesis between Drosophila species containing and species lacking P elements. By evaluating Additional reading. codon usage differences, estimates of the time of transfer of P element to the Invertebrate and viruses. melanogaster species have been made. This transfer appears to have been a relatively (Gaunt Michael and Miles Michael 2002) recent event. However, it seems clear that (Kurstak 1991) hybrid dysgenesis is associated with both P- (Darai 1990) element DNA transposition and LINE-1 (Dowton, Austin et al. 2000) retroposon genome transposition. The (Whitfield 2002) implication of these observations is that (Herniou, Luque et al. 2001) genomic colonization by DNA and (Rahbe, Digilio et al. 2002) retroposons can result in segregation (Stasiak, Demattei et al. 2000) distortion and initiate a process of sexually (Cheng, Liu et al. 2002) isolating interbreeding insect populations. (Turnbull and Webb 2002) This could thus represent the initial genetic (Pfaff 2002) events needed to generate a distinct species, (Liu and Beckenbach Andrew 1992) 191 (Beckage Nancy 1998) Beckage Nancy, E. (1998). "Parasitoids and polydnaviruses." Bioscience 48(4): 305- Transposons/ERVs. 311. Bremer, K. (2000). "Early cretaceous lineages of (Leblanc, Desset et al. 2000) monocot flowering plants." Proceedings (Canizares, Grau et al. 2000) of the National Academy of Sciences of (Lerat, Rizzon et al. 2003) the United States of America [print.] (Alberola Trinidad and De Frutos 1996) 97(9): 4707-4711. (Andrianov, Zakharyev et al. 1999) Canizares, J., M. Grau, et al. (2000). "Tirant is a (Dimitri and Junakovic 1999) new member of the gypsy family of retrotransposons in Drosophila Plant and viruses. melanogaster." Genome 43(1): 9-14. Cheng, C. H., S. M. Liu, et al. (2002). "Analysis (Matthews and Hull 2002) of the complete genome sequence of the (Gibbs 1999) Hz-1 virus suggests that it is related to (Kenrick 2000) members of the ." J Virol (Bremer 2000) 76(18): 9024-34. (Gray Stewart 1996) Darai, G. (1990). Molecular biology of (Lockhart, Menke et al. 2000) iridoviruses. Boston, Kluwer Academic (Raubeson Linda and Stein Diana 1995) Publishers. Dimitri, P. and N. Junakovic (1999). "Revising RNA virus evolution. the selfish DNA hypothesis: new evidence on accumulation of transposable (Holmes 2003) elements in heterochromatin." Trends (Zanotto, Gibbs et al. 1996) Genet 15(4): 123-4. (Argos, Kamer et al. 1984) Dowton, M., A. Austin, et al. (2000). (Gorbalenya, Pringle et al. 2002) Hymenoptera : evolution, (Koonin and Gorbalenya 1992) and biological control. Melbourne, CSIRO Publishing. Alberola Trinidad, M. and R. De Frutos Gaunt Michael, W. and A. Miles Michael (1996). "Molecular structure of a (2002). "An insect molecular clock dates gypsy element of Drosophila the origin of the insects and accords with subobscura (gypsyDs) constituting a palaeontological and biogeographic degenerate form of insect landmarks." Molecular Biology & retroviruses." Nucleic Acids Evolution 19(5): 748-761. Research 24(5): 914-923. Gibbs, A. (1999). "Evolution and origins of Andrianov, B. V., V. M. Zakharyev, et al. tobamoviruses." Philosophical (1999). "Gypsy group Transactions of the Royal Society of retrotransposon Tv1 from Drosophila London B Biological Sciences virilis." Gene (Amsterdam) 239(1): 354(1383): 593-602. 193-199. Gorbalenya, A. E., F. M. Pringle, et al. (2002). Argos, P., G. Kamer, et al. (1984). "The palm subdomain-based active site is "Similarity in gene organization and internally permuted in viral RNA- homology between proteins of dependent RNA polymerases of an animal picornaviruses and a plant ancient lineage." J Mol Biol 324(1): 47- suggest common ancestry 62. of these virus families." Nucleic Gray Stewart, M. (1996). "Plant virus proteins Acids Res 12(18): 7251-67. involved in natural vector transmission." Trends in Microbiology 4(7): 259-264. 192 Herniou, E. A., T. Luque, et al. (2001). "Use Rahbe, Y., M. C. Digilio, et al. (2002). of whole genome sequence data to "Metabolic and symbiotic interactions in infer baculovirus phylogeny." J Virol amino acid pools of the pea aphid, 75(17): 8117-26. Acyrthosiphon pisum, parasitized by the Holmes, E. C. (2003). "Molecular clocks braconid Aphidius ervi." Journal of Insect and the puzzle of RNA virus Physiology 48(5): 507-516. origins." J Virol 77(7): 3893-7. Raubeson Linda, A. and B. Stein Diana (1995). Kenrick, P. (2000). "The relationships of "Insights into fern evolution from vascular plants." Philosophical mapping chloroplast genomes." Transactions of the Royal Society of American Fern Journal 85(4): 193-204. London B Biological Sciences Stasiak, K., M. V. Demattei, et al. (2000). 355(1398): 847-855. "Phylogenetic position of the Diadromus Koonin, E. V. and A. E. Gorbalenya (1992). pulchellus ascovirus DNA polymerase "An insect picornavirus may have among viruses with large double-stranded genome organization similar to that DNA genomes." J Gen Virol 81(Pt 12): of caliciviruses." FEBS Lett 297(1- 3059-72. 2): 81-6. Turnbull, M. and B. Webb (2002). "Perspectives Kurstak, E. (1991). Viruses of invertebrates. on polydnavirus origins and evolution." New York, M. Dekker. Adv Virus Res 58: 203-54. Leblanc, P., S. Desset, et al. (2000). "Life Whitfield, J. B. (2002). "Estimating the age of cycle of an endogenous retrovirus, the polydnavirus/braconid wasp ZAM, in Drosophila melanogaster." symbiosis." Proc Natl Acad Sci U S A J Virol 74(22): 10658-69. 99(11): 7508-13. Lerat, E., C. Rizzon, et al. (2003). Zanotto, P. M., M. J. Gibbs, et al. (1996). "A "Sequence divergence within reevaluation of the higher taxonomy of families in the viruses based on RNA polymerases." J Drosophila melanogaster genome." Virol 70(9): 6083-96. Genome Res 13(8): 1889-96. Liu, H. and T. Beckenbach Andrew (1992). "Evolution of the mitochondrial Possible figures. cytochrome oxidase II gene among 10 orders of insects." Molecular 7-1. Plant dendogram (also early plants, needs Phylogenetics & Evolution 1(1): 41- re-rendering or permission) 52. 7-2. Plant retro-elements (needs permission) Lockhart, B. E., J. Menke, et al. (2000). 7-3. Insect dendogram (needs simplified re- "Characterization and genomic rendering or permission) analysis of tobacco vein clearing 7-4. Insect retrovirus (don’t have this yet) virus, a plant pararetrovirus that is 7-5. Baculovirus evolution (this is modified transmitted vertically and related to from a JV paper, permission needed) sequences integrated in the host 7-6. Polydnaviruses (needs permission, Sotltz, genome." Journal of General ˙journal- Virology?) Virology 81(6): 1579-1585. 7-7. Parasitoid wasp (I have a hard copy of a Matthews, R. E. F. and R. Hull (2002). drawing I already made) Matthews' plant virology. San Diego, 7-8. Insect origin of plant retroelements (I Academic Press. need to render a schematic) Pfaff, D. W. (2002). Hormones, brain, and behavior. Amsterdam ; Boston, Academic Press.

193 CHAPTER VIII Evolution of Terrestrial Animal and their Viruses Vertebrate emergence from the oceans mammals and birds are homoeothermic. As the As vertebrates emerged from the oceans to successful viral infections are often associated become land dwelling animals, numerous with host temperatures (such as fish and basic changes in their physiology and organ amphibian viruses), and in mammals are structures were necessary. The ability to associated with viral infections, we can also breath air required the physiological identify this as a distinct thermal viral habitat adaptation to respire directly from the air which developed with birds and mammals. and the corresponding formation of . Living in a non-aqueous habitat also Viruses of fish, host immunity and viral requires the development of a skin that is evolution. As discussed previously in chapter 6, able to retain moisture and prevent oceanic vertebrate animals represent the first desiccation. Also, limb and skeletal animals to have created the adaptive immune structure must become robust enough to system. The acquisition of this complex and support the non-buoyant mass of the animal. protective genetic system would certainly have All these adaptations represent some of the affected the relationship between these host and basic evolutionary events leading to the their viruses. But how might the selective origin of terrestrial vertebrate animals. pressures brought about by infection with Along with these developments, clear various types of virus have affected the various changes in the habitat for viral species that host lineages that later developed? In the above infect terrestrial animals would also occur. chapters, we have noticed a strong tendency that The creation of lungs and keratinized skin, has suggested links of host diversity to virus for example, would provide novel and diversity: the more diverse the host species for a distinct habitats for animal viruses and, as particular order, the more diverse tends to be the we will see, numerous viruses have indeed corresponding type if virus that infect that host adapted to both terrestrial animal lungs and (e.g Ch. 6 -bony fish and Ch.-7 flowering skin. For example, in humans respiratory plants). We cannot at this time offer a validated infections represent the most common type explanation for this observation. We are unable of modern viral infections. In the previous to discern if viruses are simply adapting to more chapter, we briefly considered some of these diverse host or if host diversification itself is issues with respect to the viruses that infect related to viral loads. Yet the invention of the terrestrial insects and plants. We noted that adaptive immune system would seem to pose a the loss of the aqueous habitat would require major problem for any virus or any putative viruses to evolve non-water borne linkage of virus to host evolution. The adaptive transmission mechanisms as well as immune system should severely limit these hosts resistance to desiccation and solar damage. as potential viral habitats. Can we now observe Many of these same issues apply to viruses any evidence that supports this expectation? of terrestrial vertebrates. Curiously, What patterns of virus evolution are seen in the however, the types of viruses that infect vertebrate lineages and do they suggest any insects are often very distinct from those linkage to the creation of the adaptive immune that infect vertebrates so it does not appear system? Sharks and Skanks are jawed, that this common terrestrial habitat led to cartilaginous vertebrates that had no developed common virus-host relationships. Another bony skeletal structures and represent the earliest difference is that insects and early animals with an essentially complete adaptive vertebrates were poikilothermic, whereas immune system. In the case of these vertebrate

194 hosts, which are not highly diverse relative genes as well as a radiation in host diversity. to bony fish, it indeed appears that they are The resulting teleost fish are now much more infected by relatively few types of virus. diverse then their vertebrate ancestors. However, until recently, there was little data available to evaluate this issue as these organisms were not well studied. Can we Teleost fish radiation. It is now estimated that examine any molecular data that might bony fishes and aquatic lower chordates illuminate this issue? encompass about 18,000 species. The large majority of these species are Teleost fish. This One gene that closely defines the adaptive compares to about 6,300 species of reptiles, immune system, is the V-H gene, which is 9,000 species of birds and 4,000 mammalian essential for the DNA recombinational species in terrestrial habitats. Although bony activity that generates the needed diversity fish represents a major radiation in vertebrate of the immune response. Recent species, relative to terrestrial vertebrates species, comparative studies of V-H genes now can it should be remembered that that this compares be used to generate a gene tree of all to 50,000 species of mollusca and 6,100 species vertebrates based on the evolution of the of echinoderms in the oceans. Also for adaptive immune system. When this is comparison, as mentioned in the prior chapter, done, five major groups of V-H genes can there are an estimated 69,000 species of higher be identified (A, B, C, D and E). Of these, plants and 751,000 species of insects in the E group was monophyletic and included terrestrial habitats. As a calibration of species the cartilaginous fish such as sharks and generation, it is worth recalling that skates. Group D corresponded to the bony Polydnaviruses specific to wasp host alone fish. Group C was more mixed, containing accounted for over 20,000 species of parasitoid fish, amphibian, reptile, bird and mammal wasp (chapter 7). genes. Group A and B were less mixed and found in mammals and amphibians. Fish viral diversity. Overall we have However, it was very interesting that the previously observed that bony fish harbor a lot of examples of the earliest representatives of viral types that includes most the viral families these genes (E), were not found in the higher that are also found in terrestrial animals. In oceanic or terrestrial vertebrates. terms of known viral families, fish only seem to Conversely, the V-H gene version found in lack parvoviruses and arboviruses. In Teleosts bony fish (D), was not found in terrestrial fish a large number of viruses are DNA vertebrates. And some examples found in containing LCD-like viruses or idiroviruses. terrestrial vertebrates (A/B), were not found LCDV alone is known to infect over 140 fish in oceanic vertebrates. These observations species, resulting mainly in benign skin and appear to identify a pattern in which early proliferative connective disease. Fish also V-H genes of the adaptive immune system, harbor a large number of RNA viruses. were initially replaced by subsequent Curiously, 2/3 of fish RNA viruses are thought versions of these genes found in bony fish, to be rhabdoviruses, which outside of bats, are but then again replaced by different versions not common in other host species and represents found in the terrestrial animals. These a rather unique virus-host association. There are transitions or gene replacements also several strikingly different virus/host patterns in correlate with changes in host diversity. It is vertebrates and insects or plants. For example, therefore of interest to us to note that the although DNA viruses are common in both radiation of viral diversity was most mammalian and insect orders, there are no apparent in the transition from cartilaginous ascoviruses or iridoviruses in mammals, yet fish to bony fish, and that this transition also these are common in both insects and cold corresponded to a replacement of the V-H

195 blooded vertebrates. Thus such shed persistent virus (especially via sexual considerations of general virus/host patterns products), and thus represent a sources of virus has allowed us to identify several clear viral- infection for other species. Interesting that co- host themes in the vertebrates. One theme is infection of salmon with the retrovirus, that mammals maintain the ability to host lymphocystitis virus, can result in SalHV-1 most of the diverse families of viruses that reactivation. In other species, such as rainbow were found in bony fish, but apparently not trout, SalHV-1 is lethal. However, Steelhead amphibians and reptiles (discussed below). herpes virus (SHV) is biologically converse to One family of virus that is of special SalHV-2 in that it shows low virulence in interest, due to its close co-evolution with its Rainbow trout but is virulent in Salmon. host is the herpes virus family. Fish have Numerous other related herpes viruses of fish are several distant relatives of herpes virus, but also known (AciHV-2, PlHV-1, PcHV-1, some lineage of herpesvirus are unique to VENF), but often the biological relationships fish whereas other families links fish herpes and viral characteristics are not well established. to herpes of mammals. Fish herpesvirus, However, this set of herpesviruses does show like fish retrovirus, are readily recognized in discernable similarity to herpesviruses of field surveys as they are often associated mammals thus these salmon herpes virus may with benign skin growths, (also a feature not represent ancestors to the mammalian viruses. seen in herpes of mammals). Are fish and human herpesviruses related? Fish herpes relationship to mammals. Humans are especially well studied with respect CCHV (Channel Catfish herpes Virus) is to herpesviruses and are known to host seven (or clearly a herpesvirus as judged by its possibly 8 depending on criteria) different morphology and replication strategy, yet it human specific species of herpesvirus, most (but shows almost no sequence similarity to other not all) of which are phylogenetically congruent herpesviurses. For example, the CCHV TK with human evolution. These herpes viruses gene is neither similar to the host version of appear to represent an ancient lineage of virus this gene, nor is it similar to the TK gene of and have even been linked to bacterial DNA other herpesviruses. This family of fish viruses (T-even phage). It seems likely that this herpesvirus therefore does not appear to be evolutionary lineage would also include the an ancestor to the current mammalian salmon (but not catfish) fish herpesviruses. As viruses. In addition there are other fish we have previously noted, similarities between herpes that appear to be members of the animal herpesviruses and the DNA viruses of CCHV-like herpes, such as IcHV-1. These green algae (phycodnaviruses) and filamentous two viruses are biologically interesting in brown algae (Phaeoviruses) are also very clear. that they both establishes latency in Thus it seems clear that some forms of aquatic leucocytes, and result in fish that herpesvirus does share similarity to human persistently shed virus. Thus, this appears to herpesviruses. In addition, Herpesvirses have identify a lineage of fish herpesvirus that has been observed in mollusk, lamprey and shark no apparent direct decendents in mammals species. Although we know relatively little (or birds) and appears to represent a distinct about these aquatic herpesviruses, some are evolutionary origin that has been maintained known to maintain common biological in fish. However, the salmonid herpes virus characteristics, such as ganglion infection, that (SalHV-1) has a genome that clearly suggest that these viruses likely share a common resembles mammalian alpha herpes viruses. heritage with the mammalian viruses. Also, SalHV-2 shows low virulence in both Coho Herpes viruses frequently persist, in specific salmon and some trout. This virus also species, but do not integrate into host establishes an inapparent infection in chromosomes. Both these characteristics appear Kokance salmon. Some fish appear able to to have been maintained in animal host. 196 Curiously, as noted above and unlike large DNA viruses of bacteria and filamentous brown algae, viruses of vertebrates do not In summary, overall, we can clearly see broad normally integrate their DNA into the host and well maintained patterns between the chromosomal sequence. terrestrial vertebrates and their viruses and further that these patterns have undergone Another ancient viral lineage that links noticeable shifts with the evolution from oceanic aquatic vertebrates to terrestrial vertebrates to terrestrial host. There have been few prior are the Retroviruses. As noted in chapter 6, attempts to explain these associations. the retroviruses are prevalent and autonomous viral infections were first OVERALL CHARACTERISTICS AND apparent in the bony fish. These fish viruses EVOLUTION OF TERRESTRIAL HOST are account for significant infections of LINEAGES commercially important species. Relative to Although we have outlined some general these fish retroviruses, however, most differences between terrestrial animals and the retroviruses of terrestrial vertebrates are not predecessors, it is now worthwhile to consider as frequently associated with autonomous the biological characteristics of the individual acute infections, but instead tend to more lineages of terrestrial animals so that we can commonly occur as inapparent of subsequently evaluate how these lineages relate endogenous retrovirus infections (ERVs), to the viruses that infect them. Compared to sometimes associated with cellular insects, terrestrial vertebrates are much less proliferation, especially cells of the immune species diverse. However, compared to fish system. Some of these ERVs are highly species, terrestrial vertebrates are of similar genome associated and tightly linked with diversity. The biological adaptations needed for the lineage of their host and can be highly the transition of aqueous habitat to a land habitat prevalent (at 100% levels), whereas other links the evolution of all the land animals, ERVs are much more restricted in their including insects and vertebrates. This distribution. In addition, there is significant transition is especially expected to affect egg and variation in the quantity of ERVs, but larval development which must now occur especially in the quantity of defective ERV outside of the ocean. Terrestrial eggs would derivatives that are found in the specific need to either return to a water habitat to develop lineage of terrestrial vertebrates. In general, (as with amphibians), acquire a non-desiccating amphibians, reptiles and birds have a covering (as with reptiles and birds), or become significantly smaller number of ERVs and internal (as with mammals). Similarly, aquatic ERV derived retroposons then do mammals. larval forms would also need to either be Of special note is that all mammal lineages returned to a water habitat to develop (as in not only tend to have much higher numbers amphibians) or be lost completely due to a more of ERV sequences, but many of these ERVs complete transition to a terrestrial habitat (as are also specific to and maintained within with reptiles, birds and mammals). As aquatic the particular lineage. Thus each mammal viruses were often associated with both host lineage is associated with its on peculiar larval forms and eggs of their host, the terrestrial version of ERVs and the ERV associated viruses would also need to adapt to changes in LINES (poly-A retroposon derivatives of the larvae (embryo) and egg biology. In addition, same ERV family). Also, the mammalian and of relevance to viral replication strategies, sex chromosomes (X but especially Y) are the transition to land would also generally affect especially highly colonized by ERVs. This the population structure of the host as the issue and how it relates to host evolution common ‘school’ population of fish are more will be considered in much greater detail at difficult to attain on land, at least in the earlier the end of this chapter. terrestrial vertebrates (amphibians, reptiles). 197 Such a population structure would disfavor amphibians is that the sex is environmentally acute viral agents and favor persistent viral determined, and not the direct result of a sex agents. An example of this issue with chromosome. Amphibians have advanced visual respect to mammals would apply to rodents. and auditory senses relative to fish and distinct Rodents are the most diverse and best teeth. Their breathing is dues to positive pressure studied mammals. However, rodents mechanisms, not aspiration as in higher seldom exist in gregarious populations in vertebrates. The skin of amphibians not only natural settings. Birds, on the other had, do contributes significantly to respiration, but is frequently exist in large flocks. Such also often associated with the secretion of variations in population structure will relate poisons and . It is interesting to note that to viral strategies in that in large populations in fish species, viral skin infections were highly are much more able to support acute viral common and were also often associated with agents, whereas persistent viral agents can mucus and toxin production. Within the be sustained in non-gregarious host amphibians, Salamanders are the more basal populations. As noted previously, species and are distinguished by internal egg persistence is also associated with genomic fertilization. As will be discussed below, all colonization, sexual transmission and old to amphibians are host to Idioviruses, and most also young transmission so we will also consider host Herpesvirus and Adenovirus. All these these issues with respect to the various viruses are associated with persistence in tumor vertebrate lineages. tissue and also with skin infections. Curiously, no RNA viruses have yet been described for Amphibians represent about 4,800 species. salamanders. Within amphibian species, about 90% are frogs (4,200 species), thus frogs are by far Amniotes: as mentioned above, amniotes have a the most successful of amphibians. monophyletic origin but are paraphyletic to Amphibians appear to be monophyletic so amphibians. Amniotes are distinguished by the their decent from a common ancestor seems origin of eggs with reparatory (amniotic) most likely. However, amphibians are membrane that are adapted to non-aqueous distinct from and paraphyletic to the early 4 habitat. This also requires that eggs have legged vertebrates that were present in protective, nondesiccating coverings and this Paleozoic period (such as Labyrinthodonts, covering in reptiles and birds is distinct. We an extinct order that gave rise to amniotes). have previously noted in the development of Thus amphibians diverged early from the insect eggs, that an egg covering resembled the lineage that went on to develop into ‘walling-off’ process of the innate immune-like mammals. Amphibians are also response. In the case of internal fertilization, poikilothermic so they have not developed such as birds and marsupials, the egg shell can thermal regulation. They generally have provide a barrier that seals off allogeneic embryo two life phases; an aquatic larval stage and a from the mother’s adaptive immune system. In terrestrial adult stage. In addition, the eggs the case of bird eggs, the amniotic membrane is are laid in and developed in an aquatic frequently used as a tissue for the production of habitat, although egg fertilization is external. various types of virus and forms the basis of the Although the large majority of amphibians use of eggs for vaccine production. The shell have aqueous larval forms, some have covering is made of various materials, but is evolved to lose such a life stage. 20 species deposited by maternal cells following internal of frog are known to have lost the tadpole fertilization of the egg by sperm. The specific phase and also to have acquired a terrestrial nature of the shell is distinct for the various egg. Unlike the evolution of amniotes, this lineages of amniotes although in all cases, it process has evolved multiple times, not must still allow respiration via the shell simply once. Another distinction of membranes. In amniote evolution, there was and 198 early divergence of reptiles from synapsids expression. This process of skin differentiation and subsequently to pelycosaurs which led is associated with virus reproduction in to monotremes, marsupials and mammals. mammalian skin. Reptiles, like birds have nucleated RBS, which can support some types of Sauropsids. These non-mammalian virus. Crocodiles are the basal clade of the terrestrial vertebrates include five groups. reptiles and appear to have diverged prior to the Of these, the turtles are the most basal divergence of the lineage leading to group. The squamates are another group and avians. The evolutionary origins of turtles which includes lizards and snakes, and are are not clear, but it appears that they diverged more recently evolved lineages. Crocodiles early from the other lineages. The Iguana-like are also a member but are paraphyletic to lizards are basal to other lizards and snakes, turtles but basal to sphenodon (dinosaurs) (which constitute about 1,000 species). and birds. The characteristics that are Gekkotan lizards also rather basal and include common to all these lineages includes the some species that are snake-like and lack fact that they all have terrestrial eggs. substantial extremities (1,500 total species). Although some of the genomes of these Snakes are the most recently evolved reptiles and animals are poorly studied, it appears that also the most diverse (2,900 species). These overall their genomes are relatively smaller rapidly diversifying species are all predators to then those of mammals and have much less others species, including lizards and other repeated and ERV DNA. Of the sauropsids, snakes. As mentioned below, snakes host lots of the birds are best studied. However, it is virus types but are especially noted for hosting now clear that the bird and crocodile paramyxoviruses as lung infections. genomes are similar. Birds. Birds are the most diverse of the Reptiles. These are diverse terrestrial terrestrial vertebrate and about 9,000 bird species species with 6,300 known members. are known. Birds are homoeothermic and have Reptiles have internal fertilization of non- evolved from the lineage. Birds are aqueous eggs. The eggs are shelled with a distinguished by the presence of feathers, bills, large yolk to store nutrients for the embryo external eggs, and complex reproductive and also provide a protective egg membrane. behavior. They have nucleated RBCs, which can In most, but not all, species, sex is replicate nuclear DNA virus. As mentioned environmental determined. This sex above, bird genomes are smaller and less determination is similar to that of fish in that variable across species compared to other the temperature at which the reptilian tetrapods and show slower rates of molecular fertilized egg is incubated can often evolution (although high rates of point determine the resulting sex of the offspring. mutation). They also have much less For example, in alligators, elevated endogenous retroviral elements (and other temperatures result in male offspring repeats) then animals or higher plants. Unlike whereas in turtles elevated temperatures reptiles, birds have genetically determined sex result in female offspring. Reptiles have no via a sex chromosome. However, in contrast to larval stage and are cold blooded. They are mammals, sex is not determined by the presence also characterized by the presence of a of a Y chromosome but is instead due to highly keratinized skin that uses distinct chromosome pairs. The female bird is forms of keratin protein and resists heterozygous and uses a ZW sex chromosome desiccation. The keratin skin is also unique system and the male in ZZ homozygous. As in undergoing periodic shedding with mentioned below, birds support many virus growth. The keratinized skin cells are types. produced by terminal differentiation of basal cells resulting in highly committed gene 199 Mammals. There are an estimated 4,692 monotremes, marsupials and placentals with the current species of mammals and an latter being much more diverse. additional number 5,000 prior mammalian species which are estimated to have become Monotremes have few current species. extinct. The defining biological These egg laying mammals are distinguished by characteristic of mammals is the mammary having one orifice used for both sex and waste gland, or fur and homoioendothermy. excretion (mono-treme). The two most studied The mammary glad appears to have evolved species of monotremes are the Platypus and from an adaptation of a mucus or subsequent Echidna. In the monotremes, ovulation is not sweat skin gland for secretion of nutrients to determined by hormonal cycling as in the feed the embryo and early young by the placentals, but is genetically programmed. There mother. Many mammals also show some are few studies of viruses of monotremen and link between mammary gland development little is known about their genomes. They do, and retrovirus production, such as with the however, seem to lack most LINE-like elements MLV association of mice (discussed below). of other mammals, although some ERVs are Mammals are called therians and closely clearly present. Also, and like the marsupials regulate their body temperature. below, monotremes, like most mammals, Accordingly, they do not determine sex of determine sex via an X Y chromosome system, the offspring via environmental but have very small Y chromosomes, on the temperatures as did the fish or reptiles. scale of 10,000 base pairs. Rather almost all mammals use an XY chromosome system for the determination of Marsupials are metatherians and bare sex (with a few interesting exceptions live young. 140 genera are known for the discussed below). All mammals have four marsupials so although much more diverse then chambered hearts and endothermy, and can the monotremes, they are much less diverse than generally respond to viral infections by their sister group, the placentals. mtDNA increasing their temperature setting. In this analysis confirms that monotremes and fever response, they are distinct from fish, marsupials and placental mammals are sister amphibians and reptiles. Mammals have groups. Although they bare live young, the been studied in the archeological record by duration of direct embryo-mother attachment is the related dentation of molars. All therians short relative to placentals (7-14 days), and breath by aspiration using a diaphragm. always corresponds to less then one estrous cycle Also, unlike birds, their RBCs lack nuclei. of the mother. Thus, they do not confront the Mammals evolved early, up to 148 million immunological dilemma of rejection by the years before present and before the mother’s adaptive immune system to nearly the dinosaurs, with all the characteristics listed same degree as do placental embryos. The above. However, prior to 65 million years marsupial embryo has a shell membrane, which ago, there was only one mammalian lineage, is made by the maternal oviducts and encloses which most likely resembled a rodent-like the embryo and can allow long term storage of egg laying monotreme. Multituberculates unfertilized eggs some species (i.e. wallaby). are an early mammalian lineage which can The major source of nourishment for the early be identified via dentation pattern. embryo is from the large egg yolk sac. Lactation Although they survived the dinosaur is prolonged and occurs within a patch extinction and subsequent mammalian specialized secretory skin within the pouch. radiation of about 65 million years ago, they Ovulation (estrous) is not hormonally all became extinct about 30 million years determined via estrus cycle, as in placentals, but ago for unknown reasons. The major is genetically programmed. Thus, aside form lineages of mammals alive today are the lactation control, the strong hormonal regulation of placental pregnancy is largely missing from 200 marsupials. The marsupial reproductive includes elephant shrews, elephants, aardvarks, tracts are distinctly different from those of manatees, anteaters and sloths. Current rodent the placentals. Marsupials have three species, which physically resemble shrews, vaginas which open one at a time for birth of however, evolved later. Rodents represent the young and the males have a bifurcated . most successful and diverse of all the placental Marsupial blsatocyst have no true species (about 2,300 species). Human (primate) trophectoderm, the layer that surrounds and and rodent lineage diverged from basal lineage supports the placental embryo. The that included bats, carnivores and the modern mammalian trophectoderm also develops horse. In placental species, ovulation and into the placenta, which is absent from lactation are hormonally controlled, via estrus marsupials. Marsupials do have a cycle. The process of lactation is mostly surrounding ‘trophoblast-like’ membrane in controlled by a prolactin hormonal system. It is their early embryos, but this marsupial interesting to note that this regulatory system is ‘trophoblast’ is not protective of embryo or very old in evolutionary terms and in its earliest involved in embryo implantation as is the appearance can be found in mucus producing trophoblast of placentals. Marsupial skin cells in fish. This observation supports that embryo parturition is very rapid relative to idea that the origin of lactation stems from such placentals and is associated with the skin cells. The most variable biological aspect of presence of inflammatory cells at the uterine placental biology is to be found in the anatomy wall. Unlike the studies of placental of their corresponding reproductive organ (the embryos, little is know about the expression uterus and placenta). Many different of endogenous or autonomous retroviruses characteristics (such as the number and in marsupial embryonic tissues. The viral- placement of embryos) will differentiate these like particles often reported in placental organs amongst the various placental lineages. embryos have not been reported in However, these reproductive characteristics are marsupial embryos. The genomes of specific to and maintained by each placental marsupials show some endogenous lineage. In contrast, non-reproductive organs retroviruses, but the amounts are much less (kidneys, hearts, etc) are much better conserved then found in placental genomes. As with amongst the placental species. Placentals are the monotremes, marsupial sex is also characterized by long gestation times, which determined by an X/Y chromosome system, involve an extended and intimate contact which like the monotremes, is also between allogeneic embryo and the mother. This characterized by a very small Y situation poses an unresolved issue of how the chromosome (about 10,000 bp). As this mother’s adaptive immune system fails to reject chromosome contains the male determining the embryo. The trophectoderm is known to be genes, there can be little room for the directly involved in this protective process. The accumulation of the large numbers of ERVs, placental trophectoderm has a variety of LINES and SINES seen in placental Y distinctive characteristics. For one, it is the first chromosomes. tissue to differentiate in the fertilized embryo, prior to implantation, and is involved in allowing Eutherians (placentals) includes a uterine implantation, uterine wall penetration and diverse set of animals from rodents to protection from mother’s immune response. The primates (all together about 4,400 current tropehctoderm will differentiate into onto the species). Early placental mammals appear placenta, mediating establishment of blood flow to have been shrew-like carnivores that and nourishment from the mother (via diverged into existing . The basal synsytiotrophoblasts). As described in greater placental thus remains as shrew-like detail below, placental and reproductive tissue mammals. An early taxon to have diverged (especially trophectoderm) of all these mammals from this lineage is Afrotheria, which is also associated with high level expression of 201 the RNA, gene products and particles of maintained in amphibians. The Iridoviruses of various endogenous retroviruses. These fish were the most commons fish DNA viruses endogeneous retroviruses are generally host and are also commonly found in amphibians and lineage specific and are highly related to the insects (but not birds or mammals). However, lineage specific LINES and SINE elements the most common RNA viruses of fish were the found in all placental species. These retro- rhabdoviruses. Although rhabdoviruses are elements are significantly more abundant commonly found in some mammals (especially then similar elements found in marsupials, bats), they are not common in amphibians, and much more abundant then distantly reptiles or birds. As mentioned above, Herpes related elements found in avian species. viruses can be found in many fish and clam Also, placental mammals are host a wide species. They are also common to essentially all family of both RNA and DNA viruses. mammals and birds (but were absent from Many of these viruses (especially the insects). Poxvirus, in contrast, were not found in persisting ones) are highly species specific fish species, but are found in insect mammal and and phylogenetically congruent with specific bird species. However, the literature available to placental lineages showing a long understand virus-host relationships in the evolutionary relationship. terrestrial vertebrates is very uneven and thus it seems possible that the viral sampling from these In summary, we see that the terrestrial host is distorted. Virus studies are largely animal lineages have acquired an array of ‘mammal-centric’ and also ‘acute-centric’. distinguishing biological characteristics that Mostly, we have studied the acute disease are maintained in a lineage specific manner. causing viruses of mammals and birds that are Particularly variable are the biological either domesticated or commercially significant characteristics associated with reproduction, to humans. Yet, as I have argued in chapter 1, it sex determination and egg development. As is the persisting virus-host relationship that most we will see below, there are also many often provides long term evolutionary stability. lineage specific associations with viruses. As such inapparent relationships are generally In addition, there are important distinctions not well studied, we have a very restricted with the endogenous retroviruses of these literature in this respect which will limit our lineages which will be presented. understanding of some viral lineages. In the section below, we examine the best-studied examples of viruses of the various terrestrial THE VIRUSES OF TERRESTRIAL vertebrate species. Like in prior chapters, we VERTEBRATES. will pay most attention to these well-studied systems. However, a theme to be developed in Overall viral patterns. As noted, with the this chapter will be to examine the role that evolution and diversification of the teleost persistent and especially endogenous retroviruses fish species, we saw a related diversification have had in the evolution of terrestrial of the types of virus that replicate in fish vertebrates. This will be especially true for the species. As will be presented below, placental mammals. This theme will relate to mammals and birds have maintained most, overall host genome evolution, including the but not all, of these same fish viral families. evolution of ‘lumpy genes’ and LINES in animal However, some of the intervening terrestrial genomes. vertebrates show much more restricted patterns of virus replication. For example, Amphibians and their viruses. As noted the overall range of viral families found to above, the salamander is considered to represent infect reptiles and amphibians appears very the most basal of amphibian species. reduced relative to that found in the oceanic Salamanders maintain a life cycle that is more animals. However, some relationships are aquatic then the other amphibian lineages. With 202 respect to viruses, however, salamanders dependent then FV3 on host nuclear systems. appear to be more like sharks in that there Unlike FV3, ENV does not require host nuclear are few reports of viruses that infect them. enzymes for its replication. Thus, these DNA No acute or persistent RNA viruses have yet viruses of frogs, although clearly iridioviruses, been reported to infect any salamanders. also more closely resemble the poxvirus in that Only infection with an iridiovirus has been they can support extra-nuclear, cytoplasmic described. Although salamanders are not DNA replication. Consistent with this line of commercially produced or otherwise grown reasoning, the Iridiovirus encoded eIF-2a is in large numbers, hence not well studied, most related to that same protein found in there does appear to be a limited set of swinepox (PK3L). These amphibian studies to evaluate. Currently, however, it iridioviruses appear to have become less seems salamanders show a dearth of viruses dependent on host nuclear replication systems, as associated with them. In the case of frogs, seen in the fish and insect iridioviruses and show iridioviruses are well established to be more similarities to poxviruses. important viral parasites. As mentioned above, frogs constitute 90% of all Iridioviruses are known for their ability to cause amphibians. Frogs are known to support the high mortality in tadpoles of various toads and replication of iridioviruses, retroviruses, frogs. Thus they are acute agents in this calciviruses, poxviruses, herpes virus, situation. However, it appears that some frog adenoviruses and polyomaviruses. Thus species are persistently infected with frogs are an established host for a broad iridioviruses and that these species are the source array of viruses. However, on closer of acute infection of the other species. For inspection this appears to be a most curious example, Venezuelan toads (Bufo marinus) are list of viruses and seems to be especially often persistently infected with gutapovirus underrepresented in the RNA viruses. With (GV), which is highly pathogenic to tadpoles of respect to DNA viruses, this list essentially other toads and other amphibians. It is worth resembles the list of DNA viruses able to recalling that the Ascoviruses are the most infect teleost fish, with the notable exception related virus to GV, but there are no ascoviruses that poxviruses are now present. The most in any vertebrate. The one DNA virus that has prominent DNA Iridiovirus of bony fish yet to be reported in frogs are parvoviruses, (IHNV) are also prominent amphibian which curiously have also not yet been reported viruses. In fish, LCDV was known to infect for fish. Entamopoxviruses were observed to many species. Similarly, 30 types of infect insect (especially locusts) orders, but the iridioviruses of frogs are known. A well amphibian poxviruses are the first true example studied frog iridiovirus is FV3, which was of a vertebrate poxvirus. From this pattern we originally discovered due to mass mortality might suspect that mammalian poxviruses may of United Kingdom frog species. FV3 is of have evolved from these insect poxviruses. This special interest in that this virus, unlike fish possibility is developed in detail below in the iridioviruses, undergoes DNA synthesis in section on the emergence of poxviruses. In two stages. The initial FV3 viral DNA frogs, these viruses are associated with skin synthesis is like that in fish virus, restricted lesions and haemorrhages ( also to the nucleus. However, subsequent viral associated with mammalian othopoxviruses). DNA replication and amplification is cytoplasmic. Furthermore, the cytoplasmic Amphibian Herpesvirus. Amphibians are also replication process resembles that of phage known to support Herpesviruses. Ranid Herpes in that DNA is synthesized into virus (RaHV-1) and Luckê tumor herpes concatenated molecules that undergo (LTHV) have both been studied in some detail. headfull processing into the mature virion. These viruses are distantly related to fish ENV, another frog virus, is even less 203 herpesvirus. Infection with these and reptile adenoviruses. In all these groups, herpesviruses, as in fish, is associated with only 16 core viral genes are conserved, and these tumors, which are not metastatic and are mostly replication and structural proteins. normally benign. Infection also frequently With the recently sequencing of FrAdV-1, it is results in surface skin lesions (mucosal and now apparent that FrAdV-1 is most related to the growth associated). These tumors are able avian turkey adenovirus Ad3. Previously, Ad3 to later produce virus and this appear to was not known to be closely related to any of the provide a mechanism for viral persistence. other adenovirus. As noted in Chapter 6, To establish persistence the virus appears to adenoviruses were also described in fish species, induce a tumor. The viral DNA has been associated with benign skin metaplasia. The fish found to persist in tumors in the absence of adenoviruses, show very limited similarity to infectious virus production, when the frog is other adenoviruses and appears to be most basal maintained at low water temperatures. At of all adenoviruses. However, the morphological high water temperatures (a seasonal similarities, the similarities of replication occurrence), virus production is again strategy and the similar genome organization induced. Persistently infected cells are also between all adenoviruses and the PRD1 phage of immortalized by the virus in that cellular Bacteria has been used to argue that bacterial is prevented by viral gene phage were the direct progenitors of all products. Thus frog herpesviruses appear to vertebrate adenoviruses, even though the persist as tumors. genomes of PRD1 lack observable sequence similarity to Adenoviruses. Amphibian Adenovirus. Frog adenovirus (FrAdV-1) was initially isolated from a naturally occurring frog renal tumor, but it Frogs also support replication with can also be isolated from healthy wild frogs. polyomavirus. A frog (leopard frog) The virus can also infect fish species. In polyomavirus, which induces mainly benign skin codfish, this virus can cause epidermal growths, but can induce kidney tumors has been hyperplasia. In amphibians, FrAdV-1 described. However, these viruses are not well appears to generally be a persistent virus. studied. FrAdV-1 is the smallest of the sequenced vertebrate adenoviruses (26,163 bp), and FrAdV-1 is lacking various 5’ and 3’ Frog Genomes have not yet been well situated regulatory genes found in other characterized. However, retroviral elements are vertebrate adenoviruses. By phylogenetic known to exist in frog genomes. Essentially all analysis, FrAdV-1 appears to represent an vertebrates appear to harbor some endogenous ancestor to both avian and mammalian viruses related at various degrees to adenoviruses. These latter viruses have spumaviruses and MLV. In Dart poisen frogs, apparently acquired additional, non-core the Dev I, II, III endogeneous retroviruses have genes, during their evolution. This pattern been reported. However, so far, these ERVs of gene acquisition with evolution clearly appear to be defective, yet they represent a resembles that of baculoviurses discussed distinct family of retroviruses, unrelated to the earlier, but is the opposite of the pattern of seven currently recognized retroviral genera. genomic evolution we will present with The amphibian retroviral fragments are equally rodent and other mammalian poxviruses distant between MLV and WDSV (walleye below (associated with gene loss). Recently, dermal sarcoma virus) genomes. As WDSV is a Adenoviruses have been classified into three fish genetic parasite, it would appear to most large groupings: Masadenoviruses of likely represent the ancestral retrovirus to the mammals, Avian Adenoviruses and now frog ERVs. These frog ERVs are only distantly , which included amphibian related to avian retroviruses. It is interesting that 204 transcripts of Dev sequence, however, are adenoviuses. For example, Nile crocodile are present in high copy number in ovum of known to support infection with an adenovirus frogs. Thus, as will be seen with the and such infections produce skin lesions. reproductive tissue of other terrestrial animals (both vertebrates and insects), high In contrast, other reptiles, such as some lizards level ERV transcription is associated with (Iguanas) and many turtles and snakes do appear amphibian reproductive tissue. to harbor both acute and inapparent infections with RNA viruses of various types, including calciviruses and paramyxoviruses. In reptiles, Reptile viruses: RNA viruses. As we paramyxovirus infection is primarily an infection noted with the salamanders, other of the lungs. The snake species, in particular, amphibians, and the most basal appear to present a host in which the lung habitat representative of the reptiles (the crocodiles has been efficiently exploited by these viruses. It and alligators), all appear to show a paucity is not presently clear if there is some feature of of disease causing RNA viruses. Since snake lung biology that might contribute to their some farming of alligators is done, the propensity to support virus infections. In other opportunity for observing viral induced reptile species, paramyxoviruses can also disease in these species is considerably establish persistent infections. One possible better then for salamanders or frogs. So far, example of such a snake paramyxovirus has been there are few RNA viruses reported for reported with iguanas, which can harbor alligators or crocodiles. However, very inapparent infections. These virus infected recently, alligator farms in Southern USA iguanas also appears to be the source of acute have reported some infections and deaths paramyxovirus infection of turtles. Interestingly, due , which itself was turtles generally appear to show a high incidence recently introduced into the USA. Yet, it of acute infection with paramyxovirus resulting was not completely clear how these animals in lung and other pathology. Thus both turtles became infected as they did not appear to and snakes seem prone such RNA viral have acquired the infection via mosquito infections. Although snakes can be persistently transmission from environmental sources. infected with some paramyxoviruses, they are Instead, it appears that the farmed alligators also susceptible to acute lung infections by were infected by being fed West Nile virus various other paramyxoviruses. Numerous virus infected horse-meat. Thus, although it mediated die offs in snake farms have been seems clear that alligators are seldom reported, but the relationship between persistent observed to be infected by natural means, and acute disease causing snake alligators can be infected by at least some paramyxoviruses has not yet been evaluated. RNA viruses that are also known to infect However, the diversity of snake birds but the farming practices may have paramyxioviruses is impressive. In fact, 16 contributed to their infections. reptilian paramyxoviruses types have so far been described. Furthermore, phylogenetric analysis In addition to a paucity of RNA viruses, no suggests that these snake viruses are related to retroviruses have been yet observed in these and basal to Sendaivirus. This result is most reptile species as well. However, as interesting since Sendaivirus is also retroviral infections can be rather phylogenetically basal to all the vertebrate inapparant, this failure to observe these paramyxoviruses. The implication is that viruses may be due to insufficient reptiles harbor paramyxoviruses (both persistent examination. Crocodiles are similar to and acute) that may represent ancestral various amphibians with regard to viral paramyxoviruses of mammals and avians. induced skin pathology that results from Placental mammals are also known to be host for infection with both poxviruses and many peramyxoviruses. However, 205 paramyxoviruses are important pathogenic lizard blood samples from field isolates are often viruses of their mammalian host but positive for JEV, St.Louis EV, POW and VEE. establish mainly acute infections. In very few, if any, natural field studies or studies In summary, although basal reptiles, such as domestic mammalian do paramyxoviruses crocodiles appear to have a paucity of RNA result in stable persistent infections. This viruses, RNA viruses of other reptiles (snakes, observation suggest that these mammalian lizards and even turtles) are numerous and paramyxovirus infections could be unstable diverse. These RNA viruses include both in an long evolutionary time scale and paramyxoviruses and calciviruses. These viruses dependent on prevailing host population establish both persistent and acute infections in structures. If the snakes and other reptiles, their specific reptile host. These reptile viruses however, harbor related viruses in a stable also appear to be basal to the similar RNA persistent way, they might provide an viruses that cause acute disease in mammalian evolutionary and ecologically stable source host. It is likely that these reptile viruses of virus for adaptation and infection of other represent ancestral version of these mammalian mammalian species. That snakes support 16 viruses. known types of paramyxovirus support the idea that this snake-virus relationship may Reptile DNA viruses. With respect to DNA be basal to and ancestral to the mammalian viruses, snakes have been reported to support relationship with this same family of infection with iridioviruses, herpesviruses, viruses. poxviruses, adenovirus and parvoviruses. Mostly, these viruses have not been observed in Snakes support the replication of additional crocodiles. However, Nile crocodile has been RNA viruses. These include calciviruses, reported to be infected with an adenovirus. togaviruses, flavaviruses, reoviruses and Turtles also support infection by various DNA retroviruses. For example, rattlesnakes are viruses including iridioviruses, herpesvirus, and know to support 16 types of calcivirus. At polyomaviruses, the latter two being associated least one of these (Cro-1 virus) is with benign tumors. Snake parvoviruses are also nonpathogenic in reptiles and frogs. Also, known. The parvoviruses of some reptiles (corn in reptiles reoviruses often show no disease snake and iguanas) have shown concomitant and can be highly prevalent in natural infection with other DNA viruses, suggesting populations (47% positive in some healthy that parvoviruses may depend on prevalent iguanas) indicating a persistent life style in infections by other DNA viruses. Many this host. These viruses are very similar to vertebrate parvoviruses are helped by adenovirus the strictly acute reoviruses of avians and infections superinfection so this may be a general mammals, but their relationships to those relationship. These snake parvoviruses exist in viruses has not been evaluated. two major and distinct clades and these clades show some geographically restricted distribution Turtles are also known to support as well as some species specificity. In addition, bunyviruses, togaviruses, flavaviruses and the snakes adenovirus are interesting in a retroviruses. Noteworthy is the occurrence phylogenetic sense, as they are basal to the of the ‘arboviruses’; togaviruses and adenoviruses from ducks and possum which are flavaviruses in turtles. These are virus types themselves basal to the adenovirus of cattle. that have not yet been reported to infect any However, the snake adenovirus may be most bony fish populations. In addition, various closely related to Frog adenovirus described field studies have suggested that these above. Recent sequence evidence suggests that reptile derived viruses may be the source of this snake lineage of adenovirus may have viral infections of other vertebrate species. jumped species, resulting in a virus adapted to Consistent with this idea, turtle, snake and cattle, explaining the existence of a distinct clade 206 of atadenovrus that infects both snakes, (leopard frog skin growths and kidney tumors). ducks as well as cattle. That DNA virus infection can mediated affects on skin growth in fish and amphibians suggest Herpesviruses are also known for snakes the induction of cellular growth and and lizards. BoiHV-1 (boids), EpHV-1 differentiation programs and the inhibition of (corbras) and IgHV-1 (iguanas) have all apoptotic pathways are inherent to the viral been studied. These viruses are often replication strategy. It is interesting that not all inapparent in their corresponding snake host, fish skin have these basal cells. Sharks skin, for although the level and sites of persistence example, lacks the basal epithelial cells present are not well studied. The Iguana in bony fish and amphibians. Interesting, the herpesvirus, for example can be isolated only DNA virus known for sharks (dogfish from healthy green iguanas in the field. Herpes) was associated with skin necrosis, not LaHV-1 is another snake herpesvirus that is hyperplasia. In reptiles, the skin has basal cells, also non-pathogenic in ring snake. The but the highly differentiated scale producing EpHV-1 in corbras is not highly pathogenic, cells (keratin) are terminal and could be although it is associated with inefficient impervious to virus infection or viral mediated venom production. In contrast to these growth control. In crocodiles, skin lesions, not nonpathogenic snake herpesviruses, in hyperplasia were seen with adenovirus infection. terrestrial tortoise and sea turtles, Similar skin lesions and oral lesions (but not herpesvirus infections are associated with growths) are seen with poxvirus infection of high mortality and die offs. In terrestrial crocodiles and captive caimans. In snake tortoise, CmHV induces mortality up to species, herpes and other DNA viruses tend to be 100%. In sea turtles, GTFPHV is highly non-pathogenic, not tumor inducing. In green prevalent and is associated with tumors. It lizards and Bolivian turtles, polyoma and herpes is interesting that this herpes has a DNA like virions can be observed in various tissue, but polymerase gene with homology to alpha again these infected reptile tissues are not family herpesvirus (mammalian/avian). hyperplastic. However, in sea turtles, internal Thus, as seen in various other species, the tissues, not skin, were induced to from tumors by pathogenesis associated with herpesvirus DNA viruses. Although this issue has not been infection is highly species specific. systematically studied, the observations are consistent. What then accounts in this shift in Reptile skin and virus growth. There biological relationship between DNA viruses and appears to be a noticeable shift in the the host skin to from either hyperplasia to skin relationship of reptile viruses with their host. lesions? Reptiles have highly keratinized, In reptiles, these viruses no longer tend to terminally differentiated skin that is periodically cause growth abnormalities in infected skin shed. As noted, such a skin might be physically as was described in chapter 6. In both impervious to virus release, providing a vertebrate fish and also many amphibian biological barrier that would prevent using species, infections with the large DNA reptile skin for the purposes of subsequent virus viruses were mostly associated with transmission. In contrast, it seems clear that the epidermal and other tissue hyperplasia (e.g. corresponding skin lesions of fish and Herpes, and Adenovrus of cod and white amphibians are able to transmit subsequent sturgeon). This relationship also extends to rounds of virus infection. The interference of include the smaller DNA viruses of fish, skin mediated, or keratin inhibited virus such as polyomaviruses (i.e. swordfish transmission might also provide a virus-host , winter flounder epidermal basis for selective pressure that contributed to the hyperplasia). This tendency to cause benign evolution of highly keratinized host skin. tumors was also a characteristic of polyomavirus infection of amphibians 207 Reptilian retroviruses. Both endogenous most of which are defective. As we have retroviruses (ERVs) and autonomous previously argued, these defective copies could retroviruses are found in fish, amphibians, also provide a mechanism to achieve stable turtles and snakes. However, autonomous persistent infections, by suppressing the retroviruses have yet to be reported for autonomous virus. If so, defective ERVs could salamanders, crocodiles and lizards. Many, be crucial for the purpose of homologous but not all, of these reptile retroviruses are retrovirus suppression, thus be selected for their also related to ERVs found within their maintenance. Without them, or with a genome. These reptilian retrovirus, however sufficiently different set of ERVs, the host could are a distinct group and have been called be susceptible to high level autonomous . This group includes the retrovirus expression. Such a scenario, could Walleye dermal sarcoma virus (WDSV) of then explain the relationship between genomic fish (an autonomous virus of skin ERV derived viruses and species specific snake hyperplasia) and this virus is the basal infectious retrovirus disease described above. member of this group according to phylogenetic analysis. This group of In summary, both genomic ERVs and infectious retroviruses also appears to be retroviruses are known for snakes. In addition, monophyletic. It therefore seems most the infectious retroviruses are similar to and likely that an aquatic WDSV-like ancestor likely derived from these ERVs. It thus seems was the progenitor to all these reptilian likely that there has been a long term interaction viruses. It is most interesting, however, that between ERVs that colonize specific reptilian this same family of autonomous retroviruses host, and that these ERVs retain the capacity also shows significant homology to the infect and induce disease in host not colonized human HERVs (discussed below), by the same ERV. What is curious to consider suggesting some evolutionary linkage to then is how a numeric balance of ERV mammals as well. colonization might be achieved. What keeps the numbers of ERVs and their defectives at a There are other types reptilian retroviruses relatively low level in reptilian genomes as well. The endogenous retroviruses of compared to high numbers that seen in all pythons, PyERV, shows two closely related placental mammals? What type of events or types. However, both types are selective pressures might lead to large scale unclassifiable with other retrovirus families shifts in the level ERV genome colonization? and are not related to known type B, D, or C retroviruses. Boid snake inclusion disease (BIBD) is due to infection with a retrovirus The Viruses of Birds. Bird species have that is closely related to the python PyERVs descended from ancestors that are in common to and may likely have been derived from such reptiles and the dinosaurs. They are an ERV. In most pythons, the PyERV are distinguished from those ancestors by the not well expressed. But strong expression is acquisition of warm blooded control, feathers, seen only in Python curtus, not in 5 other the avian egg shell and beaks. In addition, birds distinct Boid species. This situation is have acquired genetically determined sex (via reminiscent of the specificity of the ZW chromosomes in which the female is herpesviruses noted above. Low viral heterozygous). Thus, like mammals, the entire expression and inapparent infection is lineage has both become homeothermic while characteristic in its persistent host, but high acquiring chromosomally determined sex expression and disease is seen when the selection. As mentioned previously, birds are same virus infects a related but distinct host the most diverse of terrestrial vertebrates (9000 species. All reptilian species have genomic species) and the Galliform birds are the best and mostly species specific copies of ERVs, studied as well as representing a relatively basal 208 taxon. Since bird genomes maintain good poxvirus, is also associated with skin lesions, sequence similarity to crocodile genomes, such as growths in exposed (unfeathered, less we might expect that bird viruses would keratinized) skin found on the claws and around resemble viruses of reptiles. As DNA the beak. However, in flocks this virus viruses were a significant contributor to can be highly pathogenic due to obstructive reptilian virus-host biology, we will consider growths in the airway epithelia. , is DNA viruses of birds first. one of the largest and most complex of the poxviruses. Like many other poxviruses, it also Unlike crocodiles, avian species are known shows a broad species specificity but its natural to support a broad array of virus types. distribution is not well evaluated. In some field However, our understanding of avian viruses studies, such as Swainson’s francolin of S. is biased towards viruses that cause diseases Africa, 44% of the wild birds were observed to in commercial flocks. The situation is thus be infected with avian poxvirus and show the similar that that with flowering plants in that associated but benign skin growths. Avipox our scientific literature is highly biased DNA has been isolated from dermal squamous towards disease causing viruses of cell carcinomas in natural settings. However, in domesticated species. Like agriculturally some situations, avipox DNA can be isolated in a important plants, with birds and their latent state from normal skin. Thus it may exist viruses, we know very little about the as a persistent life strategy in some specific host. natural prevalence of persisting, species It is also clear that not all bird species or specific virus infections. Avian viruses have populations are exposed to avipoxvirus as they therefore been mostly studied in the context were absent from many other field studies. The of chicken and turkey commercial flocks avipoxviruses that have been evaluated show with a focus on acute pathogenic disease some variation and species specificity in terms of that affect large commercial populations of pathogenesis. For example the avian poxvirus genetically homogeneous birds. Yet, even isolated from Hawaiian crows was significantly with this focus on these specific disease less pathogenic in then other isolates. causing viruses, very few field studies have Avipoxvirus in commercial flocks thus appear to evaluated the virus-host relationship in the represent species jumps from poorly context of the natural prevalence of these characterized natural sources. It seems likely same viral infections in natural avian that these viruses are maintained by some populations. reservoir species in benign persisting states, which function as the source of virus to adapt to DNA Poxviruses of birds – Iridioviruses other (commercial) bird species. If this view is were common large DNA viruses of fish and correct, we can also propose that the evolved amphibians and might have been expected to gene function of many avipoxvirus proteins will also infect birds. Iridiovirus are also found be to maintain the persisting benign state of virus in insects, as are ascovirus. However, there production, and not to promote the high level are no iridioviruses or ascoviruses of birds virus replication and associated disease as is (or mammals). With reptiles and birds, we observed in commercial flocks. instead see the emergence of poxviruses as a prevalent type of large DNA viruses. Herpes viruses of Birds. The best studied of Poxviruses were also present in insect host the bird herpesviruses are Marek’s disease virus and it appears by phylogenetic sequence and infectious larnygotracheitis virus which have analysis of these insect poxviruses may be important affects on commercial chicken flocks. basal to those of birds and mammals. The Another bird herpesvirus is , which poxviruses of birds display essentially all the is also associated with disease in commercial biological characteristics of the mammalian duck flocks. These viruses cause proliferative poxviruses. Avipoxvirus, like mammalian diseases of the lymphatic and other tissue. 209 Resistance to Marek’s disease is associated before, however, these avian viruses are clearly with the B-F region of the MHC. For the more related to the virus found in frogs then they most part, these viruses seem to be typical are to the adenoviruses of mammals. Thus avian herpesviruses in terms of genetic adenoviruses may represent a distinct linage of organization and virion structure. adenoviruses that adapted to birds from According to sequence analysis, all these amphibians or reptile predecessors. avian herpesviruses appear to be most similar to the alpha herpesvirus family. Avians also support papillomaviruses and However, biologically, they do not resemble polyomavirus (small circular dsDNA viruses). the mammalian alpha herpes viruses in that Papillomatous lesions in male green finches have they do not establish latent or persistent been reported, but these did not affect other infections in nervous tissue (i.e. ganglions). birds, suggesting a tight species specificity. The Instead, the avian herpesviruses have a interest in avian polyomaviruses is due to their biology that is distinct and more closely effects on commercial aviaries and they are resembles the mammalian gamma-herpes especially associated with diseases of hatchlings, viruses (such as HHV6/7), which are T- such as budgerigars. There are several lymphotropic herpesviruses. It is interesting interesting distinctions between the that HHV6/7 have repeated terminal DNA, polyomaviruses of avians and those found in or TRS sequence motifs, that resemble that mammals, including human polyomaviruses. of human telomerase. Within the Avian polyomaviruses are simpler then their herpesviruses, only Marek’s disease virus mammalian counterparts and typically have a has a striking resemblence to this sequence smaller and simpler early genes (T-Antigens). element. The avian herpesviruses are Amongst the polyomavirus family, the two most frequently associated with growth conserved region of all members are found adnormalities, especially malignant within the region of T-Ag associated with lymphomas and atherosclerosis. In this ATPase activity and also with in the capsid biological aspect, they are clearly more like coding region. These are both maintained in the HHV6/7 or the herpesviruses of fish species, avian polyomviruses. Thus the avian and not like other alpha herpesviruses of polyomaviruses appear to be clearly related to mammals. Although avian herpes viruses the mammalian viruses, but their simpler genetic are also known to persist, the cellular sites structure has led some to consider them to be and mechanisms of persistence are not more representative of the progenitor known. Nor is it clear if persistence in the polyomavirus. However, there are several natural ecological setting is restricted to reasons to think they are not progenitors. For particular avian species. one, the biologically distinct characteristics that these avian viruses display suggest that they may Other Avian DNA viruses. Adenovirus are represent mostly acute replicators. Although also known for avian species. Of interest is both avian and mammalian viruses show the Egg Drop Virus due to its pathogenic preference for respiratory and excretory (kidney) affects on chickens and egg production. tissue, Avian polyomaviruses show a much less However, there have been few field studies host species specificity then the mammalian that have evaluated the natural biology of polyomaviruses. Mammalian polyomaviruses these adenoviruses so little can be said about are highly species specific whereas the avian host specificity, species jumping or viral polyomaviruses are able to infect a relatively persistence of this virus. Those few studies broad array of host bird species (although that have been conducted have not observed infection appears to require hatchlings). Also, all a significant prevalence for mammalian polyomaviruses appear to establish in wild birds. Thus we cannot now account life-long persistent infections whereas the avian for the natural source of this virus. As noted polyomaviruses appear to cause acute infections 210 that do not resulting in persistence. clearly known to be the original source of However, there may be some exceptions to influenza virus that can subsequently adapt to this situation in that persistence may be also infection of various other animals. Influenza much more species specific in avians. One virus is a segmented negative strand RNA virus report indicated that sulfur crested cockatoos in which the segments exist in various alleles in New South Wales had a high prevalence allowing for segment mixing or of polyomavirus. Thus it seems more during mixed virus infection. This provides possible that the avian polyomaviruses have influenza virus with the ability to form adapted a simpler, acute life strategy from recombinant viruses between distinct parental many avian host species, but may establish viral types. Influenza viruses can re-assort their species specific persistent infections in other eight sub-genomic segments during mixed less studied host. Consistent with this idea, infection, thus allowing for a recombinational one study of wild sulfur crested Australian process to apply to these negative strand RNA cockatoos reported that 64% of the birds viruses. This re-assortment allows a greater were positive for avian polyomavirus., but degree of virus genetic adaptation and is directly other abundant wild species (galahs) and involved in allowing influenza to adapt to other nearby domestic flocks were negative for new host species, including both other avian this virus. Curiously, in neither avian or species and mammalian species. Within the mammalian polyomaviruses, do we see the avians, we see the first prominent occurrence of common association of virus replication this type of virus in any host. with benign tumor growth that was seen in amphibians and fish polyomaviruses. This In replication strategy, the negative strand could also be related to the highly influenza virus resembles the rhabdoviruses, as keratinized (feathered) avian skin. discussed in chapter 6 and some similarity in the RNA replicase can be observed. Rhabdoviruses In summary, avians support the full are also negative strand RNA viruses found in complement of DNA viruses that are found plants and insects and especially fish. However, in mammals. However, there seems to be rhabdoviruses cannot undergo recombination. clear biological distinctions between virus- Curiously, there are very few rhabdoviruses of host relationships in avian and mammalian birds. Yet rhabdoviruses do not seem to be the DNA viruses. The avian herpes viruses and direct ancestors of influenzavirus. A more likely poxviruses are generally associated with direct ancestor would be the paramyxoviruses, growth abnormalities in infected avian cells. which are also negative strand RNA viruses, but Examples of neuronal persistence, such as are structurally much more like influenza virus. alpha type I and type II herpes virus or As mentioned above, the paramyxoviruses are various fish herpesvirus have not been prevalent as lung infections in various reptile observed in birds. However, we generally species, especially snakes. The influenza virus know little about the natural distributions of appears to represent a segmented version of these viruses or their natural host so little paramyxoviruses which has also acquired the can be said about such biological issues. ability to undergo reassortment or recombination. Consistent with this idea, RNA viruses of Birds: The influenza influenza virus are not found in any lower story. One of the most studied of all avian organisms (including amphibians or crocodiles) RNA viruses is . This but are restricted to avians and mammals. virus is of intense interest, not only because of its ability to cause much disease in Water fowl influenza persistence. Although commercial bird flocks, but also because it much of our attention was focused on the ability can adapt to human host and cause major of influenza to cause acute disease in humans human epidemics. Avian species are now and commercial bird flocks, it now appears quite 211 clear that water fowl, such as various species Human adapted Influenza from avians. of shore birds and ducks, are the major Influenza A segment evolution and human source of influenza virus for other species. adaptation. The two influenza RNA segments These water birds have a clearly different that have received the greatest attention are those and persistent relationship with these surface proteins associated with human immune viruses. Influenza infection of water fowl protection, the H (hemagglutanin) and N are long term and non-pathogenic. Also, by (neuriminidase) segments. It is re-assortments of far the greatest numbers of alleles of all these segments, called , that is eight of these influenza genomic segments associated with loss of immunity and major are found in waterfowl species, some of human by Influenza A virus. For which are specific only to water fowl. In example, the Spanish influenza pandemic of specific species, such as Peking ducks, 1918, which killed more then 20 million people influenza infection is limited to intestinal world wide, is associated with the appearance of tissues and will establish a persistent non- the H1 segment, while the 1957 Hong Kong pathogenic infection in which virus is pandemic was associated with the H5N1 excreted. While in this host, influenza A segments. Phylogenetic analysis now clearly virus shows a very low rate of genetic argues that these segments both originated from mutation and is also phylogenetically viruses present in avian species that somehow congruent with the evolution of its host. adapted, possibly through intermediate host, to This is a major distinction of the influenza A infect humans. Generally, human adapted virus isolated from water fowl compared to influenza viruses have lost ability to infect water virus isolated form other species (including fowl. Also, acute and highly pathogenic other birds) is that there has been almost no infections by influenza A virus is not restricted change in the genetic sequence of the virus to mammals. Non-water flow avian species are in ducks for the last 85 years. Dendograms also susceptible to pathogenic infections from of these persisting virus isolates appear water fowl derived influenza A virus and such frozen in evolution and show little diversity infections appear to be commonly observed in and yield dendogram ‘sticks’, not the usual natural bird flocks. This has been most apparent trees characteristic of influenza quasispecies in commercial flocks of chickens and turkeys, associated with human infections. It thus which can be decimated by A appears that the selective pressures that virus. In 1997 a major outbreak of influenza A apply to a persistent virus-host relationship in commercial chicken flocks was due to an are exerting a clonal or purifying selection H5N1 virus that was later traced to have most on the virus and that the coding sequence is likely originated from geese. Like the human maintained essentially unchanged. As it is pandemics, the geese water fowl appear to be a clear that the influenza RNA dependent reservoir for viruses that cause chicken and RNA polymerase, which lacks proofreading turkey epidemics. The water fowl species function, has a high error rate, it must be harboring persisting influenza will vary that during persistence, the initial colonizing geographically. In other parts of the world, such viral genome is somehow maintained. as Germany, wild Peking ducks will frequently However, as we know almost nothing about be infected with H6N1 virus, whereas in Brazil, the mechanisms by which influenza many wild water fowl support H1N1 or H3N2 establishes persistent infections in duck virus. Different flyways appear to be associated intestines, we are unable to understand how with different types of influenza virus. Viral the viral gene products contribute to this persistence in these water fowl species appears genetic maintenance or how influenza avoid very important for the long term ecological the host adaptive immune response. stability of influenza virus. However, as mentioned, the viral genetic functions involved

212 in host persistence, although crucial for their transmission. Once adapted to humans, long term maintenance, are not at all however, these viruses may lose adaptation to understood. birds and become unable to replicate in duck intestine. The human specific influenza B and C Species jumps and adaptation to acute viruses are most likely examples of viruses that infection. It has also been established that have adapted so thoroughly to their acute human the genetic requirements for influenza host, that they have lost all ability to replicate in replication in avian cells is distinct from that avian species. Yet phylogenetically, these of mammalian cells. For example, all avian human specific viruses still appear to have been isolates appear able to agglutinate chicken, derived from avian viruses in the recent past but not mammalian red blood cells. Thus (two hundred years). The various strains of the viral gene function needed for infection human specific influenza A virus are, however, or persistence in water fowl do not directly unstable and tend to be either be lost or displaced result in genes able to function well in other from the population with time. species. However, domestic birds provide a concentrated homogeneous host population In summary, the avian/human influenza situation that can allow influenza to adapt to efficient may be our best studied example of the acute replication. As mentioned above, the relationship between persistence and emerging H5/N1/97 virus has adapted to be highly viral disease. It seems clear that influenza virus lethal killing millions of chickens. In has been maintained on an evolutionary time addition, due to the threat of possible further scale as a persistent infection of specific water human adaptation and disease resulting from birds. However, our focus on human disease has this virus, millions of chickens were culled not led to understanding of this asymptomatic to prevent the possible epidemic spread of virus-host relationship or why it results in such a virus to humans. Although it is difficult to remarkable genetic stability for the virus. It also judge the success of this preemptive culling appears that the ability of new versions of these strategy, a human epidemic from this avian viruses to emerge form persistence and influenza did not result. Curiously, and adapt to human and other species is an unending unlike what is seen in most human influenza phenomena, resulting in endless waves of acute epidemics, in commercial avian epidemics, influenza epidemics associated with high genetic is seldom involved in flock change. epidemic outbreaks but rather most avian epidemics result from reassortment and Paramyxoviruses of birds. Birds also support acquisition of new segments. Lethality and infections with non-segmented negative strand disease in avian species is not restricted to RNA viruses of the Paramyxovirus group. As domesticated birds in large flocks (such as with influenza virus, much of the attention the turkeys and chickens). Wild Peregrine study of avian paramyxoviruses is due to the falcons, which can be predators of water impact that such infections have on commercial fowl, have been reported to die of H3N2 bird flocks. In this regard, Newcastles disease influenza infection as have owls and virus (NDV) has posed the biggest problem and buzzards. The prevalence of influenza causes serious economic losses. Overall, it mediated killing or birds of prey, however, appears that NDV is introduced into commercial has not been evaluated. However, the flocks form exotic feral birds, such as parrots, overall picture we are left with is that pheasants and doves. The natural source of influenza viruses, although able to infect a NDV, however, is not completely clear. Some wide variety of mammalian species, are field studies (Germany, South Africa) report very strictly acute replicators in mammalian and low prevalence of NDV in wild native birds (less non-water fowl avian species, and thus then 1%). However, one study from Australia require large populations for stable viral reported that NDV was highly prevalent in wild 213 Anhinga and that these infections were not acute infections. Thus the paramyxovirus pathogenic, but infected birds secreted virus lineage may trace its evolutionary origins from via intestinal shedding. Also, one field reptiles, through avians to mammals, in which study examined recent Australian isolates of reptiles and avians appear to establish some NDV in sentry chickens and observed that species specific persistent infections. unlike the NDV that was responsible of acute respiratory disease in flocks, these There are some other distinctions between RNA recent isolates were much less pathogenic viruses of birds and other species. As mentioned and initially shed via the intestinal tract, not in chapter 6, rhabdoviruses of fish are very the respiratory tract. The implication of this common. Yet avian rhabdoviruses are very rare. observation is that NDV can exist as a stable However, two novel and unclassified nonpathogenic infection in specific bird rhabdoviruses from have recently been reported species but is likely to adapt to other bird during surveys for encephalitic arboviruses. species and cause acute disease. It thus appears that paramyxoviruses may resemble influenza a virus in this persistent to acute Plus stranded RNA viruses of birds: host biology. Other paramyxovirus of birds Arboviruses. In evolutionary terms, Arboviruses are Sendai virus and Avian paramyxovirus were first observed in fish host, but clearly (APV), which like NDV are also associated represent a significant infection of birds. In with lung infections. These viruses can also birds, they are mainly mosquito transmitted. sometimes be found in wild bird Arboviral infections have frequently been populations, but they are also frequently associated with significant die-offs of various absent in other field studies. The host wild bird populations. Most recently, a crow source of these viral infections remains die-off was due to the establishment of West unknown as it does not seem possible to Nile Virus in the Eastern United maintain acute epidemics in some of these States has been seen. Birds (and reptiles as noted wild bird populations. In terms of above) may be main host for many of these phylogenetics, it is interesting to emphasize viruses. Generally, arbovirus infections are not that Sendai virus is phylogenetically basal to persisting in infected birds so how the viruses are all of the mammalian paramyxoviruses. As maintained in the ecosystem is not always clear. noted above, most of mammalian However, large bird flocks might be sufficiently paramyxoviruses are known not to establish big enough to able to support the chain of persisting infections in their host. However, transmission of acute arboviral infections. In due to clear similarity of Sendai virus to the this case, persistence in bird host might not be more basal paramyxoviruses found in necessary for viral stability as the virus will snakes, it seems most likely that the migrate with its host. In some cases, it has been evolutionary trail can now be proposed for suggested that virus might persist in the insect the origin mammalian paramyxoviruses and vector as well, but this does not appear to be a influenza virus. The paramyxoviruses general situation. However, there are also viruses of reptiles (as currently found in reports that some birds might harbor persistent snakes) were the likely progenitors of the infections, but these are so far poorly avian paramyxoviruses (Sendai) and the documented. It is clear that arboviruses pose a avian influenza A viruses and both these significant biological parameter that affects the virus families were able to establish size and structure of bird populations. It has, evolutionary stable persistent infections in however, not been well studied how such a specific avian species. However, both of relationship might have also affect bird these avian virus families have also been evolution. There are some examples of other able to jump species and adapt to these host bird RNA viruses which appear able to persist in to infect various mammalian lineages as their host. Duck hepatitis virus (a picornavirus) 214 appears to be highly prevelant and persistent retroviruses (MLV) are the most well conserved in ducks. This virus may also represent the lineage. As mentioned in earlier chapters, the evolutionary progenitor to other hepatitis Ty3-Gypsy class of RT elements is within the viruses. Also, some bird coronavirus, such MLV class and has been conserved in most as avian infectious bronchitis, can persist in invertebrate and vertebrate animal lineages. The infected birds. However, the ecology of avian ERVs (ASLV) are also related to MLV. these viruses and their host are poorly However, phylogenetic analysis suggests that the studied so we are unable to comment much avian retroviruses have been acquired into 19 on evolutionary issues of these relationships. galliform birds (carry ASLV gag genes) but appear to have originated from horizontal transmission from a mammalian source, The autonomous and endogenous followed by a rapid into retroviruses of birds. The longterm related avian lineages. REV related sequence interest in the retroviruses of birds relates to (C-type, not ALV-like) have been found in some the observations by Payton Rouse early last wild birds, but most birds were healthy and centaury that a transmissible virus was lacked tumors. One report of Attwater’s prairie causing infectious sarcomas in chickens. chickens did observe that REV was present in Years later, it was determined that the virus tumors, although captive flocks remained healthy involved was a retrovirus, an avian and did not develop antibody to the virus. The sarcomavirus. Early on, it was also apparent pattern of ERV distribution in birds is distinctly that there was a genetic component to the different from that which we will describe below occurrence of these bird tumors, which led for the mammals. to an intensive period of research into the genetic basis of avian tumor viruses and Galliforms and retroviruses. Galliforms are eventually included the study of endogenous known to host three families of retroviruses; retroviruses of chickens. The literature has ASLV (sarcoma/lymphoma), REV long developed under the notion that avian (reticuloendothelial) and LPDV (lymphatic retroviruses are prevalent infections and turkey virus). Avian retroviruses tend to cause genomic agents of all birds. However, with lymphoid and hematopoietic proliferative disease time it has become clear that compared to in domestic birds. Several classes of endogenous the mammals, avian retroviruses are much retroviruses have been observed. One class, CH- less prevalent in non-galliform bird species 1, is present at about 10 copies/cell in some then was originally suspected. With the chicken lines but can be bred out. RAV-0 is contribution of genomics and sequence another complete avian ERV that also codes for analysis, several evolutionary patterns, an env gene and is related to avian myoblastosis however, have become clear. In most virus. The pattern of ERVs in the red jungle natural bird populations, retroviral mediated fowl, the non-domestic ancestor to chickens is tumors are rare. Unlike the situation with not well evaluated. It is known, however, that fish retrovirus (WDSV), few if any field Art-CH elements (which are non-coding) are studies of natural bird populations have found in pheasants. TERV (tetroionine reported a significant prevalence retrovirus endogenous retrovirus) is currently the only mediated tumors. It has also become clear known complete non-chicken endogenous that the tumors seen in chickens are often retrovirus and is found if Ruffed grouse. This related to the presence of endogenous ERV appears to have been acquired into this retroviruses related to the specific breed of lineage early in the phasianid evolution. Avians chicken. Endogenous retroviruses (and also appear to have a distinctly different control tumor production) can actually be bred out over the activity of endogenous and autonomous of most chicken lines. In evolutionary retroviruses compared to mammals. 25 distinct terms, the Moloney class of C-type species of galliforms are known. For the most 215 part, ERVs are not phylogenetically there is direct evidence that avian retroviruses congruent with these species and appear to exist and that genomes can be colonized by represent more recent colonizations of germ ERVs, so there seems no structural barrier to lines. Another distinction between avian ERV acquisition. In contrast all mammals and avians is that avians do not mammalian genome are colonization by large globally suppress the activity of retroviruses numbers of lineage specific ERV’s and even present in an early embryo as do mammals. larger numbers of ERV derived degenerate This applies to both endogenous and retroposons (discussed below). There also autonomous retroviral activity in chickens in appears to be some link between ERV that RAV-0 expression in a chick or egg colonization and the sex chromosomes in that infection with ALV results in high level sex chromosomes appear especially prone to retrovirus expression in most of the organs ERV colonization. Unlike crocodiles and of the resulting adult bird. amphibians in which sex is determined by ambient environmental temperature, birds are Avian Genomes and retroviruses. Bird warm blooded and have genetically determined genomes are significantly smaller as well as sex, involving an ZW chromosomes. Here, the less variable across bird species relative to female is heterozygous (ZW) and the male the genomes of other tetrapods. For homozygous (ZZ). In chickens, ERV-Z example, a bird genome is about 2.8 pg/cell chromosome associations are also known. In the compared to the 8.0 pg/cell of the broiler chickens (White leghorn), the mammalian genome. As the total numbers endogenous EV21 virus is directly associated of genes between mammals and birds does with the sex-linked large broiler body mass of not differ by this amount, much of this the bird. In addition to the intact EV21, other difference is due to greater amounts of non- complete ERVs are also known to be associated, coding DNA in the genomes of mammals. such as EV3 as well as some defective One suggestion for this difference is that retroelements. Late feathering and EV21 resides birds are under selective pressure to in the both Z chromosome and other autosomes. maintain light cells, which could limit So we see the general tendency of ERVs to acquisition of non-coding DNA. However, colonize sex chromosomes also applies to similar patterns are seen for crocodiles chickens. EV21 loss from the Z chromosome genomes and non-flight birds so it is not has a phenotype and is associated with early clear the ‘light cell’ hypothesis could apply. feathering. EV21 can also produce infectious Some have suggested that the evolutionary virus that can be sexually transmitted to the eggs rate in bird genomes is slower then that of and hatchlings of females lacking Ev21. Thus mammals and this has limited the EV21 can be both transmitted via the germ line acquisition of non-coding DNA. However, and as a virus. What then might the role of recent measurements indicate that bird EV21? Can it protect against the known and genomes have 3-6 fold higher level of single related autonomous retroviruses? It is interesting nucleotide polymorphisisms (SNPs) relative that exposure of EV21 harboring birds to ALV to that of human genome. One issue might infection did not affect the response of chickens be that bird genomes are older, hence have to this virus or ALV mediated tumor induction. accumulated more SNPs with time. In fact, the endogenous EV21 may have actually Regardless of this possibility, it is clear that increased ALV induced tumors for hatchling bird genomes can and have change infections. Thus EV21 did not protect chickens significantly during evolution, but pattern of against ALV or ALV tumors. Furthermore, avian change is distinct from that of mammals. EV21-like ERVs are not uniformly maintained in The basic question is why bird genomes are all bird lineages. For example, some species of so much less colonized by ERVs then are geese (Chinese, Synthetic, Embden strains) lack mammalian genomes? As outline above, any ERV-related sequences as measured by even 216 by low stringency hybridization to EAV pol remain as egg layers, the marsupials and (AvSLV). Clearly ERV colonization is not placentals do not lay eggs and have developed always favored in avian lineages. distinct reproductive strategies for embryo development. All mammals have mammary In summary we can see that infectious avian tissue which developed from modifies skin retroviruses are well established, especially glands that respond to prolactin and proliferate. in domesticated bird and the Galliform Although not exhaustively examined, most species. However, these viruses are not mammals also appear able to produce various common and seldom if ever observed in ERVs in association with the development of natural avian populations. In addition, mammary tissue. The mouse virus that was first ERVs residing in the genomes of avians are recognized with this association was mouse also well established, some of which can mammary tumor virus (MMTV), which produces reside in sex chromosomes and be sexually milk borne particles during lactation. We now transmitted as infectious virus. However, know that MMTV represents a very old and bird genomes are much less colonized by conserved class of endogenous retroviruses that ERVs and ERV defective derivatives are also defective in all mammals. Monotremes (LINES) then are the genomes of mammals. appear able to support infection by autonomous Also, avian ERV colonization is variable, retroviruses as well. However, neither the acute not seen in all lineages and not uniformly viruses, the endogenous viruses nor the genomes associated with the origin of various bird of monotremes are well studied so are currently species unable to fully classify the viruses and ERVs present in these species. Thus we can conclude little about monotreme retrovirus-host THE VIRUSES OF MAMMALS. relationships. Monotremes also have a very A sub-chapter on ERVs. small sex determining Y chromosome, but it is so small (about 10,000 bp), that it would be In the amphibians, reptiles and birds, we incapable of coding for more then one or several saw that all these lineages had some intact ERV. Clearly, it cannot be highly ERV members of ERVS within their genomes. colonized. But it is in the mammals, especially the placentals, in which we see an explosive and Marsupials are better studied with respect to lineage dependent increase in ERV their virus and genomes, although the literature colonization. This high level colonization is is still rather thin on this subject. Marsupials are especially apparent in both the X and Y known to support the replication of various types placental sex chromosomes. We also see of virus, from herpes viruses to retroviruses to that mammals support a broad array of other mosquito borne arboviruses. Wallaby herpes virus types in general. The early mammals virus (WHV-1) appears to be a member of the were egg laying, monotreme-like and shrew- alpha group I herpesviruses (which includes the like organisms (i.e. multituberculates) that avian MDVHV of turkeys). There is also a evolved even before the dinosaurs. Tracing Parma wallaby herpesvirus (PWHV). This virus their history to about 210 million ybp, these is prevalent in field populations and has been early mammals survived the dinosaur isolated in 23% of wild wallabies. These viruses extinction, but only to become extinct tend to establish inapparent infections in their themselves about 30 million ybp with the native host, although the tissue site of persistence radiation of the placental species. The is not known. It is also not clear whether these biggest distinction between these early viruses are also able to cause acute disease in mammals and current mammals are the other related host as do the mammalian reproductive organs and the reproductive herpesviruses. PWHV replicates in all marsupial strategy. Although existing monotremes cell lines examined to date, but does not replicate 217 in most eutherian lines. Thus these in marsupials. Like the monotremes, the marsupial viruses appear to recognize marsupials also have a very tiny Y chromosome, inherent differences in marsupial and which would similarly appear unable to support placental host cells. So it is curious that this substantial ERV colonization. virus has a broad marsupial species specificity. The possible mechanisms In summary, although monotreme-like mammals involved in cell type restriction have not are old (predating then the dinosaurs), we know been evaluated. little about the viruses or genomes of these early mammalian predecessors. Both existing Retroviruses are also known for marsupials. monotremes and marsupials are known to A Fat tailed dunnart marsupial cell line is support DNA viruses (herpes) and retroviruses. known to produce D-type and A-type env However, little disease is associated with either containing retrovirus particles. However, so virus. Both these species also harbor much far, no disease is associated with this virus. reduced levels of ERVs and LINES and have In addition, and endogenous retrovirus of much smaller Y chromosomes relative to koalas (KoRV) has been described that is placental species. ubiquitous to that species and resembles gibbon retrovirus. In fact, it has been Eutherians and endogenous virus suggested that this koala virus may be the expression. ERVs are known for all vertebrates. original source of the gibbon ape leukemia Mostly, they are lineage specific and conserved virus due to cross species transmission with only a few examples of what appears to be during mixed captivity. KoRV has not been species jumps (such as snake species). With the yet associated with any disease of koalas. , we begin to see much Although the examples are relatively few, it more evidence of relatively recent species is nevertheless clear that marsupials can and jumping of various ERV sequences. However, it do support retroviruses and also have is still the case that most of the older ERVs are endogenous retroviruses. However, it is also conserved within mammalian lineages as well so clear that as a whole, marsupial genomes are there seems to be a general increase in significantly less colonized with retroviral colonization, not ERV loss. Thus, it appears that elements (including LINES and SINES), in all the mammalian lineages, ERV then are the mammalian genomes, although transposition and genome colonization has these elements are clearly more abundant in become more active, but we lack any explanation some marsupial lineages relative to avian for this observation. However, this ERV activity genomes (see Jurka et al, 1995). This seems to have mostly occurred early in placental presents a curious situation since there does evolution. For example, the integration points not appear to be a global system in of human ERVs (HERVs) occurred after the split marsupials that prevents ERV colonization, with the great ape lineage, yet these ERVs do thus it might be expected that marsupial not generally show polymorphisms. This genomes should resemble those of mammals suggest that most of these ERV integration in this regard. Along these lines, it has been events are not recent but are instead associated reported by O’Neill that in contrast to with the origin of a specific mammalian lineage placental mammals, hybrid offspring and have been stably maintained since that origin between two reproductively isolated species event. Placental species represent the most of marsupials will undergo global activation diverse lineage of the mammals. Within the of retroelements (ERVs), and that this placentals, the rodents represent the most diverse genome wide reactivation might limit the family of placental mammals. The relationship success of such species hybrids. Regardless of placental organisms to their viruses and the of this hypothesis, however, there appears to organization of their genomes are in general very be no obvious barriers to ERV colonization 218 well studied. Placental species support a that the expression of these particles might be very rich array of viruses. part of the normal developmental program and evolutionary process of the placenta. The most variable feature of placental biology is their reproductive biology along As discussed above, all placental genomes are with differences in reproductively associated highly colonized by lineage specific ERVs as tissue. It is especially the biology of the well as their corresponding LINE derivatives. uterus that varies between placental species. Placentals also have a much larger Y The placental esterus cycle is not genetically chromosomes (about 60,000,000 bp) then the controlled as in the marsupial, but is other mammals and both the X and Y hormonally regulated via estrogen cycles. chromosomes are also highly colonized by However, all placentals share some common lineage specific ERVS and retroposons. biological features in regard to embryo Placental ERVs are expressed and the development. The most distinguishing trophectoderm and placenta have been especially feature between placentals and marsupials is examined with regard to ERV expression. The the presence of the placenta itself, which trophectoderm and the early embryo are in fact surrounds, nourishes and protects the globally derepressed for ERV expression. This embryo in the uterus. Unlike the maternally is a paradoxical situation since it has been argued derived shell and shell membrane that by many that ERV repression (via DNA surrounds avian eggs, the placenta, develops methylation) must be a genome defense system, from embryonic tissues to make the which would be needed in the early embryo for trophectoderm, an endoreduplicated layer of protection against ERV colonization prior to the cells that surrounds the fertilized embryo. differentiation of the germ line. Ironically, the This is the first layer of cells to differentiate opposite is true. The early placental embryo is from placental embryos and provides open to ERV expression and it is after the germ protection via immune suppression, allows line differentiates that embryonic DNA undergo penetration of the uterine wall and mediates global DNA methylation and ERV suppression. the establishment of the fetal-maternal blood It is for this reason that early studies of retrovirus exchange. These complex interdependent infection of mouse embryos (endogenous features were apparently all acquired at the MoMuLV- MOV 3 mice) resulted in mice that origin of the placental lineage. It is noted carried the retrovirus in the germ line, but were that these same features are reminiscent of suppressed for virus expression in adult male those needed by parasites to colonize a host. tissue. Curiously, MOV 3 proviral sequences In this case the embryo resembles the were subsequently observed to amplify in the parasite and the mother the host. It is thus progeny of females, not males. More recently, highly interesting that the placental ova, the this problem of ERV methylation in mice trophectoderm and the uterus are all tightly embryos has been circumvented by using associated with high level production of based vectors for infections, which ERV particles, retroposon (LINE/SINE) have features that prevent cellular DNA transcripts and gene products (env). methylation. Trophectoderm and early embryo Following early mouse-based observations thus have undergone DNA dementylation which from R. Hubner and colleagues, in a series allows a global activation of ERV (and LINE) of papers published in the late 1970’s and expression. Also unique is that the early 1980’ Jay Levy and colleagues trophectoderm is one of the few tissues that will reported that embryonic tissue express paternal specific genes via an imprinting (syncytiotrophoblasts) from human, baboon process. One such paternally expressed gene is and mouse all expressed large quantities of IGF-II, which is a major modulator of placental endogenous retrovirus particles (classified as growth. Also, in the trophectoderm of female xenotropic viruses in mice) and suggested (XX) embryos, chromatin undergoes a 219 methylation mediated process of X gene were xenotropic, since they did not make virus in inactivation (curiously mediated by an RNA the native cells. This historic nomenclature is sequence that resembles LAT of herpes confusing and unfortunate. Both virus types are virus type I). However, this derepressed essentially the same ERV (i.e. have the same state of ERV embryonic expression is internal genes), but have acquired a different env transient. As soon as the totipotent protein. For example, all mouse cells of all mus embryonic blast cells undergo commitment, species harbor xenotropic viruses, that will not their DNA becomes methylated and ERV replicate in these cells. These viruses resemble a expression is globally suppressed. lambda lysogenic state in that the persistently xenotropic virus infected host is immune to their In summary, all placental genomes are replication. However, these viruses can often highly colonized by lineage specific ERVs replicate in other host species. Ectropic viruses and even more highly colonized by the ERV on the other hand are not uniform and are not derived LINES and SINES. Placental found in all host, not even all lab mouse strains. organisms also have much larger Y They are products of selection and genome chromosomes then other mammals, that are rearrangements. Basically, an ectropic virus is a also highly ERV colonized. The xenotropic virus that has acquired an env gene trophectoderm, which differentiates the that will allow it to replicate in native cells. This placental embryo from the marsupial historical emphasis on tropism has obscured the embryo and developes into the placenta, relationship between exogeneous virus ERVs. expresses high levels of ERVs, due to However, an unfortunate nomenclature also demethylated DNA in the early embryo. applies to other areas of ERV biology. The various forms of human ERVs, that were later The problem of endogenous retrovirus discovered, were given a bewildering series of nomenclature. As mentioned, observations names (RTVLH, HDTV, MSRV, ERV3, K-T47) that normal human placental tissues are and sometimes the same ERV would have expressing high levels of ERV particles multiple names. Later, a better naming scheme dates back to the 1970’s. It was observed became common in which the letters that that human trophectoderm designate the corresponding amino acid codon (synsytiotrophoblasts) were producing large for the tRNA primers used to synthesize the numbers of retroviral-like particles that ERV RNA were used for nomanclature. could be directly purified from the tissue. According to this scheme, human endogenous However, efforts to show that these particles retroviruses (HERVs) are given letters to were active as infectious retrovirus were designate the primer; K = lys, W = trp, R = arg, uniformly unsuccessful, thus they appeared L= leu H = his. More recent classification of to be replication incompetent or defective human ERVs has identified 22 HERV families in ERVs. Similar ERV particles are made in the human genome. This scheme, however, is mouse trophectoderm and are called sometimes confounded by the existence of intercisternal A-type particles (IAPs) and apparent chimeras of retroviruses which appear also in the placenta (RD114 particles). to have recombined two lineages of ERVs. The initial nomanclature that was applied to such virus referred to ERVs as either Human endogenous retroviruses K and ectropic or xenotropic viruses. Both are human evolution. Humans originate in Africa very similar endogenous retroviruses. But about 1 million ybp, but can be differentiated ectropic viruses will replicate in the cells of from the other great apes by ERV acquisition. the native host, whereas xenotropic virus Human genome has 22 independent ERV will not replicate in the native cells, but will families, but 6 of the HERV Ks are new to the only replicate in cells of another species. human lineage. HERV W, discussed below, is The particles being made by trophectoderm one member of this large family of HERVs. The 220 human genome has 25,000 HERV K related various normal tissues, including trophectoderm. LTRs. HERV K related sequences are also It is interesting that there is no exogenous virus found in the human LINE-1 element. This that resembles HERV K(10). It is also worth is a non LRT poly-A retroposon that retains noting that SINE-R are abundant human specific HERV K RT pol sequences. SINE_R is also retroposon that has HERV K env like sequences HERV K related in that it is a poly A and are present at about 5,000 copies per cell. retroposon with 5’ LTR and retains some One of the highly conserved genes of HERV env sequence from HERV K. The L1 LINE K10 is the dUTPase gene. HERV K10 is elements are present at about 100,000 c/c, phylogenetic parent of all HERV Ks and also these elements are also highly transcribed in encodes a protease that will properly cleave the embryonic tissues. Curiously, as noted HIV encoded protein. above, HERVs don’t show polymorphisms in human populations at DNA integration HERV K (-OLD) is another family member and sites which suggests that HERVs were appears to be the ancestor to HML-2 group in acquired early in human evolution but are humans. It has an intact open reading frame that stable with respect to their integration site. codes for a viral env. It has also conserved the The most abundant class of HERVs are the central motif of the dUTPase gene in an active HERV Ks. The HERV K family shows form. The human HML-2.HOM sequence has sequence similarity to MMTV (and also central gag sequence with a deletion of 96 a.a., Jaaksickte retrovirus). HERV Ks relative to the sequence found in old world have been grouped into 18 LTR clusters, of primates. This deleted sequence has undergone which cluster 9 is only found in humans, but amplification of HERV R in the human lineage. cluster 1 is much older and is found old world and new world monkeys. In the HERV W is another member of the large HERV human genome, 10 HERV Ks appear to be K family that also encodes a complete gag and recently acquired in evolution, but 9 of these env, and like most HERVs, no replication are unique to the human lineage. The competent virus has been seen. HERV W is human genome has 20-50 copies of full specific to catarrhines (old world great apes). HERV K elements, yet none appear to be However, the HERV W env is now known to replication competent. As noted, there are encode the human syncytin gene that is an 25,000 HERV K related LTRs also in the essential and functional membrane protein genome. Yet these HERV K sequences involved in the fusion of trophoblasts and highly have maintained a virus specific nucleotide expressed in the placenta. HERV W resembles a bias, which suggest they are under some type C/D chimera virus (which is MMTV-like) positive selection. and is present at 20 copies per cell. Thus this HERV W is at least one clear example of an ‘HERV K(10)’ super family of HERVs that acquired ERV that provides an essential function is composed of 6 HML groups with about 50 for the placenta. members found in old world primates (except for chimpanzees). HML means a Various other HERVs have also been studied. human MMTV-like virus group (a hormone HERV H has about 1000 elements and is thus responsive ERV), based on RT and env one of the largest HERV families. However, similarity. HERVK(10) (aka HTDV) was within this family, it appears only 3 of the env originally observed to be produced in genes are expressed. HERV K T47D –D-type teratocarcinoma cells at very high level of has also been reported to be expressed in viral particle production and was associated placenta and mammary carcinoma. This virus is with the differentiation of these cells. interesting in that antibodies against its env Terratocarcinomas are embryonic tumors sequence have been observed during pregnancy that are able to differentiate to produce and these antibodies were able to cross react with 221 HIV env. ERV3 is also known as an ERVs of other placental species. As HERV-R, which also encodes a complete mentioned, in addition to the human ERV env gene that is expressed in the placenta associations outlined above, the placentas of (synscytiotrophoblasts) and in differentiated most mammalian species also appear to show embryos. It does appear that associations with ERV production. Baboon syncytiotrophoblasts are able to produce placental tissues will produce an ERV that will various other types of HERVs. For also cross react with antibodies against HIV-1 example, patients with and RT and , as well as cross reacting with SIV lymphomas have been reported to produce p27.B. Thus, although little is known about this Ab that were reactive to syncytiotrophoblast. ERV, baboon placenta at least resembles the In the case of patients with trophoblastic human placenta in ERV production. Rhesus tumors, one of the antibodies being made monkeys also expresses a D-type ERV that is was shown to react to HERV K gag/env 20% similar to mason-pfizer virus (MPV) in proteins. We are thus left with a their placenta. Simian endogenous retrovirus bewildering array of associations between (SERV) is similar to the ERV found in baboons the origins of placental species, the (BaEV). Both of these primate ERVs are also acquisition of larger X and Y chromosomes, related to HERV W, which encodes synsytium and the colonization by ERVS, which are described above. HERV W sequences can be often expressed in and can be functional in found in the genomes of great apes. However, placental tissues. These associations intact BaEV is found only in Baboons, but not in suggests that there may or must be a more great apes or humans. A general pattern thus causal association between ERV acquisition starts to emerge in which recently acquired ERV and the origin of placental mammals, such sequences will generally differentiate the various as seen with HERV-W. Therefore it appears primate lineages from one another. plausible that the massive ERV colonization seen in all placental genomes contributed directly to the complex life strategy of these Mouse ERVs. Mice (mus musculus) provide a placental orders. particularly well studied system for the evaluation to ERVs. All known mus species In summary, all placental species have have within their genomes IAPs; the mouse ERV acquired sets of intact endogenous for intercisternal A-type particles. These ERVs retroviruses in significant numbers and have also sometimes been classified as expression of these ERVs in embryonic and xenotropic viruses. Typically, these ERVs are reproductive tissue is common. Although classified according to their LTR sequence confounded by a historically confusing similarities, which are used to cluster the types nomenclature, these ERVs can now best be of IAPs. The mouse genome has about 900 classified according to these intact versions copies/genome of IAP. IAPs were initially that are conserved within each lineage. discovered in embryonal carcinoma cells (EC LINES and SINES are related to these cells), which are embryonic tumors that are able ERVS and are also lineage specific. In to differentiate. The IAP particles are produced humans, the HERV-K family is conserved in high numbers in association with EC and is the most human specific ERV. trophoblasts differentiation and accumulate in HERV-K includes some family members the cytoplasm but do not result in infectious that are unique to the human genome. Some virus. However, some IAP sequences can also of these HERVs (HERV-W) code for encode for an intact env sequence. One such proteins that serve essential function in the sequence is IAP E, which is also expressed in placenta. mouse trophectoderm cells. In addition, mouse oocytes are known to express the env sequence of an MMTV-like ERV at fertilization, but this 222 expression will subsequently decline with hypothesis for the origin of live birth’. The embryo development. Defectives of IAPs concept is that ERVs are directly contributing are also highly expressed from the mouse gene function that led to the origin of the genome. The best known of these defective placenta and the ability of the embryo to escape elements is VL30 and each mouse strain immunosurvellence. If this idea can be appears to have a unique and characteristic experimentally tested, it would seem mice would set of VL30 elements. Thus, VL30 provide the best system for this. However, even replication incompetent retroposon are in mice this hypothesis is difficult to evaluate derived from a defective IAP, and is experimentally due to the heterogeneity and expressed in late embryo but is also complexity of ERVs. In humans, that HERV W expressed by many established mouse cell codes for synsytian strongly support a role for lines. Another mouse specific ERV is these ERVs in embryo development or placental MuRVY. This is a Y chromosome specific function. Various experimental approaches ERV that is found in all mus (but not other might work for further evaluation. For example, rodent) species and also codes for an intact it seems possible that stimulators or inhibitors of env sequence. ERV function could help elucidate such roles. However, there are been few if any systematic Hypothesis for mouse ERVs and live evaluations of this possibility. It is known that birth. The possible relationship between drugs that stimulate ovulation have also been mouse ERV activity and embryo reported to increase C-type particle formation, development has not been well evaluated. such as in mouse oocytes (expressing an MuLV- Other rodents (, hamster) also have their like env). RT inhibitors are not generally felt to own types of IAPs and these are highly affect pregnancy, but some reports have conserved within these corresponding suggested that early events (implantation) in lineages. For the most part, these other embryo development can be inhibited. rodent IAPs are distinct from those of Mus. For example, probes specific for the One study did attempt to globally repress all Hamster IAP do not cross-react with the mouse IAP production in in vitro produced IAPs of mouse. The experimental blastocysts and observed that these IAP evaluation of possible IAP function is made repressed blastocysts failed to implant. highly complicated by the numerous copies However, no other direct evaluation has been and versions of IAP sequences present in the attempted. There is, however, some indirect mouse genome. This constellation of IAPs evidence on this issue. A correlation between would interfere with most experimental embryo non-rejection and tumor growth has been genetic approaches that could be employed noted. During pregnancy, it has been observed to inactivate them. However, it might be that some tumors are able to grow in pregnant possible to use genetic methods to focus on female, but not in nonpregnent females or male the much smaller number of env encoding mice. There is also some relationship between sequence, but this has yet to be done. An mating, the birth of offspring and the induction interesting issue to evaluate is to determine of endogenous retrovirus in some mice. The if IAPs have any direct role in a normal mating of female Balb/C mice with C57 BL/6 mouse pregnancy, especially with respect to male followed by immunization of paternal the non-rejection of the embryo or other lymphoid cells into offspring results in the ‘parasite-like’ embryo biological activities. activation of the EmV1 ERV and the Given the established capacity of retroviral development of a subsequent AIDS-like or acute env genes to suppress immunological leukemia disease. However, this is seen only in reactions, this possibility has been proposed the mixed offspring of multiparas mice, not in in various formats (see recommended virgin females. Nor is this ERV reactivation reading) and is here called the ‘retroviral seen in the F1 of Balb/C male and balb/C female 223 mating. Although it is difficult to interpret population, MMTV has been observed to be these observations, they do suggest a link produced in the milk of about 50% of mice between the non-homologous male Y examined. Yet these MMTV positive mice did chromosome, ERV reactivation, immune not show evidence of disease or breast cancer recognition and pregnancy. from these infections. In contrast to this high incidence of MMTV isolation, MuLV is not Infectious virus from ERVs. It is worth normally isolated from most feral mouse noting that the above mating also resulted in populations. This is in stark contrast to other the development of an ERV into an mouse viruses (such as mouse hepatitis and autonomous retrovirus, able to cause mouse parvovirus), which are highly prevalent in disease. Similar observations had all Mus populations. Yet some specific wild previously been made. For example, it has mouse populations have demonstrated long been known that mouse ERVs (such as pathogenic MuLV infections. This occurred in a AKR) can become active in embryonic abandoned squab farm known as ‘Lake Cassitas’ tissue, leading to the selection of a in Southern California. Like most house-mice in replication competent retrovirus able to the USA, the feral mice of this region are mainly induce disease. In the AKR mouse, initial Mus musculus domesticus, originally derived expression a replication incompetent AKR- from northwest Europe. It now appears that ERV occurs in the early embryo. This initial Mus castaneus originally from East Asia was AKR-ERV lacks a functional env. also introduced to this region. Mus castaneus However, this defective virus expression harbors the Fv-4 MuLV-like endogenous does select for the genetic acquisition altered retrovirus along with the Fv-4R locus, which env, which results in a infectious virus and confers resistance to Fv-4. Fv-4R resistance initiates additional selection for a second locus is itself another defective MuLV-like round of virus that will eventually cause the provirus that expresses a gp70 env gene that leukemia. In both these examples, the initial appears able to block the infection by Fv-4, but embryo expressed ERVs were non- not block MuLV. The Fv-4/Fv-4R combination infectious, but did allow for the subsequent can be considered a persistence or addiction selection of infectious variants. This same module according to our previous arguments. process is probably also involved in the When Mus castaneus mates with Mus musculus generation of xenotropic viruses from domesticus, the F1 offspring will sometimes ectotropic viruses described above. acquire the Fv-4 endogenous virus without the corresponding protective Fv-4R locus, thus Hypothesis of Mouse ERVs and losing the stable persistence module for control protection against exogenous retrovirus. of Fv-4. These F1 mice will then start producing Various researchers have suggested that the Fv-4 virus, which results in premature death due ERVs are mainly maintained in genomes to to lymphomas and paralysis. Thus a natural inhibit exogenous retroviruses that can cause outbreak of pathogenic MuLV-like infection was disease. This suggests that in order to observed in this specific feral mouse population. maintain a positive selection for ERV However, as neither the Asian and European race function, an autonomous version of the virus of these mice were native to or evolved in should be prevalent in the host population. California, this represented the biological Some experimental support for this idea has meeting of two reproductively isolated lines of been presented. However, other results fail the same species of mouse mediated by human to support this view. Surveys of wild mice activity. Both of these mouse lines had have generally failed to find evidence of previously acquired distinct endogenous ongoing disease by exogenous retroviruses. retrovirus persistent modules. But these two MuLV and MMTV are the best studied in mouse lines had now been introduced into a new this regard. In feral mouse (Mus) inter-breeding habitat. Some of the resulting F1 224 offspring were now incompatible with more often behave like regulators of themselves. respect to ERV control. The resulting Another hypothesis has proposed that mouse progeny were in a sense no longer fully ERVs were involved in the origin of live birth, viable or compatible with respect to providing the embryo with various controlling their corresponding endogenous characteristics needed for vivipary (immune retroviruses. Some have pointed to this evasion, cell fusion, embryo invasion). Some result to argue that it demonstrates the experimental support for this hypothesis exists. emergence of acute retroviral disease from the genome. However, this result does not support the prevailing hypothesis that ERVs ERVS of other mammals. The proposed role are conserved in order to protect against of ERVs in placental biology would be expected autonomous or exogeneous retroviruses. to be general issue that is applicable to all Instead it supports the hypothesis that the placental species. However, other placental ERVs present in the genomes were species are not as well studied as mouse or persistence modules, preventing the human, but some results are relevant. In reactivation of the same stable persisting domestic cats, the placental trophectoderm is ERV, not protecting against prevalent observed to express high levels of the FELV exogeneous virus present in the ecosystem. RD114 endogenous retrovirus. No RD114 This mating dependent ERV reactivation related autonomous virus known (such as FLV), situation also resembles the EmV1 story we thus there is no reason to suspect that RD114 is have described above (or the reactivation of providing protection against any existent plant genomic viruses presented in chapter exogenous retrovirus. Interestingly, the receptor 7). Such results can also be used to argue for RD114 has been characterized to be the for the idea that ERVs also have a role in neutral amino acid receptor, which is also reproductively isolating lines of the same expressed in placental tissue. The presence of host species, contributing to the formation of this receptor in placental tissue is interesting species. from an immunological perspective. The receptor is involved in tryptophane transport. Summary of mouse ERVs. The summary However, low tryptophane prevents T-cell of the combined ERV-mouse studies are recognition via IDO production. Low complex. Mice, like all other rodents, have tryptophane results in Th1 type cytokine bias, lineage specific sets of ERVs (IAPs) that are and such a bias is a characteristic of pregnancy. both intact and defective (such as VL30). In contrast, a Th2 biased cytokine response is Many of these ERVs can be observed to be associated with abortion. In fact, IL 4,5 and 10, highly expressed in embryonic and placental are all inflammatory cytokines that can terminate tissue. Such ERVs are typically pregnancy and are also associated with high noninfectious (IAPs). However, in some embryo loss rates (up to 30%). These lineages (AKR), these ERVs can acquire a circumstantial observations suggest a link functional env gene resulting in autonomous between FELV RD114 receptor and the and disease causing retroviruses. In some immunological status of the placenta. mouse lines, these viruses can be induced by mating. Although a hypothesis has been Ungulates of all species also have an ERV presented that ERVs are maintained to related to JSRV (Jaagsiekte sheep retrovirus) inhibit exogenous retroviruses, this called enJSRV. enJSRV is highly expressed as hypothesis often fails to explain results. virus or viral gene products in various Instead it appears that specific ERVs reproductive tissues, including the placental constitute persistence modules that will sincythium, the cyto-trophoblast but especially suppress the exogenous replication of the highly expressed in the sheep uterus. Most very same ERV. In other words, ERVs ungulates have about 20 copies per cell of this 225 JSRV-related ERV, but bovia species has HERV W. It is interesting to note that it has only a few ERV copies. The env and capsid recently been established that the extinct Wooly proteins of JSRV are some of the most mammoth also underwent an ERV L highly expressed proteins of the sheep amplification. This ERV L amplification did not uterus. However, its role in uterine biology also occur in modern elephant species. Humans has yet to be elucidated. It is known that the can be distinguished from chimpanzees by the enJSRV virus can adapt to become an acquisition of HERV K10 HML and the exogenous virus causing lung and nasal corresponding SINE R element. Thus ongoing adenocarcinoma in sheep and goats. ERV acquisition is characteristic placental evolution. Thus, overall, we see an evidence of Pigs have 3 known endogenous retroviruses lineage specific ERV colonization in all (PERVs) that have correspondingly distinct vertebrates, well before the evolution of env sequences. There are at least two placental species. These early ERV colonizers complete copies PERV copies known for the are maintained, typically in small numbers, but pig genome. However, the expression of additional, much more numerous and specific these PERVs in reproductive tissue has not ERV colonization events are highly correlated yet been evaluated. The main concern in the with the vertebrate placental species. Humans study of these PERVs has been that they can also be distinguished from their primate appear able to replicate in some human cells. relatives by these same ERV colonization This implies that pig tissue to be used for patterns. xenotransplantation might pose a risk of possible infection with the PERV found in all pig tissue. Y-chromosome, heterochromatin and ERVs. In placental species, the Y chromosome is of particular interest due to its large scale (6,000 Overall pattern of placental ERV fold) DNA expansion relative to the Y of acquisition. All placental lineages have marsupial and monotreme mammals. Much of acquired lineage specific versions of ERVs. this expansion is due to the colonization by However, we know that some ERV families lineage specific ERVs. In human Y were acquired into host genomes well before chromosome (the only sequenced Y the evolution of vivipary or the placental chromosome to date), we can find intact and orders. For example, all vertebrate lineages defective HERV K, HERV L, HERV W and have some version of MLV-like Ty1/copia defective derivatives (SINE-R). As they are on elements. Within the mammals, all appear the Y chromosome, these elements lack to have MMTV (class II K) elements. In homologous recombination with other terms of placentals, some ERVs are chromosomes, thus they are the most maintained common to all species, such as low levels of and distinguishing genetic elements of placental HERV L, MSRV/W and related LINE species. This makes the Y chromosome an elements. However, all placental lineages outstanding genetic marker for tracing ancestral have also undergone an expanded lineages. Many of the Y ERV elements are colonization by their own lineage specific found within regions of condensed ERVs and associated LINES. This has been heterochromatin. HERVs have typically been well studied in the context of human considered as selfish DNA elements, hence they evolution. All prosimians have HER VIP are expected to be selectively neutral in their and all old world simians have HERV F, host, assuming they don’t interrupt active genes. K10, W, and have also amplified their levels As inactive genes are generally maintained of HERV L. African great apes, like within heterochromatin, ‘selfish’ HERVs might humans, have also amplified the level of be expected to favor colonization of HERV L and have additionally acquired heterochromatin. In autosomes, ERVs are also 226 present in large numbers. Chromosome 21, However, since most retoposons are persisting for example, has 225 genes, 3,000 DNA silent elements, accumulation into elements, 50,000 repeat elements, and 2,000 heterochromatin might be a way to attain genetic retroviral elements. Yet HERVs don’t show silence and persistence. The Y chromosome is polymorphisms at the site of integration especially notable for containing high levels of which would be expected if they retained heterochromatin and retroposons, but these transpositional activity. Human genome has retroposons are also the most distinguishing notable ‘gene deserts’ or area lacking coding elements form closely related lineages (such as sequences that gives it a more uneven human and chimpanzee). distribution of genes then lower animals. In fact, we might consider the entire Y Both the X and Y placental chromosome seem chromosome to be such a gene desert. especially prone to ERV colonization. The X Curiously, the distribution pattern of these and Y chromosome share some limited regions gene deserts is also maintained in the of homology, but not enough for recombination placental mouse genome. to repair or maintain some of the Y chromosome. These regions of X-Y homology appear to have How might we understand the forces that led been recently acquired as they occurred after the to the accumulation of human transposons? divergence of the hominids and primates. It is Most human TEs do not appear to be now known that 1/2 of all human Y genes are mobile. The DNA based transposons in the ampliconic (retroposon). These ERV human genome all appear to have been acquisitions must therefore be often, recent and inactivated as well as most LINES. Yet, specific to human evolution. The HERV K LTR humans have twice the amount of LINES occurs in 12 copies on X, 10 on the Y, and relative to chimpanzees, and chimpanzees several copies on autosomes (12q24). SINE-Rs have more LINES then gorilla and are human specific LTR repeats derived from orangutan. Thus, although they are mostly HERV K10 and occur on the Y in high numbers. now inactive, ERVs have recently been Other intact ERVs also accumulate on the Y acquired during the evolution of the human chromosome. The Y chromosome has one of the genome. ALU sequences are retroposon few open env ORF of HERV K, which appears processed 300 bp psudogenes, which to be a complete HERV and is also present in correspond to the nuclear estrogen receptor gorillas and chimps. In addition, an intact ERV superfamily. ALU’s are very highly 3 and an HERV H. are also found on the Y. repeated and found only in higher primates. Clearly there is a strong bias towards sex All this lineage specific retroposon chromosomes ERV and retroposon colonization accumulation is difficult to explain if the and accumulation. It is these very ERV changes bulk of genomic transposons are inactive. of the Y chromosome that most distinguish According to the selfish DNA hypothesis, as human from chimpanzee genomes. Thus we transposable elements (TE’s) have no need to consider the possible roles for HERV phenotype, their equilibria within the involvement in the evolution of human specific genome should depend on their genetic attributes. These human attributes would include stability. Also to prevent interference with human cognitive function, human speech, and active genes, they should be overrepresented associative learning, which also involves the in heterochomatin, thus TE’s in formation of social attachments, essential for heterochromatin should be more stable. But human society. However, there are only a few results of genomic analysis indicate that no studies that address the potential involvement of retroposon family is more unstable then sex chromosomes or ERV related sequences in other families. Therefore the accumulation such human attributes. For example, Xq21.3 and of retroposons into heterochromatin is not Yp block have been linked to handedness and associated with instability in euchromatin. psychosis, both are distinctly human 227 characteristics, which are also both related accumulation of such sequences. Yet even with to human language capacity. In addition, such broad autosome colonization, there is a SINE-R.C2 –like transcripts are homologous curious variation between species. For example, to cDNA isolated from schizophrenic brains, human and mouse chromosomes are both of unknown function. These issues are colonized by LINEs and SINEs. However, there discussed further below is an overall difference between the genomes in that these retroposons corresponds to nearly 20% The Y chromosome shows curious of the human but only 8% of mouse genome. In variability in other placental mammals as addition, although the human and mouse LINES well. In mouse species, MuRVY ERV is the are distinct, they are curiously often at the same murine repeat on Y which is present at 500 chromosome positions (especially the SINEs), copies per cell but only in Mus species. This and this includes their positions on the X and Y mouse sequence differs significantly from as well. Such differences might lead us to think that of the Syrian hamster where the entire Y that the human genome has more active chromosome is heterochromatin and fully retroposons then the mouse chromosome, yet 1/2 of the chromosome is occupied by a various measurements of DNA mobility suggest ERVs (IAPS), with little sequence similarity the opposite, that the retroposons of the mouse to the mouse IAP. This suggests that in chromosome are much more active then those of most Syrian hamster tissue, genes are not the human chromosome. It thus seems clear that expressed from the condensed Y thus this Y the lineage specific ERV colonization events of is expected to have little if any coding all placental species, was an event associated potential. However, since all silent with the origins of each of those species and was chromatin is derepressed following DNA not due to subsequent transpositional events. demethylation in an early embryo, even this otherwise silent Syrian hamster Y chromosome is most likely actively Summary of ERVs and sex chromosomes. expressed at this time. What then is the What can we summarize with respect to X and Y importance of so much ERV sequence on chromosomes and ERVs? We can start by the Y? There seems to be little need to posing several questions to give us some express Y genes in most tissues. Do all perspective. Why are the placental sex placental species need to maintain sex chromosomes so prone to colonization by these chromosome ERVs for embryo-specific agents? What is so different from the Y expression? If so, what function might be chromosome non-placental mammals that would served by such expression? And also, how lead to this big difference? The recent is it possible for the more recently evolved sequencing of the human Y chromosomes gives mole voles to have entirely lost their Y us a picture that resembles a dizzying house of chromosome if these chromosomes maintain mirrors. The redundant, inverted and complex some function in the embryo? Do they nature of the repeated sequences made it make embryonic ERVs from other genetic technically challenging to sequence, but also locations? Clearly their remain some major very difficult to understand their significance. mysteries about placental ERVs and the Y The conundrum is that the Y chromosome has chromosome and embryos. mainly retroposons and so few protein coding sequences (perhaps as small as 20-30) and some Our emphasis on the Y chromosome should Y chromosomes are entirely condensed. Yet Y not detract us from the fact the ERVs and chromosomes conserves retroposons within one their related LINEs also colonize the other lineage. The patterns of ERV colonization in the autosomes. In the case of these sex chromosomes represent the biggest genetic chromosomes, however, recombination differences between otherwise closely related would be elpected to eliminate or correct the species. Thus the sex chromosomes represent 228 the most dynamic of all chromosome, infected cells to low levels, but the lytic reflecting large changes that occur during replication in CTLs is still occurring at low rates. speciation. But the Y chromosome is also Although there have been some reports that the most stable and, ironically, it can be used inactive or latent HIV 1 genomes are present in to directly trace lineages for many resting CTLs, these silent genomes do not generations. Can these ERVs and their contribute to the biology of subsequent HIV defectives be the result of colonization of replication. These latent HIV ‘stored’ genes do persistence modules? If so, how might such not appear to come up later in typical HIV persistence modules affect host evolution? infection. HIV 1 infections also display the high Many of the retroposons of the Y degree of genetic variability associated with chromosome are highly expressed in many acute infections. This high level genetic embryonic, sexual and placental tissues, so variation is mostly limited to the env sequence perhaps herein lies some answers to the and the resulting clades vary by about 10%/year. questions posed above. The variation also follows a punctuated pattern. This seems to result from the deletion of T-cell sets due to virus replication that subsequently HIV and Retrovirus evolution. HIV 1 leads to evolution new receptor/env specificity. represents a newly emerged human Thus HIV resembles an ‘extended-acute’ life retrovirus that was able to cause a world style in its human host. wide pandemic. Understanding the origins of this major human disease has taken a lot A persistent origin for HIV? Unlike the of effort and a long time, but it now appears situation with AKR and other mice, no human we can present a most plausible scenario for genomic HERV appears to have been ithe origins of this virus. The HIV human predecessor to HIV1. Thus HIV is not a disease has an extended character in that it recombined of or a reactivated form of a human can take years to kill its human host. endogenous retrovirus suggesting that it has However, this disease basically represents a originated in another species. The closest long acute viral disease that takes a long relative and likely progenitor of HIV1 is the time to accumulate lethal damage. Thus, it simian virus, SIVcpz, found in certain is important to first understand this populations of chimpanzee. Both the human and ‘extended-acute’ nature of this human chimp lentiviruses have a vpu gene, that disease. During the time HIV takes to kill, distinguishing these viruses from all other disease is mainly caused by the depletion of retrovirus families. Some troops of wild stimulated human CD8 positive CTL’s, chimpanzee (pan troglodytes) in certain regions ultimately killing its human host due to of central Africa are observed to support SIVcpz, immunological incompetence. However, but do not develop the AIDS-like disease. Thus this extended duration does not really in chimpanzee SIVcpz is nonpathogenic and resemble a typical persistent infection in that persistent. However, the distribution of this it is both a ongoing productive and lytic SIVcpz in wild chimps is highly restricted infection. This acute infection is highly (absent from much of central Africa) and it does efficient relative to most viral infections. not appear that SIVcpz is a long-term prevalent HIV 1 replication in CD8+ CTLs is so infection of most wild chimpanzee populations. highly productive of progeny virus that each This observation leads to the question of where infected cell yields about 10,000 virions. SIVcpz itself may have originated from. This staggering degree of viral production However, SIVcpz is also present in various always results in the lysis of the infected African monkey species. In all, about 2 dozen CTL. Typically, there is no true biologically different African monkey species have been silent or latent state of HIV infection. Anti- shown harbor SIV in wild populations. When retroviral drug keeps the numbers of African monkeys are SIV or HIV infected, their 229 cells respond sluggishly to the infection so endogenous retroviruses (SERV) are members of they do not seem able to support the highly the Baboon virus complex which resemble productive lytic infections of HIV in human lentivirus, but are fixed in the genomes of all old CD8+ CTLs. However, if the African SIV world monkeys. This may define a lentivirus- virus is used to infect Asian monkeys, these based persistence module in the genomes of animals develop AIDS-like disease these monkey lineages. Monkeys thus appear to suggesting monkey species specificity to be the natural and original source of lentiviral virus persistence. Yet, the versions of SIV systems that can undergo adaptation to cause as found in African monkeys does not acute human disease. SERVs are absent from appear to have been a direct predecessor to the genomes of apes or humans but SERV HIV1. appear to be ancestral to BaEV and SRV.

SIVgsn can be isolated from spot-nosed Summary of HIV origins. HIV1 is a newly monkeys in Cameroon and is highly emerged human virus that appears to have divergent from SIVsyk. However, it has a originated in other primate species, but adapted vpu homologue that was previously thought to humans. The human disease, although of to be unique to SIVcpz/human lentiviruses. extended duration, nevertheless resembles an In addition, the SIVgsn has a complex acute lytic infection. Although a virus that can genome structure, similar to that of HIV and be found as a persistent infection in some it has an env gene that is also related to chimpanzees appears to be the direct progenitor SIVcpz env. Other types of SIVs can be of HIV 1, this chimpanzee virus itself appears to fond in other monkey species, such as have been recently acquired from a mixture of Cercopithecus nictitans (SIVmonNGI) two stable and persistent monkey viruses. which also have a vpu gene. Recent Chimpanzees are known to eat these two monkey sequence analysis now support the idea that species. chimpanzees were indeed the source of SIV which adapted to be HIV1 in humans. However, chimpanzee infection with SIV is THE PLACENTAL DNA VIRUSES neither prevalent nor old, so it too appears to have been a recent viral introduction to specific chimpanzee populations. This Placental DNA viruses: the Herpes viruses. sequence evidence suggest that SIVcpz is The Herpes viruses appear to be common itself a recombinant between two monkey infections of animals from clams, to fish, to most SIV viruses. Both of these monkey species acquatic and terrestrial vertebrates. Thus that are preyed upon (eaten) by herpesviruses host span all the way from oysters chimpanzees. Thus, in monkeys, SIV to wallaby to human. As far as it can be viruses are ubiquitous, persistent and appear determined, all placental species appear to harbor to represent old evolutionary infections. HIV species-specific versions of herpesvirus. is an acute disease of humans. All human Herpesviruses are classified into three major acute viral disease, even HIV1, appear to be groupings according the biological and sequence traceable to persistent virus in a non-human similarities: the alpha, beta and gamma but highly specific host. HIV 2 also appears herpesviruses. The alpha and beta herpesviruses to trace it origins to persistent infections of are for the most part phylogenetically congruent monkeys, in this case to the with their host, suggesting long term host co- monkey (SIVsmm) as a natural host. As is evolution. The gamma herpes viruses are similar typical of most persistent infections, in several respects to the beta herpesviruses, but SIVsmm shows little sequence variation in gamma genomes are more variable, suggesting its native host, but is highly variable in more recent species jumps have occurred in this humans. It is interesting that the simian viral lineage, from beta herpes ancestors. Thus 230 herpesviruses might be more broadly herpesviruses. Herpes B virus is acquired at classified into the alpha herpes viruses and sexual maturity, thus may be transmitted by the beta/gamma herpesviruses. All close contact. In their native monkey host, the herpesviruses appear to have descended viruses are persistent and normally show no from a common ancestor. It is assumed that disease, with inapparent mucosal reactivation. this ancestor is best represented by the However, when transmitted to humans, the virus herpesviruses of either clams of vertebrate are always symptomatic, causing acute CNS fish. Within herpesvirus genomes, the most disease and the resulting death rate can be high, conserved regions are 5 blocks of genes that up to 70%. The B virus from rhesus monkeys code for both structural and enzymatic may be most lethal to humans. Various functions. The biological properties of the species have strain specific B virus and can be virus groups are also well conserved found at 80-100% prevalence. Sometimes, oral (lymphotropism, latency in neurons, etc.). lesions have been reported in monkeys, but no From the perspective of human biology, it is genital lesions are seen. This is in contrast to the interesting to consider why humans are host segregated biology of oral HSV1 and genital to so many types herpesviruses, each of HSV2 in humans. This separation of HSV viral which appears to have an old relationship oral and genital habitats seems to have been a with their human host. Eight types of human specific development. The monkey B human specific herpesvirus are known: virus is nearest relative to HSV1 (next most is HSV1, HSV 2, CMV, VZV, EBV, HHV6, VZV). It seems possibly that behavioral HHV7 and HHV 8. All these viruses are difference in humans, such as frontal sex, may ubiquitous and result in lifelong infections. have led to isolation of oral and mucosal habitat VZV and HSV persist in different types of and divergence of the two HSV types. The ganglia (neurons). Alpha herpesviruses mammalian alpha herpesviruses are co- (HSV 1, HSV 2 VZV) infect mucosal speciating with their host. However, the epithelial cells and are latent in neurons or evolutionary pattern of alpha in avian host is ganglions. Beta herpesviruses (HHV 7, different and generally not co-speciating. HHV-6) persist in and are tropic to T- Curiously, no alpha herpesviruses have yet been . Beta herpes viruses show described for any rodent species. some genetic similarity (at TRS) to Marek’s disease, and also to lymphotropic bird virus, Beta/Gamma herpesvirus evolution. Although suggesting some bridge-type connection in more recently discovered, the Kaposi’s sarcoma evolution of these viruses. Gamma herpes associated herpes (KSHV or HHV8 - new viruses are biologically more similar to beta nomenclature) may present a good model to herpesviruses and are represented by EBV, understand gamma herpesvirus evolution. HHV 8, mouse MHV4, Wilderbeast virus. KSHV is a gamma 2 virus () and is the most recently isolated human gamma virus. Human herpesviruses are represented in It was recovered from and AIDS patient and it most primates. For example, ancestor to appears the HIV immunosuppression led to EBV are present in ceropithecine species HHV8 reactivation. However, the virus is and the EBV nuclear antigen conserved in otherwise very inapparent, and within one simian lymphocryptoviruses. However, patient, the viral genome is stable. Four stable there are some differences between human HHV8 genotypes are known worldwide. This and primate herpesvirus evolution. Monkey family of related viruses are found in new and B-viruses (Cercopithecine herpesvirus 1) are old world monkeys, in human and chimpanzee. alphaherpesvirus. These are endemic in Recently, additional chimpanzee and gorilla Asian , (old world only) and most specific version of this virus family have also similar to HSV. The virus found in been described. KSHV is endemic in central marmosets appears basal to these primate Africa. KSHV also clusters according to its 231 human host populations and has been used This bovine virus is lethal in sheep. Farmed deer to trace human migration out of East Africa are also susceptible to herpesvirus infections into the rest of the world (similar the HPV resulting in malignant catarrhal fever. This deer and JCV). KSHV transmission is primarily herpes virus is the closest relative to the via familial associations. The virus shows herpesvirus of bison. Elephants also harbor their very low rates of recombination. According own species specific persistent and inapparent to phylogenetic analysis, the gamma viruses versions of herpesviruses, and these viruses can appear to have descended from Beta viruses jump elephant species to cause acute fatal (via DNA pol similarities). It appears that disease. This is observed when Asian and EBV evolved in Africa and ancestral viruses African elephants are housed next to each other present in cerropithecines- as noted above. in zoos and the herpes viruses of both the Asian Both these viruses most likely descended and African elephants can cause fatal disease in from alpha herpes predecessors. Broad the other elephant species. There is also a Feline phylogenetic analysis indicates that, as a herpesvirus known. This virus has been shown group, the gamma herpesviruses are the to persist in small cat populations establishing most complex of all herpesviruses and also that its maintenance is not host density the most recent to have undergone large dependent. scale genetic changes. Currently, it appears that the gamma2 viruses are clearly co- In summary, the alpha herpesvirus are speciating (with New/Old world host), but at evolutionarly, the oldest herpesvirus lineage and much higher rate of change then for alpha appear to be the ancestors of the beta/gamma and beta herpes viruses. HHV8 is similar to herpesviruses. This virus family can be traced Murine gamma herpes virus (MHV-68), back through algal viruses and bacterial phage. which is well studied and known persists in Overall, these herpes viruses conserve biological spleens and peritoneal cells. Thus gamma characteristics and core replication genes. herpes viruses can also be found in rodents, Mostly, they have persistent life strategies and unlike the alpha herpesviruses which have establish life-long persistent infections, although no known rodent viruses. they often jump species to cause acute disease in related host. Their evolution is mostly congruent Herpes viruses of other placental species are with that of their host, in some cases, highly host also known. The tree shrew herpes virus congruent. Curiously, the gamma herpesviruses, (THV) is associated with malignant tumors but not alpha herpes viruses, are found in in some situations, but most shrew rodents, yet the gamma viruses are the most infections with THV-2 are healthy. In recently evolved. Human support an unusually cattle, Bovine herpes virus (BoHV) is also large number of herpesviruses. known. As a herpesvirus, BoHV is unusual in that the virus crosses the placenta from mother to infect the fetus. Bovine herpes POXVIRUSES: A MAJOR EXAMPLE OF virus exist in two types, BoHV-1, BoHV4. DNA VIRUS EMERGENCE. BoHV4 appears to have descended from ancestral virus of African buffalo. In these The poxviruses of moles, rodents and host, the virus is co-speciating. domesticated animals. The poxviruses Phylogenetic analysis suggests that domestic represent the largest and most complex of cattle got herpes via species jump from an vertebrate viruses. Interest in this family of African buffalo-like virus about 700,000 viruses stems from the ability of the human ybp. As the original wild species that led to specific smallpox virus (Variola) to cause major domestic cattle is unknown or extinct, it is human epidemics. These epidemics have had not clear if this herpes jump is associated devastating effects on human populations, with the cattle predecessor to domestication. especially in the New World, following the 232 introduction in the 1500s of Smallpox by initially evolved in another host, but was able to Spanish conquistadors resulting in a virgin jump species and establish itself as a successful soil epidemic. In historical terms, smallpox acute viral agent that became specific to only is responsible for more human disease and humans. death then any other virus. Smallpox initiates human infection through the Other vertebrate poxviruses. Poxviruses of respiratory route, but results in a systemic numerous other vertebrates are also know, disease characterized by the infection of including monkeys, cows, horses, and camels. In sebaceous glands of the skin resulting in the all of these host, the corresponding poxviral characteristic skin pox. Transmission is not infections are also acute and do not persist. as highly contagious as that of measles Although these viruses can clearly replicate in virus, and occurs during the pox phase but their corresponding namesake host, in fact, it does not occur during the prodrome phase. now appears rather clear that some of these Thus transmission is neither inapparent nor poxvirus names are really misnomers. Both does virus persist in humans or any other and Monkeypox present a good host. Smallpox virus is a strictly acute example of being misnomers. Both of these disease agent, dependent on host population viruses infect their namesake host but also can structure and density to maintain a chain of infect humans. However, in natural ecosystems, transmission. Because of this, smallpox was both viruses are persistent viruses of specific susceptible to eradication by . rodent species. Cowpox can be found in Bank This eradication was accomplished by the voles (Clethrionomys glareolus), field voles World Health Organization in 1978 using (Microtus agrestis) and woodmice (Apodemus vaccinia virus as a live vaccine. The sylvaticus) and shows a high prevalence in these vaccinia virus infection normally results in species in northern Europe. Infected voles, in one small pustule at the site of infection and particular, appear able to maintain the virus in does not lead to systemic infection. the natural habitat. In Bank voles, cowpox is Vaccinia is sufficiently similar to smallpox always persistent and inapparent and animals virus to provide immunological cross show few signs of infection. Also, cowpox virus protection, but is also similar to cowpox transmission is not dependent on vole population virus in that it is able to replicate in other density and appears to be transmitted via close species such as cows and horses, rodents and social or sexual interactions. Although mice chicken eggs. (Mus) and rats (Rattus) are present in the same habitats and can also be cowpox virus infected, Until recently, however, the question of the the species show an acute disease and cannot origin and evolution of human smallpox maintain virus in the ecosystem. Furthermore, virus was difficult to address. It is clear that natural transmission between species is rare in smallpox virus could not have been field studies, thus interspecies transmission is not maintained by small pre-agricultural human typical of the biological maintenance of cowpox populations (about 10,000 ybp), suggesting in the natural habitat. In terms of human it adapted to people after this time. The infection with cowpox, humans are most earliest reference to the disease can be found commonly infected by contact with infected cats, in text from ancient India, thus which being active hunters will prey on infected smallpox appears to have originated in the rats. However, cats are not highly susceptible to Indian subcontinent several thousand years cowpox infection but if they are also infected ago. Radiation into Europe appears to have and immune suppressed with FELV, cats occurred via North Africa during the Arab become five-fold more susceptible to cowpox expansion. However, it was clearly absent disease. Human cowpox disease is normally self from the New World and Pacific Islands. It limiting and localized. However humans also seems highly likely that smallpox virus infected with HIV 1 can acquire lethal cowpox 233 virus infections. It thus seems most likely complex then many other poxviruses and codes that cowpox has evolved as a virus that can for a greater number immunomodulatory genes persist in bank voles as it native Northern then is present in the smallpox genome, for European host. However, the virus readily example. These observations along with the jumps species to replicate as an acute agent above discussion now allow us to propose a in other rodents, cows, cats and humans. scenario for the evolution of smallpox consistent with the known facts, but also consistent with the An almost identical biological pattern evolutionary consequences of persisting and applies to , but with acute life strategies as we have outlined different host species involved. Monkeypox throughout this book. Various specific rodent is an acute disease of Central African species, especially old world rodents, have a monkeys and is most prevalent in Zaire. stable and persistent relationship with Human will often acquire infection by the poxviruses. It is thus likely that the basal consumption of monkey ‘’ which poxviruses have co-evolved in a persistent life can result in an acute infection, almost as strategy with these hosts. However, these rodent lethal as smallpox virus. However, it is the specific poxviruses have a natural tendency to Giant Gambian rat that is persistently replicate as acute virus infections in other infected with asymptomatic Monkeypox vertebrate host, including other rodent species. virus. Although field studies have not yet In some cases, where supported by appropriate been performed, it is expected that such a and dense host population structures, these rodent specific persistent infection will rodent derived poxviruses have adapted to be allow the stably maintenance of monkeypox acute agents in these other hosts. Generally, this virus in this habitat. Recently, a Giant has involved loss of genes, presumably genes Gambian pouched rat, asymtomatically associated with persistence in the original rodent infected with Monkeypox, was imported host. For example, cowpox virus has 39 ORF into the United States. The rat was co- that differ from or are absent from vaccinia and housed with prairie dogs, which became smallpox virus. With sufficient selection, acutely infected with monkeypox resulting however, this adaptation to acute life strategy in in outbreak of human infections. new host can become irreversible, such as with Monkeypox virus shows considerable smallpox which became a human specific acute sequence similarity to cowpox virus. Thus virus. This proposal is very similar in strategy to Monkeypox might really be considered the that which we proposed earlier for the evolution persisting Giant Gambian Rat poxvirus. of human specific acute from Influenza A viruses from persisting influenza A infections in various water fowl species. Poxvirus phylogenetics; rodent cowpox ancestor to smallpox. The sequence Vertebrate poxvirus evolution from relationships amongst the various poxviruses entomopoxviruses. We can now consider the has recently been clarified with the broader question of how poxviruses might have completion of numerous poxviral genomic initially adapted to vertebrates, since they don’t DNA sequences. Of the mammalian seem to persist in a large number of other poxviruses, phylogenetic analysis supports vertebrate host. Poxviruses are known for the idea that the cowpox virus represents the shrews and voles, birds and crocodiles, but not most basal of this vertebrate family. The for vertebrate fish or shellfish. When did they various host specific poxviruses, including first adapt to the vertebrate host? However, vaccinia and smallpox virus, appear to have besides the cordopox viruses, which have been descended from an ancestor most mainly discussed above, there are several other represented by cowpox virus. In addition, it classes of poxviruses, including was noted that the cowpox genome is more Entomopoxvirus, , 234 and of humans. However, in some host, these initial areas of skin These viruses differ significantly from each hyperplasia can become necrotic pustules and other in sequence and other important ways. lead to severe systemic infection. However, clearly the most diverse, complex and basal of all these poxviruses are the THE CLASSIC STORY OF VIRUS AND entomopoxviruses, as they are notably larger RABBITS IN AUSTRAILIA. and more complex and their gene order is less conserved. This has led to numerous Leporipox virus - New/old world rabbit and suggestions that the vertebrate poxviruses . The best studied example of a were adapted originally from the insect or persistent/virulent leporipox virus is entomopoxviruses. Besides strong myxomatosis virus. In natural settings, this virus phylogenetic support for this idea, there are infects various new world rabbits, such as the also some experimental results consistent California bush rabbit or the South American with this. Vertebrate poxviruses often show Sylvilagus brasiliensis. Infection is readily curiously good activity in insect cells. For observed in field settings and results in a example, although entomopox virus will localized benign cutaneous fibroma that is enter vertebrate host cells, the resulting slowly cleared. The cycle of transmission is infection is abortive and cells only express maintained by mosquitoes feeding on infected early genes. However, the converse rabbits. Similar situations apply to poxviruses of infection of insect cells by animal pox hare, squirrel and domestic pigs. However, viruses is much more efficient. For when myxomatosis virus infects the example, Vaccinia virus not only domesticated European rabbit (Oryctolagus successfully enters insect cells, it will cuniculus) – a fulminate and highly fatal disease expresses early genes, induce increased viral results. Thus, a species jump with this virus DNA synthesis and also induce late gene results in an acute viral life strategy and the loss expression. It appears that faults in protein of persistence. In 1950, myxomatosis virus was processing in insect cells prevent a fully introduced into Australia to help control the productive vaccinia virus infection. Thus, European rabbits, which had become a major vertebrate poxviruses show considerable and pest species on that continent. Following a very unexpectedly good activity in insect cells high mortality (99%) sweep as a virgin soil epidemic, the virus had to adapt to the resulting If Entomopoxviruses indeed represent the decreased rabbit population densities. This also predecessors to vertebrate poxviruses, it led to adaptations in the host as well as changes appear that the various lineages of more in the virus. But the virus now needed to be able distant poxviruses (leporipox, suipox, to infect the less numerous and immunologically capripox) have maintained some additional naive newborn rabbits, as surviving adults were similarities with respect to biological mostly immune. This altered selective pressure properties, such as insect mediated resulted in what appears to be a somewhat transmission and viral persistent slower replicating acute virus variant with only mechanisms. Mostly, these other poxviruses about 60% mortality in rabbits. Many appear to establish transient persistent evolutionary biologist have pointed to this result infections in their specific host by inducing to argue that it establishes that a highly virulent benign fibroma growths in infected cells. virus is not evolutionary stable as the virus kills Since these growths are slowly cleared by too many of its host and that the virus is evolving the host cellular immunity, virus can persist to lose its virulence. It is worth pointing out during this clearance. In addition, these several things concerning this idea. One, the other poxvirus are not transmitted by virus has not evolved an ability to persist in these respiratory routes, as is smallpox, but mainly host (as it does in new world rabbits) and the by biting insects (mosquitos, ). infection continues to be strictly acute. Two, the 235 resulting 60% mortality is still a highly classified as a chordopoxvirus, it has several virulent acute virus infection. To put this in distinctive characteristics, including being a human perspective, consider that this specific to its human host, being linked to mortality is still higher then 30 to 50% karatinocyte differentiation, being the only mortality smallpox virus for humans. poxvirus that codes for glutathione peroxidase which protects infected cells from UV damage, A collection of early poxviruses including stress and oxidation. This virus also induces Molluscum contagiosum. Another of the benign skin growths associated with persistent insect transmitted poxviruses are the virus production. Infection is worldwide, and Capripox viruses which are generally tends to occur mainly in children. Although specific to hoofed host. These viruses were most transmission appears to be by mechanical first observed in 1929 southern Africa means (skin contact, contagiosum), sexual infecting domestic cattle. Capripox viruses transmission is also known and common is some are also geographically restricted and will areas. Duration of infection is form a few induce skin growths, such as lumpy skin months to years, or even longer in disease (LSD) in cattle, sheep and goats. immunocompromised host. This virus appears The specific Capripox viruses have mainly to have an extended but limited persistence life adapted to specific domestic animal host. strategy and closely resembles the Little is known, however, concerning the biology of HPV, described below. Infection is origin and natural biology of these viruses, generally localized, benign and self-limiting, although Kenya appears to be the source of although lesions sometimes develop. Acute multiple outbreaks and may be the infections are not observed, which is also similar homeland. Virus replication is mostly HPV. restricted to the skin, but other organs also become involved. The initial infection results in skin hyperplasia, but with the host The broad pattern of Poxvirus evolution: immune response, these foci develop into insect to bird to mammal Molluscum papules and lesions producing virus. Thus, contagiosum (MCV) appears to represent an old the infection of domestic animals is mainly infection of humans that has become a human an acute, non-persisting infection. As no specific virus. This virus is unusual in that no lung transmission is observed, this is clearly other members of the poxvirus families are very unlike chordopoxviruses in mammals. In similar to it. Although MCV has clearly addition, Parapoxvirus are also insect discernable similarities to the other transmitted (as well as being transmitted by chordopoxviruses, it is phylogenetically the most skin abrasion) and cause widespread disease distant from these other chordopoxviruses and in small and large ruminants. Of these, Orf thus appears to represent a basal virus that virus is economically the most important diverged very early from the other poxvirus infectious agent of sheep and also induces lineage. In terms of overall genome organization benign cutaneous growths. There is one and sequence similarity, MCV also shows clear human specific poxvirus that also fits the similarity to the capripoxviruses, and the pattern of inducing cutaneous growths; lepripoxviruses described above. Thus it would Molluscum contagiosum. virus (MCV). seem to represent a viral lineage that diverged as However, Molluscum contagiosum, is old as these two other viral lineages. MCV also mechanically transmitted by contact with shows clear organizational similarity to infected skin, and is not known to be fowlpoxvirus. As mentioned previously, transmitted by insect bites. Molluscum fowlpoxviruses have the largest genomes of the contagiosum was first reported in 1841 to be poxvirus family members. The next largest a transmissible agent which induce basal genomes are those of the entomopoxviruses. skin cell hyperplasia. Although it is However, fowlpoxvirus maintains a gene order 236 that is similar to MCV, whereas the (hence the ability of cowpox to replicate in rats, entompoxviruses have significantly altered cats and humans). Many of these viruses became gene order, including changes in the adapted as acute viruses to these new host conserved gene families. Finally, resulting in species specific acute viral agents phylogenetic analysis suggests that fowlpox (such as smallpox virus). is basal to MCV, which is basal to the other orthopox members. All these observations Thus, in contrast to the herpesvirus lineage together now allow us to propose an which have evolved in close congruence with all evolutionary path for all the poxviruses and vertebrate animal lineages, the poxviruses have how they adapted to mammals. The not, for the most part, been co-evolving with the entomopoxviruses appear to be the earliest majority of their host. Rather, they appear to members of this family and as noted in trace an evolutionary path originating in insects, chaper 7. In addition, the entomopoxviruses adapting to birds, then mammals, some times as show significant similarity to other large persistent infections, but often as stable acute insect specific DNA viruses, such as the infections with high mortalities. ascoviruses which are not found outside of insects. Thus these large DNA viruses most likely initially evolved in insects. At some OTHER PERSISTENT AND EMERGENT point, likely early in vertebrate evolution, VIRUSES OF MAMMALS this insect viral lineage appears to have undergone a species jump into the reptile and avian lineage, probably via biting Smaller DNA viruses. Adenoviruses. Unlike insects. These reptile/avian viruses adapted the poxviruses, the adenoviruses appear to have to persist as skin growths in specific avian been infecting vertebrate host all during the host, but can also induce acute disease evolution of these host and are represented in all lesions in other host species. At some point these terrestrial vertebrate lineages, from fish to (likely after the radiation of avians), a amphibians, to avians, to marsupials. Thus fowlpox-like progenitor virus was able to adenoviruses have in general a co-evolutionary adapt to mammalian host resulting in a virus pattern with their host. The adenoviruses tend to that resembles MCV. This lineage of virus code about 50 genes, but only 16 genes are underwent an adaptive radiation into various known to be conserved in all genera. other host resulting in capriviruses, Surprisingly, the highly studied E1A regulatory leporipox and swinepox viruses. All these gene from adenovirus type 5 is not conserved in viruses maintained a tendency for insect all adenoviral lineages (including mouse mediated transmission and the induction of adenovirus). The marsupial adenovirus benign skin growths for persistence in their (possum) is a member of the respective host, but were capable of family. With respect to mammals, all examined sometimes inducing acute lesions in other lineages appear to host species specific host. However, one poxviral lineage adenoviruses. The adenovirus that infects the adapted to what is rodent or rodent like Tree shrew is the basal member of the (shrew) host. This was the orthopox , and phylogenetic analysis lineage. In their native rodent host, these suggests that it diverged from atadenovirus. This viruses establish a benign persistent virus appears able to have a persistent infection. Representatives of these viruses relationship with its host as a MAD-like would be cowpoxvirus (bank voles) and adenovirus can be isolated from kidneys of monkeypox (Gambian rat). However, this healthy tree shrews. Consistent with a persistent poxviral lineage had also acquired a capacity life strategy, the ends of this MAD DNA appear for acute systemic replication and to be high in G-C and may undergo DNA respiratory transmission in other hosts methylation, which suggests the transcriptional 237 silencing of the DNA during the viral life persistent infections. The best studied virus in cycle. In humans, about 50 types of human this respect is JC virus. This virus exists in three specific adenoviruses are known. Most of stable genetic types (ABC) and 4 subtypes that these, but not all, are also known to establish are distributed amongst the human populations. prevalent persistent infections. In fact, using The distribution of these viruses can be used to normal tonsils removed from children in trace the movement of human populations and 1959 in New York, allowed the spontaneous has been used to evaluate the origins of the Asian reactivation of adenovirus from cells grown population (China, Korea, Japan). Such in culture and led to the first isolation of population based analysis are similar to those human adenovirus. Within this group, the that have also been done using HTLV-1, HPV 16 viruses are divided into respiratory and and HOV 18. In the case of JCV, virus enteric families. Human adenovirus 40 transmission requires close familial contact and appears to be the basal member of both does not readily move between mixed human groups according to phylogenetic analysis. populations. In natural mammalian populations, A biological characteristic that has been lost polyomaviruses seldom induces tumors in in most mammalian adenoviruses, is the contrast to the polyomaviruses of fish, frogs and tendency to induce host cell hyperplasia that turtles which all induce tumors, either epidermal was seen in amphibians. Mammalian hyperplasia in fish, or kidney tumors in frogs. adenoviruses are also frequently associated with and co-isolated with satellite viruses, The papillomaviruses, like the polyomaviruses such as adenoassociated virus. are small circular ds DNA viruses. Many of these viruses are associated with skin growths Polyomaviruses. Polyomaviruses are (warts) in their host. Virus replication (like known for mice, monkeys and humans. MCV) is linked to the differentiation of infected However, the broader host distribution of karatinocytes but infected cells are stimulated to these viruses has not been well studied. grow. These growths will produce virus and can SV40 was the initial member of this family often persists for months. However, some to be isolated from persistently infected persistent infections are much more silent and normal monkey primary kidney cells being not associated with skin growth and such used to grow vaccine strains of poliovirus. infections may be the most common. So far, all mammalian versions of this virus Papillomavirusers are common to many appear to have a tendency to persistently mammalian species. Humans are known to host infect kidneys, but they probably initially over 100 types of papillomaviruses and they are enter their host via the respiratory tract. classified into two large groups; cutaneous and Although some disease can occur from mucosal. Most populations have a significant polyomavirus infections (such as PML), prevalence of these viruses (up to 40%). The these are rare and normally associated with two best studied of these viruses are human compromised immunity. The vast majority papillomaviru type 16 and 18 due to their clear of infections are asymptomatic and association with cervical dysplasia. There are persistent. Furthermore, persistence is also a large number of primate papillomaviruses. generally lifelong with virus shedding into As mentioned above, like the polyomaviruses, the urine increasing with age. Like the other papilomaviruses evolution is phylogenetically mammalian polyomaviruses, human congruent with its host species, thus it has also polyomaviruses (BKV, JCV) show extreme been used to trace the geographical and racial host specificity and there is no evidence that patterns of human populations and their host species jumping ever occurs. These migrations. One curious question is why there viruses are also genetically stable. These are so many types of HPV, given that human viruses display a phylogenetic congruence polyomaviruses exist in only two types. There with their host, characteristic of most may exist competition between HPV types. 238 Also, HPV may not persist as efficiently as initially suspected to be involved in human do the polyomaviruses as infections are not disease, subsequent analysis indicated that it was lifelong and either resolve or are replaced by not contributing to hepatitis or any other know another virus type. Genetic analysis disease. The virus is instead a highly prevalent, suggests that most of the HPV types are old inapparent and a stable persistent infection of and may have originated around the time of humans. Very similar viruses can also be human divergence from other primates. isolated from chimpanzees, especially after Thus the large HPY type diversity also exposure to Hepatitis A and C virus. One appears to be old, on an evolutionary time evaluation of human patients with hepatitis scale. observed that 29/99 patients were TT virus infected. Thus TT viruses often behaves like a Other small DNA viruses. Mammals also satellites virus and are frequently associated with support infections with small DNA viruses mixed virus infections. 4 new variants of TT like Parvovirus and TT of humans and have been reported; A, M1, Mz, M3. the M3 primates. Parvovirus show both congruence viruses are the chimpanzee specific variants. with their host but also show rapid evolution Like JC virus, the natural TT viral variants are with species jumps and acute disease. 472 distributed amongst the human populations. TT types of animal parvoviruses are known. type 1 is found in Asia which lacks the TT virus These exist in 3 phylogenetic groups: 1 for type3 that is found in Africa. TT infection rodents, pigs carnivores, 2, bovine and appears to be acquired during childhood as an autonomous human/primate and 3, human URT infection from infected adults. The virus is helper dependent and autonomous avian. persistently shed into the . Thus TT seems Group 2 has members that are host to be an old and stable viral parasite of humans. congruent (host co-evolving), but also has members that appear to have undergone In summary, the majority of small DNA numerous species jumps and are not (adenovirus, papillomavirus, polyomavirus, TT congruent with their host. Curiously, no virus) viruses have persistent life strategies, are invertebrate parvoviruses are known. The host specific and phylogenetically congruent AAV group is most similar to avian viruses with the evolution of their host. Most often, and may represent a species jump between these these viruses are benign, only occasionally avian and mammalian host. A parvovirus associated with acute or proliferative disease. of tree shrew is known, but this virus is The notable exception to this generalization are latent. Latent parvoviruses of mice, such as the parvoviruses. Within the parvovoruses are orphan parvovirus, are also known and members that are both persistent and host prevalent in nature as most wild mice have specific, but also members (such as the carnivore antibodies to the virus. Similar latent parvoviruses) that frequently jump species and parvoviruses are known for other wild cause acute disease. rodent species. However, the shrew virus is lytic in guinea pigs and mouse cell lines, suggesting that a species jump by this virus THE EMERGENCE OF RNA VIRUSES IN could result in acute infections. Species MAMMALS jumping and acute disease also seems to be a characteristic of the carnivore parvoviruses, Persisting RNA viruses of placental species. such as virus. In contrast the ubiquity of persisting DNA viruses in humans and other mammals, stable The human TT virus represents another persistent infections with RNA viruses are much family of small DNA viruses and was less common. Given how frequently such agents initially discovered in the blood of human can cause acute human disease, persistence by patients that had hepatitis. Although it was RNA viruses is not as common as many might 239 expect. Furthermore, in those cases were of the entire rhabdovirus group suggest that the RNA viral persistence is prevalent, they tend Mokola virus (MOKV), which is found in to be very restricted to specific host species, African Shrews, appears to be the most basal especially rodents and bats (see below). member. Mokola viruses was isolated from Humans are known to support a few stable insectivores shrews and this virus has the unique persistent RNA infections, such as hepatitis distinction amongst the rhabdoviruses of also C virus. But the blood borne nature of being able to replicate in insects, e.g. Aedes transmission of this virus and phylogenetic aegypti. This dual host biology suggest that analysis suggest that this may be a recently rhabdoviruses initially adapted to mammalian introduced virus into the human population host from some insect version of the virus. (possibly from avian sources) and that hep C These viruses are organized into two groups. did not likely evolve along with the human Group I contains the as well as the host. African Mokola virus. Another set within group I is the Lagos bat virus of fruit and insect eating Negative strand viruses. Negative strand bats. The lyssaviral lineage is within group II, viruses, such as the myxoviruses, which includes African, European, and paramyxoviruses and rhabdoviruses are Australian viruses as well as worldwide- known to infect almost all orders of virus. Sequence analysis suggests that placental mammals. Yet, for the most part, emergence of the carnivore worldwide-rabies is a these infections are acute, disease causing relatively recent evolutionary event and may and do not persist in their host. There are have occurred as recently as 900-1500 ybp. The however, some clear exceptions to this larger worldwide-rabies ‘group’ of viruses situation which we will examine. appears to have originated about 4,000 ybp. If so, this would further suggest that human culture Rhabdovirus/ persistence. Our was likely involved in the origin of modern interest in rhabdovirus stems mainly from rabies. For the most part, the virus seems to the ability of to cause highly have evolved mainly by host switching, which is lethal infections in humans and domestic most often associated with glycoprotein changes. animals. Lethality from rabies virus is very The Lyssaviruses also appear to have often near 100%. The virus is shed in the saliva switched host during evolution. Like rabies, and is normally transmitted by biting, often Lyssaviral infections of carnivores are generally by infected (and agitated) carnivores. lethal. However lyssaviruses are also known to Infected animals will replicate virus in persist as asymptomatic infections in specific bat peripheral nerves allowing virus to migrate species. This would suggest that bats are the up the nerves to the CNS, where it induces evolutionary stable source of acute lyssaviral aggressive behavior and an inevitably lethal infections for other species. disease. Because of the slow pace of this transmission, post exposure immunization Bats are Chiropters which are placental species against rabies (first established by Pasture) that diverged early from the other placental can be life saving. Rabies virus is a member lineages thus representing a rather basal of the rhabdoviruses, which represents a placental group. Although bats are found large group of viruses that includes throughout the world, in Australia, they represent lyssaviruses. As these are negative strand one of the few native placental species in an viruses, which lack recombination, their otherwise marsupial habitat. Thus it is highly evolution is mainly due to interesting that all common Australian bat process as well as gene duplication and species appear to harbor persistent lyssaviruses. deletion. Thus it is possible to infer long Why might there a link between bats and term evolutionary relationships amongst lyssavirus persistence? Phylogenetic analysis these viruses. Recent phylogenetic analysis indicates that Australian bat lyssavriuses are 240 monophyletic (via G protein) and can be paramyxoviruses have a more uniform genome separated from other lyssaviruses. These bat size. This virus was observed to caused acute viruses appear rather stable genetically and and lethal outbreaks in horses and the humans can be grouped into host species specific that were in contact with infected horses. It now clades. This is unlike the other acute seems that had jumped from other rhabdoviruses, whose clades show much species into domestic animals and humans. This greater genetic diversity. Thus the family of virus also shows some similarity to lyssaviruse, although strictly acute and (filoviruses) and may be evolutionary highly lethal in numerous species, especially related. Hendravirus is related to carnivores, appear to maintain an (via highly conserved L protein domain), which evolutionary stable relationship as a is a virus that caused a major epidemic in pigs in persistent infection of bat species. Malaysia. Infected pigs were also able to transmit Nipah infection to humans, resulting in Paramyxovirus persistence. about 100 deaths. The specific source of this Paramyxoviruses are another group of outbreak has not yet been identified. However, a negative stranded viruses that includes many broader epidemic was apparently prevented types of cause acute disease viruses and epidemic by culling millions of pigs. Flying infects almost all placental species. Like the foxes (genus Pteropus) are a natural persistent rhabdoviruses above, these paramyxovirus host for Hendra virus in Australia and harbor the infections tend to be acute and disease closely related Nipha in Malasyia. These two associated and include viruses such as viruses resemble both Ebola and measles in their Measles, , canine distemper, genetic organization, suggesting that all these , RSV and the avian viruses may share a common evolutionary paramyxoviruses. Sequence relationships background. Interesting that like Lyssaviruses, amongst all of these paramyxoviruses also cause many lethal infections viruses can be readily seen, especially in the in carnivores by inducing a distemper-like polymerase gene. The most basal member disease. This suggest that these viruses might of the mammalian infecting members have some involvement in predator prey appears to be Sendai virus. In spite of the relationships of their host. broad array of species that can be infected by paramyxoviruses, there are very few In summary, the negative strand viruses include examples of persistent infections that are a large number of viruses that cause serious part of the normal biological strategy for any acute disease in most mammalian species. Most of these viruses. They are almost all acute of these infections are acute, but some persisting infections. However, there is at least one host are known. The rhabdoviruses all appear to placental host which can also be persistently have common ancestors (possibly infecting infected with paramyxoviruses. Tree shrews insects). The paramyxoviruses also appear to are known to be infected with Tupaic have one common ancestor. Most of the acute paramyxovirus (TPMV). This virus infection caused by these viruses appear to have establishes a silent and persistent infection adapted or evolved from persistent infections in in its native host but shows no antigenic specific host. Bats in particular appear prone to cross reaction to the other paramyxoviruses. persistent infections with negative strand viruses. However, TPMV does show clear resemblance to hendravirus. Herdravirus is a paramyxovirus (a ) that was RNA VIRUSES OF RODENTS; AN isolated in Australia. The Hendra virus UNTOLD STORY OF INAPPARENT VIRUS genome has a large non-segmented minus RNA template (18.2 kb). This is the most Rodent evolution, phylogenetics and RNA complex of paramyxovirus genomes as other virus. Outside of humans, rodents are by far the 241 best studied placental species. Rodents have subgenera: Coelomys (southern Asia), Pyromys also been extensively examined with respect (southern Asia), Nannomys (Africa) and Mus to their viruses, especially Mus musculus (Europe, northern Africa, Asia). The sub-genus domesticus (lab mouse) and Rattus rattus Mus includes Mus caroli, Mus cervicolor, Mus (laboratory rat). As mentioned previously, cookii and Mus musculus. Mus musculus (the rodents comprise about one half of all house or lab mouse) is grouped into four species placental species. Within the rodents, the including the aboriginal Mus spretus, Mus murid family (Muridae) contains the most spicilegus, Mus macedonicus, all of which live species. Although rodent-like mammals are in natural field settings. The commensal, human very ancient, modern rodents are much more associated (semi-domesticated) species of Mus recently evolved. Most rodents are adapted are Mus domesticus (from western Europe and to seed eating, which also links them to the middle East) and Mus castaneus (southeast evolution of modern grasses. All rodents Asia). The domestic Mus molossinus (found in evolved from shrew-like carnivorous or Japan) appears to be a permanent hybrid cross insectivorous ancestors. Rodents have between Mus musculus and Mus castaneus. diverged into various families. Within the America was colonized by Mus musculus and muridae family there are also several sub- Mus domesticus with the European colonization families, which have diverged from one and thus this immigrant differs significantly from another. The murids are phylogenetically the native Sigmondontinae and Arvicolinae monophyletic. The murinae sub-family mouse species. All three of the domestic Mus represents an early branching (9.8 mybp) species (musculus, domesticus, castaneus) and contains various old world species, appear to have initially evolved in Northern India including Mus musculus and Rattus species sub-continent and diverged about 600,000 ybp. that are the laboratory standards. New The great bulk of scientific study has been done world rodents are represented by the on these domestic Mus species and the current Sigmondontinae sub-family and are more version of lab strains are mainly derived from a recently diverged (5.7 mybp). This sub- very small population of breeding stocks family contains the Peromyscus (Deer originating in Japan. However, these strains mouse) species studied in the Americas. were quickly crossed with European mice but The most recently diverged Muridae sub- extant strains now harboring mainly domesticus family (3.6 mybp) are the Arvicolinae, genes, domesticus mt DNA and mixed which are further divided into old and new musculus/castaneus Y chromosome. Little world lineages. The Arvicolinae are also the (1909) was first to established DBA inbred one murid sub-family that has the most strain. The BALB/c mouse was established in diverse species, includes lemmings and 1913 by Columbia graduate student H. J. Bagg voles. A recent radiative expansion seems from an albino line and after F26, the ‘/c’ was to have occurred in this sub-family. This added at Jackson laboratories. BALB/c and latter evolution of rodent species has been DBA were used along with outcrosses to especially rapid and recent when compared establish most mouse lines in lab use today. to the evolution of the primates. Feral Mus species have not often been systematically studied with respect to virus The laboratory mouse- so much from so infections. On major field study in Australia few. The laboratory mouse is within the (where the introduced Mus became a pest Murinae. Mus musculus (lab mouse) itself species), showed high prevalence with the has an extensive lineage history, but also coronavirus, (mouse hepatitis virus), the gamma results from a very small genetic pool. Herpesvirus (Mouse ) and There are 30-40 Mus species that originally mouse parvovirus. However, these infections ranged through Europe, Asia and Africa of were all asymptomatic and most known acute the Old World. Mus are divided into four viral infections were rare. Wild Sigmondontinae 242 (new world) and Arvicolinae (New and old end host for the virus. Although Hantaviruses world) species have both been extensively are also known for some shrew species, these studied with respect to hantavirus and have not been well studied, nor is it clear is these . viruses are basal to those hantaviruses of other rodents. Hantaviruses are not known to establish persistent infections outside of rodents. Hantavirus and naturally persist in rodents. Rodent field studies, although limited in number, have led to the Arenaviruses are ambisense (plus and minus) realization that many natural rodent species RNA viruses that are responsible to various tend to harbor asymptomatic persistent types of hemorrhagic fevers in humans. These infections with various RNA viruses. Two human infections are generally disease virus families in particular have been studied associated, and strictly acute. However, in in this regard. Hantaviruses and rodents, arenaviruses are know for their ability to arenaviruses are both, in their native rodent establish asymptomatic persistent infections and host, generally maintained as persistent these persistently infected animals can be the asymptomatic infections. With the source of acute infection of other species. In Hantavirus, many viral types have been Sierra Leon, for example, the consumption of characterized. However, the viruses can be asymptomatically infected rats has been a major broadly classified into new and old world source of human infection with groups in keeping with the classification of virus. The best studied member of this family is their rodent host. In all cases examined, the Lymphocytic Chroiomeningitis Virus (LCMV) specific virus is highly specific to its host, is of Mus musculus. This virus is ubiquitous in genetically stable and is distributed in a natural mouse populations and established life- geographically restricted way. Furthermore long persistent infections. Similar to the the viral and host rodent clades are Hantaviruses, the arenaviruses are classified into congruent. In addition, the virus and host two complexes. One, the LCMV-LASV are both evolving with essentially the same complex, which is monophyletic with 3 distinct slow molecular clock, much slower then the lineages. The other, the Tacaribe complex is clock seen for acute RNA viruses. Thus also monophyletic with 3 lineages. These hantaviruses that infect Arviconinae species complexes correspond to their respective new (such as Microtus) are distinct from those and old world rodent host. The arenaviruses in that infect Sigmondontinae species (such as their native rodent host have been reported to Peromyscus). Since hantaviruses are RNA show co-speciation with their host. However, viruses with the established potential for the pathogenic arenaviruses are not very fast RNA variation, it seems that the monophyletic, show multiple independent persistent state is imposing a restriction on origins and high rates of genetic change. Thus, the rate of virus evolution. Northern although we see a New/Old World congruence European Hanatviruses can be classified into between virus and host, there is also evidence two tribes both infecting members of the that some stable species jumps have occurred host microtini tribe. Although this virus and between rodents. However, outside of rodents, its rodent host are generally co-evolving, there is no evidence that arenaviruses have there is also evidence that occasional viral established stable persistent infections and even species jumps have occurred establishing the acute infections in other host appear to be Hantaviruses into new rodent lineages. In unstable, requiring the rodents as reservoirs to humans, hantaviruses tend to cause serious continue the chain of transmission. There is no respiratory and renal disease. However, primate version of an arenavirus known, for human to human transmission has been very example. We have no current explanation for the inefficient so that humans tend to be an dead 243 limitations of both the persistent persistent virus in another species. These are not hantaviruses and the arenaviruses to rodents. simply isolated events. The transmission of these agents to new species appears to be an ongoing and never ending process. Even within CURRENT ISSUES IN EMERGENCE, the same species, if two populations become well PERSISTENCE AND HUMAN isolated, they can and often do acquire distinct EVOLUTION. persistent viral agents that will present an opportunity to initiate acute viral infections if Recent examples of human viral these isolated populations reconnect. epidemics resulting from persistent infections in other species. The Thus acute viral infections, especially human epidemiological concept of a reservoir ones, come from persistent infections. As we species has been used for many decades to have presented, with close examination, there help explain the recurrence of viral disease seems to be few if any exceptions to the after it has apparently been eliminated from persistent origin of these acute viral infections. a population or habitat. Although this has Yet, there are, however, some seemingly clear been a ‘practically’ helpful concept, it has examples of acute human viruses for which we also caused confusion in its have no obvious source of virus from some other oversimplification. Because this concept persistent host. The human cold viruses, such as fails to indicate a distinct difference in the rhinoviruses (picornaviruses), appear to be one persistent viral life strategy relative to the such example. There appear to be about 50 types acute life strategy, with a correspondingly of human alone. How could all these distinct viral fitness and a distinct derive from another host? However, closer evolutionary relationship between virus and examination of existing sequence data can still host, it has led generations of infectious identify potential persistent sources for the disease students think of viral fitness and origins of these viruses. For example, Theiler’s evolution simply from the context of acute virus is a ubiquitous, generally asymptomatic disease. Thus when a new disease emerges, and naturally persistent enteric picornavirus it is simply assigned to having come form infection of mice. The RNA sequence of this some ‘reservoir animal’ as if that is virus appears to represent the basal member of sufficient to explain the origin a an intact the entire Rhinovirus family, leaving open the viral system. In this book, I have possibility that this mouse virus also represents attempted, by numerous examples, spanning the ancestor that originated human rhinoviruses. all classes of virus and host, to correct this misunderstanding and develop the central We have already examined the major emergent importance of persistent infections to the human viral epidemics of the last century from evolution of virus, its stability in its host and the perspective of viral persistence. Thus we can how this contributes to the emergence of a propose, with good experimental support, the more simple virus/host relationship; acute idea that the influenza pandemic of 1918 derived replication. However, because stable viral from a persistent avian H1 genome, following re- persistence generally imparts some fitness assortment. A similar story can be proposed for consequence to its infected host, persistence the source of the 1957 influenza pandemic, is not evolutionary neutral. It is not simply a (H2N2) and also the Hong Kong influenza selfish replicator. Persistence to outbreak of 1997 (H5N1). In both cases the host survival, often due to the vary same persistent ‘reservoir’ for the genes was water virus. It now appears that we can account fowl (geese). Given that water fowl harbor 15 for almost all examples of recent human HA types and 9 NA types whereas humans (and domestic animal) viral epidemics as the harbor only 3 types of each gene, we can expect adaptation of a virus that was and is a stable that persistently infected water fowl will 244 continue to be the major source of influenza assume they have failed to find the right genes that can adapt to infect humans. This conditions of virus growth, and not assume that appears as an almost inevitable consequence persistence or silence is in fact the expected and of humans and water fowl being in close normal outcome of the infection. A process will proximity. Similarly, as outlined above, then be undertaken to select for the version of the both HIV 1 and HIV 2 are now well virus, or host cell, that will coax the virus to supported to have originated from retroviral replicate efficiently and kill host cells, thereby genomes persisting on other (monkey then losing the very gene functions that were chimpanzee) host. Very recently, there was necessary for persistence. The isolation of a an outbreak of monkeypox virus in the USA CMV virus having the UL144 gene (a mini-TNF causing disease in human and prairie dogs. receptor sequence) from clinical, but not lab Even this outbreak could be traced to the strains, of human CMV may be an apropos importation of an asymptomatically and example of just such an inadvertent selection. persistently infected Giant Gambian rat from Such dispensable genes are generally considered Africa. ‘accessory’ as they don’t contribute to acute virus replication, although they can be highly Persistence mis-studied. However, in all conserved in nature. This explanation may also the above cases, our attention is so focused apply to explain the hitherto dilemma of why the on the acute pattern of viral replication that poxviruses that persist in rodents (such as we essentially lack any understanding of the cowpox or MCV) have so many additional mechanisms by which these viruses persist immune modulators compared to the strictly in their natural host. Our general operating acute viruses, like smallpox of humans. assumption is that viral genes have for the Persistence is not an ‘accessory’ function. It is a most part evolved to operate in an acute most basic viral life strategy and deserves to be mode, that is, to make more virus and not to studied as such. dampen the virus replication in order to persist. Thus we are often surprised to learn SARS - a most recent example of emergence: that acute viral disease, recently introduced the current future. Prior to 2003, the virologist from ‘reservoir’ host, are often associated that study coronaviruses had a real problem with with an overactive or toxic host immune Federal granting agencies. They needed to response, not with the resulting carefully justify why anyone would fund the from direct virus replication. Hantavirus study a family of viruses that causes little human severe respiratory syndrome appears to be a disease (aside from colds). For the most part, good example of exactly this situation. This they struggled to make this justification. The virus replicates poorly in human lung tissue, best studied member of this family, Mouse but induces extensive cytokine mediated Hepatitis Virus, caused little diseases in natural inflammatory or oxidative damage. In fact Mus musculus populations, but could be coaxed there may be some general mechanistic by virologist with some mutation of the virus principles of disease induction that apply into causing neurological disease. All that when a virus, evolved to persist in one host, changed in March 2003 in southern China when finds itself able to replicate acutely in a new a new coronavirus emerged for the first time to host. It may well be that the normal infect humans and cause serious, often lethal dampening mechanisms of persistence (such disease of human respiratory system. SARS was as signal transduction or immune quickly established to be due to a family of modulation) are often species specific, coronaviruses that had not previously been selected to dampen virus replication in the known. Due to rapid and effective natural species but no longer functioning in measurements, a SARS pandemic was the new host species. After all, if a virus prevented. However, the question became where fails to grow well in culture, virologist will did this fully formed and new member of the 245 cornavirus family come from? As all the Factors that favor emergent viruses. The genes appeared to be equally different from capacity of a persistent virus to adapt to become existing coronavirus strains, the possibility an acute viral agent in another host seems to that SARS represented either a recombinant present a never ending situation. As long as or an acquired set of mutations from a there are multiple species that can share viruses, known coronavirus was quickly dismissed. this possibility of virus adaptation appears It seemed to have appeared relatively intact always present. Related species would almost from an unknown source. The search was always seem to present the highest risk of such then initiated for an animal ‘reservoir’ of virus exchange, since the host habitat will be SARS and several species of regional close to what the jumping virus needs. If this carnivores (civet cat, ) were situation is extrapolated, it suggests that viral identified to be asymptomatically infected persistence may well affect competition between with a SARS virus that was very similar to host species. It may be that the most successful that infecting humans. It has yet to be species could be the ones colonized with the established if these carnivores are the stable appropriate persisting virus, which can acutely source of SARS infection or if they too infect competing species. This idea brings to acquired the infection from some prey. mind the original observations presented early in However, it is clear that the virus is in the this book concerning the mixing of bacterial ecosystem. Coronaviruses are likely to be strains harboring different prophage, leading to an ancient family of virus. Gill associated the lysogenic death of the uninfected bacterial virus of prawns (GAV), show limited strain. Viral persistence can clearly affect host similarity to the SARS replicase but is species competition. clearly a member of the same general viral family. The virus of shrimp is the simplest Host populations. The successful growth, or member of this family (Nidoviruses). overgrowth, of any one population of host Although GAV is highly pathogenic in appears to increase the probability of acquiring shrimp farms, it is found as a persist acute viral agents. Large host populations create asymptomatic infection in wild shrimp a population dynamic with increased contact populations. It thus seems likely that the rates and predispose this successful population to Nidoviruses are ancient viruses that have the possible adaptation and infection by a infected vertebrates for a long time. persisting virus in another species. The more However, if this evolutionary stability has successful, the more acute viral agents that are been maintained by persistently infected likely to adapt. Thus we expect that the very host, we know little of what those host success of any one population, brings an might be. Coronaviruses do not appear to increased contact with persisting viruses of other infect natural populations of new world species. The current human population and its rodents. But this issue s poorly studied. distribution throughout the world at high density Most of the known Nidoviruses and high migration rates makes it is likely to (coronaviruses) do not establish persistent allow a high rate of interspecies contact and infections in there corresponding host acute agent adaptation. Also any situations (human, turkey) but some viruses (e.g. cats) which brings together previously separated do. Thus, it seems almost certain that SARS species or situations which lower biological is an ancient persisting virus (possibly from barriers will increase the likelihood that a a carnivore or its prey) that has adapted as persisting virus of one species can adapt to an acute agent to the human respiratory become an acute infection of another species. tract. Thus the SARS emergence appears to Thus, the placement of African and Asian fit well the pattern of acute adaptation from elephants adjacent to one another in zoos can be a persistent viral life strategy. expected to allow the exchange of their corresponding persistent herpes viruses. 246 Similarly, the mixed farming of related tissue into new host represents a new avenue for shrimp species might also result in virus the transmission of persisting virus and this was transmission. not a situation that existed prior to human intervention. However, even if there were no Biological barriers. Besides changes in additional or outside species to be possible population structure and exchange rates, sources of emergent viral agents, in some cases changes in biological barriers can also the genome itself can provide the virus. We contribute to disease emergence. Factors have mentioned the evolution of leukemogenic that lower immunological status or physical retrovirus mice form an ERV in the AKR mouse biological barriers are expected to increase line. Similarly with the Lake Casitas mice we the probability of the adaptation of a saw from the germ line. persisting virus to become an acute agent. For example, the human population now Factors affecting persistent virus colonization. contains a large number of HIV infected The main concern expressed in the above section people. In HIV induced AIDS, HIV lowers is the occurrence of new acute human epidemic cellular immunity, it increases the likelihood viral agents. The attention of virologist has long of reactivation of many persisting human been focused on the emergence of just such agents, and increases the probability that agents. What about the occurrence of new these agents can adapt to become acute persistent agents? What do we know about agents in HIV infected individuals. The conditions or viral phenotypes that favor such increased rate of human monkeypox events? For the most part, we have seen that infection in Zaire Africa (with high HIV persistent viruses tend to evolve along with their rates) may be exactly such a situation. host lineage. Hence these are mainly previously Lowered immunity also allows reactivation existing virus/host relationships and tend not to of persisting agents and such latent virus represent new viral agents. Yet, we have seen reactivation is one of the main clinical that in various orders of organisms, there is a problems faced by AIDS patients. Outside characteristic pattern of harboring a particular of AIDS, the transplant of organs can also type of persistent infection. How then do new lead to persistent virus reactivation, as was persistent viral infections come about? As seen in the reactivation of BKV in persistence often requires specific mechanism transplanted kidneys, or HCV in liver and gene function to avoid immune elimination, transplants. Also, persistent asymptomatic maintain persistence or allow viral reactivation, VZV in skin and thymus can reactivate in it represents a more complex, generally multi- xenografts. Similar kinds of tissue gene phenotype. Consistent with this idea is that transplants in mice have also resulted in the persisting DNA viruses often have a large reactivation of silent persistent infections, number of disposable ‘accessory’ genes that can such as the transplant of mouse brain slices be deleted for virus propagation in culture. The resulting in the reactivation of mouse CMV complexity of this situation leads us to expect in 75% of the subjects, due, probably, to that colonization of a host by new persistent viral stem or progenitor cell proliferation. agents will be relatively uncommon, and perhaps Xenotransplantation (across species) can an evolutionary important event. However, also lead to persistence reactivation and genetic evidence suggests that such new acute transmission to other species. For colonization must occasionally happen. In the example, pig kidneys will reactivate CMV section on rodents RNA viruses above, we noted when implanted into baboon. In addition to evidence with both hantaviruses and arenaviruses these exogenous agents, most mammalian have on occasion, jumped into other related tissue will also prose the possibility of the rodent species to establish a new line of virus reactivation of ERVs in the germ line, such and persistently infected host. Can we say as PERVs in pig tissue. Such a placement of anything about the factors that might affect this 247 situation? There are few experimental favored when highly prevalent acute viral agents results that are relevant to this issue so much are infecting a specific host. The transmission of of the consideration must be more from a persistence to uninfected members of the same or theoretical perspective. We did mention the related species has been seen to often involve an stable colonization of wild American ‘addiction-module’ or ‘persistence-module’ like drosophila malanogaster species with process. The drosophila melanogaster, gypsy retroviral agents in chapter 7. This process retrovirus story mentioned above in which viral was associated with sexual compatibility transmission was also sex associated is an (hybrid dysgenesis) between two previously example of this situation. However, as isolated species of drosophila and also mentioned, we have no examples of this situation appeared to have elements of an addiction occurring in mammalian species, yet their module (protection if infected, harm if not genomes are highly colonized by ERVs. As we infected). However, in the vertebrate have mentioned, ERV acquisition as a way to animals, we are hard pressed to point to any protect against autonomous versions of the virus experimental observations that shows the has often been invoked to explain the presence of new colonization by a persisting virus. One many ERVs in the genomes of mammals. possibly related example would be the Lake Casitas feral mice and the role of Let us consider this issue as a theoretical endogenous retrovirus in the emergence extrapolation the ongoing human pandemic with autonomous retrovirus in the F1 cross of two HIV 1 to see how such ERV and defective-ERV sub-species. However, this is a loss of acquisition might come about. If we assume no protection from acute disease by the intervention by human culture and science, we corresponding sex mediated loss of the can envision a resulting HIV 1 epidemic that is persistent and protecting ERV, not a gain of not controlled and could even result in increased a new persistent infection. The limited rates of HIV 1 transmission. In this populations of chimpanzees that harbor circumstance, we can easily imagine that SIVcpz in Africa may be one example. essentially the entire communicating human population might become infected with HIV. As Although we lack experimental examples of most infected people would be expected to the acquisition of new persisting viruses in succumb to the infection, those few survivors mammals, we can still consider this would be non-progressors that still had the virus. possibility from a theoretical perspective. Thus the resulting human population would be The issues to examine include - what persistently HIV infected, but the infection conditions favor the acquisition of new would be asymptomatic. However, such a persisting agents, what favors the persistently infected mother would still pose a transmission of persistence and what are the to her uninfected offspring. consequences to the colonized host. One This would provide a selective pressure that theoretical issue would be that if versions of would favor germline HIV and HIV defectives acute virus infecting a particular species colonization of her offspring’s genome that were highly prevalent, this would favor the would establish a persistence module in these establishment and survival of a persisting children. In order to stabilize this persistence virus (generally derived from variants or against all the genetic variants of HIV that might defectives of the same acute virus) that prevail, this persistence module would itself would bestow immunity to the persistently need to inhibit a population of HIV, hence many infected host. Since by definition, slight variations of germline HIV defectives persistence requires the regulation of acute might be needed to stabilize this state of viral replication, in this situation persistently persistence. The resulting new human species infected host would be more likely to would be inert to HIV. Thus in the situation of a survive. Thus persistence acquisition is prevalent and lethal viral infection, we can see 248 selective conditions that would favor the to differentiate human from chimpanzee coding survival of a persistent state in the genome. DNA, which are 99% similar. Yet humans and Furthermore, this surviving population their primate relatives differ in some rather major would pose a risk of HIV infection were it to and complex ways, especially with respect to contact another human population that had cognition and language use. Can we relate the not undergone the HIV colonization and differences we see in their respective DNAs to selection. Finally, it is worth noting that if corresponding differences in their primate such a human species were to evolve, it phenotypes? What distinguishes human from would now posses various new molecular chimpanzee and do their corresponding ERVs circuits, such as vpu, nef, tat, plus a new env, have any role in this distinction? Which of these all of which would now be available for differences are the most recent changes? additional evolutionary selection and evolution of the human genome. Do such The largest and most distinct differences events really happen? There are some between human and chimpanzee genomes are to reasons to question this scenario. For be found in their corresponding Y chromosomes. example, HIV1 is a lentivirus and all Yet the recently sequenced human Y lentiviruses have distinct RT sequences, in chromosome shows that it codes for a addition to various accessory genes. surprisingly small number of genes (about 70 Although lentiviruses are not highly possilbe ORFs, but perhaps 20 authentic prevalent in most natural populations, other proteins) and half of these are retroposon species-specific versions are known. associated. Thus the biggest difference between However, there do not seem to be any human and chimpanzee is in the Y chromosome examples of endogenous retroviruses that and this differs mainly in the pattern of have the lentivirus RT sequence. Most retroposon or ERV colonization. This ERV endogenous retroviruses have an MLV-like difference also applies to the autosomes in that RT sequences. Perhaps germline lentivirus these chromosomes also show evidence of recent integration is precluded for some unknown HERV colonization. Overall, humans genomes reason. Yet we know that lentivirus based have about twice the content of HERV/LINE vectors are very efficient at elements as do chimpanzee genomes. Thus there germ line integration in mouse transgenic must have occurred a major event leading to studies. Can we see examples of ERV human ERV colonization. And it is the HERV K colonization events that occurred at the family of viruses (and their SINE derivatives) origin of the various mammalian lineages that most distinguishes the human from the that resemble this hypothetical scenario? chimpanzee genome.

Human/primate evolution and ERV Human cognition and ERVs. The acquisition acquisition. About a million years ago, the of human language and the associated changes in human and chimpanzee primate lineages human cognition are the most complex diverged from a common ancestor distinctions between human and chimpanzee. It somewhere in Africa. With the completion is thought that brain lateralization is involved in of the human genome, we are now better this processing of language as language is clearly able to consider what types of genetic processed in an asymmetric manner by the changes were associated with this recent . Brain lateralization and language human evolution. However, one of the acquisition is also thought to be related to surprises of the was acquisition of abstraction. Currently, this is a how relatively little difference there was very poorly understood process so we can say between the coding sequences of humans little about its underlying mechanisms or how it compared to chimpanzees. Based on coding might have come about during human evolution. sequences alone, one would be hard pressed However, it also appears that abstraction and 249 brain lateralization are related to a uniquely may have lacked such a high degree of social human disease, schizophrenia (which cooperation as present in modern humans. What appears absent from other primates). Some might be the genetic changes that predispose feel that schizophrenia and psychosis are modern humans to such behaviors? All of these associated with brains that are closer to social situations represent cooperative group symmetry and thus associated with activities or systems that are supportive of, lateralization. All human populations are pleasurable for and/or beneficial to participants prone to schizophrenia at similar rates, so no attached in the group and harmful, painful or environmental agents appear to be involved. punishing to individuals outside the group. In Schizophrenia strongly reduces attachment fact, it has recently been reported that social behavior. There is clearly a genetic rejection activates the same region of the brain as component to the disease, but its nature is does physical pain. What is the evolutionary unknown. Lateralization is developmentally force that drives this acquisition of such controlled but how this might be perturbed behavioral complexity? One experimental model with schizophrenia is not clear. Its onset, that seeks to understand the mechanisms of post puberty, also shows a clear sex associative learning (via attachment behavior) is association and can be seen in XO and XXY the study of partner preference formation in male humans. There is, however, a curious link prairie voles. The mating dependent learning in between schizophrenia and human specific these voles results in life-long monogamous HERV K family sequences. Affected bonding between mates. Vasopressin receptor regions of schizophrenic brains have been and dopamine are known to be involved in this shown to express SINE-R.C2 at high levels, learning, possibly affecting both reward and a human specific HERV K10 family agent anxiety circuits. Brain imaging studies indicate found at high copy level on the X and Y that this attachment learning affects the same chromosome. As the role of SINE-R,C2 region of the brain as affected by drug addiction. expression in brain function is not known, Intriguingly, infection of the affected region of we cannot now propose a causal role for the vole brains with a recombinant adenovirus ERVs in human cognition. However, the expressing the vasopressin receptor results in the congruence of these circumstances are same attachment behavior. Humans are not highly intriguing. voles, nor is human sexual behavior so simple. Yet the human genome has in recentl Complex behavior and virus. Other evolutionary times been colonized by various aspects of complex human behavior also viral agents. We have already noted situations in pose a major evolutionary puzzle. However, which viruses can clearly colonize and to most readers, the idea that viral genetic manipulate the complex sexual behavior of their parasites might be involved in the evolution host, sometimes via hormonal control of host of such complex human behavior might nervous tissue (e.g. the very successful seem ludicrous. Yet let us consider some of polydnaviruses and other viruses of parasitoid these most humane traits from the wasp). Because persistent viruses can be highly perspective of possible viral involvement. dependent on specific sexual behaviors of their One such puzzle is the acquisition of host, there is an inherent link between virus associative learning, which relates to the survival and host sexual behavior. Persistence is learning and development of social often attained and maintained by attachments and the development of human superimposition of addiction modules. cooperation and society. Such cooperation Retroviruses are especially notable as the one underlies family, tribal, cultural, religious viral family that encodes for very large number and social cohesion as well as altruistic of receptor molecules. It thus seems plausible behavior. It has recently been proposed that that an ancient viruses could also have affected other extinct human lineages (Neanderthals) the evolution of human behavior by 250 superimposing addictive behavioral to contribute to the fabric of life. With genomic elements that were responsive to specific sequences, however, we can no longer deny them environmental stimuli. However, although an important role in the evolution of life. Their human CNS has and expresses many such presence has been well documented. However, elements, we have no specific viral agent we still have a long way to go towards that we can currently identify as a candidate understanding this network of life that includes which could be used to experimentally evolutionary history of viruses. In this regard evaluate this idea. viruses confuse us with their multiple identities, mixed lineage, mixed clock rates, episodic or With respect to the general consequences of punctuated emergence and their dizzying viral agents to human survival, human social diversity. We prefer to think of the tree of life as behavior matters greatly as survival of the a coherent and congruent topology. If this viral social group is strongly affected by altruistic character were overlaid onto the tree of life, we group behavior. It is the social or group might be left net or shroud on this tree that behavioral response to HIV and to SARS confuses us rather then clarifies our thinking. epidemic that has mainly been responsible There is clearly a discernable decent of the for limiting human deaths due to these lineage of living species so the Tree of Life agents. In a sense, this behavior is a analogy reflects what we can see in living successful ‘persistence phenotype’ that has organisms. Perhaps that is why their remains a precluded other competing viral agents from big reluctance to include viruses in the tree of succeeding in human host. Thus human life since although their origins and lineage are behavior is a most powerful and adaptive old and their influence on all life major, their phenotype with respect to the unending evolution is complex and not consistent with our threats posed by emergent human viruses. accepted ‘tree’ topology. The inclusion of virus would be force us away from the comfortable Viral role in the tree of life; a never and useful analogy provided by the ‘Tree of ending story of creation. This book has Life’ into new undefined and seemingly examined how viruses influence the incomprehensible paradigms. However, we can evolution of life through the perspective of now also begin to see that viruses can be acute and persisting viruses. Form the involved in and contribute to the origin of prebiotic beginnings of replicator molecules species. Viruses, in their competition with and replicating programs, to the first cells; themselves, provide a selective pressure that from the origin of the eukaryotic nucleus to differentiates host lineages and can make multicellularity; from worms to humans, we previously compatible host populations become have surveyed how viruses influence their incompatible. They can also add new layers of host and provide a never ending and complexity and genetic identity onto their host in dynamic environment for the weave of life. episodic events through persistence and genome All life has been touched by viral influences colonization. This process would appear to be and most genomes clearly show the lasting cumulative, resulting in ever-greater host evidence of viral footprints. These complexity. Since their discovery over 100 years footprints represent those viral genomes that ago, viruses have been inherently inscrutable and have persisted and continue to be left on aside from causing disease they remain almost life’s genomes, as our own human DNA can invisible to most biologist. It is time to attest. Yet, most evolutionary biologist acknowledge these unseen creators that can and continue not to think of viruses as an do explore the vastness of sequence space at element of the tree of life. Viruses were not previously unimagined rates. perceived to be living entities. Yet viruses have all the characteristics needed in order An inherently invisible nature of viruses persists to be subjected to the laws of evolution and to this day. Consider the oceans, that vast 251 ancient cauldron that gave birth to all life on (Dimcheff, Drovetski et al. 2000; Dimcheff, this planet. Few realize that the oceans are Krishnan et al. 2001) also vast cauldrons of virus with unending, (Huder, Boni et al. 2002) hyper-accelerated rates of evolution. (Leblanc, Desset et al. 2000) Measurements indicate that the oceans have about 1031 virions or phage particles in total. ERVs and placenta The great majority of this virus corresponds to large DNA viruses and phage that can (Seman, Levy et al. 1975) acutely and lysogenically infect most (Levy 1975) bacteria and algae. To get a sense of the (Levy 1977) vastness of this number, we can estimate (Levy, Joyner et al. 1980) that most of these virions will have a (Nelson, Levy et al. 1981) diameter of about 100 nm. If laid end to (Levy, Oleszko et al. 1982) end, this virus would span the diameter of (Revoltella and Consiglio nazionale delle the observable universe (about 1024 meters)! ricerche (Italy) 1982) In addition, the great majority of this virus turns over every day. If this pattern has ERVs and the live birth hypothesis. persisted for the last three billion years, which appears likely, then there have been (Villarreal and Villareal 1997) about 1043 generations of individual virus (Espinosa and Villarreal 2000) during this period. Given the high rates of (Harris 1998) recombination and genetic variation inherent (Nakagawa and Harrison 1996) in DNA virus replication, we can see that (Hohenadl, Leib-Mosch et al. 1996) there has been a vast exploration the (Mi, Lee et al. 2000) sequence space by this viral process and (Bromham 2002) successful genomes that have colonized host (Nilsson, Jin et al. 1999) cells might be expected to persist in the (Larsson and Andersson 1998) ecosystem and contribute this vast creativity (de Parseval, Casella et al. 2001) to the tree of life. Perhaps from such a (Stoye and Coffin 2000) perspective, we can better appreciate why (Mi, Lee et al. 2000) our very own human genomes appear to represent an ocean of ancient retroviral Mouse evolution and virus elements. (Gottlieb 2001) Suggested reading. (Hook, Jude et al. 2002) (Singleton, Smith et al. 1993) Mammalian evolution (Hart and Bennett 1999) (Monroe, Morzunov et al. 1999) (Murphy, Eizirik et al. 2001; Murphy, (Nemirov, Henttonen et al. 2002) Eizirik et al. 2001) (Charrel, Feldmann et al. 2002) (Conroy Chris and Cook Joseph 1999) (Hughes Austin and Friedman 2000)

Non-placental ERVs Negative strand RNA viruses

(Tristem, Herniou et al. 1996) (Tidona, Kurz et al. 1999) (Hanger, Bromham et al. 2000) (Wang, Yu et al. 2000) (Smith and Fadly 1994; Iraqi and Smith (Wang, Harcourt et al. 2001) 1995; Fadly and Smith 1997) (Guyatt, Twin et al. 2003)

252 (Badrane, Bahloul et al. 2001; Badrane and (Barbulescu, Turner et al. 1999; Barbulescu, Tordo 2001; Le Mercier, Jacob et al. 2002) Turner et al. 2001; Turner, Barbulescu et al. (Davis, Zajac et al. 2002) 2001) (Tonjes, Czauderna et al. 1999) The Influenza A story. Emergent disease Human language, schizophrenia and ERVs (Gammelin, Altmuller et al. 1990) (Laver, Bischofberger et al. 2000) (Berlim, Mattevi et al. 2003) (Schafer, Kawaoka et al. 1993) (Crow Timothy 1993) (Brownlee and Fodor 2001) (Crow 1997) (Fanning, Slemons et al. 2002) (Kim, Wadekar Rekha et al. 1999) (Makarova, Wulf Yu et al. 1998) (Crow 2000) (Highley, McDonald et al. 1999) Small DNA viruses (Kim, Takenaka et al. 1999) (Yolken, Karlsson et al. 2000; Karlsson, (Bahr, Schondorf et al. 2003) Bachmann et al. 2001) (Sugimoto, Kitamura et al. 1997; Ikegaya, (Crow 1999) Iwase et al. 2002) (Antonsson, Forslund et al. 2000) References.

Herpesviruses

(Bahr and Darai 2001) Afonso, C. L., E. R. Tulman, et al. (2002). "The (Gentry, Lowe et al. 1988) genome of virus." Virology (Huff and Barry 2003) 295(1): 1-9. (Lacoste, Mauclere et al. 2000) Antonsson, A., O. Forslund, et al. (2000). "The (Darai, Koch et al. 1982) ubiquity and impressive genomic (McGeoch, Dolan et al. 2000) diversity of human skin papillomaviruses (Zong, Ciufo et al. 2002) suggest a commensalic nature of these viruses." J Virol 74(24): 11636-41. Poxviruses Badrane, H., C. Bahloul, et al. (2001). "Evidence of two Lyssavirus phylogroups with (Chantrey, Meyer et al. 1999; Hazel, distinct pathogenicity and Bennett et al. 2000) immunogenicity." J Virol 75(7): 3268-76. (Afonso, Tulman et al. 2002) Badrane, H. and N. Tordo (2001). "Host (Sandvik, Tryland et al. 1998) switching in Lyssavirus history from the (Begon, Hazel et al. 1999) Chiroptera to the Carnivora orders." J (Feore, Bennett et al. 1997) Virol 75(17): 8096-104. (Senkevich, Koonin et al. 1997) Bahr, U. and G. Darai (2001). "Analysis and characterization of the complete genome X, Y and ERVS of (tree shrew) herpesvirus." J Virol 75(10): 4854-70. (Jones 2003) Bahr, U., E. Schondorf, et al. (2003). "Molecular (Zsiros, Jebbink et al. 1998) Anatomy of Tupaia (Tree Shrew) (Zsiros, Jebbink et al. 1999) ; Evolution of Viral (Dimitri and Junakovic 1999) Genes and Viral Phylogeny." Virus Genes 27(1): 29-48. Human ERVs. Barbulescu, M., G. Turner, et al. (1999). "Many human endogenous retrovirus K (HERV- 253 K) proviruses are unique to humans." communication." International Academy Curr Biol 9(16): 861-8. for Biomedical & Drug Research. Barbulescu, M., G. Turner, et al. (2001). "A Brunello, N.; Mendlewicz, J.; Racagni, HERV-K provirus in chimpanzees, G.: Eds. International Academy for bonobos and gorillas, but not Biomedical and Drug Research; New humans." Curr Biol 11(10): 779-83. generation of antipsychotic drugs: Novel Begon, M., S. M. Hazel, et al. (1999). mechanisms of action 4: 39-61. "Transmission dynamics of a Crow, T. J. (1997). "Aetiology of schizophrenia: zoonotic pathogen within and An echo of the speciation event." between wildlife host species." Proc International Review of 9(4): R Soc Lond B Biol Sci 266(1432): 321-330. 1939-45. Crow, T. J. (1999). "Commentary on Annett, Berlim, M. T., B. S. Mattevi, et al. (2003). Yeo et al., Klar, Saugstad and Orr: "The etiology of schizophrenia and cerebral asymmetry, language and the origin of language: overview of a psychosis--the case for a Homo sapiens- theory." Compr Psychiatry 44(1): 7- specific sex-linked gene for brain 14. growth." Schizophr Res 39(3): 219-31. Bromham, L. (2002). "The human zoo: Crow, T. J. (2000). "Schizophrenia as the price Endogenous retroviruses in the that Homo sapiens pays for language: A human genome." Trends in Ecology resolution of the central paradox in the & Evolution 17(2): 91-97. origin of the species." Brain Research Brownlee, G. G. and E. Fodor (2001). "The Reviews 31(2-3): 118-129. predicted antigenicity of the Darai, G., H. G. Koch, et al. (1982). "Tree shrew haemagglutinin of the 1918 Spanish (Tupaia) herpesviruses." Dev Biol Stand influenza pandemic suggests an 52: 39-51. avian origin." Philos Trans R Soc Davis, I. C., A. J. Zajac, et al. (2002). "Elevated Lond B Biol Sci 356(1416): 1871-6. generation of reactive oxygen/nitrogen Chantrey, J., H. Meyer, et al. (1999). species in hantavirus cardiopulmonary "Cowpox: reservoir hosts and syndrome." J Virol 76(16): 8347-59. geographic range." Epidemiol Infect de Parseval, N., J. Casella, et al. (2001). 122(3): 455-60. "Characterization of the three HERV-H Charrel, R. N., H. Feldmann, et al. (2002). proviruses with an open envelope reading "Phylogeny of New World frame encompassing the arenaviruses based on the complete immunosuppressive domain and coding sequences of the small evolutionary history in primates." genomic segment identified an Virology 279(2): 558-69. evolutionary lineage produced by Dimcheff, D. E., S. V. Drovetski, et al. (2000). intrasegmental recombination." " and horizontal Biochem Biophys Res Commun transmission of avian sarcoma and 296(5): 1118-24. leukosis virus gag genes in galliform Conroy Chris, J. and A. Cook Joseph birds." J Virol 74(9): 3984-95. (1999). "MtDNA evidence for Dimcheff, D. E., M. Krishnan, et al. (2001). repeated pulses of speciation within "Evolution and characterization of arvicoline and murid rodents." tetraonine endogenous retrovirus: a new Journal of Mammalian Evolution virus related to avian sarcoma and 6(3): 221-245. leukosis viruses." J Virol 75(4): 2002-9. Crow Timothy, J. (1993). "Origins of Dimitri, P. and N. Junakovic (1999). "Revising psychosis and the evolution of the selfish DNA hypothesis: new human language and evidence on accumulation of transposable 254 elements in heterochromatin." related to Gibbon ape leukemia virus." J Trends Genet 15(4): 123-4. Virol 74(9): 4264-72. Espinosa, A. and L. P. Villarreal (2000). "T- Harris, J. R. (1998). "Placental endogenous Ag inhibits implantation by EC cell retrovirus (ERV): structural, functional, derived embryoid bodies." Virus and evolutionary significance." Bioessays Genes 20(3): 195-200. 20(4): 307-16. Fadly, A. M. and E. J. Smith (1997). "Role Hart, C. A. and M. Bennett (1999). "Hantavirus of contact and genetic transmission infections: epidemiology and of endogenous virus-21 in the pathogenesis." Microbes Infect 1(14): susceptibility of chickens to avian 1229-37. leukosis virus infection and tumors." Hazel, S. M., M. Bennett, et al. (2000). "A Poult Sci 76(7): 968-73. longitudinal study of an endemic disease Fanning, T. G., R. D. Slemons, et al. (2002). in its wildlife reservoir: cowpox and wild "1917 avian influenza virus rodents." Epidemiol Infect 124(3): 551- sequences suggest that the 1918 62. pandemic virus did not acquire its Highley, J. R., B. McDonald, et al. (1999). directly from birds." J "Schizophrenia and temporal lobe Virol 76(15): 7860-2. asymmetry. A post-mortem stereological Feore, S. M., M. Bennett, et al. (1997). "The study of tissue volume." Br J Psychiatry effect of cowpox virus infection on 175: 127-34. fecundity in bank voles and wood Hohenadl, C., C. Leib-Mosch, et al. (1996). mice." Proc R Soc Lond B Biol Sci "Biological significance of human 264(1387): 1457-61. endogenous retroviral sequences." J Gammelin, M., A. Altmuller, et al. (1990). Acquir Immune Defic Syndr Hum "Phylogenetic analysis of Retrovirol 13 Suppl 1: S268-73. nucleoproteins suggests that human Hook, L. M., B. A. Jude, et al. (2002). influenza A viruses emerged from a "Characterization of a novel murine 19th-century avian ancestor." Mol retrovirus mixture that facilitates Biol Evol 7(2): 194-200. hematopoiesis." J Virol 76(23): 12112- Gentry, G. A., M. Lowe, et al. (1988). 22. "Sequence analyses of herpesviral Huder, J. B., J. Boni, et al. (2002). "Identification enzymes suggest an ancient origin and characterization of two closely for human sexual behavior." Proc related unclassifiable endogenous Natl Acad Sci U S A 85(8): 2658-61. retroviruses in pythons (Python molurus Gottlieb, K. A. (2001). Polyomavirus and Python curtus)." J Virol 76(15): replication in the lungs of mice: link 7607-15. to host cell differentiation and the Huff, J. L. and P. A. Barry (2003). "B-virus role of the early proteins. Molecular (Cercopithecine herpesvirus 1) infection Biology and biochemistry. Irvine, in humans and macaques: potential for University of California, Irvine: 217. zoonotic disease." Emerg Infect Dis 9(2): Guyatt, K. J., J. Twin, et al. (2003). "A 246-50. molecular epidemiological study of Hughes Austin, L. and R. Friedman (2000). Australian bat lyssavirus." J Gen "Evolutionary diversification of protein- Virol 84(Pt 2): 485-96. coding genes of hantaviruses." Molecular Hanger, J. J., L. D. Bromham, et al. (2000). Biology & Evolution 17(10): 1558-1568. "The nucleotide sequence of koala Ikegaya, H., H. Iwase, et al. (2002). "JC virus (Phascolarctos cinereus) retrovirus: a offers a new means of tracing novel type C endogenous virus the origins of unidentified cadavers." Int J Legal Med 116(4): 242-5. 255 Iraqi, F. and E. J. Smith (1995). ZAM, in Drosophila melanogaster." J "Organization of the sex-linked late- Virol 74(22): 10658-69. feathering haplotype in chickens." Levy, J. A. (1975). "Host range of murine Anim Genet 26(3): 141-6. xenotropic virus: replication in avian Jones, S. (2003). Y : the descent of men. cells." Nature 253(5487): 140-2. Boston, Houghton Mifflin. Levy, J. A. (1977). "Endogenous C-type viruses Karlsson, H., S. Bachmann, et al. (2001). in normal and "abnormal" cell "Retroviral RNA identified in the development." Cancer Res 37(8 Pt 2): cerebrospinal fluids and brains of 2957-68. individuals with schizophrenia." Levy, J. A., J. Joyner, et al. (1980). "Mouse Proc Natl Acad Sci U S A 98(8): sperm can horizontally transmit type C 4634-9. viruses." J Gen Virol 51(Pt 2): 439-43. Kim, H.-S., V. Wadekar Rekha, et al. Levy, J. A., O. Oleszko, et al. (1982). "Murine (1999). "SINE-R C2 (a Homo xenotropic type C viruses. IV. sapiens specific retroposon) is Replication and pathogenesis of ducks." J homologous to cDNA from Gen Virol 61 (Pt l): 65-74. postmortem brain in schizophrenia Makarova, K. S., I. Wulf Yu, et al. (1998). and to two loci in the Xq21 3/Yp "[Different patterns of molecular block linked to handedness and evolution of influenza A viruses in avian psychosis." American Journal of and human population]." Genetika 34(7): Medical Genetics 88(5): 560-566. 890-6. Kim, H. S., O. Takenaka, et al. (1999). McGeoch, D. J., A. Dolan, et al. (2000). "Isolation and phylogeny of "Toward a comprehensive phylogeny for endogenous retrovirus sequences mammalian and avian herpesviruses." J belonging to the HERV-W family in Virol 74(22): 10401-6. primates." J Gen Virol 80 ( Pt 10): Mi, S., X. Lee, et al. (2000). "Syncytin is a 2613-9. captive retroviral envelope protein Lacoste, V., P. Mauclere, et al. (2000). involved in human placental "KSHV-like herpesviruses in chimps morphogenesis." Nature 403(6771): 785- and gorillas." Nature 407(6801): 9. 151-2. Monroe, M. C., S. P. Morzunov, et al. (1999). Larsson, E. and G. Andersson (1998). "Genetic diversity and distribution of "Beneficial role of human Peromyscus-borne hantaviruses in North endogenous retroviruses: facts and America." Emerg Infect Dis 5(1): 75-86. hypotheses." Scand J Immunol Murphy, W. J., E. Eizirik, et al. (2001). 48(4): 329-38. "Molecular phylogenetics and the origins Laver, W. G., N. Bischofberger, et al. of placental mammals." Nature (2000). "The origin and control of 409(6820): 614-8. pandemic influenza." Perspect Biol Murphy, W. J., E. Eizirik, et al. (2001). Med 43(2): 173-92. "Resolution of the early placental Le Mercier, P., Y. Jacob, et al. (2002). "A mammal radiation using Bayesian novel expression cassette of phylogenetics." Science 294(5550): lyssavirus shows that the distantly 2348-51. related Mokola virus can rescue a Nakagawa, K. and L. C. Harrison (1996). "The defective rabies virus genome." J potential roles of endogenous retroviruses Virol 76(4): 2024-7. in autoimmunity." Immunol Rev 152: Leblanc, P., S. Desset, et al. (2000). "Life 193-236. cycle of an endogenous retrovirus, Nelson, J. A., J. A. Levy, et al. (1981). "Human placentas contain a specific inhibitor of 256 RNA-directed DNA polymerase." endogenous avian leukosis virus." Poult Proc Natl Acad Sci U S A 78(3): Sci 73(4): 488-94. 1670-4. Stoye, J. P. and J. M. Coffin (2000). "A provirus Nemirov, K., H. Henttonen, et al. (2002). put to work." Nature 403(6771): 715, "Phylogenetic evidence for host 717. switching in the evolution of Sugimoto, C., T. Kitamura, et al. (1997). hantaviruses carried by Apodemus "Typing of urinary JC virus DNA offers a mice." Virus Res 90(1-2): 207-15. novel means of tracing human Nilsson, B. O., M. Jin, et al. (1999). migrations." Proc Natl Acad Sci U S A "Expression of envelope proteins of 94(17): 9191-6. endogeneous C-type retrovirus on Tidona, C. A., H. W. Kurz, et al. (1999). the surface of mouse and human "Isolation and molecular characterization oocytes at fertilization." Virus Genes of a novel cytopathogenic paramyxovirus 18(2): 115-20. from tree shrews." Virology 258(2): 425- Revoltella, R. P. and Consiglio nazionale 34. delle ricerche (Italy) (1982). Tonjes, R. R., F. Czauderna, et al. (1999). Expression of differentiated "Genome-wide screening, cloning, functions in cancer cells. New York, chromosomal assignment, and expression Raven Press. of full-length human endogenous Sandvik, T., M. Tryland, et al. (1998). retrovirus type K." J Virol 73(11): 9187- "Naturally occurring 95. : potential for Tristem, M., E. Herniou, et al. (1996). "Three recombination with vaccine vectors." retroviral sequences in amphibians are J Clin Microbiol 36(9): 2542-7. distinct from those in mammals and Schafer, J. R., Y. Kawaoka, et al. (1993). birds." J Virol 70(7): 4864-70. "Origin of the pandemic 1957 H2 Turner, G., M. Barbulescu, et al. (2001). influenza A virus and the persistence "Insertional polymorphisms of full-length of its possible progenitors in the endogenous retroviruses in humans." avian reservoir." Virology 194(2): Curr Biol 11(19): 1531-5. 781-8. Villarreal, L. P. and L. P. Villareal (1997). "On Seman, G., B. M. Levy, et al. (1975). "Type- viruses, sex, and motherhood." J Virol C virus particles in placenta of the 71(2): 859-65. cottontop marmoset (Saguinus Wang, L., B. H. Harcourt, et al. (2001). oedipus)." J Natl Cancer Inst 54(1): "Molecular biology of Hendra and Nipah 251-2. viruses." Microbes Infect 3(4): 279-87. Senkevich, T. G., E. V. Koonin, et al. Wang, L. F., M. Yu, et al. (2000). "The (1997). "The genome of molluscum exceptionally large genome of Hendra contagiosum virus: analysis and virus: support for creation of a new genus comparison with other poxviruses." within the family ." J Virology 233(1): 19-42. Virol 74(21): 9972-9. Singleton, G. R., A. L. Smith, et al. (1993). Yolken, R. H., H. Karlsson, et al. (2000). "Prevalence of viral antibodies and "Endogenous retroviruses and helminths in field populations of schizophrenia." Brain Res Brain Res Rev house mice (Mus domesticus) in 31(2-3): 193-9. southeastern Australia." Epidemiol Zong, J., D. M. Ciufo, et al. (2002). "Genotypic Infect 110(2): 399-417. analysis at multiple loci across Kaposi's Smith, E. J. and A. M. Fadly (1994). "Male- sarcoma herpesvirus (KSHV) DNA mediated venereal transmission of molecules: clustering patterns, novel

257 variants and chimerism." J Clin Virol 23(3): 119-48. Zsiros, J., M. F. Jebbink, et al. (1998). "Evolutionary relationships within a subgroup of HERV-K-related human endogenous retroviruses." J Gen Virol 79 ( Pt 1): 61-70. Zsiros, J., M. F. Jebbink, et al. (1999). "Biased nucleotide composition of the genome of HERV-K related endogenous retroviruses and its evolutionary implications." J Mol Evol 48(1): 102-11.

Possible figures:

8-1. Figure of ERVS, LINES, SINEs 8-2. mammalian evolution timeline 8-3. ERV classification 8-4. Intact HERVs 8-5. SINE vertebrate relationship (needs permission, data from Jarka) 8-6. ERV colonization during mammalian evolution 8-7. Mouse blastocyst 8-8. Placental HERVs electron micrograph (needs permission from Levy) 8-9. Implanted Mouse EC blastocyst with ERV (my work; Virus Genes) 8-10. Expressed ERV env genes (table) 8-11. Y evolution in humans (needs permission from Nature) 8-12. Viral emergence table 8-13. Hantavirus human emergence examples (table) 8-14. Mouse evolution dendogram (needs permission) 8-15. NIDO/SARS virus evolution 8-16. Poxvirus dendogram for evolution (needs permission or re-rendering) 8-17. Rhabdoviruses dendogram for evolution (needs permission) 8-18. The origin of HIV (needs permission from Nature, not essential) 8-19. Human evolution and SINEs (needs permission or rendering) 8-20. Human Y 258