Endogenous Retrovirus Much More Recently Than Humans and Chimpanzees
Total Page:16
File Type:pdf, Size:1020Kb
Gorillas have been infected with the HERV-K (HML-2) endogenous retrovirus much more recently than humans and chimpanzees Joseph R. Hollowaya,b,1, Zachary H. Williamsa,b,1, Michael M. Freemana,b, Uriel Bulowa,b, and John M. Coffina,b,2 aDepartment of Molecular Biology and Microbiology, Tufts University, Boston, MA 02111; and bSackler School of Graduate Biomedical Sciences, Tufts University, Boston, MA 02111 Contributed by John M. Coffin, November 25, 2018 (sent for review August 17, 2018; reviewed by Robert J. Gifford, Jack Lenz, and Jonathan P. Stoye) Human endogenous retrovirus-K (HERV-K) human mouse mammary from gorillas. The youngest known HML-2 provirus may have tumor virus-like 2 (HML-2) is the most recently active endogenous integrated in humans as recently as 100,000 y ago, suggesting that retrovirus group in humans, and the only group with human- this group was still active after the evolution of anatomically specific proviruses. HML-2 expression is associated with cancer and modern humans (15–17). Additionally, more than half of the other diseases, but extensive searches have failed to reveal any known human-specific HML-2 proviruses are insertionally poly- replication-competent proviruses in humans. However, HML-2 morphic, with some insertions present in fewer than 5% of indi- proviruses are found throughout the catarrhine primates, and it is viduals (12, 18–20). possible that they continue to infect some species today. To investi- Despite evidence of evolutionarily recent activity (as well as gate this possibility, we searched for gorilla-specific HML-2 elements occasional reports to the contrary), attempts by multiple labo- using both in silico data mining and targeted deep-sequencing ap- ratories have failed to find any unambiguous evidence for on- proaches. We identified 150 gorilla-specific integrations, including going HML-2 replication in humans. We previously reported our 31 2-LTR proviruses. Many of these proviruses have identical LTRs, attempts to identify rare, recent HML-2 integrations in short- and are insertionally polymorphic, consistent with very recent integra- read sequence data from over 2,500 individuals in the 1000 tion. One identified provirus has full-length ORFs for all genes, and thus Genomes Project (21). Although we were able to identify and EVOLUTION could potentially be replication-competent. We suggest that gorillas characterize rare HML-2 integrations, including one with full- may still harbor infectious HML-2 virus and could serve as a model length ORFs for all genes, none of the proviruses appeared to be for understanding retrovirus evolution and pathogenesis in humans. derived from recent activity (22). Although no infectious pro- virus is known, the high numbers of relatively intact, insertionally endogenous retroviruses | host–virus evolution | genome mining polymorphic HML-2 proviruses in humans have led researchers to investigate this group for links to disease (8, 23). Like most ndogenous retroviruses (ERVs) are sequences found in the ERVs, HML-2 proviruses are usually transcriptionally silenced Egenomes of all vertebrates that were originally derived from in healthy tissues, but transcription of specific proviruses has exogenous retroviruses (1–3). These sequences are the result of retroviral infection and integration of the provirus into the ge- Significance nome of germ-line cells, and provide a record of past retroviral infections. Once integrated, such proviruses are permanent resi- dents of the host and will be present in all cells of progeny derived Human endogenous retrovirus-K (HERV-K) human mouse mam- from the infected germ-line cell (4, 5). Most ERV sequences have mary tumor virus-like 2 (HML-2) is the most recently active endog- numerous mutations that render them noninfectious. Addition- enous retrovirus group in humans. Their proviruses are also found ally, homologous recombination can occur between the 5′ and 3′ within the genomes of all apes and Old World monkeys; however, LTRs of a provirus after integration, leading to the loss of internal no HML-2 provirus is known to be naturally infectious. Although coding sequence and producing a solo LTR. About 90% of ERV these proviruses seem to be functionally extinct in both humans integrations have been reduced to solo LTRs. However, replication- and chimpanzees, less is known about the profile and activity of competent ERVs have been found in a number of species, and HML-2 proviruses in gorillas. Our work here has identified gorilla- recombination between defective ERVs can also lead to the pro- specific HML-2 elements that have characteristics consistent with duction of infectious virus (6, 7). very recent activity, and raises the possibility that gorillas may still Human endogenous retroviruses (HERVs) constitute ∼8% of contain infectious HML-2 virus. Thus, gorillas could serve as a model the genome (8), with ∼30 HERV groups represented (5, 9). The for how HML-2 functioned as a virus in humans, as well as shed – groups are currently named for the specific tRNA used for light on its role in pathogenesis and host virus evolution. priming reverse transcription (10, 11), with the HERV-K group Author contributions: J.R.H., Z.H.W., and J.M.C. designed research; J.R.H., Z.H.W., M.M.F., further divided into 11 subtypes that reflect their similarity to the and U.B. performed research; J.R.H., Z.H.W., M.M.F., and J.M.C. analyzed data; and J.R.H., infectious mouse mammary tumor virus (MMTV) (4, 12, 13). The Z.H.W., and J.M.C. wrote the paper. “ human MMTV-like 2 (HML-2) subtype (hereafter called HML- Reviewers: R.J.G., MRC-University of Glasgow Centre for Virus Research; J.L., Albert Ein- 2”) is particularly interesting for a variety of reasons. In addition to stein College of Medicine; and J.P.S., Francis Crick Institute. having members that possess a number of full-length ORFs, it Conflict of interest statement: J.M.C., J.P.S., and R.J.G. are coauthors on a 2018 subcom- contains the youngest known HERV sequences, and is the only mittee report. This did not involve any active collaboration. one known to have human-specific integrations (14). These pro- Published under the PNAS license. viruses are further classified into three subtypes based on LTR Data deposition: The sequences reported in this paper have been deposited in the Gen- phylogeny: LTR5A, LTR5B, and LTR5Hs. LTR5B contains the Bank database (accession nos. MH678754–MH678803 and MH684412–MH684461). oldest insertions, and the LTR5A and LTR5Hs clades branch 1J.R.H. and Z.H.W. contributed equally to this work. separately out of this group (14). LTR5Hs includes the most re- 2To whom correspondence should be addressed. Email: [email protected]. cently integrated sequences and is the only group that has human- This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. specific integrations, whereas LTR5A and 5B appear to have 1073/pnas.1814203116/-/DCSupplemental. ceased activity in the hominoid lineage before the split of humans www.pnas.org/cgi/doi/10.1073/pnas.1814203116 PNAS Latest Articles | 1of10 Downloaded by guest on September 26, 2021 been observed in a number of disease states (24). Despite the (human) HML-2 5′ or 3′ LTR edges, filtering out any reads with abundance of research, no causative role has been proven for any junctions that matched known HML-2 proviruses in humans or HML-2 provirus in any disease. Attempts to prove such a role gorillas and any reads with <10 bp of sequence flanking the in- are hampered by insufficient knowledge of basic HML-2 biology, sertion. From the 30 gorilla genomes screened, we identified and of how HML-2 functions as a virus. Consensus HML- 2,057 putative nonreference insertion sites, of which 184 had 2 proviruses have been shown to be weakly infectious in vitro; reads corresponding to both the 5′ and 3′ junctions (SI Appendix, however, it is unclear how well these experiments recapitulate Table S1). We focused our downstream analyses on this group of how HML-2 is replicated in vivo (25, 26). high-confidence, two-sided hits, though it is likely that many of Although many HML-2 insertions are human-specific, there the hits with reads corresponding to only 5′ or 3′ flanks are are a number of older HML-2 integrations at identical sites in all genuine. Of the two-sided 184 hits, 130 were found in a single apes and Old World monkeys, dating their integration to as subspecies, with 117 unique to Western lowland gorillas and much as 35 million y ago, soon after the split with New World 13 unique to the Eastern lowland gorillas. Thirty-two hits were monkeys (14, 17, 27). Although these viruses may have become found in the single Cross River sample; however, all of them functionally extinct in humans, it is possible that active forms were shared within the Western lowland subspecies. Of the could currently exist in other primates, perhaps even in our 54 proviruses shared between two or more subspecies, 22 were closest relatives, chimpanzees and gorillas. Given the presence of found only in the Western and Eastern lowland gorillas, 18 only shared HML-2 proviruses between these species, we thought it in Western and Cross River, and 14 sites were found in all worthwhile to examine the possibility that chimpanzees and/or 3 subspecies. As we show below, this extent of polymorphism far gorillas might have been subject to more recent—and possibly exceeds that seen in humans for the same provirus group. ongoing—infection and reintegration with HML-2 virus. Al- In addition to identifying nonreference HML-2 insertions, we though chimpanzee-specific proviruses have been reported (28), took advantage of the recently released long-read gorilla genome the chimpanzee genome contains relatively low numbers of assembly (gorGor5) to identify further gorilla-specific insertions chimpanzee-specific HML-2 integrations in comparison with (32).