<<

Molecular biology and structure of a novel penaeid shrimp densovirus elucidate convergent parvoviral host evolution

Judit J. Pénzesa,b, Hanh T. Phama, Paul Chipmanb, Nilakshee Bhattacharyac, Robert McKennab, Mavis Agbandje-McKennab,1,2, and Peter Tijssena,1,2

aInstitut Armand-Frappier, Institut national de la recherche scientifique-Institut Armand-Frappier, Laval, QC H7V 1B7, Canada; bThe McKnight Brain Institute, University of Florida, Gainesville, FL 32610; and cInstitute of Molecular Biophysics, Florida State University, Tallahassee, FL 32306

Edited by Kenneth I. Berns, University of Florida College of Medicine, Gainesville, FL, and approved July 8, 2020 (received for review April 27, 2020) The giant tiger prawn (Penaeus monodon) is a decapod crustacean isolated from both proto- and deuterostome invertebrates, mostly widely reared for human consumption. Currently, of two insects and other arthropods (6–18). distinct lineages of parvoviruses (PVs, family ; subfamily To date, parvovirus crustacean comprise three Hamaparvovirinae) infect penaeid shrimp. Here, a PV was isolated distinct lineages with divergent genome organizations and tran- and cloned from Vietnamese P. monodon specimens, designated scription patterns. Cherax quadricarinatus DV, isolated from the Penaeus monodon metallodensovirus (PmMDV). This is the first freshwater red-clawed crayfish (Cherax quadricarinatus), has an member of a third divergent lineage shown to infect penaeid ambisense genome with a PLA2 domain and is now an assigned decapods. PmMDV has a strategy unique among in- member of genus Aquambidensovirus of the (19). vertebrate PVs, using extensive alternative splicing and incorpo- Genera Hepanhamaparvovirus and Penstylhamaparvovirus are rating transcription elements characteristic of vertebrate-infecting members of subfamily Hamaparvovirinae, each with one species PVs. The PmMDV have no significant sequence similarity that lacks a PLA2 domain. They both infect penaeid shrimps, with other PVs, except for an SF3 domain in its nonstruc- including Penaeus monodon and Litopenaeus stylirostris (20–26). tural . Its capsid structure, determined by cryoelectron mi- Hepanhamaparvoviruses possess a larger genome ∼6.3 kb with

croscopy to 3-Å resolution, has a similar surface morphology to 220-nt-long ITRs that form hairpins. The Penstylhamaparvovirus Penaeus stylirostris densovirus, despite the lack of significant cap- genome is only 3.9-kb long and instead of hairpins harbors direct sid (VP) sequence similarity. Unlike other PVs, PmMDV terminal repeats (27), an exception among PVs. folds its VP without incorporating a βA strand and displayed unique In contrast to the ∼100 capsid structures determined for multimer interactions, including the incorporation of a Ca2+ cation, members of the , there are only four high-resolution attaching the N termini under the icosahedral fivefold symmetry structures for invertebrate-infecting PVs (28). These include three axis, and forming a basket-like pentamer helix bundle. While the members of the Densovirinae,GalleriamellonellaDV(GmDV)of PmMDV VP sequence lacks a canonical phospholipase A2 domain, genus Protoambidensovirus at a resolution of 3.7 Å (29), Acheta the structure of an EDTA-treated capsid, determined to 2.8-Å res- domestica DV (AdDV) of genus Scindoambidensovirus at 3.5-Å olution, suggests an alternative membrane-penetrating cation- resolution (30), and Bombyx mori DV 1 (BmDV1) of dependent mechanism in its N-terminal region. PmMDV is an ob- served example of convergent evolution among invertebrate PVs Significance with respect to host-driven capsid structure and unique as a PV show- ing a cation-sensitive/dependent basket structure for an alternative Parvoviruses (PVs) are ssDNA viruses, with T = 1 icosahedral endosomal egress. symmetry, infecting deuterostome and protostome . Most PVs have a highly conserved phospholipase A2 domain Crustacea | capsid structure | Parvoviridae | densovirus | (PLA2) in the N-terminal region of their minor capsid protein. convergent evolution Under acidic pH, during endosomal/lysosomal egress, the PLA2 domain is activated to disrupt vesicle membranes. However, ensoviruses (DVs) are autonomous parvoviruses (PVs) of certain PVs lack the PLA2 and thus must use a different escape Dthe family Parvoviridae infecting invertebrates. Until re- mechanism. Our study offers insight into this enigma, showing cently, all known PVs infecting invertebrate hosts were members how a recently discovered PV of marine crustacean has evolved of Densovirinae; however, they have recently been divided into a cation-dependent mechanism to accomplish this task. We two separate subfamilies: Densovirinae, composed of exclusively also show how host-driven convergent evolution pushed two invertebrate-infecting PVs, and Hamaparvovirinae, which infect PVs, infecting the same host species, to adopt strikingly similar both invertebrates and vertebrates (1). The third Parvoviridae surface morphologies, despite distinct multimer interactions subfamily, Parvovirinae, contains exclusively vertebrate-infecting and lack of sequence similarity. PVs. All PVs are nonenveloped, single-stranded DNA (ssDNA) viruses, with an approximate capsid diameter of 21.5 to 25 nm Author contributions: J.J.P., M.A.-M., and P.T. designed research; J.J.P. and H.T.P. per- (2). They package relatively small genomes 3.9 to 6.3 kb, flanked formed research; H.T.P., P.C., and N.B. contributed new reagents/analytic tools; J.J.P., by two inverted terminal repeat (ITR)-containing palindromic R.M., M.A.-M., and P.T. analyzed data; and J.J.P., M.A.-M., and P.T. wrote the paper. sequences forming various hairpin-shaped secondary structures. The authors declare no competing interest. The genome organization is conserved and includes two major This article is a PNAS Direct Submission. expression cassettes. Conventionally these are referred to Published under the PNAS license. as rep, which encodes the nonstructural (NS) proteins, and cap, 1M.A.-M. and P.T. contributed equally to this work. which encodes the capsid viral proteins (VPs), which may have 2To whom correspondence may be addressed. Email: [email protected] or peter.tijssen@ different N-terminal extensions (2). Most Parvoviridae contain a iaf.inrs.ca. phospholipase A2 (PLA2) domain in the N-terminal region of This article contains supporting information online at https://www.pnas.org/lookup/suppl/ their VP1, which breach the endosomal membrane during cellular doi:10.1073/pnas.2008191117/-/DCSupplemental. trafficking (3, 4). DVs are pathogenic for their hosts (5) and were

www.pnas.org/cgi/doi/10.1073/pnas.2008191117 PNAS Latest Articles | 1of12 Downloaded by guest on September 27, 2021 at 3.1-Å resolution (31), and one member of the Hamaparvovirinae, Results Penaeus stylirostris DV (PstDV) of genus Penstylhamaparvovirus,ata Detection and Cloning. Deceased P. monodon specimens resolution of 2.5 Å (32). All PVs structures are T = 1icosahedral showing clinical signs of a red telson, uropodia, and pleopods, (point group operator 5.3.2), consisting of 60 VP subunits. were acquired from a farm in the South Vietnam. Discoloration The VP core is structurally conserved with a jellyroll fold (33) of the cephalothorax was observed, suggesting an underlying flanked by loops inserted between the β-strands of the jellyroll, viral infection. Negative-staining electron microscopy (EM) from and strands and helices, forming the surface morphology. In the homogenized tissue revealed uniform ∼21-nm icosahedral particles case of the Parvoviridae, the BIDG sheet of the jellyroll is com- (Fig. 1A). Consequently, extracted DNA was blunt-ended, cloned, plemented with an additional β-strand, strand A (28). The PV and sequenced; it contained a previously unknown crustacean DV, fivefold axis assembly forms a channel-like opening reported to designated PmMDV. aid genome packaging and uncoating, and PLA2 domain exter- Complete Genome Characterization of the Crustacean DV. The com- nalization when required (34, 35). plete genome sequence of PmMDV was deposited into GenBank This study reports the complete genome sequence, expression under the accession number of MK028683 (Fig. 1B). Its length strategy, and near-atomic 3D structure of a DV, designated was 4,374 nt, flanked by ITRs of 416 nt, of which 161 nt fold into a P. monodon metallodensovirus (PmMDV), isolated from P. monodon regular, T-shaped hairpin (Fig. 1B) (36). There was a single nu- shrimp. Its relationship to other PVs by phylogenetic inference, cleotide insertion in the stem of the right ITR, which was present transcription mapping, and expression analysis were also charac- in all three clones sequenced. The overall GC content of the genome terized. The PmMDV and PstDV capsids have convergently evolved was 45.6%, with 76.4% at the termini, presumably stabilizing the similar morphologies. However, PmMDV incorporates unique secondary structure. In silico analysis revealed three ORFs, with a strategies to stabilize its capsid and it has possibly evolved an length greater than 100 nt, as well as a fourth without a canonical alternative membrane-penetrating mechanism in the absence of start codon (summarized in SI Appendix, Table S1). The leftmost the PLA2 that is dependent on divalent cations. Furthermore, ORF, ORF1 (516 aa), displayed sequence identity with the major PmMDV, as the third distinct lineage of PVs to infect penaeid NS proteins of various PVs and bidensoviruses (family Bidnaviridae) shrimps, cannot be assigned to any of the current subfamilies and with no greater coverage than 34% (B. mori 3; National provides insights into PV capsid evolution. Center for Biotechnology Information [NCBI] protein ID: ALJ76088)

Fig. 1. Discovery, genome organization, and transcription analysis of PmMDV. (A) Homogenized tissue of infected P. monodon specimens diluted and examined by negative-staining EM, showing icosahedral particles. (B) The complete genome and transcription pattern of PmMDV. The ITRs are shown by the thinner lines compared to the thick line indicating the coding region of the genome. The noncoding regions of the transcripts are marked as horizontal lines. ORFs, as well as the putative proteins expressed by them, are shown as arrows of various colors. The pattern of the ORF corresponds with the pattern of the protein-coding region of mRNA transcripts. Transcripts of the upstream promoter p9 (nonstructural) are highlighted in green, whereas transcripts of the downstream promoter p47 (structural) are in blue. The panel below displays the secondary structure of the terminal hairpins of PmMDV. (C)Agarose gel of the PCR products from rep transcripts using a forward primer at position 1,100 and a reverse primer at 2,564. RNA was extracted 2, 4, and 6 d posttransfection.

2of12 | www.pnas.org/cgi/doi/10.1073/pnas.2008191117 Pénzes et al. Downloaded by guest on September 27, 2021 and identity less than 36% (Simian parvo-like virus 3; NCBI protein truncated NS1 as a stop codon is incorporated at 271, ID: APC23168), hence judged to encode the NS proteins. ORF2 creating the NS2A protein. The complex splicing pattern of this and ORF3 (143 and 133 aa, respectively) also possessed no detectable transcript also enabled the possible expression of the complete ORF3 homology. ORF4 (369 aa) did not have any amino acid similarity with (Fig. 1B). Transcript 3 was spliced once and can encode a 303-aa proteins submitted to GenBank to date; however, due to its rightmost protein, including 270 aa of ORF1 and the C-terminal region of location, a major VP role was assumed. Screening the NCBI ORF3, designated NS3. Transcript 4 is also spliced once; however, Whole-Genome Shotgun (WGS) and Transcriptome Shotgun this donor site adds a 2-aa extension to the truncated NS, forming Assembly (TSA) databases, using the tBLASTn algorithm, a 2-aa longer version of NS2A, designated NS2B. The downstream resulted in a significant hit for NS and ORF4, respectively. An p47 was responsible for transcribing two transcripts, assessed to endogenous sequence overlapping 74% of the PmMDV ORF1 encode the capsid VP. Transcript 5 was unspliced and could with 42% identity was present in the draft genome sequence of express ORF4 as well as ORF3. Transcript 6 was subjected to an amphipod crustacean, Trinorchestia longiramus (WGS ID: splicing, adding a short fragment at the C-terminal region of VCRD01000839). A TSA transcript was derived from the tran- ORF1, translated from Met462 to Pro494, with ORF2, using scriptome of a decapod, the marine Japanese blue crab (Portunus the same donor site as transcript 2 for its second intron, possibly trituberculatus) (TSA ID: GFFJ01053568), encoding a C-terminal expressing ORF2′ (Fig. 1A). This spliced transcript could also truncated protein of 30% amino acid identity with ORF4. express the complete ORF4.

Transcription Strategy. The transcriptome of PmMDV revealed Proteins. All amino acid sequences were subjected to in silico six transcripts under the control of two promoters and cotermi- analysis. NS1 contained an ATPase motif with an E-value of 1.4 nating at the same polyadenylation site of the genome (nucleotide (residues 285 to 426) and a Parvo_NS1 motif (Pfam ID: PF01057) 3829). The first promoter was identified at map unit nine, hence of an E-value 2e-11, encompassing the tripartite helicase domain referred to as p9. Four of the six transcripts transcribed from this conserved throughout the Parvoviridae, corroborating its non- proximal promoter, of which three underwent splicing (Fig. 1B). structural function (SI Appendix,Fig.S2A). ORF3, encoded by The electrophoretrograms showing the intron–exon boundaries of one of the small auxillary ORFs, demonstrated a conserved do- these spliced transcripts are shown in SI Appendix,Fig.S1.The main of 0.01 E-value homologous to the catalytic domain of the exact nucleotide position of each intron and exon is presented precorrin-6x reductase enzyme, CbiJ. CibJ, together with CobK, in SI Appendix,TableS2.RNAwasisolatedatdays2,4,and6 are enzymes of the cobalamin-synthesis pathway, catalyzing the posttransfection. Transcript 4 was first detectable in the day 4 NADP-dependent reduction of precorrin-6x (37). ORF2′ had a samples and remained expressed until day 6 posttransfection Pfam domain with an E-value of 0.08, corresponding with the MICROBIOLOGY (Fig. 1C). The unspliced transcript 1 could express NS1 (ORF1) in tetratricopeptide fold region of mitochondrial fission protein Fis1, its entire length. Transcript 2 was spliced twice and can express a responsible for scaffolding activity during mitochondrial fission

Fig. 2. The PmMDV capsid structure. (A) The structure at low resolution (8.45 Å), showing surface (Left) and cross-section (Right) views radially colored from the center according to the color key. (B) The PmMDV capsid structure at high resolution (2.96 Å), showing surface (Left) and cross-section (Right), radially colored similar to A.(C) Example of map density, shown as gray mesh, with the PmMDV model (residues labeled): Carbon in yellow, oxygen in red, nitrogen in blue. (D) Ribbon diagram of PmMDV VP monomer. The conserved jellyroll core and the αA helix, present in all parvoviral structures to date, are shown in black. The loops linking the β-strands and the N- and C-terminal regions are in multiple colors and labeled according to the β-strands that they connect.

Pénzes et al. PNAS Latest Articles | 3of12 Downloaded by guest on September 27, 2021 (38) (SI Appendix,Fig.S2A). ORF4, similarly to other penaeid an additional β-strand, comprised of the C-terminal region of a shrimp DVs, did not contain a PLA2 motif. Nevertheless, with the predicted ADAM-TS spacer domain (Fig. 3B). E-value of 0.02, an ADAM-TS metalloproteinase spacer 1 domain Both of the C-terminal–most residues were histidines, His368 was identified, hence the name PmMDV (SI Appendix,Fig.S2A). and His369, of which His369 adopts a dual conformation, facing Structural similarity was detected between ORF4 and the VP2 either toward or away from the annulus center (Fig. 3C). Inter- of a birnavirus, infectious pancreatic necrosis virus (, estingly, this annulus channel is wider than the fivefold pore (∼17 genus )(P = 0.009) (PDB ID code 3IDE), while vs. ∼10 Å), and is filled by unassigned diffuse density, coordinated searching for potential homology modeling targets. To confirm by positively charged His and Arg sidechains (Fig. 3C). In con- its structural function and to determine the number of proteins trast, the two luminal layers were occupied by hydrophobic side- comprising the capsid, expression studies were carried out with chains, such as Phe and Met (Fig. 3C). At the fivefold symmetry the Bac-to-Bac expression system. Briefly, Sf9 cultures were axes, the canonical pore is lined by the sidechains of bulky hy- transfected with three recombinant bacmid constructs named drophobic residues (Fig. 3D). The density of the basket below this PmMDV-Bac-complete, PmMDV-Bac-p47, and PmMDV-Bac-ORF4, pore could not be modeled at the sidechain level. However, the containing the complete PmMDV genome, the p47 promoter, and low-resolution structure and secondary structure predictions sug- downstream regions, as well as exclusively ORF4, respectively. All gested a helical propensity, with five α-helices forming a pentamer three supernatants contained icosahedral particles of ∼21 nm (SI helix bundle (Fig. 3E and SI Appendix, Fig. S3). We could model Appendix,Fig.S2B). All purified samples have a single protein 20 more N-terminal residues in an α-helical arrangement into the band at ∼41 kDa when analyzed by SDS-PAGE, suggesting the basket density of the low-resolution structure. A stretch of basic α presence of only one VP or two with the same C-terminal core, residues (RKRRRH) is present in each -helix. from transcripts 5 and 6, assembling the PmMDV capsid (SI The surface morphology of PmMDV and its size were most Appendix,Fig.S2C). similar to that of PstDV, when compared with available capsid struc- tures for six Parvoviridae genera: Iteradensovirus, Protoambidensovirus, Structural Studies. The PmMDV structure was determined using Scindoambidensovirus, Penstylhamaparvovirus, , cryo-EM and image reconstruction. Particles purified from and (Fig. 4A). This observation was supported by a PmMDV-Bac-ORF4 expression were subjected to data collection DALI search (z-score of 22.5) indicating that the PmMDV capsid by cryo-EM followed by single-particle image reconstruction (39). is structurally most like PstDV, despite low sequence identity We obtained the capsid structure at high resolution of 3 Å (PDB (13%). The two VP structures could be superposed with an RMSD α ID code 6WH3) (40). As disordered and mobile protein regions, of 2.4 Å for the C positions of 258 residues (Fig. 4B). With the such as the parvoviral VP N terminus, are frequently absent from exception of the threefold region, all loops superposed, although high-resolution protein structures but may be possible to visualize their conformation differed substantially. Interestingly, a homology at a lower resolution (41), we determined the PmMDV capsid model built from the P. trituberculatus transcript indicates that, if structure at the low resolution of 8.4 Å as well, from an inde- this transcript is of viral origin, the VP topology will be very similar pendent data collection. Data collection parameters and refine- to the PstDV and PmMDV, with variations in the surface loops ment statistics are given in SI Appendix, Table S3. Both structures (Fig. 4C). are similar and have T = 1 icosahedral symmetry (Fig. 2 A and B). The Effect of EDTA Treatment. The potential contribution of di- PmMDV is the smallest parvoviral capsid isolated to date, ranging valent cations to additional unmodeled density in the PmMDV from 20 to 25 nm in diameter, with the smallest lumen size as map and structure stability of the capsid was assessed by pre- shown in SI Appendix,TableS4. Overall, the surface morphology treating purified virus-like particles (VLPs) with 100 nmol of PmMDV contains prominent protrusions surrounding the EDTA, a divalent metal ion chelator, and determining the fivefold axes in two concentric rim-like circles, as well as small structure. As with the untreated capsid, the capsid-EDTA–treated protruding threefold axes (Fig. 2 A and B). The interior of the structure was first determined at low resolution, 10.37 Å, then at capsid contains a “basket-like” structure at the base of each “ ” high resolution of 2.82 Å (Fig. 5 A and B)(PDBIDcode6WH7) fivefold channel, attached to the wall by five stalk-like features (42). The removal of the divalent cation (probably Ca2+ ion) from (Fig. 2 A and B). the N-terminal–threefold interaction resulted in the complete The cryo-EM electron density map of the VP could be built absence (disordering) of the fivefold basket and stalk as well as in from Glu36 to the C-terminal residue, His369, for the high-resolution – α the disorder of the loop involved in the intra VP-cation coordi- structure (Fig. 2 C and D). Each VP monomer contains an nation, making Ser45 the first ordered N-terminal residue rather helix (labeled A) and an eight-stranded β-barrel jellyroll core β –β β than Glu36 (Fig. 5 C and D). The apex of the DE loop, from ( B I), but lack the analogous parvoviral A strand. Unlike all residues Leu138 to Thr147, forming the channel at the fivefold DV structures determined so far, but similarly to PstDV, the axes, was disordered. The hydrophobic ring forming the interior of VP contains a 12-aa-long insertion forming the CD1 subloop the channel, including Phe151, was altered due to a change in the β β within the loop linking Cand D(Fig.2D).Thesameinsertion orientation of the sidechains (Fig. 5E). The structural change on is 10-aa-long in PstDV. Like all DV structures, the region located the inside and outside of the capsid at the fivefold axes resulted in β N-terminally from the B was situated in a domain-swapped the opening of the channel in the EDTA-treated structure, with the β conformation. However, without the A, there were no hydrogen pore expanding to 16.8 Å in diameter (Fig. 5E). The unmodeled β bonds formed with the B of the twofold-related monomer, a density covering the threefold annulus was still present, albeit less phenomenon shared by all DV structures to date (Fig. 3A). Instead, coordinated, as the orientation of His369 changes to favor the the N terminus formed a loop, within which carboxyl groups and conformation facing away from the density (Fig. 5F). the side-chain of Asp215, from a threefold-related VP monomer, coordinate unmodeled diffuse electron density (Fig. 3 A, Left Inset). Phylogenetic Reconstructions. The only region of the deduced Because of its coordination, the density was proposed to be a Ca2+ protein sequences of each ORF, which could be aligned reliably ion. Interestingly, in the absence of a βA, the βB strands of twofold- with other PV and DV sequences retrieved from GenBank, was related VP monomers formed polar interactions between residues the SF3 helicase domain of ORF1 (NS1). Fig. 6 presents the Tyr60 and Thr62 (Fig. 3 A, Right Inset). Unlike any PV structure to results of the Bayesian phylogeny inference. According to date, the C terminals of the PmMDV VP monomers are located on these criteria, the Parvoviridae is composed of four distantly the surface at the threefold axes and create a β-annulus structure related clades, where two correspond with subfamilies Parvovirinae (Fig. 3B). Associated with this annulus, toward the capsid surface, is and Densovirinae. As the fourth lineage, PmMDV pulled out genus

4of12 | www.pnas.org/cgi/doi/10.1073/pnas.2008191117 Pénzes et al. Downloaded by guest on September 27, 2021 MICROBIOLOGY

Fig. 3. Multimer interactions of the PmMDV capsid. (A) Dimer interaction of two PmMDV subunits, shown as ribbon diagrams, compared to those of a vertebrate (, CPV) and an invertebrate (GmDV) parvovirus. The hydrogen bonds linking the βB strands together are shown in the zoom to the top right of the PmMDV dimer. The interaction of the PmMDV subunit N terminus with the threefold neighboring subunit is shown to the left of the 2+ PmMDV dimer, indicating the extra density corresponding with the presence of a bivalent cation, predicted to be Ca . The twofold axes are indicated by the ellipsoids, threefold axes by the triangles, and fivefold axes by the pentagons. The N termini are indicated by green arrows. (B) Ribbon diagrams of trimer interactions of PmMDV, hamaparvovirus-PstDV, densovirus-GmDV, and vertebrate protoparvovirus, CPV. The location of the ADAM-TS metalloproteinase spacer domain is indicated by the lighter ribbon color compared to the colors of the subunit ribbons themselves. The location of the icosahedral symmetry axes are as in A.(C) Density and model at the threefold axis. The PmMDV threefold structure harbors a double β-annulus–like structure. His-369 has a double conformation. The pore opening of the annulus contains a piece of unassigned density. (D) Ribbon diagram and density map of the PmMDV fivefold channel, viewed from the top, as it appears from the capsid surface. The sidechains of the hydrophobic residues lining the fivefold channel are shown in blue. Example amino acid residues are labeled (in black) in (C) and (D). (E) Ribbon diagram of the model built into the low-resolution PmMDV density map at 8.45 Å. The basket-like density under the fivefold channel is modeled with the backbone of 20 additional N-terminal amino acids, arranged in an α-helical conformation, shown from a lateral view (Upper) and from the luminal surface of the capsid (Lower).

Hepanhamaparvovirus and two hitherto unclassified DVs isolated The subfamily Hamaparvovirinae was created to include all from , from the Hamaparvovirinae. The clustering of PVs that could not be assigned to either Densovirinae or Par- PmMDV with hepanhamaparvoviruses was not significantly vovirinae reliably (43). While the clade of genera Penstyl-, supported, unlike the clade itself. Based on the results of a BlastP Brevi-, Chap-, and Ichthamaparvovirus appear to be mono- search, capsid VP shares the same origin within the Parvovirinae phyletic, the introduction of PmMDV splits Hamaparvovirinae and Densovirinae, as well as within genera Chaphamaparvovirus into two clades. This suggests that PmMDV should not be and Ichthamaparvovirus of Hamaparvovirinae (Fig. 6). assigned to the established subfamilies and its classification requires a new subfamily to be proposed to the International Discussion Committee on Taxonomy of Viruses Parvovirus Study Group PmMDV is a newly discovered of the penaeid shrimp, (1). Although all members of the Parvovirinae and Densovirinae P. monodon, with a significant amino acid sequence identity only subfamilies possess homologous structural proteins, the high within its SF3 helicase domain with other members of Parvoviridae. number of lineages lacking VP sequence similarity in Hama- Its short ssDNA genome, flanked by the T-shaped secondary parvovirinae and in the fourth clade could suggest the existence structures, and its T = 1 icosahedral capsid symmetry are all of several, only distantly related subfamily-level lineages. Since characteristics of members of the family Parvoviridae. Although most of these PVs are of aquatic origin, this environment might members of two PV genera had been previously isolated from the facilitate the acquisition of VP-encoding genes from other viral same host so far, neither penstyl- nor hepanhamaparvoviruses sources. Chimerism between CRESS DNA viruses and ssRNA(+) were closely related to PmMDV. Consequently, PmMDV thus viruses has been observed multiple times (44, 45). The finding of a represents a virus of a third parvoviral lineage to infect penaeid homologous transcript of the PmMDV VP in another decapod shrimps to be assigned to a novel genus. crustacean, P. trituberculatus, and an endogenous element in an

Pénzes et al. PNAS Latest Articles | 5of12 Downloaded by guest on September 27, 2021 Fig. 4. Comparison of the PmMDV capsid structure with selected PVs. (A) The PmMDV capsid surface morphology is compared to those of vertebrate PVs of Parvovirinae: That is, CPV of genus Protoparvovirus and Adeno-associated virus serotype 2 (AAV2) of Dependoparvovirus; and invertebrate-infecting PVs of Densovirinae, GmDV of Protoambidensovirus, BmDV1 of Iteradensovirus, AdDV of Scindoambidensovirus, and PstDV of Penstylhamaparvovirus, subfamily Hamaparvovirinae. The capsids are radially colored, according to the scale to the right and are orientated according to the schematic icosahedron diagram. (B) The PmMDV VP monomer structure superimposed on the VP monomer structure of PstDV, shown as coil diagrams. (C) Coil diagram of the PmMDV VP monomer structure superimposed on the model constructed from the derived protein sequence of a transcript derived from the Japanese blue crab, P. trituberculatus (TSA ID: GFFJ01053568). In both B and C the RMSD values between the Cαs are shown.

amphipod, a stemgroup of Malacostraca, suggested that the line- the DE loop and the basket, could be potentially associated with age, of which PmMDV is the first representative, might hold a the externalization of the N-terminal helical bundle. This mechanism wide host spectrum, involving the most diverse crustacean class, is analogous to that used by members of the Tetraviridae to Malacostraca. breach the endolysosome (47) and not the phospholipase activity, Homology modeling target selection suggested structural after PLA2 externalization, exploited by almost all PVs (3). similarity between the VP2 of infectious pancreatic necrosis vi- However, basket-like structures have been reported for Parvovir- rus, a double-stranded RNA (dsRNA) virus, and ORF4 of inae genus , although the Gly-rich flexible exten- PmMDV. Earlier structural studies revealed another birnavirus, sion linking the VP2 N terminus to the VP1 PLA2 domain is likely infectious bursal disease virus, to provide a missing evolutionary involved in externalization and a role for membrane-penetrating link between the structure of the mature dsRNA virus capsid helices has not been deduced (52, 53). (families and Birnaviridae) and certain families of Despite the homology modeling target selection results, the icosahedral positive-stranded ssRNA viruses, such as PmMDV capsid has a remarkably similar surface morphology to and Tetraviridae (46). The VP2 of these families, which contain a another shrimp-infecting DV, PstDV, yet their VPs are low in jellyroll fold as its core element, self-assembles into an immature = sequence identity. The structural similarity, however, is limited procapsid of T 1 icosahedral symmetry, while the N-terminal because the PmMDV capsid possesses multimer interactions that helices come together into a pentamer helix bundle at the fivefold differ from all other PVs. The N-termini of VPs of invertebrate- axis. In the case of family Tetraviridae, the bundle has been shown infecting PVs display a domain-swapped conformation; that is, the to undergo externalization and disrupt the plasma membrane βA interacts with the βB of the twofold-neighboring subunit and during (47). Considering its fold-related position, the β – PmMDV basket, which also appears to be a pentamer helix bundle, not with its own BasincaseoftheParvovirinae (29 32). In might hold a similar function. Additionally, helical membrane- PmMDV the VP N terminus interacts with the threefold neigh- penetrating peptides often include similar arginine-rich basic mo- boring subunit through a similar swapped-like conformation. Due β tifs, similarly to the PmMDV VP N terminus (48, 49). It was shown to the lack of the A strand, the PmMDV capsid luminal surface is β β –β that divalent cation removal triggered the externalization of the assembled by the four-stranded BIDG sheet, stabilized by B B β –β PmMDV fivefold basket from each threefold neighboring subunit. interactions instead of the canonical B A observed for all As PVs traffic through the endosomal–lysosomal network, following Parvovoiridae with known structures (28). As a result, the capsid endocytosis, their environment becomes increasingly acidified, lumen is smaller, yet packages an average-sized PV genome of 4.3 reaching a pH as low as 4.0 (50, 51). This could trigger the dis- kb. Finally, PmMDV and PstDV are only distantly related based sociation of a cation, suggesting that the PmMDV capsid might on NS1 phylogeny, and possess different genome organizations and undergo the observed conformational changes during its natural transcription strategies (27, 54). Consequently, the structural simi- trafficking pathway. Structural reorganization at the fivefold axis, larity between these two PVs is likely due to convergent evolution, manifesting as the expansion of the fivefold pore and loss of both attributed to infecting the same host and perhaps sharing similar

6of12 | www.pnas.org/cgi/doi/10.1073/pnas.2008191117 Pénzes et al. Downloaded by guest on September 27, 2021 MICROBIOLOGY

Fig. 5. The EDTA-treated PmMDV capsid structure. (A) Surface morphology and cross-section of the high-resolution EDTA-treated structure at 2.78-Å res- olution. (B) Ribbon diagram superposition of the native PmMDV and EDTA-treated VP monomer structures. The N termini are indicated by the first ordered residue in each structure. (C) Superposition of the low-resolution native and EDTA-treated PmMDV maps. The absence of the basket-like structure in the EDTA-treated structure is evident. (D) Superimposed density maps and ribbon diagrams of the native (yellow ribbon and mesh) and EDTA-treated (blue mesh and ribbon) N-terminal regions of the PmMDV high resolution structures located under the fivefold symmetry axis and viewed from the capsid lumen. The arrows indicate the first ordered residue in both structures. Density is contoured at 2ơ.(E) EDTA treatment affects the DE loop structure and expands the fivefold pore diameter to 16.8 Å. The model is shown in blue ribbon inside the mesh-like density of the map. The hydrophobic residues, previously facing toward the lumen of the channel, are highlighted in blue. (F) The threefold symmetry axis of the EDTA-treated structure. The central unassigned density is still present in the annulus center. The terminal histidine, His369, now adopts a single conformation (rather than the dual in the native structure) facing away from the annulus center.

aspects of receptor attachment and trafficking to the nucleus for . This is a portal-like opening in GmDV and PmMDV, packaged genome replication. with a diameter of ∼18 and ∼17 Å, respectively (29–32, 55). The All four available DV structures possess a β-annulus–like opening GmDV β-annulus is lined with large, positively charged sidechains at their threefold axes, similarly to T = 3 ssRNA viruses, such as similar to the upper layer of the PmMDV double β-annulus, and

Pénzes et al. PNAS Latest Articles | 7of12 Downloaded by guest on September 27, 2021 Fig. 6. The Bayesian phylogenetic inference based on the NS1 Pfam domain of family Parvoviridae (1U0J). Alignment was constructed incorporating structural data, retrieved automatically from the RCSB Protein Data Bank. The substitution model LG+I+G proved to be the most suitable based on both Akaike and Bayesian information criteria. Posterior probability values are indicated on the nodes if significant (0.7<). The area of node shapes is proportional with the posterior probability value of each node. Crustacean PVs are shown underlined and in larger font, whereas penaeid shrimp infecting viruses are in bold and underlined. The PmMDV is highlighted in red. The branches of lineages not including the PLA2 in their VP proteins are presented in blue. Capsid protein homology is indicated as circles of the same color.

was speculated to be a flexible channel for packaging or uncoating zinc ions in a variety of metalloproteins (56). If indeed this is a the ssDNA genome (29). The belief is that the PmMDV fivefold zinc ion, it may have a role in catalyzing the formation of trimers axes portal may be the location of membrane penetration while during assembly (57). The role for the predicted ADAM-TS the PmMDV threefold axes portal performs genome-associated metalloproteinase spacer-like domain, surrounding the annulus functions, such as genome release or entry into the empty particle and surface-exposed at the outer rim surrounding the fivefold during uncoating or packaging, respectively. The shape of the axes, is unknown. These domains are widespread in eukaryotic unmodeled diffuse density occupying the annulus resembles a proteins and have been associated with extracellular matrix at- metal ion; however, it was not chelated with EDTA. This suggests tachment and signaling (58–60). Therefore, this region might play that it is well coordinated by the three histidines, similarly to the aroleinvirus–host interactions of PmMDV and may possibly be

8of12 | www.pnas.org/cgi/doi/10.1073/pnas.2008191117 Pénzes et al. Downloaded by guest on September 27, 2021 associated with attachment. This remains to be tested and previously seeded on a six-well culturing dish and incubated overnight at confirmed. 28 °C. The culturing medium was aspired and replaced by seeding medium The PmMDV genome, despite its 4.3-kb size, contains four of Grace’s complete insect medium supplemented with 5% FBS (Gibco) and ’ predicted ORFs. Among PVs the presence of multiple, small Graces s unsupplemented insect medium, mixed at a ratio of 1:6, respec- tively. After adding the transfection reagent–DNA mixture to the wells, cells ORFs, which frequently overlap or are encompassed by the NS – were incubated for 5 h. The aspired transfection medium was replaced with or the VP cassette, are well-known (10, 13, 61 67). ORF2 and ORF3 SF900 II medium supplemented with 2% FBS. Cells were checked daily for were not related to any other PV ORF, suggesting their independent signs of cytopathic effects (CPE) and the whole culture was collected when acquisition by horizontal gene transfer. This was supported by the 70% of the cells detached from the dish or showed granulation. This was presence of the Fis1 Pfam domain and the CobJ motif in the amino followed by three cycles of freeze–thaws on dry ice and 200 μL of this pas- acid sequence of ORF2′ and ORF3, respectively, implying host or sage 1 (P1) stock was transferred to 25 mL of fresh Sf9 suspended cell culture prokaryotic origin. The PmMDV genome encoded two major groups in polycarbonate Erlenmeyer flasks (Corning) at the density of 2.5 × 106 cells/mL, of NS proteins from an upstream promoter. The significance of to create the P2 stock, cultured in serum-free SF900 II medium without expressing shorter C-terminal NS protein variants along a full-length antibiotics. NS1 is unclear but is common among PVs. Transcript 4, expressing NS2B, showed up later in the transcriptome, suggesting that each Transcription Studies. To study the transcriptome of PmMDV, we cloned the entire viral genome, including the ITRs, into a pFB dual vector, where both the variant harbors different roles during replication. This has also p10 and the PH (polyhedrin) promoters had been knocked out previously; been described in dependoparvoviruses, where each Rep possess hence, transcription of the whole PmMDV genome could be analyzed in the two slightly different C-terminal–extended versions (68). Tran- recombinant Autographa californica nuclear polyhedrosis virus (NPV) ge- script 2 is spliced twice and may express ORF3 in addition to NS2B. nome without interference of the NPV promoters. This construct was des- ORF3 could potentially also be expressed from the unspliced p47 ignated PmMDV-Bac-complete. The yield of DNA, however, of bacmid transcript. As this putative protein might possess enzymatic activity, minipreps from the DH10Bac (Invitrogen) cells was below 500 μg/mL; hence, the complex splicing pattern may have evolved to regulate its the large ITRs were removed. This yielded over 2 mg/mL bacmid DNA of each quantity as needed. Regulation of protein quantity via splicing preparation. Total RNA was extracted using the Direct-zol RNA MiniPrep Kit hasbeenobservedincaseofstructural protein expression of several (Zymo Research), where the denaturation step was executed by adding TRIzol Reagent (Thermo Fisher Scientific). RNA was treated by digestion with Parvovirinae genera, including Proto-, Amdo-, and Dependoparvovirus the TURBO DNA-free Kit (Ambion) to get rid of residual DNA contamination, (2). In Densovirinae, leaky scanning acts as the main expression as well as subjected to a control PCR for the remaining DNA fragments. regulatory mechanism, although alternative splicing has evolved Reverse transcription was performed only on entirely DNA-negative prepa- sporadically in various genera of Densovirinae, such as Proto-, rations using the SuperScript IV or the SuperScript III enzymes (Thermo Fisher Scindo-, and Pefuambidensovirus (5, 7, 10, 13, 54). The alternative Scientific), supplemented with random nonamers (Sigma-Aldrich). To avoid MICROBIOLOGY splicing in PmMDV, however, probably evolved independently false-detection of splicing, isolated and Dnase-treated RNA was dephos- from these. ORF2′, expressed by a spliced transcript of p47, is a phorylated by adding Antarctic phosphatase (New England Biolabs) and chimera of two ORFs acquiring its N terminus from the non- incubated for 30 min at 37 °C. Primers were designed at the following po- structural protein gene, ORF1. As the tetratricopeptide region of sitions of the PmMDV genome: 1,100 nt (reverse and forward), 2,300 nt (reverse and forward), 2,564 nt (reverse and forward), 3,038 nt (reverse), and the Fis1 protein has a scaffolding role during mitochondrial fis- ′ 3,511 (reverse and forward). sion, it cannot be excluded that ORF2 provides a similar function Anchored oligo(dT) primers were used together with the 2,300 forward during PmMDV assembly. However, as the VLPs expressed by the and 3,511 forward primers for 3′ RACE (rapid amplification of cDNA ends). To PmMDV-Bac-ORF4 were capable of self-assembly, without such perform 5′ RACE to map transcription start sites, we designed adaptors with a function, it does not appear to be essential for capsid assembly. the sequence of ATCCACAACAACTCTCCTCCTC’3. Dnase-treated RNA was subjected to dephosphorylation. by alkaline calf intestinal phosphatase Materials and Methods (New England Biolabs) and after phenol-chloroform extraction to dephos- Virus Detection, DNA Isolation, and Cloning. Virus particles were purified by phorization by tobacco acid pyrophosphatase (Ambion) to remove 5′ RNA cesium chloride gradient to obtain viral DNA for cloning (detailed later) from caps. After the ligation of adaptors using T4 RNA ligase (New England the pooled, ∼500-g P. monodon-infected tissue, obtained from Vietnam. Biolabs), reverse transcription was executed as described above. PCR was After homogenization in PBS and initial extraction with chloroform/butanol performed with the readaptor primers together with oligos 1,100 reverse (1:1 volume), a clear supernatant containing viral particles was obtained by and 2,564 reverse. All PCRs were performed using Phusion Hot Start Flex low-speed centrifugation. Virus stock was concentrated from the superna- DNA Polymerase (New England Biolabs) in a 25-μL final reaction volume, tant by ultracentrifugation at 40,000 rpm in a type 60Ti rotor for 2 h at 4 °C. including 2 μL of purified cDNA target, 0.5 μL of both primers in 50-pmol Pellets were resuspended in small volume of PBS for DNA extraction or EM concentration, 0.5 μL dNTP mix with 8 μmol of each nucleotide, 0.75 μLof50 analysis. Viral DNA was extracted by the High Pure Viral Nucleic Acid Kit mM MgCl2 solution, and 0.25 μL of enzyme. PCR reactions were executed (Roche) and eluted in 40 μM of distilled water. The isolated DNA was blunt- under a program of 5-min denaturation at 95 °C followed by 35 cycles of 30-s ended utilizing T4 DNA polymerase and Large Klenow fragment of DNA denaturation at 95 °C, 30-s annealing at 48 °C, and 1 or 2 min of elongation polymerase I (New England Biolabs) in the presence of 33 μM of each dNTP at 72 °C. The final elongation step was 8-min long at 72 °C. In case of the 5′ and cloned into the EcoRV restriction site of a pBluescript KS+ vector. RACE reactions, 0.5 μL of enzyme was used and the number of cycles was Starting with the M13 primer sites of the vector, the sequence of the reduced to 25. For the 3′RACE, the reaction was supplemented with 1 μLof PmMDV genome could be determined by primer walking. To obtain the 50 mM MgCl2 and the annealing step was left out. sequences of the termini, several GC-rich cutter restriction enzymes were used to release secondary structures and the fragments subcloned and se- Protein Expression and Purification of VLPs. The plasmids PmMDV-bac-complete quenced. Two GC cutters proved to be sufficient, namely ApaI and HaeIII, and PmMDV-Bac-p47 were constructed by using a pFB dual vector (Invi- − enabling cloning into the ApaI and EcoRV sites of the pBluescript KS and trogen), from which both the polyhedrin (PH) and the p10 promoters had KS+ vectors, respectively. We used the Sure Escherichia coli strain (Stra- been removed, while PmMDV-Bac-ORF4 was of pFB1 backbone, driven by the

tagene), and incubation at 30 °C, for both infectious clone construction and PH promoter (Invitrogen). For the expression studies, the P2 baculovirus stock for subcloning of viral ITRs. was used in the case of all three constructs detailed above. The P2 stocks were incubated for at least 7 d and monitored for CPE every third day. When at Cell Lines, Transfection, and Culturing Conditions. Sf9 (ATCC CRL-1711), C6/36 least 70% of the cells showed signs of CPE, the culture was collected, centri- (ATCC CRL-1660), and Schneider’s Drosophila Line 2 (ATCC CRL-1963) were fuged at 3,000 × g, and the pelleted cells disrupted by three cycles of tested for PmMDV susceptibility; however, none of these could sustain freeze–thaws on dry ice. This lysed cell pellet was then resuspended in 1 mL of PmMDV replication. To perform transcription and expression studies, the 1× TNTM pH8 (50 mM Tris pH8, 100 mM NaCl, 0.2% Triton X-100, 2 mM

Bac-to-Bac expression system was used (Invitrogen, Thermo Fisher Scientific), MgCl2) and centrifuged again. Supernatant was mixed back together with the involving the transfection of Sf9 cells. Sf9 cultures were maintained in SF900 cell culture supernatant and was subjected to treatment with 250 units of II medium (Gibco) in a serum-free system at 28 °C. Cellfectin II Reagent Benzonase Nuclease (Sigma-Aldrich) per every 10 mL. The liquid was mixed (Invitrogen) was used for DNA transfection with 8 × 105 cells per well, with 1× TNET pH8 (50 mM Tris pH8, 100 mM NaCl, 0.2% Triton X-100, 1 mM

Pénzes et al. PNAS Latest Articles | 9of12 Downloaded by guest on September 27, 2021 EDTA) in a 1:1 ratio and concentrated on a cushion of 20% sucrose in TNET, Structural Studies. The statistics for each reconstruction and refinement are using a type 60 Ti rotor for 3 h at 4 °C at 45,000 rpm on a Beckman Coulter S given in SI Appendix, Table S3. Three-microliter aliquots of the PmMDV VLPs class ultracentrifuge. The pellet was resuspended in 1 mL of 1× TNTM pH8 and without/with EDTA (∼1 mg/mL) were applied to glow-discharged C-flat after overnight incubation purified on a 5 to 40% sucrose step gradient for holey carbon grids with a thin layer of carbon (Protochips) and vitrified us- 3 h at 4 °C at 35,000 rpm on the same instrument in a SW 41 Ti swinging ing a Vitrobot Mark IV (FEI) at 95% humidity and 4 °C. The quality and bucket preparative ultracentrifuge rotor. The visible single band that formed suitability of the grids for cryo-EM data collection was determined by at the 15 to 20% sucrose interface was then collected by needle puncture and screening with a 16-megapixel charge-coupled device camera (Gatan) in a a 10-mL volume syringe. In the case of constructs PmMDV-Bac-complete and Tecnai G2 F20-TWIN transmission electron microscope operated at 200 kV PmMDV-Bac-p47, both expressed by the own promoters of PmMDV, sucrose under low-dose exposure (∼20 e−/Å2) prior to data collection. For collecting gradient purification did not result in a visible band; hence, the cushion- the low-resolution native and EDTA-treated PmMDV datasets, the same concentrated pellet after TNTM resuspension was purified in cesium chlo- microscope was used at 50 frames per 10 s using a K2 direct electron de- ride instead, dissolved in TNTM at a density of 1.38 g/cm3. After 24-h ultra- tector (DED) at the University of Florida Interdisciplinary Center for Bio- centrifugation at 40,000 rpm at 16 °C in a SW 55 Ti rotor, the VLPs could be technology Research electron microscopy core. aspired using a needle and a 1-mL syringe. The aspirate was dialyzed into 1× High-resolution data collection was carried out at two locations: the × HCB buffer (50 mM Hepes, 4.3 mM MgCl2 6H2O, 0.15 M NaCl) at pH 7.4 to Florida State University (FSU) for the EDTA-treated PmMDV dataset, and the remove the sucrose or the cesium chloride. As PmMDV demonstrated better University of California, Los Angeles (UCLA) for the native dataset. At both stability at the higher pH of its natural extracellular environment, particles locations, a Titan Krios electron microscope (FEI) was used, operating at 300 purified were dialyzed into 1× universal buffer (20 mM Hepes, 20 mM MES, kV, equipped with a DE64 DED (Direct Electron Detector) at FSU and Gatan 20 mM sodium acetate, 0.15 M NaCl, 3.7 mM CaCl2)atpH8.2,whichis K2 DED at UCLA. At UCLA, the scope also contained a Gatan postcolumn equivalent with the pH of tropical marine water. imaging filter (Gatan) and a free-path slit width of 20 eV. Movie frames were recorded using the Leginon semiautomated application at both sites In Silico Analyses. After obtaining the sequence of the ITRs and genome (83). At FSU, the frame rate was 50 per 10 s with ∼60 e−/Å2 electron dosage. clones, the complete genome sequence of PmMDV was assembled using At UCLA, images were collected at 50 frames per 10 s with an ∼75 e−/Å2 Staden package v4.11.2 (69). The assembled genome was annotated, as well electron dosage. Movie frames collected at both locations, as well as for the as the transcripts assembled and splice sites investigated in Artemis Genome low-resolution data set collected at the University of Florida, were aligned Browser by the Sanger Institute (70). Homology searching at amino acid using the MotionCor2 application with dose weighting (84). levels was carried out two ways: To determine sequential similarity, the Blast To reconstruct the 3D structure from the micrographs, particles were algorithms were applied with different expectation value levels (71), extracted using EMAN2 interactive boxing (85) and the AUTO3DEM software whereas structural similarity was predicted by the genThreader, pDom- suite (86). Individual particle image normalization and apodization was Threader, and pGenThreader algorithms of the PSIpred web server (http:// performed using the AUTOPP subroutine of AUTO3DEM, with options F and bioinf.cs.ucl.ac.uk/psipred/) (72). To investigate the conserved motifs and O. Estimations of the defocus values for the micrographs used the ctffind4 domains with known homologs in the derived aa sequences, the DomPred subroutine (87) in AUTO3DEM (option 3X) to enable correction of the algorithm of the PSIpred server as well as the SMART web application was microscope-related contrast transfer functions. Initial models at low resolu- used (73). Structural similarity of the resolved capsid structures with those tion (30 Å) were generated from the images of 100 particles by an ab initio available in the RCSB Protein Data Bank (PDB) was investigated using the random-model method, applying icosahedral symmetry. Orientations and DALI server (74). The tBLASTn algorithm was utilized to screen the RefSeq, origins of each particle were determined based on the initial model and the Whole Genome Shotgun Contigs, High-Throughput Genomic Sequences, final map was obtained after a number of refinement iterations (15–31) in and Transcript Shotgun Assembly databases, targeting 5,000 hits with an AUTO3DEM, until the resolution could not be further improved. The reso- expectation value of 10 (71). lution of each cryoreconstructed map was calculated based on a Fourier shell For phylogenic inference, alignments, incorporating the outputs of pair- correlation (FSC) of 0.143. The number of particles used to compute the final wise, multiple, and structural aligners, were constructed using the Expresso density maps are in SI Appendix, Table S3. A noise-suppression factor was algorithm of T-Coffee (75). Structural data were obtained using the PDB applied in AUTO3DEM to avoid amplification of noise in the density maps. mode of this algorithm. The constructed alignment was further edited using The homology model of the VP core structure used to generate a 60 mer, Unipro Ugene (76). Model selection was executed by ProtTest v2.4, sug- was fitted into the density map using the University of California, San gesting the LG + I + G + F substitution model based on both the Bayesian and Francisco Chimera visualization system (88). Each map was resized to the Akaike information criteria (77). The distance matrix to the starting trees voxel size determined in Chimera (by maximizing correlation coefficient) – were constructed using the Prodist program of Phylip v3.695 with a Johns using the e2proc3D.py subroutine in EMAN2 and then converted to the CCP4 – Thorton Taylor method and the starting tree was constructed from this format using the program MAPMAN (89). Once the βB strand could be – using the Fitch Margoliash method of the Fitch program with global rear- confidently built, using the Coot application (90), the entire ill-fitting docked rangements (78). Bayesian inference was executed by the BEAST v1.10.4 model was deleted and the density remodeled residue by residue. The EDTA- package, incorporating the predicted LG + I + G + F model, using a log- treated virion structure was built by docking the native structure model into normal relaxed clock with a Yule speciation prior, throughout 50,000,000 the density, followed by refinement and rebuilding where necessary. Then, generations (79). Convergence diagnostics were carried out using Tracer each model was refined against the map utilizing the rigid body, real space, v1.7.1, which indicated the Markov-chain Monte Carlo runs to have con- and B-factor refinement subroutines in Phenix (91). verged (80). Phylograms were edited and displayed in the FigTree 1.4.1 program of the Beast package. Data Availability. The complete genome sequence of PmMDV, accompanied To investigate possible homologous VPs throughout the family, we used by its annotation, has been deposited in the GenBank database (accession no. the Blast P and PSI Blast NCBI algorithms with an expect threshold of 1,000 MK028683). The structure of the PmMDV capsid has been deposited to the targeting the maximum number of 5,000 sequences (81). As query, one VP- Research Collaboratory for Structural Bioinformatics (RCSB) Protein Data derived amino acid sequence was submitted for each recognized parvovirus Bank, https://www.rcsb.org/ (PDB ID code 6WH3 for the native PmMDV species as well as one for each complete, unclassified entry. In the case of the capsid; PDB ID code 6WH7 for the EDTA-treated PmMDV capsid). PLA2-containing VPs, the N-terminal sequence, containing this conserved domain, was removed to avoid false hits. Screening was performed using the ACKNOWLEDGMENTS. The authors thank Dr. Hanh T. Van and Chien P. Le substitution matrices Blosum62 and PAM250. Two sequences were marked from The Institute of Tropical Biology in Hochiminh City for helping with the as homologs in case the search resulted in a hit. The search was limited to collection of shrimp specimens; Micheline Letarte at Institut national de la family Parvoviridae in order to filter out false positives. recherche scientifique-Institut Armand-Frappier microscopy and the Univer- To construct the model of the eight stranded β-barrel jellyroll core, which sity of Florida Interdisciplinary Center for Biotechnology Research electron could be docked into the PmMDV high-resolution structure density, the microscopy core for access to electron microscopes utilized for negative-stain I-TASSER Standalone Package v5.1 was used (82). As a template search failed electron microscopy and cryoelectron micrograph screening. The Spirit and to detect structural similarity with any PV capsid structure, threading was TF20 cryoelectron microscopes were provided by the University of Florida College Of Medicine and Division of Sponsored Programs. We thank restricted to the PstDV VP monomer (PDB ID code 3N7X) and incorporated Dr. Hong Zhou (University of California, Los Angeles) and the NIH “West/Mid- secondary structure predictions, obtained by the PSIpred server (72). To west Consortium for High-Resolution Cryo Electron Microscopy” project for achieve the best possible fit of the core, the N- and C-terminal regions up to access to the Electron Imaging Center for Nanomachines’s Titan Krios and K2 and following the first and last β-sheets were removed from the amino Direct Electron Detector utilized for high-resolution data collection (Multi-PI: acid sequence. Hong Zhou, M.A.-M., and others). Data collection at Florida State University

10 of 12 | www.pnas.org/cgi/doi/10.1073/pnas.2008191117 Pénzes et al. Downloaded by guest on September 27, 2021 was made possible by NIH Grants S10 OD018142-01 for purchase of a direct University of Florida College of Medicine and NIH Grants R01 GM109524 and electron camera for the Titan-Krios at Florida State University (Principal In- GM082946 (to M.A.-M. and R.M.) provided funds for the research efforts at vestigator Taylor), S10 RR025080-01 for purchase of a Field Electron and Ion the University of Florida. H.T.P. and J.J.P. acknowledge the financial support (company) Titan Krios for three-dimensional electron microscope (Principal during their doctoral and postdoctoral studies, respectively. Part of this work Investigator Taylor), and U24 GM116788 The Southeastern Consortium for was supported by a Natural Sciences and Engineering Research Council of Microscopy of MacroMolecular Machines (Principal Investigator Taylor). The Canada Discovery grant (to P.T.).

1. J. J. Pénzes et al., Reorganizing the family Parvoviridae: Proposal for a revised tax- 30. G. Meng et al., The structure and host entry of an invertebrate parvovirus. J. Virol. 87, onomy independent from the canonical approach based on host affiliation. Arch. 12523–12530 (2013). Virol., 10.1007/s00705-020-04632-4 (2020). 31. B. Kaufmann et al., Structure of Bombyx mori densovirus 1, a silkworm pathogen. 2. S. F. Cotmore et al.; Ictv Report Consortium, ICTV virus taxonomy profile: Parvoviridae. J. Virol. 85, 4691–4697 (2011). J. Gen. Virol. 100, 367–368 (2019). 32. B. Kaufmann et al., Structure of Penaeus stylirostris densovirus, a shrimp pathogen. 3. Z. Zádori et al., A viral phospholipase A2 is required for parvovirus infectivity. Dev. J. Virol. 84, 11289–11296 (2010). Cell 1, 291–302 (2001). 33. M. G. Rossmann et al., Structural comparisons of some small spherical plant viruses. 4. G. A. Farr, L. G. Zhang, P. Tattersall, Parvoviral virions deploy a capsid-tethered li- J. Mol. Biol. 165, 711–736 (1983). polytic enzyme to breach the endosomal membrane during cell entry. Proc. Natl. 34. P. Plevka et al., Structure of a packaging-defective mutant of Acad. Sci. U.S.A. 102, 17148–17153 (2005). indicates that the genome is packaged via a pore at a 5-fold axis. J. Virol. 85, 5. P. Tijssen, M. Bergoin, “Densoviruses: A highly diverse group of arthropodparvovi- 4822–4827 (2011). ruses” in Insect , S. Asgari, K. N. Johnson, Eds. (Caister Academic Press, Nor- 35. B. Venkatakrishnan et al., Structure and dynamics of adeno-associated virus serotype wich, UK, 2010), pp. 57–90. 1 VP1-unique N-terminal domain and its role in capsid trafficking. J. Virol. 87, 6. Y. Li et al., Genome organization of the densovirus from Bombyx mori (BmDNV-1) 4974–4984 (2013). and enzyme activity of its capsid. J. Gen. Virol. 82, 2821–2825 (2001). 36. J. J. Penzes, H. T. Pham, M. Agbandje-McKenna, P. Tijssen, Penaeus monodon met- 7. P. Tijssen et al., Organization and expression strategy of the ambisense genome of allodensovirus strain 2014CDN, complete genome. NCBI GenBank. https://www.ncbi. – densonucleosis virus of Galleria mellonella. J. Virol. 77, 10357 10365 (2003). nlm.nih.gov/nuccore/1784324175. Deposited 9 October 2018. 8. H. Guo, J. Zhang, Y. Hu, Complete sequence and organization of Periplaneta fuligi- 37. F. Blanche et al., Precorrin-6x reductase from Pseudomonas denitrificans: Purification – nosa densovirus genome. Acta Virol. 44, 315 322 (2000). and characterization of the enzyme and identification of the structural gene. 9. T. V. Kapelinskaya, E. U. Martynova, C. Schal, D. V. Mukha, Expression strategy of J. Bacteriol. 174, 1036–1042 (1992). densonucleosis virus from the German cockroach, Blattella germanica. J. Virol. 85, 38. J. A. Dohm, S. J. Lee, J. M. Hardwick, R. B. Hill, A. G. Gittis, Cytosolic domain of the – 11855 11870 (2011). human mitochondrial fission protein fis1 adopts a TPR fold. Proteins 54, 153–156 10. K. Liu et al., The Acheta domesticus densovirus, isolated from the European house (2004). cricket, has evolved an expression strategy unique among parvoviruses. J. Virol. 85, 39. F. Guo, W. Jiang, Single particle cryo-electron microscopy and 3-D reconstruction of – 10069 10078 (2011). viruses. Methods Mol. Biol. 1117, 401–443 (2014). 11. H. T. Pham, Q. Yu, M. Bergoin, P. Tijssen, A novel ambisense densovirus, Acheta do- 40. J. J. Penzes et al, Capsid structure of Penaeus monodon metallodensovirus at pH 8.2. mesticus mini ambidensovirus, from crickets. Genome Announc. 1, e00914-13 (2013).

RCSB Protein Data Bank. https://www.rcsb.org/structure/6wh3. Deposited 7 April MICROBIOLOGY 12. Y. Boublik, F. X. Jousset, M. Bergoin, Complete nucleotide sequence and genomic 2020. organization of the Aedes albopictus parvovirus (AaPV) pathogenic for Aedes aegypti 41. B. Kaufmann, P. R. Chipman, V. A. Kostyuchenko, S. Modrow, M. G. Rossmann, Vi- larvae. Virology 200, 752–763 (1994). sualization of the externalized VP2 N termini of infectious human . 13. E. Baquerizo-Audiot et al., Structure and expression strategy of the genome of Culex J. Virol. 82, 7306–7312 (2008). pipiens densovirus, a mosquito densovirus with an ambisense organization. J. Virol. 42. J. J. Penzes et al, P. Capsid structure of Penaeus monodon metallodensovirus fol- 83, 6863–6873 (2009). lowing EDTA treatment. RCSB Protein Data Bank. https://www.rcsb.org/structure/ 14. A. Sivaram et al., Isolation and characterization of densonucleosis virus from Aedes 6WH7. Deposited 7 April 2020. aegypti mosquitoes and its distribution in India. Intervirology 52,1–7 (2009). 43. J. J. Pénzes et al., “Re-organize the family Parvoviridae” in Taxonomy Proposal, EC 15. M. L. Thao, S. Wineriter, G. Buckingham, P. Baumann, Genetic characterization of a Approved, (International Committee on Taxonomy of Viruses, 2019). putative Densovirus from the mealybug Planococcus citri. Curr. Microbiol. 43, 457–458 44. G. S. Diemer, K. M. Stedman, A novel virus genome discovered in an extreme envi- (2001). ronment suggests recombination between unrelated groups of RNA and DNA viruses. 16. S. François et al., A new prevalent densovirus discovered in Acari. Insight from met- Biol. Direct 7, 13 (2012). agenomics in viral communities associated with two-spotted mite (Tetranychus urti- 45. M. Krupovic et al., Multiple layers of chimerism in a single-stranded DNA virus dis- cae) populations. Viruses 11, 233 (2019). covered by deep sequencing. Genome Biol. Evol. 7, 993–1001 (2015). 17. B. M. Gudenkauf, J. B. Eaglesham, W. M. Aragundi, I. Hewson, Discovery of urchin- 46. F. Coulibaly et al., The birnavirus crystal structure reveals structural relationships associated densoviruses (family Parvoviridae) in coastal waters of the Big Island, Ha- among icosahedral viruses. Cell 120, 761–772 (2005). waii. J. Gen. Virol. 95, 652–658 (2014). 47. T. Domitrovic, T. Matsui, J. E. Johnson, Dissecting quasi-equivalence in nonenveloped 18. I. Hewson et al., Densovirus associated with sea-star wasting disease and mass mor- viruses: Membrane disruption is promoted by lytic peptides released from subunit tality. Proc. Natl. Acad. Sci. U.S.A. 111, 17278–17283 (2014). pentamers, not hexamers. J. Virol. 86, 9976–9982 (2012). 19. S. Bochow, K. Condon, J. Elliman, L. Owens, First complete genome of an Ambi- 48. N. Kämper et al., A membrane-destabilizing peptide in capsid protein L2 is required densovirus; Cherax quadricarinatus densovirus, from freshwater crayfish Cherax for egress of papillomavirus genomes from endosomes. J. Virol. 80, 759–768 (2006). quadricarinatus. Mar. Genomics 24, 305–312 (2015). 49. P. Zhang, G. Monteiro da Silva, C. Deatherage, C. Burd, D. DiMaio, Cell-penetrating 20. W. Sukhumsirichart, P. Attasart, V. Boonsaeng, S. Panyim, Complete nucleotide se- quence and genomic organization of hepatopancreatic parvovirus (HPV) of Penaeus peptide mediates intracellular membrane passage of human papillomavirus L2 Pro- – monodon. Virology 346, 266–277 (2006). tein to trigger retrograde trafficking. Cell 174, 1465 1476.e13 (2018). 21. K. F. Tang, C. R. Pantoja, D. V. Lightner, Nucleotide sequence of a Madagascar hep- 50. B. Mani et al., Low pH-dependent endosomal processing of the incoming parvovirus atopancreatic parvovirus (HPV) and comparison of genetic variation among geo- minute virus of mice virion leads to externalization of the VP1 N-terminal sequence graphic isolates. Dis. Aquat. Organ. 80, 105–112 (2008). (N-VP1), N-VP2 cleavage, and uncoating of the full-length genome. J. Virol. 80, – 22. K. A. La Fauce, J. Elliman, L. Owens, Molecular characterisation of hepatopancreatic 1015 1024 (2006). parvovirus (PmergDNV) from Australian Penaeus merguiensis. Virology 362, 397–403 51. H. J. Nam et al., Structural studies of adeno-associated virus serotype 8 capsid tran- – (2007). sitions associated with endosomal trafficking. J. Virol. 85, 11791 11799 (2011). 23. J. R. Bonami, J. Mari, B. T. Poulos, D. V. Lightner, Characterization of hepatopancre- 52. S. Kailasan et al., Structure of an enteric pathogen, bovine parvovirus. J. Virol. 89, – atic parvo-like virus, a second unusual parvovirus pathogenic for penaeid shrimps. 2603 2614 (2015). J. Gen. Virol. 76, 813–817 (1995). 53. M. Mietzsch et al., Structural insights into human bocaparvoviruses. J. Virol. 91, 24. S. Jeeva et al., Complete nucleotide sequence analysis of a Korean strain of hep- e00261-17 (2017). atopancreatic parvovirus (HPV) from Fenneropenaeus chinensis. Virus Genes 44, 54. P. Tijssen, J. J. Pénzes, Q. Yu, H. T. Pham, M. Bergoin, Diversity of small, single- 89–97 (2012). stranded DNA viruses of invertebrates and their chaotic evolutionary past. 25. H. Shike et al., Infectious hypodermal and hematopoietic necrosis virus of shrimp is J. Invertebr. Pathol. 140,83–96 (2016). related to mosquito brevidensoviruses. Virology 277, 167–177 (2000). 55. M. B. Sherman et al., Near atomic resolution cryo-electron microscopy structures of 26. K. F. Tang et al., Geographic variations among infectious hypodermal and hemato- cucumber leaf spot virus and red clover necrotic mosaic virus; evolutionary divergence poietic necrosis virus (IHHNV) isolates and characteristics of their infection. Dis. Aquat. at the icosahedral three-fold axes. J. Virol. 94, e01439-19 (2019). Organ. 53,91–99 (2003). 56. L. Zhou et al., Interaction between histidine and Zn(II) metal ions over a wide pH as 27. H. T. Pham, Molecular Biology of Single-Stranded DNA Viruses in Shrimps and revealed by solid-state NMR spectroscopy and DFT calculations. J. Phys. Chem. B 117, Crickets. PhD thesis, Université duQuébec, Quebec City, QC, Canada (2015). 8954–8965 (2013). 28. M. Mietzsch, J. J. Pénzes, M. Agbandje-McKenna, Twenty-five years of structural 57. M. Laitaoja, J. Valjakka, J. Jänis, Zinc coordination spheres in protein structures. Inorg. parvovirology. Viruses 11, 362 (2019). Chem. 52, 10983–10991 (2013). 29. A. A. Simpson, P. R. Chipman, T. S. Baker, P. Tijssen, M. G. Rossmann, The structure of 58. B. M. Luken et al., The spacer domain of ADAMTS13 contains a major binding site for an insect parvovirus (Galleria mellonella densovirus) at 3.7 A resolution. Structure 6, antibodies in patients with thrombotic thrombocytopenic purpura. Thromb. Hae- 1355–1367 (1998). most. 93, 267–274 (2005).

Pénzes et al. PNAS Latest Articles | 11 of 12 Downloaded by guest on September 27, 2021 59. M. Tortorella et al., The thrombospondin motif of aggrecanase-1 (ADAMTS-4) is 75. F. Armougom et al., Expresso: Automatic incorporation of structural information in critical for aggrecan substrate recognition and cleavage. J. Biol. Chem. 275, multiple sequence alignments using 3D-Coffee. Nucleic Acids Res. 34, W604–W608 25791–25797 (2000). (2006). 60. G. Gao et al., Activation of the proteolytic activity of ADAMTS4 (aggrecanase-1) by 76. K. Okonechnikov, O. Golosova, M. Fursov; UGENE team, Unipro UGENE: A unified C-terminal truncation. J. Biol. Chem. 277, 11034–11041 (2002). bioinformatics toolkit. Bioinformatics 28, 1166–1167 (2012). 61. G. Fédière, Y. Li, Z. Zádori, J. Szelei, P. Tijssen, Genome organization of Casphalia 77. D. Darriba, G. L. Taboada, R. Doallo, D. Posada, ProtTest 3: Fast selection of best-fit extranea densovirus, a new iteravirus. Virology 292, 299–308 (2002). models of protein evolution. Bioinformatics 27, 1164–1165 (2011). 62. H. T. Pham et al., Expression strategy of Aedes albopictus densovirus. J. Virol. 87, 78. J. Felsenstein, PHYLIP (Phylogeny Inference Package) (version 3.6, Department of 9928–9932 (2013). Genome Sciences, University of Washington, Seattle). 63. J. J. Pénzes, W. M. de Souza, M. Agbandje-McKenna, R. J. Gifford, An ancient lineage 79. M. A. Suchard et al., Bayesian phylogenetic and phylodynamic data integration using of highly divergent parvoviruses infects both vertebrate and invertebrate hosts. Vi- BEAST 1.10. Virus Evol. 4, vey016 (2018). ruses 11, 525 (2019). 80. A. Rambaut, A. J. Drummond, D. Xie, G. Baele, M. A. Suchard, Posterior summariza- 64. J. M. Day, L. Zsak, Determination and analysis of the full-length chicken parvovirus tion in Bayesian phylogenetics using Tracer 1.7. Syst. Biol. 67, 901–904 (2018). genome. Virology 399,59–64 (2010). 81. S. F. Altschul et al., Gapped BLAST and PSI-BLAST: A new generation of protein da- 65. Z. Zhang et al., NP1 inhibits IFN-β production by blocking associa- tabase search programs. Nucleic Acids Res. 25, 3389–3402 (1997). tion of IFN regulatory factor 3 with IFNB promoter. J. Immunol. 189, 1144–1153 82. J. Yang et al., The I-TASSER Suite: Protein structure and function prediction. Nat. (2012). Methods 12,7–8 ((2015). 66. H. Tse et al., Discovery and genomic characterization of a novel ovine partetravirus 83. C. Suloway et al., Automated molecular microscopy: The new Leginon system. and a new genotype of bovine partetravirus. PLoS One 6, e25619 (2011). J. Struct. Biol. 151,41–60 (2005). 67. Z. Zádori, J. Szelei, P. Tijssen, SAT: A late NS protein of porcine parvovirus. J. Virol. 79, 84. S. Q. Zheng et al., MotionCor2: Anisotropic correction of beam-induced motion for 13129–13138 (2005). improved cryo-electron microscopy. Nat. Methods 14, 331–332 (2017). 68. D. S. Im, N. Muzyczka, Partial purification of adeno-associated virus Rep78, Rep52, 85. G. Tang et al., EMAN2: An extensible image processing suite for electron microscopy. and Rep40 and their biochemical characterization. J. Virol. 66, 1119–1128 (1992). J. Struct. Biol. 157,38–46 (2007). 69. R. Staden, K. F. Beal, J. K. Bonfield, The Staden package, 1998. Methods Mol. Biol. 132, 86. J. M. Spear et al., The influence of frame alignment with dose compensation on the 115–130 (2000). quality of single particle reconstructions. J. Struct. Biol. 192, 196–203 (2015). 70. T. Carver, S. R. Harris, M. Berriman, J. Parkhill, J. A. McQuillan, Artemis: An integrated 87. A. Rohou, N. Grigorieff, CTFFIND4: Fast and accurate defocus estimation from elec- platform for visualization and analysis of high-throughput sequence-based experi- tron micrographs. J. Struct. Biol. 192, 216–221 (2015). mental data. Bioinformatics 28, 464–469 (2012). 88. E. F. Pettersen et al., UCSF Chimera—A visualization system for exploratory research 71. S. F. Altschul, W. Gish, W. Miller, E. W. Myers, D. J. Lipman, Basic local alignment and analysis. J. Comput. Chem. 25, 1605–1612 (2004). search tool. J. Mol. Biol. 215, 403– 410 (1990). 89. G. J. Kleywegt, T. A. Jones, xdlMAPMAN and xdlDATAMAN—Programs for re- 72. A. Lobley, M. I. Sadowski, D. T. Jones, pGenTHREADER and pDomTHREADER: New formatting, analysis and manipulation of biomacromolecular electron-density maps methods for improved protein fold recognition and superfamily discrimination. Bio- and reflection data sets. Acta Crystallogr. D Biol. Crystallogr. 52, 826–828 (1996). informatics 25, 1761–1767 (2009). 90. P. Emsley, B. Lohkamp, W. G. Scott, K. Cowtan, Features and development of Coot. 73. I. Letunic, P. Bork, 20 years of the SMART protein domain annotation resource. Nu- Acta Crystallogr. D Biol. Crystallogr. 66, 486–501 (2010). cleic Acids Res. 46, D493–D496 (2018). 91. P. D. Adams et al., PHENIX: A comprehensive Python-based system for macromolec- 74. L. Holm, DALI and the persistence of protein shape. Protein Sci. 29, 128–140 (2019). ular structure solution. Acta Crystallogr. D Biol. Crystallogr. 66, 213–221 (2010).

12 of 12 | www.pnas.org/cgi/doi/10.1073/pnas.2008191117 Pénzes et al. Downloaded by guest on September 27, 2021