Structurally conserved erythrocyte-binding domain in PNAS PLUS Plasmodium provides a versatile scaffold for alternate engagement

Jakub Gruszczyka, Nicholas T. Y. Lima, Alicia Arnotta, Wen-Qiang Hea,b, Wang Nguitragoolc, Wanlapa Roobsoongd, Yee-Foong Moke, James M. Murphya,b, Katherine R. Smitha, Stuart Leea, Melanie Bahloa,b, Ivo Muellera,b, Alyssa E. Barrya,b, and Wai-Hong Thama,b,1

aThe Walter and Eliza Hall Institute of Medical Research, Parkville, VIC 3052, Australia; bDepartment of Medical Biology, The University of Melbourne, Melbourne, VIC 3010, Australia; cDepartment of Molecular Tropical Medicine and Genetics, Faculty of Tropical Medicine, Mahidol University, Bangkok 10400, Thailand; dMahidol Vivax Research Unit, Faculty of Tropical Medicine, Mahidol University, Bangkok 10400, Thailand; and eDepartment of Biochemistry and Molecular Biology, Bio21 Molecular Science and Biotechnology Institute, The University of Melbourne, Melbourne, VIC 3010, Australia

Edited by Louis H. Miller, NIH, Rockville, MD, and approved November 30, 2015 (received for review August 19, 2015)

Understanding how malaria parasites gain entry into human red in P. vivax that binds to Duffy antigen receptor for chemokines blood cells is essential for developing strategies to stop blood (DARC) (6, 7). De novo assembly of a Cambodian field isolate stage infection. Plasmodium vivax preferentially invades reticulo- genome identified a potential new member called P. vivax cytes, which are immature red blood cells. The organism has two erythrocyte-binding protein (PvEBP), which shares a Duffy erythrocyte-binding protein families: namely, the Duffy-binding binding-like domain and the C terminus cysteine-rich domain protein (PvDBP) and the reticulocyte-binding protein (PvRBP) fami- present in PvDBP (8). DARC is strongly implicated in malaria invasion because Duffy-negative individuals are resistant to lies. Several members of the PvRBP family bind reticulocytes, specif- – ically suggesting a role in mediating host cell selectivity of P. vivax. P. vivax infection and antibodies that interfere with PvDBP DARC Here, we present, to our knowledge, the first high-resolution crystal interaction inhibit P. vivax invasion (9). For many decades, the leading paradigm was that P. vivax invaded only Duffy-positive structure of an erythrocyte-binding domain from PvRBP2a, solved at

erythrocytes, leading to the hypothesis of the existence of only one MICROBIOLOGY 2.12 Å resolution. The monomeric molecule consists of 10 α-helices β functional P. vivax invasion pathway, in contrast to the multiple and one short -hairpin, and, although the structural fold is similar pathways involved in Plasmodium knowlesi and Plasmodium — Plasmodium to that of PfRh5 the essential invasion ligand in falciparum invasion. This concept has been overturned by mounting falciparum— its surface properties are distinct and provide a pos- evidence that P. vivax infection is observed in individuals that sible mechanism for recognition of alternate receptors. Sequence were genetically and phenotypically Duffy-negative in diverse alignments of the crystallized fragment of PvRBP2a with other geographical regions (reviewed in ref. 10). PvRBPs highlight the conserved placement of disulfide bonds. Upon sequencing of the P. vivax genome from the Salvador 1 PvRBP2a binds mature red blood cells through recognition of an strain in 2008, 10 proteins were identified as belonging to the erythrocyte receptor that is neuraminidase- and chymotrypsin- PvRBP family of proteins, comprising three partial genes and resistant but trypsin-sensitive. By examining the patterns of se- quence diversity within field isolates, we have identified and Significance mapped polymorphic residues to the PvRBP2a structure. Using mu- tagenesis, we have also defined the critical residues required for Plasmodium vivax is responsible for the most widely distrib- erythrocyte binding. Characterization of the structural features uted recurring human malaria infections whereas Plasmodium that govern functional erythrocyte binding for the PvRBP family falciparum inflicts the most mortality and morbidity in human provides a framework for generating new tools that block P. vivax populations. Malaria parasites enter our blood cells by making blood stage infection. proteins that recognize and bind to their cognate receptors on the surface. Our research describes, to our parasite invasion | X-ray crystallography | SAXS | knowledge, the first crystal structure of PvRBP2a, an erythro- reticulocyte binding protein | malaria cyte-binding protein from P. vivax, which revealed a structural scaffold similar to that of PfRh5, the essential erythrocyte- he most widely distributed recurring malaria infections binding protein in P. falciparum. Structural comparisons be- Tglobally are caused by Plasmodium vivax, which accounts for tween PvRBP2a and PfRh5 provide an important foundation 80–100 million malaria infections per year (1). The majority of toward understanding how P. vivax and P. falciparum parasites clinical symptoms associated with malaria are due to blood stage use a homologous erythrocyte-binding protein family to en- infection (2). The merozoite forms of malaria parasites invade gage alternate erythrocyte receptors and ultimately govern human erythrocytes through a multistep process that involves host cell specificity. initial contact with the red blood cell, apical reorientation of the merozoite, and the formation of a tight junction that moves Author contributions: J.G., A.A., W.N., A.E.B., and W.-H.T. designed research; J.G., N.T.Y.L., A.A., W.-Q.H., W.N., Y.-F.M., J.M.M., K.R.S., S.L., A.E.B., and W.-H.T. performed research; progressively toward the posterior end of the parasite until host J.G., W.R., and I.M. contributed new reagents/analytic tools; J.G., N.T.Y.L., A.A., W.-Q.H., cell membrane fusion is completed. These steps in invasion are W.N., Y.-F.M., J.M.M., K.R.S., S.L., M.B., I.M., A.E.B., and W.-H.T. analyzed data; and J.G., dependent on specific interactions between parasite adhesins W.N., J.M.M., S.L., M.B., I.M., A.E.B., and W.-H.T. wrote the paper. and their cognate erythrocyte receptors (reviewed in ref. 3). The authors declare no conflict of interest. P. vivax preferentially invades reticulocytes: i.e., immature red This article is a PNAS Direct Submission. blood cells (4). The basis of host cell selectivity by merozoites Freely available online through the PNAS open access option. from Plasmodium spp. seems to be mediated primarily by fami- Data deposition: The atomic coordinates and structure factors have been deposited in the lies of adhesin proteins. The two erythrocyte-binding protein Protein Data Bank, www.pdb.org (PDB ID code 4Z8N). families of P. vivax are called the Duffy-binding protein (PvDBP) 1To whom correspondence should be addressed. Email: [email protected]. and reticulocyte-binding protein (PvRBP) families (5). In labo- This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. ratory-adapted P. vivax strains, there is only one PvDBP protein 1073/pnas.1516512113/-/DCSupplemental.

www.pnas.org/cgi/doi/10.1073/pnas.1516512113 PNAS Early Edition | 1of10 Downloaded by guest on October 1, 2021 seven full-length genes, of which two are predicted pseudogenes Exon 1 Exon 2 Pf Pv A BC55 bp 7409 bp (5). The partial genes are PvRBP1-P, PvRBP2-P1, and PvRBP2- Intron 250 179 bp cDNA P2; the five full-length coding genes are PvRBP1a, PvRBP1b, gDNA -RT ATG TAA 150 (bp) PvRBP2a, PvRBP2b, and PvRBP2c; and the two pseudogenes ATGGAAAATAAGGTGCTCTGGGCAGTTTTTTACAATC 100 300 TAGTGCTGTTCCTTCTGGgtaggccaaaaagctggcacacata 75 200 are PvRBP2d and PvRBP3. Sequencing of a field isolate genome tgcacacatatgcatatatatacatatgtatatgcttacacttgtagcgaagcg also identified PvRBP2e, which seems to be absent from Salva- 100 ctctttcgcgaggggcttcttatatcctcatcagcccggcttcctttaagctgccc 50 cactcaagcgcgctcgaacttttaatgttacacccccttgcagCATCAAG 37 dor 1 but present in most P. vivax strains (8). Because DARC is CAAAGAAAGCAATCGAATTAAAGCATATAAACTTA present on both reticulocytes and normocytes, the binding of PvDBP to DARC is not sufficient for reticulocyte specificity. Fig. 1. Gene structure and expression of PvRBP2a. (A) Amplification of Restricted host cell selectivity of P. vivax is thought to be gov- pvrbp2a confirms splicing near the start codon. Primers flanking the putative erned by the PvRBP protein family, of which it has been pre- intron were used to amplify gDNA and cDNA, yielding amplicons consistent with the 356- and 177-base pair product lengths predicted by the gene viously shown that PvRBP1a and PvRBP2c bind specifically to − reticulocytes (11). The molecular function of the rest of the structure in B. The no-reverse-transcriptase ( RT) control detected no gDNA contamination in cDNA. The molecular weight for the DNA standard is PvRBP family in P. vivax invasion is currently unknown. Fur- highlighted in base pair (bp). (B) Schematic of the gene structure for thermore, no known erythrocyte receptor has been identified pvrbp2a. The pvrbp2a gene consists of two exons and one intron, with the that interacts with any PvRBP family members. coding sequence and the noncoding intron shown in capital and small ita- Through gene structure and sequence similarity, the homologs licized letters, respectively. ATG highlights the start methionine in the 5′ of PvRBPs were identified in P. falciparum and Plasmodium region whereas TAA represents the stop codon on the 3′ end. (C) Western yoelii as the PfRh family and Py235 family, respectively (12, 13). blots of P. falciparum and P. vivax protein extracts probed with anti-PvRBP2a PfRh5, a member of the PfRh family of P. falciparum, has been a and anti-EBA175 antibodies. Molecular mass marker is shown on the left focus of intense research as a leading blood stage vaccine can- hand side in kDa. didate due to its essential function in parasite invasion and low levels of sequence polymorphisms (reviewed in ref. 14). PfRh5 binds erythrocytes through the recognition of the Ok blood The Crystal Structure of PvRBP2a. Full-length PvRBP2a includes a group antigen, (15). This receptor-adhesin pair mediates signal peptide at the N terminus and a single transmembrane do- an essential molecular event during parasite entry, and sequence main at the C terminus, with no other predicted structural domains. polymorphisms within PfRh5 and/or basigin have been shown to Most secondary structure predictions highlighted a disordered re- modulate host tropism (16, 17). The addition of antibodies gion within the first 160 amino acids of PvRBPs. Therefore, for our initial structural studies, we used a construct of PvRBP2a encom- against PfRh5, basigin, or PfRh5-interacting partners (PfRipr – and PfCyRPA) in P. falciparum growth assays results in the passing amino acid residues 160 1135 (PvRBP2a160–1135). The protein was expressed in Escherichia coli and purified using strong inhibition of parasite growth across multiple strains (15, 2+ 18–21). Further promising results show that Aotus nancymaae Ni -affinity chromatography, followed by tag removal using monkeys immunized with the anti-PfRh5 vaccine are protected tobacco etch virus (TEV) protease and size exclusion chroma- against severe infection (22). The crystal structure of PfRh5 tography. Protein crystals were obtained by in situ proteolysis alone or with its receptor basigin provides, to our knowledge, the with porcine elastase. The structure of PvRBP2a was determined first structural description of how PfRh proteins engage with through the single isomorphous replacement with anomalous scattering (SIRAS) method using iodine-derivatized crystals, their receptors (23, 24). PfRh5 adopts a fold comprising an α-helical with subsequent refinement against the native diffraction data to scaffold that provides binding sites at the tips of helices for basigin a resolution of 2.12 Å. Crystallographic data collection and re- and some inhibitory monoclonal antibodies (24). finement statistics are presented in Table 1. To date, no experimentally determined molecular structure The crystal structure of PvRBP2a reveals two molecules in the for any PvRBPs has been reported. In this paper, we present, to asymmetric unit (Fig. S1A). Both molecules are almost identical, our knowledge, the first crystal structure of an erythrocyte- with rmsd of atomic positions between two molecules 0.6 Å over binding domain within PvRBP2a that shows structural similarity 281 aligned atoms Cα (Fig. S1B). As such, the following dis- to the erythrocyte-binding domain of the essential P. falciparum cussion will focus solely on molecule A, which we will refer to as invasion ligand, PfRh5. Our analyses of its erythrocyte-binding PvRBP2a160–455. The structure of PvRBP2a160–455 consists of 10 characteristics and sequence diversity patterns from field isolates α-helices surrounding one short β-hairpin (Fig. 2A). We denote provide, to our knowledge, the first structure-function analyses these helices as α1, α2a, α2b, α2c, α3a, α3b, α4, α5, α6, and α7, of this important family of P. vivax erythrocyte-binding proteins. where numerical order indicates progression of the polypeptide chain from N to C terminus. Helices α2 and α3 are divided into Results smaller fragments due to the presence of the breaks into the Gene Structure and Expression of PvRBP2a. Bioinformatic analyses canonical (i, i + 4) hydrogen bonding pattern. Within α2, helices suggest that full-length pvrbp genes contain two exons that are α2a and α2b are divided, with a bend introduced by residue separated by an intron located immediately downstream of the P229, whereas a short helix α2c is separated from the rest at the signal peptide-coding sequence (11). To verify this putative gene C-terminal extremity. Helix α3 is divided into two equal parts by structure for PvRBP2a, we prepared cDNA from a Thai patient a β-turn formed by amino acids Q273 to M276. infected with P. vivax and used it as the PCR template. Splicing The molecule adopts a flat, ellipsoidal shape, with the helices was detected when primers flanking the putative intron were arranged parallel to the long axis. It consists of two subdomains used (Fig. 1A). Sequencing of the gene fragments indicated of similar length, the N- and C-terminal, that are related to each splicing between nucleotides position 55–235 of pvrbp2a (Fig. other by a pseudo-twofold rotation symmetry (Fig. S2 A and B). 1B). No additional intron was found within the first 4,000 base The N-terminal subdomain includes helices α1, α2a, α2b, α2c, pairs of the gene by cDNA amplification and sequencing. α3a, and α3b whereas the C-terminal subdomain is composed of Full-length PvRBP2a is predicted to consist of 2,487 amino helices α4, α5, α6, and α7. The N-terminal subdomain begins acid residues with a molecular weight of 286 kDa. To determine with a very short helix 310 forming one full turn, followed by a whether PvRBP2a is expressed in P. vivax, we generated specific small β-hairpin, helix α1, and two long helices α2 and α3. The anti-PvRBP2a polyclonal antibodies in rabbits. Using Western β-hairpin localized in the center of the molecule consists of two blotting, we probed both P. falciparum and P. vivax protein ex- short anti-parallel β-strands (β1 and β2) that are connected by a tracts with anti-PvRBP2a antibodies. We detected a doublet beta bulge (Fig. 2A). The main axis of the β-hairpin is oriented running above 250 kDa only in the P. vivax extracts (Fig. 1C). perpendicularly to the plane formed by the long helices and Anti-EBA175 antibodies were used as a positive loading control protrudes slightly out on both sides. It may contribute to the for P. falciparum lysate. overall shape of the molecule either by keeping helices α2 and α3

2of10 | www.pnas.org/cgi/doi/10.1073/pnas.1516512113 Gruszczyk et al. Downloaded by guest on October 1, 2021 Table 1. Crystallographic data collection and refinement consistent with the theoretical molecular mass of 98.4 kDa. The PNAS PLUS statistics frictional coefficient obtained from AUC data analyses for all Protein PvRBP2a three protein concentrations was 2.1, which is characteristic for protein molecules with an elongated shape. Data collection Although we were able to crystallize the N-terminal domain of

Space group P212121 PvRBP2a, the overall topology of the larger fragment remained Cell parameters of broad interest, especially considering that the C-terminal a, b, c, Å 58.79, 93.45, 126.72 domains in the 160–1000 fragment have a presently unknown Resolution, Å 49.76–2.12 (2.19–2.12) structure and are nonhomologous to known domain structures. R 0.134 (1.916) To this end, we collected small angle scattering (SAXS) data for merge – 〈I/σI〉 13.3 (1.8) PvRBP2a160–1000 at the Australian Synchrotron (Fig. 3 B E and Completeness, % 100.00 (100.00) Table 2). Analysis of the Guinier region indicated that the pro- Multiplicity 14.3 (14.1) tein was monodisperse and that data were not affected by ag- gregation or interparticle repulsion (Fig. 3 B, Inset). The real CC1/2* 0.999 (0.538) Refinement statistics space interatomic distance distribution [P(r)] plot was consistent – – with PvRBP2a existing in a highly elongated form (Fig. 3C) Resolution, Å 43.84 2.12 (2.24 2.12) = Observed reflections 1,249,569 (Dmax 300 Å). The analysis of the Kratky plot confirmed that the protein molecule is fairly rigid (Fig. 3D). We performed ab Unique reflections 40,139 initio bead modeling with DAMMIF, revealing PvRBP2a to exist R 0.203 (0.286) work in an elongated, boomerang-like shape. The crystal structure of Rfree 0.229 (0.347) PvRBP2a160–455 was docked into the obtained SAXS envelope by No. of atoms 4,945 rigid body modeling to illustrate the relative size and position of Protein 4,739 the erythrocyte-binding domain in the context of the longer Ligand 11 fragment of PvRBP2a (Fig. 3E). Solvent 195 2 Average B-factors, Å 56.8 Structure Comparison Between PvRBP2a and PfRh5. The crystal Protein 57.0 structure of PfRh5, a member of the homologous PfRh family, Ligand 77.1 has been recently solved alone as well as in complex with its

Solvent 51.1 cognate receptor basigin or with inhibitory monoclonal anti- MICROBIOLOGY rmsd bodies (23, 24). Crystal structures of PfRh5 and PvRBP2a160–455 Bond lengths, Å 0.004 can be superimposed with rmsd of atomic positions 3.4 Å over 217 Bond angles, ° 0.709 aligned atoms Cα (Fig. 4A). However, the general architecture of Ramachandran plot, % the PvRBP2a160–455 molecule seemed similar to that of PfRh5. Most favored 97.6 Structural alignment of both proteins shows that the two disulfide Allowed 2.4 bridges superimpose very well (Fig. 4A). The bonded pair of cys- Outlier 0 teine residues C227 and C271 in PvRBP2a160–455 corresponds to the bond present between C224 and C317 in PfRh5 whereas the Statistics for the highest-resolution shell are shown in parentheses. Data second connection formed between C299 and C303 overlaps with a were collected on a single crystal. bond between C345 and C351. *CC1/2, Pearson correlation coefficient between independently merged halves of the dataset. A striking difference between PfRh5 and PvRBP2a160–455 concerns the distribution of the charged amino acid residues exposed on the surface (Fig. 4B), particularly within the region in apart from the C-terminal subdomain due to the stacking interac- PfRh5 involved in binding basigin (Fig. 4C). The corresponding tions or by forming an extensive network of hydrogen bonds, with area within PvRBP2a has a predominantly negative electrostatic the rest of the molecule thus holding two subdomains together (Fig. potential due to several acidic amino acid residues localized in S2C). The connection between strand β2 and helix α1includesa helix α4 and the preceding loop: e.g., E304, D306, E309, E313, γ-turn, with F181 in its center (Fig. 2B). The C-terminal subdomain exhibits simpler organization and consists of four relatively long α-helices forming a coiled-coil domain (Fig. 2A). There are two disulfide bridges present within the PvRBP2a160–455 A structure (Fig. 2A). Cysteine residues 227 and 271 form a co- valent bond, connecting helices α2 and α3, that is localized in the middle of the N-terminal subdomain in the close proximity of the kinks present in the long helices. The second disulfide bridge formed between cysteine residues 299 and 303 and is located on B the tip of the molecule in the loop connecting helices α3b and α4 at the border of the N- and C-terminal subdomains. By holding different helices or subdomains together, the covalent bonds between those residues may play an important structural role in stabilizing an overall fold of the protein. The asymmetric unit of the crystal lattice contains two mole- cules of PvRBP2a in the extensive contact with the buried surface area between them of 1,340 Å2 (Fig. S1A). To address whether Fig. 2. Crystal structure of the PvRBP2a erythrocyte-binding domain. (A)Three orthogonal views of the molecule. The overall fold of the protein is formed by this dimeric form was representative of the biological unit, we 10 antiparallel α-helices and one short β-hairpin.Moleculeisshowninribbon analyzed the oligomeric state of PvRBP2a using analytical ultra- representation colored in rainbow from blue at the N terminus until red at the centrifugation (AUC). We expressed a construct that encom- C terminus. The cysteine residues forming disulphide bridges are shown as – passed residues 160 1000 (PvRBP2a160–1000). Sedimentation yellow sticks. (B) Sample of the electron density around the residue F181

velocity experiments indicated that the protein is exclusively mo- forming a γ-turn. The 2Fobs − Fcalc map is contoured at 2σ andshownasblue nomeric, with the sedimentation coefficient of 3.5 S corresponding mesh. Protein is shown in ball and stick representation with carbon shown in to an apparent molecular mass of 93.6 kDa (Fig. 3A), which is green, nitrogen in blue, oxygen in red, and sulfur in yellow.

Gruszczyk et al. PNAS Early Edition | 3of10 Downloaded by guest on October 1, 2021 (23). The corresponding sequence in PvRBP2a forms only a very short helix α2c and consists of residues P244 to H247 (Fig. S3C).

Sequence Alignment of the PvRBPs Family. Sequence alignment of PvRBP2a160–455 with the six other PvRBP members highlighted a common domain that features conserved placement of one or two disulfide bridges (Fig. 5, Left) and is localized close to the first 500 amino acids of the N terminus of each molecule. The majority of the conserved amino acid residues within this aligned region are hydrophobic in nature and may play a structural role in maintaining the proper architecture of the domain (Fig. 5, Right, pink residues). On the other hand, the residues on the solvent-exposed surface are highly variable. The region of high- est divergence between PvRBP subgroup 1 (PvRBP1a and 1b) and subgroup 2 (PvRBP2a, -2b, and -2c, PvRBP-P1, and PvRBP- P2) is localized at the N terminus surrounding the β-hairpin, where the sequence alignment contained several gaps. The position of cysteine residues is conserved for almost all members of subgroup 2, with the exception of PvRBP2c, which is missing the second pair. In contrast, for subgroup 1, the first pair of disulfide bonds is not present. The most variable region between all PvRBP proteins includes the loop connecting helices α3b and α4, which, in PvRBP2a, includes one of the disulfide bridges formed between C299 and C303. This localization of the loop in PvRBP2a overlaps with the basigin-binding site in PfRh5 (Fig. 4C).

Sequence Diversity and Evolution of the Genes Encoding PvRBP2a. High quality consensus sequences for the entire 7,464-bp pvrbp2a coding locus were obtained from 22 P. vivax clinical isolates from Papua New Guinea (Dataset S1). These sequences were aligned together with published data for five reference strains (25) and four field isolates from Thailand (26). A total of 105 single nu- cleotide polymorphisms were identified that encoded 81 poly- morphic amino acids. All 31 P. vivax isolates surveyed had Fig. 3. AUC and SAXS data analysis for PvRBP2a. (A) Sedimentation velocity unique amino acid haplotypes (Fig. S4), demonstrating very high analytical ultracentrifugation for PvRBP2a160–1000 at 0.4, 1.0, and 1.7 mg/mL diversity of PvRBP2a. The highest nucleotide diversity was con- (green, red, and black, respectively) is consistent with a monomeric protein. centratedinthefirst2.2kb(Fig.6A). This region also contained The measured sedimentation coefficient 3.5 S corresponds to an apparent evidence of significant balancing selection as measured by the molecular mass of 93.6 kDa with the theoretical calculated molecular mass as Tajima’s D statistic, consistent with this region being an important 98.4 kDa. No formation of higher order oligomers was detected. (B)Experi- target of host immune responses (Fig. 6B). Of the 31 polymorphic mental scattering profile (black squares) overlaid with the calculated scattering sites with minor allele frequencies more than 10%, 10 mapped to pattern (red line) of a representative ab initio model generated using the the corresponding region of the crystal structure at residues N186S, program DAMMIF. The data are presented as the natural logarithm of the intensity vs. q. (Inset) The Guinier plot was linear, which is consistent with an E277K, K285N/I, K289E, E304K, D306V, E351Q, P399S, K421M, absence of detectable aggregates in the sample. (C) The interatomic distance and G438E (Fig. 6C). Of these polymorphisms, three polymorphic residues (E304K, D306V, and P399S) are within the negatively distribution function, P(r), of PvRBP2a160–1000. The curve was calculated by in- direct Fourier transform using the program GNOM. (D)Kratkyplotanalysisof charged region that may function as a putative erythrocyte-binding the SAXS data, suggesting that the protein molecule is fairly rigid. (E)Two region. Comparing these sites to polymorphism data publically

orthogonal views of the crystal structure of PvRBP2a160–455 docked into an available in PlasmoDB, 7 of the 10 residues were also found com- averaged and filtered ab initio SAXS envelope of PvRBP2a160–1000 derived from monly among isolates from other malaria endemic areas (N186S, 20 independent DAMMIF calculations. The PvRBP2a160–1000 construct adopts an E277K, K285N/I, K289E, E304K, D306V, and P399S). elongated boomerang-like shape. The crystal structure was fitted into one of its extremities by rigid-body docking. The fragment of PvRBP2a not present in Erythrocyte-Binding Characteristics of PvRBP2a. We used a flow the crystal structure forms a long tail after the C terminus. cytometry-based erythrocyte-binding assay to determine the binding characteristics of PvRBP2a (27). This flow cytometry method entailed incubating recombinant PvRBP2a160–1000 (or its E317, and E321 (Fig. 4 B and C). This negatively charged area variants) with red blood cells, and binding was detected using an forms a distinctive patch on the surface of the protein that may anti-PvRBP2a rabbit polyclonal antibody, followed by an anti- be a putative receptor-binding site. rabbit secondary antibody conjugated to a fluorophore. We Detailed comparisons of PfRh5 and PvRBP2a160–455 revealed initially performed the binding studies using an enriched pop- several differences in the placement of the secondary structure ulation of reticulocytes that were stained by Thiazole Orange + elements. The orientation of the β-hairpin within the N terminus (TO ). CD71, the transferrin receptor, is present on immature is rotated by 180 degrees between both proteins (Fig. S3A). The reticulocytes and is progressively depleted as reticulocytes ma- different arrangement of this fragment of the PvRBP2a molecule ture into normocytes. It was used as an additional surface marker is related to the presence of an additional short loop, including a for different stages of reticulocytes (Fig. 7 A and D). We ob- served that PvRBP2a – bound equally to the mature γ-turn that connects strand β2 with helix α1. Moreover, helices − 160 1000 + α α erythrocytes (TO ) and reticulocyte populations (TO ) (Fig. 7 B 2a and 3b in PvRBP2a160–455 are shifted by one helix breadth and D). This binding property was similar to PfRh4 binding compared with equivalent helices in PfRh5 (Fig. S3B). Another profile which cognate receptor is present structural difference concerns the presence of a long loop in on both mature erythrocytes and reticulocytes (Fig. 7 C and D). PfRh5, including residues S257 to D294. This region of the Because PvRBP2a160–1000 binding was not reticulocyte-specific, molecule is disordered and not visible in the electron density we performed all subsequent binding assays using mature

4of10 | www.pnas.org/cgi/doi/10.1073/pnas.1516512113 Gruszczyk et al. Downloaded by guest on October 1, 2021 Table 2. SAXS data collection and analysis statistics above to lysine (positive charge) and expressed the protein PNAS PLUS – Protein PvRBP2a 160–1000 PvRBP2aE/DmutK (Fig. 8 A C). This protein was not able to bind erythrocytes, suggesting that the negative charge patch on the Data collection parameters PvRBP2a surface is important for modulating its ability to bind Instrument Australian Synchrotron SAXS/WAXS erythrocytes (Fig. 8D). beamline Structural alignment of PfRh5 and PvRBP2a proteins shows Beam geometry 120-μm point source that both disulfide bridges superimpose very well. Reduced and Wavelength, Å 1.033 alkylated PfRh5 results in a threefold reduction in binding to q range, Å-1* 0.00369–0.22821 basigin (23). In PvRBP2a160–455, there are two disulfide bonds at Exposure time 2-s exposures C227 and C271 and between C299 and C303. We proceeded to Protein concentration 8.0 mg/mL protein via inline gel mutate C277 and C271 or C299 and C303 as pairs into serine residues to prevent disulfide bond formation. Mutation of C277S filtration chromatography and C271S resulted in an expressed protein that eluted in the Temperature, K 289 void volume of size exclusion chromatography, indicative of Structural parameters misfolded proteins. We were successful, however, in generating a −1 ± I(0), cm [from P(r)] 0.089 0.001 double mutant of C299S and C303S, which we will refer to as Rg, Å [from P(r)] 91.06 ± 0.91 PvRBP2aC299S, C303S (Fig. 8 A and C). This mutant did not bind Dmax,Å 300 − erythrocytes, highlighting the importance of the disulfide loop I(0), cm 1 (from Guinier) 0.087 ± 0.001 connecting helices α3b and α4 in PvRBP2a function (Fig. 8D). ± Rg, Å (from Guinier) 83.70 1.53 We also expressed variant forms of PvRBP2a160–1000 with Software used mutated clusters of existing polymorphisms identified in field Primary data reduction Scatterbrain (Australian Synchrotron) isolates (Fig. 6C). Although mutation of residues within the first Data processing PRIMUS, GNOM cluster of polymorphisms PvRBP2aN186S, K421M seemed to have a Rigid-body modeling COLORES detrimental effect on PvRBP2a ability to bind erythrocytes, we Computation of CRYSOL did not see a reduction in erythrocyte binding for the other model intensities variants: PvRBP2aE277K, K285N, K289E, PvRBP2aE277K, K285I, K289E, Porod–Debye analysis ScÅtter PvRBP2aE304K, D306V, P399S, PvRBP2aE304K, D306V, PvRBP2aE351Q, 3D graphics representations Sculptor and PvRBP2aG438E (Fig. 8D). We observed a slight increase in

*q is the magnitude of the scattering vector, which is related to the scatter- MICROBIOLOGY ing angle (2θ) and the wavelength (λ) as follows: q = (4π/λ)sinθ. AB erythrocytes. This protein fragment bound to erythrocytes treated with neuraminidase and chymotrypsin, but not when treated with trypsin (Fig. 7E). This binding property repre- sents an enzyme profile different from PfRh4 binding, which is neuraminidase-resistant but trypsin- and chymotrypsin-sensitive (28). Due to the striking structural similarity with PfRh5, we hy- pothesized that the putative erythrocyte-binding domain within PvRBP2a resides within PvRBP2a160–455. Unfortunately, we were unsuccessful in purifying PvRBP2a160–455 as a recombinant fragment in E. coli using both native and refolding methods. C Alternatively, we expressed a construct that contained a deletion of amino acids 160–460 within PvRBP2a160–1000, which we will refer to as PvRBP2aΔ160–460 (Fig. 8 A–C). PvRBP2aΔ160–460 did not bind red blood cells (Fig. 8D), showing that amino acid 160– 460 is the erythrocyte-binding domain of PvRBP2a. In this study, we sought to identify amino acid residues within PvRBP2a that are important for erythrocyte binding. We designed and expressed a variety of individual or clusters of mutations within PvRBP2a160–1000 based on the structural comparisons with PfRh5 and the identified field polymorphisms (Fig. 8 A–C). In most cases, these mutations were named after their respective positions within PvRBP2a, preceded by the original amino acid Fig. 4. Structural comparison between PvRBP2a and PfRh5. (A)Superimpo- residue followed by the mutant amino acid residue: e.g., in sition of PvRBP2a (magenta) and PfRh5 (blue) structures in ribbon represen- PvRBP2aE351Q, PvRBP2a was mutated in residue 351 from glu- tation. The molecules can be aligned with the calculated rmsd of atomic tamic acid to glutamine. To ensure that these protein variants positions 3.4 Å. The cysteine residues forming disulfide bridges are shown as were well-folded, we performed circular dichroism spectroscopy yellow sticks. The PDB ID code for PfRh5 is 4WAT. (B)Comparisonofthe and observed that only PvRBP2aP180A, F181A, Y182A (which has electrostatic potential surfaces of two orthogonal views of PfRh5 (Upper)and three mutations within the γ-turn) showed different secondary PvRBP2a (Lower). The negatively charged area is clearly visible in the upper part of the PvRBP2a molecule. Electrostatic surface potentials were calculated structure features from the WT protein PvRBP2a160–1000 (Fig. 8A). using the program APBS with the nonlinear Poisson–Boltzmann equation and As expected, PvRBP2aP180A, F181A, Y182A did not bind erythrocytes ± (Fig. 8D). contoured at 5kT/e. Negatively and positively charged surface areas are Comparison of the PvRBP2a and PfRh5 structures highlighted colored red and blue, respectively. (C) Basigin-binding site in PfRh5 is super- – imposed with the corresponding region in PvRBP2a. Residues forming the differences in surface properties within the PfRh5-basigin negatively charged area in PvRBP2a are localized in helix α4 and include E304, binding region (24). The corresponding area within PvRBP2a D306, E309, E313, E317, and E321. This glutamate-rich region overlaps with the has predominantly a negative electrostatic potential due to the basigin-binding site in PfRh5 (shown as a white surface representation, with presence of several acidic amino acid residues localized in helix the residues interacting with basigin highlighted in teal). The PvRBP2a mole- α4 and the preceding loop: e.g., E304, D306, E309, E313, E317, cule is shown as ribbons colored in magenta, and amino acid residues bearing and E321 (Fig. 4C). We mutated the six residues mentioned negative charge are shown as yellow/red sticks.

Gruszczyk et al. PNAS Early Edition | 5of10 Downloaded by guest on October 1, 2021 Fig. 5. Sequence alignment of PvRBP family. (Left) Alignment of the sequence corresponding to the crystallized fragment of PvRBP2a with sequences of six other members of the PvRBP family highlights a conserved domain. The secondary structure elements of PvRBP2a are shown above the alignment. The bonded pairs of cysteine residues are highlighted in yellow and marked with green numbers. (Right) Two orthogonal views of the PvRBP2a structure, with conserved amino acid residues highlighted as magenta sticks. The conserved residues have a mostly hydrophobic character and are involved in packing of the helical core.

erythrocyte binding in proteins with mutations within the negative abolished upon trypsin treatment but is resistant to neuramini- patch (PvRBP2aE304K, D306V, P399S and PvRBP2aE304K, D306V), but dase and chymotrypsin treatment. Unfortunately, none of the this increase was not statistically significant. known erythrocyte receptors in P. vivax and P. falciparum in- vasion share a similar enzyme profile. We propose that, whereas Discussion PvRBP2a does not govern entry into reticulocytes specifically, it Here we show, to our knowledge, the first high-resolution crystal may function in the subsequent steps in parasite invasion that are structure of the erythrocyte-binding domain of a member of the universal between P. vivax and P. falciparum. These molecular PvRBP family of proteins. This erythrocyte-binding domain from events may include modulation of the erythrocyte cytoskeleton, PvRBP2a shares a structural fold similar to PfRh5, an essential the assembly of the tight junction, or signaling within the parasite invasion ligand in P. falciparum and one of the current leading for rhoptry secretion. However, the elucidation of PvRBP2a vaccine candidates against blood stage infection (22–24). Align- function remains challenging due to the lack of a long-term in ment of PvRBP2a erythrocyte-binding domain and other coding vitro culture and of genetic manipulation techniques for P. vivax. PvRBPs shows a conserved domain that is characterized by the The crystal structure of PfRh5 with basigin identified critical placement of disulfide bonds. Sequence diversity analyses using residues within the binding interface (24). Overlay of PfRh5′s field and reference isolates show that the PvRBP2a erythrocyte- binding interface upon PvRBP2a160–455 shows that the surface binding domain lies within a region of significant polymorphism properties within this region are quite different. PvRBP2a160–455 and balancing selection. We propose that, whereas the PfRhs and shows a distinct patch of negative charge residues mostly con- PvRBPs family of proteins share a similar fold, the surface prop- tributed by E304, D306, E309, E313, E317, and E321 within helix erties for each member are distinct and reflect the specific nature α4. Mutation of these residues to lysine resulted in a marked re- of individual ligand–receptor interaction. duction in the ability of PvRBP2a to bind erythrocytes. Sequence Our binding assays show that PvRBP2a recognizes an eryth- analyses of field isolates show E304 and D306 as lysines in their rocyte receptor that is present on both reticulocytes and mature respective positions, and we also observed a slight increase in erythrocytes. Its interaction with the receptor is completely erythrocyte-binding in proteins with mutations within the negative

6of10 | www.pnas.org/cgi/doi/10.1073/pnas.1516512113 Gruszczyk et al. Downloaded by guest on October 1, 2021 A present within the family, as observed for DBL and EBA families PNAS PLUS 0.025 (29, 30). Cocrystallization studies between PfRh5 and inhibitory mono- 0.020 clonal antibodies have identified regions important for receptor– ligand interaction . Monoclonal antibody QA1 raised against 0.015 PfRh5 blocks PfRh5–basigin interaction and inhibits parasite growth in vitro (18). QA1 binds to the loop at the tip of PfRh5 0.010 that is held by a disulfide bond between C345 and C351 (24). Residues within this loop are important for receptor specificity 0.005 because SNP N347 has been linked with the ability of P. falci- parum to invade Aotus monkey erythrocytes (16). In PvRBP2a, the cysteine bond between C299 and C303 overlaps with the B 0.000 bond between C345 and C351 in PfRh5. Mutation of C299 and C303 to serines resulted in a loss of PvRBP2a’s capability 2.0 to bind erythrocytes, showing that the presence of the bond and the loop is required for PvRBP2a function. From our alignment 1.0 of the PvRBP family, we observed that the most variable region between all PvRBP proteins includes the loop connecting 0.0 helices α3b and α4, which includes the disulfide bridge formed between C299 and C303 on PvRBP2a. Future experiments should Tajima’s D Tajima’s -1.0 explore the intriguing possibility of targeting both the PfRh and PvRBP family of proteins by generating monoclonal anti- -2.0 bodies that recognize specific conformational epitopes within the different loops. The two-stranded β-sheet within the N terminus of PfRh5 and 0 10002000 3000 4000 50006000 7000 8000 PvRBP2a is rotated by 180° between both proteins, with the Nucleotide position in pvrbp2a coding region orientation of the β-hairpin in PvRBP2a a result of a γ-turn with C β E304 F181 in its center. Residues within the -hairpin may contribute MICROBIOLOGY P399 to stacking interactions or extensive hydrogen-bonding interac- D306 K289 tions that assist in forming the overall shape of PvRBP2a by α α α α K285 keeping helices 2 and 3 apart. In particular, 2a and 3b in N186 PvRBP2a are shifted by a whole diameter compared with equivalent helices in PfRh5, and the difference in positioning K421 E277 may confer receptor specificity. Helix α3b within PfRh5 is an important interacting surface for inhibitory monoclonal antibody 9AD4, which shows the strongest effect on the reduction of G438 P. falciparum invasion (18). We hypothesize that the orientation of the β-hairpin within the erythrocyte-binding domain in the PfRh E351 and PvRBP family of proteins will confer some level of receptor specificity by modulating the 3D architecture of helix α2andα3. Fig. 6. Polymorphism and natural selection on the genes encoding PvRBP2a. Sequence diversity analyses using field and reference isolates (A) Sliding window analysis showing nucleotide diversity (π)valuesinpvrbp2a show that the erythrocyte-binding domain of PvRBP2a is localized for the 31 sequences analyzed. A window size of 100 bp and a step size of 3 bp within a region of significant balancing selection, suggesting were used. (B) Sliding window calculation of Tajima’s D statistic was performed = functional importance. From our erythrocyte-binding assays, it was for the PNG population (n 22). A window size of 100 and a step size of 3 were observed that PvRBP2a did not bind very well to red used. The gray box in A and B refers to a highly polymorphic region within N186S, K421M pvrbp2a that is under balancing selection. (C) Three views of the surface blood cells from Australian donors but may reflect an adaptation representation of the crystallized fragment of PvRBP2a, with polymorphic to its cognate receptor in malaria endemic populations. Sequence residues highlighted in green. The polymorphic residues cluster around polymorphisms in members of the P. falciparum erythrocyte- three main sites. The first one is localized on the tip of the molecule in the binding antigen (PfEBA) protein family result in changes either in region overlapping with the basigin-binding region in PfRh5 and includes receptor affinity or alternate receptor recognition, thus allowing residues E304K, D306V, and P399S. The second region is localized on the side functional flexibility in the use of erythrocyte receptors that are of the molecule within helix α3b and includes E277K, K285N/I, and K289E. under selection in malaria-endemic regions. The identification of The third region is localized in the cleft between helices α1 and α7 and in- the erythrocyte receptor for PvRBP2a will be important to help cludes N186S and K421M. There are two isolated single polymorphic sites, elucidate the functional relationship between its sequence poly- including E351Q and G438E. morphisms and receptor interaction. PvRBP2a is highly polymorphic, with 81 polymorphic amino acids among the 31 isolates analyzed. In contrast, for PfRh5, 50 patch (PvRBP2aE304K, D306V, P399S and PvRBP2aE304K, D306V). nonsynonymous polymorphisms were found among 204 isolates Collectively, these results highlight that the negative patch on (Plasmodium falciparum Sequencing Project, Broad Institute of PvRBP2a is an important surface for the interaction with eryth- Harvard and MIT) (31, 32). The higher diversity of PvRBP2a is rocyte proteins. The structural mechanism of how these poly- consistent with higher levels of diversity genome-wide for P. vivax morphisms will affect interaction with the erythrocyte receptor will compared with P. falciparum and is at least partly the result of a be important questions for the future. more stable and ancient parasite population (25). However, the It remains a matter of outstanding interest how PvRBP1a and high density of polymorphisms found in the elucidated binding PvRBP2c mediate specific reticulocyte binding (11). It is notable region of PvRBP2a and evidence of balancing selection suggest that the site implicated in erythrocyte binding in PvRBP2a aligns antigenic diversity that allows parasites to escape host immune with the basigin-binding region in PfRh5 (24). However, the responses. However, some polymorphisms may have roles other preponderance of negatively charged residues in this region in than antigenic diversity, such as adaptation to a polymorphic PvRBP2a is not conserved among other PvRBP family members, host receptor. The differential binding of mutants to red blood suggesting that divergent interfaces for receptor binding are cells is also consistent with this scenario.

Gruszczyk et al. PNAS Early Edition | 7of10 Downloaded by guest on October 1, 2021 A B C D 15 10 30 4 4 10 10 104 0.85 11 4.21 5.43 23.2 17.7 12 8 24 3 3 10 10 103 9 6 18 TO - 2 2 10 10 102 6 4 12 TO + 1 1 10 10 101 % positive 3 2 6 0 72.5 15.7 0 70.1 20.3 50.3 8.87 10 10 100 0 1 2 3 4 0 1 2 3 4 10 10 10 10 10 10 10 10 10 10 100 101 102 103 104 0 0 0 4 104 10 104 CD71 PvRBP2a PfRh4 11.8 0.51 0.21 1.21 0.11 0.07 E 3 20 103 10 103 10 8 15 2 102 10 102 6 10 1 4 101 10 101 2 5 0 87.5 0.26 0 73.6 24.9 0 73.5 26.3 10 10 10 % RBC binding % RBC binding 0 1 2 3 4 0 100 101 102 103 104 10 10 10 10 10 100 101 102 103 104 0 Rh4 - 647 PvRBP2a - 647

CD71-PECy5 Un Nm LT HT CHY Un Nm LT HT CHY Thiazole Orange Thiazole Orange Thiazole Orange PvRBP2a PfRh4

Fig. 7. PvRBP2a erythrocyte-binding characteristics. (A) Dot plots showing the enriched reticulocyte purification stained with both TO and CD71-PECy5 (Upper) or CD71-PECy5 alone (Lower). (B) Dot plots showing the binding of PvRBP2a to red blood cells (Upper). Binding was detected using an anti-PvRBP2a rabbit IgG antibody, followed by a secondary anti-rabbit Alexa 647 antibody. (Lower) A binding control where no protein was added before incubation with primary and secondary antibodies. (C) Dot plots showing the binding of PfRh4 to red blood cells (Upper). Binding was detected using an anti-PfRh4 rabbit IgG antibody, followed by a secondary anti-rabbit Alexa 647 antibody. (Lower) A binding control where no protein was added before incubation with primary − and secondary antibodies. (D) Bar charts showing the percentage of binding of CD71-PECy5, PvRBP2a, and PfRh4 to mature erythrocyte (TO ) vs. reticulocyte + (TO ) populations. Error bars represent SEM of three independent repeats. (E) PvRBP2a binding to erythrocytes is neuraminidase- and chymotrypsin-resistant but trypsin-sensitive whereas PfRh4 binding is unperturbed by neuraminidase treatment. A flow cytometry-based red blood cell-binding assay was performed using untreated erythrocytes (Un), neuraminidase (Nm), low trypsin (LT), high trypsin (HT), and chymotrypsin-treated erythrocytes (CHY). Low trypsin and high trypsin refer to treatments with 0.1 and 1.5 mg/mL enzyme, respectively.

SAXS data show that the longer fragment of PvRBP, including DNA-free Kit (Life Technologies) was used to remove the trace amount of residues 160–1000, forms an elongated shape in solution with the genomic DNA from purified RNA. Preparation of cDNA from total RNA with erythrocyte-binding domain localized at one end of the molecule. oligo dT followed the standard protocol of the SuperScript III First Strand Synthesis Kit (Life Technologies). cDNA was used as the template for PCR Furthermore, ultracentrifugation analyses of PvRBP2a160–1000 show that this elongated form exists solely as a monomer in so- amplification of several overlapping gene fragments covering the first 4,000 lution. The erythrocyte-binding domain of PvDBP is monomeric base pairs of the gene. Each amplicon was sent to Macrogen for dye ter- in the absence of its receptor DARC but dimerizes upon receptor minator sequencing. An intron was identified by sequencing and sub- sequently confirmed by PCR with primers 5′-AGG TGC TCT GGG CAG TTT T-3′ engagement to promote receptor specificity, potentially allowing and 5′-TTG GGG GAT TTT CCT TCC-3′ (Fig. 1A). invasion-activated signaling processes to occur. Whether or not the PvRBP family undergoes receptor-induced dimerization will Antibody Production. Antibody production was performed at the Walter and be an important structural aspect to examine in the future. Some Eliza Hall Institute Monoclonal Antibody Facility and approved by the Walter and attempts have been also undertaken to structurally and function- Eliza Hall Institute Animal Ethics Committee (2014.009). Rabbits were immunized

ally characterize different fragments of Py01365, a member of five times with 150 μg of PvRBP2a160–1135. The first immunization was adminis- Py235 family (reviewed in ref. 33). This protein is thought to tered with complete Freund’s adjuvant, and the rest with incomplete Freund’s function as an ATP/ADP sensor with its nucleotide-binding adjuvant. Rabbit IgG were purified from serum using Protein G Sepharose. properties residing within a 94-kDa domain, called NBD94 (34). The N-terminal portion of NBD94, so-called EBD1–194, has been Cloning of pvrbp2a and Variants into Expression Vectors. E. coli codon-opti- proposed to be involved in erythrocyte binding (35). However, mized DNA of PvRBP2a (www.plasmodb.org/plasmo/) (PVX_121920) was SAXS data indicate that NBD94444–547 and NBD94674–793 do not purchased from Life Technologies. The fragment encompassing residues have a similar architectural arrangement to PvRBP2a (36). 160–1135 was cloned into the pPROEX HTb vector (Life Technologies). To An outstanding question in P. vivax biology is the function of the generate mutations within PvRBP2a, we used a construct encompassing PvRBP family in parasite invasion of red blood cells. PvRBP1a and residues 160–1000 in the same vector. Restriction-free cloning was used to PvRBP2c bind specifically to reticulocytes and thus allow P. vivax to introduce the specific amino acid substitutions following a protocol de- enter this particular cell type (11), although the reticulocyte-specific scribed before (37). The list of all primer sequences is presented in Table S1. receptors are yet to be identified. Mounting evidence shows that All positive clones were sequenced to verify the presence of each mutation. P. vivax is capable of invading Duffy-negative red blood cells, and one possibility is that PvRBPs may provide additional parasite Expression and Purification of Recombinant PvRBP2a. Protein expression was performed using the SHuffle T7 E. coli strain (New England Biolabs). Soluble adhesins for engaging with other erythrocyte receptors as entry protein was captured using a HisTrap column (GE Healthcare), and eluted pro- portals. Expression of several PvRBPs at the apical tip may form a tein was dialyzed overnight in the presence of TEV protease. The protein sample local concentration of adhesins to facilitate binding and affinity of was passed once again through a HisTrap column, and the collected flow- the merozoite to erythrocyte. Future examination of the red blood through was subjected to size exclusion chromatography. Expression and puri-

cell-binding profiles of each individual PvRBP and identification of fication of PvRBP2a160–1000 and variants were performed in a similar fashion. their cognate receptors will be important to further our un- derstanding of P. vivax invasion and host cell selection. Crystallization of PvRBP2a, X-Ray Data Collection, and Structure Determination.

A PvRBP2a160–1135 construct was used for crystallization trials. Crystal Materials and Methods screenings were performed at the CSIRO Collaborative Crystallization Center cDNA Sequencing of PvRBP2a. Fifty microliters of P. vivax patient blood was (Parkville, Australia). The initial crystal hit was optimized manually in the collected in 250 μL of TRIzol (Life Technologies) for RNA extraction. A TURBO presence of porcine elastase (MP Biomedicals). Diffraction data were

8of10 | www.pnas.org/cgi/doi/10.1073/pnas.1516512113 Gruszczyk et al. Downloaded by guest on October 1, 2021 Circular Dichroism Data Collection, Processing, and Analysis. Circular dichroism PNAS PLUS A C (CD) data were collected using CD spectrometer Model 410 (Aviv Biomedical). 50 The spectra were processed using the DichroWeb server (43, 44). The content )

-1 40 WT E/DmutK of secondary structures was predicted using the CDSSTR algorithm (45) and 30 C299S C303S SMP180 reference set (46) for 190–240 nm. dmol 2 20 P180A F181A Y182A P180A F181A Y182A E/D mut K C299S C303S 10 kDa N186S K421M E277K K285N K289E E277K K285I K289E E304K D306V E304K D306V P399S E351Q G438E 250 Sequence Alignment of PvRBP Family. Sequence alignment was performed 0 (deg cm 150

-3 using ClustalW2 (47), corrected manually and visualized using program -10 100 75 ESPript3.0 (48). All protein sequences were obtained from the PlasmoDB

] x 10 -20 50 Database under the following accession numbers: PvRBP1a PVX_098585, -30 37 PvRBP1b PVX_098582, PvRBP2a PVX_121920, PvRBP2b PVX_094255, PvRBP2c 200 210 220 230 240 25 20 PVX_090325, PvRBP2-P1 PVX_090330, and PvRBP2-P2 PVX_101590. Wavelength (nm) 15 B 50 D WT 8 Population Genetics Analyses. To investigate naturally occurring diversity )

-1 40 E277K K285N K289E E304K D306V among PvRBP2a gene sequences, the full-length sequence was analyzed 30 E304K D306V P399S 6 (nucleotides 1–7464), and nine previously published PvRBP2a gene se- dmol G438E 2 20 E351Q quences were available for analysis, including four sequences from Thailand N186S K421M 10 E277K K285I K289E 4 (nucleotides 1303–7464 only) (26) and five P. vivax reference isolates under

(deg cm 0 the following PlasmoDB database accession numbers: Sal-1 (PVX_121920), -3 2 Brazil I (PVBG_04239), India VII (PVIIG_00453), Mauritania (PVMG_03227), -10 % RBC Binding x 10 -20 and North Korea (PVNG_00022). In addition, 22 high quality PvRBP2a gene 0 -30 sequences were extracted from available genome sequence data from field 200 210 220 230 240 WT isolates collected in Madang (n = 18), East Sepik (n = 3), and Milne Bay (n = 1) G438E Wavelength (nm) E351Q Provinces, in Papua New Guinea where P. vivax malaria is hyperendemic. E/D mut K Data were generated by the MalariaGEN Plasmodium vivax Community E304K D306V C299S C303S N186S K421M Project (www.malariagen.net/projects/parasite/pv) and by the Plasmodium vivax Sequencing Project at the Broad Institute of Harvard and MIT (www. E277K K285I K289E P180A F181A Y182A E304K D306V P399S E277K K285N K289E broadinstitute.org/) using different Illumina short read sequencing plat- Fig. 8. Identification of amino acid residues in PvRBP2a important for forms. Briefly, Illumina short reads were aligned to the reference genome Salvador 1 using bwa-short (49). Additional read trimming was performed if erythrocyte binding. (A) Far UV CD spectrum for the WT PvRBP2a160–1000 (WT) compared with the spectra of various structural mutants, including the base quality score was below 15, and reads were aligned in either single- MICROBIOLOGY end or paired-end mode depending on their read-group. PCR and optical PvRBP2aE/DmutK, PvRBP2aC299S, C303S, PvRBP2aΔ160–460, and PvRBP2aP180A, duplicates were marked using PicardTools (broadinstitute.github.io/picard/). F181A, Y182A, as labeled. Only the last triple-alanine mutant exhibits a dif- ferent CD spectrum compared with the WT protein. (B) Far UV CD spec- Variant calls in each isolate were obtained using samtools mpileup (50) for reads mapping to the PvRBP2a region, and then consensus sequences were trum for the WT PvRBP2a160–1000 (WT) and various SNP mutants, including extracted using vcf-consensus from the vcftools package (51). All 31 con- PvRBP2aE277K, K285N, K289E, PvRBP2aE304K, D306V, PvRBP2aE304K, D306V, P399S, sensus pvrbp2a sequences are available in Dataset S1. PvRBP2aG438E, PvRBP2aE351Q, PvRBP2aN186S, K421M, and PvRBP2aE277K, K285I, K289E, as labeled. All presented spectra superimpose very well, suggesting Multiple alignments of the consensus sequences were performed using the that the introduced mutations did not alter the proper folding of the pro- MUSCLE algorithm implemented in MEGA version 5.0 software (52). Genetic tein. (C) SDS/PAGE gel of purified PvRBP2a constructs loaded, as labeled. polymorphism and diversity were estimated using DnaSP version 5.0 (53) by Two micrograms of each construct were loaded onto a NuPAGE gradient gel calculating the total number of polymorphic sites, synonymous and non- (4–12%) under reducing conditions and stained with Coomassie Brilliant synonymous polymorphisms, and number of haplotypes (i.e., different Blue. Molecular mass marker indicated in kDa. (D) Deletion and mutational combinations of polymorphisms). Average pairwise nucleotide diversity (π) analyses of the erythrocyte-binding domain within PvRBP2a. Recombinant was calculated across the length of the gene using a sliding window ap- PvRBP2a proteins were tested for their capability to bind erythrocytes in a proach, with a window size of 100 and a step size of 3. flow cytometry-based assay. % RBC binding, the percentage of erythrocytes To identify regions under selection, the Tajima’s D test statistic was cal- with bound PvRBP2a protein determined by normalizing the number of culated using DnaSP version 5.0 (53), using a sliding window approach with a erythrocytes exhibiting a positive Alexa Fluor 488 signal that is above the window size of 100 and a step size of 3. Positive values indicate an excess of background (which is the Alexa Fluor 488 signal of erythrocytes without alleles at intermediate frequencies, indicative of a recent population bot- protein added) on the total number of erythrocytes. tleneck or balancing (immune) selection (54). Negative values of the Tajima’s D test statistic indicate an excess of rare alleles, consistent with directional selection or recent population expansion. collected at the MX2 microfocus beamline at the Australian Synchrotron Facility in Clayton. Molecular replacement was attempted with the PfRh5 structure Flow Cytometry-Based Erythrocyte-Binding Assay. The flow cytometry-based as a search model (PDB ID code 4WAT) but was unsuccessful. The single erythrocyte-binding assay was performed as described with the following isomorphous replacement with anomalous scattering (SIRAS) approach was modifications (27). Five micrograms of recombinant PvRBP2a was incubated μ attempted, combining native and derivative datasets and using AutoRickshaw with 100 L of erythrocyte suspension for 1 h at room temperature. After binding, erythrocytes were washed, and PvRBP2a binding was detected (38). The obtained initial model was rebuilt automatically using the program using anti-PvRBP2a rabbit IgG (1 mg/mL, 1:100, followed by Alexa Fluor AutoBuild (39), followed by manual improvement using the program Coot 647 chicken anti-rabbit secondary antibody; Life Technologies). Erythro- (40). The structure was refined using the program Phenix Refine (39), and the cytes were washed twice and resuspended in 600 μLofPBS.Intheex- refinement statistics are given in Table 1. The atomic coordinates and structure periments where purified reticulocytes were used, 100 μLofthiazole factors have been deposited in the Protein Data Bank with accession number orange (BD Bioscience) was added and incubated at room temperature for 4Z8N. The PDB Structure Validation Report is provided in Dataset S2. 30 min before final resuspension in PBS. A total of 50,000 erythrocytes were read on the FACSCalibur flow cytometer (BD Bioscience), and the Small-Angle X-Ray Scattering Data Collection, Processing, and Analysis. SAXS results were analyzed using FlowJo software (Three Star). The percentage

data were collected on PvRBP2a160–1000 (8 mg/mL) eluted in Dulbecco’sPhos- of erythrocytes with bound recombinant PvRBP2a was determined by phate-Buffered Saline (DBP) by inline size exclusion chromatography (SEC) via normalizing the number of erythrocytes exhibiting a positive Alexa Fluor 647 (or 488) signal that is above the background [which is the Alexa Fluor a capillary in the path of the X-ray beam. Data reduction was performed using 647 (or 488) signal of erythrocytes without recombinant protein added] on Scatterbrain, and processing is detailed in Table 2, as before (41). the total number of erythrocytes.

Analytical Ultracentrifugation Data Collection, Processing, and Analysis. Samples ACKNOWLEDGMENTS. We thank Janet Newman and Shane Seabrook from were analyzed using an XL-I analytical ultracentrifuge (Beckman Coulter) equipped the CSIRO Collaborative Crystallization Centre (C3) for help with setting up with an AnTi-60 rotor. The data were processed using the program SEDFIT (42). the crystallization screens, the Walter and Eliza Hall Institute’s Monoclonal

Gruszczyk et al. PNAS Early Edition | 9of10 Downloaded by guest on October 1, 2021 Antibody Facility for production of antibodies, and the MX and SAXS beam- was supported by Grant U19AI089676 (National Institute of Allergy and In- line staff at the Australian Synchrotron for assistance during data collection. fectious Diseases International Centers of Excellence for Malaria Research). We thank Prof. Mike Lawrence at the Walter and Eliza Hall Institute for We acknowledge the NIAID-funded Broad Institute Genome Sequencing critical comments on the manuscript. This work was supported by an Australian Center for Infectious Disease and the International Centers for Excellence Research Council Future Fellowship (to W.-H.T.) and an NHMRC Senior Research in Malaria Research for providing genome sequence data prior to publica- Fellowship and NHMRC Program Grant (to M.B.). This publication uses data tion. We acknowledge the Victorian State Government Operational Infra- from the MalariaGEN Plasmodium vivax Community Project (www.malariagen. structure Support and Australian Government National Health and net/projects/parasite/pv) and from the Plasmodium vivax Sequencing Project Medical Research Council Independent Research Institute Infrastructure at the Broad Institute of Harvard and MIT (www.broadinstitute.org/) and Support Scheme.

1. Gething PW, et al. (2012) A long neglected world malaria map: Plasmodium vivax 30. Batchelor JD, Zahm JA, Tolia NH (2011) Dimerization of Plasmodium vivax DBP is endemicity in 2010. PLoS Negl Trop Dis 6(9):e1814. induced upon receptor binding and drives recognition of DARC. Nat Struct Mol Biol 2. Miller LH, Baruch DI, Marsh K, Doumbo OK (2002) The pathogenic basis of malaria. 18(8):908–914. Nature 415(6872):673–679. 31. Amambua-Ngwa A, et al. (2012) Population genomic scan for candidate signatures of 3. Cowman AF, Berry D, Baum J (2012) The cellular and molecular basis for malaria balancing selection to guide antigen characterization in malaria parasites. PLoS parasite invasion of the human red blood cell. J Cell Biol 198(6):961–971. Genet 8(11):e1002992. 4. Kitchen SF (1938) The infection of reticulocytes by Plasmodium vivax. Am J Trop Med 32. Chang H-H, et al. (2013) Malaria life cycle intensifies both natural selection and Hyg s1-18(4):347–359. random genetic drift. Proc Natl Acad Sci USA 110(50):20129–20134. 5. Carlton JM, et al. (2008) Comparative genomics of the neglected human malaria 33. Grüber A, Manimekalai MSS, Preiser PR, Grüber G (2012) Structural architecture and parasite Plasmodium vivax. Nature 455(7214):757–763. interplay of the nucleotide- and erythrocyte binding domain of the reticulocyte 6. Miller LH, Mason SJ, Clyde DF, McGinniss MH (1976) The resistance factor to Plas- binding protein Py235 from Plasmodium yoelii. Int J Parasitol 42(12):1083–1089. modium vivax in blacks: The Duffy-blood-group genotype, FyFy. N Engl J Med 295(6): 34. Ramalingam JK, Hunke C, Gao X, Grüber G, Preiser PR (2008) ATP/ADP binding to a 302–304. novel nucleotide binding domain of the reticulocyte-binding protein Py235 of Plas- 7. Barnwell JW, Nichols ME, Rubinstein P (1989) In vitro evaluation of the role of the modium yoelii. J Biol Chem 283(52):36386–36396. Duffy blood group in erythrocyte invasion by Plasmodium vivax. J Exp Med 169(5): 35. Grüber A, et al. (2011) Structural characterization of the erythrocyte binding domain – 1795 1802. of the reticulocyte binding protein homologue family of Plasmodium yoelii. Infect 8. Hester J, et al. (2013) De novo assembly of a field isolate genome reveals novel Immun 79(7):2880–2888. Plasmodium vivax erythrocyte invasion genes. PLoS Negl Trop Dis 7(12):e2569. 36. Grüber A, et al. (2010) Structural determination of functional units of the nucleotide 9. Grimberg BT, et al. (2007) Plasmodium vivax invasion of human erythrocytes inhibited binding domain (NBD94) of the reticulocyte binding protein Py235 of Plasmodium by antibodies directed against the Duffy binding protein. PLoS Med 4(12):e337. yoelii. PLoS One 5(2):e9146. 10. Zimmerman PA, Ferreira MU, Howes RE, Mercereau-Puijalon O (2013) Red blood cell 37. van den Ent F, Löwe J (2006) RF cloning: A restriction-free method for inserting target – polymorphism and susceptibility to Plasmodium vivax. Adv Parasitol 81:27 76. genes into plasmids. J Biochem Biophys Methods 67(1):67–74. 11. Galinski MR, Medina CC, Ingravallo P, Barnwell JW (1992) A reticulocyte-binding 38. Panjikar S, Parthasarathy V, Lamzin VS, Weiss MS, Tucker PA (2005) Auto-rickshaw: An – protein complex of Plasmodium vivax merozoites. Cell 69(7):1213 1226. automated crystal structure determination platform as an efficient tool for the vali- 12. Rayner JC, Galinski MR, Ingravallo P, Barnwell JW (2000) Two Plasmodium falciparum dation of an X-ray diffraction experiment. Acta Crystallogr D Biol Crystallogr 61(Pt 4): genes express merozoite proteins that are related to Plasmodium vivax and Plas- 449–457. modium yoelii adhesive proteins involved in host cell selection and invasion. Proc Natl 39. Adams PD, et al. (2010) PHENIX: A comprehensive Python-based system for macro- Acad Sci USA 97(17):9648–9653. molecular structure solution. Acta Crystallogr D Biol Crystallogr 66(Pt 2):213–221. 13. Triglia T, et al. (2001) Identification of proteins from Plasmodium falciparum that are 40. Emsley P, Lohkamp B, Scott WG, Cowtan K (2010) Features and development of Coot. homologous to reticulocyte binding proteins in Plasmodium vivax. Infect Immun Acta Crystallogr D Biol Crystallogr 66(Pt 4):486–501. 69(2):1084–1092. 41. Murphy JM, et al. (2013) The pseudokinase MLKL mediates necroptosis via a molec- 14. Paul AS, Egan ES, Duraisingh MT (2015) Host-parasite interactions that guide red ular switch mechanism. Immunity 39(3):443–453. blood cell invasion by malaria parasites. Curr Opin Hematol 22(3):220–226. 42. Schuck P, Rossmanith P (2000) Determination of the sedimentation coefficient dis- 15. Crosnier C, et al. (2011) Basigin is a receptor essential for erythrocyte invasion by tribution by least-squares boundary modeling. Biopolymers 54(5):328–341. Plasmodium falciparum. Nature 480(7378):534–537. 43. Whitmore L, Wallace BA (2008) Protein secondary structure analyses from circular di- 16. Hayton K, et al. (2008) Erythrocyte binding protein PfRH5 polymorphisms determine chroism spectroscopy: Methods and reference databases. Biopolymers 89(5):392–400. species-specific pathways of Plasmodium falciparum invasion. Cell Host Microbe 4(1): 44. Whitmore L, Wallace BA (2004) DICHROWEB, an online server for protein secondary 40–51. structure analyses from circular dichroism spectroscopic data. Nucleic Acids Res 32 17. Wanaguru M, Liu W, Hahn BH, Rayner JC, Wright GJ (2013) RH5-Basigin interaction (Web Server issue):W668–W673. plays a major role in the host tropism of Plasmodium falciparum. Proc Natl Acad Sci 45. Sreerama N, Woody RW (2000) Estimation of protein secondary structure from cir- USA 110(51):20735–20740. cular dichroism spectra: Comparison of CONTIN, SELCON, and CDSSTR methods with 18. Douglas AD, et al. (2014) Neutralization of Plasmodium falciparum merozoites by an expanded reference set. Anal Biochem 287(2):252–260. antibodies against PfRH5. J Immunol 192(1):245–258. 46. Abdul-Gader A, Miles AJ, Wallace BA (2011) A reference dataset for the analyses of 19. Chen L, et al. (2011) An EGF-like protein forms a complex with PfRh5 and is required for invasion of human erythrocytes by Plasmodium falciparum. PLoS Pathog 7(9): membrane protein secondary structures and transmembrane residues using circular – e1002199. dichroism spectroscopy. Bioinformatics 27(12):1630 1636. 20. Reddy KS, et al. (2015) Multiprotein complex between the GPI-anchored CyRPA with 47. Goujon M, et al. (2010) A new bioinformatics analysis tools framework at EMBL-EBI. – PfRH5 and PfRipr is crucial for Plasmodium falciparum erythrocyte invasion. Proc Natl Nucleic Acids Res 38(Web Server issue):W695 W699. Acad Sci USA 112(4):1179–1184. 48. Robert X, Gouet P (2014) Deciphering key features in protein structures with the new – 21. Bustamante LY, et al. (2013) A full-length recombinant Plasmodium falciparum PfRH5 ENDscript server. Nucleic Acids Res 42(Web Server issue):W320 W324. protein induces inhibitory antibodies that are effective across common PfRH5 genetic 49. Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler – variants. Vaccine 31(2):373–379. transform. Bioinformatics 25(14):1754 1760. 22. Douglas AD, et al. (2015) A PfRH5-based vaccine is efficacious against heterologous 50. Li H, et al.; 1000 Genome Project Data Processing Subgroup (2009) The Sequence – strain blood-stage Plasmodium falciparum infection in aotus monkeys. Cell Host Alignment/Map format and SAMtools. Bioinformatics 25(16):2078 2079. Microbe 17(1):130–139. 51. Danecek P, et al.; 1000 Genomes Project Analysis Group (2011) The variant call format – 23. Chen L, et al. (2014) Crystal structure of PfRh5, an essential P. falciparum ligand for and VCFtools. Bioinformatics 27(15):2156 2158. invasion of human erythrocytes. eLife 3:e04187. 52. Tamura K, et al. (2011) MEGA5: Molecular evolutionary genetics analysis using 24. Wright KE, et al. (2014) Structure of malaria invasion protein RH5 with erythrocyte maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol basigin and blocking antibodies. Nature 515(7527):427–430. Biol Evol 28(10):2731–2739. 25. Neafsey DE, et al. (2012) The malaria parasite Plasmodium vivax exhibits greater 53. Librado P, Rozas J (2009) DnaSP v5: A software for comprehensive analysis of DNA genetic diversity than Plasmodium falciparum. Nat Genet 44(9):1046–1050. polymorphism data. Bioinformatics 25(11):1451–1452. 26. Kosaisavee V, Hastings I, Craig A, Lek-Uthai U (2012) The geographical distribution of 54. Tajima F (1989) Statistical method for testing the neutral mutation hypothesis by DNA allele polymorphisms of Plasmodium vivax in different regions of Thailand. J Med polymorphism. Genetics 123(3):585–595. Assoc Thai 95(Suppl 6):S135–S140. 55. Battye TG, Kontogiannis L, Johnson O, Powell HR, Leslie AG (2011) iMOSFLM: A new 27. Tham W-H, et al. (2010) Complement receptor 1 is the host erythrocyte receptor for graphical interface for diffraction-image processing with MOSFLM. Acta Crystallogr D Plasmodium falciparum PfRh4 invasion ligand. Proc Natl Acad Sci USA 107(40): Biol Crystallogr 67(Pt 4):271–281. 17327–17332. 56. Winn MD, et al. (2011) Overview of the CCP4 suite and current developments. Acta 28. Tham W-H, et al. (2009) Antibodies to reticulocyte binding protein-like homologue 4 Crystallogr D Biol Crystallogr 67(Pt 4):235–242. inhibit invasion of Plasmodium falciparum into human erythrocytes. Infect Immun 57. Kantardjieff KA, Rupp B (2003) Matthews coefficient probabilities: Improved esti- 77(6):2427–2435. mates for unit cell contents of proteins, DNA, and protein-nucleic acid complex 29. Tolia NH, Enemark EJ, Sim BKL, Joshua-Tor L (2005) Structural basis for the EBA-175 crystals. Protein Sci 12(9):1865–1871. erythrocyte invasion pathway of the malaria parasite Plasmodium falciparum. Cell 58. Krissinel E, Henrick K (2007) Inference of macromolecular assemblies from crystalline 122(2):183–193. state. J Mol Biol 372(3):774–797.

10 of 10 | www.pnas.org/cgi/doi/10.1073/pnas.1516512113 Gruszczyk et al. Downloaded by guest on October 1, 2021