Phenomics, Genomics and Genetics in vinckei

Dissertation by

Abhinay Ramaprasad

In Partial Fulfillment of the Requirements

For the Degree of

Doctor of Philosophy

King Abdullah University of Science and Technology

Thuwal, of Saudi Arabia

October, 2017 2

EXAMINATION COMMITTEE PAGE

The dissertation of Abhinay Ramaprasad is approved by the examination committee

Committee Chairperson: Prof. Arnab Pain Co-Supervisor: Prof. Richard Culleton Committee Members: Prof. Richard Carter, Prof. Takashi Gojobori, Prof. Xin Gao 3

©October, 2017

Abhinay Ramaprasad

All Rights Reserved 4

ABSTRACT

Phenomics, Genomics and Genetics in Plasmodium vinckei Abhinay Ramaprasad

Rodent parasites (RMPs) serve as tractable models for experimental ge- netics, and as valuable tools to study malaria parasite biology and host-parasite- vector interactions. Plasmodium vinckei, one of four RMPs adapted to laboratory mice, is the most geographically widespread species and displays considerable phe- notypic and genotypic diversity amongst its subspecies and strains. The phenotypes and genotypes of P. vinckei isolates have been relatively less characterized compared to other RMPs, hampering its use as an experimental model for malaria. Here, we have studied the phenotypes and sequenced the genomes and transcriptomes of ten P. vinckei isolates including representatives of all five subspecies, all of which were collected from wild thicket rats (Thamnomys rutilans) in sub-Saharan Central Africa between the late 1940s and mid 1960s. We have generated a comprehensive resource for P. vinckei comprising of five high-quality reference genomes, growth profiles and genotypes of P. vinckei isolates, and expression profiles of genes across the intra-erythrocytic developmental stages of the parasite. We observe significant phenotypic and genotypic diversity among P. vinckei isolates, making them particu- larly suitable for classical genetics and genomics-driven studies on malaria parasite biology. As part of a proof of concept study, we have shown that experimental ge- netic crosses can be performed between P. vinckei parasites to potentially identify genotype-phenotype relationships. We have also shown that they are amenable to genetic manipulation in the laboratory. 5

ACKNOWLEDGEMENTS

My deepest gratitude to my advisors, Professor Arnab Pain and Professor Richard Culleton. I thank Arnab for his unwavering support and guidance throughout my Masters and Doctoral studies and for always motivating me to achieve more. I thank him for the valuable research training he provided through several projects, his sage advice regarding my career decisions and for hosting awesome barbeques. I thank Richard for guiding me through the world of rodent malaria, personally training me in the wet lab work and for the many fruitful discussions we have had on diverse topics from malaria’s history to Bob Dylan. I would like to thank Hussein Abkallo for teaching me reverse genetics tech- niques and for all the laughs. Thanks to Severina Klaus for the outstanding work and for sharing the pain and glory of generating genetic crosses in P. vinckei. Also, for the limitless supply of German chocolate. I want to thank all members of the Pathogen Genomics group at KAUST and the Malaria Unit at NEKKEN for their support. I would also like to thank the Protozool- ogy lab at NEKKEN, KAUST Bioscience Core Lab and CBRC Dragon cluster for their support. Special thanks to Sarina, Fathia, Raeece, Jojo and Dr. Katsuhiko Mineta for their timely help in matters big and small. I would like to thank King Abdullah University of Science and Technology for their financial support through the KAUST fellowship award. Last but not least, I would like to thank those constants in the sea of variables - my parents, Chandrika and Prasad, for trading their dreams for mine, and my friends, Anu, Ice and Siva, for putting up with my hibernations, proofreading my write-ups and being the awesome friends that they are. 6

STATEMENT OF CONTRIBUTION

The work presented in this thesis is a product of joint efforts made by several individuals apart from me. With profound gratitude towards them, I would like to mention here their contributions along with mine during the course of my doctoral study.

The dissertation work was carried out at the Pathogen Genomics Lab, King Ab- dullah University of Science and Technology (KAUST) under the supervision of Prof. Arnab Pain (AP) and at the Malaria Unit, Institute of Tropical Medicine (NEKKEN), Nagasaki University under the supervision of Prof. Richard Culleton (RC). The project was jointly conceived by AP and RC.

• Chapter 2 - Phenotyping (At the Malaria Unit) All animal work was carried out by Abhinay Ramaprasad (AR) under the supervision of RC. AR also carried out the parasitaemia counting and final presentation of the growth profiles. Transmission experiments were carried out by both AR and RC, where RC performed the mosquito dissections.

• Chapter 3 - Genomes and Transcriptomes (At Malaria Unit and KAUST) All animal work, DNA and RNA isolations were carried out by AR. PacBio sequencing and Pacbio genome assembly were availed commercially from Macrogen Inc. Illumina PCR-free library preparation was performed by AR. All subsequent bioinformatic analyses were performed by AR under the super- vision of AP. The simplified protocol for RNAseq was formulated by AR.

• Chapter 4 - Genetics (At Malaria Unit and KAUST) The four genetic cross experiments and LGS were performed jointly by AR, Ms. Severina Klaus (SK) 7 (University of Heidelberg) and RC. AR and SK took care of animal work and transmission to mosquitoes. RC performed mosquito dissection. qPCR was performed by SK. DNA isolation, library prep and sequencing for LGS were performed by AR. Bioinformatic analysis for LGS was carried out by Chris Illingworth (CI) (University of Cambridge), AR and Axel Martinelli (AM) (Hokkaido University). AR performed read mapping and SNP calling using scripts generated by AM. CI performed the mathematical modelling steps. Genetic manipulation experiments were done by AR, that includes plasmid design and construction, and parasite transfection. Cloning of transformed parasites were done by RC.

Apart from this, I would like to acknowledge here Dr. Hussein Abkallo (Univer- sity of Edinburgh) and Dr. Kittisak Thawnashom (Naresuan University) who were generous in providing me with advice and training for the genetic manipulation experiments.

In additional to the work presented in this dissertation, I have contributed to- wards other projects (some of which have resulted in publications listed below) during my doctoral study at the Pathogen Genomics Laboratory, KAUST and Malaria Unit, NEKKEN.

1. Abkallo HM, Martinelli A, Inoue M, Ramaprasad A, Xangsayarath P, Gitaka J, Tang J, Yahata K, Zoungrana A, Mitaka H, Acharjee A, Datta PP, Hunt P, Carter R, Kaneko O, Mustonen V, Illingworth CJR, Pain A, Culleton R. Rapid identification of genes controlling virulence and immunity in malaria parasites. PLoS Pathogens. 2017;13(7): e1006447. doi: 10.1371/journal.ppat.1006447 Contribution Transcriptome sequencing and analysis

2. Lu F, Culleton R, Zhang M, Ramaprasad A, Seidlein L, Zhou H, Zhu G, Tang J, Liu Y, Wang W, Cao Y, Xu S, Gu Y, Li J, Zhang C, Gao Q, Menard D, Pain A, Yang H, Zhang Q, Cao J. Emergence of Indigenous Artemisinin-Resistant Plasmodium falciparum in Africa. N Engl J Med. 2017;0028-4793. doi: 10.1056/NEJMc1612765. Contribution Determined geographical origin of ART resistant parasite by DNA sequencing and SNP analysis.

3. Ansari HR, Templeton TJ, Subudhi AK, Ramaprasad A, Tang J, Lu F, Naeem R, Hashish Y, Oguike MC, Benavente ED, Clark TG, Sutherland CJ, Barnwell JW, Culleton R, Cao J, Pain A. Genome-scale comparison of expanded gene 8

families in wallikeri and Plasmodium ovale curtisi with and with other Plasmodium species. Int J Parasitol. 2016;S0020-7519(16)30135-7. doi: 10.1016/j.ijpara.2016.05.009. Contribution DNA sequencing, Genome assembly and annotation.

4. Moon RW, Sharaf H, Hastings CH, Ho YS, Nair MB, Rchiad Z, Knuepfer E, Ramaprasad A, Mohring F, Amir A, Yusuf NA, Hall J, Almond N, Lau YL, Pain A, Blackman MJ, Holder AA. Normocyte-binding protein required for human erythrocyte invasion by the zoonotic malaria parasite Plasmodium knowlesi. Proc Natl Acad Sci U S A. 2016;113(26): 7231-7236. doi: 10.1073/pnas.1522469113. Contribution Manual curation of gene models.

5. Roques M*, Wall RJ*, Douglass AP*, Ramaprasad A*, Ferguson DJ, Kaindama ML, Brusini L, Joshi N, Rchiad Z, Brady D, Guttery DS, Wheatley SP, Yamano H, Holder AA, Pain A, Wickstead B, Tewari R. Plasmodium P- Type Cyclin CYC3 Modulates Endomitotic Growth during Oocyst Development in Mosquitoes. PLoS pathogens. 2015;11(11):e1005273. doi: 10.1371/journal.ppat.1005273. (* equal contribution) Contribution Transcriptome sequencing and differential expression analysis.

6. Woo YH, Ansari H, Otto TD, Klinger CM, Kolisko M, Michalek J, Saxena A, Shanmugam D, Tayyrov A, Veluchamy A, Ali S, Bernal A, Del Campo J, Cihlar J, Flegontov P, Gornik SG, Hajduskova E, Horak A, Janouskovec J, Katris NJ, Mast FD, Miranda-Saavedra D, Mourier T, Naeem R, Nair M, Panigrahi AK, Rawlings ND, Padron-Regalado E, Ramaprasad A, Samad N, Tomcala A, Wilkes J, Neafsey DE, Doerig C, Bowler C, Keeling PJ, Roos DS, Dacks JB, Templeton TJ, Waller RF, Lukes J, Obornik M, Pain A. Chromerid genomes reveal the evolutionary path from photosynthetic algae to obligate intracellular parasites. eLife. 2015;4. doi: 10.7554/eLife.06974. Contribution Manual curation of gene models.

7. Gornik SG, Febrimarsa, Cassin AM, MacRae JI, Ramaprasad A, Rchiad Z, McConville MJ, Bacic A, McFadden GI, Pain A, Waller RF. Endosymbiosis undone by stepwise elimination of the plastid in a parasitic dinoflagellate. Proc Natl Acad Sci U S A. 2015;112(18):5767-72. doi: 10.1073/pnas.1423400112. PubMed PMID: 25902514. Contribution DNA and RNA sequencing

8. Guttery DS, Poulin B, Ramaprasad A, Wall RJ, Ferguson DJP, Brady D, Patzewitz E-M, Whipple S, Straschil U, Wright MH, Mohamed AMAH, Radhakrishnan A, Arold ST, Tate EW, Holder AA, Wickstead B, Pain A, Tewari R. Genome- wide Functional Analysis of Plasmodium Protein Phosphatases Reveals Key Regulators of Parasite Development and Differentiation. Cell Host & Microbe. 2014;16(1):128-40. doi: 10.1016/j.chom.2014.05.020. Contribution Transcriptome sequencing and differential expression analysis.

Manuscripts under preparation

1. A quick and cost-efficient protocol for time-series transcriptomics in rodent malaria parasites.

2. A phenotypic and genotypic resource for performing genetic linkage studies in the rodent malaria parasite, Plasmodium vinckei. 9

TABLE OF CONTENTS

Examination Committee Page 2

Copyright 3

Abstract 4

Acknowledgements 5

Statement of Contribution 6

List of Abbreviations 12

List of Figures 13

List of Tables 14

1 Introduction 16 1.1 The malaria parasite ...... 21 1.2 Rodent malaria parasites ...... 24 1.3 Plasmodium vinckei clade ...... 28 1.4 Objectives ...... 30

2 Phenotypic diversity within Plasmodium vinckei 33 2.1 Review of Literature - Phenotypes of RMPs ...... 33 2.2 Results and Discussion ...... 36 2.2.1 Differences in growth phenotypes of P. vinckei isolates . . . . 36 2.2.2 Optimal transmission temperature for P. v. subsp. isolates . . 38 2.3 Methods ...... 40 2.3.1 Ethics statement ...... 40 2.3.2 Mice and mosquitoes ...... 40 2.3.3 Thawing of deep frozen rodent malaria parasites ...... 40 2.3.4 Blood collection from mice ...... 41 10 2.3.5 Preparation of frozen stabilates of parasites for long term storage ...... 41 2.3.6 Parasite Cloning ...... 41 2.3.7 Growth phenotyping ...... 42 2.3.8 Transmission of P. v. subsp. isolates ...... 42

3 Genomes and Transcriptomes of Plasmodium vinckei 43 3.1 Review of Literature ...... 43 3.1.1 Genome of the malaria parasite ...... 43 3.1.2 Genomes of rodent malaria parasites ...... 45 3.1.3 Single nucleotide polymorphisms in RMPs ...... 47 3.1.4 Functional genomics in the malaria parasite ...... 49 3.2 Results and Discussion ...... 50 3.2.1 Genome assembly and organization of P. vinckei genomes . . 51 3.2.2 Annotation of P. vinckei genomes ...... 57 3.2.3 Sub-telomeric multigene families in P. vinckei ...... 60 3.2.4 Genotypic diversity within P. vinckei isolates ...... 66 3.2.5 Transcriptome analysis of P. vinckei blood stages ...... 70 3.3 Methods ...... 77 3.3.1 DNA extraction and sequencing ...... 77 3.3.2 Genome assembly ...... 79 3.3.3 Comparative genomics ...... 80 3.3.4 Genome annotation ...... 80 3.3.5 Whole genome phylogeny construction ...... 81 3.3.6 Genotyping ...... 81 3.3.7 Sample collection for P. vinckei vinckei CY blood stage tran- scriptome ...... 82 3.3.8 RNA extraction and sequencing ...... 82 3.3.9 Transcriptome analysis ...... 83

4 Genetics in Plasmodium vinckei 84 4.1 Review of Literature ...... 84 4.1.1 Classical genetics in the malaria parasite ...... 84 4.1.2 Genetic crossing in Plasmodium ...... 85 4.1.3 Molecular markers and genetic linkage maps ...... 91 4.1.4 Classical linkage analysis in Plasmodium ...... 94 4.1.5 Linkage Group Selection (LGS) ...... 96 11 4.1.6 Reverse genetics in RMPs ...... 100 4.2 Results and Discussion ...... 103 4.2.1 Genetics in P. vinckei ...... 103 4.2.2 Genetic manipulation in P. vinckei ...... 116 4.3 Methods ...... 118 4.3.1 Genetic cross ...... 118 4.3.2 Linkage Group Selection ...... 119 4.3.3 Plasmid construction of pPvvCY-∆p230p-gfpLuc ...... 120 4.3.4 Transfection ...... 120

5 Concluding Remarks 121

References 126

Appendices 175 12

LIST OF ABBREVIATIONS

PvbDA Plasmodium vinckei brucechwatti DA PvbDB Plasmodium vinckei brucechwatti DB PvlDE Plasmodium vinckei lentum DE PvlDS Plasmodium vinckei lentum DS PvpBS Plasmodium vinckei petteri BS PvpCR Plasmodium vinckei petteri CR PvsEE Plasmodium vinckei subsp. EE PvsEH Plasmodium vinckei subsp. EH PvsEL Plasmodium vinckei subsp. EL PvvCY Plasmodium vinckei vinckei CY ema1 erythrocyte membrane antigen 1 msp1 merozoite surface protein 1 pir Plasmodium interspersed repeat AFLP amplified fragment length polymorphism ETRAMP early transcribed membrane protein FPKM Fragments Per Kilobase of transcript per Mil- lion mapped reads GFP green fluorescent protein GFPLuc green fluorescent protein - firefly luciferase IDC intra-erythrocytic developmental cycle LGS Linkage Group Selection MS microsatellite PEXEL Plasmodium EXport ELement RBC red blood cell RBP reticulocyte binding protein RFLP restriction fragment length polymorphism RMP rodent malaria parasite SNP single nucleotide polymorphism 13

LIST OF FIGURES

1.1 The complete life cycle of malaria parasite...... 23 1.2 Rodent malaria parasites and their geographical origins...... 25 1.3 Plasmodium vinckei isolates used in this study and their geographical origins...... 32

2.1 Infection profiles of Plasmodium vinckei isolates...... 37 2.2 Mosquito stages of P. vinckei subsp. EL ...... 39

3.1 Genome rearrangements within RMP genomes...... 54 3.2 Whole genome-based phylogeny of Plasmodium vinckei subspecies. . 60 3.3 RMP multigene families...... 62 3.4 Inter- and Intra-specific comparisons of Plasmodium vinckei genotypes. 69 3.5 Blood stage transcriptome of P. vinckei vinckei CY...... 72

4.1 Basics of genetics in malaria and Linkage Group Selection method. . 89 4.2 Brief workflow of performing a genetic cross and Linkage Group Se- lection with P. vinckei isolates, PvsEH and PvsEL...... 105 4.3 Allele distribution of P. vinckei isolates, PvsEH and PvsEL, after a ge- netic cross...... 111

4.4 Genetic manipulation of P. vinckei vinckei CY to create a PvGFP-Luccon, a PvvCY line constitutively expressing GFP-Luc fusion protein . . . . 117 14

LIST OF TABLES

1.1 RMP species, subspecies and isolates...... 27 1.2 List of Plasmodium vinckei isolates used in this study...... 32

2.1 Optimal transmission temperature for P. v. subsp...... 39

3.1 Characteristics of Plasmodium vinckei genomes...... 52 3.2 Genotypic diversity within Plasmodium vinckei isolates...... 67

4.1 Details of the transmission experiments undertaken ...... 108 15

To him that returns on the morrow, to him that returns for two (successive) days, to the takman that returns on the third day, homage shall be - Charm against takman (fever). Atharva Veda I, 25:4 (c. 1200 - 1000 BC)

It's risky business sharing your body with strangers- uninvited multiplicities hijacking what you have because to them you are what you have. - Cameron Conaway. Silence, Anopheles. Malaria, Poems. (2014 AD) 16

Chapter 1

Introduction

Malaria is one of the oldest recorded diseases affecting mankind. Egyptian mum- mies dating back to 3200 BC have been found to contain malaria antigen [Miller et al., 1994]. Ancient scripts from Vedic (Indian), Chinese, Mesopotamian and Greek civilizations have described the disease and its symptoms, and throughout history, malaria has affected human migration and population growth [Institute of Medicine (US) Committee on the Economics of Antimalarial Drugs, 2004]. Malaria parasite DNA has been found in 2000-year-old human remains in Rome [Sallares and Gomzi, 2001, Marciniak et al., 2016], supporting an old speculation that a malaria epidemic contributed to the fall of Roman civilization.

Malaria continues to exert a heavy toll on public health, especially in endemic sub-Saharan Africa where 92% of malaria deaths occur [World Health Organiza- tion., 2016]. Globally, there were as many as 212 million new cases of malaria reported and around half a million deaths in 2015. It is also a major killer of chil- dren under five years of age, with an estimated 303,000 children dying of malaria in 2015. The pathology of malaria is caused by the growth and replication in the blood of protozoan parasites of the genus Plasmodium. Disease symptoms are caused through two main mechanisms - the synchronized rupturing of RBCs through par- asite egress from erythrocytes, and through the sequestration of infected RBCs in the microvasculature of major organs. A person gets infected by the causal par- asite through a mosquito bite and 10-15 days later begins to show characteristic 17 symptoms of periodic high fevers, chills, headache and nausea. Severe manifes- tations include severe anaemia, seizures, coma, difficulty in breathing and death [Claire et al., 2004, Idro et al., 2005, Maitland and Newton, 2005]. Malaria can also cause complications during pregnancy, causing maternal anaemia leading to abortion, stillbirth and low birth weight [Desai et al., 2007].

Malaria is a preventable and treatable disease and considerable international efforts are being taken to reduce malaria burden. In 2015 alone, a total of $ 2.9 billion was spent globally to combat malaria [World Health Organization., 2016]. Insecticides and mosquito nets are being widely used to reduce malaria transmis- sion. Rapid diagnostic tools are used to swiftly detect malaria infections for timely treatment with antimalarial drugs. There are several antimalarial drugs that are used to treat malaria, of which chloroquine and artemisinin-based compounds are most widely used. As of now, the recommended antimalarial treatment in Africa is a combination of drugs with different mechanisms of action, commonly known as artemisinin-based combination therapy (ACT) because they all include the anti- malarial drug, artemisinin [Roll Back Malaria, 2005, World Health Organization., 2015].

Frustratingly, the malaria parasite is notorious for developing resistance to anti- malarial drugs. Parasites able to resist widely-used chloroquine began to emerge in Thailand in 1957, spread through South-east Asia and reached Africa in the 1970s [Harinasuta et al., 1962, Wernsdorfer and Payne, 1991]. Chloroquine’s effective- ness has since declined and from 2005, artemisinin-based drugs have been first-line treatments in Africa [Roll Back Malaria, 2005]. The trend of drug-resistance re- peated again, this time against artemisinin with artemisinin-resistant (ART) malaria parasites emerging in Cambodia [Noedl et al., 2008, Dondorp et al., 2009] and 18 spreading through six other countries in South-east Asia [Phyo et al., 2012, Ama- ratunga et al., 2012, Hien et al., 2012, Kyaw et al., 2013, Ashley et al., 2014]. This year, ART parasites were detected for the first time in Africa in travellers returning to UK [Sutherland et al., 2017] (a worrying sign that artemisinin resistance might have developed in Africa too) and to China [Lu et al., 2017].

A better understanding of the parasite’s biology is definitely needed to address the above challenges in combating malaria by identifying new antimalarial drug and vaccine targets. Over the past few decades, researchers have been using var- ious approaches to study the cellular and molecular mechanisms in key areas like cell invasion (reviewed in [Cowman et al., 2012, Satchwell, 2016]), immune eva- sion (reviewed in [Reeder and Brown, 1996, Scherf et al., 2008]), transmission (reviewed in [Guttery et al., 2012, 2015]) and drug resistance (reviewed in [White, 2004, Petersen et al., 2011a]).

In 2002, a team of scientists sequenced the whole genome of Plasmodium fal- ciparum, the most severe of human malaria parasites, offering information on the parasite’s complete gene repertoire [Gardner et al., 2002]. The study revealed that about 60% of parasite proteins’ functions cannot be inferred as their sequences do not show similarity to any proteins of other organisms. Identifying the functional roles of these proteins is itself quite a daunting task that requires systematic studies to link genes with unknown function to particular phenotypes. In order to achieve this, genetic approaches, mainly classical genetics (studying entities with pheno- typic differences and linking gene(s) to these variations) (reviewed in [Culleton and Abkallo, 2014]) and genetic manipulation (perturbing a gene’s expression and observing the change in phenotype) (reviewed in [de Koning-Ward et al., 2015]), are being employed in studies of malaria parasites. 19

However, carrying out genetic experiments in human malaria parasites can be quite challenging. Malaria parasites follow a complex life cycle that involves sex- ual and asexual stages being propagated within mosquito vectors and hosts (comprising of liver and blood stages). In order to cross malaria parasites, they must progress through their complete life cycle requiring the propagation of human malaria parasites in non-human primates such as chimpanzees (reviewed in [Ranford-Cartwright and Mwangi, 2012]), making these experiments difficult to carry out and bound by ethical constraints. On the other hand, genetic manipu- lation of P. falciparum has been relatively successful with various genetic systems developed (reviewed in [de Koning-Ward et al., 2015]). However, only the blood stages can be studied using P. falciparum cultures, while vector and liver stages remain difficult to access. Genetic modification has also been demonstrated in an- other important human malaria parasite, Plasmodium vivax [Sanchez et al., 2013, Moraes Barros et al., 2015] but a robust, continuous in vitro culture system is still unavailable for this species. Due to these challenges, genetic studies are instead carried out using malaria species that infect rodents and non-human primates as experimental models.

Apart from the host specificity, rodent and primate malaria parasites share sev- eral biological and physiological characteristics with human malaria parasites [Wa- ters, 2002, Langhorne et al., 2011, Kreier, 2012], thus making them ideal tractable models for malaria research. Of these, rodent malaria parasites (RMPs) are the most commonly used owing to the difficult nature of work and ethical constraints involved in dealing with primate models. RMPs are relatively easy and safe to use, and more importantly allow us to study vector (reviewed in [Guttery et al., 2012, 2015]) and liver stages (reviewed in [Prudencio et al., 2011]) which would other- 20 wise not be possible with human malaria parasites.

This dissertation focuses on the rodent malaria parasite, Plasmodium vinckei. Several isolates of P. vinckei were collected along with other rodent malaria par- asite species from sub-Saharan Africa in the 1960-70s from five different coun- tries, making P. vinckei the most diverse of rodent malaria species [Killick-Kendrick and Peters, 1978] . However, to date, they have not been effectively employed as malaria models owing to a lack of information concerning their varying phenotypic and genotypic traits. We have attempted here to create a comprehensive resource describing the phenotypic and genotypic variations within P. vinckei in the hope that it will help future genetic studies in this model to identify gene-phenotype relation- ships.

This thesis is divided into five chapters. Chapter 1 provides a general back- ground on malaria and rodent malaria parasites, and defines the objectives of the work. Chapter 2 describes the variations in the growth profiles of ten P. vinckei isolates spanning five subspecies and transmission characteristics of one of the sub- species. Chapter 3 provides high quality reference genomes for all five P. vinckei subspecies, describes diversity of their sub-telomeric multigene families, genotypic variations within P. vinckei and presents an evolutionary perspective of all RMP iso- lates sequenced to date. It also describes gene expression during P. vinckei’ s intra- erythrocytic developmental cycle. Chapter 4 demonstrates performing a genetic cross between two P. vinckei isolates, Linkage Group Selection (LGS) experiments to identify genes linked to virulence and reverse genetic approaches in P. vinckei. Finally, we present our conclusions and list the limitations of the study in Chapter 5. 21 1.1 The malaria parasite

The history of discovery of malaria parasites is a fascinating tale and there are many engaging reviews [Carter and Mendis, 2002, Schlagenhauf, 2004, Cox, 2010] that tell the story in detail, but the following are the main milestones: In the year 1880, the malaria parasite was discovered by Charles Louis Alphonse Laveran, a French army officer working in Algeria. He examined fresh blood from 200 patients and as- sociated the disease with the presence of crescent shaped, pigmented bodies in the blood, which he identified as a protozoan parasite (naming it Oscillaria malariae). The malaria parasite thus became the first protozoan parasite inhabiting a human red blood cell to be discovered.

This was followed by another important discovery in 1897 by Ronald Ross, an army surgeon working in India, of malaria parasite forms in the guts of mosquitoes fed on the blood of a malaria patient [Ross, 1898, 1897, Manson, 1898]. He went on to demonstrate that avian malaria parasites were transmitted by culicine mosquitoes and elucidated the parasite’s developmental stages within the mosquito [Ross, 1899a] .

Following this, the Italian malariologists Bignami and Grassi (and Ross indepen- dently) proved that human malaria was transmitted by female Anopheles mosquitoes and went on to describe the parasite’s blood and mosquito stages in detail [Ross, 1899b, Grassi et al., 1898]. The last piece of the puzzle as to where the parasites multiply between entering the host via mosquitoes and appearing in blood circula- tion was solved by Shortt, Garnham and colleagues in 1947, by showing that the parasites go through a division phase in the liver before proceeding to blood stages [Shortt and Garnham, 1948]. 22 The parasite’s life cycle, therefore, can be briefly described thus; malaria infec- tion begins when an infected female anopheline mosquito takes a blood meal from a human and injects malaria sporozoites into the bloodstream. The sporozoites travel to the liver and infect hepatocytes, marking the beginning of the liver stages of the parasite’s life cycle. The sporozoites grow and multiply asexually (exoerythrocytic schizogony) inside liver cells producing 2,000 to 40,000 merozoites, dependent on parasite species. These merozoites, packaged within the host cell membrane into structures called merosomes, then travel and accumulate in the lungs where they are eventually released into the bloodstream via the lung capillaries [Baer et al., 2007]. Merozoites then invade red blood cells (RBCs) where they start multiply- ing again asexually (erythrocytic schizogony) eventually causing rupture of infected RBCs, releasing 8 to 36 merozoites that in turn invade new red blood cells. This multiplication cycle continues indefinitely and constitutes the erythrocytic or blood stages. The blood stages cause the onset of disease and its clinical symptoms in- cluding the characteristic periodical waves of fever that accompany each round of blood stage multiplication.

As the infection progresses, some of the blood stage parasites develop into male and female gametocytes (sexual forms) that circulate in the bloodstream until they are taken up by a mosquito during blood meal. In the mosquito gut, the gameto- cytes mature into male and female gametes that fuse to form diploid zygotes and undergo meiotic recombination. They become motile ookinetes that invade the mosquito midgut wall, form oocysts, within which they multiply to produce 2000 to 8000 haploid sporozoites per oocyst ( a process called sporogony; [Rosenberg and Rungsiwongse, 1991, Sinden and Matuschewski, 2005]). These sporozoites invade the mosquito’s salivary glands and are deposited into another human when the mosquito feeds again (see Figure 1.1). Apart from the brief period as a zygote 23 in the mosquito gut, the parasites exist as haploid forms throughout their life cycle.

Figure 1.1: The complete life cycle of malaria parasite. Credit: National Institute of Allergy and Infectious Diseases, National Institutes of Health

Malaria parasites are classified under the genus Plasmodium of phylum Apicom- plexa. Apicomplexans are a large family of parasitic (due to presence of cortical alveoli- flattened vesicles near the outer cell membrane), characterized by an apical complex structure that helps in penetration into the host cell. There are five Plasmodium species that infect humans - Plasmodium falciparum (malignant form of tertian malaria with 48 hour blood cycle), Plasmodium vivax, Plasmodium ovale (both species causing benign tertian malaria), Plasmodium malariae (quar- tan malaria with 72 hour blood cycle) and Plasmodium knowlesi (quotidian malaria with 24 hour blood cycle). It has also been argued that the two forms of P. ovale, P. ovale curtisi and P. ovale wallikeri, be considered two separate human malaria 24 species [Oguike and Sutherland, 2015, Ansari et al., 2016]. Despite P. knowlesi having a simian host preference, it is considered a human malaria parasite due to its high prevalence in humans in Asia and it being a major cause of malaria in hu- mans in Malaysia [Cox-Singh et al., 2008]. Cases of natural zoonotic infections in humans by the monkey malaria parasites, Plasmodium cynomolgi and Plasmodium simium have also been reported [Deane et al., 1966, Ta et al., 2014, Brasil et al., 2017]. Apart from these, there are around 200 Plasmodium species described to date that affect a wide host range of terrestrial including reptiles, birds and mammals.

1.2 Rodent malaria parasites

In 1943, the Belgian malariologist Ignace Vincke, while on a fishing holiday in the Katanga province (at present in the Democratic Republic of Congo, Africa), dis- covered rodent malaria parasites in the midguts of a new mosquito species he was studying, Anopheles dureni millecampsi. Of course, at that time, he was unaware of the parasite’s origin and only in the next few years he and his colleagues found that A. dureni millecampsi did not feed on humans (therefore, they could not be human malaria parasites) and tests of the ingested blood showed that it was not from common livestock, but in fact from rodents. Suspecting then the source may be rodents, over 360 specimens of wild rats were collected from the Katanga forests and examined for presence of malaria parasites [Killick-Kendrick and Peters, 1978].

At last, in 1948, Vincke and his colleague Lips discovered blood parasites in a wild thicket rat specimen (Grammomys surdaster) and were able to pass on the infection to white mice through blood inoculation [Vincke and Lips, 1948]. This marked the discovery of one of the most important research tools for the study of malaria, and the parasite isolated, , is an extensively used 25 malaria model to this day. Plasmodium berghei could also be transmitted to laboratory- bred mosquitoes such as Anopheles stephensi, thereby enabling the maintenance in the laboratory of the complete malaria life cycle [Yoeli and Most, 1965]. Follow- ing this, between 1948 and 1974, the rodent malaria parasite collection grew with isolates taken from wild thicket rats and infected mosquitoes in Cameroon, Central African Republic, Congo Brazzaville, Democratic Republic of Congo and Nigeria (see Figure 1.2).

Figure 1.2: The complete life cycle of malaria parasite. The four RMP species - P. berghei, P. yoelii, P. chabaudi and P. vinckei- and their subspecies were isolated from different locations in five countries within sub-Saharan Africa between 1948-1974. Image reproduced from [Carlton et al., 2001].

Researchers in the University of Edinburgh (David Walliker and Richard Carter) preserved these RMP isolates as stabilates and cloned and characterized some of them as single parasite lines, thus establishing the Edinburgh rodent malaria para-

site collection (www.malariaresearch.eu/content/rodent-malaria-parasites). Initially the RMPs were classified into two species- berghei and vinckei- the main dif- 26 ference being their host cell preference, with berghei parasites showing a predilec- tion for immature erythrocytes. Isolates from different regions were given sub- species status under these two groups based on morphological and developmen- tal characteristics [Landau and Chabaud, 1965, Landau and Killick-Kendrick, 1966, Adam et al., 1966, Landau et al., 1968, Killick-Kendrick, 1968, Landau et al., 1970]. However, subsequent studies with isoenzymes [Carter, 1970, 1973, 1978] made further distinctions within the two classical species, resulting in a four species clas- sification of RMPs.

Further studies identified different forms of enzymes based on their migration patterns following starch gel electrophoresis and distinct groups within RMP strains became apparent. Three berghei subspecies were grouped into a new species - Plas- modium yoelii (previously described in [Landau and Killick-Kendrick, 1966, Landau et al., 1968, Killick-Kendrick, 1973a]) as their isoenzymes were different from P. berghei berghei that was classified as another species - Plasmodium berghei (previ- ously described in [Vincke and Lips, 1948]). Similarly, the vinckei group was divided into two distinct species - Plasmodium vinckei (previously described in [Rodhain, 1952, Landau et al., 1970, Killick-Kendrick, 1975, Carter and Walliker, 1975]) and Plasmodium chabaudi (previously described in [Landau and Chabaud, 1965, Carter and Walliker, 1975, 1976]). Later studies with DNA/RNA based techniques [Perkins et al., 2007, Ramiro et al., 2012] confirmed these groupings (See Table 1.1). The parasite collection also contained a parasite designated as Plasmodium atheruri [van Den Berghe et al., 1958], originally isolated from porcupines.

RMPs allow for tractable in vivo models that are easier and less expensive to maintain than primate malaria models. The basic biology and biochemistry of RMPs are very similar to their human malaria parasite counterparts [Kreier, 2012] and 27 used as parasite model for immunity Commonly liver stages strain-specific sequestration, Drug resistance, Reverse genetics, Reverse genetics, antigenic variation, transmission studies ? ? 11 14 14 15 21 11.7 11.7 16.7 14.7 16.2 22.5 Mean Sporozoite length(uM) ? ? 26 24 24 19-21 24-26 22-24 20-21 24-26 24-25 24-26 23-26 Sporogony temperature (°C) 72 ? ? ? ? 50 (h) > 52-53 46-50 47-50 43-50 53-61 61-65 53-61 Liver stage blood Synchronous Synchronous Synchronous Synchronous Synchronous Synchronous Synchronous Synchronous Synchrony in Asynchronous Asynchronous Asynchronous Asynchronous Asynchronous preference Erythrocyte Normocytes Normocytes Normocytes Normocytes Normocytes Normocytes Normocytes Normocytes Normocytes Reticulocytes Reticulocytes Reticulocytes Reticulocytes Reticulocytes/ Genome available PlasmoDB PlasmoDB The four rodent malaria parasite species are listed with their subspecies v30 Feb 2017 v30 Feb 2017 Hall et al., 2005 Hall et al., 2005 Otto et al., 2014 , RMPs have been used as models to study different aspects of malaria Carlton et al., 2002 P. vinckei first in Reported Landau and Landau and Rodhain, 1952 Chabaud, 1965 Landau et al., 1968 Landau et al., 1970 Vincke and Lips, 1948 Killick-Kendrick, 1975 Killick-Kendrick, 1966 Killick-Kendrick, 1973a Carter and Walliker, 1976 Carter and Walliker, 1975 Carter and Walliker, 1975 CAR CAR CAR DRC DRC origin Congo Congo Congo Nigeria Nigeria Cameroon Cameroon Cameroon Brazzaville Brazzaville Brazzaville Geographical 8 2 2 2 2 3 2 2 4 5 5 of 22 22 isolates Number subp. subp. yoelii RMP species, subspecies and isolates. subsp. adami petteri killicki vinckei lentum chabaudi nigeriensis Subspecies brucechwatti yoelii vinckei berghei Species chabaudi Plasmodium Plasmodium Plasmodium Plasmodium Table 1.1: and the number ofand isolates the available first for reportsphenotypes each identifying offer, of and with them. describing the the All exception isolates RMPs of are were listed. isolated from Depending regions upon within the five specific African advantages countries that their biology and the publicationbold of were their determined genomes within have this further thesis aided work. in (DRC- their Democratic use Republic as of Congo; experimental CAR- models. Central The African Republic) characteristics in 28 gene content and genome organization are also highly conserved between them [Carlton et al., 2002]. Importantly, they allow access to exoerythrocytic stages of the parasite’s life cycle that are difficult to study in human infections. They can serve as in vivo screens [Fidock et al., 2004] for new anti-malarial drug trials and in genetic studies on drug resistance as stable drug-resistant RMP lines can be es- tablished [Carlton et al., 2001].

Genetic modifications can be efficiently done to establish genetic mutant lines

(RMgmDB- Rodent Malaria genetically modified Parasites http://www.pberghei. eu/) and high-throughput genetic screens [Gomes et al., 2015, Bushell et al., 2017], making RMPs well suited for functional genetics. Thus RMPs’ continuing contribu- tion to our present knowledge about malaria is quite significant, with them being used in several fields of malaria research such as parasite cell biology, malaria trans- mission, drug resistance, immune evasion and host-parasite interactions.

1.3 Plasmodium vinckei clade

The P. vinckei clade of parasites are morphologically indistinguishable from each other and have abundant golden pigment in their asexual stages when stained with Giemsa stain and observed under a microscope; a feature that distinguishes it from P. chabaudi and other RMPs [Kreier, 2012]. The first representative parasite of P. vinckei was isolated from mosquitoes in Katanga, Democratic Republic of Congo in 1952, four years after P. berghei was first isolated from the same region [Rod- hain, 1952]. In the following years, many other P. vinckei parasites were discov- ered in Nigeria in 1954 [Bruce-Chwatt and Gibson, 1955], Congo Brazzaville in 1966 [Adam et al., 1966], Central African Republic in 1964 [Landau and Chabaud, 1965] and Cameroon in 1977 [Bafort, 1977], making the vinckei clade geographi- cally quite diverse. 29

Subspecies classifications were made for 18 P. vinckei isolates in total based on parasite characteristics and geographical origin, giving rise to four subspecies - P. v. vinckei (Democratic Republic of Congo), P. v. petteri (Central African Repub- lic), P. v. lentum (Congo Brazzaville) and P. v. brucechwatti (Nigeria). These four subspecies were characterized in various laboratories [Bafort, 1971b, Carter and Walliker, 1975, 1976, Killick-Kendrick, 1975, Landau et al., 1970]. The fifth P. vinckei subspecies isolated from Cameroon (P. v. subsp.) remains uncharacterized.

Enzyme variation studies, undertaken by Richard Carter between 1970 and 1978 [Carter, 1973, 1978], showed that the P. vinckei subspecies shared only one enzyme form across all isolates. This implied that these subspecies had emerged from a single ancestor and had diverged under independent evolutionary pressures in the respective regions where they thrived. It can, therefore, be postulated that there are significant phenotypic and genotypic variations among P. vinckei isolates. This is an important question that we have tried to address in our study.

The lack of a detailed record of phenotypes and genotypes within P. vinckei has hindered its use as a malaria model. To date, only a handful of studies have used P. vinckei as a model to study parasite recrudescence [LaCrue et al., 2011], chronobi- ology [Gautret et al., 1994] and artemisinin resistance [Chandra et al., 2011]. Only two strains, P. v. vinckei v52 and P. v. petteri CR, have been used in these experi- ments and they are also the only isolates for which a draft genome assembly and annotation are available as part of the Broad Institute Plasmodium 100 Genomes initiative (https://www.broadinstitute.org/projects/1000-genomes).

With a more comprehensive analysis of the diverse subspecies and isolates of 30 Plasmodium vinckei, this rodent malaria parasite can potentially be developed into an experimental model system for studying genetic variation and for linking geno- types to phenotypes using high-throughput omics-driven technologies.

1.4 Objectives

The overall aim of this thesis is to create a comprehensive biological and genetic resource for Plasmodium vinckei that would aid the use of this parasite as an experi- mental model to study Plasmodium biology. It can further be broken down into the following objectives:

• To describe the phenotypic diversity within P. vinckei isolates This involves following the infection profiles of several P. vinckei isolates and identifying differences in their phenotypic traits.

• To analyse reference genomes for the five P. vinckei subspecies. This involves sequencing and assembling high-quality genome sequences, ac- curately annotating their gene models and describing the genomic features of P. vinckei in relation to other RMPs.

• To describe the genotypic diversity within P. vinckei isolates This involves making genome-wide sequence comparisons within several P. vinckei isolates and quantifying the level of genetic variation between them.

• To study blood-stage gene expression in P. vinckei This involves generating and analyzing strand-specific transcriptome data for each subspecies and for individual life stages of the parasite within the red blood cell. This also involves evaluating a new, simplified protocol to profile gene expression in RMPs during their blood stages. 31 • To evaluate genetic tractability of P. vinckei This involves performing a test genetic cross between two P. vinckei isolates and applying Linkage Group Selection (LGS) method [Culleton et al., 2005, Abkallo et al., 2017] to identify genes linked to a growth-rate phenotype and applying a reverse genetics approach to introduce a green fluorescent protein (GFP) marker into a P. vinckei parasite.

We used a total of ten P. vinckei isolates, spanning the five P. vinckei subspecies for our study which are listed in Table 1.2 and their geographical origins shown in Figure 1.3. The abbreviated notations in Table 1.2 will be used to denote the P. vinckei isolates throughout this thesis. 32 Species Geographical origin Isolate name Abbreviation Original designation Plasmodium vinckei vinckei DRC CY PvvCY v67 DA PvbDA 1/69 Plasmodium vinckei brucechwatti Nigeria DB PvbDB N48 DE PvlDE 170L Plasmodium vinckei lentum Congo Brazzaville DS PvlDS 408XZ BS PvpBS BS Plasmodium vinckei petteri CAR CR PvpCR CR EE PvsEE biboto 18 Plasmodium vinckei subspecies Cameroon EH PvsEH Esekam 2 EL PvsEL Esekam 4

Table 1.2: List of Plasmodium vinckei isolates used in this study. We used at least two isolates per subspecies except for P. vinckei vinckei for which only one isolate is available. The abbreviations are used to denote the isolate throughout this thesis. (DRC- Democratic Republic of Congo; CAR- Central African Republic)

Figure 1.3: Plasmodium vinckei isolates used in this study and their geograph- ical origins. Adapted from [Carlton et al., 2001]. 33

Chapter 2

Phenotypic diversity within Plasmodium vinckei

2.1 Review of Literature - Phenotypes of RMPs

Characterizing the phenotype of an RMP describes how the parasite behaves during the course of infection and identifies phenotypic similarities/dissimilarities between isolates. This knowledge is necessary to use an isolate as an experimental model and such observations lay down the foundation upon which future studies can be carried out. Parasite characterization generally includes but is not limited to de- scribing; i) the morphology of the parasite in the sexual and asexual blood stages; ii) growth profile and effect on host during the course of blood infection; iii) durations for liver stage, erythrocytic cycle and sporogony (sporozoite develop- ment inside the mosquito); iv) erythrocytic stage characteristics such as number of merozoites produced per schizont, host cell preference and synchronicity; v) optimal sporogony temperature; and vi) drug resistance.

Depending upon the technical and scientific advantages that their phenotypes offer, each RMP species has been exploited to study particular aspects of Plasmod- ium biology. 34

Plasmodium chabaudi is naturally sensitive to antimalarial drugs including chloro- quine and can be induced in the lab to get drug-resistant lines, thus making them ideal for studying drug resistance. It also exhibits endothelial sequestration, a phe- nomenon also seen in human cerebral malaria and has the lowest number of variant surface protein genes [Stephens et al., 2012]. This has made it suitable to study se- questration and antigenic variation in malaria [Spence et al., 2013, Brugat et al., 2014, Yam et al., 2016, Brugat et al., 2017].

Plasmodium yoelii induces less liver inflammation and more liver stage parasites during its pre-erythrocytic infection than P. chabaudi and P. berghei [Tarun et al., 2006]. Hence, P. yoelii is extensively used to study the elusive liver stages which are very difficult to study in human malaria infections in vivo [Meis et al., 1983, Tarun et al., 2006, Meister et al., 2011].

Plasmodium berghei can be adapted to in vitro cultivation and around 5-25% of the parasites are committed to develop into gametocytes. P. berghei is therefore used as a tractable model to study sexual stages and malaria transmission. Efficient transfection techniques [Spence et al., 2011, Pfander et al., 2011] have been es- tablished in both P. berghei and P. yoelii and they are widely used as in vivo model systems for functional studies [Zuzarte-Luis et al., 2014, Matz and Kooij, 2015, Bushell et al., 2017].

Within the vinckei clade, owing to the geographical isolation of its subspecies, each subspecies seems to have evolved independently resulting in considerable ge- netic diversity. Characterization of four P. vinckei subspecies has already demon- strated many phenotypic differences (See Table 1.1). For example, P. v. vinckei, 35 isolated from highland areas in Katanga has a lower optimal transmission temper- ature of 20-21°C compared to the rest of the subspecies which were isolated from low lying areas [Killick-Kendrick, 1973a].

P. v. lentum has a slower development of schizonts during its liver stages, delay- ing the onset of its blood stage infection [Killick-Kendrick and Peters, 1978].

The subspecies also differ in their lethality, disease manifestations and trans- mission efficiency [Killick-Kendrick and Peters, 1978]. They have many traits in common too, such as host cell preference (mature erythrocytes), synchronicity of infection (synchronous), and morphology (though there are minor differences, P. vinckei subspecies are largely indistinguishable from each other, with a characteris- tic rich amount of haemozoin crystals visible in trophozoites and gametocytes).

The complete lifecycle (that is, blood, exo-erythrocytic and sporogonic stages of the parasite) of four out of the 18 P. vinckei isolates available have been charac- terized; Plasmodium vinckei vinckei strain 67 (designated as strain CY in this the- sis) [Bafort, 1969, Killick-Kendrick, 1973a], Plasmodium vinckei petteri subsp. nov. strain CE (except for exo-erythrocytic stages) [Carter and Walliker, 1975], Plasmod- ium vinckei lentum subsp. nov. strain ZZ [Landau et al., 1970, Carter and Walliker, 1976] and Plasmodium vinckei brucechwatti subsp. nov. strain 1/69 (designated as strain DA in this thesis) [Killick-Kendrick, 1975, Bafort, 1971a]. However, knowl- edge about the phenotypic diversity within P. vinckei, especially within isolates of the same subspecies, could be improved by identifying differences in their blood stage growth rates and in the pathophysiological changes in the host during the course of their infection. 36 Identifying phenotypic differences is important for applying forward genetic ap- proaches in RMP models. Characterizing inter- and intra- species cross immunity in P. chabaudi enabled use of genetic approaches to identify genetic loci linked to strain-specific immunity [Martinelli et al., 2005a]. Differences in growth pheno- types within P. yoelii strains helped in identifying a single point mutation in the gene encoding the erythrocyte- binding ligand (ebl) as the sole determinant of host- cell preference in P. yoelii [Pattaradilokrat et al., 2009, Otsuki et al., 2009]. As a larger number of well-characterized isolates become available, large scale genetic studies can be carried out to investigate a variety of phenotypes.

2.2 Results and Discussion

2.2.1 Differences in growth phenotypes of P. vinckei isolates

We followed the infection profiles of 10 P. vinckei isolates in CBA mice to charac- terize their growth phenotype and virulence. Parasitaemia was determined daily to measure an isolate’s replication growth rate and the host RBC count and weight were measured as an indication of virulence (see Figure 2.1 ). PvvCY is virulent and highly synchronous in its growth [Gautret et al., 1994] reaching a high parasitaemia of ∼90% on day 6 causing host mortality. Their synchronicity is comparable with that observed in P. chabaudi but PvvCY schizonts do not seem to sequester (unlike P. chabaudi [Gilks et al., 1990]).

Both strains of P. v. brucechwatti, PvbDA and PvbDB, have fast growth rates, reaching a peak parasitemia of around 70% on day 7-8 and killing the host. How- ever, PvbDA shows a faster growth rate from the beginning, reaching ∼15% para- sitemia at day 3 in contrast with all other P. vinckei parasites. Similar differences in growth rates are found between PvlDS and PvlDE, resulting in their peak para- 37 parasites. 6 10 Five female CBA mice were infected with 1x isolates. Plasmodium vinckei Infection profiles of Figure 2.1: Parasitaemia, haematocrit andstandard weight deviation readings of were the readings taken within every five biological 24 replicates. hours †denotes host until mortality. day 20 post infection. Error bars show 38 sitaemia occuring at day 6 and day 9 respectively.

The P. v. lentum parasites are not lethal and eventually get cleared by the host immune system, but their clearance rates also differ, with PvlDS’s clearance more prolonged than PvlDE. PvpCR and PvpBS reach peak parasitaemia in similar time- lines (day 6 or day 7), but PvpCR is more virulent and can kill the host sometimes while PvpBS maintains a mild infection. Within the three isolates of P. vinckei subsp., PvsEL and PvsEE are similar in their growth profiles and their perceived effect on the host, while in contrast, PvsEH is highly virulent and causes host mortality at day 5, the earliest among all P. vinckei parasites.

These differences in growth fitness and infection profiles within P. vinckei iso- lates are macroscopic and could be brought about by a multitude of factors rang- ing from antigenic variation, parasite sequestration to altered expression of multi- gene families after mosquito transmission. These mechanisms involve several genes whose exact functions can be better understood by identifying these phenotypes and correlating them with the differences in their genotypes.

2.2.2 Optimal transmission temperature for P. v. subsp. isolates

Optimal transmission temperature and vector stages were characterized for P. v. subsp. EE, EH and EL. Each isolate was inoculated into three CBA mice and on day 3 post infection, gametocytes were visible on Giemsa smears. Around 100 female A. stephensi mosquitoes were allowed to engorge on each mouse and each feed was carried out at 21°C, 23°C and 26°C. The fed mosquitoes were maintained at the feed temperatures and at 70% humidity. On day 8 and day 15 post-feed, 20- 25 mosquitoes from each batch were dissected for midguts and on day 21, 20-25 mosquitoes were dissected for salivary glands. 39 Gametocyte No. of oocysts No. of oocysts No. of sporozoites Parasite Vector Temperature carrier/host on 8th day P.F on 15th day P.F on 21st day P.F 21 2-3 2-3 (not mature) none PvsEL 23 4-5 ∼140 (mature) 1-2 26 ∼30 ∼50 (mature) none Mice, Anopheles 21 1-2 none none CBA strain, PvsEH stephensi 23 ∼5 ∼50 (mature) 4 female, (reared at 24°C) 26 ∼10 ∼60 (mature) none 6-8w old 21 none none none PvsEE 23 ∼2 ∼120 (mature) none 26 ∼4 ∼55 (mature) none

Table 2.1: Optimal transmission temperature for P. v. subsp. At temperatures of 23°C and 26°C, all three P. v. subsp. isolates are able to produce mature oocysts, though the oocyst numbers are higher at 23°C and sporozoites were observed at least for PvsEL and PvsEH. At 21°C, the oocysts either do not mature or not appear at all. Each oocyst number is the average from 20-25 mosquitoes. P.F -post feed.

A B C

Figure 2.2: Mosquito stages of P. vinckei subsp. EL. A) 4-5 oocysts of 12.5-17.5 µm diameter were observed at day 8 in the mosquito midgut; and B) around a hundred mature oocysts of 50 µm diameter could be observed at day 15 post-feed; C) Some of these mature oocysts had progressed into sporoblasts with sporozoites of 20-25 µm length.

All three P. v. subsp. isolates were able to transmit into mosquitoes at 23°C and 26°C, producing atleast 50 mature oocysts on day 15, but failed to transmit at 21°C (see Table 2.1). 4-5 oocysts of 12.5-17.5 µm diameter were observed at day 8 in the mosquito midgut and around a hundred mature oocysts of 50 µm diameter could be observed at day 15 post-feed. Some of these mature oocysts had progressed into sporoblasts with sporozoites of 20-25 µm length (see Figure 2.2) but only a few appeared upon disrupting salivary glands. 40 2.3 Methods

2.3.1 Ethics statement

Laboratory animal experimentation was performed in strict accordance with the Japanese Humane Treatment and Management of Animals Law (Law No. 105 dated 19 October 1973 modified on 2 June 2006), and the Regulation on Animal Experi- mentation at Nagasaki University, Japan. The protocol was approved by the Institu- tional Animal Research Committee of Nagasaki University (permit: 12072610052) and IBEC (Institutional BioEthical Committee) exempted in KAUST.

2.3.2 Mice and mosquitoes

Six to eight weeks old female ICR or CBA mice were used in all the experiments. The mice were housed at 23°C and maintained on a diet of mouse feed (CLEA Rodent Diet CE-2 from CLEA Japan, Inc.) and water. Mice infected with malaria parasites were given 0.05% para-aminobenzoic acid (PABA)- supplemented water to assist parasite growth. Anopheles stephensi mosquitoes were housed in a temperature and humidity controlled insectary at 24°C and 70% humidity. Mosquito larvae were fed with mouse feed and yeast mixture and adult mosquitoes were maintained on 10% glucose solution supplemented with 0.05% PABA.

2.3.3 Thawing of deep frozen rodent malaria parasites

Frozen parasite stabilates were taken out from preservation in liquid nitrogen, the ampoule caps immediately loosened and thawed in ice. The parasitized blood was transferred from the ampoule to a 15ml Corning tube and 0.2 mL of 12% NaCl in H2O was added drop by drop for every 1 mL of blood with intermittent gentle mixing. Then, 10ml of 1.6% NaCl was added in the same way followed by 10mL of 0.2% Dextrose/0.9% NaCl. The solution was centrifuged at 2000 rpm for 5 min. 41 Supernatant was removed and the RBC pellet resuspended in 0.9% NaCl in equal volumes to get 50% haematocit. The solution was inoculated intravenously into mice.

2.3.4 Blood collection from mice

Infected mice were anaesthetized with an intraperitoneal injection of 0.2 mL of 10% sodium pentabarbitol solution in phosphate buffered saline (PBS) solution. Once completely sedated, a vertical incision was made from the bottom of the rib-cage to the right shoulder, forming a cavity. The brachial artery was cut and blood was collected into 3 mL citrate saline solution (8.5g of NaCl, 15g trisodium citrate in 1L of distilled H2O, pH 7.2) in ice with a sterile Pasteur pipette. Mice were euthanized by cervical dislocation.

2.3.5 Preparation of frozen stabilates of parasites for long term storage

Parasitized blood was collected as described in Section 2.3.4. Blood pellet was obtained by centrifuging at 2000 rpm for 5 min. Supernatant was removed and one pellet volume of deep freeze solution (28% glycerol, 3% sorbitol, 0.65% mannitol) was added slowly drop by drop with intermittent gentle mixing. The mixture was divided into 0.2-0.4 mL volumes in ampoules and stored in liquid nitrogen for long term storage.

2.3.6 Parasite Cloning

Four out of the ten P. vinckei isolates were preserved as uncloned stabilates - PvvCY, PvbDA, PvlDE and PvsEE. These were revived and cloned parasite lines were ob- tained as follows. The parasites were diluted to a limiting concentration of 0.3 42 parasites/100 µL with 50:50 fetal calf serum: Ringer's solution. 100 µL of the in- oculum was administered intravenously to 10 ICR mice each. Parasites arising from a single haploid parasite appeared in two to five mice at 10-12 days post-infection, thus giving 2 PvvCY, 2 PvbDA, 5 PvlDE and 3 PvsEE clones. One clone in each parasite was chosen randomly for phenotyping and genotyping.

2.3.7 Growth phenotyping

Parasites were initially propagated in a donor ICR mouse. When 10-15% para- sitemia was achieved, an inoculum containing 1 X 106 parasites/100 µL was pre- pared using 50:50 fetal calf serum: Ringer's solution. 100 µL of inoculum was in- jected intravenously in 5 CBA mice each. Blood smears, haematocrit readings and body weight readings were taken daily for 20 days or till the mice died of infection. Blood smears were fixed with 100% methanol and stained with Giemsa’s solution. Parasite and total RBC counts were taken at three independent microscopic fields and the average parasitaemia was calculated.

2.3.8 Transmission of P. v. subsp. isolates

CBA mice infected with P. v. subsp. parasites were anaesthetized and ∼100 mosquitoes were allowed to take a blood meal for 30 min without interruption on day 3 post- inoculation after confirming presence of gametocytes by microscopy. Three batches of ∼100 mosquitoes were fed and maintained at three different temperatures - 21°C, 23°C and 26°C, and at 70% humidity with 12hr light-dark cycle in a multi- chamber light-dark cycle controlled incubator (model LP-80CCFL-6CTAR from Ni- honika, Japan). To check for presence of oocysts or sporozoites, mosquitoes were dissected and their midguts or salivary glands were suspended in a drop of PBS solution atop a glass slide, covered by a coverslip and studied under a microscope. 43

Chapter 3

Genomes and Transcriptomes of Plasmodium vinckei

3.1 Review of Literature

3.1.1 Genome of the malaria parasite

The complete genome sequence of Plasmodium falciparum was published in 2002 [Gardner et al., 2002] through an international effort that was launched in 1996, marking a new landmark in malaria research. The same year also saw publication of the genome sequence of Anopheles gambiae (the major African malaria vector) [Holt et al., 2002] and two years later the human genome project (that was initi- ated in 2001 [Lander et al., 2001]) was also completed [Schmutz et al., 2004], thus ushering the host-parasite-vector triad into the genomic era.

The malaria parasite genome was sequenced by a whole chromosome shotgun sequencing approach that entailed separating the chromosomes by pulse-field elec- trophoresis, randomly shearing the chromosomal DNA, cloning the fragments into cloning vectors to make shot-gun libraries and sequencing the inserts through au- tomated Sanger sequencing.

The P. falciparum nuclear genome consists of 22.8 mega-bases (Mb) distributed across 14 chromosomes (now 23.5 Mb after genome improvement - http://www. genedb.org/Homepage/Pfalciparum), is AT-rich (A+T content of ∼80%), and com- prises of around 5,300 protein-coding genes [Gardner et al., 2002]. Around 60% 44 of the predicted proteins have no similarity with proteins with known function from other organisms in protein databases and therefore are of unknown func- tion. The sub-telomeric regions of the parasite’s chromosomes are large, up to 120 kilobasepairs (kb), and are recombination hotspots resulting in extensive polymor- phisms in terms of size and composition. These regions house multigene families that code for surface antigens that mediate several key immune evasion mecha- nisms of the malaria parasite (cell adherence and antigenic variation) and there- fore are potential vaccine targets. Thus the frequent recombination events in the sub-telomeric regions contribute to shuffling of these families that in turn promote antigenic diversity [Freitas-Junior et al., 2000]. The P. falciparum genome con- sists of several highly variable multigene families at their subtelomeric regions: var (codes for P. falciparum Erythrocyte Membrane Protein 1- PfEMP1), rifin (P. falciparum-encoded repetitive interspersed families of polypeptides), stevor (sub- telomeric variable open reading frame), surf (surface-associated interspersed pro- tein), phist (Plasmodium helical interspersed subtelomeric), clag (cytoadherence- linked asexual protein), etramp (early transcribed membrane protein), fikk (FIKK kinase), pfmc-2tm (P. falciparum Maurer’s clefts two transmembrane) and more have been proposed [Wahlgren et al., 2017]. The malaria parasite also consists of a circular apicoplast genome of around 35 kb size and a linear mitochondrial se- quence of 6 kb size.

The wealth of information provided by the malaria parasite genome remains indispensable in malaria research [Kooij et al., 2005], from single gene-knockout experiments in order to study important biological mechanisms to genome-wide variation profiling in drug resistance and epidemiological studies. With this in mind, the quality of the P. falciparum genome is being continually improved (http://www. genedb.org/Homepage/Pfalciparum) and genomes of the other five human malaria 45 species have also been sequenced [Carlton et al., 2008, Pain et al., 2008, Ansari et al., 2016, Rutledge et al., 2017].

3.1.2 Genomes of rodent malaria parasites

Efforts to sequence rodent malaria parasites began alongside the Plasmodium falci- parum Genome Sequencing Project and a partial genome of Plasmodium yoelii yoelii was published in 2002 alongside the P. falciparum genome [Carlton et al., 2002]. Draft genome and transcriptome sequences of P. berghei and P. chabaudi followed in 2005 [Hall et al., 2005] and quite recently the quality of these genomes were sig- nificantly improved using next-generation sequencing [Brugat et al., 2017, Fougere et al., 2016, Otto et al., 2014].

RMP genomes are smaller than human and primate malaria genomes ranging from ∼18 Mb (P. berghei) to ∼21 Mb (P. yoelii). All three RMP genomes show marked conservation in gene content and organization in comparison with P. falci- parum, having orthologs for around 80% of P. falciparum genes in highly syntenic (i.e. conserved gene order) central regions [Carlton et al., 2005, Kooij et al., 2005]. This degree of synteny will generally allow findings from studies using RMP models to be easily interpreted in the human malaria perspective.

The subtelomeric regions, on the other hand, are highly variable and consist of several multigene families [Fischer et al., 2003], some of which are homologous to families in human/primate Plasmodium species and some unique to RMPs [Otto et al., 2014]. The largest are the pir (Plasmodium interspersed repeat genes) gene family, homologous to the vir family found in P. vivax, and which are particularly expanded in P. yoelii with almost 800 member genes (yir). 46 Other gene families that have orthologs outside RMPs are 235 kDa rhoptry pro- teins, etramp (early transcribed membrane proteins), halo-acid dehalogenase-like hydrolases, lysophospholipases and “fam-a”. It is worth mentioning that the fam-a gene family is a result of an RMP-specific expansion of a single gene in P. falci- parum. RMP- specific multigene families consist of fam-b, fam-c, fam-d and erythro- cyte membrane antigen 1 (ema1) [Otto et al., 2014].

The erythrocyte membrane antigen 1 (ema1) family has particularly expanded in P. chabaudi with 13 members as opposed to just one gene in P. berghei andP. yoelii. Most of the multigene families encode for proteins that get exported and expressed on the RBC membrane as evidenced by transmembrane domains, signal peptides or PEXEL-motifs (Plasmodium EXport ELement). There are also other sub- telomeric genes that are RMP-specific, get exported to the RBC membrane but lack a PEXEL-motif, indicating alternative export mechanisms [Otto et al., 2014].

Genome resources have greatly facilitated the use of these three RMP species as experimental models, especially for reverse and forward genetics approaches. As an example, it has facilitated the generation of modification vector plasmids targeting more than 2,000 proteins in P. berghei [Gomes et al., 2015, Schwach et al., 2015, Bushell et al., 2017] and in the interrogation of key enzymes [Tewari et al., 2010, Guttery et al., 2014], thereby enabling large scale reverse genetics approaches. It has improved the resolution of existing forward genetic approaches such as Linkage Group Selection to base pair level by allowing for single nucleotide polymorphisms to be used as selection markers [Abkallo et al., 2017].

High quality RMP genomes provide for good resolution of the repetitive sub- telomeric regions and enable proper interrogation of the important multigene fam- 47 ilies [Brugat et al., 2017], many of which have homologous counterparts in the human malaria parasite and are more likely to have similar functions. But, such genome resources have stopped short of the P. vinckei isolate collection with frag- mented genomes of only two isolates available. A comprehensive description of the P. vinckei genomes would make the isolates more accessible as malaria models and would give a better evolutionary perspective on rodent malaria parasites in general.

3.1.3 Single nucleotide polymorphisms in RMPs

Whole genome sequences of different species causing malaria, despite being valu- able resources to study malaria biology, provide little insight into genetic variation that occurs within parasite populations. Genetic variations or polymorphisms in the coding gene sequences enable the malaria parasite to evolve in response to the environment and are key drivers of drug resistance.

Several large-scale surveys have collected and genotyped clinical isolates of P. falciparum [Manske et al., 2012, Miotto et al., 2015, Amato et al., 2015, Menard et al., 2016] and P. vivax [Pearson et al., 2016] (The P. falciparum community

project under the MalariaGen network - https://www.malariagen.net/, currently holds genotype data for 3,488 isolates from 23 countries) in order to study the oc- currence of genetic variations worldwide and to identify polymorphisms enriched among drug-resistant populations of these two important malaria parasites.

While genotyping of human malaria parasites is an important tool for epidemio- logical studies and disease control, genotyping of RMPs serves a different purpose, namely to assist in genetic linkage studies aiming to link genotypes to phenotypes. Association mapping and linkage analyses use allelic variations in a given locus as molecular markers whose patterns of segregation during meiotic recombination 48 can determine how they are linked to each other (otherwise called genetic linkage maps) and to the studied phenotype [Vignal et al., 2002, Schlotterer, 2004, Davey

et al., 2011] (see Section 4.1.3 - Molecular markers and genetic linkage maps).

With the advent of high-throughput sequencing, Single Nucleotide Polymor- phisms (SNPs) are increasingly becoming the molecular marker of choice for genetic linkage studies. SNPs are stably inherited and those occurring in coding regions can be directly correlated with protein function. A collection of high-confidence SNPs across the entire genome that distinguishes different RMP isolates can be made available through shotgun DNA sequencing with much ease and speed. SNPs, de- tected by whole genome sequencing, have been used to resolve mutations asso- ciated with sulphadoxine [Martinelli et al., 2011], chloroquine [Kinga Modrzyn- ska et al., 2012] and artemisinin resistance [Hunt et al., 2010], and growth rate [Abkallo et al., 2017] in RMP models.

SNP markers are single nucleotide changes and less complex compared to mi- crosatellites which are multi-allelic. Since most of the genome is invariant, a good and uniform SNP density is required for downstream analysis to identify geneti- cally linked loci. Therefore, a priori knowledge about the level of genetic diversity among multiple parental isolates (along with the phenotypic differences) can help us decide which ones would be suitable for a genetic cross and subsequent linkage analysis.

Genotyping of 13 RMP isolates (6 P. berghei, 2 P. yoelii and 5 P. chabaudi isolates) based on whole genome sequencing (WGS) revealed low levels of polymorphisms among the P. berghei isolates (ranging from 95 to 2,759 SNPs, thus rendering them unsuitable for linkage analysis) and high levels among P. chabaudi isolates (144,148 49 - 274,877 SNPs)[Otto et al., 2014]. Similarly, given the diversity in growth rates we see among the P. vinckei isolates, it would be informative to identify SNPs genome- wide and characterize the genotypic diversity within the P. vinckei clade. Such a resource would help us choose the isolate combinations for a genetic cross and would aid as molecular markers in genetic linkage studies using P. vinckei in the future.

3.1.4 Functional genomics in the malaria parasite

With the completion of the P. falciparum genome, it became possible to measure gene expression in the parasite in a high-throughput manner with tools such as mi- croarrays and RNA sequencing (RNAseq). A series of studies [Le Roch et al., 2002, 2003, Bozdech et al., 2003a,b] used microarrays to describe the transcript-level changes occurring in the parasite as it progresses through its 48 hr asexual growth cycle or the intraerythrocytic developmental cycle (IDC) inside the RBC.

It was found that the parasite exerts a remarkable level of transcriptional reg- ulation, with over 75% of the genes expressed in the asexual life stages being activated only once during the IDC, and their expression is timed to happen just when required to perform their function. For example, many genes related to basic metabolic and cellular processes were found to be expressed in the fast-growing rings and trophozoites and those involved in host-parasite interactions were ex- pressed during the merozoite stages. This strict transcriptional regulation was con- sistently observed in different P. falciparum laboratory strains [Llinas et al., 2006] and field isolates [Mackinnon et al., 2009], in P. vivax [Bozdech et al., 2008] and most recently in the RMP parasites [Otto et al., 2014, Hoo et al., 2016].

Gene expression of parasite lifestages in the vector have been profiled in P. fal- 50 ciparum [Le Roch et al., 2003, Lasonder et al., 2016, 2008], P. berghei [Otto et al., 2014, Hall et al., 2005, Srinivasan et al., 2004] and P. yoelii (only sporozoites) [Kaiser et al., 2004, Kappe et al., 2001] using transcriptomic or proteomic ap- proaches. However, for oocysts and sporozoite stages, these studies have managed to describe only the abundantly expressed genes and a more exhaustive analysis of the vector stage transcriptome remains elusive due to lack of a proper technique to separate the parasites from the mosquito tissue to reduce vector RNA contamina- tion. Finally, Plasmodium liver stage gene expression has also been profiled using P. yoelii as a model [Tarun et al., 2008, Sacci et al., 2005, Wang et al., 2004].

With the advent of next-generation sequencing, the microarray-based studies of the IDC in P. falciparum were recently validated using RNA sequencing [Otto et al., 2010b], showing high correlation between both methods. Since RNA sequencing, unlike microarrays, doesn’t require the gene sequence information beforehand, it allows for an unbiased, global measurement of mRNA abundances and discovery of novel transcripts. RNAseq reads offer base-level resolution of the gene models and can be used to infer splice junctions and untranslated regions in the genome. They can be used to correct errors in existing gene models, detect novel genes that gene prediction softwares had missed and thus, raise the annotation quality in general.

3.2 Results and Discussion

In order to obtain high quality P. vinckei reference genomes and genotype data for all ten P. vinckei isolates, the following were considered as goals for this Chapter:

• To use single molecule real-time (SMRT) sequencing and associated genome assembly pipeline to produce highly contiguous P. vinckei genome assemblies.

• To use strand-specific RNA sequencing and associated gene prediction and 51 annotation pipeline to produce high-quality gene models for P. vinckei.

• To describe genome organization in P. vinckei by comparing genomes of P. vinckei subspecies with each other and with genomes of other RMPs.

• To describe the sub-telomeric multigene families in P. vinckei, a key feature of Plasmodium genomes.

• To draw a pan-RMP whole genome phylogeny to infer their evolutionary rela- tionships.

3.2.1 Genome assembly and organization of P. vinckei genomes

We used a combination of second- and third-generation sequencing technologies (using the Illumina and Pacific Biosciences sequencing platforms) to sequence and assemble reference genomes for five P. vinckei isolates, one from each subspecies- P. v. vinckei CY, P. v. brucechwatti DA, P. v. lentum DE, P. v. petteri CR and P. v. subsp. EL.

Long single molecule sequencing reads (PacBio reads) of 10-20 kb length and with a high median coverage of >155X across the genome enabled assembly of each of the 14 chromosomes as single unitigs (high-confidence contig). Base-call errors in the assemblies due to high error rate of PacBio reads were corrected using high-quality 350 bp and 550 bp insert PCR-free Illumina reads. We corrected up to around 500 single base-call errors and 900 insersion/deletion (indel) errors in each assembly. A few gaps (less than 11) remain in the assemblies but these are mainly confined to apicoplast genomes and to PvsEL and PvlDE genomes that we assembled from 10 kb-long PacBio reads instead of 20 kb. 52 P. vinckei P. vinckei P. vinckei P. vinckei P. vinckei Genomic features vinckei CY brucechwatti DA lentum DE petteri CR subsp. EL Nuclear genome Genome Size (Mb) 18.34 19.17 19.31 19.39 19.50 G+C content (%) 22.94 23.17 23.33 23.19 23.16 Gaps within assembly 0 0 7 0 11 Genes 5,073 5265 5276 5317 5319 Mitochondrial genome Genome Size (bps) 6,003 5,998 6,006 6,002 5,999 G+C content (%) 31.28 30.94 30.77 30.94 31.07 Gaps within assembly 0 0 0 0 0 Genes 3 3 3 3 3 Apicoplast genome Genome Size (bps) 29,543 29,329 28,571 12,928 12,754 G+C content (%) 13.72 13.78 14.25 19.73 13.67 Gaps within assembly 1 1 3 5 0 Genes 28 24 24 4 9

Table 3.1: Characteristics of Plasmodium vinckei genomes.

The PvpCR and PvvCY assemblies, with each chromosome in one piece, are a sig- nificant improvement over their existing fragmented genome assemblies (available through PlasmoDB v.30). P. vinckei genome sizes range from 19.2 to 19.5 Mb except for PvvCY which has a smaller genome size of 18.3 Mb, similar to P. berghei (both isolates are from the same Katanga region). While we were not able to resolve the telomeric repeats at ends of some of the chromosomes, all the resolved telomeric repeats had the RMP-specific sub-telomeric repeat sequences - CCCTA(G)AA. The mitochondrial and apicoplast genomes were ∼6 kb and ∼30 kb long respectively, except for the apicoplast genomes of PvpCR and PvsEL, for which we were able to get only partial assemblies due to low read coverage (see Table 3.1).

Comparative analysis of P. vinckei and other RMP genomes showed that P. vinckei genomes exhibited the same high level of conservation seen within RMP genomes with only a few chromosomal rearrangements. Chromosomal rearrangements are major mutations in an organism’s chromosomal structure that are usually products of errors during double strand DNA breakage and recombination during meiosis or 53 mitosis. These events can be identified by breaks in synteny (synteny breakpoints- SBPs) observed upon aligning and comparing genome sequences. We aligned P. vinckei and other RMP genomes to identify synteny blocks between their chromo- somes, using the MUMmer tool [Kurtz et al., 2004]. Similar to previous findings in RMP genomes [Otto et al., 2014, Kooij et al., 2005], we observed events of large scale exchange of chromosomal segments between non-homologous chromosomes (see Figure 3.1) and micro-rearrangements in the highly variable sub-telomeric re- gions.

3.2.1.1 Reciprocal translocations

Comparing the two Katanga isolates, P. vinckei vinckei CY and P. berghei, we ob- served reciprocal translocation of ∼0.6 Mb (with 134 genes) and ∼0.4 Mb (with 99 genes) long regions between chromosomes 8 and 10 (see Figure 3.1). There is also an inversion of a ∼100 kb region in chromosome 14. While the translocation remained conserved in all of P. vinckei subspecies, the inversion was found to be PvvCY- specific.

Within the P. vinckei subspecies, two reciprocal translocations were observed between PvpCR and PvlDE- one pair of exchange (∼1 Mb and ∼0.55 Mb) between chromosomes 5 and 13, and another smaller pair (∼150 kb and ∼70 kb) between chromosomes 5 and 6. These events have left the 5th chromosomes of the sub- species pair with only a ∼0.15 Mb region of synteny between them, consisting of 48 genes while their remaining 304 genes have been rearranged with chromo- some 6 and 13. All the synteny breakage points (SBPs) were found to be in inter- genic regions, thus having no observable change in any of the gene structures. The PvlDE-PvpCR SBPs in chromosomes 5 and 6 were near rRNA units, loci previously described as hotspots for such rearrangement events [Carlton et al., 2002, Liu and 54 P. yoelii yoelii ANKA, P. berghei (left). The 14 chromosomes of different RMP genomes Genome rearrangements have occurred during the speciation CY. Plasmodium vinckei P. vinckei vinckei AS and Genome rearrangements within RMP genomes. P. chabaudi chabaudi 17X, of the four RMP speciesare (right) arranged and as subspecies a of circosin plot red and and the ribbons an (light inversion grey) is between marked them in denote regions blue. of The synteny. A four translocation RMP is reference marked genomes compared are Figure 3.1: 55 Sanderson, 1995]. No major rearrangement events were observed between PvpCR and PvsEL genomes.

Large-scale genome rearrangements are infrequent in Plasmodium, with only 15 recombination events between human and rodent malaria parasite genomes [Kooij et al., 2005] . Within the RMPs, genome rearrangements have accompanied the speciation of P. chabaudi (between chromosomes 7 and 9) and P. vinckei (between chromosome 8 and 10; inversion within chromosome 14) from their most recent common evolutionary ancestor. Evidence to some of these events have been shown previously [Kooij et al., 2005], but we additionally observe here similar rearrange- ments (between chromosomes 5, 6 and 13) accompanying the evolution of RMP subspecies P. vinckei petteri and P. vinckei subsp..

The biological or phenotypic significance, if any, of such alterations are poorly understood, but they might provide clues as to the evolution of RMPs. While P. v. vinckei, P. v. brucechwatti and P. v. lentum’s divergences were driven by accumu- lating genotypic and phenotypic changes, chromosomal rearrangements could have been an additional factor in the evolution of P. v. petteri and P. v. subspecies. It can be speculated that these rearrangements caused a recombination barrier that separated these two subspecies from their most recent common ancestor, and sub- sequent geographical isolation aided the propagation of these small “variant” pop- ulations that would have died out otherwise.

These recombination barriers might still exist between the P. v. petteri-P. v. subsp. group and the rest of the vinckei isolates. Therefore, in practice, the vinckei clade could be further divided into two subclades and these distinctions should be kept in mind when performing an experimental genetic cross of the parasites in linkage 56 studies.

3.2.1.2 Micro-synteny breaks

Micro-synteny breaks that include small inversions, duplications and deletions were numerous in the sub-telomeric regions, making their lengths and gene content highly variable across all P. vinckei subspecies. Apart from these, we also observed such events within the central conserved regions. A gene encoding for “conserved rodent malaria protein” (PVPCR 0100190) in chromosome 1 has been inverted in PvpCR and PvsEL genomes, compared to other P. vinckei genomes. A PvpCR gene en- coding for “reticulocyte-binding protein” (PVPCR 0600200) in chromosome 6 has expanded into three consecutive copies in PvsEL. Both fam-a and fam-d expansions in chromosome 13 and 9 respectively have occurred in P. vinckei and in the same loci as in P. chabaudi and P. yoelii.

The gene encoding for dipeptidyl aminopeptidase 1 (DPAP1), a single copy gene in all of Plasmodium species, has been duplicated in PvvCY alone (PVVCY 0901490 and PVVCY 0901500 in chromosome 9). DPAP1 is an essential proteolytic enzyme that is used by the parasite to digest host haemoglobin in its food vacuole to fuel its intra-erythrocytic growth [Wang et al., 2011, Klemba et al., 2004]. Gene expression data from RNAseq (see Section 3.2.5) shows that both copies of the enzyme are con- sistently expressed throughout the blood stages. Gene duplications are well known adaptive mechanisms to changing environment [Innan and Kondrashov, 2010] and DPAP1 gene duplication could be in response to higher haemoglobin content at high altitude regions [Beall et al., 1998]. But this adaptation seems to have happened only in PvvCY and not in P. berghei, both isolated from high-altitude regions. Hence, these assumptions are currently pure speculations. 57 3.2.2 Annotation of P. vinckei genomes

Gene models were predicted by combining multiple lines of evidence to improve the quality of those predictions. These include publicly available P. chabaudi gene models, de novo predicted gene models from AUGUSTUS [Stanke et al., 2004] and transcript models from strand-specific RNAseq data of different blood life-stages. Consensus gene models were obtained through MAKER [Cantarel et al., 2008] and further manually corrected to an extent through comparative genomics and visual- ization of mapped RNAseq reads [Carver et al., 2012] [Carver et al., 2005]. As a result, we predicted around 5,073 to 5,319 protein-coding genes, 57-67 tRNA genes and 40-48 rRNA genes in each P. vinckei genome. Protein functions were trans- ferred from well-curated P. chabaudi gene models in GeneDB (version 3), based on top protein BLAST scores.

3.2.2.1 Ortholog analysis

We identified putative orthologs for the total predicted P. vinckei proteome (26,250 proteins from the five subspecies) by comparing with the predicted proteomes of other three RMPs - P. berghei, P. chabaudi and P. yoelii (16,203 proteins in total), and three primate malaria species - P. falciparum, P. vivax and P. knowlesi (17,298 proteins in total), using OrthoMCL [Li et al., 2003]. Around 85.5% of all RMP proteins (42,453 proteins) had orthologs in at least one of the primate malaria species. A total of 6,119 RMP proteins were predicted to be RMP-specific, of which only 1,152 P. vinckei proteins (4.3% of P. vinckei proteome) did not have orthologs predicted in other RMPs. Almost all of these P. vinckei proteins were members of sub-telomeric multigene families, half of them belonging to pir gene family. Thus, gene content of P. vinckei genomes are highly conserved among themselves and with other RMP and primate malaria species. 58 3.2.2.2 Functional domain analysis

Functional analysis of the proteome was performed to detect presence of known protein family domains, transmembrane domains, signal peptide cleavage sites and PEXEL/VTS (Plasmodium export element/ vacuolar transport signal) motifs [Hiller et al., 2004, Marti et al., 2004]. A signal peptide and PEXEL/VTS motif are required for the export of parasite proteins through the endoplasmic reticulum and the par- asitophorous vacuole respectively to reach the host erythrocyte. But, there is also evidence of alternate export machineries as several exported proteins do not have a PEXEL motif (PNEP- PEXEL negative exported proteins)[Heiber et al., 2013]. Of the 26,250 P. vinckei proteins, 70.6% (18,534 proteins) had known protein domain signatures (identified by InterPro accessions) [Jones et al., 2014] and 63.2% had Pfam domains [Bateman et al., 2004].

We identified between 1,300 and 1,400 proteins in each genome as putative membrane proteins due to presence of one or more transmembrane domains pre- dicted by TMHMMv2.0 [Krogh et al., 2001]. Approximately 700 proteins had sig- nal peptide cleavage sites and therefore putative secreted proteins in each genome except for PvvCY which only had 548 proteins with signal peptide. We also iden- tified parasite proteins that are exported to the host erythrocytes by detecting the presence of PEXEL/VTS motif using ExportPredv4.0 [Sargeant et al., 2006]. The number of proteins with a PEXEL motif ranged from 120 to 146 in the P. vinckei genomes. In conclusion, we found that the functional domains and motifs content in P. vinckei genomes were comparable to rest of the RMPs [Otto et al., 2014] (See Appendix Table A). 59 3.2.2.3 Whole-genome based phylogeny

As our next step, we constructed a genome-wide phylogeny comparing the five P. vinckei subspecies and other RMPs to understand the evolutionary relationships amongst them. We derived 3,920 one-to-one orthologous groups from our or- tholog analysis on ten taxa and their protein sequences were concatenated. The concatamers were aligned using MUSCLE [Edgar, 2004], alignments trimmed with trimAL [Capella-Gutierrez et al., 2009] and the phylogeny tree was inferred from the 2,278,855 amino acids long alignment in MrBayes [Ronquist et al., 2012] with both empirical (JTT) and non-empirical (Poisson) fixed-rate models with model for rate variation across sites set to “invgamma”. From both methods, we obtained trees with identical topologies and 100% support for all nodes after running 200,000 bootstrap iterations (see Figure 3.2).

Within the four RMP species, both the berghei and vinckei clades have diverged from their putative common ancestor at equal rates (∼ 0.036 average amino acid substitutions per site), but within the clades, P. vinckei and P. chabaudi are closer to each other than P. berghei and P. yoelii are to each other. The P. vinckei subspecies show varied levels of divergence from their common ancestor, the most diverged being P. v. vinckei and least being P. v. petteri. P. v. vinckei has undergone sig- nificant divergence from the common vinckei ancestor and though classified as a subspecies, has almost the same divergence rate as P. chabaudi, an RMP species. While our phylogeny is consistent with previous predictions based on isoenzyme variation [Carter, 1978] and gene sequences of multiple housekeeping loci [Perkins et al., 2007, Ramiro et al., 2012], we believe our version to be the most resolved and robust with branch lengths inferred from a concatenated dataset of 3,920 proteins. 60

Figure 3.2: Whole genome-based phylogeny of Plasmodium vinckei subspecies. Phylogeny showing the divergence of P. vinckei clade, other RMP species and three primate species (P. falciparum treated as outgroup), as inferred from 2,278,855 amino acids from 3,920 one-to-one orthologs after 200,000 bootstrap iterations using MrBayes.

3.2.3 Sub-telomeric multigene families in P. vinckei

Plasmodium multigene families play key roles in immune evasion via antigen vari- ation, RBC sequestration and virulence. These families are clustered in the sub- telomeric regions where recombination occurs at higher rates [Freitas-Junior et al., 2000], thus enabling them to vary their copy number and also modulate their expression through epigenetic regulation (gene expression can be activated or si- lenced depending upon its juxtaposition with heterochromatic regions). The copy number variation and shuffling of the gene locations in these multigene families ultimately lead to phenotypic plasticity in Plasmodium. There are in total ten multi- gene families in rodent malaria parasites, whose exact functions and modes of ac- tion are under active investigation.

3.2.3.1 Erythrocyte membrane antigen 1 (EMA1)

This family was first identified and described in P. chabaudi, is found only in RMPs and is associated with host RBC membrane [Favaloro and Kemp, 1994]. These genes encode for a ∼800 aa long protein and consist of two exons- a first short 61 exon carrying signal peptide and a second long exon carrying a PcEMA1 protein family domain (Pfam ID- PF07418). The gene encoding for ema1 is present only as single or double copies in P. yoelii and P. berghei respectively, but has expanded to 14 copies in P. chabaudi. We see similar gene expansions of between 7 and 21 gene copies in P. vinckei (see Figure 3.3). However, almost half of these genes in each subspecies are pseudogenes, with the exception of PvvCY where all 7 genes are protein-coding copies. Each pseudogene had a SNP (C>A) at base position 14 that introduces a “TAA” stop codon (S5X) within the signal peptide region, followed by a few more stop codons in the rest of the gene (see Appendix Figure E.1).

Apart from one or two cases, the S5X mutation is found in all pseudogenes belonging to the ema1 family and is vinckei-specific (not present in the single P. c. chabaudi pseudogene). Thus, despite gene expansions in this family, all P. vinckei subspecies have a core repertoire of 7-10 ema1 encoding genes, all of which contain a signal peptide and lack a PEXEL motif. Pseudogenes, though translationally silent, are thought to have functions with increasing evidence pointing to them having regulatory roles in [Balakirev and Ayala, 2003]. The P. vinckei ema1 pseudogenes could exist just as a reservoir for promoting antigenic variation within this multigene family. Pseudogenes can be transcriptionally active or silent, and our transcriptome data on the five P. vinckei isolates showed that ema1 pseudogenes were being actively transcribed. Therefore, they could also have regulatory roles by competing with their protein-coding counterparts during translation.

3.2.3.2 Early transcribed membrane protein (ETRAMP)

This family of exported proteins are exclusively localized to the parasitophorous vacuole, an important structure for the parasites survival [MacKellar et al., 2011, Spielmann et al., 2003]. P. vinckei genomes contain 11 to 13 ETRAMP encoding 62 - lysophospholi- lpl - erythrocyte membrane ema1 subspecies and other RMPs and the P. vinckei - haloacid dehalogenase-like hydrolases; interspersed repeat proteins.) hdh Plasmodium - pir Sizes of multigene families across all - early transcribed membrane protein; - reticulocyte-binding proteins; RMP multigene families. etramp p235 antigen 1; pases; proportions of complete, pseudogenized and incomplete gene models are shown as piecharts. ( Figure 3.3: 63 genes similar to P. chabaudi and P. falciparum genomes. All the genes have trans- membrane domains and a signal peptide, and are PEXEL-negative.

3.2.3.3 RMP-fam-a, fam-b, fam-c and fam-d proteins

The fam-a gene family is the second largest in RMP formed by a RMP-specific ex- pansion of a single copy ortholog in primate malaria species [Otto et al., 2014]. fam-a proteins are predicted to have roles in lipid binding and transport, aiding in cholesterol salvage in the parasite [Frech and Chen, 2013] and transport of phos- phatidylcholine, thus mediating membrane synthesis during liver stage develop- ment [Fougere et al., 2016]. They are multi-exonic and encode exported proteins that are expressed in the RBC membrane. P. vinckei fam-a family size range from 87 to 207 gene copies and are comparable to that of the other RMPs. PvvCY has simi- lar number of fam-a genes as P. berghei but while over one-third of P. berghei fam-a genes are pseudogenes, none are present in PvvCY. Almost 90% of P. vinckei fam-a proteins contain a signal peptide, 6.5% of them have transmembrane domains and most of them (∼98%) lack a PEXEL-motif.

The number of genes encoding for fam-b proteins is between 28 and 44 in P. vinckei. fam-b proteins are also exported to the RBC membrane and are charac- terized by the presence of a pyst-b domain (Plasmodium yoelii subtelomeric family PYST-B). These proteins are PEXEL-positive and up to 70% of the P. vinckei fam-b family have a PEXEL-motif. Around half of them have a transmembrane domain and one third of them contain a signal peptide.

The fam-c proteins are also exported proteins characterized by pyst-c1 and pyst- c2 domains (Plasmodium yoelii subtelomeric family PYST-C1 and PYST-C2). There is a considerable expansion of this family in PvbDA, PvlDE, PvpCR and PvsEL with 64 around 60 genes, double of that in PvvCY and other RMPs. More than one third of the genes contained a transmembrane domain (75.5%) and signal peptide (88.9%) but most of them lacked a PEXEL-motif (motif was detected in only 4% of the genes compared to 24% in other RMPs).

The fam-d gene family is present in P. chabaudi and P. yoelii as a single gene cluster in chromosome 9 expanded from one gene in P. berghei, and they form a similar gene cluster in P. vinckei. The copy number varies from 6 to 27 with a particularly large expansion in PvlDE (27 copies). Almost all of them have a signal peptide (94.2%) but many lack a transmembrane domain and PEXEL-motif.

3.2.3.4 Haloacid dehalogenase (HAD)-like hydrolases

HAD-like hydrolases are a large superfamily of phosphohydrolases present in all organisms [Kuznetsova et al., 2015, Koonin and Tatusov, 1994]. Considerable se- quence divergence among the members of this superfamily have made it difficult to characterize their biological functions [Koonin and Tatusov, 1994]. P. falciparum has four genes coding for HAD-like hydrolases, of which one was recently impli- cated as regulator of methylerythritol phosphate pathway in the parasite [Guggis- berg et al., 2014]. P. berghei, P. yoelii and P. chabaudi also have these four genes located in internal syntenic regions of chromosomes 5, 9 and 14 (doublet in chro- mosome 14), but the family has expanded in P. chabaudi with an additional eight copies in subtelomeric regions. We see similar family expansion in P. vinckei (7 to 14 copies), and apart from the four internally placed copies, the location and num- ber of the sub-telomeric copies widely vary within the five P. vinckei subspecies. However, one sub-telomeric copy in chromosome 4 is present across all P. vinckei genomes. There is also an event of gene duplication into three consecutive copies in chromosome 10 in PvpCR and PvsEL. 50% of the family are positive for PEXEL-motif 65 but do not contain transmembrane domains or signal peptides.

3.2.3.5 Lysophospholipases

This family is present across all Plasmodium species and its members have a char- acteristic pst-a domain [Fischer et al., 2003]. Apart from the two gene copies found in the internal syntenic regions across Plasmodium species, the family has under- gone expansion in both P. chabaudi and P. vinckei with 20 to 26 additional copies in sub-telomeric regions. About 10% of them have a transmembrane domain but lack both signal peptide and PEXEL motif.

3.2.3.6 Reticulocyte binding proteins (RBPs)

This family of large proteins was first described as 235-kDa (p235) rhoptry proteins in P. yoelii [Keen et al., 1990] and was shown to bear some homology to the retic- ulocyte binding proteins of P. vivax [Galinski et al., 1992] and RH (Reticulocyte- binding protein Homologue) proteins in P. falciparum [Khan et al., 2001]. These proteins are involved in RBC invasion and are known for their remarkable form of clonal antigenic variation in which individual merozoites from the same schizont express a different member of the family [Iyer et al., 2007]. With the exception of PvvCY, P. vinckei and P. chabaudi have similar copy number of 8 genes encoding for reticulocyte binding proteins. In the case of PvsEL, there are three consecutive copies of the gene in chromosome 6. Only two copies in each subspecies contain a signal peptide, some of them have transmembrane domains and all of them are PEXEL-negative.

3.2.3.7 Plasmodium interspersed repeat (pir) proteins

The pir genes comprise the largest multigene family in all Plasmodium species that include rif and stevor in P. falciparum and vir, kir, cyir, bir, yir, and cir in P. vivax, P. 66 knowlesi, P. cynomolgi, P. berghei, P. yoelii and P. chabaudi respectively [Cunningham et al., 2010, Janssen et al., 2004]. To date, their exact functions are unclear and it is suspected that different members perform different tasks [Bernabeu et al., 2012]. PIR proteins get exported to the RBC membrane [Pasini et al., 2013] and they bring about antigenic variation thus influencing parasite virulence and its response to host immunity [Brugat et al., 2017, Spence et al., 2013, Cunningham et al., 2005]. The size of the vinkir (vinckei- interspersed repeat) gene repertoire ranges from 178 to 272 genes with most of them containing a transmembrane domain and lacking signal peptide or PEXEL- motif.

3.2.4 Genotypic diversity within P. vinckei isolates

To assess the genotypic diversity within P. vinckei isolates, we produced PCR-free Illumina sequencing reads for five additional isolates- P. v. brucechwatti DB, P. v. lentum DB, P. v. petteri BS, P. v. subsp. EH and P. v. subsp. EE. Genotypic diversity was quantified at the inter-subspecies (between P. vinckei subspecies) and the intra- subspecies level (within isolates of a P. vinckei subspecies) by mapping sequencing reads against a suitable reference genome. Two types of sequence polymorphisms were called- single nucleotide polymorphisms (SNPs) and small insertions/dele- tions (indels) - using GATK HaplotypeCaller with recommended quality filtering. SNP and indel calls within highly variable sub-telomeric gene families and repeti- tive regions were subsequently removed, as these predictions can be unreliable.

3.2.4.1 Inter-subspecies diversity

Pairwise comparisons were made for isolates - PvvCY, PvbDA, PvlDE, PvpCR and PvsEL - representing each of the five subspecies. A whole-genome phylogeny had been constructed earlier (Figure 3.2) based on protein sequences of these isolates showing significant divergence among the subspecies but SNP information helps us 67 Inter-subspecies Synonymous Non-synonymous SNPs Indels Genes with SNPs comparisons SNPs SNPs P. v. brucechwatti DA 292,360 8,955 4,614 153,287 139,069 P. v. lentum DE 280,126 9,190 4,611 145,488 134,636 P. v. vinckei CY P. v. petteri CR 280,507 9,385 4,608 145,530 134,976 P. v. subsp. EL 280,924 9,254 4,613 145,868 135,055 P. v. lentum DE 207,545 3,554 4,613 114,588 92,957 P. v. brucechwatti DA P. v. petteri CR 206,297 3,458 4,604 116,350 89,947 P. v. subsp. EL 208,709 3,521 4,604 117,869 90,840 P. v. petteri CR 138,744 2,214 4,421 70,257 50,487 P. v. lentum DE P. v. subsp. EL 145,264 2,045 4,547 84,946 60,318 P. v. petteri CR P. v. subsp. EL 73,554 997 4,439 43,597 29,957 Intra-subspecies comparisons P. v. brucechwatti DA P. v. brucechwatti DB 8,189 176 2,160 4,221 3,968 P. v. lentum DE P. v. lentum DS 102,590 1,296 4,495 61,166 41,424 P. v. petteri CR P. v. petteri BS 19,712 284 3,173 11,232 8,480 P. v. subsp. EH 28,363 410 3,295 16,857 11,506 P. v. subsp. EL P. v. subsp. EE 58,305 753 4,297 35,033 23,272 Table 3.2: Genotypic diversity within Plasmodium vinckei isolates.

describe the diversity both in terms of quantity and quality. A high number of SNPs were found between the different subspecies with almost equal numbers in the cod- ing and non-coding regions. The number of SNPs in the coding genes ranged from 73,554 (between P. v. petteri to P. v. subsp.) and 292,360 SNPs (between P. v. vinckei and P. v. brucechwatti) (see Table 3.2). These SNPs were widely distributed across all the genes with 94 -99% of the coding genes (excluding the telomeric genes that were removed from the analysis) having at least one SNP within them (see Appendix Table B). We then classified the SNPs into synonymous (not causing a change in the amino acid) and non-synonymous (causing a change in the amino acid) substitutions. The number of non-synonymous SNPs were always slightly lower in numbers, especially between P. v. lentum, P. v. petteri and P. v. subsp. with N/S (non-synonymous versus synonymous) ratios being around 0.69:1. Similarly, we also identified indels within the coding regions and found between 997 to 9,385 indels with most of them being single-base insertions or deletions (see Table 3.2).

Next, we wanted to estimate how many SNP positions were common across all the subspecies and how many were specific to a particular subspecies (see Figure 68 3.4A). Comparing the other isolates with PvvCY alone, we found that a total of 166,509 common base positions showed polymorphisms in all five subspecies. De- pending upon how diverged a subspecies was from PvvCY, the total number of SNP positions specific only to one subspecies ranged between 91,614 (PvbDA- specific) to 23,291 (PvpCR- specific) SNP positions. This, along with SNP positions common between two or more isolates, came to a total of 478,505 SNP positions that could be used to distinguish different P. vinckei isolates and thus make an excellent set of molecular markers for genetic linkage studies in P. vinckei.

3.2.4.2 Intra-subspecies diversity

Comparing pairs of isolates within each subspecies, we found varied levels of di- vergence, with the least number of SNPs (8,189 SNPs in 2,160 genes) within P. v. brucechwatti and the highest number (102,590 SNPs in 4,495 genes) within P. v. lentum (see Table 3.2). SNP density within the genes differed in each subspecies accordingly and Figure 3.4B shows the genes plotted based on the number of SNPs within them (the grey area denotes the number of genes that have a particular SNP density). Most P. brucechwatti genes were found to have only one SNP, while P. v. lentum and P. v. subspecies having an average of around ten SNPs per gene (see Figure 3.4B). We then evaluated the evolutionary pressure on each subspecies by calculating the Ka/Ks (non-synonymous substitution rate/ synonymous substitution rate) ratio for each protein-coding gene in each pairwise comparison of isolates.

Each gene has been plotted in Figure 3.4C based on its Ka/Ks ratio and the grey area denotes the number of genes having a particular Ka/Ks ratio. Empirically, genes having a Ka/Ks ratio of less than 1 are considered to be under purifying se- lection (offsprings having deleterious mutations in these genes are being removed from the population) and those with Ka/Ks ratio greater than 1 are considered 69

Figure 3.4: Inter- and Intra-specific comparisons of Plasmodium vinckei geno- types. A) Venn diagram showing how the 478,505 SNP positions inferred by map- ping four P. vinckei subspecies onto PvvCY reference genome are shared among them. A total of 166,509 common base positions showed polymorphisms in all five subspecies. B) Violin plot showing genes plotted based on their SNP densities (Num- ber of SNPs per gene, y-axis) within each intra-subspecies comparison (x-axis). The grey violin plot shows the number of genes having a particular SNP density and the box plot within it shows the average SNP density. C) Violin plot showing genes plotted based on their Ka/Ks ratios (y-axis) within each intra-subspecies compari- son (x-axis). The grey violin plot shows the number of genes having a particular Ka/Ks ratio and the box plot within it shows the average Ka/Ks ratio. Most of the genes are under purifying selection (Ka/Ks < 1). to be under positive selection (offsprings having advantageous mutations in these genes are favoured to propagate into future generations). Most of the genes in each subspecies were found to be under purifying selection with Ka/Ks ratio < 1 but the number of genes under positive selection varied within each subspecies, pointing towards distinct evolutionary pressures. Some of these genes were those encoding for merozoite surface proteins, rhoptry proteins, ookinete surface protein, 70 GPI-anchored antigens, helicases and ATPases (see Appendix Figure E.2).

The level of diversity within P. vinckei surpasses the low diversity recorded within P. berghei and is comparable to the diversity shown in P. chabaudi [Otto et al., 2014]. While only five isolates from two P. chabaudi subspecies have been genotyped, we have shown the same level of diversity among ten P. vinckei isolates. This makes them suitable for studying genotype-phenotype relationships at least in terms of availability of a highly dense set of molecular markers.

3.2.5 Transcriptome analysis of P. vinckei blood stages

We performed strand-specific RNA sequencing of poly-adenylated RNA transcripts from five P. vinckei isolates during their blood stages in order to:

• Profile the changes in global gene expression during different stages of the intra-erythrocytic life cycle of P. vinckei.

• Validate an efficient protocol for transcriptome sequencing of rodent malaria blood stage parasites from low quantities of blood.

• Assist in annotating the P. vinckei gene models by providing splice-site infor- mation at base-level resolution.

We sequenced the transcriptome of Plasmodium vinckei vinckei CY at four time points of its 24 hr asexual life cycle; 6, 12, 18 and 24 hrs. Owing to this para- sites highly synchronous nature, these timepoints corresponded to a predominant population (> 95%) of each of its four progressive life stages - ring, early tropho- zoites, late trophozoites and schizonts respectively. RNA for these time points was extracted from 20 uL of infected mice blood via tail snip over the course of infec- tion instead of collecting whole blood through terminal techniques. We performed 71 differential expression analysis to profile the gene expression changes across these time points and to test whether our RNA extraction method is able to reproduce transcriptional regulation previously shown in Plasmodium [Bozdech et al., 2003a]. Additionally, RNA from blood stage parasites (mixed) of PvpCR, PvlDS, PvbDA and PvsEL were sequenced and all transcriptome data used to improve genome annota- tion.

3.2.5.1 Stage-specific transcriptome of P. v. vinckei CY

Of the 5,073 genes in PvvCY, 85.31% (4,328 transcripts) were differentially ex- pressed in at least one of the four time points, consistent with the highly stage- specific transcriptional regulation in Plasmodium [Bozdech et al., 2008, 2003a], where almost every gene is targeted to a specific developmental stage of the para- site. In order to evaluate the temporal cascade of gene expression, we first identified at which timepoint each gene's expression peaked by fitting the expression patterns to a harmonic regression model using ARSER [Yang and Su, 2010] and inferring “phase” values for each gene. Upon ordering the differentially expressed genes based on their phase values, we observed distinct gene clusters upregulated during each time point, showing the tightly regulated transcriptional cascade in PvvCY (see Figure 3.5A, left hand panel). Next, we ordered 2,598 PvvCY transcripts according to the expression pattern [Bozdech et al., 2003a] of their one-to-one orthologs in P. falciparum during its 48 hour IDC and found that our expression data was also able to reproduce the temporal cascade observed in P. falciparum (see Figure 3.5A, right hand panel). Thus, as shown in other RMPs, the transcriptional changes dur- ing the IDC of P. vinckei reflect the morphological and functional distinctions of its developmental stages and is highly comparable to the human malaria parasite. 72

Figure 3.5: Blood stage transcriptome of P. vinckei vinckei CY. A. Heat maps showing gene expression in PvvCY at 6h time-points during the 24 hour asexual cy- cle. All significantly regulated P. vinckei genes were ordered according to their phase of expression (left). PvvCY genes with one-to-one orthologs in P. falciparum were ordered based on P. falciparum gene expression pattern shown in [Bozdech et al., 2003a] (right). B. Gene expression of multigene families in the four blood stages. (R - rings (6h); ET - early trophozoites (12h); LT - late trophozoites (18h); S - sch- izonts (24h); ema1 - erythrocyte membrane antigen 1; etramp - early transcribed membrane protein; hdh - Haloacid dehalogenase-like hydrolases; lpl - lysophospho- lipases; p235 - reticulocyte-binding proteins; pir - Plasmodium interspersed repeat proteins.) 73 3.2.5.2 Gene expression of multigene families

It has been previously shown that different members of the pir gene family show stage-specific expression during the IDC [Otto et al., 2014, Lawton et al., 2012]. Of the 158 vinkir genes in PvvCY, we detected expression for 150 genes in at least one of the time points and 51 genes showed significant regulation in at least one pair-wise comparison between the time points. A majority of the genes were upreg- ulated in late trophozoites (32 genes) and schizonts (17 genes) and only 2 genes were upregulated in ring stages (see Figure 3.5B). Of all the vinkir genes, two genes were consistently expressed at high levels (Fragments Per Kilobase of transcript per Million mapped reads -FPKM > 300) in all stages with significant upregulation in schizonts. fam-a, fam-b, fam-c, fam-d and lysophospholipase genes all formed stage- specific clusters but most of their members were upregulated during the schizont- ring stage transition. Different HAD-like hydrolases were upregulated during ring, early trophozoite and schizont stages which could mean stage-specific functions for this family. etramp genes were specifically upregulated during the ring stages, con- sistent with previous findings [Spielmann et al., 2003], with the exception of two genes that were upregulated in schizonts.

Members of both ema1 and p235 families show similar patterns of expression with an increase in expression during late trophozoite stages, a significant upreg- ulation during schizont stages, which then subsides during ring stages. This is consistent with the expression patterns of merozoite invasion related genes in P. vivax [Bozdech et al., 2008] and P. falciparum [Bozdech et al., 2003a] and it could therefore be hypothesized that these families play a role in host invasion. In addi- tion to this, the ema1 family of antigens are known to be exported to the host RBC membrane [Favaloro and Kemp, 1994], indicating possible roles in cytoadherence and immune evasion by remodelling the host RBC membrane. Thus, ema1 proteins 74 might play multiple roles in malaria pathogenesis, as shown for stevor proteins in P. falciparum [Niang et al., 2014] and cir proteins in P. chabaudi [Yam et al., 2016]. These are only speculations at this point and further investigation into the ema1 proteins is required to elucidate the role of this RMP-specific multigene family.

3.2.5.3 Validation of a new RNAseq protocol for RMPs

Transcriptome studies continue to play a key role in understanding Plasmodium bi- ology as strict transcriptional regulation is observed throughout the lifecycle of the parasite. Experimental protocols for such studies on blood stages have generally involved obtaining RNA from in vitro cultures (in the case of P. berghei) or from parasitized blood from in vivo infections in mice (P. chabaudi and P. yoelii) [Otto et al., 2014, Hall et al., 2005]. If lifestage specific transcriptome is required, syn- chronized in vitro cultures are used or whole blood is terminally bled from mice followed by enriching for particular life stages by Nycodenz gradient separation.

The methods also involve a step of leukocyte depletion using Plasmodipur filters (EuroProxima CAT. 8011Filter25u) or cellulose CF11 columns[Mons et al., 1988, Richards and Williams, 1973] and gentle saponin lysis of infected RBCs to remove host RNA. RNA is then extracted from the blood using (most commonly) TRIzol reagent (ThermoFischer Scientific Cat#15596026) or other commercially available RNA extraction kits. The abundance of individual transcripts is then measured ei- ther by microarray or RNA sequencing workflows. Thus, the protocols start with a considerable volume of blood (1-1.5 ml of whole blood can be obtained from a single mice) that goes through multiple filtration steps in order to obtain an ade- quate amount of pure parasite RNA for cDNA preparation (microarrays) or RNAseq library preparation. In the case of time series experiments, where the transcrip- tomic changes are to be studied at regular intervals over the course of infection, the 75 workflows are started with a cohort of infected mice so that for each time point, 1-2 mice can be euthanized for whole blood collection [Hoo et al., 2016, Lawton et al., 2012]. Thus, in such experiments, the existing methods can be costly in terms of sample size (number of animals) and processing time and can be a limiting factor to the number of time points that could be profiled.

We, therefore, sought to adopt and evaluate a simplified and quick protocol for RNA extraction from low quantity of parasitized blood as starting material. We col- lected 20 µL of parasitized blood via tail snip from three mice infected with PvvCY parasites at four 6 hr time points. The blood was washed with PBS, spun down and the blood pellet was immediately added to 0.5 mL of TRIzol and stored at 4°C. Once all the time point samples were collected, total RNA was extracted through standard Trizol extraction protocol. We obtained 1 to 10 µg of good quality total RNA (host plus parasite RNA) from each sample and we used as low as 500 ng input for constructing strand-specific RNAseq libraries using TruSeq stranded total RNA library preparation kit (Illumina). The 12 libraries (3 replicates X 4 time points) were pooled in equimolar ratios and sequenced on one lane of Illumina HiSeq2000 with 100 bp paired end chemistry, yielding 13 to 18 million paired reads per sample.

As the samples were not leukocyte depleted or RBC lysed prior to TRIzol treat- ment, up to 50% of the reads were “contaminant” reads of mouse origin. The raw reads were initially mapped against the Mus musculus genome (GRCm38 assembly) and the mapped host reads were removed from further analysis. The remaining reads of parasite origin yielded a suitable read coverage for downstream analy- ses with a median FPKM coverage of 32 and more than 90% of the genes having FPKM greater than 8. Next, P. v. vinckei is highly synchronous and our time points correlated to peak abundances in rings, early trophozoites, late trophozoites and 76 schizonts (confirmed by microscopy), we expected our data to show stage-specific gene expression.

Constructing a phaseogram with our data showed a pattern of transcriptional regulation that has been consistently observed in several Plasmodium species (see Figure 3.5A) [Bozdech et al., 2003a, Llinas et al., 2006, Bozdech et al., 2008, Otto et al., 2014, Hoo et al., 2016]. Moreover, one-to-one PvvCY orthologs of P. falci- parum genes were able to reproduce the expression pattern seen in P. falciparum clearly proving that our simplified protocol can efficiently capture the parasites transcript-level information from non-terminally collected blood samples of vol- umes as low as 20 µL.

Our protocol provides for an ideal study design to be applied for in vivo time- series gene expression studies with RMPs during their blood stage infections in rodents. It would be possible to profile the temporal gene expression dynamics within the same host instead of using different hosts for each time point, thus re- ducing inter-individual variability. This will also drastically reduce the number of mice required for a particular study, thus providing ethical and cost benefits that can in turn allow for more time points and/or replicates resulting in better statis- tical power. Quick processing time will allow us to create “snapshots” of the para- site transcriptome- the blood can be collected, washed and resuspended in TRIzol within minutes.

In the case of low parasitaemia, there are two possible limitations in the method - low RNA yield and high host contamination. While low RNA yield can be over- come by using new RNA library prep kits [Adiconis et al., 2013] that can handle picograms level of RNA input, host contamination remains a major challenge as ex- 77 isting host RNA depletion techniques cannot be applied to low blood volumes. High host RNA levels has to be compensated with more sequencing depth but with the reducing sequencing costs, we believe this to be a reasonable trade-off for reduced animal use and processing times.

3.3 Methods

3.3.1 DNA extraction and sequencing

3.3.1.1 DNA extraction

Parasitized blood was collected as described in Section 2.3.4. A blood pellet was obtained by centrifuging at 2000 rpm for 5 min. The pellet was washed once with 10 mL PBS (Phosphate buffered saline) solution to remove blood serum. RBC pellet was obtained again by centrifuging at 2000 rpm for 5 min and resuspended in 10 mL PBS solution. CF11 cellulose columns were prepared, equilibrated with PBS solution and blood solution passed through it to partially deplete the suspension of mouse leukocytes. The RBCs were then gently lysed with 0.15% saponin solution, centrifuged at 3000 rpm for 5 min and ghost RBCs carefully removed leaving be- hind the parasite pellet. The pellet was immediately resuspended in DNAzol reagent (Invitrogen CAT # 10503027) and mixed vigorously for 2-3 min.

DNA extraction was performed from DNAzol as per manufacturers instructions. Briefly, the homogenate was sedimented by centrifugation at 10,000g for 10 min at 4°C to remove cell debris and RNA (The lysing solution in DNAzol enables clumping of proteins, lipids and partially hydrolysed RNA which is then removed by this centrifugation step). DNA was precipitated from the viscous supernatant by adding 100% ethanol (0.5mL for every 1 mL of DNAzol), mixing by inversion and storing 78 at room temperature for 5 min. DNA pellet was washed with 0.8 mL 75% ethanol twice, air-dried at room temperature for 10 min and resuspended in 8mM NaOH solution. DNA was quantified by Qubit fluorimeter and DNA quality was assessed by running the sample in a 0.5% agarose gel electropheresis.

3.3.1.2 Single molecule Real-time (SMRT) Sequencing

5-10 µg of DNA was sheared using Covaris g-TUBE shearing device to obtain tar- get sizes of 20kB (for samples PvvCY, PvbDA and PvpCR) and 10 kb (for samples PvlDE and PvsEL). Sheared DNA was concentrated using AMPure magnetic beads and SMRTbell template libraries were generated as per Pacific Biosciences issued in- structions. Libraries were sequenced using P6 polymerase and chemistry version 4 (P6C4) on 3-6 SMRT cells. Reads were filtered using SMRT portal v2.2 with default parameters. Read yields were 352,693, 356,960, 765,596, 386,746 and 675,879 reads for PvvCY, PvbDA, PvlDE, PvpCR and PvsEL respectively totalling around 2.7 to 4.7 Gb per sample. Mean subread lengths ranged from 6.15 to 9.1 kb. N50 of 11.7 kb and 19.2 kb were obtained for 10 and 20 kb libraries respectively. Sample preparation and sequencing were done commercially by Macrogen Inc.

3.3.1.3 PCR-free Illumina sequencing

1-2 µg of genomic DNA was sheared using Covaris to obtain fragment sizes of 350 and 550 bp. 350bp and 550bp PCR-free libraries were prepared using TruSeq PCR- free DNA library preparation kit according to manufacterer's instructions. Libraries were sequenced on Illumina HiSeq2000 platform with 2 X 100bp paired-end read chemistry. Read yields ranged from 8-22 million reads for each library. 79 3.3.2 Genome assembly

Genome assembly from long single molecule sequencing reads was done using FAL-

CON v0.2.1 (https://github.com/PacificBiosciences/FALCON) with length cut- off for seed reads used for initial mapping set as 2,000 bp and for pre-assembly set as 12,000 bp. The falcon sense options were set as- “–min idt 0.70 –min cov 4 – local match count threshold 2 –max n read 200” and overlap filtering settings were set as “–max diff 240 –max cov 360 –min cov 5 –bestn 10”. 28-40 unitigs were ob- tained and smaller unitigs were discarded as they were exact copies of the regions already present in the larger unitigs. PCR-free reads were trimmed using Trimmo- matic v0.32 [Bolger et al., 2014] (default parameters) and contaminant reads from mice host were removed by mapping onto Mus Musculus reference genome using BWA v0.7.5 [Li and Durbin, 2009] and retaining only unmapped reads using bed- tools v2.17 bam2fastq [Quinlan and Hall, 2010].

PCR-free reads thus obtained were used to correct base call errors in the unit- igs using ICORN2 [Otto et al., 2010a], run with default parameters for 15 iter- ations, at the end of which no more single base or indel errors were detected and corrected. The unitigs were classified as chromosomes based on their ho- mology with P. chabaudi chromosomes (GeneDB version 3). In PvlDE and PvsEL samples, some of the chromosomes were made of two to three unitigs with over- lapping ends which were then fused and the gaps were removed manually. Api- coplast and mitochondrial genomes were assembled from PCR-free reads alone us- ing Velvet [Zerbino and Birney, 2008] with VelvetOptimiser (https://github.com/ tseemann/VelvetOptimiser). 80 3.3.3 Comparative genomics

Syntenic regions between genome sequences were identified using MUMmer v3.2 [Kurtz et al., 2004]. Synteny breakpoints were identified manually and were con- firmed not to be misassemblies by verifying that they had continuous read coverage from PacBio and Illumina reads. Artemis Comparison tool [Carver et al., 2005] and Integrative Genomics Viewer [Robinson et al., 2011] were used for this purpose. The structural variations were illustrated using Circos [Krzywinski et al., 2009].

3.3.4 Genome annotation

De novo gene predictions were made using AUGUSTUS [Stanke et al., 2004]. An AUGUSTUS config file was created by training the tool on P. chabaudi gene models. RNAseq reads were mapped onto the reference genome using TopHat2 v2.0.138 [Kim et al., 2013] to infer splice junctions. Consensus gene models were inferred using MAKER v2.31.8 [Campbell et al., 2014] from three lines of evidence- AU- GUSTUS predictions made within MAKER, transcript models from TopHat2 and protein homology of P. chabaudi protein sequences. Default parameters were used in MAKER run except for “correct est fusion=1”. Gene models were then manu- ally curated based on RNAseq evidence in Artemis viewer and Artemis Comparison tool [Carver et al., 2005, Rutherford et al., 2000]. Ribosomal RNA (rRNA) and transfer RNA (tRNA) were annotated using RNAmmer v1.2 [Lagesen et al., 2007]. Orthologous genes were identified between the five P. vinckei genomes, three RMP genomes, P. falciparum, P. knowlesi and P. vivax genomes using OrthoMCL v2.0.9 [Li et al., 2003] with inflation parameter as 1.5, BLAST hit evalue cutoff as 1e−5 and percentage match cutoff as 50%. Functional domain annotations were inferred from InterPro database using Inter- ProScan v5.17 [Finn et al., 2017, Jones et al., 2014]. Transmembrane domains were predicted by TMHMMv2.0 [Krogh et al., 2001], signal peptide cleavage sites 81 by SignalP v4.0 [Petersen et al., 2011b] presence of PEXEL/VTS motif detected using ExportPredv4.0 [Sargeant et al., 2006] (with PEXEL score cutoff of 4.3).

3.3.5 Whole genome phylogeny construction

Proteins of 3,920 one-to-one orthologous groups were concatenated and aligned using MUSCLE v3.8 [Edgar, 2004] and the alignment trimmed with trimAl v1.2 [Capella-Gutierrez et al., 2009] with “-gappyout” option. Phylogeny tree was in- ferred using MrBayes v3.2.4 [Ronquist et al., 2012] with JTT and Poisson fixed-rate models with model for rate variation across sites set to “invgamma”. Two indepen- dent chains were run with burninfrac set to 0.25 and run for 200,000 generations. All clades in the tree were supported with posterior probability 1.00 and 100% bootstraps.

3.3.6 Genotyping

All sequencing reads were mapped onto relevant reference genomes using BWA v0.7.5 [Li and Durbin, 2009] and SNPs inferred using GATK tool [McKenna et al., 2010, DePristo et al., 2011, Van der Auwera et al., 2013]. Duplicate reads were re- moved using Picard tools v1.131 (https://broadinstitute.github.io/picard/). High quality SNPs were inferred in chromosome 14 using GATK HaplotypeCaller and filtered for quality using GATK VariantFiltration commands. These SNPs were used for base recalibration of the mapped reads using GATK Recalibrator and SNPs and indels were called using GATK Haplotypecaller with settings “-ploidy 1 -pcrModel NONE”. SNPs/indels were filtered using GATK recommended settings “QD<2, FS>60, SOR>4, MQ<40, MQRankSum<-12.5, ReadPosRankSum<-8.0”. Repetitive regions in the genome were identified using DustMasker [Morgulis et al., 2006] and SNPs in repetitive regions and subtelomeric genes were discarded using vcftools [Danecek et al., 2011]. 82 3.3.7 Sample collection for P. vinckei vinckei CY blood stage tran- scriptome

100 µL of PvvCY parasites was inoculated intravenously into three ICR mice. For blood collection via tail snip, mouse tail was snipped and 20 µL of blood was col- lected by pipette into 500 µL of PBS solution. This was done at 4 timepoints on day 4 post-inoculation- 06:00 hrs, 12:00 hrs, 18:00 hrs and 24:00 hrs. The blood sample was centrifuged at 2000 rpm for 5 min to remove supernatant and 0.75 mL of TRIzol was added and the samples stored at 4°C for a few hours and then stored at -80C for long term.

3.3.8 RNA extraction and sequencing

For RNA extraction from terminally bled samples (for samples- PvbDA, PvpCR, PvlDS and PvsEL), blood collection was done as described in 2.3.4 and leukocyte depletion and saponin lysis was performed as described in 3.3.1.1. Parasite pellet was treated immediately with TRIzol (Invitrogen).

RNA isolation from TRIzol was done according to manufacturer's protocol. RNA was resuspended in nuclease-free water, quantity measured by Qubit flourimeter and integrity measured by Agilent Bioanalyser chip.

Strand-specific mRNA sequencing was performed from total RNA using TruSeq Stranded mRNA Sample Prep Kit LT (Illumina) according to manufacturer's instruc- tions. Briefly, polyA+ mRNA was purified from total RNA using oligo-dT dyn- abead selection. First strand cDNA was synthesised using randomly primed oli- gos followed by second strand synthesis where dUTPs were incorporated to achieve strand-specificity. The cDNA was adapter-ligated and the libraries amplified by PCR. Libraries were sequenced in Illumina Hiseq2000 with paired-end 100 bp read chem- 83 istry.

3.3.9 Transcriptome analysis

Strand-specific RNAseq paired-end reads were mapped onto the reference genomes using TopHat2 version 2.0.138 [Kim et al., 2013] with options “– library-type=fr- firststrand” and “–no-novel-juncs”. Differential expression analysis was carried out using cuffdiff2 v.2.2.1 [Trapnell et al., 2009] with “-u -b” parameters. To create the phaseograms, the phase of gene expression was calculated using ARSER [Yang and Su, 2010] package and the genes were ordered according to their phase. Heatmaps were created using heatmap.2 package in R. 84

Chapter 4

Genetics in Plasmodium vinckei

4.1 Review of Literature

4.1.1 Classical genetics in the malaria parasite

Genetic linkage analysis is a powerful tool that is used to map a phenotypic trait to a genomic location by showing co-segregation of the trait with genetic markers in the progeny of a genetic cross. The technique is based on three concepts:

i) inheritance of alleles or gene variants from genetically distinct parents to their progeny is accompanied by intrachromosomal crossing over and chromosomal seg- regation during meiosis, resulting in a genetically diverse progeny.

ii) phenotypic traits are also assorted similarly in the progeny and a phenotypic trait and the gene(s) controlling that trait co-segregate during meiosis.

iii) genetic loci that are in close physical proximity to each other on a chromo- some, with low chances of crossover events between them, are said to be linked and likely to be inherited together. Thus, a map of polymorphic genetic markers can be constructed and their pattern of inheritance can be used to pinpoint the location of a gene linked to a particular phenotype.

Classical genetic analysis offers the advantage of being an unbiased approach 85 towards identifying phenotype-genotype relationships, and has been successfully applied to study several phenotypes in the malaria parasite (see Appendix Table C). A typical classical genetics experiment involves:

• Achieving a genetic cross in Plasmodium and producing a recombinant progeny.

• Developing genetic markers and genetic linkage maps to trace inheritance of genomic loci in progeny clones.

• Characterizing a measurable phenotype in the progeny clones and linking it to a genotype.

4.1.2 Genetic crossing in Plasmodium

The first requirement for undertaking genetic analysis in the malaria parasite is to be able to perform a genetic cross between two parasites that differ in their geno- types under laboratory settings yet are reproductively compatible. The first ever laboratory cross between two malaria parasites was reported in 1954, between two P. gallinaceum strains [Greenberg and Trembley, 1954]. It was shown that simul- taneous passage of both strains through mosquitoes resulted in a drug-sensitive, avirulent strain acquiring pyrimethamine resistance from a resistance-induced, vir- ulent strain. However, the study could not provide conclusive proof that exchange of genetic material via recombination had occurred in the parasite.

Much of our present knowledge about malaria genetics stems from subsequent studies done with rodent malaria parasites. Apart from being easily tractable sys- tems, RMP isolates are genetically diverse, thus providing genetically distinct parent lines for crossing and also providing unambiguous molecular markers for the dif- ferentiation of genetically distinct entities in the progeny. 86

The first RMP cross was performed in P. yoelii [Walliker et al., 1971, 1973]. Mosquitoes were fed on a mouse infected with equal amounts of a pyrimethamine resistant line of P. yoelii 17X and a pyrimethamine sensitive line of P. yoelii 33X. Sporozoites, isolated from the mosquitoes’ salivary glands, were injected into a thicket rat and the resulting blood stage parasites were treated with pyrimethamine. Electrophoretic variants of the enzyme, glucose phosphate isomerase (GPI), were used as a marker to distinguish the two parasite lines, in addition to the drug resis- tance phenotype. The GPI variant of pyrimethamine sensitive P. yoelii 33X line was detected in the resistant population that survived drug treatment, thus confirming that genetic recombination had taken place between the two P. yoelii lines.

Subsequent studies showed that cross-immunity [Oxbrow, 1973] (a mouse pre- viously infected with one parasite line can gain immune protection against a second challenge from another parasite line), virulence [Walliker et al., 1976] and chloro- quine resistance [Rosario, 1976] are genetically determined, inherited in a simple Mendelian fashion and undergo recombination independently along with genetic markers.

Another cross experiment in P. chabaudi used two enzyme markers to investi- gate the recombinants and established that markers were inherited independently of each other [Walliker et al., 1975]. A more extensive study in P. yoelii [Knowles et al., 1981], using five crosses, three enzyme markers and 105 progeny clones showed that the ratio of parental and recombinant clones obtained was as predicted by Mendelian genetics, confirming equal preference for self- and cross-fertilization among the parental gametes. The study (along with [Oxbrow, 1973]) also showed that two RMP subspecies from different geographical regions can interbreed in a 87 similar fashion as two strains from the same geographical location. The genetic studies discussed above, along with electron microscopy studies [Sinden et al., 1985, Sinden and Hartley, 1985], helped establish the fundamentals of malaria genetics (see Section Fundamentals of malaria genetics below).

With genetic experiments in RMPs as a basis, the first genetic cross with the hu- man malaria parasite, P. falciparum, was successfully made between two isolates; pyrimethamine sensitive 3D7 and resistant HB3 [Walliker et al., 1987]. This experi- ment involved two main alterations in the protocol; i) gametocytes from each strain were cultured in vitro, mixed together in equal proportions and fed to Anopheles freeborni mosquitoes through membrane-feeding, and ii) infective sporozoites were injected into splenectomized chimpanzees. The genetic cross was confirmed by detection of recombination between three markers; pyrimethamine resistance, the isoenzyme adenosine deaminase, and polymorphic antigens.

As this protocol requires propagation of the pre-erythrocytic stages in non-human primates such as chimpanzees, such experiments are bound by ethical, technical and cost constraints. Due to this, genetic crosses in P. falciparum have been rare, with only two more crosses performed over a span of three decades; a HB3 X Dd2 cross [Wellems et al., 1990] to study chloroquine resistance and a 7G8 X GB4 cross [Hayton et al., 2008] to study host receptor recognition. Recently, a P. falciparum genetic cross was successfully performed in humanized mice (chimeric mice that carry human hepatocytes) [Vaughan et al., 2015] and this could be a promising alternative in the future. 88 4.1.2.1 Fundamentals of malaria genetics

A series of genetic and cytological studies established the basic principles of malaria genetics (reviewed in [Walliker, 1989, Fenton and Walliker, 1992, Walliker, 1994, Baton and Ranford-Cartwright, 2005, Gerald et al., 2011, Culleton and Abkallo, 2014]; see Figure 4.1A). All blood stage parasites, including the sexual stages; male and female gametocytes, have haploid genomes (1n) [Janse et al., 1986a, 1988]. During a blood meal, gametocytes enter the lumen of the mosquito’s midgut, and within 12 min, the male gametocyte undergoes three rounds of replication to pro- duce eight male gametes (1n) [Janse et al., 1986b], while the female gametocyte produces a single female gamete (1n). Within the lumen of the mosquito’s midgut, a male gamete fuses with the female gamete to produce a diploid zygote (2n).

The zygote is the only stage in the parasite’s entire life cycle that has a diploid genome and is short-lived. Within a few hours of fertilization, the zygote undergoes one round of genome duplication (4n) followed by meiosis [Sinden and Hartley, 1985, Janse et al., 1986b]. Chromosomes from each parent are arranged in pairs and crossing-over between chromosomes occur, followed by separation and inde- pendent assortment of the chromosomes into four haploid meiotic products (1n) [Sinden et al., 1985, Sinden and Hartley, 1985]. Extranuclear genomes - mito- chondrial and apicoplast genomes are maternally inherited, i.e, exclusively from the female gamete. The zygote develops into a motile ookinete that traverses the midgut epithelium where it forms an oocyst containing the four haploid meiotic products [Beier, 1998]. Each meiotic haploid complement of chromosomes un- dergoes 10-11 rounds of replication, yielding around 2000-8000 sporozoites (1n) per oocyst that then migrate to the salivary glands following rupture of the oocyst [Rosenberg and Rungsiwongse, 1991, Sinden and Matuschewski, 2005]. Following deposition in to the skin of the host, the sporozoites eventually invade and develop 89

gene linked to A B resistance resistant parent A B gametes sensitive parent genetic cross

zygotes recombinant self hybrids self progeny

AA BB AB BA

Meiosis applying crossing selection over pressure & segregation selected recombinant progeny sporozoites measuring genetic markers

allele ratio

genome length

selection valley recombinants at gene of interest parental parental

Figure 4.1: Basics of genetics in malaria and Linkage Group Selection method. (A) The fundamentals of malaria genetics are as follows: A male gamete fuses with the female gamete to produce a diploid zygote within the mosquito follow- ing a blood meal. When two parasite clones (denoted here as blue and orange) are propagated, the gametes can self-fertilize, producing homozygous zygotes, and cross-fertilize, producing two types of heterozygous or hybrid zygotes. Shortly af- ter fertilization, meiosis ensues, during which crossing over between homologous chromosomes occurs and then the chromosomes segregate independently into four daughter cells. Each zygote matures into an oocyst and produces thousands of sporozoites. The homozygous zygotes produce sporozoites that are genetically iden- tical to their parent, while the heterozygous zygotes produce recombinant sporo- zoites of four distinct genotypes. Adapted from [Culleton and Abkallo, 2014]. (B) Linkage Group Selection involves placing the uncloned recombinant progeny of a genetic cross between a “resistant” (orange) and a “sensitive” (blue) parent under a selection pressure which represents a phenotype of interest. Under the selection pressure, the “sensitive” parasites in the population get eliminated and the “resis- tant” parasites get enriched. The surviving progeny are then genotyped as a whole to quantify genetic markers of both parental strains across the genome. Markers from the resistant parent that are linked to the gene(s) determining the phenotype (black) will be enriched in the selected progeny. At each marker locus unaffected by the selection pressure, both parental alleles would be present in constant ratios, while the loci around the determinant gene would have the resistant alleles over- represented. A “selection valley” appears around the selected locus with the base of the valley pinpointing it. Adapted from [Culleton et al., 2005]. 90 within hepatocytes. Inside the host liver, each sporozoite undergoes 13-14 rounds of replication and nuclear division, releasing tens of thousands of daughter mero- zoites (1n) into the bloodstream to invade red blood cells.

When only one parasite clone is propagated through these genetic events, fer- tilization occurs between male and female gametes with identical genomes and the resulting sporozoites are all genetically identical to the parental clone. When two genetically distinct parasite clones, for example, “A” and “B”, are propagated, the gametes fertilize to produce four different combinations- self-fertilization, AA and BB, and cross-fertilization, AB (male gamete A with female gamete B) and BA (male gamete B with female gamete A). Assuming that both parental gametocytes A and B were in equal proportions and that the rates of self- and cross-fertilization events were equal, one would expect one homozygous zygote each of types AA and BB for every two heterozygous zygotes of type AB/BA. The homozygous zygotes would produce sporozoites of parental type A or B, while hybrid zygotes would yield cross progeny of recombinant sporozoites, with as many as four distinct genotypes from each zygote. Assuming that the parental types and the recombinants are equally successful in becoming infective sporozoites, half the sporozoites would be recom- binants while the rest would be parental types A and B (one-fourth each) [Fenton and Walliker, 1992, Walliker, 1994]. Each sporozoite would then yield thousands of genetically identical merozoites after liver stage development and assuming there are no selection pressures present, the genotype proportions in the sporozoite pop- ulation would be maintained in the blood stages too. 91 4.1.3 Molecular markers and genetic linkage maps

The second requirement for a genetic linkage study is a suitable genetic marker system that can distinguish between genotypes and provide information about al- lelic variation at a particular locus. As discussed earlier, phenotypic markers like pyrimethamine resistance and strain-specific immunity have partially satisfied this purpose, but genetic markers, derived using molecular biology techniques, have the advantage that large numbers can be generated for particular parental strain com- binations.

Molecular markers can track genetic loci following genetic recombination and over the years, advances in molecular biology have given us new marker types that offer better precision and resolution in assaying genetic variation [Schlotterer, 2004]. The first molecular markers employed in malaria parasites were allozymes, which are protein variants of enzymes that can be distinguished in a native gel electrophoresis based on their size and charge. Allozymes were used in studies that marked the very beginning of genetic linkage analysis in RMPs [Walliker et al., 1971, 1973, 1975, Oxbrow, 1973, Rosario, 1976]. Variations in enzymes like glu- cose phosphate isomerase, 6-phosphogluconate dehydrogenase, lactate dehydroge- nase and glutamate dehydrogenase across RMP isolates were characterized [Carter, 1973, 1978] and segregation of these enzyme variants in the cross experiments was used to confirm genetic recombination.

This was followed by use of more precise DNA-based markers- fragment length polymorphisms (FLPs), microsatellites (MS) and single nucleotide polymorphisms. DNA-based markers can have genome-wide distributions and their relative genetic distances (measures of linkage between two markers) are plotted as genetic link- age maps. These maps can be used to track movement of genetic loci from the 92 parents to their cross progeny at the genome level. Restriction Fragment Length Polymorphism (RFLP) analysis involves fragmenting the DNA with a restriction en- zyme, separating the DNA fragments by size on a gel and profiling the fragments (through Southern blots) for presence of a particular DNA sequence. Sequence dif- ferences between parasite strains change the distribution of restriction sites across their genomes, and these changes can be detected as variations in the length of fragments.

FLPs can also be detected by PCR through the addition of selective adapters to the fragments and amplifying a subset of the fragments. Markers generated in this manner are called Amplified Fragment Length Polymorphisms (AFLPs) [Vos et al., 1995].

RFLPs were first employed in P. falciparum HB3 X Dd2 cross [Wellems et al., 1990] and 85-90 RFLP markers were used to construct the first genetic linkage map for P. falciparum and identify loci linked to chloroquine resistance [Wellems et al., 1991, Walker-Jonah et al., 1992]. Within RMPs, RFLP markers (46 markers) were designed for only P. chabaudi and used to study different drug resistance pheno- types [Carlton et al., 1998, Hayton et al., 2002, Cravo et al., 2003]. Following this, AFLPs [Grech et al., 2002] became the markers of choice for studies employ- ing Linkage Group Selection and around 92-275 AFLPs were put to use to study a variety of phenotypes in P. chabaudi [Culleton et al., 2005, Martinelli et al., 2005a, Pattaradilokrat et al., 2007] and P. yoelii [Pattaradilokrat et al., 2009]. An AFLP- based genetic map, comprising of 672 markers, was also constructed for P. chabaudi [Martinelli et al., 2005b].

The increasing availability of reference genomes for malaria parasites subse- 93 quently made it possible to use microsatellites as genetic markers. Microsatellites are di-, tri- and tetra- nucleotide tandem repeats distributed across the genome and have high mutation rates, therefore differing greatly in their location and length be- tween even closely related genotypes. Since they are widely distributed across the genome and are multi-allelic, high density genetic maps can be constructed based on their polymorphisms [Li et al., 2009b].

Genetic maps based on hundreds of microsatellites were constructed for P. falci- parum [Su and Wellems, 1996, Su et al., 1997, 1999b, Hayton et al., 2008] and P. yoelii [Li et al., 2011] and subsequently used for linkage analysis. However, analysis using FLPs and microsatellites require tedious methods involving multiple PCR am- plification steps and running high resolution polyacrylamide or agarose gels. These marker systems can also be unreliable [Schlotterer, 2004]. SNPs, on the other hand, are stably inherited, have denser, more consistent distribution in the genome and can be identified via easy and low-cost next generation sequencing technologies. At present, SNPs are the genetic markers of choice, offering high resolution in linkage analysis, and have been used for developing new statistical models [Abkallo et al., 2017] and high density linkage maps [Miles et al., 2016] to discover mutations driving a specific phenotype (for more detailed review on SNPs, see Section 3.1.3 - Single nucleotide polymorphisms in RMPs).

Genetic linkage maps have also provided a better understanding of meiotic re- combination in Plasmodium. The first rudimentary linkage map for P. falciparum based on RFLP markers showed that extensive recombination occurs during meiosis in the malaria parasite at a high rate of 15-30 kb/cM (meaning that genetic mark- ers lying 15-30 kilo base-pairs apart undergo one crossover in every 100 meioses) [Walker-Jonah et al., 1992]. Later, high density linkage maps constructed using 94 microsatellites and SNPs were able to provide more accurate recombination rates [Su et al., 1999a, Jiang et al., 2011, Miles et al., 2016]. The average recombination rate in P. falciparum is estimated to be 12.7-14.3 kb/cM with the rates lower near centromeres and subtelomeres [Miles et al., 2016].

4.1.4 Classical linkage analysis in Plasmodium

With genetic crosses and genetic marker systems established in the malaria para- site, it was now possible to follow the co-inheritance of a particular phenotype and the genetic loci controlling it in a cross progeny.

The first proper linkage study was carried out in P. falciparum to identify ge- netic loci linked to chloroquine resistance [Wellems et al., 1990, 1991]. A chloro- quine sensitive HB3 strain and a resistant Dd2 strain were crossed and individual clones were obtained from the progeny by limiting dilution. Using RFLPs that differ between HB3 and Dd2, 16 such clones were identified as recombinants carrying unique genotypes, and upon treatment with chloroquine, half of them had a resis- tant phenotype. There was no accumulation of RFLP markers against pfmdr1 and pfmdr2 genes in the drug resistant clones implying that these genes were not linked to chloroquine resistance. A 400 kb region in chromosome 7 was instead implicated since all chloroquine sensitive clones carried markers of the sensitive parent HB3 and all resistant clones carried markers of the resistant parent Dd2 within this re- gion [Wellems et al., 1991].

This region was further narrowed down to a smaller, 36kb-long determinant lo- cus, by screening for crossover events in a large number of progeny clones (1,120 clones) using microsatellite markers [Su et al., 1997]. It was subsequently found that mutations in a gene at this locus, pfcrt, encoding for a vacuole transmembrane 95 protein, played a role in chloroquine resistance [Fidock et al., 2000]. Much later, two P. falciparum strains, 7G8 and GB4, which differ in their ability to infect Ao- tus monkeys were crossed and upon profiling 200 progeny clones with microsatel- lites, a putative erythrocyte binding protein, PfRH5, was found to be associated with the host recognition phenotype [Hayton et al., 2008]. It was clear from these studies that the resolution of a linkage analysis could be improved by analysing a large number of unique recombinant clones with more densely distributed molecu- lar markers.

Gene expression can itself be considered as a phenotypic trait and by profiling the transcripts of genes in a cross progeny, the genetic basis of transcriptional reg- ulation can be identified. This type of analysis, called expression quantitative trait loci (eQTL) mapping, was undertaken for 34 progeny clones from the P. falciparum Hb3 X Dd2 cross, using microarrays to identify expression level polymorphisms among the recombinants [Gonzales et al., 2008]. The study was able to identify 14 prominent regulatory hotspots and structural variations like copy number varia- tions influencing the transcription network of the parasite.

Linkage studies in RMPs started with those in P. chabaudi to identify genetic determinants of chloroquine [Carlton et al., 1998], sulphadoxine-pyrimethamine [Hayton et al., 2002] and mefloquine resistance [Cravo et al., 2003]. In each case, a strain selected for drug resistance was crossed with a sensitive strain and individual progeny clones were phenotyped for their susceptibility to the drug and genotyped using RFLP markers. Linkage analyses suggested loci in chromosome 11 associated with chloroquine resistance (by profiling 20 hybrid clones) [Carlton et al., 1998], mutations in the dhfr gene with sulphadoxine/pyrimethamine resistance (24 hybrid clones) [Hayton et al., 2002] and pcmdr1 gene duplication with mefloquine resis- 96 tance (16 hybrid clones) [Cravo et al., 2003]. Similarly, by typing 38 recombinant clones of a genetic cross between P. y. yoelii 17XNL (slow growth rate) and P. y. nigeriensis N67 (fast growth rate) using microsatellite markers, loci in chromosome 13, 10 and 7 were found to be linked with variable growth rates [Li et al., 2011].

Genetic loci identified by linkage studies in RMPs were weakly resolved due to the relatively low number of clones and genetic markers used. As demonstrated by the chloroquine resistance linkage studies in P. falciparum [Wellems et al., 1991, Su et al., 1997], there is an inverse relationship between the measurable size of the genetic locus and the number of recombinant clones. Obtaining a large number of clones from P. falciparum in vitro cultures is straightforward but in RMPs, this step is long, laborious and expensive. A novel method, called Linkage Group Selection, was thus developed in order to overcome these limitations and identify genes linked to selectable phenotypes with high resolution and speed [Culleton et al., 2005]. The density of molecular markers in these studies were also improved by using thousands of genome-wide SNPs between the parents [Kinga Modrzynska et al., 2012, Abkallo et al., 2017].

4.1.5 Linkage Group Selection (LGS)

In an LGS analysis, recombinant clones are not obtained from the cross progeny and each clone’s phenotype and genotype is not characterised individually. The un- cloned progeny is placed under a selection pressure which represents a phenotype of interest (see Figure 4.1B). Under the selection pressure (for example, drug treat- ment), the “sensitive” parasites in the population get eliminated and the “resistant” parasites get enriched. The surviving progeny is then genotyped as a whole to as- sess the quantitative representation of genetic markers of both parents throughout the genome. 97

When the uncloned recombinant progeny is genotyped and markers are plot- ted onto the genome, each marker locus would show an equal presence of both parental alleles as a result of random and frequent recombination events in the re- combinants. When a selection pressure is applied, those recombinants that have inherited the sensitive allele of the determinant gene would be depleted in the progeny. During recombination, markers close to each other are linked and are likely to be inherited together. Thus, markers around the determinant gene would also show a marked decrease in sensitive allele frequency, forming a linkage group. Within the linkage group, markers in close proximity to the selected locus are de- pleted more and those farther away are depleted less. This kind of distribution forms a “selection valley” around the selected locus with the base of the valley pin- pointing it.

The size and shape of the linkage group depends on the number of recombi- nation events (or the recombination rate) at a given locus and is intrinsic to the organism involved. On the other hand, the number of recombinants and genetic markers will help to accurately define the shape of the selection valley. The larger the number of recombinants, the higher the contrast in allele frequency changes occurring closest to the driving locus compared to those occurring farther away would be. A high density of markers would afford enough number of datapoints around the locus to accurately assay the allele frequency changes. Taken together, we require a good recombination rate, a high number of recombinants and a high density of genetic markers in order to achieve a selection valley of good resolution in an LGS experiment.

In the first study that sought to validate LGS, progeny from a genetic cross 98 between pyrimethamine sensitive and resistant P. chabaudi strains were selected for drug-resistant recombinants by treatment with pyrimethamine, and subsequent genotyping using AFLP markers formed a selection valley with the dhfr gene at its base (as expected) [Culleton et al., 2005].

Following this, LGS was employed to study growth rate [Gadsby, 2008] and strain-specific immunity in P. chabaudi [Martinelli et al., 2005a, Pattaradilokrat et al., 2007] and growth rate differences in P. yoelii [Pattaradilokrat et al., 2009, Abkallo et al., 2017]. In [Martinelli et al., 2005a], a genetic cross was performed between two P. chabaudi isolates, AS and CB. Strain-specific immune selection, ap- plied by growing the cross progeny in immunized mice, caused a change in msp1 allele proportions, thus identifying msp1 gene as a controller of strain-specific im- munity. In [Gadsby, 2008], P. chabaudi adami strains DS and DK were crossed and LGS analysis showed loci in chromosome 6, 7 and 9 as linked to the difference in their growth rates. Similarly in [Pattaradilokrat et al., 2009], P. yoelii 17XYM and 33XC isolates were crossed and growth rate selection was applied, by allowing fast growers in the progeny to outgrow slow growers. Linkage mapping revealed a re- gion containing the pyebl gene, encoding for a erythrocyte binding ligand, to be associated with host preference and growth rate differences between the isolates.

The power and accuracy of LGS was significantly improved through the use of quantitative SNP markers in place of AFLP markers [Kinga Modrzynska et al., 2012, Abkallo et al., 2017]. SNP discovery through whole genome re-sequencing has made the modified method significantly faster and also immediately reveals the underlying mutations (possibly) determining the phenotype.

In [Kinga Modrzynska et al., 2012], LGS was used to identify loci conferring 99 chloroquine resistance through the analysis of over 100,000 SNP markers that dif- ferentiated between the drug resistant AS-30CQ and sensitive AJ parental strains of P. chabaudi. Three selection valleys on chromosomes 11, 3 and 2 revealed mu- tations; a A173E substitution in the aat1 (amino acid transporter, putative) gene and a V2728F substitution in the ubp1 (deubiquitinating enzyme) gene, conferring chloroquine resistance in an additive manner.

A more sophisticated mathematical approach to identify putative loci was adopted in [Abkallo et al., 2017], leading to improved inference of selection valleys. Two loci, one of them containing the msp1 gene, were identified as linked to strain- specific immunity and a C351Y substitution in the pyebl gene was shown to change host cell preference and increase growth rate in P. yoelii.

An obvious limitation of the LGS method is that it can be applied only to study phenotypes for which a selection pressure can be applied. For example, two P. yoelii strains, N67 and YM, have distinct disease patterns- YM kills the host at around day 7 while N67 has two growth peaks eventually killing its host at around day 15. However, since both strains are lethal, a selection pressure cannot be applied on their genetic cross progeny. A quantitative trait loci (QTL) analysis is more suitable here and was used to link a HECT-like E3 ubiquitin ligase to the observed growth dynamics of the cross progeny [Nair et al., 2017]. Also, being a genetics-based approach, LGS cannot be used to study inherited phenotypes being driven by epi- genetics.

In conclusion, the genetic analysis of malaria parasites has improved greatly over the years and has yielded important information regarding the parasite’s bi- ology (for a brief overview of all the genetic studies and their findings, please see 100 Appendix Table C). The underlying genetics of several key parasite phenotypes such as drug resistance, growth rate, host cell preference and interactions with host im- munity have been identified. With the advent of advanced molecular biology tech- niques and the development of novel analysis methods, it is now possible to achieve base level resolution in identifying phenotype-genotype relationships in these stud- ies.

4.1.6 Reverse genetics in RMPs

Reverse genetics in Plasmodium is another field that has become an indispensable tool in malaria research and has been central to efforts aimed at understanding the molecular basis of parasite biology. Reverse genetics methods work in the opposite direction of classical genetics; that is, given a target gene, it seeks to remove or alter its sequence and infer its function by studying any resulting change in phenotype. This kind of genetic manipulation of the malaria parasite includes knocking out gene expression, mutating sequences of genes and functional domains, introducing and expressing new genes, and tagging proteins with reporter proteins like green fluorescent protein (GFP) to track its movement and localization within the cell.

As Plasmodium genomes are haploid (except for a brief period in their lifecy- cle), all these strategies involve targeting a single gene. Foreign DNA material can be transfected into the malaria parasite by electroporation of blood stage parasites since they can be easily accessed and purified. Gene constructs within this foreign DNA can be expressed within the parasite, either transiently, where transgenes are readily expressed from the introduced plasmid DNA episomally, or stably, where the exogenous DNA gets integrated into the malaria genome with the help of homolo- gous targeting sequences by single or double cross-over recombination. 101 The first demonstration of foreign DNA expression in a malaria parasite was in 1993, when a firefly luciferase gene was introduced and expressed as plasmid DNA via electroporation in P. gallinaceum [Goonewardene et al., 1993]. Transient expression of a chloramphenicol acetyltransferase gene construct in P. falciparum marked the first successful transfection of a human malaria parasite [Wu et al., 1995].

Stable gene expression was then achieved through integrating the dihydrofolate reductase-thymidylate synthase (dhfr-ts) gene into the parasite genome and trans- forming the parasites into a pyrimethamine resistant line [Crabb and Cowman, 1996, Wu and Kirkman, 1996]. A year later, a landmark study achieved the first gene knockout in P. falciparum by disrupting a gene encoding the knob-associated histidine-rich protein (KAHRP) [Crabb et al., 1997]. This gene knockout resulted in the absence of knob formation on the surface of infected erythrocytes and reduced cell adherence, thus suggesting that KAHRP is essential for knob formation in para- sitized RBCs.

Since then, technologies for genetic manipulation are constantly being devel- oped in malaria parasites and have revolutionised the field in a short span of time [de Koning-Ward et al., 2000, 2015, Matz and Kooij, 2015]. Recently, more direct genome editing methods mediated by CRISPR-Cas9 [Ghorbal et al., 2014, Wagner et al., 2014] and zinc-finger nucleases [Straimer et al., 2012] have been utilised in P. falciparum. Conditional gene knockout system [Collins et al., 2013, Knuepfer et al., 2017] to study functions of essential genes and transposon mutagenesis sys- tem [Balu et al., 2009] as high-throughput functional screens have also been devel- oped. The genetic manipulation techniques developed through these studies have been put to great use in the study of important biological phenomena in P. falci- 102 parum such as cell invasion [Baum et al., 2005, Stubbs et al., 2005, Maier et al., 2008, Lopaticki et al., 2011], cytoadherance [Crabb et al., 1997, Waterkeyn et al., 2000] and drug resistance [Fidock et al., 2000, Reed et al., 2000].

The vector and pre-erythrocytic stages of parasite development are inaccessible in P. falciparum for studying phenotypic changes through gene disruption. More- over, the effect(s) of gene knockouts in P. falciparum can be studied only in vitro and therefore, its capability to study host-parasite interactions is limited. Genetic ma- nipulation in RMP models can overcome these hurdles and P. berghei has emerged as a very valuable model for this purpose. It has better transfection efficiency [Janse et al., 2006b] (10−3 to 10−2) than P. falciparum and several in vivo and in vitro as- says are available to study the mosquito and liver stages in P. berghei [Limenitakis and Soldati-Favre, 2011, Matz and Kooij, 2015].

The P. berghei transfection system was developed in parallel with P. falciparum [van Dijk et al., 1995, 1996] and several improvements to the technique have been made to increase its efficiency [Janse et al., 2006b,a]. High throughput functional screens capable of studying multiple gene knockouts in parallel have been devel- oped based on the efficient transfection system in P. berghei [Gomes et al., 2015, Bushell et al., 2017]. Stable transfections have also been demonstrated in P. yoelii [Mota et al., 2001, Jongco et al., 2006] and P. chabaudi [Spence et al., 2011], but not in P. vinckei. Together, these genetically modified RMPs (listed in RMgmDB - Ro- dent Malaria genetically modified DataBase [Janse et al., 2011, Khan et al., 2013]) have contributed immensely to the study of in vivo phenomena in malaria para- site biology such as parasite sequestration [Franke-Fayard et al., 2010], cerebral malaria [Engwerda et al., 2005, Hansen, 2012], host-immune responses [Hafalla et al., 2011], mosquito transmission [Tewari et al., 2010, Guttery et al., 2012, 103 2014, 2015, Bechtsi and Waters, 2017] and liver stage development [Prudencio et al., 2011].

However, as phenotypes and transfection efficiencies differ among the RMP species, suitable revisions to commonly used transfection protocols are required. For example, P. berghei-infected reticulocytes do not undergo schizogony when maintained in in vitro culture and these arrested schizonts can be purified in high numbers for electroporation. In contrast, P. chabaudi has a slower growth rate, their schizonts rupture in vitro and sequester in host tissues in vivo, resulting in low num- bers for electroporation. As a result, several modifications have been introduced in the transfection protocol for P. chabaudi including the use of immuno-deficient mice as hosts [Spence et al., 2011]. Thus, assessing the feasibility of genetic manipula- tion in P. vinckei and setting up a suitable transfection protocol would be beneficial for future investigations using reverse genetics in P. vinckei.

4.2 Results and Discussion

4.2.1 Genetics in P. vinckei

Genetic analyses using RMP models have been carried out primarily in P. yoelii and P. chabaudi, made possible by the availability of reference genomes to develop ge- netic markers and genetically diverse strains that can cross-fertilize with relative high frequency. In this work, we aim to fulfil both these requirements for P. vinckei; reference genomes for its five subspecies have been sequenced and genetic diversity among the 10 P. vinckei isolates have been established in the form of high-quality

SNP markers (Chapter 3). What remains to be assessed is the ability of P. vinckei isolates to cross-fertilize and successfully produce recombinant progeny. We have also seen that growth rates of P. vinckei isolates differ, but, unlike in P. yoelii [Pat- 104 taradilokrat et al., 2009], this does not appear to be due to differences in host cell preference (all the isolates prefer normocytes) (Chapter 2). Therefore, two of this Chapter’s objectives are:

• To gauge the feasibility of a P. vinckei genetic cross and

• To identify genes controlling growth rate differences between strains of P. vinckei.

4.2.1.1 A P. vinckei genetic cross

Two isolates of the P. vinckei subspecies, PvsEH and PvsEL, that differ in their growth rates were chosen for performing a genetic cross (see Figure 4.2). PvsEH is the faster grower, reaching a peak parasitaemia of ∼70% at day 5 and becoming lethal to the host, while PvsEL is the slower grower, reaching a peak parasitaemia of ∼40% at day 5. The optimal transmission temperature for P. vinckei subsp. was characterized as 23-26°C in our study. A mixed inoculum containing equal proportions of PvsEH and PvsEL parasites was injected into CBA mice and presence of gametocytes was confirmed by microscopy on day 3 post-infection. Since it was unclear how the gametocytaemia and parasitaemia on the day of mosquito feed might affect trans- mission, mosquito feeds were performed on both day 3 and day 4 post-infection to increase the chances of a successful transmission. These were treated as indi- vidual biological replicates (R1 and R2). For each replicate, around 160 female A. stephensi mosquitoes were allowed to take a blood meal from two anaesthetized mice at 24°C for 40 minutes without interruption.

5-10 mosquito midguts were inspected on day 9 post-feed for the presence of oocysts and 100% infection was observed (all midguts inspected contained oocysts) for both day 3 and day 4 feeds. Around 25-100 oocysts were found per midgut in 105

Confirmation of R1 R2 genetic cross

day 5 P.I; day 5 P.I; 0.1- 1% parasitemia; 0.1- 1% parasitemia; 1x106 parasites/100ul IV 0.6 parasites/100ul fast grower slow grower R2 pvsEH pvsEL R1 R2 20 ϕ ICR donor 1 ϕ ICR 1 ϕ ICR 7 ϕ CBA 7 ϕ CBA day 10 P.I day 3 P.I Linkage 4 clones DNA extraction and EH:EL=1:1 Group day 3 P.I day 5 P.I Sanger sequencing 1x106 parasites/ Selection of positive clones 100ul IV

Before growth selection After growth selection Blood collected Blood collected 4 ϕ CBA from 4 ϕ CBA per replicate from 3 ϕ CBA per replicate

o day 3 & 4 P.I 40 min, 24 C R1D3 and R2D3 R1D5 and R2D5

R1 R2 DNA extraction and whole genome sequencing 80 mosquitoes 80 mosquitoes Read mapping and SNP calling day 9 P.F

Check for oocysts Identification of selection valleys (R1 - 25-100 oocysts per midgut, using mathematical model R2 - 5-40 oocysts per midgut)

day 20 P.F

Extract sporozoites

Allelefrequency (visible- 7 sporozoites /field) Allelefrequency Position Position IV within 2-3 hrs Unselected progeny Selected progeny

Figure 4.2: Brief workflow of performing a genetic cross and Linkage Group Selection with P. vinckei isolates, PvsEH and PvsEL. A mixed inoculum containing equal proportions of PvsEH and PvsEL parasites was injected into CBA mice and mosquito feeds were performed on both day 3 (replicate 1) and day 4 (replicate 2) post-infection. For each replicate, around 160 female A. stephensi mosquitoes were allowed to take a blood meal from two anaesthetized mice at 24°C for 40 minutes without interruption. On day 9 post-feed, mosquito midguts contained oocysts and on day 20, around 60 mosquitoes in each replicate were dissected and salivary glands were crushed to release sporozoites. Sporozoites from each replicate were injected into one ICR mouse each and 5 days later, both replicates became positive for blood stage parasites. Four clones were obtained from replicate 2 by limiting dilution (inoculum of 0.6 parasites per 100 µL was injected into 10 ICR mice in total) and found to have both PvsEH and PvsEL alleles within the chromosomes, thus confirming that genetic crossing had taken place.The cross progeny from both replicates were passaged into 7 female CBA mice per replicate and parasites were harvested on day 3 post-infection (from 4 CBA mice) before growth selection and on day 5 (from 3 CBA mice) after growth selection had taken place. This was followed by DNA extraction, high-throughput sequencing, mapping of sequence reads onto PvsEL reference genome and calling SNPs. A mathematical model was then applied to the allele frequencies to detect selective sweeps in the data caused by the growth selection pressure. (P.I - post-infection; P.F - post-feed; IV - intravenous injection).) 106 day 3 fed mosquitoes and 5-40 oocysts per midgut in day 4 fed mosquitoes (see 4.1. On day 12 post-feed, mature oocysts and also a high number of sporozoites were found in the midguts, but upon disrupting the salivary glands on day 20 post-feed, only a few sporozoites were found in the suspension.

For each replicate, sporozoites obtained from ∼60 mosquitoes were injected into one ICR mouse and 5 days later, both replicates became positive for blood stage parasites. PCR amplification of a region within the polymorphic msp1 gene with isolate-specific primers confirmed presence of both PvsEH and PvsEL msp1 alleles in the cross progeny. The relative proportions of PvsEH and PvsEL msp1 alleles were monitored by real-time quantitative PCR (qPCR) (see Appendix Figure E.3) on four occasions- i) on the day of the feed, ii) on day 7 post-feed in the oocysts, iii) when mice became positive for cross progeny blood stages and iv) before and after growth selection.

We observed a PvsEL : PvsEH msp1 allele ratio of around 5:1 in the mice on the day of the feed and in the oocysts, 2:1 in the cross progeny and around 1:1 before and after growth selection. qPCR showed that the isolates were not in equal pro- portions during the feed as desired, and the higher proportions of PvsEL continued through in the mosquito stages as well. This was probably due to handling error during inoculum preparation and since the blood samples were collected on the day of the feed and qPCR performed after the feed, we were unable to rectify the error in time. Nevertheless, the imbalance was reduced in the cross progeny and almost equal proportions were observed prior to applying the growth selection.

In order to confirm that a genetic cross had taken place, individual clones were obtained from one of the replicates, R2, by limiting dilution. 10 ICR mice were 107 infected with an inoculum of 0.6 parasites/100 µL and 8 days post infection, four mice were positive for parasites. These four clones were screened for the presence of both PvsEH and PvsEL alleles within the chromosomes.

Based on our genotype data, we chose one polymorphic gene on both ends of the 14 chromosomes (a total of 28 genes) and designed primers to amplify 600 to 1000 bp regions that contained isolate-specific SNPs (see Appendix Table D). Of the 28 SNP markers screened by Sanger sequencing, 11 were PvsEL- specific while the rest were of PvsEH origin, thus confirming that recombination had taken place (See Fig- ure 4.3 B). However, all four clones had the same pattern of recombination which suggests that the diversity of recombinants in the cross progeny was low and that a single recombinant parasite may have undergone significant clonal expansion.

4.2.1.2 Low infectivity of P. vinckei petteri and P. vinckei subsp. sporozoites within the vector

Prior to the above experiment, three attempts were made to achieve a genetic cross between P. vinckei isolates (see Table 4.1). A transmission temperature of 25.5°C was tested for PvsEH and PvsEL isolates. The midguts became positive for good numbers of oocysts at day 12 post feed but only 1-2 sporozoites were visible upon disrupting the salivary glands (of 60-80 mosquitoes per replicate) on day 18. Only one replicate (day 3 mosquito feed)) turned positive for blood stage parasites on the 5th day after being injected with sporozoites and qPCR showed presence of only the PvsEH msp1 allele. This implied that only either parental PvsEH or a re- combinant carrying PvsEH msp1 allele has successfully transmitted while parental PvsEL and recombinants with PvsEL msp1 allele have failed to transmit. Since this clearly indicated a bias in the genetic composition of the progeny, the experiment was aborted. This was followed by the second attempt that was successful in trans- 108

Day P.F Number Day Day P.F Proportion midguts of No. of Sporozoites Gametocyte P.I of salivary Attempt Parasite Vector in Temperature checked oocysts mosquitoes in salivary Infective? carrier/host blood glands inoculum for per dissected glands feed dissected oocysts midgut PvpCR 3 24 9 ∼2 16 1 110 + PvpBS 4 24 9 ∼5 16 24 12 0 16 PvpCR 1:2 not visible no Mice, 25.5 12 10 to 15 16 2 + Anopheles 3 50-60 CBA strain, 27.5 12 10 to 15 16 PvpBS stephensi female, 24-27, (reared at 24°C) 12 2 to 10 16 6-8w old 12h cycle ∼50, yes PvsEL 3 25.5 12 3/8 18 3 (only pvsEH) + midguts 1:1 60-80 2-7 PvsEH ∼100, 4 25.5 12 4/8 18 no midguts 25-100, PvsEL 3 24 9 5/5 20 yes 4 + midguts PvsEH 5-40, 4 24 9 10/10 20 yes midguts Table 4.1: Experimental details of the four attempts made to transmit and generate genetic cross within two pairs of Plasmodium vinckei isolates.

mitting both PvsEH and PvsEL isolates. As described earlier, this transmission was however suboptimal and only a few recombinants appear to have beeen present in the cross progeny.

Two attempts were made with P. v. petteri CR and BS isolates as they too showed different growth profiles, but they proved to be even harder to transmit. In the first attempt, mosquito feed was performed at 24°C on mice infected with parasite mix- ture of PvpCR and PvpBS in the ratio 1:2. Only 2-5 oocysts were found in the midguts up to day 16 post-feed, no sporozoites were visible upon disruption of sali- vary glands and blood stage parasites failed to appear in the mice. We hypothesized that the temperature might be a factor for the failed transmission and tried four dif- ferent transmission temperatures - 24°C, 25.5°C, 27.5°C and alternating cycle of 24°C and 27°C every 12 hrs. However, all the experiments produced low number of oocysts, were negative for sporozoites after salivary gland disruption and produced no progeny within mice. Additionally, we also tried to transmit the isolates individ- ually, but were unsuccessful (data not shown).

Thus, we encountered problems in transmitting four P. vinckei isolates through 109 mosquitoes. Parasite development within the mosquito involves transition through well differentiated parasite forms that perform complex processes such as cell glid- ing and invading multiple cell types to surpass several physical barriers, any of which could be a limiting factor.

Zygotes develop into ookinetes that traverse the mosquito midgut and form oocysts. Oocysts mature and release sporozoites that then travel through the haemo- coel and into the salivary glands by penetrating the salivary gland lumen.

Normal oocyst development was observed in the experiments though the oocyst numbers were very low for P. v. petteri. The oocysts also proceeded to mature to contain sporoblasts, which released thousands of sporozoites upon disruption of dissected midguts. However, upon disruption of the salivary glands, no sporo- zoites were visible in the P. v. petteri experiments and only one or two sporozoites in the P. v. subsp. experiments. Usually, gland disruption releases large numbers (thousands) of sporozoites, readily visible as thin hair-like structures under the mi- croscope. Therefore, it may be hypothesised that optimal transmission was limited mainly by inability of sporozoites to efficiently invade the salivary glands, following their maturation within oocysts.

These observations are similar to those in P. berghei where while midgut in- fections occurr normally at a variety of temperatures, salivary gland invasion was optimal only at a particular temperature. In our case, however, we failed to achieve optimal transmission for P. v. petteri, both in terms of midgut and salivary gland infections despite carrying out the experiments at four different temperature condi- tions within previously identified sporogonic temperature range (24°-26°C) [Carter and Walliker, 1975]. 110

Our study is the first ever attempt (to the best of our knowledge) to transmit P. vinckei subsp. isolates from Cameroon. Despite the low sporozoite yield, we have at least demonstrated that these sporozoites are infective and take 5 days to initiate microscopically detectable blood stage infections in mice. Still, it is unclear why we were not able to obtain a higher number of sporozoites in both P. v. subsp. and P. v. petteri transmission experiments.

P. v. petteri sporozoites have previously been shown to appear in the salivary glands on day 9 post feed at temperatures between 24°-26°C but the number of sporozoites has not been reported [Carter and Walliker, 1975]. There is also no evidence to date of successful infection of vertebrate hosts with P. v. petteri sporo- zoites. Therefore, it is a possibility that 24°-26°C may not be the optimal temper- ature range for sporogony and a more systematic effort is therefore required to optimize P. v. petteri’s transmission.

Admittedly, our efforts here were not extensive enough in this direction, since our main objective was to study the genetic basis of growth rate differences in P. vinckei. The A. stephensi strain used in this study was able to efficiently transmit P. yoelii (data not shown) during the same period suggesting that there were no defects in the mosquitoes used. Future efforts could include testing out other tem- perature ranges for sporogony and different vertebrate hosts that could probably boost gametocyte load and increase oocyst numbers. Alternatively, P. v. vinckei [Bafort, 1969, Killick-Kendrick, 1973b], P. v. lentum [Landau et al., 1970] and P. v. brucechwatti [Killick-Kendrick, 1975] have been successfully transmitted previously and might be more suitable for crossing experiments. 111

Figure 4.3: Allele distribution of P. vinckei isolates, PvsEH and PvsEL, after a genetic cross. A) The genome-wide PvsEL allele frequencies are shown in two independent crosses (R1 and R2) before (D3) and after (D5) growth selection. Growth selection resulted in abrupt, crude jumps in allele frequencies and we were unable to infer a clear selection valley. Different colours demarcate SNP positions in different chromosomes. B) Confirmation of the genetic cross by screening clones for presence of both PvsEH and PvsEL alleles using Sanger sequencing. Each column denotes each chromosome and the two rows represent the 2 markers on either end of each chromosome. Of the total 28 markers, 11 were of PvsEL origin (white) and 17 were of PvsEH origin (grey). The same profile was observed in all four clones.

4.2.1.3 Linkage Group Selection

The cross progeny from both replicates of the PvsEH X PvsEL genetic cross were passaged into seven female CBA mice per replicate and parasites were harvested on day 3 post-infection when PvsEH and PvsEL theoretically have almost equal para- sitaemia (samples before growth selection - R1D3 and R2D3) and on day 5 when PvsEH parasitaemia is almost double of that of PvsEL (samples after growth selec- tion - R1D5 and R2D5). 112 DNA from all four samples (R1D3, R1D5, R2D3 and R2D5) were extracted and whole genome resequencing was performed. Around 29 to 65 million paired NGS reads were obtained from each sample in order to achieve a good sequencing depth for accurate SNP calling. Initially, NGS reads from parental PvsEH (∼37 million read pairs) were mapped against PvsEL reference genome and SNPs were called with the following conditions- i) mapping quality greater than 30 (1 in 1000 reads would be wrongly aligned), ii) base quality greater than 20 (probability that the base is wrong is 0.01), iii) minimum number of supporting reads greater than 10 and iv) only homozygous calls, yielding a total of 58,274 high confidence SNP positions that distinguish PvsEH and PvsEL throughout the genome (coding and non-coding regions). Next, sequencing reads from the four samples were mapped onto the PvsEL reference genome, base calling was performed at the 58,274 SNP positions and allele frequency in each position was calculated based on the number of reads supporting each allele.

This was followed by likelihood ratio based filtering step to remove erroneous alleles mapped to wrong locations in the genome. Unlike the previous LGS dataset in P. yoelii [Abkallo et al., 2017], we observed a high degree of fluctuations in our allele frequencies within close intervals resulting in a distinct “shadow” band (data not shown) in addition to the main allele frequency distribution. Therefore, the filtering step was modified from a learnt beta-binomial model to a more stringent binomial model that was able to remove most of the noise in the data.

Next, a jump diffusion method [Abkallo et al., 2017] was employed to identify jumps in the allele frequencies and the genome was split into regions demarcated by the inferred jump points. These regions were then compared against a constant allele frequency model to identify sites where a smooth allele frequency change has 113 taken place, thus forming a selection valley. A total of 17 sites were inferred as selection valleys (7 sites in samples before selection and 10 sites in samples after selection), but on closer inspection, all the segments were sudden, crude jumps in allele frequencies and did not exhibit a smooth change characteristic of a selection valley (see Figure 4.3A).

The false positives caused by sudden jumps in allele frequencies required us to set a more conservative threshold (likelihood ratio of 1 per site instead of 0.1) to infer selection valleys and it is possible that genuine selection valleys would have been missed with this conservative threshold. Thus, in conclusion, we were unable to fit our data with the current mathematical model for LGS [Abkallo et al., 2017] and our analysis was not able to detect any selection valley caused by the growth selection pressure.

Resolution of a selection valley depends on the accumulation of a good number of recombination events in the region to ensure robust distribution of allele fre- quencies. It is, therefore, directly proportional to the number of recombinants in the cross progeny. Based on our observed oocyst numbers (the average being 50 oocysts per midgut), we could expect, on an average, 6,000 unique recombinant genotypes (4 recombinants X 25 hybrid oocysts X 60 mosquitoes X 100% of the mosquitoes infected) in the cross progeny. However, cloning of the progeny yielded only one genotype suggesting that the number of recombinants could be signifi- cantly lower than the theoretical expectation and that this particular genotype has undergone significant clonal expansion in the population.

The presence of sudden jumps in allele frequencies in our LGS analysis with- out any discernible selection valley (see Figure 4.3B) also supports this hypothesis. 114 The crossover events in chromosomes 2, 4, 10, 12 and 13 identified by Sanger sequencing in the recombinant clone are reflected as jumps in the allele frequency distribution in those chromosomes. Additionally, there are jumps in chromosomes 7 and 8, suggesting that there is at the least one or two more recombinants present in the population, bringing the estimated number of recombinants to just three. Sub- optimal transmission resulting in only a few sporozoites within the salivary glands could be the main reason for this sparse diversity.

Establishing genetics in Plasmodium vinckei could open up new possibilities to investigate genotype-phenotype relationships in Plasmodium. Of the present RMP species, only P. chabaudi and P. yoelii isolates have been used for genetic studies since P. berghei isolates are genetically very similar [Otto et al., 2014], rendering them unusable for linkage analysis. Phenotypic and genotypic characterization of P. vinckei carried out in this study provides ten additional isolates that could be used for this purpose. This would allow for existing points of enquiry like growth rate and strain-specific immunity to be further explored and also new phenotypes to be investigated.

One important phenotype that could be investigated is thermoregulation of sporogony in Plasmodium [Fang and McCutchan, 2002]. Parasite development into infective sporozoites within the mosquito is highly dependent on the ambient tem- perature and each RMP species has evolved to prefer a narrow temperature range- P. berghei, P. yoelii and P. chabaudi undergo optimal sporogony at 19-21°C, 22-24°C, 24-26°C respectively [Killick-Kendrick and Peters, 1978]. In P. vinckei alone, the op- timal sporogony temperature differs between P. v. vinckei (20-21°C) and the other subspecies (24-26°C) [Killick-Kendrick and Peters, 1978], providing an unique op- portunity to study this phenotype through a genetic cross between two P. vinckei 115 subspecies.

Differences in the host immune response to parasites with different genotypes (strain-specific immunity) can be exploited as a selection pressure followed by link- age analysis to identify parasite antigens that interact with the host. With the avail- ability of several, genetically diverse isolates within the same species, a panel of such experiments could be carried out across multiple strains of the parasite and host, yielding many candidate genes with higher confidence. Artemisinin resistance could also be potentially studied with the availability of P. v. petteri line showing stable resistance to arteether [Puri and Chandra, 2006].

Naturally occurring growth rate differences among RMP isolates is one of the most successfully studied phenotypes using linkage analysis. Growth rate has been directly linked to host cell preference, wherein a P. yoelii isolate that can invade both immature and mature RBCs grows faster than the isolate that can invade only immature RBCs [Pattaradilokrat et al., 2009, Abkallo et al., 2017]. A Duffy bind- ing protein, PyEBL, has been shown to control this phenotype with a single amino acid substitution within the protein altering the parasite’s host cell preference and growth rate. We observe similar differences in growth rates within P. vinckei isolates but all of them prefer to invade mature RBCs. This growth rate difference could be the result of a more subtle difference in host cell preference not readily identifiable by microscopy. Else, there could be other unknown genetic determinants of growth rate for P. vinckei.

In order to answer this last question, we set out to produce a genetic cross be- tween two Plasmodium vinckei isolates, PvsEH and PvsEL, and apply LGS to identify genetic factors associated with growth rate. Our effort was met with only partial 116 success. We were able to show that P. vinckei isolates can cross-fertilize and produce a progeny of recombinant parasites and by selecting for the fast-growing recombi- nants, fluctuations could be induced in the allele frequencies that could potentially form selection valleys pinpointing to controlling genes.

A good degree of polymorphism was also found between the parental isolates, providing us with a high density of genetic markers - around 58,000 high-quality SNPs (almost 3 SNP sites for every 1000 basepairs), which is almost double the number of SNPs used in the latest LGS study in P. yoelii. However, we faced a severe reduction in the number of recombinants, possibly due to suboptimal trans- mission, causing us to get only crude jumps in allele frequencies that did not form selection valleys. As stated above in the previous section, future directions towards genetic analysis in P. vinckei include tests to assess if P. v. vinckei, P. v. lentum or P. v. brucechwatti are more efficient at transmission and a more comprehensive transmission experiment in P. v. petteri and P. v. subsp. to optimize their sporogony conditions.

4.2.2 Genetic manipulation in P. vinckei

Targeted mutation, knock out or introduction of particular genes continue to be powerful and often used techniques to assess gene function in the malaria parasite. The tractability of P. vinckei parasites for genetic manipulation is unknown and an assessment of the same can help future studies apply existing genetic modification techniques to P. vinckei. In order to demonstrate transfection and genetic modification in P. vinckei, we aimed to create a P. vinckei line that constitutively expresses GFPLuc (green fluo- rescent protein- firefly luciferase) fusion protein, similar to previous studies done in P. berghei and P. yoelii [Miller et al., 2013, Franke-Fayard et al., 2004]. A recom- 117

Figure 4.4: Genetic manipulation of P. vinckei vinckei CY to create a PvGFP- Luccon, a PvvCY line constitutively expressing GFP-Luc fusion protein. A) The dispensable p230p locus of PvvCY (PVVCY 0300700) was targeted by a recombina- tion plasmid - pPVVCY-∆p230p-gfpLuc to replace the gene with gfpLuc cassette by double crossover recombination. B) Fluorescent live cell imaging shows expression of gfpLuc by blood-stage PvGFP-Luccon parasites. C) Fluorescent imaging shows ex- pression of gfpLuc by PvGFP-Luccon oocysts in the mosquito midgut (magnification- 100X). Inset shows higher magnification view of a single oocyst present in the midgut, positive for gfp.

bination plasmid, pPvvCY-∆p230p-gfpLuc, was constructed to target and replace the dispensable wildtype P230p locus in P. v. vinckei CY (PVVCY 0300700) with a gene cassette encoding for GFPLuc and a hdhfr selectable marker cassette (human dihydrofolate reductase). Transfection of purified PvvCY schizonts with 20 ug of linearized pPvvCY-∆p230p-gfpLuc plasmid by electroporation, followed by marker selection using pyrimethamine yielded pyrimethamine-resistant transfectant para- sites (PvGFP-Luccon) on day 6 after drug treatment.

Stable transfectants were cloned by limiting dilution (0.3 parasites per mouse 118 inoculum into 10 ICR mice) and plasmid integration in these clones was confirmed by PCR. PCR amplification of the 5' end yielded an expected band of ∼5.2 kb size confirming 5'integration. Similarly, an expected band of ∼1.6 kb size was ampli- fied at the 3'end confirming 3'integration. Constitutive expression of GFPLuc in

PvGFP-Luccon blood stage parasites was confirmed by fluorescence live cell imaging.

GFPLuc expression in PvGFP-Luccon oocysts was confirmed by fluorescence imaging of mosquito midguts 7 days after blood meal from mice carrying PvGFP-Luccon par- asites.

In conclusion, we have successfully created a genetically modified P. vinckei par- asite line, demonstrating the feasibility of genetic manipulation in P. vinckei. This is advantageous for genetic studies in P. vinckei as gene candidates identified to be linked to a particular phenotype by linkage studies could then be knocked out or mutated to confirm the hypothesis.

4.3 Methods

4.3.1 Genetic cross

Frozen stabilates of P. vinckei subsp. EH and EL clones were thawed following the protocol described in Chapter I. The thawed clones were grown separately in donor ICR mice. Parasites were harvested from the donor mice, mixed to achieve a 1:1 ratio of PvsEH and PvsEL parasites and 1 X 106 parasites intravenously inoculated into 4 female CBA mice. Three days after inoculation, the presence of gameto- cytes was confirmed microscopically and two infected CBA mice were anaesthetized and placed on two mosquito cages, each containing around 80 female Anopheles stephensi mosquitoes 7 to 12 days post emergence. 119 Mosquitoes were allowed to feed on the mice without interruption for 40 min- utes at 24°C. A fresh feed was again performed on the 4th day post- inoculation with the other two CBA mice and two fresh cages of mosquitoes. Day 3 feed was considered to be biological replicate 1 (R1) and day 4 feed was considered to be biological replicate 2 (R2). 5-10 female mosquitoes were dissected on the 9th and 12th day after the blood meal to check for presence of oocysts in the mosquito midguts. Twenty days after the blood meal, the mosquitoes were dissected and the salivary glands were removed, placed in 0.5-0.7 ml PBS solution and gently dis- rupted to release sporozoites. The suspensions from R1 and R2 cages were injected intravenously into one and two ICR mice respectively.

Once blood stage parasites were confirmed by microscopy in the ICR mice, the cross progeny were inoculated intravenously into groups of CBA mice with 1 X 106 parasites per mouse for LGS experiments.

4.3.2 Linkage Group Selection

The cross progeny were subjected to growth rate selection pressure by allowing the parasites to grow for 5 days, when PvsEH's parasitaemia is almost double of that of PvsEL. Parasites were harvested on day 3 and day 5 post-inoculation to obtain the progeny population pre- and post-selection respectively. DNA extraction and sequencing was done for both these populations as described in 3. Reads from the four samples- R1D3, R1D5, R2D3, R2D5 and the PvsEH parental were mapped onto PvsEL reference genome using BWA v0.7.5 [Li and Durbin, 2009] and SNPs were called using samtools mpileup [Li et al., 2009a]. SNPs were called with the following conditions- i) mapping quality greater than 30, ii) base quality greater than 20, iii) minimum number of supporting reads greater than 10 and iv) only homozygous calls. SNPs were filtered using a binomial model and a jump diffusion 120 method was employed to identify jumps in the allele frequencies, as described in [Abkallo et al., 2017].

4.3.3 Plasmid construction of pPvvCY-∆p230p-gfpLuc

pPvvCY-∆p230p-gfpLuc plasmid was constructed using MultiSite Gateway cloning system (Invitrogen). attB-flanked 5'and 3'homology arms were obtained by am- plifying 800bp regions upstream and downstream of PVVCY 0300700. attB12-Pv- 5homarm and attB41-3homarm fragments were subjected to independent BP re- combination with pDONRP4-P1R (Invitrogen) to generate entry plasmids pENT12- 5U and pENT41-3U, respectively. Similarly the gfpLuc cassette from pL1063 was amplified and subjected to LR reaction to obtain pENT23-gfpLuc. BP reaction was performed using the BP Clonase II enzyme mix (Invitrogen) according to the man- ufacturer's instructions.

4.3.4 Transfection

P. vinckei vinckei CY schizont-enriched fraction was collected by differential centrifu- gation on 50% Nycodenz in incomplete RPMI1640 medium, and 20 ug of ApaI- and StuI-double digested linearized transfection constructs were electroporated to ∼1 x 107 of enriched schizonts using a Nucleofector device (Amaxa) with human T-cell so- lution under program U-33. Transfected parasites were intravenously injected into 7-week-old ICR female mice, which were treated by administering pyrimethamine in the drinking water (0.07 mg/mL) 24 hours later for a period of 4-7 days. Drug re- sistant parasites were cloned by limiting dilution - an inoculum of 0.3 parasites/100 µL was injected into 10 female ICR. Two clones were obtained and integration of the transfection constructs was confirmed by PCR amplification with a unique set of primers for the modified p230p gene locus. 121

Chapter 5

Concluding Remarks

Since 1948, when a wild rodent malaria parasite from the forests of Katanga in sub-Saharan Africa was propagated in laboratory mice for the first time [Vincke and Lips, 1948], a considerable amount of effort has gone into establishing RMPs as experimental models to study malaria. In a span of thirty years, several dif- ferent RMP isolates had been collected, classified into four Plasmodium species and their complete life cycles characterized in detail [Killick-Kendrick and Peters, 1978]. Researchers also succeeded in performing genetic crosses between RMP isolates in laboratory settings, thereby establishing RMPs as tractable genetic models [Walliker et al., 1971]. Another breakthrough was made in 1995 with the successful stable transfection of P. berghei [van Dijk et al., 1995], opening up the enticing possibility of genetically modifying RMP parasites.

With the advent of genome sequencing, complete RMP genomes were published making them more accessible as experimental models and suitable for undertak- ing high-throughput functional studies. All these efforts have paid off. RMPs are now indispensable malaria models regularly put to use in functional genetics to study, among other key areas, Plasmodium cell biology, malaria transmission, im- mune evasion and host-parasite interactions, and a significant share of our present knowledge about the malaria parasite can be attributed to studies using RMP mod- els. 122 Given the success of already well-established RMP models, viz. P. berghei, P. chabaudi and P. yoelii, the collective research focus in this field has gradually shifted (as it should) from establishing RMPs as malaria models to putting them to use to study different aspects of malaria biology. Hence, interest in characterizing addi- tional RMP isolates and making them available to be used as models has waned and is possibly the reason why, despite the collection of several Plasmodium vinckei isolates, this RMP species remains largely uncharacterized.

Genetic studies in RMPs have resulted in the identification of genes linked to drug resistance, growth rate and strain-specific immunity, but these are the cumu- lative result of just two P. c. chabaudi, two P. c. adami and three P. yoelii strains (apart from induced drug-resistant P. chabaudi lines)[Culleton et al., 2005, Pat- taradilokrat et al., 2007, Gadsby, 2008, Pattaradilokrat et al., 2009, Abkallo et al., 2017]. More could be achieved with a collection of RMP isolates within which phe- notypic differences exist and are identified, genotypic diversity is sufficient enough to be used as molecular markers, and the isolates can be crossed with each other. P. chabaudi meets these requirements to a certain extent with two known subspecies and significant genotypic diversity within their isolates [Otto et al., 2014]. The four P. vinckei subspecies, consisting of several isolates, along with additional P. vinckei isolates from Cameroon, with known genotypic diversity could be a more promis- ing platform for genetic studies. With this as our working hypothesis, we set out to characterize the phenotypes and genotypes of ten P. vinckei isolates and demon- strate their tractability for experimental genetics.

The key findings of this dissertation are as follows: P. vinckei isolates differ significantly in their growth rates and could potentially be used to identify genes linked to parasite virulence. The growth profiles characterized in this study would 123 also be a resource for those intending to use these isolates as models in their studies.

By implementing state-of-the-art sequencing technologies such as single molecule real-time sequencing and strand-specific RNA sequencing, we have generated five high-quality reference genomes for P. vinckei, produced genotype data for all ten isolates and profiled the gene expression in one of the subspecies, P. v. vinckei, over the course of its intra-erythrocytic development cycle. The unfragmented na- ture of our genome assemblies allowed us to characterize genome re-arrangements within P. vinckei, this being the first time that large scale genome re-arrangements are reported between isolates of the same malaria species. It would also lay the foundation for future omics-driven functional biology work in P. vinckei.

Gene annotations are the bedrock of omics-driven studies and errors in the anno- tation will affect subsequent downstream analyses. Using multiple lines of evidence to predict the gene models and having manually curated them, we have produced high quality P. vinckei gene models. With the telomeric regions well-resolved, we identified copy number variations in sub-telomeric multigene families, including an interesting expansion of pseudogenes within the erythrocyte membrane antigen 1 family.

By generating high quality SNP datasets, we have shown P. vinckei isolates to be genetically very diverse and have made available a resource of molecular markers for future linkage studies with these isolates.

Transcriptome data shows stage-specific expression of genes, including that of multi-gene families, reminiscent of other Plasmodium species. 124 Taken together, we have thus made available a comprehensive genome resource for P. vinckei that will aid future research with this model. Apart from this, we also formulated a simplified and quick RNAseq protocol for RMPs that can allow for larger number of replicates and timepoints for gene expression profiling at a reduced cost, time and effort.

However, one of our prime objectives was to identify genes linked to growth rate differences in P. vinckei by applying Linkage Group Selection. We were hindered in this pursuit when repeated attempts to efficiently transmit P. vinckei petteri and P. vinckei subsp. parasites through the mosquito vector failed. Despite achieving a genetic cross, the inability of P. vinckei subsp. sporozoites to effectively invade the salivary glands under the stated experimental conditions reduced the number of recombinants in the cross progeny and resulted in a low resolution dataset within which we were unable to discern selection valleys.

This failure to obtain infective sporozoites upon disrupting salivary glands also prevented us from visualizing GFP-positive sporozoites in the salivary glands and luciferase enzyme-based bioluminescence in the liver stages of our modified P. v. vinckei CY line expressing GFPLuc. We consider these as the main limitations of this study and a more detailed transmission study with P. vinckei isolates is required to improve the transmission capability of this model. Nevertheless, we have shown that a genetic cross can be achieved between P. vinckei isolates and that the parasite is also amenable to transfection and genetic manipulation.

If the present hurdles in transmission are overcome and genetic linkage studies can be successfully applied to P. vinckei isolates, several exciting research ques- tions can be pursued. Virulence related genes could be identified as P. vinckei holds 125 three independent pairs of isolates differing in their growth rates. A genetic cross between isolates of contrasting optimal sporogony temperatures followed by tem- perature selection of the cross progeny could reveal genes linked to this phenotype. Since several isolates are available at our disposal, strain-specific host immunity could be studied from both the parasite and the host perspectives by setting up in- fection panels across different mice and parasite strains.

To conclude, this study has built a comprehensive resource of Plasmodium vinckei isolates, their phenotypes and genotypes, and it is our hope that this work will pro- vide the right tools to effectively use Plasmodium vinckei as a rodent malaria model for the study of Plasmodium biology. 126

REFERENCES

H. M. Abkallo, A. Martinelli, M. Inoue, A. Ramaprasad, P. Xangsayarath, J. Gitaka, J. Tang, K. Yahata, A. Zoungrana, H. Mitaka, A. Acharjee, P. P. Datta, P. Hunt, R. Carter, O. Kaneko, V. Mustonen, C. J. R. Illingworth, A. Pain, and R. Culleton. Rapid identification of genes controlling virulence and immunity in malaria para- sites. PLOS Pathogens, 13(7):1–24, 07 2017. doi: 10.1371/journal.ppat.1006447. URL https://doi.org/10.1371/journal.ppat.1006447.

J. P. Adam, I. Landau, and A. Chabaud. Decouverte dans la region de brazzaville de rongeurs infectes par des plasmodium. Compte Rendu Hebdomadaire des Seances de l’Academie des Sciences, 263:140–141, 1966.

X. Adiconis, D. Borges-Rivera, R. Satija, D. S. DeLuca, M. A. Busby, A. M. Berlin, A. Sivachenko, D. A. Thompson, A. Wysoker, T. Fennell, A. Gnirke, N. Pochet, A. Regev, and J. Z. Levin. Comparative analysis of rna sequencing methods for degraded or low-input samples. Nat Methods, 10(7):623–9, 2013. ISSN 1548- 7105 (Electronic) 1548-7091 (Linking). doi: 10.1038/nmeth.2483. URL https: //www.ncbi.nlm.nih.gov/pubmed/23685885.

C. Amaratunga, S. Sreng, S. Suon, E. S. Phelps, K. Stepniewska, P. Lim, C. Zhou, S. Mao, J. M. Anderson, N. Lindegardh, H. Jiang, J. Song, X. Z. Su, N. J. White, A. M. Dondorp, T. J. Anderson, M. P. Fay, J. Mu, S. Duong, and R. M. Fairhurst. Artemisinin-resistant plasmodium falciparum in pursat province, western cambo- dia: a parasite clearance rate study. Lancet Infect Dis, 12(11):851–8, 2012. ISSN 1474-4457 (Electronic) 1473-3099 (Linking). doi: 10.1016/S1473-3099(12) 70181-0. URL https://www.ncbi.nlm.nih.gov/pubmed/22940027.

R. Amato, O. Miotto, C. Woodrow, J. Almagro-Garcia, I. Sinha, S. Campino, D. Mead, E. Drury, M. Kekre, M. Sanders, A. Amambua-Ngwa, C. Amaratunga, L. Amenga-Etego, T.J.C Anderson, V. Andrianaranjaka, T. Apinjoh, E. Ashley, S. Auburn, G. A. Awandare, V. Baraka, A. Barry, M. F. Boni, S. Borrmann, T. Bousema, O. Branch, P. C. Bull, K. Chotivanich, D. J. Conway, A. Craig, N. P. Day, A. Djimd´e, C. Dolecek, A. M. Dondorp, C. Drakeley, P. Duffy, D. F. Echeverri- Garcia, T. G. Egwang, R. M. Fairhurst, Md. A. Faiz, C. I. Fanello, T. T. Hien, 127 A. Hodgson, M. Imwong, D. Ishengoma, P. Lim, C. Lon, J. Marfurt, K. Marsh, M. Mayxay, V. Mobegi, O. Mokuolu, J. Montgomery, I. Mueller, M. P. Kyaw, P. N Newton, F. Nosten, R. Noviyanti, A. Nzila, H. Ocholla, A. Oduro, M. Onyam- boko, J. Ouedraogo, A. P. Phyo, C. V. Plowe, R. N. Price, S. Pukrittayakamee, M. Randrianarivelojosia, P. Ringwald, L. Ruiz, D. Saunders, A. Shayo, P. Siba, S. Takala-Harrison, T. N. Thanh, V. Thathy, F. Verra, N. J. White, Y. Htut, V. J. Cor- nelius, R. Giacomantonio, D. Muddyman, C. Henrichs, C. Malangone, D. Jyothi, R. D. Pearson, J. C. Rayner, G. McVean, K. Rockett, A. Miles, P. Vauterin, B. Jef- fery, M. Manske, J. Stalker, B. MacInnis, D. P. Kwiatkowski, and KARMA Consor- tium. Genomic epidemiology of the current wave of artemisinin resistant malaria. bioRxiv, 2015. doi: 10.1101/019737. URL http://www.biorxiv.org/content/ early/2015/05/24/019737.

H. R. Ansari, T. J. Templeton, A. K. Subudhi, A. Ramaprasad, J. Tang, F. Lu, R. Naeem, Y. Hashish, M. C. Oguike, E. D. Benavente, T. G. Clark, C. J. Sutherland, J. W. Barnwell, R. Culleton, J. Cao, and A. Pain. Genome-scale comparison of expanded gene families in plasmodium ovale wallikeri and plasmodium ovale curtisi with plasmodium malariae and with other plas- modium species. Int J Parasitol, 46(11):685–96, 2016. ISSN 1879-0135 (Electronic) 0020-7519 (Linking). doi: 10.1016/j.ijpara.2016.05.009. URL https://www.ncbi.nlm.nih.gov/pubmed/27392654http://ac. els-cdn.com/S0020751916301357/1-s2.0-S0020751916301357-main. pdf?_tid=dcbacd3c-0980-11e7-a8ab-00000aab0f6c&acdnat=1489583652_ c21168c7bc8a98cbf66b963e1d838257.

E. A. Ashley, M. Dhorda, R. M. Fairhurst, C. Amaratunga, P. Lim, S. Suon, S. Sreng, J. M. Anderson, S. Mao, B. Sam, C. Sopha, C. M. Chuor, C. Nguon, S. Sovan- naroth, S. Pukrittayakamee, P. Jittamala, K. Chotivanich, K. Chutasmit, C. Suchat- soonthorn, R. Runcharoen, T. T. Hien, N. T. Thuy-Nhien, N. V. Thanh, N. H. Phu, Y. Htut, K. T. Han, K. H. Aye, O. A. Mokuolu, R. R. Olaosebikan, O. O. Folaranmi, M. Mayxay, M. Khanthavong, B. Hongvanthong, P. N. Newton, M. A. Onyamboko, C. I. Fanello, A. K. Tshefu, N. Mishra, N. Valecha, A. P. Phyo, F. Nosten, P. Yi, R. Tripura, S. Borrmann, M. Bashraheil, J. Peshu, M. A. Faiz, A. Ghose, M. A. Hossain, R. Samad, M. R. Rahman, M. M. Hasan, A. Islam, O. Miotto, R. Amato, B. MacInnis, J. Stalker, D. P. Kwiatkowski, Z. Bozdech, A. Jeeyapant, P. Y. Cheah, T. Sakulthaew, J. Chalk, B. Intharabut, K. Silamut, S. J. Lee, B. Vihokhern, C. Ku- nasol, M. Imwong, J. Tarning, W. J. Taylor, S. Yeung, C. J. Woodrow, J. A. Flegg, 128 D. Das, J. Smith, M. Venkatesan, C. V. Plowe, K. Stepniewska, P. J. Guerin, A. M. Dondorp, N. P. Day, and N. J. White. Spread of artemisinin resistance in plas- modium falciparum malaria. N Engl J Med, 371(5):411–23, 2014. ISSN 1533- 4406 (Electronic) 0028-4793 (Linking). doi: 10.1056/NEJMoa1314981. URL https://www.ncbi.nlm.nih.gov/pubmed/25075834.

K. Baer, C. Klotz, S. H. Kappe, T. Schnieder, and U. Frevert. Release of hepatic plas- modium yoelii merozoites into the pulmonary microvasculature. PLoS Pathog, 3(11):e171, 2007. ISSN 1553-7374 (Electronic) 1553-7366 (Linking). doi: 10.1371/journal.ppat.0030171. URL https://www.ncbi.nlm.nih.gov/pubmed/ 17997605.

J Bafort. Le cycle biologique du plasmodium vinckei de nigeria. Multicolloque Europeen de Parasitologie, ler., Rennes, pages 235–237, 1971a.

J. M. Bafort. [biological cycle of plasmodium v. vinckei rodhain 1952]. Ann Soc Belges Med Trop Parasitol Mycol, 49(6):533–628, 1969. ISSN 0037-9638 (Print) 0037-9638 (Linking). URL http://www.ncbi.nlm.nih.gov/pubmed/5403133.

J. M. Bafort. The biology of rodent malaria with particular reference to plasmodium vinckei vinckei rodhain 1952. Ann Soc Belges Med Trop Parasitol Mycol, 51(1):5– 203, 1971b. ISSN 0037-9638 (Print) 0037-9638 (Linking). URL http://www. ncbi.nlm.nih.gov/pubmed/4398609.

J. M. Bafort. New isolations of murine malaria in Africa; Cameroon. In 5th Interna- tional Congress of Protozoology, page 343, 1977.

E. S. Balakirev and F. J. Ayala. Pseudogenes: are they ”junk” or functional dna? Annu Rev Genet, 37:123–51, 2003. ISSN 0066-4197 (Print) 0066-4197 (Linking). doi: 10.1146/annurev.genet.37.040103.103949. URL https://www.ncbi.nlm. nih.gov/pubmed/14616058.

B. Balu, C. Chauhan, S. P. Maher, D. A. Shoue, J. C. Kissinger, M. J. Fraser, and J. H. Adams. piggyBac is an effective tool for functional analysis of the Plasmodium falciparum genome. BMC Microbiology, 9:83, may 2009. ISSN 1471-2180. doi: 10.1186/1471-2180-9-83. URL http://www.ncbi.nlm.nih.gov/pmc/articles/ PMC2686711/.

A. Bateman, L. Coin, R. Durbin, R. D. Finn, V. Hollich, S. Griffiths-Jones, A. Khanna, M. Marshall, S. Moxon, E. L. Sonnhammer, D. J. Studholme, C. Yeats, and S. R. 129 Eddy. The pfam protein families database. Nucleic Acids Res, 32(Database issue): D138–41, 2004. ISSN 1362-4962 (Electronic) 0305-1048 (Linking). doi: 10. 1093/nar/gkh121. URL http://www.ncbi.nlm.nih.gov/pubmed/14681378.

L. A. Baton and L. C. Ranford-Cartwright. Spreading the seeds of million-murdering death: metamorphoses of malaria in the mosquito. Trends in Parasitology, 21(12):573–580, 2005. ISSN 1471-4922. doi: http://dx.doi.org/10.1016/j. pt.2005.09.012. URL http://www.sciencedirect.com/science/article/pii/ S1471492205002874.

J. Baum, A. G. Maier, R. T. Good, K. M. Simpson, and A. F. Cowman. Invasion by P. falciparum merozoites suggests a hierarchy of molecular interactions. PLoS Pathogens, 1(4):0299–0309, 2005. ISSN 15537366. doi: 10.1371/journal.ppat. 0010037.

C. M. Beall, G. M. Brittenham, K. P. Strohl, J. Blangero, S. Williams-Blangero, M. C. Goldstein, M. J. Decker, E. Vargas, M. Villena, R. Soria, A. M. Alarcon, and C. Gon- zales. Hemoglobin concentration of high-altitude tibetans and bolivian aymara. Am J Phys Anthropol, 106(3):385–400, 1998. ISSN 0002-9483 (Print) 0002-9483 (Linking). doi: 10.1002/(SICI)1096-8644(199807)106:3h385::AID-AJPA10i3.0. CO;2-X. URL https://www.ncbi.nlm.nih.gov/pubmed/9696153.

D. P. Bechtsi and A. P. Waters. Genomics and epigenetics of sexual commitment in plasmodium. Int J Parasitol, 2017. ISSN 1879-0135 (Electronic) 0020-7519 (Linking). doi: 10.1016/j.ijpara.2017.03.002. URL https://www.ncbi.nlm.nih. gov/pubmed/28455236.

J. C. Beier. Malaria parasite development in mosquitoes. Annu Rev Entomol, 43: 519–43, 1998. ISSN 0066-4170 (Print) 0066-4170 (Linking). doi: 10.1146/ annurev.ento.43.1.519. URL https://www.ncbi.nlm.nih.gov/pubmed/9444756.

M. Bernabeu, F. J. Lopez, M. Ferrer, L. Martin-Jaular, A. Razaname, G. Cor- radin, A. G. Maier, H. A. Del Portillo, and C. Fernandez-Becerra. Functional analysis of plasmodium vivax vir proteins reveals different subcellular local- izations and cytoadherence to the icam-1 endothelial receptor. Cell Micro- biol, 14(3):386–400, 2012. ISSN 1462-5822 (Electronic) 1462-5814 (Link- ing). doi: 10.1111/j.1462-5822.2011.01726.x. URL https://www.ncbi.nlm. nih.gov/pubmed/22103402. 130 A. M. Bolger, M. Lohse, and B. Usadel. Trimmomatic: a flexible trimmer for illumina sequence data. Bioinformatics, 30(15):2114, 2014. doi: 10.1093/bioinformatics/ btu170. URL +http://dx.doi.org/10.1093/bioinformatics/btu170.

Z. Bozdech, M. Llinas, B. L. Pulliam, E. D. Wong, J. Zhu, and J. L. DeRisi. The transcriptome of the intraerythrocytic developmental cy- cle of plasmodium falciparum. PLoS biology, 1(1):E5, 2003a. ISSN 1545-7885 (Electronic) 1544-9173 (Linking). doi: 10.1371/journal.pbio. 0000005. URL http://www.ncbi.nlm.nih.gov/pubmed/12929205http://www. ncbi.nlm.nih.gov/pmc/articles/PMC176545/pdf/pbio.0000005.pdf.

Z. Bozdech, J. Zhu, M. P. Joachimiak, F. E. Cohen, B. Pulliam, and J. L. De- Risi. Expression profiling of the schizont and trophozoite stages of plasmod- ium falciparum with a long-oligonucleotide microarray. Genome Biol, 4(2): R9, 2003b. ISSN 1474-760X (Electronic) 1474-7596 (Linking). URL https: //www.ncbi.nlm.nih.gov/pubmed/12620119.

Z. Bozdech, S. Mok, G. Hu, M. Imwong, A. Jaidee, B. Russell, H. Ginsburg, F. Nosten, N. P. Day, N. J. White, J. M. Carlton, and P. R. Preiser. The transcriptome of plasmodium vivax reveals divergence and diversity of transcriptional regulation in malaria parasites. Proc Natl Acad Sci U S A, 105(42):16290–5, 2008. ISSN 1091-6490 (Electronic) 0027-8424 (Linking). doi: 10.1073/pnas.0807404105. URL https://www.ncbi.nlm.nih.gov/pubmed/18852452.

P. Brasil, M. G. Zalis, A. de Pina-Costa, A. M. Siqueira, C. B. Junior,´ S. Silva, A. L. L. Areas, M. Pelajo-Machado, D. A. M. de Alvarenga, A. C. F. da Silva Santelli, H. G. Albuquerque, P. Cravo, F. V. Santos de Abreu, C. L. Peterka, G. M. Zanini, M. C. Suarez´ Mutis, A. Pissinatti, R. Lourenc¸o-de Oliveira, Cristiana F. A. de Brito, M. de Fatima´ Ferreira-da-Cruz, R. Culleton, and C. T. Daniel-Ribeiro. Outbreak of hu- man malaria caused by Plasmodium simium in the Atlantic Forest in Rio de Janeiro: a molecular epidemiological investigation. The Lancet Global Health, sep 2017. ISSN 2214-109X. doi: 10.1016/S2214-109X(17)30333-9. URL http://dx.doi.org/10.1016/S2214-109X(17)30333-9.

LJ Bruce-Chwatt and FD Gibson. A plasmodium from a nigerian rodent. Trans R Soc Trop Med Hyg, 49:9, 1955.

T. Brugat, D. Cunningham, J. Sodenkamp, S. Coomes, M. Wilson, P. J. Spence, W. Jarra, J. Thompson, C. Scudamore, and J. Langhorne. Sequestration and 131 histopathology in plasmodium chabaudi malaria are influenced by the immune response in an organ-specific manner. Cellular Microbiology, 16(5):687–700, 2014. ISSN 1462-5822. doi: 10.1111/cmi.12212. URL http://dx.doi.org/ 10.1111/cmi.12212.

T. Brugat, A. J. Reid, J. W. Lin, D. Cunningham, I. Tumwine, G. Kushinga, S. McLaughlin, P. Spence, U. Bohme, M. Sanders, S. Conteh, E. Bushell, T. Met- calf, O. Billker, P. E. Duffy, C. Newbold, M. Berriman, and J. Langhorne. Antibody- independent mechanisms regulate the establishment of chronic plasmodium in- fection. Nat Microbiol, 2:16276, 2017. ISSN 2058-5276 (Electronic) 2058-5276 (Linking). doi: 10.1038/nmicrobiol.2016.276. URL https://www.ncbi.nlm. nih.gov/pubmed/28165471.

E. Bushell, A. R. Gomes, T. Sanderson, B. Anar, G. Girling, C. Herd, T. Metcalf, K. Modrzynska, F. Schwach, R. E. Martin, M. W. Mather, G. I. McFadden, L. Parts, G. G. Rutledge, A. B. Vaidya, K. Wengelnik, J. C. Rayner, and O. Billker. Functional Profiling of a Plasmodium Genome Reveals an Abundance of Essen- tial Genes. Cell, 170(2):260–272.e8, sep 2017. ISSN 0092-8674. doi: 10.1016/ j.cell.2017.06.030. URL http://dx.doi.org/10.1016/j.cell.2017.06.030.

M. S. Campbell, C. Holt, B. Moore, and M. Yandell. Genome anno- tation and curation using maker and maker-p. Curr Protoc Bioin- formatics, 48:4 11 1–4 11 39, 2014. ISSN 1934-340X (Electronic) 1934-3396 (Linking). doi: 10.1002/0471250953.bi0411s48. URL http://www.ncbi.nlm.nih.gov/pubmed/25501943http://onlinelibrary. wiley.com/doi/10.1002/0471250953.bi0411s48/abstract.

B. L. Cantarel, I. Korf, S. M. Robb, G. Parra, E. Ross, B. Moore, C. Holt, A. Sanchez Al- varado, and M. Yandell. Maker: an easy-to-use annotation pipeline designed for emerging model organism genomes. Genome Res, 18(1):188–96, 2008. ISSN 1088-9051 (Print) 1088-9051 (Linking). doi: 10.1101/gr.6743907. URL https://www.ncbi.nlm.nih.gov/pubmed/18025269.

S. Capella-Gutierrez, J. M. Silla-Martinez, and T. Gabaldon. trimal: a tool for auto- mated alignment trimming in large-scale phylogenetic analyses. Bioinformatics, 25(15):1972–3, 2009. ISSN 1367-4811 (Electronic) 1367-4803 (Linking). doi: 10.1093/bioinformatics/btp348. URL https://www.ncbi.nlm.nih.gov/pubmed/ 19505945. 132 J. Carlton, M. Mackinnon, and D. Walliker. A chloroquine resistance locus in the rodent malaria parasite plasmodium chabaudi. Mol Biochem Parasitol, 93(1): 57–72, 1998. ISSN 0166-6851 (Print) 0166-6851 (Linking). URL https://www. ncbi.nlm.nih.gov/pubmed/9662028.

J. Carlton, S. Angiuoli, B. Suh, T. Kooij, M. Pertea, J. Silva, M. Ermolaeva, J. Allen, J. Selengut, H. Koo, J. Peterson, M. Pop, D. Kosack, M. Shumway, S. Bidwell, S. Shallom, S. van Aken, S. Riedmuller, T. Feldblyum, J. Cho, J. Quackenbush, M. Sedegah, A. Shoaibi, L. Cummings, L. Florens, J. Yates, J. Raine, R. Sinden, M. Harris, D. Cunningham, P. Preiser, L. Bergman, A. Vaidya, L. van Lin, C. Janse, A. Waters, H. Smith, O. White, S. Salzberg, J. Ven- ter, C. Fraser, S. Hoffman, M. Gardner, and D. Carucci. Genome sequence and comparative analysis of the model rodent malaria parasite plasmodium yoelii yoelii. Nature, 419(6906):512–519, 2002. ISSN 0028-0836. doi: 10.1038/nature01099. URL http://dx.doi.org/10.1038/nature01099http: //www.nature.com/nature/journal/v419/n6906/pdf/nature01099.pdf.

J. Carlton, J. Silva, and N. Hall. The genome of model malaria parasites, and com- parative genomics. Current Issues in Molecular Biology, 7(1):23–38, 2005. ISSN 14673037 (ISSN). URL http://www.scopus.com/inward/record.url?eid= 2-s2.0-9444263051&partnerID=40&md5=c002e1fed01b32d227cbd8c78975be72.

J. Carlton, J. Adams, J. Silva, S. Bidwell, H. Lorenzi, E. Caler, J. Crabtree, S. Angiuoli, E. Merino, P. Amedeo, Q. Cheng, R. Coulson, B. Crabb, H. Del Por- tillo, K. Essien, T. Feldblyum, C. Fernandez-Becerra, P. Gilson, A. Gueye, X. Guo, S. Kang’a, T. Kooij, M. Korsinczky, E. Meyer, V. Nene, I. Paulsen, O. White, S. Ralph, Q. Ren, T. Sargeant, S. Salzberg, C. Stoeckert, S. Sullivan, M. Ya- mamoto, S. Hoffman, J. Wortman, M. Gardner, M. Galinski, J. Barnwell, and C. Fraser-Liggett. Comparative genomics of the neglected human malaria para- site plasmodium vivax. Nature, 455(7214):757–763, 2008. ISSN 0028-0836. doi: 10.1038/nature07327. URL http://dx.doi.org/10.1038/nature07327http: //www.nature.com/nature/journal/v455/n7214/pdf/nature07327.pdf.

J. M. Carlton, K. Hayton, P. V. Cravo, and D. Walliker. Of mice and malaria mutants: unravelling the genetics of drug resistance using rodent malaria models. Trends Parasitol, 17(5):236–42, 2001. ISSN 1471-4922 (Print) 1471- 4922 (Linking). URL http://www.ncbi.nlm.nih.gov/pubmed/11323308http: //ac.els-cdn.com/S1471492201018992/1-s2.0-S1471492201018992-main. 133 pdf?_tid=95ea0680-bcce-11e4-be18-00000aacb361&acdnat=1424855799_ efaed2ae84a284a4618b3a5fd6083bd1.

R. Carter. Enzyme variation in plasmodium berghei. Transactions of the Royal Society of Tropical Medicine and Hygiene, 64(3):401 – 406, 1970. ISSN 0035- 9203. doi: http://dx.doi.org/10.1016/0035-9203(70)90176-8. URL http: //www.sciencedirect.com/science/article/pii/0035920370901768.

R. Carter. Enzyme variation in plasmodium berghei and plasmodium vinckei. Par- asitology, 66(2):297–307, 1973. ISSN 0031-1820 (Print) 0031-1820 (Linking). URL http://www.ncbi.nlm.nih.gov/pubmed/4595114.

R. Carter. Studies on enzyme variation in the murine malaria parasites plasmodium berghei, p. yoelii, p. vinckei and p. chabaudi by starch gel electrophore- sis. Parasitology, 76(3):241–67, 1978. ISSN 0031-1820 (Print) 0031- 1820 (Linking). URL http://www.ncbi.nlm.nih.gov/pubmed/351525http: //journals.cambridge.org/download.php?file=%2FPAR%2FPAR76_03% 2FS0031182000048137a.pdf&code=d7f779b0b70b5f5500062d1042e9273c.

R. Carter and K. N. Mendis. Evolutionary and historical aspects of the burden of malaria. Clin Microbiol Rev, 15(4):564–94, 2002. ISSN 0893-8512 (Print) 0893- 8512 (Linking). URL http://www.ncbi.nlm.nih.gov/pubmed/12364370http:// www.ncbi.nlm.nih.gov/pmc/articles/PMC126857/pdf/0033.pdf.

R. Carter and D. Walliker. New observations on the malaria parasites of rodents of the central african republic - plasmodium vinckei petteri subsp. nov. and plas- modium chabaudi landau, 1965. Ann Trop Med Parasitol, 69(2):187–96, 1975. ISSN 0003-4983 (Print) 0003-4983 (Linking). URL http://www.ncbi.nlm.nih. gov/pubmed/1155987.

R. Carter and D. Walliker. Malaria parasites of rodents of the congo (brazzaville): Plasmodium chabaudi adami subsp. nov. and plasmodium vinckei lentum landau, michel, adam and boulard, 1970. Ann Parasitol Hum Comp, 51(6):637–46, 1976. ISSN 0003-4150 (Print) 0003-4150 (Linking). URL http://www.ncbi.nlm.nih. gov/pubmed/800328.

T. Carver, S. R. Harris, M. Berriman, J. Parkhill, and J. A. McQuillan. Artemis: an integrated platform for visualization and analysis of high-throughput sequence- based experimental data. Bioinformatics, 28(4):464–9, 2012. ISSN 1367-4811 134 (Electronic) 1367-4803 (Linking). doi: 10.1093/bioinformatics/btr703. URL http://www.ncbi.nlm.nih.gov/pubmed/22199388.

T. J. Carver, K. M. Rutherford, M. Berriman, M. A. Rajandream, B. G. Barrell, and J. Parkhill. Act: the artemis comparison tool. Bioinformatics, 21(16): 3422–3, 2005. ISSN 1367-4803 (Print) 1367-4803 (Linking). doi: 10.1093/ bioinformatics/bti553. URL http://www.ncbi.nlm.nih.gov/pubmed/15976072.

R. Chandra, S. Kumar, and S. K. Puri. Plasmodium vinckei: infectivity of arteether-sensitive and arteether-resistant parasites in different strains of mice. Parasitol Res, 109(4):1143–9, 2011. ISSN 1432-1955 (Elec- tronic) 0932-0113 (Linking). doi: 10.1007/s00436-011-2358-8. URL http://www.ncbi.nlm.nih.gov/pubmed/21479576http://download.springer. com/static/pdf/675/art%253A10.1007%252Fs00436-011-2358-8.pdf?auth66= 1425473209_242d76baa3869e026c6f4383ff03c93a&ext=.pdf.

L. M. Claire, G. B. James, and M. Kevin. Clinical features and pathogenesis of severe malaria. Trends in Parasitology, 20(12):597 – 603, 2004. ISSN 1471-4922. doi: http://dx.doi.org/10.1016/j.pt.2004.09.006. URL http://www.sciencedirect. com/science/article/pii/S147149220400251X.

C. R. Collins, S. Das, E. H. Wong, N. Andenmatten, R. Stallmach, F. Hackett, J. Her- man, S. Mller, M. Meissner, and M. J. Blackman. Robust inducible cre recombi- nase activity in the human malaria parasite plasmodium falciparum enables effi- cient gene deletion within a single asexual erythrocytic growth cycle. Molecular Microbiology, 88(4):687–701, 2013. ISSN 1365-2958. doi: 10.1111/mmi.12206. URL http://dx.doi.org/10.1111/mmi.12206.

A. F. Cowman, D. Berry, and J. Baum. The cellular and molecular basis for malaria parasite invasion of the human red blood cell. J Cell Biol, 198(6):961–71, 2012. ISSN 1540-8140 (Electronic) 0021-9525 (Linking). doi: 10.1083/jcb. 201206112. URL https://www.ncbi.nlm.nih.gov/pubmed/22986493.

F. E. Cox. History of the discovery of the malaria parasites and their vectors. Parasit Vectors, 3(1):5, 2010. ISSN 1756-3305 (Elec- tronic) 1756-3305 (Linking). doi: 10.1186/1756-3305-3-5. URL http://www.ncbi.nlm.nih.gov/pubmed/20205846http://www.ncbi.nlm. nih.gov/pmc/articles/PMC2825508/pdf/1756-3305-3-5.pdf. 135 J. Cox-Singh, T. M. E. Davis, K. Lee, S. S. G. Shamsul, A. Matusop, S. Ratnam, H. A. Rahman, D. J. Conway, and B. Singh. Plasmodium knowlesi malaria in humans is widely distributed and potentially life threatening. Clinical Infectious Diseases, 46(2):165, 2008. doi: 10.1086/524888. URL +http://dx.doi.org/10.1086/ 524888.

B. S. Crabb and A. F. Cowman. Characterization of promoters and stable trans- fection by homologous and nonhomologous recombination in Plasmodium fal- ciparum. Proceedings of the National Academy of Sciences of the United States of America, 93(July):7289–7294, 1996. ISSN 00278424. doi: 10.1073/pnas.93.14. 7289.

B. S. Crabb, B. M. Cooke, J. C. Reeder, R. F. Waller, S. R. Caruana, K. M. Davern, M. E. Wickham, G. V. Brown, R. L. Coppel, and A. F. Cowman. Targeted Gene Disruption Shows That Knobs Enable Malaria-Infected Red Cells to Cytoadhere under Physiological Shear Stress. Cell, 89(2):287–296, 1997. ISSN 00928674. doi: 10.1016/S0092-8674(00)80207-X. URL http://www.sciencedirect.com/ science/article/pii/S009286740080207X.

P. V. Cravo, J. M. Carlton, P. Hunt, L. Bisoni, R. A. Padua, and D. Walliker. Genetics of mefloquine resistance in the rodent malaria parasite plasmodium chabaudi. Antimicrob Agents Chemother, 47(2):709–18, 2003. ISSN 0066-4804 (Print) 0066-4804 (Linking). URL https://www.ncbi.nlm.nih.gov/pubmed/12543682.

R. Culleton, A. Martinelli, P. Hunt, and R. Carter. Linkage group selection: rapid gene discovery in malaria parasites. Genome Res, 15(1):92–7, 2005. ISSN 1088- 9051 (Print) 1088-9051 (Linking). doi: 10.1101/gr.2866205. URL https:// www.ncbi.nlm.nih.gov/pubmed/15632093.

R. L. Culleton and H. M. Abkallo. Malaria parasite genetics: do- ing something useful. Parasitol Int, 2014. ISSN 1873-0329 (Elec- tronic) 1383-5769 (Linking). doi: 10.1016/j.parint.2014.07.006. URL http://www.ncbi.nlm.nih.gov/pubmed/25073068http://ac. els-cdn.com/S138357691400097X/1-s2.0-S138357691400097X-main. pdf?_tid=39ba7afa-c509-11e4-ae62-00000aab0f26&acdnat=1425760594_ e5c52f191c4e701a0ba50f88a4714417.

D. Cunningham, J. Lawton, W. Jarra, P. Preiser, and J. Langhorne. The pir multi- gene family of plasmodium: antigenic variation and beyond. Mol Biochem Par- 136 asitol, 170(2):65–73, 2010. ISSN 1872-9428 (Electronic) 0166-6851 (Linking). doi: 10.1016/j.molbiopara.2009.12.010. URL https://www.ncbi.nlm.nih.gov/ pubmed/20045030.

D. A. Cunningham, W. Jarra, S. Koernig, J. Fonager, D. Fernandez-Reyes, J. E. Blythe, C. Waller, P. R. Preiser, and J. Langhorne. Host immunity modulates transcriptional changes in a multigene family (yir) of rodent malaria. Mol Mi- crobiol, 58(3):636–47, 2005. ISSN 0950-382X (Print) 0950-382X (Linking). doi: 10.1111/j.1365-2958.2005.04840.x. URL https://www.ncbi.nlm.nih. gov/pubmed/16238615.

P. Danecek, A. Auton, G. Abecasis, C. A. Albers, E. Banks, M. A. DePristo, R. E. Handsaker, G. Lunter, G. T. Marth, S. T. Sherry, G. McVean, R. Durbin, and 1000 Genomes Project Analysis Group. The variant call format and vcftools. Bioinformatics, 27(15):2156, 2011. doi: 10.1093/bioinformatics/btr330. URL +http://dx.doi.org/10.1093/bioinformatics/btr330.

J. W. Davey, P. A. Hohenlohe, P. D. Etter, J. Q. Boone, J. M. Catchen, and M. L. Blaxter. Genome-wide genetic marker discovery and genotyping using next- generation sequencing. Nat Rev Genet, 12(7):499–510, jul 2011. ISSN 1471- 0056. URL http://dx.doi.org/10.1038/nrg3012.

T. F. de Koning-Ward, C. J. Janse, and A. P. Waters. The development of genetic tools for dissecting the biology of malaria parasites. Annual review of microbiology, 54: 157–85, 2000. ISSN 0066-4227. doi: 10.1146/annurev.micro.54.1.157. URL http://www.ncbi.nlm.nih.gov/pubmed/11018127.

T. F. de Koning-Ward, P. R. Gilson, and B. S. Crabb. Advances in molecular genetic systems in malaria. Nat Rev Microbiol, 13(6):373–87, 2015. ISSN 1740-1534 (Electronic) 1740-1526 (Linking). doi: 10.1038/nrmicro3450. URL https:// www.ncbi.nlm.nih.gov/pubmed/25978707.

L. M. Deane, M. P. Deane, and J. F. Neto. Studies on transmission of simian malaria and on a natural infection of man with plasmodium simium in brazil. Bulletin of the World Health Organization, 35(5):805, 1966.

M. A. DePristo, E. Banks, R. Poplin, K. V. Garimella, J. R. Maguire, C. Hartl, A. A. Philippakis, G. del Angel, M. A. Rivas, M. Hanna, A. McKenna, T. J. Fennell, A. M. Kernytsky, A. Y. Sivachenko, K. Cibulskis, S. B. Gabriel, D. Alt- shuler, and M. J. Daly. A framework for variation discovery and genotyping 137 using next-generation dna sequencing data. Nat Genet, 43(5):491–8, 2011. ISSN 1546-1718 (Electronic) 1061-4036 (Linking). doi: 10.1038/ng.806. URL https://www.ncbi.nlm.nih.gov/pubmed/21478889.

M. Desai, F. O. Kuile, F. Nosten, R. McGready, K. Asamoa, B. Brabin, and R. D. Newman. Epidemiology and burden of malaria in pregnancy. The Lancet In- fectious Diseases, 7(2):93 – 104, 2007. ISSN 1473-3099. doi: http://dx.doi. org/10.1016/S1473-3099(07)70021-X. URL http://www.sciencedirect.com/ science/article/pii/S147330990770021X.

A. M. Dondorp, F. Nosten, P. Yi, D. Das, A. P. Phyo, J. Tarning, K. M. Lwin, F. Ariey, W. Hanpithakpong, S. J. Lee, P. Ringwald, K. Silamut, M. Imwong, K. Choti- vanich, P. Lim, T. Herdman, S. S. An, S. Yeung, P. Singhasivanon, N. P. Day, N. Lindegardh, D. Socheat, and N. J. White. Artemisinin resistance in plas- modium falciparum malaria. N Engl J Med, 361(5):455–67, 2009. ISSN 1533- 4406 (Electronic) 0028-4793 (Linking). doi: 10.1056/NEJMoa0808859. URL https://www.ncbi.nlm.nih.gov/pubmed/19641202.

R. C. Edgar. Muscle: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res, 32(5):1792–7, 2004. ISSN 1362-4962 (Electronic) 0305-1048 (Linking). doi: 10.1093/nar/gkh340. URL https://www.ncbi.nlm. nih.gov/pubmed/15034147.

C. Engwerda, E. Belnoue, A. C. Gruner, and L. Renia. Experimental models of cere- bral malaria. Curr Top Microbiol Immunol, 297:103–43, 2005. ISSN 0070-217X (Print) 0070-217X (Linking). URL https://www.ncbi.nlm.nih.gov/pubmed/ 16265904.

J. Fang and T. F. McCutchan. Malaria: Thermoregulation in a parasite’s life cycle. Nature, 418(6899):742, aug 2002. ISSN 0028-0836. URL http://dx.doi.org/ 10.1038/418742a.

J. M. Favaloro and D. J. Kemp. Sequence diversity of the erythrocyte membrane antigen 1 in various strains of plasmodium chabaudi. Mol Biochem Parasitol, 66 (1):39–47, 1994. ISSN 0166-6851 (Print) 0166-6851 (Linking). URL https: //www.ncbi.nlm.nih.gov/pubmed/7984187.

B. Fenton and D. Walliker. Genetic Analysis of Malaria Parasites, pages 307–331. Springer US, Boston, MA, 1992. ISBN 978-1-4899-1651- 138 8. doi: 10.1007/978-1-4899-1651-8 9. URL https://doi.org/10.1007/ 978-1-4899-1651-8{_}9.

D. A. Fidock, T. Nomura, A. K. Talley, R. A. Cooper, S. M. Dzekunov, M. T. Ferdig, L. M. Ursos, A. B. Sidhu, B. Naude, K. W. Deitsch, X. Z. Su, J. C. Wootton, P. D. Roepe, and T. E. Wellems. Mutations in the p. falciparum digestive vacuole trans- membrane protein pfcrt and evidence for their role in chloroquine resistance. Mol Cell, 6(4):861–71, 2000. ISSN 1097-2765 (Print) 1097-2765 (Linking). URL https://www.ncbi.nlm.nih.gov/pubmed/11090624.

D. A. Fidock, P. J. Rosenthal, S. L. Croft, R. Brun, and S. Nwaka. Antimalarial drug discovery: efficacy models for compound screening. Nat Rev Drug Discov, 3(6): 509–20, 2004. ISSN 1474-1776 (Print) 1474-1776 (Linking). doi: 10.1038/ nrd1416. URL http://www.ncbi.nlm.nih.gov/pubmed/15173840.

R. D. Finn, T. K. Attwood, P. C. Babbitt, A. Bateman, P. Bork, A. J. Bridge, H. Y. Chang, Z. Dosztanyi, S. El-Gebali, M. Fraser, J. Gough, D. Haft, G. L. Holliday, H. Huang, X. Huang, I. Letunic, R. Lopez, S. Lu, A. Marchler-Bauer, H. Mi, J. Mis- try, D. A. Natale, M. Necci, G. Nuka, C. A. Orengo, Y. Park, S. Pesseat, D. Pi- ovesan, S. C. Potter, N. D. Rawlings, N. Redaschi, L. Richardson, C. Rivoire, A. Sangrador-Vegas, C. Sigrist, I. Sillitoe, B. Smithers, S. Squizzato, G. Sutton, N. Thanki, P. D. Thomas, S. C. Tosatto, C. H. Wu, I. Xenarios, L. S. Yeh, S. Y. Young, and A. L. Mitchell. Interpro in 2017-beyond protein family and do- main annotations. Nucleic Acids Res, 45(D1):D190–D199, 2017. ISSN 1362- 4962 (Electronic) 0305-1048 (Linking). doi: 10.1093/nar/gkw1107. URL https://www.ncbi.nlm.nih.gov/pubmed/27899635.

K. Fischer, M. Chavchich, R. Huestis, D. W. Wilson, D. J. Kemp, and A. Saul. Ten fam- ilies of variant genes encoded in subtelomeric regions of multiple chromosomes of plasmodium chabaudi, a malaria species that undergoes antigenic variation in the laboratory mouse. Mol Microbiol, 48(5):1209–23, 2003. ISSN 0950-382X (Print) 0950-382X (Linking). URL https://www.ncbi.nlm.nih.gov/pubmed/12787350.

A. Fougere, A. P. Jackson, D. P. Bechtsi, J. A. Braks, T. Annoura, J. Fonager, R. Spac- capelo, J. Ramesar, S. Chevalley-Maurel, O. Klop, A. M. van der Laan, H. J. Tanke, C. H. Kocken, E. M. Pasini, S. M. Khan, U. Bohme, C. van Ooij, T. D. Otto, C. J. Janse, and B. Franke-Fayard. Variant exported blood-stage proteins encoded by plasmodium multigene families are expressed in liver stages where they are ex- ported into the parasitophorous vacuole. PLoS Pathog, 12(11):e1005917, 2016. 139 ISSN 1553-7374 (Electronic) 1553-7366 (Linking). doi: 10.1371/journal.ppat. 1005917. URL https://www.ncbi.nlm.nih.gov/pubmed/27851824.

B. Franke-Fayard, H. Trueman, J. Ramesar, J. Mendoza, M. van der Keur, R. van der Linden, R. E. Sinden, A. P. Waters, and C. J. Janse. A plasmodium berghei reference line that constitutively expresses gfp at a high level throughout the complete life cycle. Mol Biochem Parasitol, 137(1):23–33, 2004. ISSN 0166- 6851 (Print) 0166-6851 (Linking). doi: 10.1016/j.molbiopara.2004.04.007. URL https://www.ncbi.nlm.nih.gov/pubmed/15279948.

B. Franke-Fayard, J. Fonager, A. Braks, S. M. Khan, and C. J. Janse. Sequestration and tissue accumulation of human malaria parasites: can we learn anything from rodent models of malaria? PLoS Pathog, 6(9):e1001032, 2010. ISSN 1553-7374 (Electronic) 1553-7366 (Linking). doi: 10.1371/journal.ppat.1001032. URL https://www.ncbi.nlm.nih.gov/pubmed/20941396.

C. Frech and N. Chen. Variant surface antigens of malaria parasites: functional and evolutionary insights from comparative gene family classification and analysis. BMC Genomics, 14:427, 2013. ISSN 1471-2164 (Electronic) 1471-2164 (Link- ing). doi: 10.1186/1471-2164-14-427. URL https://www.ncbi.nlm.nih.gov/ pubmed/23805789.

L. H. Freitas-Junior, E. Bottius, L. A. Pirrit, K. W. Deitsch, C. Scheidig, F. Guinet, U. Nehrbass, T. E. Wellems, and A. Scherf. Frequent ectopic recombination of virulence factor genes in telomeric chromosome clusters of P. falciparum. Nature, 407(6807):1018–1022, oct 2000. ISSN 0028-0836. URL http://dx.doi.org/ 10.1038/35039531.

N. Gadsby. A Genetic Analysis of Two Strains of Plasmodium chabaudi adami that Differ in Growth and Pathogenicity. Thesis, 2008.

M. R. Galinski, C. C. Medina, P. Ingravallo, and J. W. Barnwell. A reticulocyte- binding protein complex of plasmodium vivax merozoites. Cell, 69(7):1213–26, 1992. ISSN 0092-8674 (Print) 0092-8674 (Linking). URL https://www.ncbi. nlm.nih.gov/pubmed/1617731.

M. J. Gardner, N. Hall, E. Fung, O. White, M. Berriman, R. W. Hyman, J. M. Carlton, A. Pain, K. E. Nelson, S. Bowman, I. T. Paulsen, K. James, J. A. Eisen, K. Rutherford, S. L. Salzberg, A. Craig, S. Kyes, M. S. Chan, V. Nene, 140 S. J. Shallom, B. Suh, J. Peterson, S. Angiuoli, M. Pertea, J. Allen, J. Se- lengut, D. Haft, M. W. Mather, A. B. Vaidya, D. M. Martin, A. H. Fairlamb, M. J. Fraunholz, D. S. Roos, S. A. Ralph, G. I. McFadden, L. M. Cummings, G. M. Subramanian, C. Mungall, J. C. Venter, D. J. Carucci, S. L. Hoffman, C. Newbold, R. W. Davis, C. M. Fraser, and B. Barrell. Genome sequence of the human malaria parasite plasmodium falciparum. Nature, 419(6906):498– 511, 2002. ISSN 0028-0836 (Print) 0028-0836 (Linking). doi: 10.1038/ nature01097. URL http://www.ncbi.nlm.nih.gov/pubmed/12368864http:// www.nature.com/nature/journal/v419/n6906/pdf/nature01097.pdf.

P. Gautret, E. Deharo, A. G. Chabaud, H. Ginsburg, and I. Landau. Plasmod- ium vinckei vinckei, p. v. lentum and p. yoelii yoelii: chronobiology of the asexual cycle in the blood. Parasite, 1(3):235–9, 1994. ISSN 1252-607X (Print) 1252-607X (Linking). URL http://www.ncbi.nlm.nih.gov/pubmed/ 9140490http://www.parasite-journal.org/articles/parasite/pdf/1994/ 04/parasite1994013p235.pdf.

N. Gerald, B. Mahajan, and S. Kumar. Mitosis in the Human Malaria Parasite Plas- modium falciparum, apr 2011. ISSN 1535-9778 (Print).

M. Ghorbal, M. Gorman, C. R. Macpherson, R. M. Martins, A. Scherf, and J. J. Lopez-Rubio. Genome editing in the human malaria parasite plasmod- ium falciparum using the crispr-cas9 system. Nat Biotechnol, 32(8):819–21, 2014. ISSN 1546-1696 (Electronic) 1087-0156 (Linking). doi: 10.1038/ nbt.2925. URL http://www.ncbi.nlm.nih.gov/pubmed/24880488http://www. nature.com/nbt/journal/v32/n8/pdf/nbt.2925.pdf.

C. F. Gilks, D. Walliker, and C. I. Newbold. Relationships between sequestration, antigenic variation and chronic parasitism in plasmodium chabaudi chabaudi a rodent malaria model. Parasite Immunology, 12(1):45–64, 1990. ISSN 1365- 3024. doi: 10.1111/j.1365-3024.1990.tb00935.x. URL http://dx.doi.org/10. 1111/j.1365-3024.1990.tb00935.x.

A. R. Gomes, E. Bushell, F. Schwach, G. Girling, B. Anar, M. A. Quail, C. Herd, C. Pfander, K. Modrzynska, J. C. Rayner, and O. Billker. A genome-scale vector resource enables high-throughput reverse genetic screening in a malaria parasite. Cell Host Microbe, 2015. ISSN 1934- 6069 (Electronic) 1931-3128 (Linking). doi: 10.1016/j.chom.2015. 141 01.014. URL http://www.ncbi.nlm.nih.gov/pubmed/25732065http: //ac.els-cdn.com/S1931312815000347/1-s2.0-S1931312815000347-main. pdf?_tid=cbec5638-5c98-11e5-8fec-00000aacb361&acdnat=1442424883_ 8108fee4b5ccc257635ff809c598bddb.

J. M. Gonzales, J. J. Patel, N. Ponmee, L. Jiang, A. Tan, S. P. Maher, S. Wuchty, P. K. Rathod, and M. T. Ferdig. Regulatory hotspots in the malaria parasite genome dic- tate transcriptional variation. PLOS Biology, 6(9):1–12, 09 2008. doi: 10.1371/ journal.pbio.0060238. URL https://doi.org/10.1371/journal.pbio.0060238.

R. Goonewardene, J. Daily, D. Kaslow, T. J. Sullivan, P. Duffy, R. Carter, K. Mendis, and D. Wirth. Transfection of the malaria parasite and expression of firefly lu- ciferase. Proc Natl Acad Sci U S A, 90(11):5234–6, 1993. ISSN 0027-8424 (Print) 0027-8424 (Linking). URL https://www.ncbi.nlm.nih.gov/pubmed/8506371.

Battista Grassi, Amico Bignami, and Giuseppe Bastianelli. Ulteriori ricerche sul ciclo dei parassiti malarici umani nel corpo del zanzarone. tip. della R. Accademia dei Lincei, 1898.

K. Grech, A. Martinelli, S. Pathirana, D. Walliker, P. Hunt, and R. Carter. Numerous, robust genetic markers for plasmodium chabaudi by the method of amplified fragment length polymorphism. Mol Biochem Parasitol, 123(2):95–104, 2002. ISSN 0166-6851 (Print) 0166-6851 (Linking). URL https://www.ncbi.nlm.nih. gov/pubmed/12270625.

J. Greenberg and H. L. Trembley. Infections produced by mixed strains of plasmod- ium gallinaceum in chicks. The Journal of Parasitology, 40(3):336–340, 1954. ISSN 00223395, 19372345. URL http://www.jstor.org/stable/3273747.

A. M. Guggisberg, J. Park, R. L. Edwards, M. L. Kelly, D. M. Hodge, N. H. Tolia, and A. R. Odom. A sugar phosphatase regulates the methylerythritol phosphate (MEP) pathway in malaria parasites. Nature communications, 5:4467, 2014. ISSN 2041-1723. doi: 10.1038/ncomms5467. URL http://www.ncbi.nlm.nih.gov/ pubmed/25058848.

D. S. Guttery, A. A. Holder, and R. Tewari. Sexual development in plasmodium: lessons from functional analyses. PLoS Pathog, 8(1):e1002404, 2012. ISSN 1553- 7374 (Electronic) 1553-7366 (Linking). doi: 10.1371/journal.ppat.1002404. URL https://www.ncbi.nlm.nih.gov/pubmed/22275863. 142 D. S. Guttery, B. Poulin, A. Ramaprasad, R. J. Wall, D. J. P. Ferguson, D. Brady, E. Patzewitz, S. Whipple, U. Straschil, M. H. Wright, A. M. A. H. Mohamed, A. Radhakrishnan, S. T. Arold, E. W. Tate, A. A. Holder, B. Wickstead, A. Pain, and R. Tewari. Genome-wide functional analysis of plasmodium protein phosphatases reveals key regulators of parasite de- velopment and differentiation. Cell Host Microbe, 16(1):128–140, 2014. ISSN 1931-3128;1934-6069. doi: 10.1016/j.chom.2014.05.020. URL http: //ac.els-cdn.com/S1931312814002194/1-s2.0-S1931312814002194-main. pdf?_tid=e28f69ac-8a02-11e4-af4d-00000aacb35f&acdnat= 1419270752_4b8dfbaf495b30c0abe3ec3852a9f532http://www.cell.com/ cell-host-microbe/pdf/S1931-3128(14)00219-4.pdf.

D. S. Guttery, M. Roques, A. A. Holder, and R. Tewari. Commit and transmit: Molecular players in plasmodium sexual development and zygote differentiation. Trends Parasitol, 31(12):676–85, 2015. ISSN 1471-5007 (Electronic) 1471-4922 (Linking). doi: 10.1016/j.pt.2015.08.002. URL https://www.ncbi.nlm.nih. gov/pubmed/26440790.

J. C. Hafalla, O. Silvie, and K. Matuschewski. Cell biology and immunology of malaria. Immunol Rev, 240(1):297–316, 2011. ISSN 1600-065X (Electronic) 0105-2896 (Linking). doi: 10.1111/j.1600-065X.2010.00988.x. URL https: //www.ncbi.nlm.nih.gov/pubmed/21349101.

N. Hall, M. Karras, J. Raine, J. Carlton, T. Kooij, M. Berriman, L. Florens, C. Janssen, A. Pain, G. Christophides, K. James, K. Rutherford, B. Harris, D. Har- ris, C. Churcher, M. Quail, D. Ormond, J. Doggett, H. Trueman, J. Mendoza, S. Bidwell, M. Rajandream, D. Carucci, J. Yates, F. Kafatos, C. Janse, B. Bar- rell, C. Turner, A. Waters, and R. Sinden. A comprehensive survey of the plas- modium life cycle by genomic, transcriptomic, and proteomic analyses. Science (New York, N.Y.), 307(5706):82–86, 2005. ISSN 0036-8075. doi: 10.1126/ science.1103717. URL http://dx.doi.org/10.1126/science.1103717http:// www.sciencemag.org/content/307/5706/82.full.pdf.

D. S. Hansen. Inflammatory responses associated with the induction of cere- bral malaria: lessons from experimental murine models. PLoS Pathog, 8(12): e1003045, 2012. ISSN 1553-7374 (Electronic) 1553-7366 (Linking). doi: 10.1371/journal.ppat.1003045. URL https://www.ncbi.nlm.nih.gov/pubmed/ 23300435. 143 T Harinasuta, S Migasen, and D Bunnag. Chloroquine resistance in plasmodium fal- ciparum in thailand. In 1st UNESCO Regional Symposium on Scientific Knowledge of Tropical Parasites. United Nations Educational, Scientific, and Cultural Organi- zation, Paris, France, pages 148–153, 1962.

K. Hayton, L. C. Ranford-Cartwright, and D. Walliker. Sulfadoxine-pyrimethamine resistance in the rodent malaria parasite plasmodium chabaudi. Antimicrob Agents Chemother, 46(8):2482–9, 2002. ISSN 0066-4804 (Print) 0066-4804 (Linking). URL https://www.ncbi.nlm.nih.gov/pubmed/12121922.

K. Hayton, D. Gaur, A. Liu, J. Takahashi, B. Henschen, S. Singh, L. Lambert, T. Fu- ruya, R. Bouttenot, M. Doll, F. Nawaz, J. Mu, L. Jiang, L. H. Miller, and T. E. Wellems. Erythrocyte binding protein pfrh5 polymorphisms determine species- specific pathways of plasmodium falciparum invasion. Cell Host Microbe, 4(1): 40–51, 2008. ISSN 1934-6069 (Electronic) 1931-3128 (Linking). doi: 10.1016/ j.chom.2008.06.001. URL https://www.ncbi.nlm.nih.gov/pubmed/18621009.

A. Heiber, F. Kruse, C. Pick, C. Gruring, S. Flemming, A. Oberli, H. Schoeler, S. Ret- zlaff, P. Mesen-Ramirez, J. A. Hiss, M. Kadekoppala, L. Hecht, A. A. Holder, T. W. Gilberger, and T. Spielmann. Identification of new pneps indicates a substan- tial non-pexel exportome and underpins common features in plasmodium fal- ciparum protein export. PLoS Pathog, 9(8):e1003546, 2013. ISSN 1553-7374 (Electronic) 1553-7366 (Linking). doi: 10.1371/journal.ppat.1003546. URL https://www.ncbi.nlm.nih.gov/pubmed/23950716.

T. T. Hien, N. T. Thuy-Nhien, N. H. Phu, M. F. Boni, N. V. Thanh, N. T. Nha- Ca, H. Thai le, C. Q. Thai, P. V. Toi, P. D. Thuan, T. Long le, T. Dong le, L. Merson, C. Dolecek, K. Stepniewska, P. Ringwald, N. J. White, J. Farrar, and M. Wolbers. In vivo susceptibility of plasmodium falciparum to artesunate in binh phuoc province, vietnam. Malar J, 11:355, 2012. ISSN 1475-2875 (Elec- tronic) 1475-2875 (Linking). doi: 10.1186/1475-2875-11-355. URL https: //www.ncbi.nlm.nih.gov/pubmed/23101492.

N. L. Hiller, S. Bhattacharjee, C. van Ooij, K. Liolios, T. Harrison, C. Lopez-Estrano, and K. Haldar. A host-targeting signal in virulence proteins reveals a secre- tome in malarial infection. Science, 306(5703):1934–7, 2004. ISSN 1095- 9203 (Electronic) 0036-8075 (Linking). doi: 10.1126/science.1102737. URL https://www.ncbi.nlm.nih.gov/pubmed/15591203. 144 R. A. Holt, G. M. Subramanian, A. Halpern, G. G. Sutton, R. Charlab, D. R. Nusskern, P. Wincker, A. G. Clark, J. M. Ribeiro, R. Wides, S. L. Salzberg, B. Loftus, M. Yandell, W. H. Majoros, D. B. Rusch, Z. Lai, C. L. Kraft, J. F. Abril, V. Anthouard, P. Arensburger, P. W. Atkinson, H. Baden, V. de Berardi- nis, D. Baldwin, V. Benes, J. Biedler, C. Blass, R. Bolanos, D. Boscus, M. Barn- stead, S. Cai, A. Center, K. Chaturverdi, G. K. Christophides, M. A. Chrystal, M. Clamp, A. Cravchik, V. Curwen, A. Dana, A. Delcher, I. Dew, C. A. Evans, M. Flanigan, A. Grundschober-Freimoser, L. Friedli, Z. Gu, P. Guan, R. Guigo, M. E. Hillenmeyer, S. L. Hladun, J. R. Hogan, Y. S. Hong, J. Hoover, O. Jaillon, Z. Ke, C. Kodira, E. Kokoza, A. Koutsos, I. Letunic, A. Levitsky, Y. Liang, J. J. Lin, N. F. Lobo, J. R. Lopez, J. A. Malek, T. C. McIntosh, S. Meister, J. Miller, C. Mobarry, E. Mongin, S. D. Murphy, D. A. O’Brochta, C. Pfannkoch, R. Qi, M. A. Regier, K. Remington, H. Shao, M. V. Sharakhova, C. D. Sitter, J. Shetty, T. J. Smith, R. Strong, J. Sun, D. Thomasova, L. Q. Ton, P. Topalis, Z. Tu, M. F. Unger, B. Walenz, A. Wang, J. Wang, M. Wang, X. Wang, K. J. Woodford, J. R. Wort- man, M. Wu, A. Yao, E. M. Zdobnov, H. Zhang, Q. Zhao, et al. The genome sequence of the malaria mosquito anopheles gambiae. Science, 298(5591):129– 49, 2002. ISSN 1095-9203 (Electronic) 0036-8075 (Linking). doi: 10.1126/ science.1076181. URL https://www.ncbi.nlm.nih.gov/pubmed/12364791.

R. Hoo, L. Zhu, A. Amaladoss, S. Mok, O. Natalang, S. A. Lapp, G. Hu, K. Liew, M. R. Galinski, Z. Bozdech, and P. R. Preiser. Integrated analysis of the plas- modium species transcriptome. EBioMedicine, 7:255–66, 2016. ISSN 2352- 3964 (Electronic) 2352-3964 (Linking). doi: 10.1016/j.ebiom.2016.04.011. URL https://www.ncbi.nlm.nih.gov/pubmed/27322479.

P. Hunt, A. Martinelli, K. Modrzynska, S. Borges, A. Creasey, L. Rodrigues, D. Be- raldi, L. Loewe, R. Fawcett, S. Kumar, M. Thomson, U. Trivedi, T. D. Otto, A. Pain, M. Blaxter, and P. Cravo. Experimental evolution, genetic analysis and genome re-sequencing reveal the mutation conferring artemisinin resistance in an iso- genic lineage of malaria parasites. BMC Genomics, 11(1):499, 2010. ISSN 1471- 2164. doi: 10.1186/1471-2164-11-499. URL http://www.biomedcentral.com/ 1471-2164/11/499.

R. Idro, N. E. Jenkins, and C. J. R. C. Newton. Pathogenesis, clinical features, and neurological outcome of cerebral malaria. The Lancet Neu- rology, 4(12):827 – 840, 2005. ISSN 1474-4422. doi: http://dx.doi. 145 org/10.1016/S1474-4422(05)70247-7. URL http://www.sciencedirect.com/ science/article/pii/S1474442205702477.

H. Innan and F. Kondrashov. The evolution of gene duplications: classifying and distinguishing between models. Nat Rev Genet, 11(2):97–108, 2010. ISSN 1471- 0064 (Electronic) 1471-0056 (Linking). doi: 10.1038/nrg2689. URL https: //www.ncbi.nlm.nih.gov/pubmed/20051986.

Institute of Medicine (US) Committee on the Economics of Antimalarial Drugs. Washington (DC), 2004. ISBN 0309092183. doi: 10.17226/11017. URL https://www.ncbi.nlm.nih.gov/pubmed/25009879.

J. K. Iyer, A. Amaladoss, S. Genesan, and P. R. Preiser. Variable expression of the 235 kda rhoptry protein of plasmodium yoelii mediate host cell adaptation and immune evasion. Mol Microbiol, 65(2):333–46, 2007. ISSN 0950-382X (Print) 0950-382X (Linking). doi: 10.1111/j.1365-2958.2007.05786.x. URL https: //www.ncbi.nlm.nih.gov/pubmed/17590237.

C. J. Janse, P. F.J. van der Klooster, H. J. van der Kaay, M. van der Ploeg, and J. Prosper Overdulve. DNA synthesis in Plasmodium berghei during asexual and sexual development. Molecular and Biochemical Parasitology, 20(2):173–182, aug 1986a. ISSN 01666851. doi: 10.1016/0166-6851(86)90029-0. URL http:// linkinghub.elsevier.com/retrieve/pii/0166685186900290.

C. J. Janse, T. Ponnudurai, A. H. W. Lensen, J. H. E. Th. Meuwissen, J. Rame- sar, M. Van Der Ploeg, and J. P. Overdulve. DNA synthesis in gametocytes of Plasmodium falciparum. Parasitology, 96(01):1, feb 1988. ISSN 0031-1820. doi: 10.1017/S0031182000081609. URL http://www.journals.cambridge. org/abstract{_}S0031182000081609.

C. J. Janse, B. Franke-Fayard, G. R. Mair, J. Ramesar, C. Thiel, S. Engelmann, K. Ma- tuschewski, G. J. V. Gemert, R. W. Sauerwein, and A. P. Waters. High efficiency transfection of Plasmodium berghei facilitates novel selection procedures. Molec- ular and Biochemical Parasitology, 145(1):60–70, 2006a. ISSN 01666851. doi: 10.1016/j.molbiopara.2005.09.007.

C. J. Janse, J. Ramesar, and A. P. Waters. High-efficiency transfection and drug selection of genetically transformed blood stages of the rodent malaria para- site plasmodium berghei. Nat Protoc, 1(1):346–56, 2006b. ISSN 1750-2799 146 (Electronic) 1750-2799 (Linking). doi: 10.1038/nprot.2006.53. URL https: //www.ncbi.nlm.nih.gov/pubmed/17406255.

C. J. Janse, H. Kroeze, A. van Wigcheren, S. Mededovic, J. Fonager, B. Franke- Fayard, A. P. Waters, and S. M. Khan. A genotype and phenotype database of genetically modified malaria-parasites. Trends Parasitol, 27(1):31–9, 2011. ISSN 1471-5007 (Electronic) 1471-4922 (Linking). doi: 10.1016/j.pt.2010.06.016. URL https://www.ncbi.nlm.nih.gov/pubmed/20663715.

C.J. Janse, P.F.J. Van der Klooster, H.J. Van der Kaay, M. Van der Ploeg, and J.P. Overdulve. Rapid repeated DNA replication during microgametogenesis and DNA synthesis in young zygotes of Plasmodium berghei. Transactions of the Royal Society of Tropical Medicine and Hygiene, 80(1):154–157, jan 1986b. ISSN 00359203. doi: 10.1016/0035-9203(86)90219-1. URL https://academic.oup. com/trstmh/article-lookup/doi/10.1016/0035-9203(86)90219-1.

C. S. Janssen, R. S. Phillips, C. M. Turner, and M. P. Barrett. Plasmodium inter- spersed repeats: the major multigene superfamily of malaria parasites. Nucleic Acids Res, 32(19):5712–20, 2004. ISSN 1362-4962 (Electronic) 0305-1048 (Link- ing). doi: 10.1093/nar/gkh907. URL https://www.ncbi.nlm.nih.gov/pubmed/ 15507685.

H. Jiang, N. Li, V. Gopalan, M. M. Zilversmit, S. Varma, V. Nagarajan, J. Li, J. Mu, K. Hayton, B. Henschen, M. Yi, R. Stephens, G. McVean, P. Awadalla, T. E. Wellems, and X. Z. Su. High recombination rates and hotspots in a plas- modium falciparum genetic cross. Genome Biol, 12(4):R33, 2011. ISSN 1474- 760X (Electronic) 1474-7596 (Linking). doi: 10.1186/gb-2011-12-4-r33. URL https://www.ncbi.nlm.nih.gov/pubmed/21463505.

P. Jones, D. Binns, H. Y. Chang, M. Fraser, W. Li, C. McAnulla, H. McWilliam, J. Maslen, A. Mitchell, G. Nuka, S. Pesseat, A. F. Quinn, A. Sangrador-Vegas, M. Scheremetjew, S. Y. Yong, R. Lopez, and S. Hunter. Interproscan 5: genome- scale protein function classification. Bioinformatics, 30(9):1236–40, 2014. ISSN 1367-4811 (Electronic) 1367-4803 (Linking). doi: 10.1093/bioinformatics/ btu031. URL https://www.ncbi.nlm.nih.gov/pubmed/24451626.

A. M. Jongco, L. M. Ting, V. Thathy, M. M. Mota, and K. Kim. Improved transfection and new selectable markers for the rodent malaria parasite plasmodium yoelii. Mol Biochem Parasitol, 146(2):242–50, 2006. ISSN 0166-6851 (Print) 0166-6851 147 (Linking). doi: 10.1016/j.molbiopara.2006.01.001. URL https://www.ncbi. nlm.nih.gov/pubmed/16458371.

K. Kaiser, K. Matuschewski, N. Camargo, J. Ross, and S. H. Kappe. Differential transcriptome profiling identifies plasmodium genes encoding pre-erythrocytic stage-specific proteins. Mol Microbiol, 51(5):1221–32, 2004. ISSN 0950-382X (Print) 0950-382X (Linking). doi: 10.1046/j.1365-2958.2003.03909.x. URL https://www.ncbi.nlm.nih.gov/pubmed/14982620.

S. H. Kappe, M. J. Gardner, S. M. Brown, J. Ross, K. Matuschewski, J. M. Ribeiro, J. H. Adams, J. Quackenbush, J. Cho, D. J. Carucci, S. L. Hoffman, and V. Nussen- zweig. Exploring the transcriptome of the malaria sporozoite stage. Proc Natl Acad Sci U S A, 98(17):9895–900, 2001. ISSN 0027-8424 (Print) 0027-8424 (Linking). doi: 10.1073/pnas.171185198. URL https://www.ncbi.nlm.nih. gov/pubmed/11493695.

J. Keen, A. Holder, J. Playfair, M. Lockyer, and A. Lewis. Identification of the gene for a plasmodium yoelii rhoptry protein. multiple copies in the parasite genome. Mol Biochem Parasitol, 42(2):241–6, 1990. ISSN 0166-6851 (Print) 0166-6851 (Linking). URL https://www.ncbi.nlm.nih.gov/pubmed/2270106.

S. M. Khan, W. Jarra, and P. R. Preiser. The 235 kda rhoptry protein of plasmodium (yoelii) yoelii: function at the junction. Mol Biochem Parasitol, 117(1):1–10, 2001. ISSN 0166-6851 (Print) 0166-6851 (Linking). URL https://www.ncbi. nlm.nih.gov/pubmed/11551627.

S. M. Khan, H. Kroeze, B. Franke-Fayard, and C. J. Janse. Standardization in generating and reporting genetically modified rodent malaria parasites: the rmgmdb database. Methods Mol Biol, 923:139–50, 2013. ISSN 1940-6029 (Electronic) 1064-3745 (Linking). doi: 10.1007/978-1-62703-026-7 9. URL http://www.ncbi.nlm.nih.gov/pubmed/22990775.

R Killick-Kendrick. Malaria parasites of Thamnomys rutilans (Rodentia, Muridae) in Nigeria. Bulletin of the World Health Organization, 38(5):822–824, 1968. ISSN 0042-9686. URL http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2554675/.

R. Killick-Kendrick. Parasitic of the blood of rodents. i. the life-cycle and zoogeography of plasmodium berghei nigeriensis subsp. nov. Ann Trop Med Par- asitol, 67(3):261–77, 1973a. ISSN 0003-4983 (Print) 0003-4983 (Linking). URL https://www.ncbi.nlm.nih.gov/pubmed/4586820. 148 R. Killick-Kendrick. Parasitic protozoa of the blood of rodents. Annals of Tropical Medicine & Parasitology, 67(3):261–277, 1973b. doi: 10.1080/00034983.1973. 11686887. URL http://dx.doi.org/10.1080/00034983.1973.11686887. PMID: 4586820.

R. Killick-Kendrick. Parasitic protozoa of the blood of rodents. v. plasmodium vinckei brucechwatti subsp. nov. a malaria parasite of the thicket rat, thamno- mys rutilans, in nigeria. Ann Parasitol Hum Comp, 50(3):251–64, 1975. ISSN 0003-4150 (Print) 0003-4150 (Linking). URL http://www.ncbi.nlm.nih.gov/ pubmed/1211764.

R. Killick-Kendrick and W. Peters. Rodent malaria. Academic Press, London, 1978.

D. Kim, G. Pertea, C. Trapnell, H. Pimentel, R. Kelley, and S. L. Salzberg. Tophat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol, 14(4):R36, 2013. ISSN 1465-6914 (Electronic) 1465- 6906 (Linking). doi: 10.1186/gb-2013-14-4-r36. URL http://www.ncbi.nlm. nih.gov/pubmed/23618408.

K. Kinga Modrzynska, A. Creasey, L. Loewe, T. Cezard, S. Trindade Borges, A. Mar- tinelli, L. Rodrigues, P. Cravo, M. Blaxter, R. Carter, and P. Hunt. Quantitative genome re-sequencing defines multiple mutations conferring chloroquine resis- tance in rodent malaria. BMC Genomics, 13(1):106, Mar 2012. doi: 10.1186/ 1471-2164-13-106. URL http://dx.doi.org/10.1186/1471-2164-13-106.

M. Klemba, I. Gluzman, and D. E. Goldberg. A plasmodium falciparum dipep- tidyl aminopeptidase i participates in vacuolar hemoglobin degradation. J Biol Chem, 279(41):43000–7, 2004. ISSN 0021-9258 (Print) 0021-9258 (Linking). doi: 10.1074/jbc.M408123200. URL https://www.ncbi.nlm.nih.gov/pubmed/ 15304495.

G. Knowles, A. Sanderson, and D. Walliker. Plasmodium yoelii: Genetic analysis of crosses between two rodent malaria subspecies. Experimental Parasitology, 52 (2):243–247, oct 1981. ISSN 00144894. doi: 10.1016/0014-4894(81)90079-5. URL http://linkinghub.elsevier.com/retrieve/pii/0014489481900795.

E. Knuepfer, M. Napiorkowska, C. van Ooij, and A. A. Holder. Generating con- ditional gene knockouts in Plasmodium a toolkit to produce stable DiCre recombinase-expressing parasite lines using CRISPR/Cas9. Scientific Reports, 7 149 (1):3881, 2017. ISSN 2045-2322. doi: 10.1038/s41598-017-03984-3. URL https://doi.org/10.1038/s41598-017-03984-3.

T. W. Kooij, J. M. Carlton, S. L. Bidwell, N. Hall, J. Ramesar, C. J. Janse, and A. P. Waters. A plasmodium whole-genome synteny map: indels and syn- teny breakpoints as foci for species-specific genes. PLoS Pathog, 1(4):e44, 2005. ISSN 1553-7366 (Print) 1553-7366 (Linking). doi: 10.1371/journal. ppat.0010044. URL http://www.ncbi.nlm.nih.gov/pubmed/16389297http:// www.ncbi.nlm.nih.gov/pmc/articles/PMC1317653/pdf/ppat.0010044.pdf.

E. V. Koonin and R. L. Tatusov. Computer analysis of bacterial haloacid dehaloge- nases defines a large superfamily of hydrolases with diverse specificity. applica- tion of an iterative approach to database search. J Mol Biol, 244(1):125–32, 1994. ISSN 0022-2836 (Print) 0022-2836 (Linking). doi: 10.1006/jmbi.1994.1711. URL https://www.ncbi.nlm.nih.gov/pubmed/7966317.

J.P. Kreier. Parasitic Protozoa: and Plasmodia. Elsevier Science, 2012. ISBN 9780323139199. URL https://books.google.com.sa/books?id= qduC6o08fGAC.

A. Krogh, B. Larsson, G. von Heijne, and E. L. Sonnhammer. Predicting transmem- brane protein topology with a hidden markov model: application to complete genomes. J Mol Biol, 305(3):567–80, 2001. ISSN 0022-2836 (Print) 0022- 2836 (Linking). doi: 10.1006/jmbi.2000.4315. URL https://www.ncbi.nlm. nih.gov/pubmed/11152613.

M. Krzywinski, J. Schein, I. Birol, J. Connors, R. Gascoyne, D. Horsman, S. J. Jones, and M. A. Marra. Circos: an information aesthetic for comparative genomics. Genome Res, 19(9):1639–45, 2009. ISSN 1549-5469 (Electronic) 1088-9051 (Linking). doi: 10.1101/gr.092759.109. URL https://www.ncbi.nlm.nih.gov/ pubmed/19541911.

S. Kurtz, A. Phillippy, A. L. Delcher, M. Smoot, M. Shumway, C. Antonescu, and S. L. Salzberg. Versatile and open software for comparing large genomes. Genome Biol, 5(2):R12, 2004. ISSN 1474-760X (Electronic) 1474-7596 (Linking). doi: 10.1186/gb-2004-5-2-r12. URL https://www.ncbi.nlm.nih.gov/pubmed/ 14759262.

E. Kuznetsova, B. Nocek, G. Brown, K. S. Makarova, R. Flick, Y. I. Wolf, A. Khus- nutdinova, E. Evdokimova, K. Jin, K. Tan, A. D. Hanson, G. Hasnain, R. Zallot, 150 V. de Crecy-Lagard, M. Babu, A. Savchenko, A. Joachimiak, A. M. Edwards, E. V. Koonin, and A. F. Yakunin. Functional diversity of haloacid dehalogenase su- perfamily phosphatases from saccharomyces cerevisiae: Biochemical, structural, and evolutionary insights. J Biol Chem, 290(30):18678–98, 2015. ISSN 1083- 351X (Electronic) 0021-9258 (Linking). doi: 10.1074/jbc.M115.657916. URL https://www.ncbi.nlm.nih.gov/pubmed/26071590.

M. P. Kyaw, M. H. Nyunt, K. Chit, M. M. Aye, K. H. Aye, M. M. Aye, N. Lindegardh, J. Tarning, M. Imwong, C. G. Jacob, C. Rasmussen, J. Perin, P. Ringwald, and M. M. Nyunt. Reduced susceptibility of plasmodium falciparum to artesunate in southern myanmar. PLoS One, 8(3):e57689, 2013. ISSN 1932-6203 (Electronic) 1932-6203 (Linking). doi: 10.1371/journal.pone.0057689. URL https://www. ncbi.nlm.nih.gov/pubmed/23520478.

A. N. LaCrue, M. Scheel, K. Kennedy, N. Kumar, and D. E. Kyle. Ef- fects of artesunate on parasite recrudescence and dormancy in the rodent malaria model plasmodium vinckei. PLoS One, 6(10):e26689, 2011. ISSN 1932-6203 (Electronic) 1932-6203 (Linking). doi: 10.1371/journal.pone. 0026689. URL http://www.ncbi.nlm.nih.gov/pubmed/22039533http://www. ncbi.nlm.nih.gov/pmc/articles/PMC3200358/pdf/pone.0026689.pdf.

K. Lagesen, P. Hallin, E. A. Rodland, H. H. Staerfeldt, T. Rognes, and D. W. Ussery. Rnammer: consistent and rapid annotation of ribosomal rna genes. Nucleic Acids Res, 35(9):3100–8, 2007. ISSN 1362-4962 (Electronic) 0305-1048 (Link- ing). doi: 10.1093/nar/gkm160. URL https://www.ncbi.nlm.nih.gov/pubmed/ 17452365.

I. Landau and A. Chabaud. Infection naturelle par deux plasmodium du rongeur thamnomys rutilans en republique centrafricaine. Compte Rendu Hebdomadaire des Seances de l’Academie des Sciences, 260(D):230–232, 1965.

I. Landau and R. Killick-Kendrick. Rodent plasmodia of the rpublique cen- trafricaine: The sporogony and tissue stages of plasmodium chabaudi and p. berghei yoelii. Transactions of the Royal Society of Tropical Medicine and Hy- giene, 60(5):633 – 649, 1966. ISSN 0035-9203. doi: http://dx.doi.org/10. 1016/0035-9203(66)90010-1. URL http://www.sciencedirect.com/science/ article/pii/0035920366900101. 151 I. Landau, J. C. Michel, and J. P. Adam. [biologic cycle in the laboratory of plas- modium berghei killicki n. sub. sp]. Ann Parasitol Hum Comp, 43(5):545–9, 1968. ISSN 0003-4150 (Print) 0003-4150 (Linking). URL http://www.ncbi.nlm.nih. gov/pubmed/4888201.

I. Landau, J. C. Michel, J. P. Adam, and Y. Boulard. The life cycle of plasmodium vinckei lentum subsp. nov. in the laboratory; comments on the nomenclature of the murine malaria parasites. Ann Trop Med Parasitol, 64(3):315–23, 1970. ISSN 0003-4983 (Print) 0003-4983 (Linking). URL http://www.ncbi.nlm.nih.gov/ pubmed/5500105.

E. S. Lander, L. M. Linton, B. Birren, C. Nusbaum, M. C. Zody, J. Baldwin, K. Devon, K. Dewar, M. Doyle, W. FitzHugh, R. Funke, D. Gage, K. Harris, A. Heaford, J. Howland, L. Kann, J. Lehoczky, R. LeVine, P. McEwan, K. McKer- nan, J. Meldrim, J. P. Mesirov, C. Miranda, W. Morris, J. Naylor, C. Raymond, M. Rosetti, R. Santos, A. Sheridan, C. Sougnez, Y. Stange-Thomann, N. Sto- janovic, A. Subramanian, D. Wyman, J. Rogers, J. Sulston, R. Ainscough, S. Beck, D. Bentley, J. Burton, C. Clee, N. Carter, A. Coulson, R. Deadman, P. Deloukas, A. Dunham, I. Dunham, R. Durbin, L. French, D. Grafham, S. Gregory, T. Hub- bard, S. Humphray, A. Hunt, M. Jones, C. Lloyd, A. McMurray, L. Matthews, S. Mercer, S. Milne, J. C. Mullikin, A. Mungall, R. Plumb, M. Ross, R. Shown- keen, S. Sims, R. H. Waterston, R. K. Wilson, L. W. Hillier, J. D. McPherson, M. A. Marra, E. R. Mardis, L. A. Fulton, A. T. Chinwalla, K. H. Pepin, W. R. Gish, S. L. Chissoe, M. C. Wendl, K. D. Delehaunty, T. L. Miner, A. Delehaunty, J. B. Kramer, L. L. Cook, R. S. Fulton, D. L. Johnson, P. J. Minx, S. W. Clifton, T. Hawkins, E. Branscomb, P. Predki, P. Richardson, S. Wenning, T. Slezak, N. Doggett, J. F. Cheng, A. Olsen, S. Lucas, C. Elkin, E. Uberbacher, M. Frazier, et al. Initial se- quencing and analysis of the human genome. Nature, 409(6822):860–921, 2001. ISSN 0028-0836 (Print) 0028-0836 (Linking). doi: 10.1038/35057062. URL https://www.ncbi.nlm.nih.gov/pubmed/11237011.

J. Langhorne, P. Buffet, M. Galinski, M. Good, J. Harty, D. Leroy, M. M. Mota, E. Pasini, L. Renia, E. Riley, M. Stins, and P. Duffy. The relevance of non-human primate and rodent malaria models for humans. Malar J, 10:23, 2011. ISSN 1475-2875 (Electronic) 1475-2875 (Linking). doi: 10.1186/1475-2875-10-23. URL http://www.ncbi.nlm.nih.gov/pubmed/21288352http://www.ncbi.nlm. nih.gov/pmc/articles/PMC3041720/pdf/1475-2875-10-23.pdf. 152 E. Lasonder, C. J. Janse, G. J. van Gemert, G. R. Mair, A. M. Vermunt, B. G. Dourad- inha, V. van Noort, M. A. Huynen, A. J. Luty, H. Kroeze, S. M. Khan, R. W. Sauer- wein, A. P. Waters, M. Mann, and H. G. Stunnenberg. Proteomic profiling of plasmodium sporozoite maturation identifies new proteins essential for parasite development and infectivity. PLoS Pathog, 4(10):e1000195, 2008. ISSN 1553- 7374 (Electronic) 1553-7366 (Linking). doi: 10.1371/journal.ppat.1000195. URL https://www.ncbi.nlm.nih.gov/pubmed/18974882.

E. Lasonder, S. R. Rijpma, B. C. van Schaijk, W. A. Hoeijmakers, P. R. Kensche, M. S. Gresnigt, A. Italiaander, M. W. Vos, R. Woestenenk, T. Bousema, G. R. Mair, S. M. Khan, C. J. Janse, R. Bartfai, and R. W. Sauerwein. Integrated transcriptomic and proteomic analyses of p. falciparum gametocytes: molecular insight into sex- specific processes and translational repression. Nucleic Acids Res, 44(13):6087– 101, 2016. ISSN 1362-4962 (Electronic) 0305-1048 (Linking). doi: 10.1093/ nar/gkw536. URL https://www.ncbi.nlm.nih.gov/pubmed/27298255.

J. Lawton, T. Brugat, Y. X. Yan, A. J. Reid, U. Bohme, T. D. Otto, A. Pain, A. Jackson, M. Berriman, D. Cunningham, P. Preiser, and J. Langhorne. Characterization and gene expression analysis of the cir multi-gene family of plasmodium chabaudi chabaudi (as). BMC Genomics, 13:125, 2012. ISSN 1471-2164 (Electronic) 1471- 2164 (Linking). doi: 10.1186/1471-2164-13-125. URL https://www.ncbi.nlm. nih.gov/pubmed/22458863.

K. G. Le Roch, Y. Zhou, S. Batalov, and E. A. Winzeler. Monitoring the chromosome 2 intraerythrocytic transcriptome of plasmodium falciparum using oligonucleotide arrays. Am J Trop Med Hyg, 67(3):233–43, 2002. ISSN 0002-9637 (Print) 0002- 9637 (Linking). URL https://www.ncbi.nlm.nih.gov/pubmed/12408661.

K. G. Le Roch, Y. Zhou, P. L. Blair, M. Grainger, J. K. Moch, J. D. Haynes, P. De La Vega, A. A. Holder, S. Batalov, D. J. Carucci, and E. A. Winzeler. Dis- covery of gene function by expression profiling of the malaria parasite life cycle. Science, 301(5639):1503–8, 2003. ISSN 1095-9203 (Electronic) 0036-8075 (Linking). doi: 10.1126/science.1087025. URL http://www.ncbi.nlm.nih.gov/ pubmed/12893887http://www.sciencemag.org/content/301/5639/1503http: //www.sciencemag.org/content/301/5639/1503.full.pdf.

H. Li and R. Durbin. Fast and accurate short read alignment with burrowswheeler transform. Bioinformatics, 25(14):1754, 2009. doi: 10.1093/bioinformatics/ btp324. URL +http://dx.doi.org/10.1093/bioinformatics/btp324. 153 H. Li, B. Handsaker, A. Wysoker, T. Fennell, J. Ruan, N. Homer, G. Marth, G. Abeca- sis, R. Durbin, and 1000 Genome Project Data Processing Subgroup. The sequence alignment/map format and samtools. Bioinformatics, 25(16):2078, 2009a. doi: 10.1093/bioinformatics/btp352. URL +http://dx.doi.org/10. 1093/bioinformatics/btp352.

J. Li, Y. Zhang, S. Liu, L. Hong, M. Sullivan, T. F. McCutchan, J. M. Carlton, and X. Z. Su. Hundreds of microsatellites for genotyping plasmodium yoelii parasites. Mol Biochem Parasitol, 166(2):153–8, 2009b. ISSN 1872-9428 (Electronic) 0166- 6851 (Linking). doi: 10.1016/j.molbiopara.2009.03.011. URL https://www. ncbi.nlm.nih.gov/pubmed/19450732.

J. Li, S. Pattaradilokrat, F. Zhu, H. Jiang, S. Liu, L. Hong, Y. Fu, L. Koo, W. Xu, W. Pan, J. M. Carlton, O. Kaneko, R. Carter, J. C. Wootton, and X. Z. Su. Linkage maps from multiple genetic crosses and loci linked to growth-related virulent phenotype in plasmodium yoelii. Proc Natl Acad Sci U S A, 108(31):E374–82, 2011. ISSN 1091-6490 (Electronic) 0027-8424 (Linking). doi: 10.1073/pnas. 1102261108. URL https://www.ncbi.nlm.nih.gov/pubmed/21690382.

L. Li, Jr. Stoeckert, C. J., and D. S. Roos. Orthomcl: identification of ortholog groups for eukaryotic genomes. Genome Res, 13(9):2178–89, 2003. ISSN 1088- 9051 (Print) 1088-9051 (Linking). doi: 10.1101/gr.1224503. URL https:// www.ncbi.nlm.nih.gov/pubmed/12952885.

J. Limenitakis and D. Soldati-Favre. Functional genetics in : Potentials and limits. 585(11):1579–1588, 2011. ISSN 00145793. doi: 10.1016/j.febslet. 2011.05.002.

S. L. Liu and K. E. Sanderson. Rearrangements in the genome of the bacterium salmonella typhi. Proc Natl Acad Sci U S A, 92(4):1018–22, 1995. ISSN 0027-8424 (Print) 0027-8424 (Linking). URL https://www.ncbi.nlm.nih.gov/ pubmed/7862625.

M. Llinas, Z. Bozdech, E. D. Wong, A. T. Adai, and J. L. DeRisi. Comparative whole genome transcriptome analysis of three plasmodium falciparum strains. Nucleic Acids Res, 34(4):1166–73, 2006. ISSN 1362-4962 (Electronic) 0305-1048 (Link- ing). doi: 10.1093/nar/gkj517. URL https://www.ncbi.nlm.nih.gov/pubmed/ 16493140. 154 S. Lopaticki, A. G. Maier, J. Thompson, D. W. Wilson, W. H. Tham, T. Triglia, A. Gout, T. P. Speed, J. G. Beeson, J. Healer, and A. F. Cowman. Reticulocyte and erythrocyte binding-like proteins function cooperatively in invasion of human ery- throcytes by malaria parasites. Infection and Immunity, 79(3):1107–1117, 2011. ISSN 00199567. doi: 10.1128/IAI.01021-10.

F. Lu, R. Culleton, M. Zhang, A. Ramaprasad, L. von Seidlein, H. Zhou, G. Zhu, J. Tang, Y. Liu, W. Wang, Y. Cao, S. Xu, Y. Gu, J. Li, C. Zhang, Q. Gao, D. Menard, A. Pain, H. Yang, Q. Zhang, and J. Cao. Emergence of indigenous artemisinin-resistant plasmodium falciparum in africa. N Engl J Med, 376(10): 991–3, 2017. ISSN 1533-4406 (Electronic) 0028-4793 (Linking). doi: 10.1056/ NEJMc1612765. URL https://www.ncbi.nlm.nih.gov/pubmed/28225668http: //www.nejm.org/doi/pdf/10.1056/NEJMc1612765.

D. C. MacKellar, A. M. Vaughan, A. S. Aly, S. DeLeon, and S. H. Kappe. A systematic analysis of the early transcribed membrane protein family throughout the life cycle of plasmodium yoelii. Cell Microbiol, 13(11):1755–67, 2011. ISSN 1462- 5822 (Electronic) 1462-5814 (Linking). doi: 10.1111/j.1462-5822.2011.01656. x. URL https://www.ncbi.nlm.nih.gov/pubmed/21819513.

M. J. Mackinnon, J. Li, S. Mok, M. M. Kortok, K. Marsh, P. R. Preiser, and Z. Bozdech. Comparative transcriptional and genomic analysis of plasmodium falciparum field isolates. PLoS Pathog, 5(10):e1000644, 2009. ISSN 1553-7374 (Electronic) 1553-7366 (Linking). doi: 10.1371/journal.ppat.1000644. URL https://www.ncbi.nlm.nih.gov/pubmed/19898609.

A. G. Maier, M. Rug, M. T. O’Neill, M. Brown, S. Chakravorty, T. Szestak, J. Chesson, Y. Wu, K. Hughes, Ross L Coppel, Chris Newbold, J. G. Beeson, A. Craig, B. S. Crabb, and A. F. Cowman. Exported Proteins Required for Virulence and Rigidity of Plasmodium falciparum-Infected Human Erythrocytes. Cell, 134 (1):48–61, sep 2008. ISSN 0092-8674. doi: 10.1016/j.cell.2008.04.051. URL http://dx.doi.org/10.1016/j.cell.2008.04.051.

K. Maitland and C. R. J. C. Newton. Acidosis of severe falciparum malaria: heading for a shock? Trends in Parasitology, 21(1):11–16, sep 2005. ISSN 1471-4922. doi: 10.1016/j.pt.2004.10.010. URL http://dx.doi.org/10.1016/j.pt.2004. 10.010. 155 M. Manske, O. Miotto, S. Campino, S. Auburn, J. Almagro-Garcia, G. Maslen, J. O’Brien, A. Djimde, O. Doumbo, I. Zongo, J. B. Ouedraogo, P. Michon, I. Mueller, P. Siba, A. Nzila, S. Borrmann, S. M. Kiara, K. Marsh, H. Jiang, X. Z. Su, C. Amaratunga, R. Fairhurst, D. Socheat, F. Nosten, M. Imwong, N. J. White, M. Sanders, E. Anastasi, D. Alcock, E. Drury, S. Oyola, M. A. Quail, D. J. Turner, V. Ruano-Rubio, D. Jyothi, L. Amenga-Etego, C. Hubbart, A. Jeffreys, K. Row- lands, C. Sutherland, C. Roper, V. Mangano, D. Modiano, J. C. Tan, M. T. Fer- dig, A. Amambua-Ngwa, D. J. Conway, S. Takala-Harrison, C. V. Plowe, J. C. Rayner, K. A. Rockett, T. G. Clark, C. I. Newbold, M. Berriman, B. MacInnis, and D. P. Kwiatkowski. Analysis of plasmodium falciparum diversity in nat- ural infections by deep sequencing. Nature, 487(7407):375–9, 2012. ISSN 1476-4687 (Electronic) 0028-0836 (Linking). doi: 10.1038/nature11174. URL http://www.ncbi.nlm.nih.gov/pubmed/22722859.

P. Manson. Surgeon-major ronald ross’s recent investigations on the mosquito- malaria theory. Br Med J, 1(1955):1575–7, 1898. ISSN 0007-1447 (Print) 0007- 1447 (Linking). URL https://www.ncbi.nlm.nih.gov/pubmed/20757898.

S. Marciniak, T. L. Prowse, D. A. Herring, J. Klunk, M. Kuch, A. T. Duggan, L. Bon- dioli, E. C. Holmes, and H. N. Poinar. Plasmodium falciparum malaria in 1st-2nd century ce southern italy. Curr Biol, 26(23):R1220–R1222, 2016. ISSN 1879- 0445 (Electronic) 0960-9822 (Linking). doi: 10.1016/j.cub.2016.10.016. URL https://www.ncbi.nlm.nih.gov/pubmed/27923126.

M. Marti, R. T. Good, M. Rug, E. Knuepfer, and A. F. Cowman. Targeting malaria virulence and remodeling proteins to the host erythrocyte. Science, 306(5703): 1930–3, 2004. ISSN 1095-9203 (Electronic) 0036-8075 (Linking). doi: 10.1126/ science.1102452. URL https://www.ncbi.nlm.nih.gov/pubmed/15591202.

A. Martinelli, S. Cheesman, P. Hunt, R. Culleton, A. Raza, M. Mackinnon, and R. Carter. A genetic approach to the de novo identification of targets of strain- specific immunity in malaria parasites. Proc Natl Acad Sci U S A, 102(3):814– 9, 2005a. ISSN 0027-8424 (Print) 0027-8424 (Linking). doi: 10.1073/pnas. 0405097102. URL https://www.ncbi.nlm.nih.gov/pubmed/15640359.

A. Martinelli, P. Hunt, R. Fawcett, P. V. Cravo, D. Walliker, and R. Carter. An aflp-based genetic linkage map of plasmodium chabaudi chabaudi. Malar J, 4: 11, 2005b. ISSN 1475-2875 (Electronic) 1475-2875 (Linking). doi: 10.1186/ 1475-2875-4-11. URL https://www.ncbi.nlm.nih.gov/pubmed/15707493. 156 A. Martinelli, G. Henriques, P. Cravo, and P. Hunt. Whole genome re-sequencing identifies a mutation in an abc transporter (mdr2) in a plasmodium chabaudi clone with altered susceptibility to antifolate drugs. Int J Parasitol, 41(2):165– 71, 2011. ISSN 1879-0135 (Electronic) 0020-7519 (Linking). doi: 10.1016/j. ijpara.2010.08.008. URL https://www.ncbi.nlm.nih.gov/pubmed/20858498.

J. M. Matz and T. W. Kooij. Towards genome-wide experimental genetics in the in vivo malaria model parasite plasmodium berghei. Pathog Glob Health, 109(2):46–60, 2015. ISSN 2047-7732 (Electronic) 2047-7724 (Linking). doi: 10.1179/2047773215Y.0000000006. URL http://www.ncbi.nlm.nih.gov/ pubmed/25789828.

A. McKenna, M. Hanna, E. Banks, A. Sivachenko, K. Cibulskis, A. Kernytsky, K. Garimella, D. Altshuler, S. Gabriel, M. Daly, and M. A. DePristo. The genome analysis toolkit: a mapreduce framework for analyzing next-generation dna se- quencing data. Genome Res, 20(9):1297–303, 2010. ISSN 1549-5469 (Electronic) 1088-9051 (Linking). doi: 10.1101/gr.107524.110. URL https://www.ncbi. nlm.nih.gov/pubmed/20644199.

J. F. Meis, J. P. Verhave, P. H. Jap, R. E. Sinden, and J. H. Meuwissen. Malaria parasites–discovery of the early liver form. Nature, 302(5907):424–6, 1983. ISSN 0028-0836 (Print) 0028-0836 (Linking). URL http://www.ncbi.nlm.nih.gov/ pubmed/6339945.

S. Meister, D. M. Plouffe, K. L. Kuhen, G. M. Bonamy, T. Wu, S. W. Barnes, S. E. Bopp, R. Borboa, A. T. Bright, J. Che, S. Cohen, N. V. Dharia, K. Gagaring, M. Gettayacamin, P. Gordon, T. Groessl, N. Kato, M. C. Lee, C. W. McNamara, D. A. Fidock, A. Nagle, T. G. Nam, W. Richmond, J. Roland, M. Rottmann, B. Zhou, P. Froissard, R. J. Glynne, D. Mazier, J. Sattabongkot, P. G. Schultz, T. Tuntland, J. R. Walker, Y. Zhou, A. Chatterjee, T. T. Diagana, and E. A. Winzeler. Imaging of plasmodium liver stages to drive next-generation anti- malarial drug discovery. Science, 334(6061):1372–7, 2011. ISSN 1095-9203 (Electronic) 0036-8075 (Linking). doi: 10.1126/science.1211936. URL https: //www.ncbi.nlm.nih.gov/pubmed/22096101.

D. Menard, N. Khim, J. Beghain, A. A. Adegnika, M. Shafiul-Alam, O. Amodu, G. Rahim-Awab, C. Barnadas, A. Berry, Y. Boum, M. D. Bustos, J. Cao, J. Chen, L. Collet, L. Cui, G. Thakur, A. Dieye, D. Djall, M. A. Dorkenoo, C. E. Eboumbou- Moukoko, F. J. Espino, T. Fandeur, M. Ferreira-da Cruz, A. A. Fola, H. Fuehrer, 157 A. M. Hassan, S. Herrera, B. Hongvanthong, S. Houz, M. L. Ibrahim, M. Jahirul- Karim, L. Jiang, S. Kano, W. Ali-Khan, M. Khanthavong, P. G. Kremsner, M. Lac- erda, R. Leang, M. Leelawong, M. Li, K. Lin, J. Mazarati, S. Mnard, I. Mor- lais, H. Muhindo-Mavoko, L. Musset, K. Na-Bangchang, M. Nambozi, K. Niar, H. Noedl, J. Oudraogo, D. R. Pillai, B. Pradines, B. Quang-Phuc, M. Ramharter, M. Randrianarivelojosia, J. Sattabongkot, A. Sheikh-Omar, K. D. Silu, S. B. Sir- ima, C. Sutherland, D. Syafruddin, R. Tahar, L. Tang, O. A. Tour, P. Tshibangu-wa Tshibangu, I. Vigan-Womas, M. Warsame, L. Wini, S. Zakeri, S. Kim, R. Eam, L. Berne, C. Khean, S. Chy, M. Ken, K. Loch, L. Canier, V. Duru, E. Legrand, J. Barale, B. Stokes, J. Straimer, B. Witkowski, D. A. Fidock, C. Rogier, P. Ring- wald, F. Ariey, and O. Mercereau-Puijalon. A worldwide map of plasmodium falciparum k13-propeller polymorphisms. New England Journal of Medicine, 374 (25):2453–2464, 2016. doi: 10.1056/NEJMoa1513137. URL http://dx.doi. org/10.1056/NEJMoa1513137. PMID: 27332904.

A. Miles, Z. Iqbal, P. Vauterin, R. Pearson, S. Campino, M. Theron, K. Gould, D. Mead, E. Drury, J. O’Brien, V. Ruano Rubio, B. MacInnis, J. Mwangi, U. Sama- rakoon, L. Ranford-Cartwright, M. Ferdig, K. Hayton, X. Z. Su, T. Wellems, J. Rayner, G. McVean, and D. Kwiatkowski. Indels, structural variation, and re- combination drive genomic diversity in plasmodium falciparum. Genome Res, 26 (9):1288–99, 2016. ISSN 1549-5469 (Electronic) 1088-9051 (Linking). doi: 10. 1101/gr.203711.115. URL https://www.ncbi.nlm.nih.gov/pubmed/27531718.

J. L. Miller, S. Murray, A. M. Vaughan, A. Harupa, B. Sack, M. Baldwin, I. N. Crispe, and S. H. Kappe. Quantitative bioluminescent imaging of pre-erythrocytic malaria parasite infection using luciferase-expressing plasmodium yoelii. PLoS One, 8(4):e60820, 2013. ISSN 1932-6203 (Electronic) 1932-6203 (Linking). doi: 10.1371/journal.pone.0060820. URL https://www.ncbi.nlm.nih.gov/pubmed/ 23593316.

R. L. Miller, S. Ikram, G. J. Armelagos, R. Walker, W. B. Harer, C. J. Shiff, D. Baggett, M. Carrigan, and S. M. Maret. Diagnosis of plasmodium falciparum infections in mummies using the rapid manual parasight-f test. Trans R Soc Trop Med Hyg, 88(1):31–2, 1994. ISSN 0035-9203 (Print) 0035-9203 (Linking). URL https: //www.ncbi.nlm.nih.gov/pubmed/8153990.

O. Miotto, R. Amato, E. A. Ashley, B. MacInnis, J. Almagro-Garcia, C. Amaratunga, P. Lim, D. Mead, S. O. Oyola, M. Dhorda, M. Imwong, C. Woodrow, M. Manske, 158 J. Stalker, E. Drury, S. Campino, L. Amenga-Etego, T. N. Thanh, H. T. Tran, P. Ringwald, D. Bethell, F. Nosten, A. P. Phyo, S. Pukrittayakamee, K. Choti- vanich, C. M. Chuor, C. Nguon, S. Suon, S. Sreng, P. N. Newton, M. Mayxay, M. Khanthavong, B. Hongvanthong, Y. Htut, K. T. Han, M. P. Kyaw, M. A. Faiz, C. I. Fanello, M. Onyamboko, O. A. Mokuolu, C. G. Jacob, S. Takala- Harrison, C. V. Plowe, N. P. Day, A. M. Dondorp, C. C. Spencer, G. McVean, R. M. Fairhurst, N. J. White, and D. P. Kwiatkowski. Genetic architecture of artemisinin-resistant plasmodium falciparum. Nat Genet, 47(3):226–34, 2015. ISSN 1546-1718 (Electronic) 1061-4036 (Linking). doi: 10.1038/ng.3189. URL https://www.ncbi.nlm.nih.gov/pubmed/25599401.

B. Mons, E. G. Boorsma, J. Ramesar, and C. J. Janse. Removal of leucocytes from malaria-infected blood using commercially available filters. Ann Trop Med Par- asitol, 82(6):621–3, 1988. ISSN 0003-4983 (Print) 0003-4983 (Linking). URL https://www.ncbi.nlm.nih.gov/pubmed/3076749.

R. R. Moraes Barros, J. Straimer, J. M. Sa, R. E. Salzman, V. A. Melendez-Muniz, J. Mu, D. A. Fidock, and T. E. Wellems. Editing the plasmodium vivax genome, using zinc-finger nucleases. J Infect Dis, 211(1):125–9, 2015. ISSN 1537-6613 (Electronic) 0022-1899 (Linking). doi: 10.1093/infdis/jiu423. URL https:// www.ncbi.nlm.nih.gov/pubmed/25081932.

A. Morgulis, E. M. Gertz, A. A. Schaffer, and R. Agarwala. A fast and symmetric dust implementation to mask low-complexity dna sequences. J Comput Biol, 13 (5):1028–40, 2006. ISSN 1066-5277 (Print) 1066-5277 (Linking). doi: 10.1089/ cmb.2006.13.1028. URL https://www.ncbi.nlm.nih.gov/pubmed/16796549.

M. M. Mota, G. Pradel, J. P. Vanderberg, J. C. Hafalla, U. Frevert, R. S. Nussenzweig, V. Nussenzweig, and A. Rodriguez. Migration of plasmodium sporozoites through cells before infection. Science, 291(5501):141–4, 2001. ISSN 0036-8075 (Print) 0036-8075 (Linking). doi: 10.1126/science.291.5501.141. URL https://www. ncbi.nlm.nih.gov/pubmed/11141568.

S. C. Nair, R. Xu, S. Pattaradilokrat, J. Wu, Y. Qi, M. Zilversmit, S. Ganesan, V. Nagarajan, R. T. Eastman, M. S. Orandle, J. C. Tan, T. G. Myers, S. Liu, C. A. Long, J. Li, and X. Su. A Plasmodium yoelii HECT-like E3 ubiquitin ligase regulates parasite growth and virulence. Nature Communications, 8(1): 223, 2017. ISSN 2041-1723. doi: 10.1038/s41467-017-00267-3. URL https: //doi.org/10.1038/s41467-017-00267-3. 159 M. Niang, A. Bei, K. Madnani, S. Pelly, S. Dankwa, U. Kanjee, K. Gunalan, A. Amaladoss, K. Yeo, N. Bob, B. Malleret, M.T. Duraisingh, and P.R. Preiser. STEVOR Is a Plasmodium falciparum Erythrocyte Binding Protein that Mediates Merozoite Invasion and Rosetting. Cell Host & Microbe, 16(1): 81–93, jul 2014. ISSN 1931-3128. doi: 10.1016/j.chom.2014.06.004. URL http://dx.doi.org/10.1016/j.chom.2014.06.004.

H. Noedl, Y. Se, K. Schaecher, B. L. Smith, D. Socheat, M. M. Fukuda, and Con- sortium Artemisinin Resistance in Cambodia 1 Study. Evidence of artemisinin- resistant malaria in western cambodia. N Engl J Med, 359(24):2619–20, 2008. ISSN 1533-4406 (Electronic) 0028-4793 (Linking). doi: 10.1056/ NEJMc0805011. URL https://www.ncbi.nlm.nih.gov/pubmed/19064625.

M. C. Oguike and C. J. Sutherland. Dimorphism in genes encoding sexual-stage proteins of Plasmodium ovale curtisi and Plasmodium ovale wallikeri. Inter- national Journal for Parasitology, 45(7):449–454, 2015. ISSN 18790135. doi: 10.1016/j.ijpara.2015.02.004.

H. Otsuki, O. Kaneko, A. Thongkukiatkul, M. Tachibana, H. Iriko, S. Takeo, T. Tsuboi, and M. Torii. Single amino acid substitution in plasmodium yoelii erythrocyte ligand determines its localization and controls parasite virulence. Proc Natl Acad Sci U S A, 106(17):7167–72, 2009. ISSN 1091-6490 (Elec- tronic) 0027-8424 (Linking). doi: 10.1073/pnas.0811313106. URL https: //www.ncbi.nlm.nih.gov/pubmed/19346470.

T. D. Otto, M. Sanders, M. Berriman, and C. Newbold. Iterative correction of refer- ence nucleotides (icorn) using second generation sequencing technology. Bioin- formatics (Oxford, England), 26:1704–7, 2010a. doi: 10.1093/bioinformatics/ btq269. URL http://bioinformatics.oxfordjournals.org/content/26/14/ 1704.full.pdf.

T. D. Otto, D. Wilinski, S. Assefa, T. M. Keane, L. R. Sarry, U. Bohme, J. Lemieux, B. Barrell, A. Pain, M. Berriman, C. Newbold, and M. Llinas. New insights into the blood-stage transcriptome of plasmodium falciparum using rna-seq. Molecular microbiology, 76(1):12–24, 2010b. ISSN 1365-2958 (Electronic) 0950-382X (Linking). doi: 10.1111/j.1365-2958.2009.07026.x. URL http:// www.ncbi.nlm.nih.gov/pubmed/20141604http://onlinelibrary.wiley.com/ store/10.1111/j.1365-2958.2009.07026.x/asset/j.1365-2958.2009.07026. 160 x.pdf?v=1&t=gr2j55sf&s=1ed95e7fd4cba0a26c26fcd8b9e43d424e86d6b9http: //www.ncbi.nlm.nih.gov/pmc/articles/PMC2859250/pdf/mmi0076-0012.pdf.

T. D. Otto, U. Bohme, A. P. Jackson, M. Hunt, B. Franke-Fayard, W. A. Hoei- jmakers, A. A. Religa, L. Robertson, M. Sanders, S. A. Ogun, D. Cunning- ham, A. Erhart, O. Billker, S. M. Khan, H. G. Stunnenberg, J. Langhorne, A. A. Holder, A. P. Waters, C. I. Newbold, A. Pain, M. Berriman, and C. J. Janse. A comprehensive evaluation of rodent malaria parasite genomes and gene expression. BMC Biol, 12(1):86, 2014. ISSN 1741-7007 (Elec- tronic) 1741-7007 (Linking). doi: 10.1186/PREACCEPT-1233682211145405. URL http://www.ncbi.nlm.nih.gov/pubmed/25359557http://www.ncbi.nlm. nih.gov/pmc/articles/PMC4242472/pdf/12915_2014_Article_86.pdf.

A. I. Oxbrow. Strain specific immunity to plasmodium berghei: a new genetic marker. Parasitology, 67(1):1727, 1973. doi: 10.1017/S0031182000046254.

A. Pain, U. Bhme, A. Berry, K. Mungall, R. Finn, A. Jackson, T. Mourier, J. Mistry, E. Pasini, M. Aslett, S. Balasubrammaniam, K. Borgwardt, K. Brooks, C. Car- ret, T. Carver, I. Cherevach, T. Chillingworth, T. Clark, M. Galinski, N. Hall, D. Harper, D. Harris, H. Hauser, A. Ivens, C. Janssen, T. Keane, N. Larke, S. Lapp, M. Marti, S. Moule, I. Meyer, D. Ormond, N. Peters, M. Sanders, S. Sanders, T. Sargeant, M. Simmonds, F. Smith, R. Squares, S. Thurston, A. Tivey, D. Walker, B. White, E. Zuiderwijk, C. Churcher, M. Quail, A. Cow- man, C. Turner, M. Rajandream, C. Kocken, A. Thomas, C. Newbold, B. Barrell, and M. Berriman. The genome of the simian and human malaria parasite plas- modium knowlesi. Nature, 455(7214):799–803, 2008. ISSN 0028-0836. doi: 10.1038/nature07306. URL http://dx.doi.org/10.1038/nature07306http: //www.nature.com/nature/journal/v455/n7214/pdf/nature07306.pdf.

E. M. Pasini, J. A. Braks, J. Fonager, O. Klop, E. Aime, R. Spaccapelo, T. D. Otto, M. Berriman, J. A. Hiss, A. W. Thomas, M. Mann, C. J. Janse, C. H. Kocken, and B. Franke-Fayard. Proteomic and genetic analyses demonstrate that plasmodium berghei blood stages export a large and diverse repertoire of proteins. Mol Cell Proteomics, 12(2):426–48, 2013. ISSN 1535-9484 (Elec- tronic) 1535-9476 (Linking). doi: 10.1074/mcp.M112.021238. URL https: //www.ncbi.nlm.nih.gov/pubmed/23197789.

S. Pattaradilokrat, S. J. Cheesman, and R. Carter. Linkage group selection: towards identifying genes controlling strain specific protective immunity in malaria. PLoS 161 One, 2(9):e857, 2007. ISSN 1932-6203 (Electronic) 1932-6203 (Linking). doi: 10.1371/journal.pone.0000857. URL https://www.ncbi.nlm.nih.gov/pubmed/ 17848988.

S. Pattaradilokrat, R. L. Culleton, S. J. Cheesman, and R. Carter. Gene encod- ing erythrocyte binding ligand linked to blood stage multiplication rate pheno- type in plasmodium yoelii yoelii. Proc Natl Acad Sci U S A, 106(17):7161–6, 2009. ISSN 1091-6490 (Electronic) 0027-8424 (Linking). doi: 10.1073/pnas. 0811430106. URL http://www.ncbi.nlm.nih.gov/pubmed/19359470http:// www.pnas.org/content/106/17/7161.full.pdf.

R. D. Pearson, R. Amato, S. Auburn, O. Miotto, J. Almagro-Garcia, C. Ama- ratunga, S. Suon, S. Mao, R. Noviyanti, H. Trimarsanto, J. Marfurt, N. M. Anstey, T. William, M. F. Boni, C. Dolecek, H. T. Tran, N. J. White, P. Michon, P. Siba, L. Tavul, G. Harrison, A. Barry, I. Mueller, M. U. Ferreira, N. Karunaweera, M. Ran- drianarivelojosia, Q. Gao, C. Hubbart, L. Hart, B. Jeffery, E. Drury, D. Mead, M. Kekre, S. Campino, M. Manske, V. J. Cornelius, B. MacInnis, K. A. Rock- ett, A. Miles, J. C. Rayner, R. M. Fairhurst, F. Nosten, R. N. Price, and D. P. Kwiatkowski. Genomic analysis of local variation and recent evolution in plas- modium vivax. Nat Genet, 48(8):959–64, 2016. ISSN 1546-1718 (Electronic) 1061-4036 (Linking). doi: 10.1038/ng.3599. URL https://www.ncbi.nlm.nih. gov/pubmed/27348299.

S. L. Perkins, I. N. Sarkar, and R. Carter. The phylogeny of rodent malaria parasites: simultaneous analysis across three genomes. Infect Genet Evol, 7(1):74–83, 2007. ISSN 1567-1348 (Print) 1567-1348 (Linking). doi: 10.1016/j.meegid. 2006.04.005. URL http://www.ncbi.nlm.nih.gov/pubmed/16765106http: //ac.els-cdn.com/S1567134806000621/1-s2.0-S1567134806000621-main. pdf?_tid=131f6a90-ddf0-11e4-a318-00000aab0f02&acdnat=1428498571_ 52bbb1199730d73fe53d2ee554e40a17.

I. Petersen, R. Eastman, and M. Lanzer. Drug-resistant malaria: molecular mech- anisms and implications for public health. FEBS Lett, 585(11):1551–62, 2011a. ISSN 1873-3468 (Electronic) 0014-5793 (Linking). doi: 10.1016/j.febslet.2011. 04.042. URL https://www.ncbi.nlm.nih.gov/pubmed/21530510.

T. N. Petersen, S. Brunak, G. von Heijne, and H. Nielsen. Signalp 4.0: discriminating signal peptides from transmembrane regions. Nat Methods, 8(10):785–6, 2011b. 162 ISSN 1548-7105 (Electronic) 1548-7091 (Linking). doi: 10.1038/nmeth.1701. URL https://www.ncbi.nlm.nih.gov/pubmed/21959131.

C. Pfander, B. Anar, F. Schwach, T. D. Otto, M. Brochet, K. Volkmann, M. A. Quail, A. Pain, B. Rosen, W. Skarnes, J. C. Rayner, and O. Billker. A scalable pipeline for highly effective genetic modification of a malaria parasite. Nat Methods, 8 (12):1078–82, 2011. ISSN 1548-7105 (Electronic) 1548-7091 (Linking). doi: 10.1038/nmeth.1742. URL https://www.ncbi.nlm.nih.gov/pubmed/22020067.

A. P. Phyo, S. Nkhoma, K. Stepniewska, E. A. Ashley, S. Nair, R. McGready, C. ler Moo, S. Al-Saai, A. M. Dondorp, K. M. Lwin, P. Singhasivanon, N. P. Day, N. J. White, T. J. Anderson, and F. Nosten. Emergence of artemisinin- resistant malaria on the western border of thailand: a longitudinal study. Lancet, 379(9830):1960–6, 2012. ISSN 1474-547X (Electronic) 0140-6736 (Linking). doi: 10.1016/S0140-6736(12)60484-X. URL https://www.ncbi.nlm.nih.gov/ pubmed/22484134.

M. Prudencio, M. M. Mota, and A. M. Mendes. A toolbox to study liver stage malaria. Trends Parasitol, 27(12):565–74, 2011. ISSN 1471-5007 (Electronic) 1471-4922 (Linking). doi: 10.1016/j.pt.2011.09.004. URL https://www.ncbi. nlm.nih.gov/pubmed/22015112.

S. K. Puri and R. Chandra. Plasmodium vinckei: selection of a strain ex- hibiting stable resistance to arteether. Exp Parasitol, 114(2):129–32, 2006. ISSN 0014-4894 (Print) 0014-4894 (Linking). doi: 10.1016/j.exppara. 2006.02.017. URL http://www.ncbi.nlm.nih.gov/pubmed/16624307http: //ac.els-cdn.com/S0014489406000634/1-s2.0-S0014489406000634-main. pdf?_tid=733ec24e-c26c-11e4-b521-00000aacb35d&acdnat=1425473357_ face716ee58b7f54aa7c7d4b517ae261.

A. R. Quinlan and I. M. Hall. Bedtools: a flexible suite of utilities for comparing ge- nomic features. Bioinformatics, 26(6):841, 2010. doi: 10.1093/bioinformatics/ btq033. URL +http://dx.doi.org/10.1093/bioinformatics/btq033.

R. S. Ramiro, S. E. Reece, and D. J. Obbard. Molecular evolution and phy- logenetics of rodent malaria parasites. BMC Evol Biol, 12:219, 2012. ISSN 1471-2148 (Electronic) 1471-2148 (Linking). doi: 10.1186/1471-2148-12-219. URL http://www.ncbi.nlm.nih.gov/pubmed/23151308http://www.ncbi.nlm. nih.gov/pmc/articles/PMC3538709/pdf/1471-2148-12-219.pdf. 163 L. C. Ranford-Cartwright and J. M. Mwangi. Analysis of malaria parasite phe- notypes using experimental genetic crosses of plasmodium falciparum. Int J Parasitol, 42(6):529–34, 2012. ISSN 1879-0135 (Electronic) 0020-7519 (Link- ing). doi: 10.1016/j.ijpara.2012.03.004. URL https://www.ncbi.nlm.nih.gov/ pubmed/22475816.

M. B. Reed, K. J. Saliba, S. R. Caruana, K. Kirk, and A. F. Cowman. Pgh1 modulates sensitivity and resistance to multiple antimalarials in Plasmodium falciparum. Nature, 403(6772):906–909, 2000. ISSN 0028-0836. doi: 10.1038/35002615.

J. C. Reeder and G. V. Brown. Antigenic variation and immune evasion in plas- modium falciparum malaria. Immunol Cell Biol, 74(6):546–54, 1996. ISSN 0818-9641 (Print) 0818-9641 (Linking). doi: 10.1038/icb.1996.88. URL https: //www.ncbi.nlm.nih.gov/pubmed/8989593.

W. H. Richards and S. G. Williams. The removal of leucocytes from malaria infected blood. Ann Trop Med Parasitol, 67(2):249–50, 1973. ISSN 0003-4983 (Print) 0003-4983 (Linking). URL https://www.ncbi.nlm.nih.gov/pubmed/4578939.

J. T. Robinson, H. Thorvaldsdottir, W. Winckler, M. Guttman, E. S. Lander, G. Getz, and J. P. Mesirov. Integrative genomics viewer. Nat Biotechnol, 29(1):24–6, 2011. ISSN 1546-1696 (Electronic) 1087-0156 (Linking). doi: 10.1038/nbt.1754. URL https://www.ncbi.nlm.nih.gov/pubmed/21221095.

J. Rodhain. [plasmodium vinckei n. sp.; second plasmodium parasite of wild ro- dents at katange]. Ann Soc Belg Med Trop (1920), 32(3):275–9, 1952. ISSN 0365-6527 (Print) 0365-6527 (Linking). URL http://www.ncbi.nlm.nih.gov/ pubmed/12976900.

Roll Back Malaria. World malaria report 2005. Report, World Health Organization and UNICEF, 2005.

F. Ronquist, M. Teslenko, P. van der Mark, D. L. Ayres, A. Darling, S. Hohna, B. Larget, L. Liu, M. A. Suchard, and J. P. Huelsenbeck. Mrbayes 3.2: effi- cient bayesian phylogenetic inference and model choice across a large model space. Syst Biol, 61(3):539–42, 2012. ISSN 1076-836X (Electronic) 1063-5157 (Linking). doi: 10.1093/sysbio/sys029. URL https://www.ncbi.nlm.nih.gov/ pubmed/22357727. 164 V. E. Rosario. Genetics of chloroquine resistance in malaria parasites. Nature, 261 (5561):585–6, 1976. ISSN 0028-0836 (Print) 0028-0836 (Linking). URL https: //www.ncbi.nlm.nih.gov/pubmed/934297.

R. Rosenberg and J. Rungsiwongse. The Number of Sporozoites Produced by Indi- vidual Malaria Oocysts. The American Journal of Tropical Medicine and Hygiene, 45(5), 1991.

R. Ross. On some peculiar pigmented cells found in two mosquitos fed on malarial blood. BMJ, 2(1929):1786–1788, 1897. ISSN 0007-1447. doi: 10.1136/bmj.2. 1929.1786. URL http://www.bmj.com/content/2/1929/1786.

R. Ross. Pigmented cells in mosquitos. Br Med J, 1(1939):550–1, 1898. ISSN 0007-1447 (Print) 0007-1447 (Linking). URL https://www.ncbi.nlm.nih.gov/ pubmed/20757668.

Ronald Ross. Infection of birds with proteosoma by the bites of mosquitos. Indian Med. Gaz, 34:1–3, 1899a.

Sir Ronald Ross. Du roleˆ des moustiques dans le paludisme. publisher not identified, 1899b.

K. Rutherford, J. Parkhill, J. Crook, T. Horsnell, P. Rice, M. A. Rajandream, and B. Barrell. Artemis: sequence visualization and annotation. Bioinformatics, 16 (10):944–5, 2000. ISSN 1367-4803 (Print) 1367-4803 (Linking). URL http: //www.ncbi.nlm.nih.gov/pubmed/11120685.

G. G. Rutledge, U. Bohme, M. Sanders, A. J. Reid, J. A. Cotton, O. Maiga- Ascofare, A. A. Djimde, T. O. Apinjoh, L. Amenga-Etego, M. Manske, J. W. Barnwell, F. Renaud, B. Ollomo, F. Prugnolle, N. M. Anstey, S. Auburn, R. N. Price, J. S. McCarthy, D. P. Kwiatkowski, C. I. Newbold, M. Berri- man, and T. D. Otto. Plasmodium malariae and p. ovale genomes pro- vide insights into malaria parasite evolution. Nature, 542(7639):101–104, 2017. ISSN 1476-4687 (Electronic) 0028-0836 (Linking). doi: 10.1038/ nature21038. URL https://www.ncbi.nlm.nih.gov/pubmed/28117441http:// www.nature.com/nature/journal/v542/n7639/pdf/nature21038.pdf.

Jr. Sacci, J. B., J. M. Ribeiro, F. Huang, U. Alam, J. A. Russell, P. L. Blair, A. Witney, D. J. Carucci, A. F. Azad, and J. C. Aguiar. Transcriptional anal- ysis of in vivo plasmodium yoelii liver stage gene expression. Mol Biochem 165 Parasitol, 142(2):177–83, 2005. ISSN 0166-6851 (Print) 0166-6851 (Linking). doi: 10.1016/j.molbiopara.2005.03.018. URL https://www.ncbi.nlm.nih.gov/ pubmed/15876462.

R Sallares and S Gomzi. Biomolecular archaeology of malaria, 2001. ISSN 13586122. URL http://scholar.google.com/scholar?hl=en{&}btnG= Search{&}q=intitle:Biomolecular+archaeology+of+malaria{#}0.

C. P. Sanchez, J. Pfahler, H. A. Del Portillo, and M. Lanzer. Transient trans- fection of plasmodium vivax blood-stage parasites. Methods Mol Biol, 923: 151–9, 2013. ISSN 1940-6029 (Electronic) 1064-3745 (Linking). doi: 10. 1007/978-1-62703-026-7 10. URL https://www.ncbi.nlm.nih.gov/pubmed/ 22990776.

T. J. Sargeant, M. Marti, E. Caler, J. M. Carlton, K. Simpson, T. P. Speed, and A. F. Cowman. Lineage-specific expansion of proteins exported to erythrocytes in malaria parasites. Genome Biol, 7(2):R12, 2006. ISSN 1474-760X (Electronic) 1474-7596 (Linking). doi: 10.1186/gb-2006-7-2-r12. URL https://www.ncbi. nlm.nih.gov/pubmed/16507167.

T. J. Satchwell. Erythrocyte invasion receptors for plasmodium falciparum: new and old. Transfus Med, 26(2):77–88, 2016. ISSN 1365-3148 (Electronic) 0958- 7578 (Linking). doi: 10.1111/tme.12280. URL https://www.ncbi.nlm.nih. gov/pubmed/26862042.

A. Scherf, J. J. Lopez-Rubio, and L. Riviere. Antigenic variation in plasmodium falciparum. Annu Rev Microbiol, 62:445–70, 2008. ISSN 0066-4227 (Print) 0066- 4227 (Linking). doi: 10.1146/annurev.micro.61.080706.093134. URL https: //www.ncbi.nlm.nih.gov/pubmed/18785843.

P. Schlagenhauf. Malaria: from prehistory to present. Infect Dis Clin North Am, 18(2):189–205, table of contents, 2004. ISSN 0891-5520 (Print) 0891-5520 (Linking). doi: 10.1016/j.idc.2004.01.002. URL http://www.ncbi.nlm.nih.gov/pubmed/15145375http://www.sciencedirect. com/science/article/pii/S0891552004000170.

C. Schlotterer. The evolution of molecular markers–just a matter of fashion? Nature reviews. Genetics, 5(1):63–69, jan 2004. ISSN 1471-0056 (Print). doi: 10.1038/ nrg1249. 166 J. Schmutz, J. Wheeler, J. Grimwood, M. Dickson, J. Yang, C. Caoile, E. Bajorek, S. Black, Y. M. Chan, M. Denys, J. Escobar, D. Flowers, D. Fotopulos, C. Garcia, M. Gomez, E. Gonzales, L. Haydu, F. Lopez, L. Ramirez, J. Retterer, A. Rodriguez, S. Rogers, A. Salazar, M. Tsai, and R. M. Myers. Quality assessment of the human genome sequence. Nature, 429(6990):365–8, 2004. ISSN 1476-4687 (Electronic) 0028-0836 (Linking). doi: 10.1038/nature02390. URL https://www.ncbi.nlm. nih.gov/pubmed/15164052.

F. Schwach, E. Bushell, A. R. Gomes, B. Anar, G. Girling, C. Herd, J. C. Rayner, and O. Billker. Plasmogem, a database supporting a community resource for large- scale experimental genetics in malaria parasites. Nucleic Acids Res, 43(Database issue):D1176–82, 2015. ISSN 1362-4962 (Electronic) 0305-1048 (Linking). doi: 10.1093/nar/gku1143. URL http://www.ncbi.nlm.nih.gov/pubmed/ 25593348http://nar.oxfordjournals.org/content/43/D1/D1176.full.pdf.

H. E. Shortt and P. C. Garnham. Demonstration of a persisting exo-erythrocytic cycle in plasmodium cynomolgi and its bearing on the production of relapses. Br Med J, 1(4564):1225–8, 1948. ISSN 0007-1447 (Print) 0007-1447 (Linking). URL https://www.ncbi.nlm.nih.gov/pubmed/18865981.

R. E. Sinden and R. H. Hartley. Identification of the meiotic division of malarial parasites. The Journal of Protozoology, 32(4):742–744, 1985. ISSN 1550-7408. doi: 10.1111/j.1550-7408.1985.tb03113.x. URL http://dx.doi.org/10.1111/ j.1550-7408.1985.tb03113.x.

R. E. Sinden and K. Matuschewski. The Sporozoite. In Molecular Approaches to Malaria, pages 169–190. American Society of Microbiology, 2005. URL http: //www.asmscience.org/content/book/10.1128/9781555817558.chap9.

R. E. Sinden, R. H. Hartley, and L. Winger. The development of Plasmodium ookinetes in vitro: an ultrastructural study including a description of mei- otic division. Parasitology, 91(02):227, oct 1985. ISSN 0031-1820. doi: 10.1017/S0031182000057334. URL http://www.journals.cambridge.org/ abstract{_}S0031182000057334.

P. J. Spence, D. Cunningham, W. Jarra, J. Lawton, J. Langhorne, and J. Thompson. Transformation of the rodent malaria parasite plasmodium chabaudi. Nat Protoc, 6(4):553–61, 2011. ISSN 1750-2799 (Electronic) 1750-2799 (Linking). doi: 10. 1038/nprot.2011.313. URL https://www.ncbi.nlm.nih.gov/pubmed/21455190. 167 P. J. Spence, W. Jarra, P. Levy, A. J. Reid, L. Chappell, T. Brugat, M. Sanders, M. Berriman, and J. Langhorne. Vector transmission regulates immune con- trol of plasmodium virulence. Nature, 498(7453):228–31, 2013. ISSN 1476- 4687 (Electronic) 0028-0836 (Linking). doi: 10.1038/nature12231. URL https: //www.ncbi.nlm.nih.gov/pubmed/23719378.

T. Spielmann, D. J. Fergusen, and H. P. Beck. etramps, a new plasmodium fal- ciparum gene family coding for developmentally regulated and highly charged membrane proteins located at the parasite-host cell interface. Mol Biol Cell, 14 (4):1529–44, 2003. ISSN 1059-1524 (Print) 1059-1524 (Linking). doi: 10.1091/ mbc.E02-04-0240. URL https://www.ncbi.nlm.nih.gov/pubmed/12686607.

P. Srinivasan, E. G. Abraham, A. K. Ghosh, J. Valenzuela, J. M. Ribeiro, G. Di- mopoulos, F. C. Kafatos, J. H. Adams, H. Fujioka, and M. Jacobs-Lorena. Analysis of the plasmodium and anopheles transcriptomes during oocyst differentiation. J Biol Chem, 279(7):5581–7, 2004. ISSN 0021-9258 (Print) 0021-9258 (Linking). doi: 10.1074/jbc.M307587200. URL https://www.ncbi.nlm.nih.gov/pubmed/ 14627711.

M. Stanke, R. Steinkamp, S. Waack, and B. Morgenstern. Augustus: a web server for gene finding in eukaryotes. Nucleic Acids Res, 32(Web Server issue):W309– 12, 2004. ISSN 1362-4962 (Electronic) 0305-1048 (Linking). doi: 10.1093/nar/ gkh379. URL https://www.ncbi.nlm.nih.gov/pubmed/15215400.

R. Stephens, R. L. Culleton, and T. J. Lamb. The contribution of plasmodium chabaudi to our understanding of malaria. Trends Parasitol, 28(2):73–82, 2012. ISSN 1471-5007 (Electronic) 1471-4922 (Linking). doi: 10.1016/j. pt.2011.10.006. URL http://www.ncbi.nlm.nih.gov/pubmed/22100995http: //ac.els-cdn.com/S147149221100184X/1-s2.0-S147149221100184X-main. pdf?_tid=dbac15d0-c509-11e4-96f5-00000aab0f27&acdnat=1425760865_ 8819cdcf173e973455f234117ba37353.

J. Straimer, M. C. Lee, A. H. Lee, B. Zeitler, A. E. Williams, J. R. Pearl, L. Zhang, E. J. Rebar, P. D. Gregory, M. Llinas, F. D. Urnov, and D. A. Fidock. Site-specific genome editing in plasmodium falci- parum using engineered zinc-finger nucleases. Nat Methods, 9(10):993–8, 2012. ISSN 1548-7105 (Electronic) 1548-7091 (Linking). doi: 10.1038/ nmeth.2143. URL http://www.ncbi.nlm.nih.gov/pubmed/22922501http:// www.nature.com/nmeth/journal/v9/n10/pdf/nmeth.2143.pdf. 168 J. Stubbs, K. M Simpson, T. Triglia, D. Plouffe, C. J. Tonkin, M. T. Duraisingh, A. G. Maier, E. A. Winzeler, and A. F. Cowman. Molecular mechanism for switching of P. falciparum invasion pathways into human erythrocytes. Science (New York, N.Y.), 309:1384–1387, 2005. ISSN 0036-8075. doi: 10.1126/science.1115257.

X. Su and T. E. Wellems. Toward a High-Resolution Plasmodium falciparum Link- age Map: Polymorphic Markers from Hundreds of Simple Sequence Repeats. Ge- nomics, 33(3):430–444, 1996. ISSN 0888-7543. doi: https://doi.org/10.1006/ geno.1996.0218. URL http://www.sciencedirect.com/science/article/pii/ S0888754396902189.

X. Su, L. A. Kirkman, H. Fujioka, and T. E. Wellems. Complex Polymorphisms in an 330 kDa Protein Are Linked to Chloroquine-Resistant P. falciparum in Southeast Asia and Africa. Cell, 91(5):593–603, 1997. ISSN 0092-8674. doi: http://dx.doi. org/10.1016/S0092-8674(00)80447-X. URL http://www.sciencedirect.com/ science/article/pii/S009286740080447X.

X. Su, M. T. Ferdig, Y. Huang, C. Q. Huynh, A. Liu, J. You, J. C. Wootton, and T. E. Wellems. A genetic map and recombination parameters of the human malaria parasite plasmodium falciparum. Science, 286(5443):1351–3, 1999a. ISSN 0036-8075 (Print) 0036-8075 (Linking). URL https://www.ncbi.nlm.nih. gov/pubmed/10558988.

X. Su, M. T. Ferdig, Y. Huang, C. Q. Huynh, A. Liu, J. You, J. C. Wootton, and T. E. Wellems. A Genetic Map and Recombination Parameters of the Human Malaria Parasite Plasmodium falciparum. Science, 286(5443):1351–1353, 1999b. ISSN 00368075, 10959203. URL http://www.jstor.org/stable/2899932.

C. J. Sutherland, P. Lansdell, M. Sanders, J. Muwanguzi, D. A. van Schalkwyk, H. Kaur, D. Nolder, J. Tucker, H. M. Bennett, T. D. Otto, M. Berriman, T. A. Patel, R. Lynn, E. Gkrania-Klotsas, and P. L. Chiodini. pfk13-independent treatment failure in four imported cases of plasmodium falciparum malaria treated with artemether-lumefantrine in the united kingdom. Antimicrob Agents Chemother, 61(3), 2017. ISSN 1098-6596 (Electronic) 0066-4804 (Linking). doi: 10.1128/ AAC.02382-16. URL https://www.ncbi.nlm.nih.gov/pubmed/28137810.

T. H. Ta, S. Hisam, M. Lanza, A. I. Jiram, N. Ismail, and J. M. Rubio. First case of a naturally acquired human infection with plasmodium cynomolgi. Malar J, 169 13:68, 2014. ISSN 1475-2875 (Electronic) 1475-2875 (Linking). doi: 10.1186/ 1475-2875-13-68. URL https://www.ncbi.nlm.nih.gov/pubmed/24564912.

A. S. Tarun, K. Baer, R. F. Dumpit, S. Gray, N. Lejarcegui, U. Frevert, and S. H. Kappe. Quantitative isolation and in vivo imaging of malaria parasite liver stages. Int J Parasitol, 36(12):1283–93, 2006. ISSN 0020-7519 (Print) 0020-7519 (Link- ing). doi: 10.1016/j.ijpara.2006.06.009. URL http://www.ncbi.nlm.nih.gov/ pubmed/16890231.

A. S. Tarun, X. Peng, R. F. Dumpit, Y. Ogata, H. Silva-Rivera, N. Camargo, T. M. Daly, L. W. Bergman, and S. H. Kappe. A combined transcriptome and proteome survey of malaria parasite liver stages. Proc Natl Acad Sci U S A, 105(1):305–10, 2008. ISSN 1091-6490 (Electronic) 0027-8424 (Linking). doi: 10.1073/pnas. 0710780104. URL https://www.ncbi.nlm.nih.gov/pubmed/18172196.

R. Tewari, U. Straschil, A. Bateman, U. Bohme, I. Cherevach, P. Gong, A. Pain, and O. Billker. The systematic functional analysis of plasmodium protein kinases identifies essential regulators of mosquito transmission. Cell Host Microbe, 8(4): 377–87, 2010. ISSN 1934-6069 (Electronic) 1931-3128 (Linking). doi: 10.1016/ j.chom.2010.09.006. URL http://www.ncbi.nlm.nih.gov/pubmed/20951971.

C. Trapnell, L. Pachter, and S. L. Salzberg. Tophat: discovering splice junc- tions with rna-seq. Bioinformatics, 25(9):1105–11, 2009. ISSN 1367- 4811 (Electronic) 1367-4803 (Linking). doi: 10.1093/bioinformatics/ btp120. URL http://www.ncbi.nlm.nih.gov/pubmed/19289445http://www. ncbi.nlm.nih.gov/pmc/articles/PMC2672628/pdf/btp120.pdf.

L. van Den Berghe, E. Peel, M. Chardome, and F. L. Lambrecht. Asexual cycle of plasmodium atheruri n. sp. of porcupine atherurus africanus centralis in belgium congo. Annales de la Societe belge de medecine tropicale, 38(5):971–6, 1958. URL https://www.ncbi.nlm.nih.gov/pubmed/13627659.

G. A. Van der Auwera, M. O. Carneiro, C. Hartl, R. Poplin, G. Del Angel, A. Levy- Moonshine, T. Jordan, K. Shakir, D. Roazen, J. Thibault, E. Banks, K. V. Garimella, D. Altshuler, S. Gabriel, and M. A. DePristo. From fastq data to high confidence variant calls: the genome analysis toolkit best practices pipeline. Curr Protoc Bioinformatics, 43:11 10 1–33, 2013. ISSN 1934-340X (Electronic) 1934-3396 (Linking). doi: 10.1002/0471250953.bi1110s43. URL https://www.ncbi.nlm. nih.gov/pubmed/25431634. 170 M. R. van Dijk, A. P. Waters, and C. J. Janse. Stable transfection of malaria parasite blood stages. Science, 268(5215):1358–62, 1995. ISSN 0036-8075 (Print) 0036- 8075 (Linking). URL https://www.ncbi.nlm.nih.gov/pubmed/7761856.

M. R. van Dijk, C. J. Janse, and A. P. Waters. Expression of a Plas- modium gene introduced into subtelomeric regions of Plasmodium berghei chromosomes. Science, 271(5249):662–665, 1996. ISSN 0036-8075. URL http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve{&}db= PubMed{&}dopt=Citation{&}list{_}uids=8571132.

A. M. Vaughan, R. S. Pinapati, I. H. Cheeseman, N. Camargo, M. Fishbaugher, L. A. Checkley, S. Nair, C. A. Hutyra, F. H. Nosten, T. J. Anderson, M. T. Ferdig, and S. H. Kappe. Plasmodium falciparum genetic crosses in a humanized mouse model. Nat Methods, 12(7):631–3, 2015. ISSN 1548-7105 (Electronic) 1548- 7091 (Linking). doi: 10.1038/nmeth.3432. URL https://www.ncbi.nlm.nih. gov/pubmed/26030447.

A. Vignal, D. Milan, M. SanCristobal, and A. Eggen. A review on snp and other types of molecular markers and their use in animal genetics. Genetics Selection Evolution, 34(3):275, May 2002. ISSN 1297-9686. doi: 10.1186/ 1297-9686-34-3-275. URL http://dx.doi.org/10.1186/1297-9686-34-3-275.

I. H. Vincke and M. Lips. Un nouveau plasmodium d’un rongeur sauvage du congo, plasmodium berghei n.sp. Ann Soc Belg Med Trop (1920), 28(1):97–104, 1948. ISSN 0365-6527 (Print) 0365-6527 (Linking). URL http://www.ncbi.nlm.nih. gov/pubmed/18874862.

P. Vos, R. Hogers, M. Bleeker, M. Reijans, T. V. D. Lee, M. Hornes, A. Friters, J. Pot, J. Paleman, M. Kuiper, and M. Zabeau. AFLP: A new technique for DNA finger- printing. Nucleic Acids Research, 23(21):4407–4414, 1995. ISSN 03051048. doi: 10.1093/nar/23.21.4407.

J. C. Wagner, R. J. Platt, S. J. Goldfless, F. Zhang, and J. C. Niles. Efficient crispr- cas9-mediated genome editing in plasmodium falciparum. Nat Methods, 11(9): 915–8, 2014. ISSN 1548-7105 (Electronic) 1548-7091 (Linking). doi: 10.1038/ nmeth.3063. URL http://www.ncbi.nlm.nih.gov/pubmed/25108687http:// www.nature.com/nmeth/journal/v11/n9/pdf/nmeth.3063.pdf.

M. Wahlgren, S. Goel, and R. R. Akhouri. Variant surface antigens of Plasmodium falciparum and their roles in severe malaria. Nat Rev Micro, 15(8):479–491, 171 aug 2017. ISSN 1740-1526. URL http://dx.doi.org/10.1038/nrmicro.2017. 47http://10.0.4.14/nrmicro.2017.47.

A. Walker-Jonah, S. A Dolan, R. W Gwadz, L. J Panton, and T. E. Wellems. An RFLP map of the Plasmodium falciparum genome, recombination rates and favored linkage groups in a genetic cross. Molecular and Biochemical Para- sitology, 51(2):313–320, 1992. ISSN 0166-6851. doi: http://dx.doi.org/10. 1016/0166-6851(92)90081-T. URL http://www.sciencedirect.com/science/ article/pii/016668519290081T.

D. Walliker. Genetic recombination in malaria parasites. Experimental Para- sitology, 69(3):303–309, 1989. ISSN 0014-4894. doi: http://dx.doi.org/10. 1016/0014-4894(89)90078-7. URL http://www.sciencedirect.com/science/ article/pii/0014489489900787.

D. Walliker. The role of molecular genetics in field studies on malaria para- sites. International Journal for Parasitology, 24(6):799–808, sep 1994. ISSN 00207519. doi: 10.1016/0020-7519(94)90006-X. URL http://linkinghub. elsevier.com/retrieve/pii/002075199490006X.

D. Walliker, R. Carter, and S. Morgan. Genetic recombination in malaria parasites. Nature, 232(5312):561–2, 1971. ISSN 0028-0836 (Print) 0028-0836 (Linking). URL https://www.ncbi.nlm.nih.gov/pubmed/4937500.

D. Walliker, R. Carter, and S. Morgan. Genetic recombination in plasmodium berghei. Parasitology, 66(2):309320, 1973. doi: 10.1017/S0031182000045248.

D. Walliker, R. Carter, and A. Sanderson. Genetic studies on plasmodium chabaudi: recombination between enzyme markers. Parasitology, 70(1):19–24, 1975. ISSN 0031-1820 (Print) 0031-1820 (Linking). URL https://www.ncbi.nlm.nih.gov/ pubmed/1118185.

D. Walliker, A. Sanderson, M. Yoeli, and B. J. Hargreaves. A genetic investigation of virulence in a rodent malaria parasite. Parasitology, 72(2):183194, 1976. doi: 10.1017/S0031182000048484.

D. Walliker, I. A. Quakyi, T. E. Wellems, T. F. McCutchan, A. Szarfman, W. T. Lon- don, L. M. Corcoran, T. R. Burkot, and R. Carter. Genetic analysis of the hu- man malaria parasite plasmodium falciparum. Science, 236(4809):1661–6, 1987. 172 ISSN 0036-8075 (Print) 0036-8075 (Linking). URL https://www.ncbi.nlm.nih. gov/pubmed/3299700.

F. Wang, P. Krai, E. Deu, B. Bibb, C. Lauritzen, J. Pedersen, M. Bogyo, and M. Klemba. Biochemical characterization of plasmodium falciparum dipeptidyl aminopeptidase 1. Mol Biochem Parasitol, 175(1):10–20, 2011. ISSN 1872-9428 (Electronic) 0166-6851 (Linking). doi: 10.1016/j.molbiopara.2010.08.004. URL https://www.ncbi.nlm.nih.gov/pubmed/20833209.

Q. Wang, S. Brown, D. S. Roos, V. Nussenzweig, and P. Bhanot. Transcriptome of axenic liver stages of plasmodium yoelii. Mol Biochem Parasitol, 137(1):161–8, 2004. ISSN 0166-6851 (Print) 0166-6851 (Linking). doi: 10.1016/j.molbiopara. 2004.06.001. URL https://www.ncbi.nlm.nih.gov/pubmed/15279962.

J. G. Waterkeyn, M. E. Wickham, K. M. Davern, B. M. Cooke, R. L. Coppel, J. C. Reeder, J. G. Culvenor, R. F. Waller, and A. F. Cowman. Targeted muta- genesis of Plasmodium falciparum erythrocyte membrane protein 3 (PfEMP3) disrupts cytoadherence of malaria-infected red blood cells. The EMBO jour- nal, 19(12):2813–23, 2000. ISSN 0261-4189. doi: 10.1093/emboj/19.12. 2813. URL http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid= 203347{&}tool=pmcentrez{&}rendertype=abstract.

A. P. Waters. Orthology between the genomes of Plasmodium falciparum and rodent malaria parasites: possible practical applications, volume 357. 2002. doi: 10.1098/rstb.2001.1011. URL http://rstb.royalsocietypublishing.org/ royptb/357/1417/55.full.pdfhttp://rstb.royalsocietypublishing.org/ content/royptb/357/1417/55.full.pdf.

T. E. Wellems, L. J. Panton, I. Y. Gluzman, V. E. do Rosario, R. W. Gwadz, A. Walker- Jonah, and D. J. Krogstad. Chloroquine resistance not linked to mdr-like genes in a plasmodium falciparum cross. Nature, 345(6272):253–5, 1990. ISSN 0028- 0836 (Print) 0028-0836 (Linking). doi: 10.1038/345253a0. URL https://www. ncbi.nlm.nih.gov/pubmed/1970614.

T. E. Wellems, A. Walker-Jonah, and L. J. Panton. Genetic mapping of the chloroquine-resistance locus on plasmodium falciparum chromosome 7. Proc Natl Acad Sci U S A, 88(8):3382–6, 1991. ISSN 0027-8424 (Print) 0027-8424 (Linking). URL https://www.ncbi.nlm.nih.gov/pubmed/1673031. 173 W. H. Wernsdorfer and D. Payne. The dynamics of drug resistance in plasmodium falciparum. Pharmacol Ther, 50(1):95–121, 1991. ISSN 0163-7258 (Print) 0163- 7258 (Linking). URL https://www.ncbi.nlm.nih.gov/pubmed/1891480.

N. J. White. Antimalarial drug resistance. J Clin Invest, 113(8):1084–92, 2004. ISSN 0021-9738 (Print) 0021-9738 (Linking). doi: 10.1172/JCI21682. URL https://www.ncbi.nlm.nih.gov/pubmed/15085184.

World Health Organization. Guidelines for the treatment of malaria. Third edition. World Health Organization, Geneva, 2015. ISBN ISBN9789241549127.

World Health Organization. World malaria report 2016. World Health Organization, Geneva, 2016. ISBN ISBN9789241511711.

Y. Wu and T. E. Kirkman, L. A.and Wellems. Transformation of Plasmodium falci- parum malaria parasites by homologous integration of plasmids that confer re- sistance to pyrimethamine. Proceedings of the National Academy of Sciences of the United States of America, 93(3):1130–4, 1996. ISSN 0027-8424. doi: 10.1073/ pnas.93.3.1130. URL http://www.pubmedcentral.nih.gov/articlerender. fcgi?artid=40043{&}tool=pmcentrez{&}rendertype=abstract.

Y. Wu, C. D. Sifri, H. H. Lei, X. Z. Su, and T. E. Wellems. Transfec- tion of Plasmodium falciparum within human red blood cells. Proceed- ings of the National Academy of Sciences of the United States of Amer- ica, 92(4):973–7, 1995. ISSN 0027-8424. doi: 10.1073/pnas.92.4. 973. URL http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid= 42619{&}tool=pmcentrez{&}rendertype=abstract.

X. Y. Yam, T. Brugat, A. Siau, J. Lawton, D. S. Wong, A. Farah, J. S. Twang, X. Gao, J. Langhorne, and P. R. Preiser. Characterization of the Plasmodium Interspersed Repeats (PIR) proteins of Plasmod- ium chabaudi indicates functional diversity. 6:23449, mar 2016. URL http://dx.doi.org/10.1038/srep23449http://10.0.4.14/srep23449https: //www.nature.com/articles/srep23449{#}supplementary-information.

R. Yang and Z. Su. Analyzing circadian expression data by harmonic regres- sion based on autoregressive spectral estimation. Bioinformatics, 26(12): i168–74, 2010. ISSN 1367-4811 (Electronic) 1367-4803 (Linking). doi: 10.1093/bioinformatics/btq189. URL https://www.ncbi.nlm.nih.gov/pubmed/ 20529902. 174 M. Yoeli and H. Most. Pre-erythrocytic development of plasmodium berghei. Nature, 205:715–6, 1965. ISSN 0028-0836 (Print) 0028-0836 (Linking). URL http: //www.ncbi.nlm.nih.gov/pubmed/14287427.

D. R. Zerbino and E. Birney. Velvet: algorithms for de novo short read assembly using de bruijn graphs. Genome Res, 18(5):821–9, 2008. ISSN 1088-9051 (Print) 1088-9051 (Linking). doi: 10.1101/gr.074492. 107. URL http://www.ncbi.nlm.nih.gov/pubmed/18349386http://www.ncbi. nlm.nih.gov/pmc/articles/PMC2336801/pdf/821.pdf.

V. Zuzarte-Luis, M. M. Mota, and A. M. Vigario. Malaria infections: what and how can mice teach us. J Immunol Methods, 410:113–22, 2014. ISSN 1872-7905 (Electronic) 0022-1759 (Linking). doi: 10.1016/j.jim. 2014.05.001. URL http://www.ncbi.nlm.nih.gov/pubmed/24837740http: //ac.els-cdn.com/S0022175914001318/1-s2.0-S0022175914001318-main. pdf?_tid=efd5153e-c509-11e4-a9ee-00000aacb35e&acdnat=1425760899_ 4cae36cf98cf5abe118417a68767bf3e. 175

APPENDICES

A Functional annotation of Plasmodium vinckei genes. P. vinckei gene IDs with products calls, number of transmembrane domains (TMs), PEXEL motif score ( >4.3 is considered PEXEL positive) and presence/absence of signal peptide sequences are shown.

Gene ID Number of TMs PEXEL score Signal sequence Product PVBDA 0100010 NA 6.1608 NA haloacid dehalogenase-like hydrolase, putative, fragment PVBDA 0100020 1 6.11789 NA fam-b protein, fragment PVBDA 0100030 1 NA YES fam-c protein PVBDA 0100040 1 NA NA PIR protein CIR protein PVBDA 0100050 1 NA NA CIR protein PIR protein PVBDA 0100060 2 NA NA PIR protein CIR protein PVBDA 0100080 NA NA YES fam-a protein PVBDA 0100090 1 NA YES fam-c protein PVBDA 0100100 1 NA NA CIR protein PIR protein PVBDA 0100110 1 NA NA PIR protein CIR protein PVBDA 0100120 2 NA NA PIR protein CIR protein PVBDA 0100130 NA NA YES fam-a protein PVBDA 0100140 NA NA YES fam-a protein PVBDA 0100150 NA NA YES fam-a protein, pseudogene PVBDA 0100170 NA 3.01732 NA fam-b protein PVBDA 0100180 2 NA NA conserved rodent malaria protein, unknown function PVBDA 0100190 1 NA YES fam-c protein PVBDA 0100210 1 NA NA fam-a protein PVBDA 0100220 3 9.33248 NA Plasmodium exported protein, unknown function PVBDA 0100250 1 NA NA PIR protein CIR protein PVBDA 0100260 1 7.18772 NA schizont membrane associated cytoadherence protein, putative PVBDA 0100270 1 11.9513 NA Plasmodium exported protein, unknown function PVBDA 0100320 2 NA NA elongation factor G, putative 176

Gene ID Number of TMs PEXEL score Signal sequence Product PVBDA 0100400 1 NA NA mitochondrial chaperone BCS1, putative PVBDA 0100410 1 NA NA dihydroorotate dehydrogenase, putative PVBDA 0100430 10 NA NA cation H+ antiporter, putative PVBDA 0100440 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0100460 1 NA NA centrosomal protein CEP76, putative PVBDA 0100560 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0100660 7 NA NA long chain polyunsaturated fatty acid elongation enzyme, putative PVBDA 0100770 4 0.946001 NA RING zinc finger protein, putative PVBDA 0100780 2 NA NA uroporphyrinogen III decarboxylase, putative PVBDA 0100790 11 NA NA G-protein associated signal transduction protein, putative PVBDA 0100800 6 NA NA para-hydroxybenzoate–polyprenyltransferase, putative PVBDA 0100900 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0100970 8 NA NA zinc transporter ZIP1, putative PVBDA 0101000 1 NA NA mitochondrial cardiolipin synthase, putative PVBDA 0101010 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0101020 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0101030 4 NA NA palmitoyltransferase DHHC2, putative PVBDA 0101120 6 4.56874 NA SNARE associated Golgi protein, putative PVBDA 0101130 2 NA NA succinate dehydrogenase subunit 3, putative PVBDA 0101180 4 NA NA conserved Plasmodium protein, unknown function PVBDA 0101220 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0101230 2 NA NA eukaryotic translation initiation factor 3 subunit L, putative PVBDA 0101250 4 NA NA conserved protein, unknown function PVBDA 0101285 1 NA YES 6-cysteine protein PVBDA 0101320 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0101330 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0101340 13 NA NA rhoptry protein ROP14, putative PVBDA 0101370 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0101430 12 NA NA major facilitator superfamily-related transporter, putative PVBDA 0101440 1 NA NA fam-b protein PVBDA 0101460 1 NA NA fam-b protein PVBDA 0101470 1 6.45839 YES fam-a protein PVBDA 0101480 NA NA YES fam-a protein PVBDA 0101490 NA NA YES fam-a protein PVBDA 0101510 NA NA YES erythrocyte membrane antigen 1 PVBDA 0101520 2 NA NA CIR protein PIR protein PVBDA 0101540 NA NA YES fam-a protein PVBDA 0101550 NA NA YES fam-a protein PVBDA 0101560 2 NA NA PIR protein CIR protein PVBDA 0101570 1 NA NA PIR protein CIR protein 177

Gene ID Number of TMs PEXEL score Signal sequence Product PVBDA 0101580 NA NA YES fam-c protein PVBDA 0101590 NA NA YES fam-a protein PVBDA 0101600 NA NA YES fam-a protein PVBDA 0101610 2 NA NA CIR protein PIR protein PVBDA 0101620 1 NA NA CIR protein PIR protein PVBDA 0101630 1 NA NA PIR protein CIR protein PVBDA 0101640 2 NA NA CIR protein PIR protein PVBDA 0101650 1 NA NA CIR protein PIR protein PVBDA 0101660 1 NA NA CIR protein PIR protein PVBDA 0101670 NA 0.460666 YES fam-a protein PVBDA 0200010 NA NA YES fam-a protein, fragment PVBDA 0200020 NA NA YES fam-a protein PVBDA 0200030 1 NA YES fam-c protein PVBDA 0200040 1 NA NA CIR protein PIR protein PVBDA 0200050 1 NA NA CIR protein PIR protein PVBDA 0200060 2 NA NA PIR protein CIR protein PVBDA 0200070 1 NA NA CIR protein PIR protein PVBDA 0200080 1 NA NA PIR protein CIR protein PVBDA 0200090 NA NA YES fam-a protein, fragment PVBDA 0200100 1 NA YES fam-c protein PVBDA 0200110 1 NA NA PIR protein CIR protein PVBDA 0200120 2 NA NA CIR protein PIR protein PVBDA 0200130 2 NA NA PIR protein CIR protein PVBDA 0200140 1 NA YES fam-a protein, fragment PVBDA 0200160 1 NA NA PIR protein CIR protein PVBDA 0200170 1 NA NA PIR protein CIR protein PVBDA 0200180 NA NA YES fam-a protein PVBDA 0200190 NA NA YES fam-a protein PVBDA 0200200 NA 1.093 YES fam-a protein PVBDA 0200220 1 NA YES fam-c protein PVBDA 0200230 2 NA NA PIR protein CIR protein PVBDA 0200240 NA NA YES fam-a protein PVBDA 0200260 2 NA YES early transcribed membrane protein PVBDA 0200270 1 4.26342 NA conserved Plasmodium protein, unknown function PVBDA 0200330 1 NA NA UMP-CMP kinase, putative PVBDA 0200470 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0200480 8 NA NA phosphatidate cytidylyltransferase, putative PVBDA 0200560 NA 6.34843 YES LCCL domain-containing protein PVBDA 0200590 2 NA YES secreted ookinete protein, putative PVBDA 0200600 1 NA NA conserved Plasmodium protein, unknown function 178

Gene ID Number of TMs PEXEL score Signal sequence Product PVBDA 0200620 3 NA NA mitochondrial carrier protein, putative PVBDA 0200690 6 NA NA serine threonine protein kinase, putative PVBDA 0200700 10 NA NA lipid sterol:H+ symporter, putative PVBDA 0200710 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0200810 8 NA NA calcium-transporting ATPase, putative PVBDA 0200900 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0200940 11 NA NA novel putative transporter 1, putative PVBDA 0200950 12 NA NA major facilitator superfamily-related transporter, putative PVBDA 0200970 1 NA NA conserved protein, unknown function PVBDA 0201000 NA 3.45082 NA StAR-related lipid transfer protein PVBDA 0201010 5 NA NA conserved Plasmodium protein, unknown function PVBDA 0201020 NA 4.53071 NA thrombospondin related sporozoite protein, putative PVBDA 0201030 2 NA NA parasite-infected erythrocyte surface protein PVBDA 0201050 6 NA NA L-seryl-tRNA(Sec) kinase, putative PVBDA 0201080 12 NA NA nucleoside transporter 4, putative PVBDA 0201120 4 NA NA conserved Plasmodium protein, unknown function PVBDA 0201140 4 NA NA conserved Plasmodium protein, unknown function PVBDA 0201175 3 NA NA dolichyl-diphosphooligosaccharide–protein glycosyltransferase subunit DAD1, putative PVBDA 0201180 1 NA NA mitochondrial import inner membrane translocase subunit TIM50, putative PVBDA 0201240 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0201270 9 NA NA cation transporting ATPase, putative PVBDA 0201300 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0201410 NA NA YES 1-cys peroxiredoxin, putative PVBDA 0201490 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0201560 1 2.56315 NA Plasmodium exported protein, unknown function PVBDA 0201570 3 6.05129 NA Plasmodium exported protein, unknown function PVBDA 0201580 2 7.59654 NA Plasmodium exported protein, unknown function PVBDA 0201590 1 5.98034 NA heat shock protein, putative PVBDA 0201600 NA NA YES fam-a protein PVBDA 0201610 NA NA YES fam-a protein PVBDA 0201620 NA NA YES fam-a protein PVBDA 0201630 NA NA YES fam-a protein PVBDA 0201640 1 NA NA PIR protein CIR protein PVBDA 0201650 1 NA NA PIR protein CIR protein PVBDA 0201660 1 NA YES fam-c protein PVBDA 0201670 NA NA YES fam-a protein PVBDA 0201690 1 4.71965 YES fam-a protein PVBDA 0201700 1 NA NA PIR protein CIR protein, pseudogene PVBDA 0201710 1 NA NA CIR protein PIR protein PVBDA 0201720 2 NA NA PIR protein CIR protein 179

Gene ID Number of TMs PEXEL score Signal sequence Product PVBDA 0201730 2 NA NA PIR protein CIR protein PVBDA 0201740 2 NA NA CIR protein PIR protein PVBDA 0201750 1 NA NA PIR protein CIR protein PVBDA 0201760 1 NA NA PIR protein CIR protein PVBDA 0201770 2 NA NA PIR protein CIR protein PVBDA 0201780 1 NA NA PIR protein CIR protein PVBDA 0201790 1 NA NA CIR protein PIR protein, pseudogene PVBDA 0201810 1 NA NA CIR protein PIR protein PVBDA 0201820 2 NA NA PIR protein CIR protein PVBDA 0201830 1 NA NA CIR protein PIR protein PVBDA 0300020 2 NA NA CIR protein PIR protein PVBDA 0300030 2 NA NA CIR protein PIR protein PVBDA 0300040 1 NA NA CIR protein PIR protein PVBDA 0300045 1 NA NA PIR protein CIR protein PVBDA 0300050 2 NA NA PIR protein CIR protein PVBDA 0300055 1 NA NA CIR protein PIR protein PVBDA 0300060 2 NA NA PIR protein CIR protein PVBDA 0300070 1 NA NA PIR protein CIR protein PVBDA 0300080 2 NA NA CIR protein PIR protein PVBDA 0300090 1 NA NA PIR protein CIR protein PVBDA 0300100 1 NA NA PIR protein CIR protein PVBDA 0300110 NA NA YES fam-a protein, fragment PVBDA 0300120 NA NA YES fam-c protein PVBDA 0300130 1 NA NA CIR protein PIR protein PVBDA 0300140 1 NA NA PIR protein CIR protein PVBDA 0300150 2 NA NA PIR protein CIR protein PVBDA 0300160 1 2.415 NA fam-b protein PVBDA 0300170 2 NA NA PIR protein CIR protein, pseudogene PVBDA 0300190 2 9.62271 NA fam-b protein PVBDA 0300200 2 NA NA CIR protein PIR protein PVBDA 0300210 1 8.70287 YES fam-b protein PVBDA 0300220 1 8.09161 YES fam-b protein PVBDA 0300230 NA 5.80289 YES fam-a protein PVBDA 0300240 NA NA YES fam-a protein PVBDA 0300250 1 NA NA CIR protein PIR protein PVBDA 0300260 1 3.72832 NA Plasmodium exported protein, unknown function PVBDA 0300340 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0300370 NA NA YES 5’-3’ exonuclease, putative PVBDA 0300390 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0300400 1 NA NA conserved Plasmodium protein, unknown function 180

Gene ID Number of TMs PEXEL score Signal sequence Product PVBDA 0300440 2 NA NA 5’-3’ exonuclease, putative PVBDA 0300450 12 NA NA hexose transporter, putative PVBDA 0300530 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0300580 1 NA NA cysteine desulfuration protein SufE, putative PVBDA 0300590 10 NA NA pantothenate transporter, putative PVBDA 0300640 1 3.31808 YES merozoite surface protein 4/5, putative PVBDA 0300670 NA NA YES serine repeat antigen 5, putative PVBDA 0300680 NA NA YES serine repeat antigen 4, putative PVBDA 0300690 NA NA YES serine repeat antigen 4, putative PVBDA 0300700 NA NA YES serine repeat antigen 2, putative PVBDA 0300710 NA NA YES serine repeat antigen 2, putative PVBDA 0300740 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0300750 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0300760 NA NA YES acyl carrier protein, putative PVBDA 0300770 1 NA NA ribosome-recycling factor, putative PVBDA 0300810 1 NA YES 6-cysteine protein PVBDA 0300840 1 NA NA 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase, putative PVBDA 0300870 13 NA NA transporter, putative PVBDA 0300890 1 NA NA Sec61-gamma subunit of protein translocation complex, putative PVBDA 0300920 12 NA NA monocarboxylate transporter, putative PVBDA 0300950 NA 1.68545 NA conserved Plasmodium protein, unknown function PVBDA 0300970 1 NA NA syntaxin, putative PVBDA 0301030 1 NA NA apicoplast beta-ketoacyl-acyl carrier protein synthase III precursor, putative PVBDA 0301040 3 NA NA GAF domain-related protein, putative PVBDA 0301050 3 NA NA UDP-N-acetylglucosamine transferase subunit ALG14, putative PVBDA 0301090 5 NA NA GDP-fructose:GMP antiporter, putative PVBDA 0301130 5 NA NA conserved Plasmodium protein, unknown function PVBDA 0301170 10 NA NA multidrug efflux pump, putative PVBDA 0301190 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0301210 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0301240 1 NA NA tetratricopeptide repeat protein, putative PVBDA 0301270 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0301360 12 NA NA conserved Plasmodium protein, unknown function PVBDA 0301370 NA NA YES rhoptry neck protein 6, putative PVBDA 0301380 NA NA YES acyl-CoA synthetase, putative PVBDA 0301430 3 NA NA conserved Plasmodium protein, unknown function PVBDA 0301460 4 NA NA palmitoyltransferase DHHC11, putative PVBDA 0301530 6 NA NA MtN3-like protein PVBDA 0301550 11 NA NA conserved Plasmodium protein, unknown function PVBDA 0301570 12 NA NA conserved Plasmodium membrane protein, unknown function 181

Gene ID Number of TMs PEXEL score Signal sequence Product PVBDA 0301610 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0301630 3 NA NA conserved Plasmodium protein, unknown function PVBDA 0301680 5 NA NA conserved Plasmodium membrane protein, unknown function PVBDA 0301730 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0301770 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0301780 1 NA NA ribosome associated membrane protein RAMP4, putative PVBDA 0301790 1 NA NA pseudouridine synthase, putative PVBDA 0301810 3 NA NA Plasmodium exported protein, unknown function PVBDA 0301820 2 2.01523 NA Plasmodium exported protein, unknown function PVBDA 0301830 2 6.09429 NA fam-b protein PVBDA 0301840 1 NA YES reticulocyte binding protein, putative PVBDA 0301860 NA NA YES fam-a protein PVBDA 0301870 NA NA YES fam-b protein PVBDA 0301880 2 NA NA PIR protein CIR protein PVBDA 0301890 1 NA YES fam-c protein PVBDA 0301900 NA NA YES fam-a protein PVBDA 0301920 NA 9.15166 NA haloacid dehalogenase-like hydrolase, putative PVBDA 0301930 1 NA NA CIR protein PIR protein PVBDA 0301940 NA NA YES fam-a protein PVBDA 0301950 NA NA YES fam-a protein, fragment PVBDA 0400010 1 NA YES fam-c protein PVBDA 0400020 1 NA NA CIR protein PIR protein, pseudogene PVBDA 0400030 1 NA NA PIR protein CIR protein PVBDA 0400035 NA 8.90559 YES fam-b protein PVBDA 0400040 1 NA NA CIR protein PIR protein, pseudogene PVBDA 0400050 2 NA NA CIR protein PIR protein PVBDA 0400060 NA NA YES fam-a protein PVBDA 0400070 1 NA YES fam-c protein PVBDA 0400080 1 NA NA CIR protein PIR protein PVBDA 0400090 1 NA NA PIR protein CIR protein PVBDA 0400110 1 6.2513 NA fam-b protein PVBDA 0400140 NA NA YES fam-a protein PVBDA 0400150 NA NA YES fam-a protein PVBDA 0400160 2 NA NA PIR protein CIR protein PVBDA 0400180 1 NA YES fam-c protein, fragment PVBDA 0400190 1 NA NA CIR protein PIR protein PVBDA 0400200 1 NA NA CIR protein PIR protein, fragment PVBDA 0400207 1 NA NA CIR protein PIR protein, fragment PVBDA 0400210 2 NA NA CIR protein PIR protein PVBDA 0400220 1 NA YES fam-a protein 182

Gene ID Number of TMs PEXEL score Signal sequence Product PVBDA 0400230 1 8.77838 YES fam-b protein PVBDA 0400240 2 NA NA PIR protein CIR protein PVBDA 0400250 NA NA YES fam-a protein PVBDA 0400270 NA NA YES fam-a protein PVBDA 0400280 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0400310 3 NA NA ABC transporter B family member 4, putative PVBDA 0400390 4 NA NA palmitoyltransferase DHHC1, putative PVBDA 0400440 1 NA NA phosphatidylethanolamine-binding protein, putative PVBDA 0400510 1 2.58635 YES circumsporozoite (CS) protein, putative PVBDA 0400540 NA NA YES elongation factor (EF-TS), putative PVBDA 0400550 4 NA NA conserved Plasmodium protein, unknown function PVBDA 0400570 11 NA NA conserved Plasmodium membrane protein, unknown function PVBDA 0400590 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0400630 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0400670 1 NA NA FAD-dependent glycerol-3-phosphate dehydrogenase, putative PVBDA 0400680 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0400700 2 NA NA membrane magnesium transporter, putative PVBDA 0401020 1 NA NA phd finger protein, putative PVBDA 0401070 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0401110 1 NA YES valine–tRNA ligase, putative PVBDA 0401140 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0401150 1 NA NA dolichyl-diphosphooligosaccharide–protein glycosyltransferase subunit 1, putative PVBDA 0401160 1 NA NA plasmepsin VI, putative PVBDA 0401200 6 NA NA E3 ubiquitin-protein ligase, putative PVBDA 0401240 8 NA NA major facilitator superfamily-related transporter, putative PVBDA 0401300 1 2.76187 YES ubiquitin-protein ligase, putative PVBDA 0401320 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0401370 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0401390 NA NA YES co-chaperone p23, putative PVBDA 0401400 1 NA NA vesicle transport v-SNARE protein, putative PVBDA 0401410 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0401430 3 NA YES DER1-like protein, putative PVBDA 0401520 2 NA NA circumsporozoite- and TRAP-related protein, putative PVBDA 0401530 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0401570 7 NA NA conserved Plasmodium protein, unknown function PVBDA 0401580 6 NA NA conserved Plasmodium protein, unknown function PVBDA 0401630 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0401640 1 NA YES inorganic pyrophosphatase, putative PVBDA 0401680 5 NA NA formate-nitrite transporter, putative PVBDA 0401690 2 NA NA HVA22 TB2 DP1 family protein, putative 183

Gene ID Number of TMs PEXEL score Signal sequence Product PVBDA 0401780 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0401830 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0401870 2 NA NA nucleoporin NUP100/NSP100, putative PVBDA 0401890 6 NA NA copper-transporting ATPase, putative PVBDA 0401910 NA NA YES bacterial histone-like protein, putative PVBDA 0401940 1 NA NA signal peptidase complex subunit 3, putative PVBDA 0401960 11 NA NA conserved protein, unknown function PVBDA 0401970 1 3.47685 YES PH domain-containing protein, putative PVBDA 0402010 NA NA YES LCCL domain-containing protein PVBDA 0402080 3 NA NA protein RER1, putative PVBDA 0402090 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0402100 2 9.46614 NA fam-b protein PVBDA 0402110 NA 6.19427 YES fam-b protein PVBDA 0402120 NA NA YES fam-a protein PVBDA 0402140 1 NA NA PIR protein CIR protein PVBDA 0402150 1 NA NA CIR protein PIR protein PVBDA 0402160 1 NA YES fam-c protein PVBDA 0402180 NA 7.04157 NA haloacid dehalogenase-like hydrolase, putative PVBDA 0402190 1 NA YES fam-c protein PVBDA 0402200 2 NA NA PIR protein CIR protein PVBDA 0402210 NA NA YES fam-c protein PVBDA 0402220 NA NA YES fam-a protein PVBDA 0402230 1 NA NA PIR protein CIR protein PVBDA 0402250 1 NA YES fam-c protein PVBDA 0402260 NA NA YES fam-a protein PVBDA 0402270 1 NA NA PIR protein CIR protein PVBDA 0500010 NA NA YES fam-a protein PVBDA 0500020 NA NA YES fam-c protein PVBDA 0500030 1 NA NA CIR protein PIR protein PVBDA 0500040 2 NA NA PIR protein CIR protein PVBDA 0500050 NA NA YES fam-a protein PVBDA 0500055 1 NA YES fam-c protein PVBDA 0500060 1 NA NA CIR protein PIR protein PVBDA 0500070 2 NA NA CIR protein PIR protein PVBDA 0500090 1 NA NA CIR protein PIR protein PVBDA 0500100 2 NA NA fam-a protein PVBDA 0500110 1 7.38471 NA fam-b protein PVBDA 0500120 NA NA YES fam-a protein PVBDA 0500140 3 NA NA PIR protein CIR protein PVBDA 0500150 1 NA YES reticulocyte binding protein, putative 184

Gene ID Number of TMs PEXEL score Signal sequence Product PVBDA 0500170 2 NA YES early transcribed membrane protein PVBDA 0500180 2 NA NA up-regulated in infective sporozoites early transcribed membrane protein PVBDA 0500290 1 NA NA tRNA pseudouridine synthase, putative PVBDA 0500320 2 4.32182 NA conserved Plasmodium protein, unknown function PVBDA 0500360 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0500370 1 NA NA conserved protein, unknown function PVBDA 0500440 10 NA NA conserved Plasmodium protein, unknown function PVBDA 0500470 NA NA YES tRNA methyltransferase, putative PVBDA 0500540 2 NA NA hypothetical protein PVBDA 0500620 7 NA NA conserved Plasmodium protein, unknown function PVBDA 0500630 5 NA YES endomembrane protein 70, putative PVBDA 0500660 7 NA NA conserved Plasmodium protein, unknown function PVBDA 0500710 10 NA NA conserved Plasmodium membrane protein, unknown function PVBDA 0500720 8 NA NA ZIP domain-containing protein, putative PVBDA 0500780 1 NA NA CDGSH iron-sulfur domain-containing protein, putative PVBDA 0500790 5 NA NA conserved Plasmodium protein, unknown function PVBDA 0500820 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0500840 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0500860 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0500870 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0500950 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0500980 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0501040 10 NA NA conserved Plasmodium protein, unknown function PVBDA 0501080 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0501270 2 NA NA methyltransferase, putative PVBDA 0501280 2 NA NA protoporphyrinogen oxidase, putative PVBDA 0501290 4 NA NA zinc finger, C3HC4 type, putative PVBDA 0501340 1 NA YES merozoite TRAP-like protein, putative PVBDA 0501350 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0501360 1 NA NA inner membrane complex protein 1m, putative PVBDA 0501400 2 NA NA LEM3 CDC50 family protein, putative PVBDA 0501430 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0501480 4 NA NA conserved Plasmodium protein, unknown function PVBDA 0501550 1 NA YES 28 kDa ookinete surface protein, putative PVBDA 0501560 2 NA NA 25 kDa ookinete surface antigen precursor, putative PVBDA 0501610 NA NA YES protein phosphatase inhibitor 3, putative PVBDA 0501690 5 NA NA DER1-like protein, putative PVBDA 0501700 1 NA NA conserved protein, unknown function PVBDA 0501740 NA 1.29047 NA conserved Plasmodium protein, unknown function PVBDA 0501760 1 NA NA early transcribed membrane protein 185

Gene ID Number of TMs PEXEL score Signal sequence Product PVBDA 0501820 1 NA YES plasmepsin VII, putative PVBDA 0501850 NA 3.90665 NA conserved Plasmodium protein, unknown function PVBDA 0501860 NA 2.54444 NA apicoplast ribosomal protein L27 precursor, putative PVBDA 0501890 4 NA NA conserved Plasmodium protein, unknown function PVBDA 0501950 1 NA NA conserved Plasmodium membrane protein, unknown function PVBDA 0501960 NA NA YES S-antigen, putative PVBDA 0501990 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0502020 NA 0.565112 NA conserved Plasmodium protein, unknown function PVBDA 0502040 11 NA NA acetyl-CoA transporter, putative PVBDA 0502060 NA NA YES pyruvate kinase 2, putative PVBDA 0502080 3 NA NA ADP ATP transporter on adenylate translocase, putative PVBDA 0502150 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0502160 2 NA NA antigen UB05, putative PVBDA 0502300 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0502310 2 NA NA steroid dehydrogenase, putative PVBDA 0502320 2 NA NA transmembrane emp24 domain-containing protein, putative PVBDA 0502390 7 NA YES serpentine receptor, putative PVBDA 0502440 NA NA YES asparagine rich protein, putative PVBDA 0502450 6 NA NA glideosome associated protein with multiple membrane spans 2, putative PVBDA 0502470 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0502480 2 NA YES early transcribed membrane protein PVBDA 0502500 1 NA NA tryptophan-rich protein tryptophan-rich antigen PVBDA 0502510 1 NA NA conserved rodent malaria protein, unknown function PVBDA 0502520 1 NA NA PIR protein CIR protein PVBDA 0502530 2 2.29518 NA Plasmodium exported protein, unknown function PVBDA 0502540 2 NA YES early transcribed membrane protein PVBDA 0502550 NA NA YES fam-a protein PVBDA 0502570 NA NA YES fam-a protein PVBDA 0502580 1 NA YES fam-a protein PVBDA 0502600 NA NA YES 6-cysteine protein PVBDA 0502620 1 NA YES phosphatidylinositol 4-kinase, putative PVBDA 0502630 2 NA NA PIR protein CIR protein PVBDA 0502640 1 NA NA CIR protein PIR protein PVBDA 0502650 NA NA YES fam-c protein PVBDA 0502660 NA NA YES fam-a protein PVBDA 0502680 1 NA NA CIR protein PIR protein PVBDA 0600010 1 NA NA reticulocyte binding protein, putative, pseudogene PVBDA 0600040 NA NA YES fam-a protein PVBDA 0600050 NA NA YES fam-a protein PVBDA 0600060 6 NA NA reticulocyte binding protein, putative, pseudogene 186

Gene ID Number of TMs PEXEL score Signal sequence Product PVBDA 0600080 NA NA YES fam-a protein PVBDA 0600090 1 NA NA XPA binding protein 1, putative PVBDA 0600110 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0600180 2 NA NA trimethylguanosine synthase, putative PVBDA 0600270 6 NA NA major facilitator superfamily domain-containing protein, putative PVBDA 0600360 7 NA NA conserved Plasmodium protein, unknown function PVBDA 0600400 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0600410 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0600500 2 NA NA conserved protein, unknown function PVBDA 0600640 NA NA YES RNA-binding protein, putative PVBDA 0600690 1 NA NA 50S ribosomal protein L29, putative PVBDA 0600710 8 NA NA cysteine repeat modular protein 3, putative PVBDA 0600730 10 NA NA amino acid transporter, putative PVBDA 0600770 6 NA NA zinc finger protein, putative PVBDA 0600830 7 NA NA 3’,5’-cyclic nucleotide phosphodiesterase, putative PVBDA 0600840 NA NA YES porphobilinogen deaminase, putative PVBDA 0600850 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0600870 5 NA NA ABC transporter B family member 7, putative PVBDA 0600960 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0600970 6 NA NA GPI mannosyltransferase 1, putative PVBDA 0601000 4 NA NA conserved Plasmodium protein, unknown function PVBDA 0601060 5 NA NA non-SERCA-type Ca2+ -transporting P-ATPase, putative PVBDA 0601070 1 NA NA glutathione peroxidase-like thioredoxin peroxidase, putative PVBDA 0601150 6 NA NA conserved Plasmodium protein, unknown function PVBDA 0601190 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0601200 2 NA NA N-acetyltransferase, putative PVBDA 0601280 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0601350 4 NA NA conserved Plasmodium protein, unknown function PVBDA 0601370 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0601470 1 NA YES cysteine desulfurase, putative PVBDA 0601500 7 NA NA drug metabolite transporter, putative PVBDA 0601510 7 NA NA conserved Plasmodium membrane protein, unknown function PVBDA 0601520 4 NA NA conserved Plasmodium protein, unknown function PVBDA 0601630 10 NA NA cysteine repeat modular protein 2, putative PVBDA 0601670 5 NA NA conserved Plasmodium protein, unknown function PVBDA 0601680 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0601700 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0601740 2 NA NA LEM3 CDC50 family protein, putative PVBDA 0601770 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0601780 2 NA NA conserved Plasmodium protein, unknown function 187

Gene ID Number of TMs PEXEL score Signal sequence Product PVBDA 0601840 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0601860 4 NA NA phosphoinositide-binding protein, putative PVBDA 0601890 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0601940 5 NA NA conserved Plasmodium protein, unknown function PVBDA 0601960 NA NA YES secreted ookinete protein, putative PVBDA 0601980 2 NA NA V-type ATPase V0 subunit e, putative PVBDA 0602060 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0602070 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0602100 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0602150 NA 1.15438 NA metallo-hydrolase oxidoreductase, putative PVBDA 0602230 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0602290 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0602320 NA 6.86109 NA conserved Plasmodium protein, unknown function PVBDA 0602370 1 NA NA tryptophan-rich antigen tryptophan-rich protein PVBDA 0602380 1 NA NA conserved rodent malaria protein, unknown function PVBDA 0602400 1 NA NA tryptophan-rich protein tryptophan-rich antigen PVBDA 0602410 NA NA YES fam-a protein PVBDA 0602420 NA NA YES fam-a protein PVBDA 0602440 NA NA YES fam-a protein PVBDA 0602470 2 NA NA PIR protein CIR protein, pseudogene PVBDA 0602490 2 2.76851 NA fam-b protein PVBDA 0602500 NA NA YES fam-a protein PVBDA 0602510 1 NA NA PIR protein CIR protein PVBDA 0602520 1 NA NA CIR protein PIR protein PVBDA 0602530 1 NA YES fam-c protein PVBDA 0602540 NA NA YES fam-a protein PVBDA 0602550 NA NA YES fam-a protein PVBDA 0602570 2 NA NA PIR protein CIR protein PVBDA 0700010 2 NA NA CIR protein PIR protein PVBDA 0700030 NA NA YES fam-a protein PVBDA 0700040 NA 4.82508 YES fam-a protein PVBDA 0700050 NA NA YES fam-a protein PVBDA 0700070 2 NA NA early transcribed membrane protein PVBDA 0700090 NA NA YES erythrocyte membrane antigen 1 PVBDA 0700100 2 12.1902 NA Plasmodium exported protein, unknown function PVBDA 0700110 NA 12.7238 NA Plasmodium exported protein, unknown function PVBDA 0700170 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0700230 4 NA NA conserved Plasmodium protein, unknown function PVBDA 0700250 1 NA YES GPI-anchored micronemal antigen, putative PVBDA 0700270 12 NA NA folate transporter 1, putative 188

Gene ID Number of TMs PEXEL score Signal sequence Product PVBDA 0700290 3 NA NA mitochondrial inner membrane protein OXA1, putative PVBDA 0700330 6 NA NA rhomboid protease ROM3, putative PVBDA 0700340 NA NA YES protein disulfide isomerase PVBDA 0700360 9 NA YES magnesium transporter, putative PVBDA 0700470 1 NA NA SNARE protein, putative PVBDA 0700510 2 NA NA alpha beta hydrolase, putative PVBDA 0700520 2 NA NA E3 ubiquitin-protein ligase, putative PVBDA 0700550 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0700590 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0700600 1 7.51725 NA conserved protein, unknown function PVBDA 0700610 1 NA NA translation initiation factor IF-3, putative PVBDA 0700650 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0700660 6 NA NA lipase maturation factor, putative PVBDA 0700690 10 NA NA nucleoside transporter 2, putative PVBDA 0700750 1 NA NA DnaJ protein, putative PVBDA 0700760 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0700850 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0700900 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0700950 1 NA NA protein transport protein SEC61 subunit beta, putative PVBDA 0701030 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0701040 3 NA NA conserved Plasmodium protein, unknown function PVBDA 0701070 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0701080 6 NA NA conserved Plasmodium protein, unknown function PVBDA 0701100 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0701160 3 NA NA conserved Plasmodium protein, unknown function PVBDA 0701180 1 NA NA conserved protein, unknown function PVBDA 0701210 1 NA YES perforin-like protein 5, putative PVBDA 0701270 1 NA NA BEM46-like protein, putative PVBDA 0701360 NA NA YES rhoptry neck protein 5, putative PVBDA 0701410 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0701470 NA NA YES chaperone protein ClpB1, putative PVBDA 0701500 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0701510 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0701570 2 NA NA ubiquitin, putative PVBDA 0701590 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0701610 1 NA NA phosphoglucomutase-2, putative PVBDA 0701630 1 NA NA lysine decarboxylase-like protein, putative PVBDA 0701640 1 NA NA 50S ribosomal protein L10, putative PVBDA 0701670 3 NA NA conserved Plasmodium protein, unknown function PVBDA 0701690 2 NA NA Rab5-interacting protein, putative 189

Gene ID Number of TMs PEXEL score Signal sequence Product PVBDA 0701730 NA NA YES GTP-binding protein, putative PVBDA 0701870 1 NA NA glutamyl-tRNA(Gln) amidotransferase subunit A, putative PVBDA 0701920 2 NA NA prohibitin-like protein, putative PVBDA 0702060 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0702140 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0702180 3 NA NA protein transport protein GOT1, putative PVBDA 0702200 2 1.89227 NA conserved Plasmodium protein, unknown function PVBDA 0702230 2 NA YES apical merozoite protein, putative PVBDA 0702250 13 NA NA phosphatidylinositol 4-kinase, putative PVBDA 0702270 NA 8.59306 YES fam-a protein PVBDA 0702290 1 NA NA CIR protein PIR protein, fragment PVBDA 0702300 1 NA YES fam-c protein PVBDA 0702310 1 NA NA lysophospholipase, putative PVBDA 0702330 1 NA NA PIR protein CIR protein PVBDA 0702340 2 NA NA CIR protein PIR protein PVBDA 0702345 1 NA NA CIR protein PIR protein PVBDA 0702350 2 NA NA CIR protein PIR protein PVBDA 0702360 1 NA NA CIR protein PIR protein PVBDA 0702370 NA NA YES fam-c protein PVBDA 0702380 NA NA YES fam-a protein PVBDA 0702390 1 NA NA PIR protein CIR protein PVBDA 0702410 2 NA NA CIR protein PIR protein PVBDA 0702420 1 NA NA PIR protein CIR protein PVBDA 0702430 1 NA YES fam-c protein PVBDA 0800020 NA NA YES fam-a protein PVBDA 0800030 NA NA YES fam-c protein PVBDA 0800040 1 NA NA PIR protein CIR protein PVBDA 0800050 2 NA NA CIR protein PIR protein PVBDA 0800060 1 NA YES fam-a protein, fragment PVBDA 0800080 1 NA NA CIR protein PIR protein, pseudogene PVBDA 0800090 NA NA YES fam-a protein PVBDA 0800100 NA NA YES fam-a protein, fragment PVBDA 0800110 1 NA YES fam-c protein PVBDA 0800120 1 NA NA PIR protein CIR protein PVBDA 0800150 NA NA YES fam-a protein PVBDA 0800170 2 6.96516 YES fam-b protein PVBDA 0800180 2 NA NA CIR protein PIR protein, pseudogene PVBDA 0800200 NA NA YES fam-a protein PVBDA 0800210 1 9.2012 YES fam-b protein PVBDA 0800280 NA NA YES erythrocyte membrane antigen 1 190

Gene ID Number of TMs PEXEL score Signal sequence Product PVBDA 0800300 NA 0.957159 NA regulator of chromosome condensation, putative PVBDA 0800330 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0800350 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0800460 2 NA YES 6-cysteine protein PVBDA 0800470 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0800480 NA NA YES dipeptidyl aminopeptidase 3, putative PVBDA 0800500 NA NA YES 6-cysteine protein PVBDA 0800570 4 NA NA conserved Plasmodium membrane protein, unknown function PVBDA 0800600 1 NA NA apical sushi protein, putative PVBDA 0800630 2 NA NA conserved rodent malaria protein, unknown function PVBDA 0800710 1 NA NA methyltransferase, putative PVBDA 0800720 1 NA NA peptidyl-tRNA hydrolase 2, putative PVBDA 0800800 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0800880 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0800890 1 NA NA tRNA N6-adenosine threonylcarbamoyltransferase, putative PVBDA 0800950 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0800970 1 NA NA peptide chain release factor 2, putative PVBDA 0801000 1 5.92396 NA erythrocyte vesicle protein 1, putative PVBDA 0801110 NA NA YES translocon component PTEX150, putative PVBDA 0801130 5 NA NA conserved Plasmodium protein, unknown function PVBDA 0801210 2 NA NA ER membrane protein complex subunit 4, putative PVBDA 0801250 8 NA NA cytochrome c oxidase assembly protein COX15, putative PVBDA 0801280 4 NA NA mitochondrial import inner membrane translocase subunit TIM17, putative PVBDA 0801410 9 NA YES protease, putative PVBDA 0801450 NA NA YES arginine–tRNA ligase, putative PVBDA 0801460 NA NA YES pseudouridylate synthase, putative PVBDA 0801480 3 NA NA phospholipid or glycerol acyltransferase, putative PVBDA 0801530 12 NA NA major facilitator superfamily-related transporter, putative PVBDA 0801660 11 NA NA major facilitator superfamily domain-containing protein, putative PVBDA 0801670 1 NA NA thioredoxin 3, putative PVBDA 0801680 1 1.00872 NA conserved Plasmodium protein, unknown function PVBDA 0801690 4 NA NA conserved Plasmodium protein, unknown function PVBDA 0801700 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0801780 2 NA NA conserved Plasmodium membrane protein, unknown function PVBDA 0801790 4 NA NA conserved Plasmodium protein, unknown function PVBDA 0801850 NA NA YES heat shock protein 70, putative PVBDA 0801860 2 NA NA glideosome-associated protein 50, putative PVBDA 0801870 1 NA NA cytochrome b5, putative PVBDA 0801900 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0801960 6 NA NA DNAJ-like molecular chaperone protein, putative 191

Gene ID Number of TMs PEXEL score Signal sequence Product PVBDA 0801990 NA NA YES protein disulfide isomerase, putative PVBDA 0802000 12 NA NA major facilitator superfamily domain-containing protein, putative PVBDA 0802010 1 NA NA dolichyl-diphosphooligosaccharide–protein glycosyltransferase, putative PVBDA 0802050 9 NA NA long chain fatty acid elongation enzyme, putative PVBDA 0802070 NA NA YES CS domain protein, putative PVBDA 0802090 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0802200 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0802275 NA 1.41913 NA conserved Plasmodium protein, unknown function PVBDA 0802330 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0802380 NA 5.53776 NA perforin-like protein 3 PVBDA 0802450 1 NA NA patatin-like phospholipase, putative PVBDA 0802500 13 NA NA conserved Plasmodium protein, unknown function PVBDA 0802550 3 NA NA conserved Plasmodium protein, unknown function PVBDA 0802660 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0802680 12 NA NA monocarboxylate transporter, putative PVBDA 0802690 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0802700 11 NA NA conserved Plasmodium protein, unknown function PVBDA 0802830 1 3.51717 NA phosphatidylserine decarboxylase, putative PVBDA 0802880 4 NA NA conserved Plasmodium protein, unknown function PVBDA 0802900 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0802910 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0803020 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0803030 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0803040 2 NA NA procollagen lysine 5-dioxygenase, putative PVBDA 0803070 1 NA YES merozoite surface protein 1 PVBDA 0803090 2 NA NA diacylglycerol kinase, putative PVBDA 0803120 8 NA NA conserved Plasmodium protein, unknown function PVBDA 0803185 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0803230 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0803240 3 NA NA protein MAM3, putative PVBDA 0803270 4 NA NA peptide chain release factor 1, putative PVBDA 0803280 NA NA YES apicoplast ribosomal protein S6, putative PVBDA 0803330 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0803350 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0803360 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0803530 2 NA NA protein kish, putative PVBDA 0803550 2 NA NA phosphatidylinositol N- acetylglucosaminyltransferase subunit P, putative PVBDA 0803570 2 NA NA cytoadherence linked asexual protein 9, putative PVBDA 0803580 2 12.6965 NA Plasmodium exported protein, unknown function PVBDA 0803600 1 NA NA PIR protein CIR protein 192

Gene ID Number of TMs PEXEL score Signal sequence Product PVBDA 0803620 NA NA YES fam-a protein PVBDA 0803630 NA NA YES fam-a protein PVBDA 0803640 NA NA YES fam-c protein PVBDA 0803650 NA NA YES fam-a protein PVBDA 0803680 2 9.57588 YES fam-b protein PVBDA 0803690 NA NA YES fam-a protein PVBDA 0803700 2 NA NA PIR protein CIR protein PVBDA 0803710 1 NA NA CIR protein PIR protein PVBDA 0803720 2 0.727046 NA Plasmodium exported protein, unknown function PVBDA 0803730 2 9.09297 NA Plasmodium exported protein, unknown function PVBDA 0900010 1 NA YES fam-a protein PVBDA 0900020 1 NA NA CIR protein PIR protein PVBDA 0900030 NA NA YES fam-a protein, fragment PVBDA 0900040 1 NA NA PIR protein CIR protein PVBDA 0900050 1 NA YES fam-a protein PVBDA 0900060 1 NA NA CIR protein PIR protein PVBDA 0900070 2 NA NA PIR protein CIR protein PVBDA 0900080 NA NA YES erythrocyte membrane antigen 1 PVBDA 0900090 NA NA YES fam-a protein PVBDA 0900100 NA NA YES fam-c protein PVBDA 0900120 1 NA NA PIR protein CIR protein PVBDA 0900130 NA NA YES fam-a protein PVBDA 0900140 1 NA YES fam-c protein PVBDA 0900150 1 NA NA CIR protein PIR protein PVBDA 0900160 1 NA NA CIR protein PIR protein PVBDA 0900170 NA 3.18452 YES fam-a protein PVBDA 0900180 NA NA YES erythrocyte membrane antigen 1 PVBDA 0900190 NA NA YES fam-a protein PVBDA 0900200 1 7.87284 NA fam-b protein PVBDA 0900210 2 NA NA CIR protein PIR protein PVBDA 0900220 2 NA YES early transcribed membrane protein PVBDA 0900230 1 NA YES fam-a protein PVBDA 0900240 1 NA NA fam-b protein PVBDA 0900290 NA NA YES CPW-WPC family protein PVBDA 0900340 1 NA NA syntaxin, putative PVBDA 0900370 NA NA YES thioredoxin, putative PVBDA 0900410 9 NA NA metabolite drug transporter, putative PVBDA 0900450 1 NA NA WD repeat-containing protein WRAP73, putative PVBDA 0900460 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0900490 NA NA YES translocon component PTEX88, putative 193

Gene ID Number of TMs PEXEL score Signal sequence Product PVBDA 0900510 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0900520 4 NA NA conserved Plasmodium protein, unknown function PVBDA 0900560 1 NA NA exonuclease, putative PVBDA 0900720 5 NA NA mechanosensitive ion channel protein, putative PVBDA 0900790 1 5.59539 NA heat shock protein, putative PVBDA 0900860 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0900870 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0900990 1 NA NA syntaxin, putative PVBDA 0901040 1 NA NA peptidyl-prolyl cis-trans isomerase, putative PVBDA 0901050 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0901180 8 NA NA UDP-galactose transporter, putative PVBDA 0901230 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0901260 6 NA NA rhomboid protease ROM1, putative PVBDA 0901320 1 NA NA glycerol-3-phosphate dehydrogenase, putative PVBDA 0901350 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0901370 1 NA NA chabaupain 2 PVBDA 0901400 4 NA NA palmitoyltransferase DHHC9, putative PVBDA 0901410 NA NA YES rhoptry neck protein 4, putative PVBDA 0901450 1 NA NA guanine nucleotide-exchange factor SEC12, putative PVBDA 0901460 11 NA NA folate transporter 2, putative PVBDA 0901470 12 NA NA dolichyl-diphosphooligosaccharide–protein glycosyltransferase subunit STT3, putative PVBDA 0901480 1 NA YES dipeptidyl aminopeptidase 1, putative PVBDA 0901500 1 NA YES heat shock protein 101, putative PVBDA 0901520 10 NA NA conserved Plasmodium membrane protein, unknown function PVBDA 0901550 3 NA NA conserved Plasmodium protein, unknown function PVBDA 0901700 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0901770 1 NA NA ATP-dependent zinc metalloprotease FTSH, putative PVBDA 0901820 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0901830 2 NA NA CorA-like Mg2+ transporter protein, putative PVBDA 0901840 NA NA YES tRNA nucleotidyltransferase, putative PVBDA 0901850 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0901870 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0901890 2 NA NA palmitoyltransferase DHHC3, putative PVBDA 0901910 1 NA NA tyrosine kinase-like protein, putative PVBDA 0901930 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0901940 2 NA NA circumsporozoite-related antigen exported protein 1, putative PVBDA 0901960 NA NA YES petidase, M16 family, putative PVBDA 0901990 1 NA NA GPI transamidase component GPI16, putative PVBDA 0902010 2 NA NA conserved Plasmodium protein, unknown function PVBDA 0902050 3 NA NA conserved Plasmodium protein, unknown function 194

Gene ID Number of TMs PEXEL score Signal sequence Product PVBDA 0902110 1 NA NA RING zinc finger protein, putative PVBDA 0902140 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0902190 1 NA NA endoplasmic reticulum oxidoreductin, putative PVBDA 0902300 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0902480 1 NA NA protein phosphatase, putative PVBDA 0902520 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0902530 1 NA YES protein disulfide-isomerase, putative PVBDA 0902550 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0902670 2 NA NA GPI-anchor transamidase, putative PVBDA 0902690 1 NA NA spermidine synthase, putative PVBDA 0902700 NA NA YES fam-d protein PVBDA 0902710 NA NA YES fam-d protein, pseudogene PVBDA 0902720 NA NA YES fam-d protein PVBDA 0902730 NA NA YES fam-d protein PVBDA 0902740 NA NA YES fam-d protein PVBDA 0902750 NA NA YES fam-d protein PVBDA 0902760 NA NA YES fam-d protein PVBDA 0902770 NA NA YES fam-d protein PVBDA 0902780 NA NA YES fam-d protein, pseudo PVBDA 0902790 NA NA YES fam-d protein PVBDA 0902800 NA NA YES fam-d protein PVBDA 0902810 NA NA YES fam-d protein PVBDA 0902820 NA NA YES fam-d protein PVBDA 0902910 11 NA NA major facilitator superfamily-related transporter, putative PVBDA 0903020 7 NA NA conserved Plasmodium protein, unknown function PVBDA 0903090 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0903140 10 NA NA conserved Plasmodium protein, unknown function PVBDA 0903150 10 NA NA amino acid transporter, putative PVBDA 0903180 6 NA NA aquaglyceroporin, putative PVBDA 0903200 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0903230 2 NA NA LEM3 CDC50 family protein, putative PVBDA 0903240 1 NA YES apical membrane antigen 1, putative PVBDA 0903260 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0903350 1 NA NA alpha beta hydrolase, putative PVBDA 0903390 5 NA NA PQ-loop repeat-containing protein PVBDA 0903430 8 NA NA conserved Plasmodium membrane protein, unknown function PVBDA 0903450 2 NA NA condensin-2 complex subunit D3, putative PVBDA 0903480 5 NA NA 3-oxo-5-alpha-steroid 4-dehydrogenase, putative PVBDA 0903510 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0903570 1 NA NA DnaJ protein, putative 195

Gene ID Number of TMs PEXEL score Signal sequence Product PVBDA 0903580 1 0.802594 YES subtilisin-like protease 2, putative PVBDA 0903610 6 NA NA CLPTM1 domain-containing protein, putative PVBDA 0903700 4 NA NA conserved Plasmodium protein, unknown function PVBDA 0903710 4 NA NA conserved Plasmodium protein, unknown function PVBDA 0903730 22 NA NA guanylyl cyclase, putative PVBDA 0903850 1 NA NA conserved protein, unknown function PVBDA 0903860 1 NA NA carbonic anhydrase, putative PVBDA 0903980 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0904000 2 NA NA phosphatidylinositol N- acetylglucosaminyltransferase subunit H, putative PVBDA 0904090 4 NA NA conserved Plasmodium protein, unknown function PVBDA 0904180 3 NA YES DnaJ protein, putative PVBDA 0904240 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0904310 5 NA YES apicoplast import protein Tic20, putative PVBDA 0904390 5 NA NA ABC transporter B family member 3, putative PVBDA 0904400 3 NA NA conserved protein, unknown function PVBDA 0904460 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0904510 5 NA NA protein YIPF6, putative PVBDA 0904600 1 NA NA merozoite adhesive erythrocytic binding protein PVBDA 0904610 1 NA NA conserved Plasmodium protein, unknown function PVBDA 0904630 1 NA YES fam-a protein PVBDA 0904650 NA NA YES fam-a protein PVBDA 0904660 2 NA NA CIR protein PIR protein PVBDA 0904670 1 6.39699 YES fam-b protein PVBDA 0904680 1 NA NA lysophospholipase, putative PVBDA 0904690 NA NA YES fam-a protein PVBDA 0904700 2 NA NA PIR protein CIR protein PVBDA 0904710 NA NA YES fam-a protein PVBDA 0904730 2 11.7676 NA Plasmodium exported protein, unknown function PVBDA 0904740 1 NA YES fam-c protein PVBDA 0904750 NA NA YES fam-a protein PVBDA 0904770 1 NA NA PIR protein CIR protein PVBDA 0904780 NA NA YES fam-a protein, fragment PVBDA 0904790 1 NA NA CIR protein PIR protein PVBDA 0904800 NA NA YES fam-c protein PVBDA 0904810 NA NA YES fam-a protein PVBDA 0904830 NA 8.02505 YES fam-b protein PVBDA 0904840 NA NA YES fam-a protein PVBDA 1000010 2 NA NA PIR protein CIR protein PVBDA 1000020 1 NA NA PIR protein CIR protein PVBDA 1000030 2 NA NA PIR protein CIR protein 196

Gene ID Number of TMs PEXEL score Signal sequence Product PVBDA 1000040 2 NA NA PIR protein CIR protein PVBDA 1000050 1 NA NA CIR protein PIR protein PVBDA 1000060 2 NA NA PIR protein CIR protein PVBDA 1000070 NA NA YES fam-a protein PVBDA 1000090 NA NA YES fam-a protein PVBDA 1000100 1 NA NA CIR protein PIR protein PVBDA 1000110 2 NA NA CIR protein PIR protein PVBDA 1000140 NA NA YES fam-a protein PVBDA 1000150 1 NA NA CIR protein PIR protein PVBDA 1000170 1 NA NA PIR protein CIR protein PVBDA 1000180 NA NA YES fam-a protein PVBDA 1000230 2 NA YES early transcribed membrane protein PVBDA 1000250 1 NA NA conserved Plasmodium protein, unknown function PVBDA 1000260 NA NA YES chitinase, putative PVBDA 1000270 NA NA YES centrin, putative PVBDA 1000280 1 NA NA conserved Plasmodium protein, unknown function PVBDA 1000370 10 NA NA conserved Plasmodium protein, unknown function PVBDA 1000380 2 NA NA conserved Plasmodium protein, unknown function PVBDA 1000390 7 NA NA conserved Plasmodium protein, unknown function PVBDA 1000410 4 NA NA conserved Plasmodium protein, unknown function PVBDA 1000450 7 NA NA phosphopantetheine adenylyltransferase, putative PVBDA 1000560 2 NA NA ATP synthase subunit C, putative PVBDA 1000580 2 0.124229 NA conserved Plasmodium protein, unknown function PVBDA 1000710 1 NA NA RAP protein, putative PVBDA 1000760 1 NA NA conserved Plasmodium protein, unknown function PVBDA 1000840 10 NA NA P-type ATPase, putative PVBDA 1000850 2 NA YES conserved protein, unknown function PVBDA 1000860 NA NA YES triosephosphate isomerase, putative PVBDA 1000880 2 NA NA conserved Plasmodium protein, unknown function PVBDA 1000900 1 NA NA conserved Plasmodium protein, unknown function PVBDA 1001020 2 NA NA 6-cysteine protein PVBDA 1001040 1 NA NA E3 ubiquitin-protein ligase, putative PVBDA 1001070 2 NA NA conserved Plasmodium protein, unknown function PVBDA 1001130 NA NA YES ribosomal protein L35, putative PVBDA 1001150 1 NA NA P1 nuclease, putative PVBDA 1001160 10 NA NA conserved Plasmodium protein, unknown function PVBDA 1001190 6 NA NA conserved protein, unknown function PVBDA 1001220 3 NA NA transporter, putative PVBDA 1001250 2 NA NA conserved Plasmodium protein, unknown function PVBDA 1001370 2 NA NA conserved Plasmodium protein, unknown function 197

Gene ID Number of TMs PEXEL score Signal sequence Product PVBDA 1001410 1 NA NA cytosolic Fe-S cluster assembly factor NBP35, putative PVBDA 1001430 7 NA NA phosphatidylinositol N- acetylglucosaminyltransferase, putative PVBDA 1001460 8 NA NA cysteine repeat modular protein 1, putative PVBDA 1001470 5 NA NA conserved Plasmodium protein, unknown function PVBDA 1001500 1 NA NA GTP-binding protein, putative PVBDA 1001510 1 NA NA conserved Plasmodium protein, unknown function PVBDA 1001520 NA NA YES inhibitor of cysteine proteases, putative PVBDA 1001530 2 NA NA conserved Plasmodium protein, unknown function PVBDA 1001550 7 NA NA conserved Plasmodium protein, unknown function PVBDA 1001560 1 NA YES alkaline phosphatase, putative PVBDA 1001620 2 NA NA protein kinase, putative PVBDA 1001630 1 NA NA conserved Plasmodium protein, unknown function PVBDA 1001650 2 NA NA signal peptidase complex subunit SPC1, putative PVBDA 1001670 8 NA NA conserved Plasmodium protein, unknown function PVBDA 1001740 1 NA NA HP12 protein homolog, putative PVBDA 1001790 3 NA NA conserved Plasmodium protein, unknown function PVBDA 1001800 1 NA NA conserved Plasmodium protein, unknown function PVBDA 1001820 1 NA NA SNARE protein, putative PVBDA 1001830 9 NA NA inner membrane complex suture component, putative PVBDA 1001870 1 NA NA conserved Plasmodium protein, unknown function PVBDA 1001930 1 NA NA NADP-specific glutamate dehydrogenase, putative PVBDA 1001980 1 NA NA plasmepsin IX, putative PVBDA 1002000 3 NA NA conserved Plasmodium protein, unknown function PVBDA 1002040 3 NA NA conserved Plasmodium protein, unknown function PVBDA 1002090 1 NA NA apicoplast ribosomal protein L15 precursor, putative PVBDA 1002120 1 NA NA transcription initiation TFIID-like, putative PVBDA 1002160 12 NA NA major facilitator superfamily domain-containing protein, putative PVBDA 1002180 10 NA NA conserved Plasmodium protein, unknown function PVBDA 1002210 1 NA NA conserved Plasmodium protein, unknown function PVBDA 1002220 2 NA NA CorA-like Mg2+ transporter protein, putative PVBDA 1002240 8 NA NA conserved Plasmodium protein, unknown function PVBDA 1002260 1 3.45512 NA selenoprotein, putative PVBDA 1002270 1 NA NA lipase, putative PVBDA 1002330 6 NA NA ABC transporter G family member 2, putative PVBDA 1002410 2 NA NA conserved Plasmodium protein, unknown function PVBDA 1002520 3 NA NA conserved Plasmodium protein, unknown function PVBDA 1002610 5 NA NA ERAD-associated E3 ubiquitin-protein ligase HRD1, putative PVBDA 1002660 1 NA NA cytochrome c oxidase assembly protein COX14, putative PVBDA 1002670 3 NA YES copper transporter, putative PVBDA 1002770 NA NA YES surface protein P113, putative 198

Gene ID Number of TMs PEXEL score Signal sequence Product PVBDA 1002790 10 NA NA conserved Plasmodium protein, unknown function PVBDA 1002900 5 NA NA conserved Plasmodium protein, unknown function PVBDA 1002920 3 NA NA thioredoxin-like protein, putative PVBDA 1002980 NA NA YES liver specific protein 1, putative PVBDA 1003030 1 NA NA conserved Plasmodium protein, unknown function PVBDA 1003120 1 NA NA conserved Plasmodium protein, unknown function PVBDA 1003190 2 NA NA protein SEY1, putative PVBDA 1003200 3 NA NA conserved Plasmodium protein, unknown function PVBDA 1003230 1 NA NA serine C-palmitoyltransferase, putative PVBDA 1003240 2 NA NA conserved Plasmodium protein, unknown function PVBDA 1003250 13 NA NA conserved Plasmodium protein, unknown function PVBDA 1003290 1 NA NA conserved Plasmodium protein, unknown function PVBDA 1003420 5 0.387487 NA DnaJ protein, putative PVBDA 1003460 1 NA NA FeS assembly ATPase SufC, putative ABC transporter I family member 1, putative PVBDA 1003470 1 NA NA 30S ribosomal protein S9, putative PVBDA 1003560 2 NA NA conserved Plasmodium protein, unknown function PVBDA 1003600 NA NA YES p1 s1 nuclease, putative PVBDA 1003610 1 NA NA conserved Plasmodium protein, unknown function PVBDA 1003630 NA NA YES GTP-binding protein, putative PVBDA 1003650 1 NA NA plastid replication-repair enzyme, putative PVBDA 1003670 5 NA NA rhomboid protease ROM8, putative PVBDA 1003680 9 NA NA conserved Plasmodium protein, unknown function PVBDA 1003690 2 NA NA conserved Plasmodium protein, unknown function PVBDA 1003720 1 NA NA conserved Plasmodium protein, unknown function PVBDA 1003750 NA NA YES rhoptry-associated protein 1, putative PVBDA 1003800 7 NA NA cytidine diphosphate-diacylglycerol synthase, putative PVBDA 1003830 1 NA NA conserved Plasmodium protein, unknown function PVBDA 1003850 4 NA NA conserved Plasmodium protein, unknown function PVBDA 1003880 1 NA NA aldo-keto reductase, putative PVBDA 1003960 1 NA NA conserved Plasmodium protein, unknown function PVBDA 1003980 1 NA NA plasmepsin IV, putative PVBDA 1004040 1 NA NA conserved Plasmodium protein, unknown function PVBDA 1004060 NA NA YES LCCL domain-containing protein PVBDA 1004080 6 NA NA glideosome associated protein with multiple membrane spans 3, putative PVBDA 1004190 4 NA NA RING zinc finger protein, putative PVBDA 1004230 1 NA NA conserved Plasmodium protein, unknown function PVBDA 1004260 1 NA NA conserved Plasmodium protein, unknown function PVBDA 1004290 5 NA NA adenylyl cyclase alpha, putative PVBDA 1004370 7 NA NA translocation associated membrane protein, putative PVBDA 1004380 1 NA NA selenoprotein, putative 199

Gene ID Number of TMs PEXEL score Signal sequence Product PVBDA 1004510 1 NA NA conserved Plasmodium protein, unknown function PVBDA 1004520 1 NA NA conserved Plasmodium protein, unknown function PVBDA 1004560 2 NA YES early transcribed membrane protein PVBDA 1004600 NA NA YES fam-c protein PVBDA 1004610 NA NA YES fam-a protein PVBDA 1004640 1 NA YES fam-c protein PVBDA 1004650 NA NA YES fam-a protein PVBDA 1004690 NA 3.59063 YES fam-a protein PVBDA 1004700 2 NA NA PIR protein CIR protein PVBDA 1004710 1 NA NA CIR protein PIR protein PVBDA 1004720 1 NA YES fam-c protein PVBDA 1004730 NA NA YES fam-a protein PVBDA 1004740 2 NA NA CIR protein PIR protein PVBDA 1004750 1 NA NA CIR protein PIR protein PVBDA 1004760 1 NA YES fam-c protein PVBDA 1004770 NA NA YES fam-a protein, fragment PVBDA 1004790 1 NA NA PIR protein CIR protein PVBDA 1004800 1 NA YES fam-c protein PVBDA 1004810 NA NA YES fam-a protein PVBDA 1100010 NA NA YES fam-a protein PVBDA 1100040 NA NA YES fam-c protein PVBDA 1100050 1 NA NA CIR protein PIR protein PVBDA 1100060 NA NA YES fam-a protein PVBDA 1100070 NA NA YES fam-a protein PVBDA 1100080 NA NA YES fam-a protein PVBDA 1100100 1 NA NA fam-a protein PVBDA 1100110 NA 0.945726 YES fam-a protein PVBDA 1100120 1 3.74213 NA Plasmodium exported protein, unknown function PVBDA 1100130 2 10.0517 NA Plasmodium exported protein, unknown function PVBDA 1100140 1 NA NA skeleton-binding protein 1, putative PVBDA 1100150 NA NA YES rhoptry-associated protein 2/3, putative PVBDA 1100210 5 NA NA protein Mpv17, putative PVBDA 1100230 1 NA YES merozoite surface protein 8, putative PVBDA 1100280 1 NA NA 50S ribosomal protein L28, apicoplast, putative PVBDA 1100370 14 NA NA cation transporting P-ATPase, putative PVBDA 1100410 1 0.386529 NA ATP-dependent helicase, putative PVBDA 1100420 1 NA NA conserved Plasmodium protein, unknown function PVBDA 1100460 1 NA NA conserved protein, unknown function PVBDA 1100470 4 NA NA conserved Plasmodium protein, unknown function PVBDA 1100500 9 NA NA UDP-N-acetylglucosamine transporter, putative 200

Gene ID Number of TMs PEXEL score Signal sequence Product PVBDA 1100510 NA 2.93648 NA conserved Plasmodium protein, unknown function PVBDA 1100540 4 NA NA conserved Plasmodium protein, unknown function PVBDA 1100570 2 NA NA conserved Plasmodium protein, unknown function PVBDA 1100630 NA NA YES histone deacetylase, putative PVBDA 1100660 7 NA NA rhomboid protease ROM4, putative PVBDA 1100700 NA 6.84086 NA conserved Plasmodium protein, unknown function PVBDA 1100720 NA NA YES subtilisin-like protease 1, putative PVBDA 1100770 1 NA YES 6-cysteine protein PVBDA 1100790 6 NA NA longevity-assurance (LAG1) protein, putative PVBDA 1100800 8 NA NA triose phosphate transporter, putative PVBDA 1100930 1 11.4492 NA asparagine–tRNA ligase, putative PVBDA 1100970 1 NA NA conserved Plasmodium protein, unknown function PVBDA 1101070 1 NA NA conserved Plasmodium protein, unknown function PVBDA 1101080 16 NA NA stearoyl-CoA desaturase, putative PVBDA 1101100 1 2.28887 NA conserved Plasmodium protein, unknown function PVBDA 1101120 2 NA NA apical rhoptry neck protein, putative PVBDA 1101240 2 NA NA orotate phosphoribosyltransferase, putative 201

B Number of single nucleotide polymorphisms (SNPs) per gene between P. vinckei subspecies.

PvvCY Gene ID PvbDA PvlDS PvpCR PvsELProduct PVVCY 0100120 25 36 32 30 conserved rodent malaria protein, unknown function PVVCY 0100220 108 96 89 100 conserved Plasmodium protein, unknown function PVVCY 0100230 74 65 81 75 ATP-dependent RNA helicase, putative PVVCY 0100240 13 9 12 11 MYND finger protein, putative PVVCY 0100250 32 31 28 29 liver merozoite formation protein, putative PVVCY 0100260 61 63 57 61 elongation factor G, putative PVVCY 0100270 28 28 27 31 geranylgeranyltransferase, putative PVVCY 0100280 97 85 90 87 conserved Plasmodium protein, unknown function PVVCY 0100290 38 34 38 37 conserved Plasmodium protein, unknown function PVVCY 0100300 66 71 66 70 conserved Plasmodium protein, unknown function PVVCY 0100310 48 33 39 42 conserved Plasmodium protein, unknown function PVVCY 0100320 45 45 42 43 conserved Plasmodium protein, unknown function PVVCY 0100330 43 36 42 41 RNA-binding protein, putative PVVCY 0100340 35 38 33 34 mitochondrial chaperone BCS1, putative PVVCY 0100350 37 40 35 35 dihydroorotate dehydrogenase, putative PVVCY 0100360 69 55 64 59 trophozoite exported protein 1, putative PVVCY 0100370 46 40 43 44 cation H+ antiporter, putative PVVCY 0100380 225 225 242 226 conserved Plasmodium protein, unknown function PVVCY 0100390 38 35 35 38 phenylalanine–tRNA ligase, putative PVVCY 0100400 161 157 155 168 centrosomal protein CEP76, putative PVVCY 0100410 25 19 24 21 conserved Plasmodium protein, unknown function PVVCY 0100420 181 179 170 177 conserved Plasmodium protein, unknown function PVVCY 0100430 268 235 245 253 transcription factor with AP2 domain(s), putative SPE2-interacting protein, putative PVVCY 0100440 7 6 7 7 mitochondrial ribosomal protein L41, putative PVVCY 0100450 73 65 65 65 conserved Plasmodium protein, unknown function PVVCY 0100460 23 20 17 17 conserved Plasmodium protein, unknown function PVVCY 0100470 291 259 268 281 conserved Plasmodium protein, unknown function PVVCY 0100480 31 35 37 34 glyoxalase I, putative PVVCY 0100490 156 155 152 156 RAP protein, putative PVVCY 0100500 7 4 10 NA conserved Plasmodium protein, unknown function PVVCY 0100510 18 14 17 16 50S ribosomal protein L24, putative 202

PvvCY Gene ID PvbDA PvlDS PvpCR PvsELProduct PVVCY 0100520 47 48 50 45 RNA-binding protein, putative PVVCY 0100530 21 17 18 16 conserved Plasmodium protein, unknown function PVVCY 0100540 27 29 28 30 serine threonine protein kinase, putative PVVCY 0100550 6 8 7 9 calcium-binding protein, putative PVVCY 0100560 81 69 80 82 cyclin dependent kinase binding protein, putative PVVCY 0100570 209 211 183 199 nucleoside diphosphate kinase, putative PVVCY 0100580 16 20 22 19 conserved Plasmodium protein, unknown function PVVCY 0100590 219 202 182 165 DNA repair protein RAD50, putative PVVCY 0100600 26 24 22 24 long chain polyunsaturated fatty acid elongation enzyme, putative PVVCY 0100610 74 73 66 67 conserved Plasmodium protein, unknown function PVVCY 0100620 76 72 71 66 RNA-binding protein, putative PVVCY 0100630 5 7 6 7 ubiquitin-conjugating enzyme E2, putative PVVCY 0100640 14 8 9 9 conserved Plasmodium protein, unknown function PVVCY 0100650 64 59 58 58 polypyrimidine tract-binding protein, putative PVVCY 0100660 203 229 234 214 conserved Plasmodium protein, unknown function PVVCY 0100670 150 136 132 125 coatomer alpha subunit, putative PVVCY 0100680 25 25 24 27 glutaredoxin-like protein PVVCY 0100690 76 62 69 66 translation initiation factor IF-2, putative PVVCY 0100700 50 40 43 35 MYND finger protein, putative PVVCY 0100710 61 59 58 53 RING zinc finger protein, putative PVVCY 0100720 25 28 27 29 uroporphyrinogen III decarboxylase, putative PVVCY 0100730 63 55 57 55 G-protein associated signal transduction protein, putative PVVCY 0100740 66 43 82 68 para-hydroxybenzoate–polyprenyltransferase, putative PVVCY 0100750 74 62 64 63 spindle assembly abnormal protein 6, putative PVVCY 0100760 281 263 254 265 conserved Plasmodium protein, unknown function PVVCY 0100770 48 46 40 45 GTP-binding protein, putative PVVCY 0100780 113 108 111 113 conserved Plasmodium protein, unknown function PVVCY 0100790 36 33 29 33 diphthine methyltransferase, putative PVVCY 0100800 100 106 95 99 conserved Plasmodium protein, unknown function PVVCY 0100810 14 13 12 11 conserved Plasmodium protein, unknown function PVVCY 0100820 170 174 203 213 conserved Plasmodium protein, unknown function PVVCY 0100830 50 45 50 41 sorting assembly machinery 50 kDa subunit, putative PVVCY 0100840 13 9 10 11 conserved Plasmodium protein, unknown function PVVCY 0100850 16 13 11 13 proteasome subunit alpha type-2, putative PVVCY 0100860 24 36 31 28 conserved Plasmodium protein, unknown function PVVCY 0100870 53 53 52 47 T-complex protein 1 subunit zeta, putative PVVCY 0100880 36 40 38 40 ornithine aminotransferase, putative PVVCY 0100890 99 86 95 99 conserved Plasmodium protein, unknown function PVVCY 0100900 705 661 644 639 conserved Plasmodium protein, unknown function 203

PvvCY Gene ID PvbDA PvlDS PvpCR PvsELProduct PVVCY 0100910 66 54 53 43 zinc transporter ZIP1, putative PVVCY 0100920 53 57 56 58 citrate synthase-like protein, putative PVVCY 0100930 95 102 97 71 conserved Plasmodium protein, unknown function PVVCY 0100940 77 78 84 79 mitochondrial cardiolipin synthase, putative PVVCY 0100950 99 99 90 98 conserved Plasmodium protein, unknown function PVVCY 0100960 160 160 163 163 conserved Plasmodium protein, unknown function PVVCY 0100970 18 14 12 12 palmitoyltransferase DHHC2, putative PVVCY 0100980 168 143 149 138 conserved Plasmodium protein, unknown function PVVCY 0100990 33 31 30 34 mitochondrial ribosomal protein L19 precursor, putative PVVCY 0101000 34 24 25 25 pre-mRNA-splicing factor SLU7, putative PVVCY 0101010 50 49 53 53 RNA-binding protein 25, putative PVVCY 0101020 3 5 4 4 histone H3, putative PVVCY 0101030 4 5 7 6 peptidyl-tRNA hydrolase PTRHD1, putative PVVCY 0101040 4 4 4 4 conserved Plasmodium protein, unknown function PVVCY 0101050 64 61 64 59 transketolase, putative PVVCY 0101060 78 62 68 66 transcription elongation factor SPT5, putative PVVCY 0101070 16 12 13 11 SNARE associated Golgi protein, putative PVVCY 0101080 6 4 5 5 succinate dehydrogenase subunit 3, putative PVVCY 0101090 14 9 8 8 transcription factor with AP2 domain(s), putative PVVCY 0101100 8 4 6 7 conserved Plasmodium protein, unknown function PVVCY 0101110 43 38 43 40 SWIB MDM2 domain-containing protein, putative PVVCY 0101120 23 19 22 24 conserved Plasmodium protein, unknown function PVVCY 0101130 56 48 47 46 conserved Plasmodium protein, unknown function PVVCY 0101140 3 3 2 2 60S ribosomal protein L39, putative PVVCY 0101150 315 273 267 272 conserved Plasmodium protein, unknown function PVVCY 0101160 28 28 28 21 lsm12, putative PVVCY 0101170 7 5 3 6 conserved Plasmodium protein, unknown function PVVCY 0101180 70 69 69 71 eukaryotic translation initiation factor 3 subunit L, putative PVVCY 0101190 160 149 106 114 leucine-rich repeat protein PVVCY 0101200 14 12 13 13 conserved protein, unknown function PVVCY 0101210 4 6 7 8 conserved Plasmodium protein, unknown function PVVCY 0101220 6 4 4 4 conserved Plasmodium protein, unknown function PVVCY 0101230 17 17 14 15 cytoplasmic tRNA 2-thiolation protein 1, putative PVVCY 0101240 66 43 81 77 6-cysteine protein PVVCY 0101250 37 42 39 43 6-cysteine protein PVVCY 0101260 105 84 85 86 nucleolar GTP-binding protein 1, putative PVVCY 0101270 14 14 12 12 conserved Plasmodium protein, unknown function PVVCY 0101280 4 7 6 5 conserved Plasmodium protein, unknown function PVVCY 0101285 5 4 3 3 conserved Plasmodium protein, unknown function PVVCY 0101290 117 118 113 118 rhoptry protein ROP14, putative 204

PvvCY Gene ID PvbDA PvlDS PvpCR PvsELProduct PVVCY 0101300 31 38 32 37 apicoplast ribosomal protein L18 precursor, putative PVVCY 0101310 204 215 241 206 AP-3 complex subunit beta, putative PVVCY 0101320 93 69 85 98 conserved Plasmodium protein, unknown function PVVCY 0101330 57 60 62 53 syntaxin binding protein, putative PVVCY 0101340 389 336 358 335 transcription factor with AP2 domain(s), putative PVVCY 0101350 155 137 154 160 myosin-like protein, putative PVVCY 0101360 73 69 65 68 conserved Plasmodium protein, unknown function PVVCY 0101370 100 95 97 101 cytosolic Fe-S cluster assembly factor NAR1, putative PVVCY 0101380 138 127 124 132 major facilitator superfamily-related transporter, putative PVVCY 0101720 15 33 18 16 M17 leucyl aminopeptidase, putative PVVCY 0200150 9 6 7 7 conserved Plasmodium protein, unknown function PVVCY 0200160 32 33 30 23 TatD-like deoxyribonuclease, putative PVVCY 0200170 2 3 3 3 conserved protein, unknown function PVVCY 0200180 52 52 48 50 eukaryotic translation initiation factor 4E, putative PVVCY 0200190 3 1 5 6 conserved Plasmodium protein, unknown function PVVCY 0200200 25 19 26 24 conserved Plasmodium protein, unknown function PVVCY 0200210 29 26 26 27 UMP-CMP kinase, putative PVVCY 0200220 47 40 41 43 conserved Plasmodium protein, unknown function PVVCY 0200230 90 86 92 85 replication factor c protein, putative PVVCY 0200240 3 2 4 4 conserved Plasmodium protein, unknown function PVVCY 0200250 141 149 141 148 kinesin-8, putative PVVCY 0200260 11 8 9 8 adenylate kinase-like protein 1, putative PVVCY 0200270 38 40 43 37 transcription initiation factor TFIIB, putative PVVCY 0200280 57 53 46 53 chromatin assembly factor 1 protein WD40 domain, putative PVVCY 0200290 160 150 114 104 phosphatidylinositol-4-phosphate 5-kinase, putative PVVCY 0200300 150 189 105 149 bromodomain protein, putative PVVCY 0200310 31 26 31 26 DNA-directed RNA polymerase II subunit RPB9, putative PVVCY 0200320 15 15 14 9 FAD-linked sulfhydryl oxidase ERV1, putative PVVCY 0200330 47 73 62 62 selenocysteine-specific elongation factor selB homologue, puta- tive PVVCY 0200340 21 22 18 19 conserved Plasmodium protein, unknown function PVVCY 0200350 17 12 12 9 conserved Plasmodium protein, unknown function PVVCY 0200360 46 42 39 48 phosphatidate cytidylyltransferase, putative PVVCY 0200370 43 52 41 50 phenylalanine–tRNA ligase alpha subunit, putative PVVCY 0200380 4 4 8 2 rRNA biogenesis protein RRP36, putative PVVCY 0200390 15 17 17 14 cold-shock protein, putative PVVCY 0200400 15 13 15 17 N-terminal acetyltransferase, putative PVVCY 0200410 8 4 5 5 tubulin-specific chaperone a, putative PVVCY 0200420 18 13 13 20 cleavage and polyadenylation specificity factor subunit 5, putative PVVCY 0200430 155 146 147 155 LCCL domain-containing protein 205

PvvCY Gene ID PvbDA PvlDS PvpCR PvsELProduct PVVCY 0200440 15 17 18 15 photosensitized INA-labeled protein PHIL1, putative PVVCY 0200450 38 32 30 34 conserved Plasmodium protein, unknown function PVVCY 0200460 120 112 106 115 secreted ookinete protein, putative PVVCY 0200470 16 9 8 12 conserved Plasmodium protein, unknown function PVVCY 0200480 15 12 12 3 conserved Plasmodium protein, unknown function PVVCY 0200490 34 28 31 24 mitochondrial carrier protein, putative PVVCY 0200500 187 186 186 179 conserved Plasmodium protein, unknown function PVVCY 0200510 55 54 52 53 conserved Plasmodium protein, unknown function PVVCY 0200520 4 5 7 6 proteasome subunit beta type-3, putative PVVCY 0200530 13 14 23 22 conserved Plasmodium protein, unknown function PVVCY 0200540 126 112 113 127 double-strand break repair protein MRE11, putative PVVCY 0200550 34 33 32 29 conserved Plasmodium membrane protein, unknown function PVVCY 0200560 280 282 264 279 serine threonine protein kinase, putative PVVCY 0200570 137 179 176 152 lipid sterol:H+ symporter, putative PVVCY 0200580 9 6 8 9 conserved Plasmodium protein, unknown function PVVCY 0200590 31 26 27 29 carbon catabolite repressor protein 4, putative PVVCY 0200600 42 37 35 37 conserved Plasmodium protein, unknown function PVVCY 0200610 11 5 5 5 centrin-1, putative PVVCY 0200620 86 70 68 73 2-C-methyl-D-erythritol 4-phosphate cytidylyltransferase, puta- tive PVVCY 0200630 20 20 13 23 ras-related protein Rab-5C, putative PVVCY 0200640 131 129 134 123 asparagine and aspartate rich protein 2, putative PVVCY 0200650 32 42 43 39 conserved Plasmodium protein, unknown function PVVCY 0200660 195 194 182 188 conserved Plasmodium protein, unknown function PVVCY 0200670 16 11 13 15 pre-rRNA-processing protein TSR2, putative PVVCY 0200680 110 95 110 96 calcium-transporting ATPase, putative PVVCY 0200690 20 13 13 17 conserved Plasmodium protein, unknown function PVVCY 0200700 40 31 36 38 V-type proton ATPase subunit C, putative PVVCY 0200710 31 37 37 37 conserved Plasmodium protein, unknown function PVVCY 0200720 46 39 39 41 DNA binding protein, putative PVVCY 0200730 86 76 88 98 conserved Plasmodium protein, unknown function PVVCY 0200740 297 314 329 327 asparagine-rich antigen, putative PVVCY 0200750 70 86 76 73 conserved Plasmodium protein, unknown function PVVCY 0200760 26 21 21 22 conserved Plasmodium protein, unknown function PVVCY 0200770 26 20 16 19 conserved Plasmodium protein, unknown function PVVCY 0200780 16 12 17 13 cyclase-associated protein, putative PVVCY 0200790 48 47 47 45 RAP protein, putative PVVCY 0200800 13 14 13 14 conserved Plasmodium protein, unknown function PVVCY 0200820 51 48 45 37 major facilitator superfamily-related transporter, putative PVVCY 0200830 158 126 114 120 conserved Plasmodium protein, unknown function 206

PvvCY Gene ID PvbDA PvlDS PvpCR PvsELProduct PVVCY 0200840 11 10 10 8 conserved protein, unknown function PVVCY 0200850 57 50 51 51 4-hydroxy-3-methylbut-2-enyl diphosphate reductase, putative PVVCY 0200860 448 408 417 422 ubiquitin carboxyl-terminal hydrolase 1, putative PVVCY 0200870 69 87 58 56 StAR-related lipid transfer protein PVVCY 0200880 168 170 153 166 conserved Plasmodium protein, unknown function PVVCY 0200890 30 31 27 27 thrombospondin related sporozoite protein, putative PVVCY 0200900 74 63 65 63 parasite-infected erythrocyte surface protein PVVCY 0200910 24 20 30 29 actin-related protein PVVCY 0200920 60 56 61 76 L-seryl-tRNA(Sec) kinase, putative PVVCY 0200930 85 83 84 67 ATP dependent RNA helicase, putative PVVCY 0200940 167 227 179 186 conserved Plasmodium protein, unknown function PVVCY 0200950 189 191 197 202 zinc-carboxypeptidase, putative PVVCY 0200960 45 48 50 43 conserved Plasmodium protein, unknown function PVVCY 0200970 109 99 115 85 nucleoside transporter 4, putative PVVCY 0200980 162 144 161 180 vacuolar protein sorting-associated protein 51, putative PVVCY 0200990 27 27 33 30 vacuolar protein sorting-associated protein VTA1, putative PVVCY 0201000 47 40 35 35 aspartate–tRNA ligase, putative PVVCY 0201010 43 39 37 40 conserved Plasmodium protein, unknown function PVVCY 0201020 117 100 107 99 DNA mismatch repair protein PMS1, putative PVVCY 0201030 529 495 513 495 conserved Plasmodium protein, unknown function PVVCY 0201040 334 309 316 317 ubiquitin carboxyl-terminal hydrolase, putative PVVCY 0201050 71 60 57 66 conserved Plasmodium protein, unknown function PVVCY 0201060 11 6 7 8 dolichyl-diphosphooligosaccharide–protein glycosyltransferase subunit DAD1, putative PVVCY 0201070 61 58 48 47 mitochondrial import inner membrane translocase subunit TIM50, putative PVVCY 0201080 88 133 137 135 vacuolar protein sorting-associated protein 53, putative PVVCY 0201090 99 109 92 102 conserved Plasmodium protein, unknown function PVVCY 0201100 58 48 55 49 cysteine desulfurase, putative PVVCY 0201110 105 99 91 83 DNA (cytosine-5)-methyltransferase, putative PVVCY 0201120 24 17 24 16 proteasome subunit alpha type-5, putative PVVCY 0201130 69 66 64 61 conserved Plasmodium protein, unknown function PVVCY 0201140 145 143 133 133 conserved Plasmodium protein, unknown function PVVCY 0201150 109 109 105 112 conserved Plasmodium protein, unknown function PVVCY 0201160 208 169 184 192 cation transporting ATPase, putative PVVCY 0201170 279 285 281 269 conserved Plasmodium protein, unknown function PVVCY 0201180 33 24 26 23 eukaryotic translation initiation factor 2 subunit alpha, putative PVVCY 0201190 508 438 441 423 conserved Plasmodium protein, unknown function PVVCY 0201200 42 49 41 51 actin-like protein, putative PVVCY 0201210 25 18 17 14 conserved Plasmodium protein, unknown function 207

PvvCY Gene ID PvbDA PvlDS PvpCR PvsELProduct PVVCY 0201220 18 18 16 18 conserved Plasmodium protein, unknown function PVVCY 0201230 9 10 10 9 conserved Plasmodium protein, unknown function PVVCY 0201240 153 132 137 130 zinc finger protein, putative PVVCY 0201250 63 57 62 60 alpha beta hydrolase, putative PVVCY 0201260 15 16 16 12 conserved Plasmodium protein, unknown function PVVCY 0201270 36 31 32 30 RNA-binding protein, putative PVVCY 0201280 8 8 7 7 signal recognition particle subunit SRP9, putative PVVCY 0201290 208 197 184 188 conserved Plasmodium protein, unknown function PVVCY 0201300 19 11 12 11 1-cys peroxiredoxin, putative PVVCY 0201310 86 86 79 78 60S ribosomal export protein NMD3, putative PVVCY 0201320 15 29 20 20 ribosome biogenesis protein BRX1 homolog, putative PVVCY 0201330 63 63 60 65 mRNA (N6-adenosine)-methyltransferase, putative PVVCY 0201340 5 7 6 4 conserved Plasmodium protein, unknown function PVVCY 0201350 42 38 40 42 conserved Plasmodium protein, unknown function PVVCY 0201360 7 6 6 6 dynein light chain, putative PVVCY 0201370 291 249 266 263 dynein heavy chain, putative PVVCY 0201380 7 6 6 7 conserved Plasmodium protein, unknown function PVVCY 0201390 72 72 54 63 tRNA pseudouridine synthase D, putative PVVCY 0201400 59 47 47 36 AP-4 complex subunit beta, putative PVVCY 0201410 111 88 100 96 transcription factor with AP2 domain(s), putative PVVCY 0201420 15 14 12 12 conserved Plasmodium protein, unknown function PVVCY 0300150 130 139 137 149 conserved Plasmodium protein, unknown function PVVCY 0300160 51 40 35 35 octaprenyl pyrophosphate synthase, putative PVVCY 0300170 16 19 16 18 conserved Plasmodium protein, unknown function PVVCY 0300180 176 122 149 177 repetitive organellar protein, putative PVVCY 0300190 39 57 56 63 conserved Plasmodium protein, unknown function PVVCY 0300200 10 11 10 9 ERCC1 nucleotide excision repair protein, putative PVVCY 0300210 10 10 8 8 conserved Plasmodium protein, unknown function PVVCY 0300230 11 16 15 14 conserved Plasmodium protein, unknown function PVVCY 0300240 11 14 13 13 protein MAK16, putative PVVCY 0300250 33 31 32 31 conserved Plasmodium protein, unknown function PVVCY 0300260 53 50 48 47 5’-3’ exonuclease, putative PVVCY 0300270 123 80 91 85 condensin-2 complex subunit H2, putative PVVCY 0300280 218 232 227 197 conserved Plasmodium protein, unknown function PVVCY 0300290 28 23 22 22 conserved Plasmodium protein, unknown function PVVCY 0300300 60 58 56 54 conserved Plasmodium protein, unknown function PVVCY 0300310 15 12 13 13 dynein light chain, putative PVVCY 0300320 71 70 61 66 aspartate aminotransferase, putative PVVCY 0300330 138 158 163 145 5’-3’ exonuclease, putative PVVCY 0300340 61 67 58 56 hexose transporter, putative 208

PvvCY Gene ID PvbDA PvlDS PvpCR PvsELProduct PVVCY 0300350 21 19 18 20 3’-5’ exonuclease, putative PVVCY 0300360 29 29 24 26 methyltransferase, putative PVVCY 0300370 64 68 27 45 conserved Plasmodium protein, unknown function PVVCY 0300380 22 19 18 20 conserved Plasmodium protein, unknown function PVVCY 0300390 39 43 43 44 conserved Plasmodium protein, unknown function PVVCY 0300400 26 24 21 20 PCI domain-containing protein, putative PVVCY 0300410 11 4 6 9 DNA-directed RNA polymerase II 16 kDa subunit, putative PVVCY 0300420 64 58 58 58 conserved Plasmodium protein, unknown function PVVCY 0300430 26 20 20 21 RNA-binding protein, putative PVVCY 0300440 11 8 11 12 PH domain-containing protein, putative PVVCY 0300450 94 77 79 77 26S proteasome regulatory subunit RPN1, putative PVVCY 0300460 165 153 139 144 DNA repair protein RAD2, putative PVVCY 0300470 38 31 35 33 cysteine desulfuration protein SufE, putative PVVCY 0300480 54 56 54 56 pantothenate transporter, putative PVVCY 0300490 217 172 211 208 pentafunctional AROM polypeptide, putative PVVCY 0300500 61 78 71 81 pentafunctional AROM polypeptide, putative PVVCY 0300510 104 122 83 107 conserved Plasmodium protein, unknown function PVVCY 0300520 10 6 5 5 DNA-directed RNA polymerase III subunit RPC10, putative PVVCY 0300530 35 36 36 37 adenylosuccinate lyase, putative PVVCY 0300540 29 32 56 5 merozoite surface protein 4/5, putative PVVCY 0300550 165 147 142 133 conserved Plasmodium protein, unknown function PVVCY 0300560 21 23 22 20 iron-sulfur assembly protein, putative PVVCY 0300570 127 115 127 124 serine repeat antigen 5, putative PVVCY 0300580 124 135 139 141 serine repeat antigen 4, putative PVVCY 0300590 148 131 178 167 serine repeat antigen 4, putative PVVCY 0300600 106 160 97 128 serine repeat antigen 3, putative PVVCY 0300610 169 145 169 179 serine repeat antigen 2, putative PVVCY 0300620 100 87 85 88 conserved Plasmodium protein, unknown function PVVCY 0300630 24 13 19 18 KRR1 small subunit processome component, putative PVVCY 0300640 242 209 222 212 conserved Plasmodium protein, unknown function PVVCY 0300650 115 111 117 96 conserved Plasmodium protein, unknown function PVVCY 0300660 5 6 8 5 acyl carrier protein, putative PVVCY 0300670 37 31 29 31 ribosome-recycling factor, putative PVVCY 0300680 26 23 27 25 conserved Plasmodium protein, unknown function PVVCY 0300690 25 34 28 22 conserved Plasmodium protein, unknown function PVVCY 0300700 249 258 256 250 6-cysteine protein PVVCY 0300710 297 276 280 281 6-cysteine protein PVVCY 0300720 80 72 78 76 phospholipase A2, putative PVVCY 0300730 27 25 21 23 3’ exoribonuclease, putative PVVCY 0300740 22 19 19 16 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase, putative 209

PvvCY Gene ID PvbDA PvlDS PvpCR PvsELProduct PVVCY 0300750 28 31 31 30 conserved protein, unknown function PVVCY 0300760 38 39 38 34 conserved Plasmodium protein, unknown function PVVCY 0300770 99 89 98 104 transporter, putative PVVCY 0300780 19 20 19 19 ATP-dependent RNA helicase UAP56, putative PVVCY 0300790 8 10 7 6 Sec61-gamma subunit of protein translocation complex, putative PVVCY 0300800 8 6 5 6 60S ribosomal protein L37ae, putative PVVCY 0300810 192 197 205 200 conserved Plasmodium protein, unknown function PVVCY 0300820 25 21 18 17 monocarboxylate transporter, putative PVVCY 0300830 5 4 3 1 50S ribosomal protein L33, putative PVVCY 0300840 177 164 173 172 conserved Plasmodium protein, unknown function PVVCY 0300850 34 33 33 34 conserved Plasmodium protein, unknown function PVVCY 0300860 23 23 23 21 syntaxin, putative PVVCY 0300870 39 33 32 30 conserved Plasmodium protein, unknown function PVVCY 0300880 30 28 29 30 conserved Plasmodium protein, unknown function PVVCY 0300890 72 87 78 88 conserved Plasmodium protein, unknown function PVVCY 0300900 11 12 11 13 ras-related protein Rab-5A, putative PVVCY 0300910 20 18 21 19 ubiquinol-cytochrome-c reductase complex assembly factor 1, putative PVVCY 0300920 30 31 30 31 apicoplast beta-ketoacyl-acyl carrier protein synthase III precursor, putative PVVCY 0300930 101 112 115 122 GAF domain-related protein, putative PVVCY 0300940 7 6 8 10 UDP-N-acetylglucosamine transferase subunit ALG14, putative PVVCY 0300950 101 78 83 92 tyrosine kinase-like protein, putative PVVCY 0300960 58 56 60 60 asparagine–tRNA ligase, putative PVVCY 0300970 32 57 59 53 conserved Plasmodium protein, unknown function PVVCY 0300980 23 23 24 23 GDP-fructose:GMP antiporter, putative PVVCY 0300990 108 99 97 93 conserved Plasmodium protein, unknown function PVVCY 0301000 37 38 36 38 mitochondrial ribosomal protein L12 precursor, putative PVVCY 0301010 34 34 42 35 peptide chain release factor subunit 1, putative PVVCY 0301020 346 357 330 333 conserved Plasmodium protein, unknown function PVVCY 0301030 374 419 407 397 conserved Plasmodium protein, unknown function PVVCY 0301040 32 31 28 33 secreted protein with altered thrombospondin repeat domain, putative PVVCY 0301050 21 21 20 20 SRR1-like protein PVVCY 0301060 113 113 97 114 multidrug efflux pump, putative PVVCY 0301070 58 52 53 58 leucyl phenylalanyl-tRNA–protein transferase, putative PVVCY 0301080 15 18 14 16 conserved Plasmodium protein, unknown function PVVCY 0301090 24 20 21 21 protein SIS1, putative PVVCY 0301100 23 17 17 15 conserved Plasmodium protein, unknown function PVVCY 0301110 3 6 2 5 conserved Plasmodium protein, unknown function PVVCY 0301120 22 25 17 19 protein kinase 7, putative 210

PvvCY Gene ID PvbDA PvlDS PvpCR PvsELProduct PVVCY 0301130 36 40 34 35 tetratricopeptide repeat protein, putative PVVCY 0301140 212 223 213 211 conserved Plasmodium protein, unknown function PVVCY 0301150 10 9 10 6 conserved protein, unknown function PVVCY 0301160 66 57 66 62 conserved Plasmodium protein, unknown function PVVCY 0301170 96 108 114 96 conserved Plasmodium protein, unknown function PVVCY 0301180 83 63 72 75 T-complex protein 1 subunit theta, putative PVVCY 0301190 184 184 180 180 protein transport protein SEC31, putative PVVCY 0301200 23 20 23 22 mitochondrial large ribosomal subunit, putative PVVCY 0301210 62 69 69 68 conserved Plasmodium protein, unknown function PVVCY 0301220 35 34 28 29 conserved Plasmodium protein, unknown function PVVCY 0301230 222 226 211 224 serine threonine protein kinase, putative PVVCY 0301240 36 26 28 31 conserved Plasmodium protein, unknown function PVVCY 0301250 157 155 163 178 conserved Plasmodium protein, unknown function PVVCY 0301260 53 125 87 86 rhoptry neck protein 6, putative PVVCY 0301280 23 18 20 20 hypothetical protein PVVCY 0301290 16 12 13 14 conserved Plasmodium protein, unknown function PVVCY 0301300 62 83 73 79 conserved Plasmodium protein, unknown function PVVCY 0301310 54 50 49 49 conserved Plasmodium protein, unknown function PVVCY 0301320 36 40 40 41 conserved Plasmodium protein, unknown function PVVCY 0301330 110 94 111 112 DNA-directed RNA polymerase II subunit RPB2, putative PVVCY 0301340 74 64 72 70 origin recognition complex subunit 5, putative PVVCY 0301350 26 30 28 26 palmitoyltransferase DHHC11, putative PVVCY 0301360 234 248 229 221 DEAD DEAH helicase, putative PVVCY 0301370 108 107 92 86 conserved Plasmodium protein, unknown function PVVCY 0301380 6 9 9 9 conserved Plasmodium protein, unknown function PVVCY 0301390 148 152 143 144 conserved Plasmodium protein, unknown function PVVCY 0301400 54 65 56 63 vacuolar protein sorting-associated protein 45, putative PVVCY 0301410 124 128 119 115 conserved Plasmodium protein, unknown function PVVCY 0301420 52 64 60 58 MtN3-like protein PVVCY 0301430 87 81 87 78 autophagy-related protein 11, putative PVVCY 0301440 54 44 43 47 conserved Plasmodium protein, unknown function PVVCY 0301450 69 57 57 58 conserved Plasmodium protein, unknown function PVVCY 0301460 65 56 62 56 conserved Plasmodium membrane protein, unknown function PVVCY 0301470 44 39 40 41 ATP synthase F1, alpha subunit, putative PVVCY 0301480 137 151 147 142 conserved Plasmodium protein, unknown function PVVCY 0301490 17 13 13 14 AP-2 complex subunit sigma, putative PVVCY 0301500 13 10 11 8 conserved Plasmodium protein, unknown function PVVCY 0301510 34 30 24 33 calcium-dependent protein kinase 1, putative PVVCY 0301520 144 138 175 146 conserved Plasmodium protein, unknown function PVVCY 0301530 21 21 18 18 E2F-associated phosphoprotein, putative 211

PvvCY Gene ID PvbDA PvlDS PvpCR PvsELProduct PVVCY 0301540 5 5 6 6 40S ribosomal protein S26, putative PVVCY 0301550 49 45 45 45 conserved Plasmodium protein, unknown function PVVCY 0301560 31 22 25 26 replication factor C subunit 2, putative PVVCY 0301570 4 22 37 22 conserved Plasmodium membrane protein, unknown function PVVCY 0301580 67 62 74 78 conserved Plasmodium protein, unknown function PVVCY 0301590 39 31 28 31 apicoplast RNA methyltransferase precursor, putative PVVCY 0301600 65 62 63 63 ATP-dependent RNA helicase DDX47, putative PVVCY 0301610 10 19 17 17 small nuclear ribonucleoprotein Sm D2, putative PVVCY 0301620 223 205 212 192 conserved Plasmodium protein, unknown function PVVCY 0301630 31 18 29 27 pre-mRNA-processing protein 45, putative PVVCY 0301640 81 56 63 62 conserved Plasmodium protein, unknown function PVVCY 0301650 4 2 1 4 40S ribosomal protein S30, putative PVVCY 0301660 13 13 14 15 conserved Plasmodium protein, unknown function PVVCY 0301670 7 10 11 8 ribosome associated membrane protein RAMP4, putative PVVCY 0301680 19 19 21 22 pseudouridine synthase, putative PVVCY 0301690 99 94 89 90 replication factor C subunit 1, putative PVVCY 0400150 15 11 12 11 conserved Plasmodium protein, unknown function PVVCY 0400160 35 26 31 33 pre-mRNA-splicing factor PRP46, putative PVVCY 0400170 113 90 92 96 serine threonine protein kinase, putative PVVCY 0400180 103 85 86 86 ABC transporter B family member 4, putative PVVCY 0400190 12 14 14 14 CDGSH iron-sulfur domain-containing protein, putative PVVCY 0400200 46 50 49 45 conserved Plasmodium protein, unknown function PVVCY 0400210 113 97 98 102 exportin-1, putative PVVCY 0400220 85 79 78 80 N-ethylmaleimide-sensitive fusion protein, putative PVVCY 0400230 140 158 161 158 conserved Plasmodium protein, unknown function PVVCY 0400240 132 153 149 154 HAD superfamily protein, putative PVVCY 0400250 15 19 18 18 DNA-directed RNA polymerases I, II, and III subunit RPABC2, putative PVVCY 0400260 46 38 43 49 palmitoyltransferase DHHC1, putative PVVCY 0400270 170 155 158 162 spindle pole body protein, putative PVVCY 0400280 10 13 7 9 plasmoredoxin, putative PVVCY 0400290 49 50 51 50 lipoamide acyltransferase component of branched- chain alpha-keto acid dehydrogenase complex, putative PVVCY 0400300 79 74 72 71 IBR domain protein, putative PVVCY 0400310 42 44 40 41 phosphatidylethanolamine-binding protein, putative PVVCY 0400320 71 119 136 86 inner membrane complex protein 1a, putative PVVCY 0400330 44 58 56 53 inner membrane complex protein 1e, putative PVVCY 0400340 62 65 64 62 EH domain-containing protein, putative PVVCY 0400350 176 161 164 167 conserved Plasmodium protein, unknown function PVVCY 0400360 3 2 1 2 60S ribosomal protein L44, putative 212

PvvCY Gene ID PvbDA PvlDS PvpCR PvsELProduct PVVCY 0400370 12 10 12 15 1-cys-glutaredoxin-like protein-1, putative PVVCY 0400380 14 3 17 11 circumsporozoite (CS) protein, putative PVVCY 0400390 113 107 116 109 conserved Plasmodium membrane protein, unknown function PVVCY 0400400 7 5 6 3 conserved Plasmodium protein, unknown function PVVCY 0400410 32 31 26 35 elongation factor (EF-TS), putative PVVCY 0400420 368 340 377 381 conserved Plasmodium protein, unknown function PVVCY 0400430 87 84 82 84 conserved Plasmodium protein, unknown function PVVCY 0400440 64 117 106 68 conserved Plasmodium membrane protein, unknown function PVVCY 0400450 9 9 8 7 conserved Plasmodium protein, unknown function PVVCY 0400460 466 516 547 565 conserved Plasmodium protein, unknown function PVVCY 0400470 90 81 76 102 AP endonuclease (DNA-[apurinic or apyrimidinic site] lyase), pu- tative PVVCY 0400480 12 19 23 21 ubiquitin-conjugating enzyme E2, putative PVVCY 0400490 103 114 89 91 P-loop containing nucleoside triphosphate hydrolase, putative PVVCY 0400500 8 9 10 11 conserved Plasmodium protein, unknown function PVVCY 0400510 31 27 26 30 conserved Plasmodium protein, unknown function PVVCY 0400520 34 29 29 27 activator of Hsp90 ATPase, putative PVVCY 0400530 12 10 11 11 glutaredoxin 1, putative PVVCY 0400540 76 71 67 67 FAD-dependent glycerol-3-phosphate dehydrogenase, putative PVVCY 0400550 27 27 26 27 conserved Plasmodium protein, unknown function PVVCY 0400560 6 7 9 7 conserved Plasmodium protein, unknown function PVVCY 0400570 7 5 7 5 membrane magnesium transporter, putative PVVCY 0400580 44 36 34 34 T-complex protein 1 subunit beta, putative PVVCY 0400590 4 4 4 4 40S ribosomal protein S23, putative PVVCY 0400610 12 10 12 11 40S ribosomal protein S12, putative PVVCY 0400620 11 10 14 13 60S ribosomal protein L7, putative PVVCY 0400630 46 46 53 49 EB1 homolog, putative PVVCY 0400640 27 29 32 28 ATP-dependent Clp protease proteolytic subunit, putative PVVCY 0400650 26 23 27 25 conserved protein, unknown function PVVCY 0400660 70 69 58 65 conserved Plasmodium protein, unknown function PVVCY 0400670 148 143 138 131 conserved Plasmodium protein, unknown function PVVCY 0400680 59 25 43 46 conserved Plasmodium protein, unknown function PVVCY 0400690 316 335 319 321 conserved Plasmodium protein, unknown function PVVCY 0400700 46 37 41 42 DNA polymerase delta small subunit, putative PVVCY 0400710 124 137 140 138 conserved Plasmodium protein, unknown function PVVCY 0400720 43 39 37 37 T-complex protein 1 subunit eta, putative PVVCY 0400730 37 36 37 35 conserved Plasmodium protein, unknown function PVVCY 0400740 20 19 17 18 activator of Hsp90 ATPase, putative PVVCY 0400750 32 26 29 25 pre-mRNA-processing factor 19, putative PVVCY 0400760 64 50 49 47 conserved Plasmodium protein, unknown function 213

PvvCY Gene ID PvbDA PvlDS PvpCR PvsELProduct PVVCY 0400770 8 8 8 8 conserved Plasmodium protein, unknown function PVVCY 0400780 113 94 97 94 splicing factor 3B subunit 1, putative PVVCY 0400790 51 57 56 57 dual specificity protein phosphatase, putative PVVCY 0400800 14 13 16 19 conserved Plasmodium protein, unknown function PVVCY 0400810 158 146 135 140 serine threonine protein kinase, putative PVVCY 0400820 76 77 57 83 N2227-like protein, putative PVVCY 0400840 38 46 38 44 asparagine synthetase [glutamine-hydrolyzing], putative PVVCY 0400850 9 7 7 8 60S acidic ribosomal protein P2, putative PVVCY 0400860 57 64 61 62 SECIS-binding protein 2, putative PVVCY 0400870 27 25 23 23 YTH domain-containing protein, putative PVVCY 0400880 157 185 188 188 conserved Plasmodium protein, unknown function PVVCY 0400890 18 13 7 13 50S ribosomal protein L9, apicoplast, putative PVVCY 0400900 41 36 38 37 calcium-dependent protein kinase 3, putative PVVCY 0400910 413 398 404 428 phd finger protein, putative PVVCY 0400920 104 105 105 102 phosphoglycerate mutase, putative PVVCY 0400930 140 131 134 141 parasite-infected erythrocyte surface protein PVVCY 0400940 10 10 12 13 eukaryotic translation initiation factor 3 subunit K, putative PVVCY 0400950 6 9 9 8 trafficking protein particle complex subunit 4, putative PVVCY 0400960 10 13 12 13 conserved Plasmodium protein, unknown function PVVCY 0400970 78 63 67 68 conserved Plasmodium protein, unknown function PVVCY 0400980 8 6 7 5 conserved Plasmodium protein, unknown function PVVCY 0400990 37 38 36 38 pre-mRNA splicing factor, putative PVVCY 0401000 176 163 173 183 valine–tRNA ligase, putative PVVCY 0401010 58 48 43 50 phosphatidylinositol 3- and 4-kinase, putative PVVCY 0401020 328 322 262 282 protein kinase, putative PVVCY 0401030 1 1 1 1 conserved Plasmodium protein, unknown function PVVCY 0401040 53 65 59 71 dolichyl-diphosphooligosaccharide-protein glycosyltransferase subunit 1, putative PVVCY 0401050 22 20 22 22 plasmepsin VI, putative PVVCY 0401060 39 41 49 46 conserved protein, unknown function PVVCY 0401070 71 74 67 70 conserved Plasmodium protein, unknown function PVVCY 0401080 21 12 10 11 conserved Plasmodium protein, unknown function PVVCY 0401090 106 86 85 64 E3 ubiquitin-protein ligase, putative PVVCY 0401100 91 83 81 81 tetratricopeptide repeat protein, putative PVVCY 0401110 33 32 32 36 26S proteasome regulatory subunit RPN12, putative PVVCY 0401120 38 36 35 38 glycogen synthase kinase 3, putative PVVCY 0401130 98 83 80 85 major facilitator superfamily-related transporter, putative PVVCY 0401160 5 4 3 4 60S ribosomal protein L26, putative PVVCY 0401170 44 40 46 43 conserved Plasmodium protein, unknown function PVVCY 0401180 21 22 24 19 conserved Plasmodium protein, unknown function 214

PvvCY Gene ID PvbDA PvlDS PvpCR PvsELProduct PVVCY 0401190 53 56 56 53 ubiquitin-protein ligase, putative PVVCY 0401200 20 27 28 30 conserved Plasmodium protein, unknown function PVVCY 0401210 24 19 20 20 conserved Plasmodium protein, unknown function PVVCY 0401220 99 86 89 85 conserved Plasmodium protein, unknown function PVVCY 0401230 77 64 60 57 Ras-like G protein, putative PVVCY 0401240 56 54 54 52 conserved Plasmodium protein, unknown function PVVCY 0401250 36 28 26 23 conserved Plasmodium protein, unknown function PVVCY 0401260 38 42 47 40 conserved Plasmodium protein, unknown function PVVCY 0401270 70 66 72 67 conserved Plasmodium protein, unknown function PVVCY 0401280 23 17 11 14 co-chaperone p23, putative PVVCY 0401290 12 7 13 15 vesicle transport v-SNARE protein, putative PVVCY 0401300 35 30 31 27 conserved Plasmodium protein, unknown function PVVCY 0401310 94 94 98 101 conserved Plasmodium protein, unknown function PVVCY 0401320 23 21 21 18 DER1-like protein, putative PVVCY 0401330 10 7 8 7 serine threonine protein phosphatase 6, putative PVVCY 0401340 14 8 8 8 conserved Plasmodium protein, unknown function PVVCY 0401350 15 18 18 19 conserved Plasmodium protein, unknown function PVVCY 0401360 74 70 69 71 zinc finger protein, putative PVVCY 0401370 67 62 56 58 conserved Plasmodium protein, unknown function PVVCY 0401380 72 80 92 102 conserved Plasmodium protein, unknown function PVVCY 0401390 20 18 19 18 zinc finger protein, putative PVVCY 0401400 6 8 4 8 eukaryotic translation initiation factor 4E, putative PVVCY 0401410 157 171 139 156 circumsporozoite- and TRAP-related protein, putative PVVCY 0401420 157 156 169 159 conserved Plasmodium protein, unknown function PVVCY 0401430 33 26 27 28 conserved Plasmodium protein, unknown function PVVCY 0401440 16 14 15 15 mitochondrial ribosomal protein L29/L47 precursor, putative PVVCY 0401450 42 37 37 38 zinc finger protein, putative PVVCY 0401460 89 91 90 80 conserved Plasmodium protein, unknown function PVVCY 0401470 52 47 48 44 zinc finger protein, putative PVVCY 0401480 47 57 29 55 conserved Plasmodium protein, unknown function PVVCY 0401490 28 22 20 25 microneme associated antigen, putative PVVCY 0401500 16 17 14 15 mitochondrial ribosomal protein L27 precursor, putative PVVCY 0401510 225 235 224 228 conserved Plasmodium protein, unknown function PVVCY 0401520 49 38 35 38 inorganic pyrophosphatase, putative PVVCY 0401530 121 124 128 127 conserved Plasmodium protein, unknown function PVVCY 0401540 23 21 19 22 kinetochore protein NUF2, putative PVVCY 0401550 36 34 32 31 formate-nitrite transporter, putative PVVCY 0401560 23 18 19 17 HVA22 TB2 DP1 family protein, putative PVVCY 0401570 5 2 6 2 40S ribosomal protein S15A, putative PVVCY 0401580 49 79 79 71 zinc finger protein, putative 215

PvvCY Gene ID PvbDA PvlDS PvpCR PvsELProduct PVVCY 0401590 36 33 47 38 arginase, putative PVVCY 0401600 64 85 87 96 dynein light intermediate chain 2, putative PVVCY 0401610 69 65 73 68 Maf-like protein, putative PVVCY 0401620 50 44 46 51 conserved Plasmodium protein, unknown function PVVCY 0401630 11 12 17 17 vacuolar protein sorting-associated protein 46, putative PVVCY 0401640 171 203 181 200 exoribonuclease II, putative PVVCY 0401650 189 178 178 168 conserved Plasmodium protein, unknown function PVVCY 0401660 127 99 109 103 coatomer subunit beta, putative PVVCY 0401670 96 87 119 116 conserved Plasmodium protein, unknown function PVVCY 0401680 31 31 22 21 autophagy-related protein 3, putative PVVCY 0401690 81 84 83 84 WD repeat-containing protein 66, putative PVVCY 0401700 14 10 13 12 conserved Plasmodium protein, unknown function PVVCY 0401710 155 213 160 142 high molecular weight rhoptry protein 3, putative PVVCY 0401720 420 365 358 373 dynein heavy chain, putative PVVCY 0401730 117 103 113 114 mitochondrial carrier protein, putative PVVCY 0401740 175 176 170 170 nucleoporin NUP100/NSP100, putative PVVCY 0401750 10 10 9 11 conserved Plasmodium protein, unknown function PVVCY 0401760 135 139 126 134 copper-transporting ATPase, putative PVVCY 0401770 23 25 25 24 replication protein A1, small fragment PVVCY 0401780 15 14 12 13 bacterial histone-like protein, putative PVVCY 0401790 177 152 149 157 ubiquitin specific protease, putative PVVCY 0401800 10 10 13 12 prefoldin subunit 4, putative PVVCY 0401810 10 18 15 14 signal peptidase complex subunit 3, putative PVVCY 0401820 200 181 200 178 conserved protein, unknown function PVVCY 0401830 24 15 19 21 PH domain-containing protein, putative PVVCY 0401840 86 81 79 93 AP-4 complex subunit epsilon, putative PVVCY 0401850 24 26 25 26 GTPase-activating protein, putative PVVCY 0401860 3 4 4 4 60S ribosomal protein L32, putative PVVCY 0401870 105 109 104 108 LCCL domain-containing protein PVVCY 0401880 21 23 21 18 alpha tubulin 1, putative PVVCY 0401890 32 33 27 27 conserved Plasmodium protein, unknown function PVVCY 0401900 91 84 80 84 conserved Plasmodium protein, unknown function PVVCY 0401910 126 132 128 131 DEAD DEAH box helicase, putative PVVCY 0401920 206 207 204 203 conserved Plasmodium protein, unknown function PVVCY 0401930 6 7 9 11 ras-related protein RAB7, putative PVVCY 0401940 5 2 4 4 protein RER1, putative PVVCY 0401950 8 9 5 7 conserved Plasmodium protein, unknown function PVVCY 0402020 5 8 NA 12 hypothetical protein PVVCY 0402040 7 14 11 28 Plasmodium exported protein, unknown function PVVCY 0500170 67 52 53 52 DNA polymerase delta catalytic subunit, putative 216

PvvCY Gene ID PvbDA PvlDS PvpCR PvsELProduct PVVCY 0500180 20 22 20 20 rhoptry neck protein 12, putative PVVCY 0500190 16 16 25 22 enoyl-CoA hydratase-related protein, putative PVVCY 0500200 47 47 46 49 golgi re-assembly stacking protein, putative PVVCY 0500210 21 17 19 24 phosphomannomutase, putative PVVCY 0500220 17 15 17 17 conserved Plasmodium protein, unknown function PVVCY 0500230 39 35 32 38 conserved Plasmodium protein, unknown function PVVCY 0500240 66 60 47 54 conserved Plasmodium protein, unknown function PVVCY 0500250 93 104 100 104 conserved Plasmodium protein, unknown function PVVCY 0500260 32 31 28 24 26S proteasome regulatory subunit p55, putative PVVCY 0500270 42 38 29 32 tRNA pseudouridine synthase, putative PVVCY 0500280 19 17 19 19 conserved Plasmodium protein, unknown function PVVCY 0500290 227 226 230 231 serine threonine protein phosphatase 8, putative PVVCY 0500300 93 79 87 85 conserved Plasmodium protein, unknown function PVVCY 0500310 54 46 54 54 conserved Plasmodium protein, unknown function PVVCY 0500320 12 7 8 8 PHF5-like protein, putative PVVCY 0500330 172 190 190 190 conserved Plasmodium protein, unknown function PVVCY 0500340 38 34 28 33 conserved Plasmodium protein, unknown function PVVCY 0500350 51 53 52 54 conserved protein, unknown function PVVCY 0500360 92 59 64 67 conserved Plasmodium protein, unknown function PVVCY 0500370 297 306 322 288 eukaryotic translation initiation factor subunit eIF2A, putative PVVCY 0500380 180 164 175 174 conserved Plasmodium protein, unknown function PVVCY 0500390 25 49 47 45 conserved Plasmodium protein, unknown function PVVCY 0500400 234 231 186 229 zinc finger protein, putative PVVCY 0500410 9 6 7 6 60S ribosomal protein L30e, putative PVVCY 0500420 119 100 104 95 conserved Plasmodium protein, unknown function PVVCY 0500430 124 127 125 137 conserved Plasmodium protein, unknown function PVVCY 0500440 54 52 55 52 conserved Plasmodium protein, unknown function PVVCY 0500450 125 137 136 131 tRNA methyltransferase, putative PVVCY 0500460 11 12 14 15 autophagy-related protein 8, putative PVVCY 0500470 42 37 29 34 RNA-binding protein 34, putative PVVCY 0500480 16 12 12 13 flagellar outer arm dynein-associated protein, putative PVVCY 0500490 72 78 78 76 conserved Plasmodium protein, unknown function PVVCY 0500500 57 53 64 60 cytoplasmic dynein intermediate chain, putative PVVCY 0500510 25 19 27 19 rRNA (cytosine-C(5))-methyltransferase, putative PVVCY 0500520 36 27 30 28 ribosomal silencing factor RsfS, putative PVVCY 0500530 235 193 188 165 conserved Plasmodium protein, unknown function PVVCY 0500540 132 130 124 129 histone acetyltransferase, putative PVVCY 0500550 82 81 74 86 dihydrolipoamide acyltransferase, putative PVVCY 0500560 9 8 10 8 ADP-ribosylation factor, putative PVVCY 0500570 19 21 16 18 conserved Plasmodium protein, unknown function 217

PvvCY Gene ID PvbDA PvlDS PvpCR PvsELProduct PVVCY 0500580 187 176 172 179 conserved Plasmodium protein, unknown function PVVCY 0500590 86 76 77 80 conserved Plasmodium protein, unknown function PVVCY 0500600 20 19 20 21 conserved Plasmodium protein, unknown function PVVCY 0500610 56 58 53 54 endomembrane protein 70, putative PVVCY 0500620 46 42 40 39 ATP-dependent RNA helicase ROK1, putative PVVCY 0500630 17 12 13 12 deoxyribose-phosphate aldolase, putative PVVCY 0500640 621 665 633 735 conserved Plasmodium protein, unknown function PVVCY 0500650 156 141 137 129 schizont egress antigen-1, putative PVVCY 0500660 114 102 99 108 PHAX domain-containing protein, putative PVVCY 0500670 82 65 69 71 RNA-binding protein, putative PVVCY 0500680 19 18 22 23 conserved Plasmodium protein, unknown function PVVCY 0500690 26 27 27 27 conserved Plasmodium membrane protein, unknown function PVVCY 0500700 16 12 11 10 ZIP domain-containing protein, putative PVVCY 0500710 26 28 22 27 serine arginine-rich splicing factor 4, putative PVVCY 0500720 42 44 46 43 citrate synthase, mitochondrial precursor, putative PVVCY 0500730 56 51 47 48 kelch domain-containing protein, putative PVVCY 0500740 12 11 39 15 phospholipid scramblase, putative PVVCY 0500750 82 77 76 73 4-hydroxy-3-methylbut-2-en-1-yl diphosphate synthase, putative PVVCY 0500760 9 12 8 8 CDGSH iron-sulfur domain-containing protein, putative PVVCY 0500770 74 70 63 64 conserved Plasmodium protein, unknown function PVVCY 0500780 254 259 257 264 dynein heavy chain, putative PVVCY 0500790 32 32 32 30 orotidine 5’-phosphate decarboxylase, putative PVVCY 0500800 2 2 4 2 conserved Plasmodium protein, unknown function PVVCY 0500810 10 15 11 11 HORMA domain protein, putative PVVCY 0500820 5 3 3 2 conserved Plasmodium protein, unknown function PVVCY 0500830 24 18 19 18 conserved Plasmodium protein, unknown function PVVCY 0500840 101 89 91 86 conserved Plasmodium protein, unknown function PVVCY 0500850 43 68 58 63 conserved Plasmodium protein, unknown function PVVCY 0500860 246 221 223 217 chromodomain-helicase-DNA-binding protein 1, putative PVVCY 0500870 26 28 18 20 conserved Plasmodium protein, unknown function PVVCY 0500880 118 109 113 114 conserved Plasmodium protein, unknown function PVVCY 0500890 9 10 10 9 RNA-binding protein, putative PVVCY 0500900 9 5 7 7 conserved Plasmodium protein, unknown function PVVCY 0500910 41 43 43 36 conserved Plasmodium protein, unknown function PVVCY 0500920 15 14 7 13 conserved Plasmodium protein, unknown function PVVCY 0500930 64 54 53 59 RAP protein, putative PVVCY 0500940 42 44 39 42 conserved Plasmodium protein, unknown function PVVCY 0500950 138 160 163 136 conserved Plasmodium protein, unknown function PVVCY 0500960 163 185 167 161 conserved Plasmodium protein, unknown function PVVCY 0500970 71 77 86 91 formin 2, putative 218

PvvCY Gene ID PvbDA PvlDS PvpCR PvsELProduct PVVCY 0500980 75 68 68 69 glutamine–fructose-6-phosphate aminotransferase [isomerizing], putative PVVCY 0500990 24 21 23 22 ATP synthase mitochondrial F1 complex assembly factor 2, puta- tive PVVCY 0501000 70 64 64 64 conserved Plasmodium protein, unknown function PVVCY 0501010 119 110 103 117 conserved Plasmodium protein, unknown function PVVCY 0501020 539 539 556 527 conserved Plasmodium protein, unknown function PVVCY 0501030 13 13 10 1 cytochrome c oxidase copper chaperone, putative PVVCY 0501040 13 11 16 15 conserved Plasmodium protein, unknown function PVVCY 0501050 5 6 6 5 conserved Plasmodium protein, unknown function PVVCY 0501060 126 135 132 131 conserved Plasmodium protein, unknown function PVVCY 0501070 18 29 27 23 conserved Plasmodium protein, unknown function PVVCY 0501080 11 4 12 12 conserved Plasmodium protein, unknown function PVVCY 0501090 59 57 50 55 conserved Plasmodium protein, unknown function PVVCY 0501100 84 72 76 76 conserved Plasmodium protein, unknown function PVVCY 0501110 65 52 47 54 cell division cycle protein 20 homolog, putative PVVCY 0501120 3 3 3 3 conserved Plasmodium protein, unknown function PVVCY 0501130 127 109 172 142 conserved Plasmodium protein, unknown function PVVCY 0501140 7 8 10 7 40S ribosomal protein S2, putative PVVCY 0501150 24 19 19 19 biotin–acetyl-CoA-carboxylase, putative PVVCY 0501160 143 125 133 133 conserved Plasmodium protein, unknown function PVVCY 0501170 31 34 28 31 U3 small nucleolar ribonucleoprotein protein MPP10, putative PVVCY 0501180 50 52 46 47 mitochondrial ribosomal protein S22 precursor, putative PVVCY 0501190 26 27 10 10 merozoite capping protein 1, putative peroxiredoxin, putative PVVCY 0501200 9 15 13 13 DNA-directed RNA polymerase II subunit RPB7, putative PVVCY 0501210 5 5 6 6 conserved Plasmodium protein, unknown function PVVCY 0501220 31 24 24 24 topoisomerase, putative PVVCY 0501230 8 7 6 6 centrin-3, putative PVVCY 0501240 26 25 26 23 60S ribosomal protein L3, putative PVVCY 0501250 16 13 12 13 palmitoyltransferase DHHC10, putative PVVCY 0501260 51 48 50 43 methyltransferase, putative PVVCY 0501270 63 58 38 63 protoporphyrinogen oxidase, putative PVVCY 0501280 22 19 16 23 zinc finger, C3HC4 type, putative PVVCY 0501290 21 11 25 26 rRNA-processing protein EBP2, putative PVVCY 0501300 25 24 18 23 nucleolar preribosomal assembly protein, putative PVVCY 0501310 49 51 45 49 conserved Plasmodium protein, unknown function PVVCY 0501320 24 28 26 22 conserved Plasmodium protein, unknown function PVVCY 0501330 106 98 145 35 merozoite TRAP-like protein, putative PVVCY 0501340 6 3 5 3 conserved Plasmodium protein, unknown function PVVCY 0501350 53 41 45 47 inner membrane complex protein 1m, putative 219

PvvCY Gene ID PvbDA PvlDS PvpCR PvsELProduct PVVCY 0501360 120 99 104 91 conserved Plasmodium protein, unknown function PVVCY 0501370 41 41 49 50 WD repeat-containing protein, putative PVVCY 0501380 65 57 47 56 conserved Plasmodium protein, unknown function PVVCY 0501390 65 83 69 72 LEM3 CDC50 family protein, putative PVVCY 0501400 6 5 NA 17 conserved Plasmodium protein, unknown function PVVCY 0501410 44 40 54 43 adenosine deaminase, putative PVVCY 0501420 68 66 51 51 conserved Plasmodium protein, unknown function PVVCY 0501430 26 23 18 15 RAP protein, putative PVVCY 0501440 128 130 137 134 conserved Plasmodium protein, unknown function PVVCY 0501450 11 7 5 4 transcription elongation factor SPT4, putative PVVCY 0501460 108 98 110 97 pre-mRNA-splicing factor ATP-dependent RNA helicase PRP22, putative PVVCY 0501470 49 62 63 57 conserved Plasmodium protein, unknown function PVVCY 0501480 99 87 96 96 conserved Plasmodium protein, unknown function PVVCY 0501490 16 8 11 13 conserved Plasmodium protein, unknown function PVVCY 0501500 45 36 36 38 26S proteasome regulatory subunit RPN9, putative PVVCY 0501510 59 53 52 51 tRNA N6-adenosine threonylcarbamoyltransferase, putative PVVCY 0501520 32 25 28 28 RNA methyltransferase, putative PVVCY 0501530 17 17 18 16 calmodulin, putative PVVCY 0501540 40 49 35 35 28 kDa ookinete surface protein, putative PVVCY 0501550 38 41 41 35 25 kDa ookinete surface antigen precursor, putative PVVCY 0501560 14 14 15 15 conserved Plasmodium protein, unknown function PVVCY 0501570 18 16 11 12 MORN repeat-containing protein 1, putative PVVCY 0501580 64 49 51 55 conserved Plasmodium protein, unknown function PVVCY 0501590 55 48 48 50 conserved Plasmodium protein, unknown function PVVCY 0501600 11 19 16 16 protein phosphatase inhibitor 3, putative PVVCY 0501610 38 39 40 38 conserved Plasmodium protein, unknown function PVVCY 0501620 77 93 80 92 conserved Plasmodium protein, unknown function PVVCY 0501630 75 71 66 71 ribosome maturation factor RimM, putative PVVCY 0501640 116 88 95 92 mRNA-decapping enzyme subunit 1, putative PVVCY 0501650 28 23 27 27 conserved Plasmodium protein, unknown function PVVCY 0501660 60 55 52 48 conserved Plasmodium protein, unknown function PVVCY 0501670 48 51 43 45 phosphatidylinositol N- acetylglucosaminyltransferase subunit A, putative PVVCY 0501680 21 24 23 25 DER1-like protein, putative PVVCY 0501690 22 20 22 21 conserved protein, unknown function PVVCY 0501700 43 39 41 46 conserved Plasmodium protein, unknown function PVVCY 0501710 12 22 19 17 leucine-rich repeat protein PVVCY 0501720 121 114 117 115 RNA polymerase II-associated protein 1, putative PVVCY 0501730 44 35 33 32 conserved Plasmodium protein, unknown function 220

PvvCY Gene ID PvbDA PvlDS PvpCR PvsELProduct PVVCY 0501740 195 178 169 180 S-adenosylmethionine decarboxylase/ornithine decarboxylase, putative PVVCY 0501760 5 NA 4 3 conserved Plasmodium protein, unknown function PVVCY 0501780 47 45 48 43 WD repeat-containing protein 70, putative PVVCY 0501790 57 46 43 48 myb2 transcription factor, putative PVVCY 0501800 30 27 28 29 bromodomain protein 1, putative PVVCY 0501810 25 25 20 27 plasmepsin VII, putative PVVCY 0501820 10 10 10 10 ubiquitin-conjugating enzyme, putative PVVCY 0501830 32 27 27 32 Sec1 family protein, putative PVVCY 0501840 4 5 6 5 conserved Plasmodium protein, unknown function PVVCY 0501850 19 16 19 14 apicoplast ribosomal protein L27 precursor, putative PVVCY 0501860 39 35 42 38 thioredoxin-like associated protein 2, putative PVVCY 0501870 32 27 30 28 flavoprotein subunit of succinate dehydrogenase, putative PVVCY 0501880 126 121 118 117 conserved Plasmodium protein, unknown function PVVCY 0501890 25 24 20 24 translation initiation factor IF-3, putative PVVCY 0501900 10 17 11 11 ADP-ribosylation factor, putative PVVCY 0501910 27 27 28 26 conserved Plasmodium protein, unknown function PVVCY 0501920 62 70 67 63 methionine–tRNA ligase, putative PVVCY 0501930 80 103 91 86 tRNA pseudouridine synthase D, putative PVVCY 0501940 56 68 71 58 conserved Plasmodium membrane protein, unknown function PVVCY 0501950 18 22 21 21 CCAT-binding transcription factor-like protein, putative PVVCY 0501960 40 74 81 68 CCAT-binding transcription factor-like protein, putative PVVCY 0501970 117 134 146 99 conserved Plasmodium protein, unknown function PVVCY 0501980 44 64 59 84 conserved Plasmodium protein, unknown function PVVCY 0501990 42 34 45 22 conserved Plasmodium protein, unknown function PVVCY 0502000 15 9 14 16 conserved Plasmodium protein, unknown function PVVCY 0502010 14 11 11 11 phosducin-like protein, putative PVVCY 0502020 37 39 44 33 acetyl-CoA transporter, putative PVVCY 0502030 85 89 77 80 conserved Plasmodium protein, unknown function PVVCY 0502040 41 40 34 34 pyruvate kinase 2, putative PVVCY 0502050 45 44 32 35 conserved Plasmodium protein, unknown function PVVCY 0502060 17 20 17 17 ADP ATP transporter on adenylate translocase, putative PVVCY 0502070 85 79 79 89 conserved Plasmodium protein, unknown function PVVCY 0502080 34 40 32 36 dynamin-like protein, putative PVVCY 0502090 32 28 29 34 DNA repair helicase RAD25, putative PVVCY 0502100 4 7 4 5 enhancer of rudimentary homolog, putative PVVCY 0502110 51 54 43 43 conserved Plasmodium protein, unknown function PVVCY 0502120 8 12 16 17 conserved Plasmodium protein, unknown function PVVCY 0502130 4 6 4 6 antigen UB05, putative PVVCY 0502140 76 77 71 76 GDP dissociation inhibitor, putative 221

PvvCY Gene ID PvbDA PvlDS PvpCR PvsELProduct PVVCY 0502150 1 1 3 3 conserved Plasmodium protein, unknown function PVVCY 0502160 13 12 13 13 conserved Plasmodium protein, unknown function PVVCY 0502200 256 236 249 248 zinc finger protein, putative PVVCY 0502210 70 63 67 62 serine threonine protein kinase RIO2, putative PVVCY 0502220 146 152 124 163 conserved Plasmodium protein, unknown function PVVCY 0502230 59 60 53 50 holo-[acyl-carrier-protein] synthase, putative PVVCY 0502240 303 297 297 271 transcription factor with AP2 domain(s), putative PVVCY 0502250 27 27 26 26 ribosome-recycling factor, putative PVVCY 0502260 38 28 33 30 conserved Plasmodium protein, unknown function PVVCY 0502270 34 50 51 48 conserved Plasmodium protein, unknown function PVVCY 0502280 181 167 167 179 conserved Plasmodium protein, unknown function PVVCY 0502290 9 6 6 7 conserved Plasmodium protein, unknown function PVVCY 0502300 7 7 5 5 conserved Plasmodium protein, unknown function PVVCY 0502310 33 33 27 32 steroid dehydrogenase, putative PVVCY 0502320 63 58 32 53 transmembrane emp24 domain-containing protein, putative PVVCY 0502330 420 360 375 368 erythrocyte membrane-associated antigen, putative PVVCY 0502340 30 32 25 22 alpha tubulin 1, putative PVVCY 0502350 18 16 15 17 40S ribosomal protein S19, putative PVVCY 0502360 344 318 322 321 pre-mRNA-splicing helicase BRR2, putative PVVCY 0502370 6 7 10 6 conserved Plasmodium protein, unknown function PVVCY 0502380 31 28 31 29 eukaryotic initiation factor 4A-III, putative PVVCY 0502390 66 66 59 63 serpentine receptor, putative PVVCY 0502400 16 34 29 32 conserved Plasmodium protein, unknown function PVVCY 0502410 13 9 9 9 AP-4 complex subunit sigma, putative PVVCY 0502420 10 8 10 10 BSD-domain protein, putative PVVCY 0502430 26 29 21 20 asparagine rich protein, putative PVVCY 0502440 31 31 26 26 glideosome associated protein with multiple membrane spans 2, putative PVVCY 0502450 95 106 105 104 conserved Plasmodium protein, unknown function PVVCY 0502460 27 38 17 47 conserved Plasmodium protein, unknown function PVVCY 0502480 75 85 89 83 tryptophan-rich antigen tryptophan-rich protein PVVCY 0502490 103 69 90 47 tryptophan-rich protein tryptophan-rich antigen PVVCY 0600070 28 30 24 25 XPA binding protein 1, putative PVVCY 0600080 17 32 18 22 NIMA related kinase 3, putative PVVCY 0600090 12 9 10 8 conserved Plasmodium protein, unknown function PVVCY 0600100 23 12 13 11 cytochrome c oxidase assembly protein COX19, putative PVVCY 0600110 22 17 18 20 conserved Plasmodium protein, unknown function PVVCY 0600120 61 50 62 59 ATP dependent RNA helicase, putative PVVCY 0600130 5 5 6 5 mitochondrial phosphate carrier protein, putative PVVCY 0600140 87 81 87 90 dynein heavy chain, putative 222

PvvCY Gene ID PvbDA PvlDS PvpCR PvsELProduct PVVCY 0600150 238 233 244 241 dynein heavy chain, putative PVVCY 0600160 20 18 19 17 peptidyl-prolyl cis-trans isomerase, putative PVVCY 0600170 98 86 100 97 trimethylguanosine synthase, putative PVVCY 0600180 191 175 171 178 conserved protein, unknown function PVVCY 0600190 29 23 24 24 conserved Plasmodium protein, unknown function PVVCY 0600200 9 9 9 10 conserved Plasmodium protein, unknown function PVVCY 0600210 3 4 3 3 high mobility group protein B1, putative PVVCY 0600220 118 108 123 123 origin recognition complex subunit 1, putative PVVCY 0600230 23 19 18 17 conserved Plasmodium protein, unknown function PVVCY 0600240 11 8 7 7 signal recognition particle subunit SRP14, putative PVVCY 0600250 70 71 96 73 conserved Plasmodium protein, unknown function PVVCY 0600260 83 72 78 72 major facilitator superfamily domain-containing protein, putative PVVCY 0600270 47 40 42 47 threonylcarbamoyl-AMP synthase, putative PVVCY 0600280 19 16 15 15 cytochrome c1 heme lyase, putative PVVCY 0600290 34 34 32 32 nucleosome assembly protein, putative PVVCY 0600300 11 8 8 9 ubiquitin-conjugating enzyme E2, putative PVVCY 0600310 25 15 21 19 conserved Plasmodium protein, unknown function PVVCY 0600320 119 150 160 165 conserved Plasmodium protein, unknown function PVVCY 0600330 61 59 51 65 conserved Plasmodium protein, unknown function PVVCY 0600340 7 8 5 6 eukaryotic translation initiation factor 5A, putative PVVCY 0600350 25 30 29 28 conserved Plasmodium protein, unknown function PVVCY 0600360 17 18 20 15 conserved Plasmodium protein, unknown function PVVCY 0600370 26 23 26 29 cytidine deaminase, putative PVVCY 0600380 13 8 9 13 conserved Plasmodium protein, unknown function PVVCY 0600390 NA 8 NA NA NA PVVCY 0600400 11 9 10 8 conserved Plasmodium protein, unknown function PVVCY 0600410 84 105 116 118 conserved Plasmodium protein, unknown function PVVCY 0600420 79 67 56 93 O-phosphoseryl-tRNA(Sec) selenium transferase, putative PVVCY 0600430 56 44 51 48 HAD domain ookinete protein, putative PVVCY 0600440 47 43 47 43 conserved Plasmodium protein, unknown function PVVCY 0600450 90 81 89 88 kelch domain-containing protein, putative PVVCY 0600460 189 202 206 209 zinc finger protein, putative PVVCY 0600470 28 27 24 27 conserved Plasmodium protein, unknown function PVVCY 0600480 10 11 12 12 targeted glyoxalase II, putative PVVCY 0600490 205 185 187 183 high mobility group protein B3, putative PVVCY 0600500 143 147 132 127 conserved protein, unknown function PVVCY 0600510 24 26 24 25 shewanella-like protein phosphatase 2, putative PVVCY 0600520 64 61 59 60 eukaryotic translation initiation factor 3 subunit C, putative PVVCY 0600530 205 181 184 209 conserved Plasmodium protein, unknown function PVVCY 0600540 32 34 31 34 rhodanese like protein, putative 223

PvvCY Gene ID PvbDA PvlDS PvpCR PvsELProduct PVVCY 0600550 61 49 58 66 Tat binding protein 1(TBP-1)-interacting protein, putative PVVCY 0600560 89 82 77 72 DNA-directed RNA polymerase III subunit RPC2, putative PVVCY 0600570 39 37 43 43 eukaryotic translation initiation factor 5, putative PVVCY 0600580 55 48 52 43 conserved Plasmodium protein, unknown function PVVCY 0600590 80 97 87 95 conserved Plasmodium protein, unknown function PVVCY 0600600 123 118 109 107 conserved Plasmodium protein, unknown function PVVCY 0600610 68 53 55 57 small subunit rRNA processing factor, putative PVVCY 0600620 4 4 4 5 conserved Plasmodium protein, unknown function PVVCY 0600630 36 32 25 34 conserved Plasmodium protein, unknown function PVVCY 0600640 13 15 14 13 RNA-binding protein, putative PVVCY 0600650 31 30 38 32 tRNA delta(2)-isopentenylpyrophosphate transferase, putative PVVCY 0600660 21 15 13 18 blood stage antigen 41-3 precursor, putative PVVCY 0600670 36 40 55 73 conserved Plasmodium protein, unknown function PVVCY 0600680 29 27 31 28 conserved Plasmodium protein, unknown function PVVCY 0600690 340 326 334 342 conserved Plasmodium protein, unknown function PVVCY 0600700 393 368 355 391 cysteine repeat modular protein 3, putative PVVCY 0600710 6 7 11 11 mitochondrial ACP precursor, putative PVVCY 0600720 131 141 132 137 amino acid transporter, putative PVVCY 0600730 136 137 125 118 conserved Plasmodium protein, unknown function PVVCY 0600740 8 4 5 5 mitochondrial import inner membrane translocase subunit TIM10, putative PVVCY 0600750 24 22 19 25 conserved protein, unknown function PVVCY 0600760 105 91 87 91 zinc finger protein, putative PVVCY 0600770 90 77 76 79 protein phosphatase, putative PVVCY 0600780 19 14 12 15 conserved Plasmodium protein, unknown function PVVCY 0600790 14 13 7 13 U6 snRNA-associated Sm-like protein LSm7, putative PVVCY 0600800 134 127 142 138 zinc finger transcription factor, putative PVVCY 0600810 93 93 92 93 cytosolic iron-sulfur protein assembly protein 1, putative PVVCY 0600820 87 83 83 81 3’,5’-cyclic nucleotide phosphodiesterase, putative PVVCY 0600830 35 32 31 28 porphobilinogen deaminase, putative PVVCY 0600840 16 15 12 12 conserved Plasmodium protein, unknown function PVVCY 0600850 13 8 10 8 ATP synthase mitochondrial F1 complex assembly factor 1, puta- tive PVVCY 0600860 80 65 65 72 ABC transporter B family member 7, putative PVVCY 0600870 40 33 33 32 50S ribosomal protein L1, apicoplast, putative PVVCY 0600880 27 23 22 21 SNARE protein, putative PVVCY 0600890 98 75 75 82 zinc finger protein, putative PVVCY 0600900 19 20 22 17 conserved Plasmodium protein, unknown function PVVCY 0600910 33 74 71 60 conserved Plasmodium protein, unknown function PVVCY 0600920 114 106 107 105 general transcription factor 3C polypeptide 5, putative 224

PvvCY Gene ID PvbDA PvlDS PvpCR PvsELProduct PVVCY 0600930 24 49 50 52 conserved Plasmodium protein, unknown function PVVCY 0600940 153 155 153 153 conserved Plasmodium protein, unknown function PVVCY 0600950 40 36 35 38 conserved Plasmodium protein, unknown function PVVCY 0600960 25 23 26 31 GPI mannosyltransferase 1, putative PVVCY 0600970 110 119 120 120 kinesin-7, putative PVVCY 0600980 22 24 23 25 heat shock protein 20, putative PVVCY 0600990 158 170 164 167 conserved Plasmodium protein, unknown function PVVCY 0601000 135 120 109 121 DNA helicase MCM8, putative PVVCY 0601010 22 19 25 23 heat shock protein DNAJ homologue Pfj4, putative PVVCY 0601020 37 37 36 36 mitochondrial ribosomal protein S18 precursor, putative PVVCY 0601030 224 230 228 235 lysine-specific histone demethylase 1, putative PVVCY 0601040 47 53 49 47 DNA replication licensing factor MCM5, putative PVVCY 0601050 155 148 160 155 non-SERCA-type Ca2+ -transporting P-ATPase, putative PVVCY 0601060 31 28 32 31 glutathione peroxidase-like thioredoxin peroxidase, putative PVVCY 0601070 48 43 47 44 peripheral plastid protein 1, putative PVVCY 0601080 22 14 11 11 conserved Plasmodium protein, unknown function PVVCY 0601090 65 45 56 58 eukaryotic initiation factor 2a, putative PVVCY 0601100 38 36 34 36 HSP40, subfamily A, putative PVVCY 0601110 7 6 7 8 trafficking protein particle complex subunit 5, putative PVVCY 0601120 97 92 86 88 succinyl-CoA ligase, putative PVVCY 0601130 79 74 69 70 conserved Plasmodium protein, unknown function PVVCY 0601140 33 18 17 30 conserved Plasmodium protein, unknown function PVVCY 0601150 86 76 73 70 pantothenate kinase, putative PVVCY 0601160 21 16 27 27 conserved Plasmodium protein, unknown function PVVCY 0601170 88 71 74 80 ribonucleoside-diphosphate reductase large subunit, putative PVVCY 0601180 21 14 15 15 conserved Plasmodium protein, unknown function PVVCY 0601190 38 32 29 31 N-acetyltransferase, putative PVVCY 0601200 10 10 7 8 histidine triad protein, putative PVVCY 0601210 15 16 13 14 ATP-dependent Clp protease proteolytic subunit, putative PVVCY 0601220 70 68 75 80 conserved Plasmodium protein, unknown function PVVCY 0601230 141 133 146 140 conserved Plasmodium protein, unknown function PVVCY 0601240 86 76 77 73 conserved Plasmodium protein, unknown function PVVCY 0601250 103 96 102 96 conserved Plasmodium protein, unknown function PVVCY 0601260 13 17 20 19 conserved Plasmodium protein, unknown function PVVCY 0601270 145 126 139 147 conserved Plasmodium protein, unknown function PVVCY 0601280 64 56 61 58 conserved Plasmodium protein, unknown function PVVCY 0601290 115 96 97 101 DEAD box ATP-dependent RNA helicase, putative PVVCY 0601320 16 15 15 15 conserved Plasmodium protein, unknown function PVVCY 0601330 100 108 169 142 conserved Plasmodium protein, unknown function PVVCY 0601340 548 484 522 512 conserved Plasmodium protein, unknown function 225

PvvCY Gene ID PvbDA PvlDS PvpCR PvsELProduct PVVCY 0601350 175 189 173 184 DNA polymerase alpha catalytic subunit A, putative PVVCY 0601360 80 131 87 79 conserved Plasmodium protein, unknown function PVVCY 0601370 34 38 36 34 mitochondrial ribosomal protein S12 precursor, putative PVVCY 0601380 42 39 36 41 conserved Plasmodium protein, unknown function PVVCY 0601390 58 51 55 55 phosphopantothenoylcysteine synthetase, putative PVVCY 0601400 76 60 66 61 RNA-binding protein, putative PVVCY 0601410 68 68 65 73 protein SDA1, putative PVVCY 0601420 48 73 68 78 conserved Plasmodium protein, unknown function PVVCY 0601430 46 40 37 40 conserved Plasmodium protein, unknown function PVVCY 0601440 14 15 13 12 transcription initiation factor IIA subunit 1, putative PVVCY 0601450 36 25 15 24 conserved Plasmodium protein, unknown function PVVCY 0601460 43 41 49 34 cysteine desulfurase, putative PVVCY 0601470 122 102 101 99 conserved Plasmodium protein, unknown function PVVCY 0601480 19 14 13 12 eukaryotic translation initiation factor 3 subunit I, putative PVVCY 0601490 60 52 69 65 drug metabolite transporter, putative PVVCY 0601500 30 28 32 34 conserved Plasmodium membrane protein, unknown function PVVCY 0601510 82 82 87 82 conserved Plasmodium protein, unknown function PVVCY 0601520 41 32 35 38 conserved Plasmodium protein, unknown function PVVCY 0601530 33 31 33 28 transcription initiation factor IIE subunit alpha, putative PVVCY 0601540 51 47 41 53 queuine tRNA-ribosyltransferase, putative PVVCY 0601550 23 18 13 13 calcium-dependent protein kinase 4, putative PVVCY 0601560 67 60 64 63 conserved Plasmodium protein, unknown function PVVCY 0601570 30 27 32 36 serine–tRNA ligase, putative PVVCY 0601580 102 95 93 96 conserved Plasmodium protein, unknown function PVVCY 0601590 25 24 27 29 thioredoxin-like protein PVVCY 0601600 305 305 311 296 dynein heavy chain, putative PVVCY 0601610 128 124 125 122 exported serine/threonine protein kinase, putative PVVCY 0601620 222 179 183 174 cysteine repeat modular protein 2, putative PVVCY 0601630 4 3 1 3 mitochondrial ribosomal protein S8 precursor, putative PVVCY 0601640 9 6 7 8 prefoldin subunit 3, putative PVVCY 0601650 108 103 101 105 conserved Plasmodium protein, unknown function PVVCY 0601660 31 27 29 29 conserved Plasmodium protein, unknown function PVVCY 0601670 30 24 26 23 conserved Plasmodium protein, unknown function PVVCY 0601680 10 14 14 11 conserved Plasmodium protein, unknown function PVVCY 0601690 6 5 5 5 conserved Plasmodium protein, unknown function PVVCY 0601700 14 18 18 19 NIMA related kinase 4, putative PVVCY 0601710 92 78 84 83 actin-related protein, putative PVVCY 0601720 199 171 207 196 conserved Plasmodium protein, unknown function PVVCY 0601730 36 32 29 32 LEM3 CDC50 family protein, putative PVVCY 0601740 17 14 17 17 60S ribosomal protein L11a, putative 226

PvvCY Gene ID PvbDA PvlDS PvpCR PvsELProduct PVVCY 0601750 6 10 9 9 40S ribosomal protein S10, putative PVVCY 0601760 58 54 64 64 conserved Plasmodium protein, unknown function PVVCY 0601770 328 318 327 309 conserved Plasmodium protein, unknown function PVVCY 0601780 15 14 12 14 exosome complex component CSL4, putative PVVCY 0601790 16 14 16 17 small subunit rRNA processing protein, putative PVVCY 0601800 2 10 12 9 conserved Plasmodium protein, unknown function PVVCY 0601810 8 12 10 12 conserved Plasmodium protein, unknown function PVVCY 0601820 82 74 56 81 ferrodoxin reductase-like protein, putative PVVCY 0601830 9 6 7 9 conserved Plasmodium protein, unknown function PVVCY 0601840 26 15 22 22 conserved Plasmodium protein, unknown function PVVCY 0601850 218 216 203 212 phosphoinositide-binding protein, putative PVVCY 0601860 22 33 18 32 Ham1-like protein, putative PVVCY 0601870 15 11 8 10 conserved Plasmodium protein, unknown function PVVCY 0601880 230 242 237 228 conserved Plasmodium protein, unknown function PVVCY 0601890 21 20 19 20 conserved Plasmodium protein, unknown function PVVCY 0601900 102 116 110 110 conserved Plasmodium protein, unknown function PVVCY 0601910 72 65 74 74 ATP-dependent RNA helicase DBP7, putative PVVCY 0601920 21 23 22 22 conserved Plasmodium protein, unknown function PVVCY 0601930 61 52 76 60 conserved Plasmodium protein, unknown function PVVCY 0601940 7 10 10 10 40S ribosomal protein S5, putative PVVCY 0601950 43 45 48 48 secreted ookinete protein, putative PVVCY 0601960 7 7 3 7 conserved Plasmodium protein, unknown function PVVCY 0601970 2 4 4 1 V-type ATPase V0 subunit e, putative PVVCY 0601980 19 19 22 20 conserved Plasmodium protein, unknown function PVVCY 0601990 41 54 48 51 PelOta protein homologue, putative PVVCY 0602000 101 91 89 89 rhoptry-associated leucine zipper-like protein 1, putative PVVCY 0602010 85 70 69 73 zinc finger protein, putative PVVCY 0602020 25 19 23 22 Obg-like ATPase 1, putative C Brief overview of genetics in malaria.

Study Model Study Type Phenotype Genetic Genetic marker Parent isolate A Parent isolate B Main conclusion marker type Walliker et al., RMP Genetic cross Pyrimethamine Allozyme Glucose phosphate P. y. yoelii 17X P. y. yoelii 33X Genetic cross confirmed by appearance of GPI-2 1972 resistance isomerase (Pyr-R, GPI-1) (Pyr-S, GPI-2) expressing, pyrimethamine resistant parasites

Oxbrow, 1973 RMP Genetic cross Cross-immunity Allozyme Glucose phosphate P. y. yoelii NK65 P. y. nigeriensis N67 Cross-immunity is genetics-driven, undergoes isomerase (Pyr-R, cannot survive (Pyr-S, can survive independent recombination with other markers. in immune mice, GPI-1) in immune mice, GPI-2) Walliker et al., RMP Genetic cross Pyrimethamine Allozyme 6-phosphogluconate P. c. chabaudi 47AS P. c. chabaudi 10AJ Segregation of enzyme markers suggest blood stage 1975 resistance dehydrogenase, lactate (Pyr-R, 6PGD-2, LDH-3) (Pyr-S, 6PGD-3, LDH-2) parasites are genetically haploid. dehydrogenase

Walliker et al., RMP Genetic cross Virulence Allozyme Glucose phosphate P. y. yoelii YM P. y. yoelii A/C Virulence is genetics-driven and inherited in a simple 1976 isomerase (Pyr-S, virulent, GPI-1) (Pyr-R, avirulent, GPI-2) Mendelian fashion.

Rosario et al., RMP Genetic cross Chloroquine Allozyme 6-phosphogluconate P. c. chabaudi 411AS P. c. chabaudi 96AJ Chloroquine resistance is genetics-driven and inherited 1976 resistance dehydrogenase, lactate (CQ-R, Pyr-R, (CQ-S, Pyr-S, in a simple Mendelian fashion. dehydrogenase 6PGD-2, LDH-3) 6PGD-3, LDH-2) Study Model Study Type Phenotype Genetic Genetic marker Parent isolate A Parent isolate B Main conclusion marker type

Knowles et al., RMP Genetic cross Pyrimethamine Allozyme Glucose phosphate P. y. yoelii 17X P. y. nigeriensis N67 Parental: recombinant clone ratios follow Mendelian 1981 resistance isomerase, glutamate (Pyr-R, GPI-1, (Pyr-S, GPI-2, genetics. Genetic cross can be performed between dehydrogenase, adenosine GDH-4, ADA-2) GDH-2, ADA-1) subspecies. deaminase

Carlton et al., RMP Classical linkage Chloroquine RFLP 46 RFLP markers P. c. chabaudi AJ P. c. chabaudi AS(3CQ) Chloroquine resistance linked to chromosome 11 1998 analysis resistance (CQ sensitive) (CQ resistant)

Hayton et al., RMP Classical linkage Sulphadoxine- RFLP RFLP markers from Carlton P. c. chabaudi AJ P. c. chabaudi AS(50S/P) S/P resistance not associated with mutations in dhfr 2002 analysis pyrimethamine et al., 1998 (Pyr-S, S-R) (Pyr-R, S-R) gene resistance

Cravo et al., 2003 RMP Classical linkage Mefloquine RFLP RFLP markers from Carlton P. c. chabaudi AJ P. c. chabaudi AS(15MF/3) pcmdr1 gene duplication and another analysis resistance et al., 1998 (Mef-S) (Mef-R) unknown gene is associated with mefloquine resistance.

Culleton et al., RMP LGS Pyrimethamine AFLP 206 AFLP markers P. c. chabaudi AJ P. c. chabaudi AS-PYR1 A new genetic method, Linkage Group Selection, is fast 2005 resistance (Pyr-S) (Pyr-R) and efficient by dispensing off the need to clone recombinant progeny.

Martinelli et al., RMP LGS Strain-specific AFLP 275 AFLP markers P. c. chabaudi CB P. c. chabaudi AS-PYR1 msp1 locus is a major target antigen associated with 2005 immunity strain-specific immunity

Pattaradilokrat et RMP LGS Strain-specific AFLP 92 AFLP markers P. c. chabaudi CB-PYR10 P. c. chabaudi AJ msp1 locus is a major target antigen associated with al., 2007 immunity strain-specific immunity

Pattaradilokrat et RMP LGS Growth rate/ host AFLP 108 AFLP markers P. y. yoelii 17XYM P. y. yoelii 33XC Growth rate and host cell invasion preference is linked al., 2009 cell preference (invades reticulocytes (invades reticulocytes to pyebl gene encoding for a erythrocyte binding ligand and normocytes, fast grower) only, slow grower) (a C713R mutation). Study Model Study Type Phenotype Genetic Genetic marker Parent isolate A Parent isolate B Main conclusion marker type Li et al., 2011 RMP QTL linkage Growth rate Microsatellites 539 MS P. y. yoelii 17XNL P. y. nigeriensis N67 C741Y (N67) and C713R (YM) mutations in PyEBL is analysis (slow grower); (fast grower); associated with growth rate. Genetic linkage map for P. P. y. yoelii YM; P. y. yoelii 33X; yoelii with average recombination rate of 39.7 kb/cM P. y. yoelii BY265 P. y. nigeriensis NSM Abkallo et al., RMP LGS Growth rate SNPs 29,053 SNPs (called by P. y. yoelii 17X1.1pp P. y. yoelii CU Simultaneous detection of two loci (one containing 2017 Strain-specific NGS) (fast grower) (slow grower) msp1 gene) linked to strain-specific immunity. C351Y immunity substitution in PyEBL associated with host cell preference and growth rate.

Nair et al., 2017 RMP QTL linkage analysisGrowth rate SNPs 11,000 SNPs (called by P. y. yoelii YM P. y. yoelii N67 HECT-like E3 ubiquitin ligase regulates parasite growth microarray) (kills host on day 5) (kills host on day 15) and virulence.

Walliker et al., Pfalc Genetic cross Pyrimethamine Allozyme, Adenosine deaminase P. falciparum 3D7 P. falciparum HB3 Genetic cross confirmed in P. falciparum by 1987 resistance Antigen (Pyr sensitive, ADA-1) (Pyr resistant, ADA-2) recombination of isoenzyme and drug resistance markers.

Wellems et al., Pfalc Classical linkage Chloroquine RFLP 2 RFLP probes - against P. falciparum P. falciparum Chloroquine resistance not linked to mdr-like genes. 1990 analysis resistance pfmdr1 and pfmdr2 HB3(CQ sensitive) Dd2(CQ resistant)

Wellems et al., Pfalc Classical linkage Chloroquine RFLP 85 RFLP markers HB3 X Dd2 cross progeny from Wellems et al., 1990 Chloroquine resistance is linked to ∼400kB region of 1991 analysis resistance chromosome 7

Su et al., 1997 Pfalc Classical linkage Chloroquine Microsatellites 24 markers within HB3 X Dd2 cross progeny from Wellems et al., 1990 Chloroquine resistance is linked to polymorphisms in analysis resistance ∼400 kb region cg2 gene encoding for a ∼330kDa protein in 36kb in chromosome 7 region of chromosome 7.

Ferdig et al., 2004 Pfalc QTL linkage Quinine Microsatellites, Markers from HB3 X Dd2 cross progeny from Wellems et al., 1990 pfcrt, pfmdr1, pfnhe-1 associated with quinine resistance analysis resistance RFLP Su et al., 1999 Study Model Study Type Phenotype Genetic Genetic marker Parent isolate A Parent isolate B Main conclusion marker type

Hayton et al., Pfalc Classical linkage Host Microsatellites 285 MS P. falciparum 7G8 P. falciparum GB4 PfRH5, a putative erythrocyte binding protein is a 2008 analysis receptor (does not infect Aotus monkeys) (infects Aotus monkeys) parasite ligand for host receptors. Average recognition recombination rate of 36 kb/cM

Gonzales et al., Pfalc eQTL analysis Gene Microsatellites 329 MS HB3 X Dd2 cross progeny from Wellems et al., 1990 12 transcription regulatory hotspots identified. 2008 expression

Molina-Cruz et al.,Pfalc LGS Vector Microsatellites 26 MS in chr 13 P. falciparum 7G8 P. falciparum GB4 Pfs47 mediates evasion of mosquito immune system. 2013 immune (cannot infect (can infect Cross achieved by artificial feed via membrane feeding. evasion A. gambiae (R)) A. gambiae (R)) Walker-Jonah et Pfalc Linkage map NA RFLP 90 RFLP markers HB3 X Dd2 cross progeny from Wellems et al., 1990 Average recombination rate of 15-30 kB/cM. al., 1992

Su and Wellems , Pfalc Linkage map NA Microsatellites 188 polymorphic MS HB3 X Dd2 cross progeny from Wellems et al., 1990 Microsatellites occur abundantly in P. falciparum and 1996 (from total of 507 identified) can be used to construct high density genetic maps Su et al., 1999 Pfalc Linkage map NA Microsatellites, 901 RFLPs and MS HB3 X Dd2 cross progeny from Wellems et al., 1990 Average recombination rate of 17 kb/cM RFLP

Martinelli et al., RMP Linkage map NA AFLP 672 AFLP markers P. chabaudi AJ P. chabaudi AS(30CQ) Genetic linkage map for P. chabaudi.Average 2005 (CQ sensitive) (CQ resistant) recombination rate of 15.1 kb/cM

Jiang et al., 2011 Pfalc Linkage map NA SNPs, 3,184 SNPs (microarrays) 7G8 X GB4 cross progeny from Hayton et al., 2008 High density genetic map - one marker per 6.3 kb. Microsatellites and 254 MS Average recombination rate of 12.8 kb/cM

3D7 X HB3: 15,388 SNPs, 3D7 X HB3 cross progeny from Walliker et al, 1987 Miles et al., 2016 Pfalc Linkage map NA SNPs 26,699 indels Average recombination rate of 12.7-14.3 kb/cM HB3 X Dd2: 14,885 SNPs, HB3 X Dd2 cross progeny from Wellems et al., 1990 14,885 indels 7G8 X GB4: 14,392 SNPs, 7G8 X GB4 cross progeny from Hayton et al., 2008 20,079 indels (NGS) 231

D Pan-vinckei primers for genotyping recombinant clones.

Two polymorphic genes were chosen from locations closest to either end of each of the fourteen chromosomes. Regions were chosen within these genes such that they contained at least two or more distinguishing SNPs specific to each of the ten P. vinckei isolates. Forward ( fwd) and reverse ( rev) primers were designed against identical sequences flanking the regions containing the SNPs. One of the primers ( seq) was used as a sequencing primer for Sanger sequencing.

S. no Gene product Gene-Primer name Primer sequence Primer size

conserved Plasmodium protein, PVPCR 0100470 fwd tccttttctcatttttgtatcatt 24 1 unknown function PVPCR 0100470 rev seq ttggaaaagagtttggact 19 PVPCR 0101270 fwd seq acatctctcaatctaactattt 22 2 leucine-rich repeat protein PVPCR 0101270 rev agatgtagatactgtaaagaatca 24

replication factor c protein, PVPCR 0200310 fwd seq aatcggtagctagcca 16 3 putative PVPCR 0200310 rev tttgtcatatacgatttttcttg 23 PVPCR 0201200 fwd seq ttatgctatttttgaaatcttctaatt 27 4 cysteine desulfurase, putative PVPCR 0201200 rev caacaaaaacgggtgttat 19

DNA repair protein RAD2, PVPCR 0300520 fwd seq atctacttcctttacagtgg 20 5 putative PVPCR 0300520 rev ttgttgaaattgaggaggat 20 PVPCR 0301460 fwd tgacgagaaacaaaataagg 20 6 MtN3-like protein PVPCR 0301460 rev seq atatcgatgatcctattgataatg 24

spindle pole body protein, PVPCR 0400440 fwd seq tggaaatttaatatacataaaggagt 26 7 putative PVPCR 0400440 rev gttcattatttgcagaaaaagac 23 PVPCR 0401800 fwd tgaaaatgtagaaatgaaaaaagg 24 8 exoribonuclease II, putative PVPCR 0401800 rev seq aatatcatcatttcttctattagcat 26

tyrosine kinase-like protein, PVPCR 0500580 fwd seq taatagaactatgatctttttcaaatg 27 9 putative PVPCR 0500580 rev ctagtttatcaataaatgtgtctg 24 ubiquitin activating enzyme, PVPCR 0502050 fwd ctataaattgataaactttttttaaatttgt 31 10 putative PVPCR 0502050 rev seq aagaaaatcaaaaatggacacc 22 trimethylguanosine PVPCR 0600310 fwd seq ttaatatatattttattaaaaaaactaaaattatcaaa 38 11 synthase, putative PVPCR 0600310 rev tgatgaagtatcttttcaaataaatg 26 lysine-specific histone PVPCR 0602430 fwd seq tttttaatggtaaaacaaatgaatataatac 31 12 demethylase 1, putative PVPCR 0602430 rev gttttcatttttgactatttcaaca 25 PVPCR 0700450 fwd gttatattattactatcaaggtatttatattg 32 13 kinesin-19, putative PVPCR 0700450 rev seq aaagatgataccaatttgaatagc 24

histone acetyltransferase, PVPCR 0701930 fwd seq gaggatcatcagaatataaatataataaat 30 14 putative PVPCR 0701930 rev cagtttcacgtgtaagtac 19 232

S. no Gene product Gene-Primer name Primer sequence Primer size

PVPCR 0800780 fwd ttttcgaattgatataattaaaagctt 27 15 AAA family ATPase, putative PVPCR 0800780 rev seq attttatgaatttcgatccattattt 26

vacuolar protein sorting- PVPCR 0803550 fwd tggatttcgaattttatttttttg 24 16 associated protein 33, putative PVPCR 0803550 rev seq agtgatgatgctttgaaatt 20 CCR4-NOT transcription PVPCR 0900280 fwd aaacctctcaaatatatcagataaaaata 29 17 complex subunit 1, putative PVPCR 0900280 rev seq cgatattaagtaaacatagagtattg 26 apical membrane antigen 1, PVPCR 0903290 fwd seq tatgatatagaaaatgtgcatgg 23 18 putative PVPCR 0903290 rev gtccaaatttagcattttttacac 24

conserved Plasmodium protein, PVPCR 1000440 fwd cacgacaacttcaactatt 19 19 unknown functiion PVPCR 1000440 rev seq agtattgtaactgttgttatcg 22 conserved Plasmodium protein, PVPCR 1002890 fwd seq atattggtaacaatacgatgttt 23 20 unknown functiion PVPCR 1002890 rev cttcatctttcacttgatcaatat 24 trafficking protein particle PVPCR 1100500 fwd acaactctatagatttagaaaatgg 25 21 complex subunit 8, putative PVPCR 1100500 rev seq gtggacgaagatataaagtga 21 DNA repair endonuclease, PVPCR 1104480 fwd cttgacattgtttattattttatgca 26 22 putative PVPCR 1104480 rev seq atacaaaattggtaacaatttctct 25 U5 small nuclear ribonuclear PVPCR 1200300 fwd seq tcacatgcaactttaaaaacag 22 23 protein, putative PVPCR 1200300 rev atgcaaaaaataaaactttaagtatatattc 31 phosphoenolpyruvate/ PVPCR 1204580 fwd atttaaagcctttttattttctatattatataa 33 24 phosphate translocator, putative PVPCR 1204580 rev seq cgtaataattgcgtttgttatc 22 PVPCR 1300780 fwd atgacaaaagtaaaaattaatagattac 28 25 DEAD/DEAH helicase, putative PVPCR 1300780 rev seq tactaatgatttcattggggc 21

eukaryotic translation initiator PVPCR 1305600 fwd atcattttttatatgcataagaagttt 27 26 facotr subunit eIF2A, putative PVPCR 1305600 rev seq gagctggaggaaatctttat 20

conserved Plasmodium protein, PVPCR 1400520 fwd catgtatattattttttgttaatgattcttc 31 27 unknown functiion PVPCR 1400520 rev seq ttatgacatcttaaatatattgttacaaaaa 31 osmiophilic body PVPCR 1406630 fwd seq attttaatgttttacctaatatgtatatatct 32 28 protein G377, putative PVPCR 1406630 rev taattattaacttagaaatagataatcaaat 31

E Appendix Figures 233

pseudogenes

Figure E.1: Erythrocyte membrane antigen 1 pseudogenes in P. vinckei. Shown is the multiple sequence alignment of the first 50 amino acids of ema1 genes in P. vinckei with a signal peptide. The S5X mutation within the signal peptide region is found in all pseudogenes (with the exception of one or two) belonging to the ema1 family and is vinckei-specific (not present in the single P. chabaudi pseudogene). 234 .Different genes are under positive selection (Ka/Ks subspecies P. vinckei subspecies P. vinckei Genes under positive selection within 1) within different > ratio Figure E.2: 235

After growth selection

Before growth selection

Cross progeny Day 2 post- appearance

Oocysts Day 7 post-feed

Day 3 feed

0 20 40 60 80 100 pvsEH Percentages of EH and EL msp1 alleles pvsEL

Figure E.3: Measurement of PvsEH and PvsEL msp1 allele proportions at vari- ous points during transmission.