<<

154 Update TRENDS in Parasitology Vol.20 No.4 April 2004

Advances in schistosome genomics

Najib M.A. El-Sayed1, Daniella Bartholomeu1, Alasdair Ivens2, David A. Johnston3 Philip T. LoVerde4

1Department of Parasite Genomics, The Institute for Genomic Research, 9712 Medical Center Drive, Rockville, MD 20850, USA 2Pathogen Unit, The Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1SA, UK 3Department of Zoology, The Natural History Museum, Cromwell Road, South Kensington, London, SW7 5BD, UK 4Department of Microbiology and Immunology, 138 Farber Hall, School of Medicine and Biomedical Sciences, 3435 Main Street, State University of New York, Buffalo, NY 14214, USA

In Spring 2004, the first draft of the 270 Mb genome of Genome sequencing strategy and progress Schistosoma mansoni will be released. This sequence is As a prelude to whole genome sequencing of S. mansoni, based on the assembly and annotation of a >7.5-fold bacterial artificial chromosome (BAC)-end sequencing was coverage, shotgun sequencing project. The key stages undertaken using two BAC libraries. One (Sm1) rep- involved in the international collaborative efforts that resents ,8.5-fold haploid genome equivalents with a mean have led to the generation of these sequencing data for insert size of 100 kb [11], and the other (CHORI-103) the parasite S. mansoni are discussed here. represents ,19-fold haploid genome equivalents with a mean insert size of 140 kb (http://bacpac.chori.org/ Despite advances in control measures, schistosomiasis schis103.htm). In early 2001, Genoscope (http://www. remains a major cause of morbidity in 76 countries [1].Asa genoscope.cns.fr) generated ,14 750 BAC-end sequences consequence, in 1993, the World Health Organization using the Sm1 library, and The Institute for Genomic (http://www.who.int) advocated genomics as a way to Research (TIGR; http://www.tigr.org) generated 31 775 obtain information that could be translated into new end-sequences from both BAC libraries. The end control tools. Parasite genomics offer the best prospects for sequences not only assisted with gene discovery in the identifying new targets for drugs and vaccines, in addition early stages of the project through the generation of 21 Mb to improving diagnostics and dissecting the biological of discontinuous sequence, but they also serve as markers basis of host–parasite interactions. The Schistosoma for the construction of a high-resolution sequence-ready Genome Network was set up in 1994 to achieve those map by providing, on average, a marker of 600–700 bp aims (Table 1). every ,8500 bp. TIGR also initiated sequencing of 8 Mb of Schistosoma mansoni has a genome of ,270 Mb that is S. mansoni genomic regions, by iteratively selecting 34% G þ C [2–4]. The genome is ,40% repetitive, with the minimally overlapping BACs for complete sequencing. remaining 60% representing single-copy sequences or TIGR initially sequenced nine selected seed BAC clones small gene-families [3]. The genetic information is (,1 Mb) from various S. mansoni chromosomes, which contained on eight pairs of chromosomes – seven auto- was followed by an effort to extend outwards to develop somal pairs and one pair of sex chromosomes [5]. The contigs of BAC clones using the ‘map-as-you-go’ approach genome was estimated to contain between 15 000 and [12]. Unfortunately, although restriction enzymes with 20 000 genes [6]. distinct specificities had been used to construct the BAC Before 1994, there were ,220 sequences in the libraries to provide different genomic representations, database for all schistosome species. Franco et al. [7] much of the BAC-end sequence fell within genome-wide were the first to use the expressed sequence tag (EST) repeats (many of which appear to be retroviral elements) strategy for schistosomes [8] to initiate the era of gene [13]. The ‘map-as-you-go’ strategy was foiled by the density discovery. Currently, there are 139 064 public S. mansoni of such interspersed repeat sequences, leading to con- ESTs, representing 92% of the S. mansoni gene comp- sideration of alternative strategies for sequencing the lement [9] (Table 1). The recent dramatic increase in the S. mansoni genome. number of S. mansoni ESTs is a result of efforts by the Continuous improvements in the technologies used for Organization for Nucleotide Sequencing and Analysis DNA sequencing, enhancements of assembly algorithms, (ONSA) network in Sao Paulo, Brazil (Table 1). In reduction in sequencing costs and the award of additional addition, the Minas Gerais Genome Network EST project funds from the National Institutes of Health (http://www. expects to deposit another 100 000 ESTs by early 2004 nih.gov/) have enabled TIGR to adopt the whole-genome (Table 1). A major EST initiative for Schistosoma shotgun (WGS) approach as an optimal sequencing strategy. japonicum by the Chinese National Human Genome This involves the generation of a very large number of Centre in Shanghai has brought the total number of sequence reads (to obtain a coverage range from onefold to ESTs in the public domain for this species to 45 902 [10] eightfold, depending on the goals of the project) from both (Table 1). ends of randomly selected clones derived from different insert-size libraries. The generation of paired end-sequences Corresponding author: Philip T. LoVerde ([email protected]). greatly improves the efficiency of the sequence assembly by www.sciencedirect.com Update TRENDS in Parasitology Vol.20 No.4 April 2004 155

Table 1. Schistosoma genome websites of interesta

Genome sequencing and annotation TIGR Schistosoma mansoni Genome Project http://www.tigr.org/tdb/e2k1/sma1 The S. mansoni Genome Project at WTSI http://www.sanger.ac.uk/Projects/S_mansoni/ TIGR FTP server http://www.tigr.org/tigr-scripts/license/new.pl?genre ¼ euk WTSI FTP server ftp://ftp.sanger.ac.uk/pub/databases/Trematode/ WTSI trace server http://trace.ensembl.org TIGR S. mansoni annotation database http://www.tigr.org/tdb/e2k1/sma1/sma1.shtml GeneDB annotation and curation database http://www.genedb.org The S. japonicum Genome Project at the Chinese National http://schistosoma.chgc.sh.cn Human Genome Center, Shanghai Gene discovery WTSI EST Project http://www.sanger.ac.uk/Projects/S_mansoni/ TIGR S. mansoni Gene Index (SmGI) http://www.tigr.org/tdb/tgi/smgi/ Shanghai S. japonicum EST project http://schistosoma.chgc.sh.cn ONSA S. mansoni ORESTES Project http://verjo18.iq.usp.br/schisto/ Centro de Pesquisas Rene Rachou-FIOCRUZ EST project, gene http://www.cpqrr.fiocruz.br/dna ontology and microsatellite analysis Universidade Federal de Minas Gerais EST project http://www.icb.ufmg.br/~lgb/schisto Minas Gerais Genome Network EST Project http://bioinfo.cenapad.ufmg.br WTSI microarrays http://www.sanger.ac.uk/PostGenomics/PathogenArrays/ Schisto/ Resources and repositories Schistosoma Genome Network http://www.nhm.ac.uk/hosted_sites/schisto/ S. mansoni BAC library at BACPAC resources at the CHORI http://bacpac.chori.org/schis103.htm aAbbreviations: BAC, bacterial artificial chromosome; BACPAC, bacterial artificial chromosome–P1-derived artificial chromosome; CHORI, Children’s Hospital Oakland Research Institute; EST, expressed sequence tag; FTP, file transfer protocol; NIAID, National Institute of Allergy and Infectious Diseases; ONSA, Organization for Nucleotide Sequencing and Analysis; ORESTES, open-reading frame expressed-sequence tags; TIGR, The Institute for Genomic Research; WTSI, The Wellcome Trust Sanger Institute. providing information on spacing and orientation. To date, a combined total of 2.8 million reads has been Whereas end sequences from small (2–3 kb) and inter- generated, thus achieving 7.5-fold coverage of the genome. mediate (15–20 kb) insert-size libraries provide the bulk of Based on the Lander–Waterman model for shotgun the sequence coverage, the end sequences from large insert- sequencing [14], an estimated ,0.5% of the 270 Mb size libraries (e.g. 50–100 kb) are essential for ordering of genome is unsequenced (Figure 1), with ,2500 gaps the contig groups and independent verification of overall (average length of each gap 100 bp) remaining. It is genome structure. WGS sequencing was initiated at TIGR important to emphasize that these numbers do not in October 2002 and entered an accelerated phase at the indicate that sequencing should cease. In practice, the beginning of 2003. In June 2003, the Wellcome Trust number of gaps is frequently greater than simulation Sanger Institute (WTSI; http://www.sanger.ac.uk/) joined predicts. The Lander–Waterman model assumes that the the effort, complementing the ,1.1 Mb of finished BAC sequenced clones are uniformly sampled from the genome; clone sequencing that they had already completed. however, this is rarely the case because some regions are

100 3000 8 Key WTSI reads 90 TIGR reads 7 Coverage 2500 80 Predicted genome 6 unsequenced (%) 70 2000 5 60

50 1500 4

40

3 coverage Fold 1000 30 2 Predicted genome sequenced (%) 20

Cumulative no. of shotgun reads (x1000) no. Cumulative 500 1 10

0 0 0 10/2002 12/200202/2003 04/2003 06/2003 08/2003 10/2003 12/2003 Date TRENDS in Parasitology

Figure 1. Schistosoma mansoni whole-genome shotgun sequencing progress. The cumulative progress of the NIAID-funded project at TIGR and the Wellcome Trust- funded project at WTSI is reported by month. All TIGR sequence data are available for searching (http://tigrblast.tigr.org/er-blast/index.cgi?project ¼ sma1) or downloading (http://www.tigr.org/tigr-scripts/license/new.pl?genre ¼ euk), with reference to TIGR’s data release policy. All the reads (genome shotgun and ESTs) from the WTSI are available for searching and/or downloading in format (http://www.sanger.ac.uk/Projects/S_mansoni/), and the raw reads are available from: http://trace.ensembl.org. Abbreviations: EST, expressed sequence tag; NIAID, National Institutes of Allergy and Infectious Diseases; TIGR, The Institute for Genomic Research; WTSI, The Wellcome Trust Sanger Institute. www.sciencedirect.com 156 Update TRENDS in Parasitology Vol.20 No.4 April 2004 unclonable and others exhibit low coverage, regardless Future prospects of the number of sequences generated. At the twofold This is an exciting time for the schistosomiasis research sequence coverage milestone, the randomness of the community. Current timelines suggest that shotgun S. mansoni libraries was assessed by searching the raw sequence assembly will proceed within months, not sequences for single-copy genes that had been previously years. A rough auto-annotated, draft genome sequence described in the literature. The number of hits obtained for based on eightfold or ninefold coverage should be available each query was in agreement with the estimated genome by early 2004. At that stage, the quality and coverage of coverage. A similar confirmation of library representation the draft will determine what further activities should be and genome coverage was also obtained by aligning the undertaken. Within a similar time frame, a fourfold to random shotgun sequences to the BACs that had been fivefold coverage of the S. japonicum genome will be completely sequenced. Because smaller contigs impose available (Sheng-Yue Wang, pers. commun.). serious limitations on gene identification, our aim was to The seeds of functional genomics have already been maximize assembly by generating the largest contigs planted [17–24]. With this enormous amount of infor- possible. Therefore, both sequencing centers continue to mation soon to be available, it is now up to the schistosome generate sequence (Figure 1). Data exchange and community to perform the experiments that will lead to sequence assembly are being conducted collaboratively, novel vaccine and drug targets, to improved diagnostics using procedures and methodologies already established and to elucidation of the mechanisms of host–parasite co- between WTSI and TIGR for other joint sequencing evolution. As stated by Colley et al. [25], it is hoped that projects. It has not been proposed to finish the entire this resource will attract new investigators and funding to genome (i.e. close all gaps, resolve assembly anomalies the field. caused by repeated sequences). However, once the initial assemblies have been assessed, additional funding Acknowledgements will be sought by both centers to further exploit the This work is supported by a grant from the NIH (U01 AI48828) and the resources and data already generated, with a view to Wellcome Trust-funded project at The Wellcome Trust Sanger Institute. finishing at least some regions of particular biological N.E. also holds a faculty appointment in the Department of Microbiology and Tropical Medicine, The George Washington University, Washington, interest. Additional mapping, scaffolding and validation DC 20037, USA. information will be gained from the ongoing fluorescent in situ hybridization (FISH) mapping projects. As the References project progresses, data will be made available to the 1 Engels, D. et al. (2002) The global epidemiological situation of research community via regular updates to file transfer schistosomiasis and new approaches to control and research. Acta protocol (FTP) sites and websites (Table 1), in keeping with Trop. 82, 139–146 2 Hillyer, G.V.(1974) Buoyant density and thermal denaturation profiles the data release policy of both institutes. A preliminary of schistosome DNA. J. Parasitol. 60, 725–727 automatic analysis of the assembled sequence contigs 3 Simpson, A.J.G. et al. (1982) The genome of Schistosoma mansoni: (e.g. gene predictions, similarity searches, domain identi- Isolation of DNA, its size, bases and repetitive sequences. Mol. fication) will be undertaken on an agreed assembly with Biochem. Parasitol. 6, 125–137 the resultant information displayed on both websites 4 Marx, K.A. et al. (2000) Experimental DNA melting behavior of the three major Schistosoma species. Mol. Biochem. Parasitol. 107, (Table 1). Funding for manual curation and database 303–307 development is being sought. 5 Short, R.B. (1983) Presidential address: Sex and the single schisto- some. J. Parasitol. 69, 3–22 Physical and genetic mapping 6 Franco, G.R. et al. (2000) The Schistosoma gene discovery program: Construction of a chromosome hybridization map of the state of the art. Int. J. Parasitol. 30, 453–463 S. mansoni genome was initiated to identify a set of 7 Franco, G.R. et al. (1995) Identification of new Schistosoma mansoni genes by the EST strategy using a directional cDNA library. Gene 152, overlapping clones that represent each chromosome, thus 141–147 enabling researchers to map the location of genes, 8 Adams, M.D. et al. (1992) Sequence identification of 2,375 human brain microsatellite markers and other genetic elements to genes. Nature 355, 632–634 facilitate genetic studies. This was feasible because 9 Verjovski-Almeida, S. et al. (2003) Transcriptome analysis of the methods were available to identify individual chromo- acoelomate human parasite Schistosoma mansoni. Nat. Genet. 35, 148–157 somes (based on size, shape and banding patterns) [5] and 10 Hu, W. et al. (2003) Evolutionary and biomedical implications of a to map genes by FISH [15]. With the construction of yeast Schistosoma japonicum complementary DNA resource. Nat. Genet. 35, artificial chromosome (YAC) [16] and BAC libraries, it was 139–147 possible to map large-insert genomic clones to the 11 Le Paslier, M.C. et al. (2000) Construction and characterization of a karyotype and identify contigs. To date, .100 YACs, 370 Schistosoma mansoni bacterial artificial chromosome library. Geno- mics 65, 87–94 BACs, repetitive elements and various genes have been 12 Venter, J.C. et al. (1996) A new strategy for genome sequencing. Nature mapped to the schistosome karyotype [15] (H. Hirai and 381, 364–366 P.T. LoVerde, unpublished). In addition, using YAC clones 13 Copeland, C.S. et al. (2003) Boudicca, a retrovirus-like long terminal mapped to chromosome three as a starting point, a repeat retrotransposon from the genome of the human blood fluke minimal tile path for chromosome three was initiated Schistosoma mansoni. J. Virol. 77, 6153–6166 14 Lander, E.S. and Waterman, M.S. (1988) Genomic mapping by and several BAC contigs assembled. However, the repeti- fingerprinting random clones: a mathematical analysis. Genomics 3, tive nature of the genome has made both hybridization and 231–239 sequence-based BAC chromosome walking impractical. 15 Hirai, H. and LoVerde, P.T. (1995) FISH techniques for constructing www.sciencedirect.com Update TRENDS in Parasitology Vol.20 No.4 April 2004 157

physical maps on schistosome chromosomes. Parasitol. Today 11, transiently transformed schistosomes. Mol. Biochem. Parasitol. 120, 310–314 141–150 16 Tanaka, M. et al. (1995) Yeast artificial chromosome (YAC)-based 22 Rossi, A. et al. (2003) Cloning of 50 and 30 flanking regions of the genome mapping of Schistosoma mansoni. Mol. Biochem. Parasitol. Schistosoma mansoni calcineurin A gene and their characterization 69, 41–51 in transiently transformed parasites. Mol. Biochem. Parasitol. 130, 17 Davis, R.E. et al. (1999) Transient expression of DNA and RNA in 133–138 parasitic helminths by using particle bombardment. Proc. Natl. Acad. 23 Boyle, J.P. et al. (2003) Using RNA interference to manipulate Sci. U. S. A. 96, 8687–8692 endogenous gene expression in Schistosoma mansoni sporocysts. 18 Ashton, P.D. et al. (2001) Linking proteome and genome: how to Mol. Biochem. Parasitol. 128, 205–215 identify parasite proteins. Trends Parasitol. 17, 198–202 24 Skelly, P.J. et al. (2003) Suppression of cathepsin B expression in 19 Hoffmann, K.F. et al. (2002) Identification of Schistosoma mansoni Schistosoma mansoni by RNA interference. Int. J. Parasitol. 33, gender-associated gene transcripts by cDNA microarray profiling. 363–369 Genome Biol. 3, 41 25 Colley, D.G. et al. (2001) Infectious disease. Medical helminthology in 20 Wippersteg, V. et al. (2002) Characterisation of the protease the 21st century. Science 293, 1437–1438 ER60 in transgenic Schistosoma mansoni larvae. Int. J. Parasitol. 32,

1219–1224 1471-4922/$ - see front matter q 2004 Elsevier Ltd. All rights reserved. 21 Wippersteg, V. et al. (2002) HSP70-controlled GFP expression in doi:10.1016/j.pt.2004.02.002

|Research Focus A new boost for malaria vaccines

Dodie S. Pouniotis*, Owen Proudfoot*, Gabriela Minigo, Jennifer C. Hanley and Magdalena Plebanski

Vaccine and Infectious Diseases Unit, The Austin Research Institute, Studley Road, Heidelberg, Victoria 3084, Australia

Plasmodium falciparum kills 1–3 million people each parasites) [2,3]. Sterile immunity was not achieved in the year, most of them are children in sub-Saharan Africa. human trial, but delayed development of blood-stage Immunization with a specific sequence of different vac- parasitemia was observed. The potential utility of cine carriers (prime boost) is particularly effective in prime–boost vaccination in malaria-endemic populations conferring sterile protective immunity against the pre- is discussed. erythrocytic stage in animal models of malaria. Here, we review the immunogenicity and protective efficacy Protective immune responses data from the first human trial of a prime–boost vaccine Pre-erythrocytic vaccines have aimed to induce antibodies against pre-erythrocytic malaria. This vaccine is based to target sporozoites before hepatocyte invasion, or T-cell on priming with DNA and boosting with modified vacci- immunity to target intrahepatic forms of the parasite nia virus Ankara strain. The limitations and future chal- (reviewed in Refs [4,5]). In murine malaria, vaccines aimed lenges facing this and other vaccines against human at inducing CD8þ T-cell responses [irradiated sporozoites, malaria are discussed. DNA, live viral or bacterial vectors, and virus-like- particles (VLP) alone or in combination in prime–boost Human clinical trials of recombinant vaccines, which aim protocols] have shown that protection is crucially depen- to induce cellular immunity, are under way for a range of dent on IFN-g. This cytokine is capable of turning on nitric viral infectious diseases. These efforts are followed closely oxide (NO) production by hepatocytes and immune cells. by attempts to develop a vaccine against the pre- CD4þ T cells support the induction and maintenance of erythrocytic stage of the malaria parasite, with organiz- such CD8þ T cells, as well as high-affinity antibodies, and ations such as the Malaria Vaccine Initiative (MVI; http:// can further promote parasite clearance via IFN-g www.mvi.org) providing in recent years a much-needed secretion. financial boost to support clinical trials. One example of a In humans, irradiated sporozoites induce both humoral human clinical trial for vaccines is the prime–boost and cellular immunity, and complete protection against approach [1], where humans are primed with one delivery live sporozoite challenge [6–8]. Both antibodies and CD8þ system (usually DNA) and boosted by another (usually a T cells are associated with protection. Human clinical replication-deficient virus), with the aim to induce high þ trials using RTS,S, a recombinant VLP vaccine expressing numbers of interferon gamma (IFN-g)-producing CD8 the circumsporozoite (CS) pre-erythrocytic antigen deliv- T cells. This protocol had previously been demonstrated to ered in an oil-in-water adjuvant, indicate that it is possible induce protective IFN-g and CD8þ T-cell-mediated pre- to induce sterile pre-erythrocytic protection in ,50% of erythrocytic immunity in animals, including sterile immu- volunteers, even in the absence of CD8þ T cells or nity in mice (resulting in a total absence of blood-stage antibodies [9]. High levels of CS-specific IFN-g-producing þ * These two authors contributed equally to the article. CD4 T cells have been detected in some of these protected Corresponding author: Magdalena Plebanski ([email protected]). individuals. In a field setting, RTS’S induced a similar www.sciencedirect.com