<<

Testing the effect of in planta RNA silencing on Plasmodiophorabrassicae infection

A thesis submitted in partial fulfilment of the requirements for the Degree of Doctor of Philosophy at Lincoln University

S. R. Bulman 2006 ii

Contents

Abstract...... , ...... v Acknowledgements ...... vii List of Tables ...... viii List of Figures ...... ix CHAPTER 1. Literature Review ...... 1 brassicae ...... 1 RNA silencing ...... 5 Plant defence by RNA silencing ...... 8 RNA silencing versus P. brassicae ...... 10 Molecular biology of P. brassicae ...... 10 Choice of cDNA cloning strategy in P. brassicae ...... 12 Overall strategy ...... 14 CHAPTER 2. Identification of genes from the obligate intracellular plant pathogen, Plasmodiophora brassicae ...... 15 Abstract ...... 15 Introduction ...... 15 Materials and methods ...... 17 Results ...... 19 CHAPTER 3. Genomic DNA sequences from Plasmodiophora brassicae ...... 28 Abstract ...... 28 Introduction ...... 28 Materials and methods ...... 29 Results ...... 32 Discussion ...... 37 CHAPTER 4. Testing plant-derived RNA silencing against the intracellular pathogen Plasmodiophora brassicae ...... 42 Abstract ...... 42 Introduction ...... 42 Material and methods ...... 44 Results ...... ; ...... 47 Discussion ...... ;...... 51 CHAPTER 5. Discussion ...... 54 Summary of research findings ...... 54 SSH cDNA cloning techniques ...... 55 Full-length cDNA cloning techniques ...... 56 iii

The future of P. brassicae cDNA cloning ...... 57 Genomic sequences from P. brassicae by PCR-walking ...... 57 Multiple P. brassicae actin genes ...... 58 New details of the P. brassicae genome ...... 59 Evidence for RNA silencing machinery in plasmodiophorids? ...... 59 Assessment of P. brassicae infection and choice of first hpRNA target ...... 60 Design of actin hpRNA construct...... 61 Assessment of hpRNA expression ...... 62 In planta siRNAs showed no effect on P. brassicae infection ...... 63 Why might RNA silencing have failed to be induced? ...... 63 Future research ...... 64 Literature ...... - ...... 66 Appendices ...... 85 Appendix 1. Identification of genes from the obligate intracellular plant pathogen, Plasmodiophora brassicae ...... 85 Quantitative RT-PCR analysis of P. brassicae gene expression ...... 89 Appendix 2. Genomic DNA sequences from Plasmodiophora brassicae ...... 92 Degenerate PCR for Dicer proteins ...... 96 Appendix 3. Testing plant-derived RNA Silencing against the intracellular pathogen Plasmodiophora brassicae ...... 99 Reporter-gene expression in P. brassicae galls ...... 104 iv

Abstract of a thesis submitted in partial fulfilment of the requirements for the Degree of Doctor of Philosophy

Testing the effect of in planta RNA silencing on Plasmodiophora brassicae infection

S. R. Bulman 2006

In the late 1990s, a series of landmark publications described RNA interference (RNAi) and related RNA silencing phenomena in nematodes, plants and fungi. By manipulating RNA silencing, biologists have been able to create tools for specifically inactivating genes. In organisms from trypanosomes to insects, RNA silencing is now indispensible for studying gene function. RNA silencing has been used in a project aimed at systematically knocking out all genes in the model plant Arabidopsis thaliana.

RNA silencing has a natural role in defending eukaryotic cells against virus replication. By assembling virus DNA sequences in a form that triggers RNA silenCing, biologists have created plants resistant to specific viruses. In this study, we set out to test if a similar approach would protect plants against infection by the agriculturally important Brassica pathogen, Plasmodiophora brassicae. P. brassicae is an obligate intracellular biotroph, from the little studied eukaryotic supergroup, the .

To identify the gene sequences that would be starting materia-I for P. brassicae RNA silencing, new P. brassicae genes were gathered by cDNA cloning or genomic PCR­ walking. Using suppression subtractive hybridisation (SSH) and oligo-capping cloning of full-length cDNAs, 76 new gene sequences were identified. A large proportion of the cDNAs were predicted to contain Signal peptides for ER translocation. v

In addition to the new cDNA identified here, partial sequences for the P. brassicae actin and TPS genes were published by other researchers close to the beginning of this study. Using peR-walking, full-length genomic DNA sequences from both genes were obtained. Later, genomic DNA sequences spanning or flanking a total of 24 P. brassicae genes were obtained. The P. brassicae genes were rich in typical eukaryotic spliceosomal introns. Transcription of P. brassicae genes also appears likely to begin from initiator elements rather than TATA-box-containing promoters.

A segment of the P. brassicae actin gene was assembled in hairpin format and transformed into Arabidopsis thaliana. Observation of simultaneous knockdown of the GUS marker gene as well as detection of siRNAs indicated that the hpRNA sequences induced RNA silencing. However, inoculation of these plants with P. brassicae resulted in heavy club root infection. We were unable to detect decreases in actin gene expression in the infecting P. brassicae, at either early or late stages of infection. We conclude that, within the limits of the techniques used here, there is no evidence for induction of RNA silencing in P. brassicae by in planta produced siRNAs.

Keywords: Plasmodiophora brassicae, club root, Rhizaria, RNA interference, in planta RNA silencing, Arabidopsis thaliana, suppression subtractive hybridisation, cDNA cloning, genome, intron. vi

Acknowledgements

I offer thanks to my supervisors Tony Conner, Colin Eady and Hayley Ridgway for their guidance. They directed me away from tempting side projects and gave timely advice to keep the project on track. John Marshall was very kind in allowing me to continue work at his Crop & Food Research laboratory. I worked with Judith Candy on this project and she was always happy to exchange ideas on Plasmodiophora biology. I am grateful to the BioProtection CoRE facility and staff for the opportunity to complete my PhD on this interesting and satisfying subject. Crop & Food Research provided additional financial support as well as generous access to facilities. Simon Hayes, Robert Lamberts, Carolyn Barry and Astrid Erasmuson made the high quality figures for my PhD; it's production would have been so much more difficult without their cheerful help. I am indebted to Daniel Park who spent too much of his time making my writing coherent. Jenny Drummond and Ruth Butler provided advice on thesis production and statistics respectively. Sarah put up with it all. Cheers. vii

List of Tables

Chapter 2 Table 1. Summary of cDNA clones from P. brassicae 24-27 Chapter 3 Table 1. Details of P. brassicae genes 33 Chapter 4 Table 1. Quantitative RT-PCR in pH8AG lines 50 Appendix 1 Table 1. Arabidopsis tha/iana cDNAs 85-88 Appendix 1 Table 2. PCR primers used for qRT-PCR 90 Appendix 1 Table 3. Quantitative RT-PCR for selected P. brassicae genes and the Arabidopsis actin 8 gene 91 Appendix 2 Table 1. PCR primer and adaptor sequences used for PCR walking 92 Appendix 3 Table 1. PCR primer sequences 99 Appendix 3 Table 2. Segregation ratios of pH8AG T2 seedlings on kanamycin containing media 100 viii

List of Figures

Chapter 1 Figure 1. Club root disease 2 Chapter 1 Figure 2. The plasmodiophorid life cycle 3 Chapter 1 Figure 3. Infection of host cell (encystment) by a P. brassicae zoospore 4 Chapter 1 Figure 4. The major groups of 11 Chapter 3 Figure 1. Diagram of the genomic sequences obtained in this study 34 Chapter 3 Figure 2. Conserved sequences at intron splice sites 36 Chapter 3 Figure 3. Conserved TSS sequences 37 Chapter 4 Figure 1. Vector assembly 45 Chapter 4 Figure 2. GUS staining of T2 seedlings 48 Chapter 4 Figure 3. Infected pH8AG plants from line 4 and 7 at 30 dpi 49 Chapter 4 Figure 4. Detection of P. brassicae actin transcription at 30 dpi 49 Chapter 4 Figure 5. siRNAs blot hybridised with 153 bp actin hpRNA fragment 50 Appendix 2 Figure 1. Maximum likelihood phylogenetic tree constructed from ribosomal l7a proteins 93 Appendix 2 Figure 2. Intron sequences from P. brassicae genes 94-95 Appendix 2 Figure 3. Alignment of Dicer genes 97 Appendix 2 Figure 4. DCl sequence from Brassica rapa club root gall 98 Appendix 2 Figure 5. BLAST alignments of Dicer sequence from Brassica rapa with DCll sequence from Arabidopsis 98 Appendix 3 Figure 1. Sequence of pH8AG insert 100 Appendix 3 Figure 2. Primary P. brassicae infection in 8 dpi root hairs 101 Appendix 3 Figure 3. Southern hybridisation 101 Appendix 3 Figure 4. PCR detection of pH8AG insert 101 Appendix 3 Figure 5. Segregation of pH8AG insert 102 Appendix 3 Figure 6. Pot grown Arabidopsis infected with P. brassicae 102 Appendix 3 Figure 7. siRNA acrylamide gel 103 Appendix 3 Figure 8. Arabidopsis plants containing Arabidopsis promoter-GUS reporter genes 104 1

CHAPTER 1. literature Review

c Plasmodiophora brassicae

P. brassicae is a protistan pathogen that causes club root disease in plants of the Brassicaceae family. Club root is a soil or water borne disease that causes large distinctive galls on the roots and hypercotyls of plants (Fig. 1). Worldwide, economically important crops such as cabbages and rapeseed are subject to severe losses from club root (Ludwig-Mulier, 1999). Initial P. brassicaeinfection is made by -zoospores that enter root hairs and root epidermal cells. After a period of about 10 days, the primary infection phase is followed by a second phase that occurs in the root cortex and stele. This phase features plant cell division and enlargement, tissue disorganisation and formation of the characteristic galls. Secondary development is accompanied by substantial changes in phytohormone levels (Devos et aI., 2005) with heavy infection leading to significant changes in plant structure. Club root infection is first visible to the naked eye as a swelling in the plant hypocotyl after two weeks. Over a further ten days, severe swelling of the hypocotyl, main root and lateral roots develops. Infection is often evident in the field as wilting of the plant during hot weather (Crute, 1986). At plant death, rotting galls release mature spores into the soil where they may remain dormant for extended periods.

P. brassicae belongs in the Plasmodiophorida, an enigmatic order of protists (Braselton, 2001). The Plasmodiophorida contains several other important plant­ pathogens including Spongospora subterranea that causes potato powdery scab disease and Spongospora nasturtiithat causes water-cress crooked root disease. Spongospora and Po/ymyxa species also serve as vectors for many plant viruses including barley yellow mosaic virus, potato mop top virus and beet soilborne mosaic virus (Rochon et aI., 2004). These diseases are serious problems for crop growers worldwide. Aside from parasites of plants, plasmodiophorids from the Woronina genus infect stramenopiles such as pythium and Sapro/egnia species (Braselton, 2001). Protists from the closely related Phagomyxida order also infect stramenopile hosts in marine environments (Schnepf, 1994). 2

Figure 1. Club root disease. A heavily infected Brassica rapa plant showing advanced secondary stage club root infection in hypercotyl and roots.

Defining features of the Plasmodiophorida are obligate intracellular parasitism, flagellated zoospores, cruciform nuclear divisions and multi-nucleate plasmodia. In the second half of the 20th century, extensive ultrastructural and karyological studies illuminated most aspects of the plasmodiophorid's complicated lifecycle (summarised in Dylewski 1990; Braselton 1995). The following lifecycle description (numbers in the lifecycle description refer to those in Fig. 2) is based predominantly on the study of P. brassicae by Aist and Williams (1971). The term zoosporangial is used for primary stages of development and sporogenic for secondary development.

The rv3-5J.JM in diameter resting spores (1) of P. brassicae are resistant to environmental extremes and may remain viable in the soil for years. In response to root exudates, spores hatch to give primary zoospores (2) with two flagellae of unequal length. When swimming zoospores come into contact with the host, they attach to the root, and encystment begins (Fig. 3). Over a period of two to three hours the morphology of the zoospore changes. Most strikingly, a long tubular cavity termed the Rohr is formed, oriented towards the root (Keskin and Fuchs, 1969). Within the Rohr is a dense-staining, sharp-pointed rod called the Stachel. About 15 minutes before cell penetration a vacuole enlarges to fill half of the zoospore, and an attachment structure, the adhesorium, forms between the zoospore and the root. 3

primary ? infection

Figure 2. The plasmodiophorid life cycle. See text for discussion of life cycle stages.

In a rapid process the vacuole enlarges to fill the entire zoospore, forcing the Stachel into the host cell. The contents of the zoospore are injected, -like, after the Stachel (3), and the plasmodiophorid begins to synchronously divide. This phase of development, termed primary or zoosporangial development (4), results in multinucleate protoplasts or plasmodia. These cell divisions are characterised by a persistent nuclear envelope and nucleolus. During metaphase the nucleolus elongates in the direction of the poles and at right angles to the equatorial chromatin plate, forming a cross-like configuration. This is known as cruciform nuclear division. After the cruciform divisions, a series of non-cruciform divisions segment the primary plasmodia (5) into secondary zoospores (6). Secondary zoospores are released into the environment where they re-infect new host cells (7). In this secondary development stage (7), cruciform divisions again precede non-cruciform divisions in the formation of plasmodia (8). However, the second-stage divisions are considered to be meiotic and the plasmodium is cleaved into new resting spores (9). Mature resting spores are released into the soil upon death of the host. 4

A B

HW HW He He c D

HW HW He He

Figure 3. Infection of host cell (encystment) by a P. brassicaezoospore. After cell attachment the zoospore forms a Rohr and Stachel apparatus with a large cellular vacuole (A) The vacuole begins to enlarge and the adhesorium forms. (8) The vacuole enlarges further and the Stachel is forced into the host cell. (C) The zoospore cytoplasmic contents are extruded into the host cell and a papilla forms. V, vacuole; Ad, adhesorium; FAM, fibrillar adhesive material; L, lipid body; HC, host cytoplasm; HW, host wall; P, Rohr plug; Pa, papilla; PC, parasite cytoplasm; RW, Rohr wall; Sch, Schlauch; St, Stachel; and ZW, zoospore wall. From Aist and Williams (1971).

There is some uncertainty about the transition from primary to secondary P. brassicae development. Environmental conditions probably influence the transition, because some primary infections appear to lead directly to spore formation (2 -> 7) with zoosporangial and sporogenic development occurring at the same time in the host (Braselton, 1995; Claxton et aI., 1996; Mithen and Magrath, 1992). Further, although secondary infection is usually ascribed to reinfection by secondary zoospores, since 1935 there have been persistent suggestions of cell-to-cell movement by myxamoeba (Buczacki, 1983). This mechanism, which could account for the movement in gall-forming plasmodiophorids between the site of primary infection in root-hairs/epidermal cells and the site of secondary infection in cortical tissues, has found increasing recent popularity (Mithen and Magrath, 1992; Claxton et aI., 1996; Clay and Walsh, 1997). 5

As will be discussed later, the biology - especially the molecular biology - of P/asmodiophora brassicae is poorly understood because of the difficulty of working with plasmodiophorids in the laboratory.

RNA silencing

The first observations of RNA silencing were made in transgenic petunia carrying extra pigment-producing genes. Instead of deep purple flowers, the petunia produced white flowers (Napoli et aI., 1990). Studies of this and related silencing phenomena were greatly accelerated by studies of RNA interference (RNAi) in the nematode Caenorhabditis e/egans (Fire et aI., 1998), post transcriptional gene silencing in plants (Hamilton and Baulcombe, 1999) and quelling in fungi (Cogoni and Macino, 1997; Cogoni and Macino, 1999). These processes share the uniting features 1) double stranded (ds)RNA in the cell is the trigger for RNA silencing; 2) dsRNAs are processed by RNase III-like enzymes, Dicers (Bernstein et aI., 2001), to give 20-26 nucleotide small interfering (si)RNAs (Hamilton and Baulcombe, 1999); 3) siRNAs are incorporated into a multi-subunit RNA-induced silenCing complex (RISC) containing an Argonaute protein (Liu et aI., 2004). The siRNA then serves as a guide to target fully or near fully complementary RNA or DNA for inactivation.

The term RNA silencing encompasses a number of related pathways leading to gene inactivation (reviewed in Meister and Tuschl, 2004). Plants are especially rich in these pathways, having multiple Dicer-like (DCl) proteins. The four Arabidopsis DCl proteins each have central but partly redundant roles in individual silencing pathways (reviewed in Brodersen and Voinnet, 2006; Henderson et aI., 2006; Deleris et aI., 2006). One of the most important facets of RNA silencing in both animals and plants is the micro (mi)RNA pathway. miRNAs are non-coding RNA species that form RNA stem-loop-like precursor structures that are processed into 20-24 bp small RNAs (Carrington and Ambros, 2003; Kim and Nam, 2006). Across a range of organisms, miRNAs may mediate postranscriptional mRNA cleavage or translational repression, but of the 1V100 plant miRNAs identified to date (Xie et aI., 2005), most have near­ perfect homologyeto their target mRNAs, meaning that targeted transcripts are cleaved (reviewed in Mallory and Vaucheret, 2006). In both animals and plants, inactivation of miRNA production leads to large phenotypic growth defects, indicating 6 the central role of miRNAs in coordinating development. In plants, these broad changes may be controlled by miRNA targeting of regulatory cascades mediated by auxin and transcription factors (e.g. Guo et aI., 2005; Jones-Rhoades and Bartel, 2004). In Arabidopsis, DCl1 is responsible for production miRNAs.

As well as targeting mRNAs, some siRNAs interact directly with DNA to cause methylation and heterochromatin formation. siRNAs corresponding to transposons and silenced genes in heterochromatin regions have been detected in plants, and loss of this class of siRNAs leads to increased transposon mobilisation (Vastenhouw et aI., 2003; Zilberman et aI., 2003). Methylation specific siRNAs are predominantly 24 bp in length and are processed by Arabidopsis DCl3 (Xie et aI., 2004). Further RNA silenCing pathways have recently been discovered in plants. A 22 bp siRNA was found to be processed by DCl2 from a naturally occuring Arabidopsis anti-sense transcript (i.e. transcripts from two overlapping genes in opposite orientations) (Borsani et aI., 2005). This siRNA appears to playa role in salt tolerance. Since other overlapping mRNAs are known, siRNAs from such transcripts may be common. A new group of 21 bp transacting (tasi)RNAs, processed by DCl4 from genomic loci, have recently been identified in Arabidopsis. These may have a role in modulating the adult to juvenile transition or auxin responses (Vasquez et aI., 2004; Peragine et aI., 2004).

The two aspects of RNA silenCing of most relevance to this study are its role in plant­ microbe interactions, and in the use of RNA silencing as a gene knockout tool. In addition to protecting the genome against transposon replication, RNA silenCing has a critical role in defence against viruses (Waterhouse et aI., 2001; Wang and Metzlaff, 2005). RNA silencing mutants are more susceptible to viral replication (Mourrain et aI., 2000) and several viruses have specific defences against host RNA silenCing (Voinnet et al., 2000). Arabidopsis DCl4 produces 21 bp siRNAs that guide viral defence, while DCl2 has a secondary antiviral role (Deleris et aI., 2006). The role of RNA silencing in interactions with other micro-organisms is a matter of much current interest (e.g. Fritz et aI., 2006). An Arabidopsis miRNA that is induced by Pseudomonas syringae infection has been idenitifed (Navarro et aI., 2006). This miRNA plays a role in downregulating host genes involved in plant susceptibility. The complex involvement of RNA silencing in the interaction between plants and the crown gall-causing Agrobacterium tumefaciens has also been confirmed (Dunoyer et 7 ai., 2006). Unlike most pathogens, Agrobacterium exports a discrete DNA/protein complex (T-DNA) into plant cells. In an apparent defence against this infection, plants produce 21 bp siRNAs corresponding to some of these Agrobacterium genes (Dunoyer et ai., 2006). These include the iaaM oncogene that is essential for successful cell transformation. On the other side of the coin, successful Agrobacterium infection requires manipulation of the endogenous plant miRNA pathways. Hosts with defective miRNA pathways were essentially immune to infection (Dunoyer et ai., 2006).

The early studies of RNAi in C. elegans showed that dsRNA could be a tool for gene knockdown (Fire et ai., 1998). RNA silencing was induced simply by bathing nematodes in dsRNAs (Tabara et ai., 1998), and is now routinely applied to genome wide studies of C. elegansbiology (e.g. Kamath eta!., 2003; Kim etai., 2005). RNA silencing is also widely used as a research tool for genetics in Drosophila 52 cells (Boutros et aI., 2004) and Trypanosoma brucei(Motyka and Englund, 2004). In C. elegans and Drosophila, RNA silencing could be induced by microinjection of RNAs. Lacking such a technique for plants, Chuang and Meyerowitz (2000) showed that when transgenes in inverted repeat (IR) format were introduced into cells, transcription led to efficient gene silencing. Addition of an intron to the IRs produced even more effective induction, and these hairpin format (hp)RNAs have become the basis for RNA silencing in plants (Smith et ai., 2000). Via virus-based vectors and , Agrobacterium infiltration, hpRNAs are widely used for transient RNA silencing in tobacco and tomato (Bendahmane et ai., 2000; Liu et ai., 2002; Lu et ai., 2003). In Arabidopsis, stable transformation with transgenes carrying hpRNAs is the current method of choice for RNA silencing (Wesley et ai., 2001). Vectors for the assembly and constitutive expression of hpRNAs are available (e.g. Helliwell et ai., 2002). Induction of gene knockdown by these methods is sufficiently simple that large scale, genome-wide knockdowns of Arabidopsisgenes have been generated (e.g. AGRIKOLA http://www.agrikola.orgl).

Beyond model organisms, the presence and distribution of RNA silencing mechanisms is unknown, especially in the less studied groups of single-celled eukaryotes.. Conserved components of the RNA silencing machinery like Dicer proteins, have been found in all major groups of eukaryotes for which whole genome sequences are available (Cerutti and Casas-Mollano, 2006). However, RNA silencing appears to be 8 absent from many organisms, and has frequently been lost from closely related . lineages. For example, while RNA silencing is routinely used as a tool in

Trypanosoma bruce~ the RNA silencing machinery is absent from Trypanosoma cruzi (Motyka and Englund, 2004).

An important consideration in this study is the ability of RNA silencing to spread from cell to cell, or across cell membranes. In C elegansthe transmembrane protein SID1 is responsible for movement of dsRNA into cells (Winston et aI., 2002; Feinberg a,nd Hunter, 2003). However, sid-J appears to have a limited phylogenetic distibution; currently, BLAST (Altschul et aI., 1997) queries of GenBank with the C elegans sid-J sequence shows that sid-J homologues are present mostly in vertebrates .. sid-J genes have been detected in some insects, together with direct evidence for systemic silencing spread in the beetle Tribolium castaneum and the grasshopper Schistocerca americana (Bucher et aI., 2002; Dong and Friederich, 2005).

Non-SID1 mediated systemic silenCing spread has been found in other organisms. RNA silenCing spreads over considerable distances in plants by transfer of siRNAs through plasmodesmata and phloem (Hamilton et aI., 2002; Palauqui et aI., 1997; Voinnet et aI., 1998). Shorter range spread, involving an RNA dependent RNA polymerase, occurs through short iterative RNA silenCing cycles into a nearly constant number of neighbouring cells (Himber et aI., 2003).

Two recent studies are perhaps of the most relevance to inducing RNA silencing in P. brassicae by external siRNA application. These have shown gene inactivation stimulated by external application of siRNA duplexes in Entamoeba histolytica (Vayssie et aI., 2004) and Acanthamoeba (Lorenzo-Morales et aI., 2005). No molecular details of the silenCing mechanisms in amoeba are yet available.

Plant defence by RNA silencing

Following from the natural role of RNA silencing in defence against viral replication, an early application of RNA silencing was in protecting plants against viruses. By utilising IRs corresponding to virus sequences, a number of research groups have 9 created transgenic plants resistant to infection by specific viruses (e.g. Wang et aI., 2000; Di Nicola-Negri et aI., 2005). An obvious question arising from this method of plant-protection was, might RNA silencing help defend plants against other pathogens? This idea has found some empirical support in tomato plants expressing hpRNAs directed at Agrobacterium tumefacienstumour-forming genes. These plants were resistant to gall formation although the bacteria were still able to replicate in the plants (Escobar et aI., 2001; Escobar et aI., 2003). However, since Agrobacterium genes required for successful infection are directly exposed to the plant RNA silencing machinery, this infection is conceptually more similar to that of viruses than of pathogens that require passage of RNA silencing across the cell membrane.

No examples of an in p/anta RNA silencing signal spreading across a eukaryotic pathogen membrane, have yet been published. Evidence for this event is being most actively pursued in studies of phytopathogenic nematodes. In C e/egans, RNA silencing can be induced not only by soaking in dsRNA, but also by ingestion of bacteria engineered to transcribe dsRNAs (Timmons et aI., 2001). It can also be induced by feeding nematodes on material from plants that have transcribed appropriate dsRNA (Boutla et aI., 2002). Building on these C e/egansstudies, several research groups have now utilised RNA silencing techniques to knockdown gene expression in plant parasitic nematodes (Chen et aI., 2005; Rosso et aI., 2005; Urwin et aI., 2002). RNA silencing was not induced by external siRNA; instead, it was necessary to induce the juvenile nematodes to feed on siRNA material, by the use of a neurotransmitter (Urwin et aI., 2002). No evidence for nematode gene silencing by in p/anta expressed dsRNA has yet been published, but this is one of the expressed aims of the research (Bakhetia et aI., 2005).

Beyond nematodes, there have been no publications on in p/anta RNA silencing versus other eukaryotes. In unpublished research, transgenic plants expressing hairpins targeted at mycotoxin genes of Aspergillus fungi species are being created in the laboratory of Nancy Keller (http://www.plantpath.wisc.edu/fac/npk.htm) (8th International Mycological Congress, Cairns 2006). 10

RNA silencing versus P. brassicae

Although no clues as to the presence of RNA silencing in the plasmodiophorids were available, P. brassicae had a number of features that made it appear an attractive candidate for testing plant-derived RNA silencing. Firstly, because P. brassicae is an intracellular biotroph, the pathogen would be fully exposed to plant-transcribed RNAs during infection. The exchange of compounds between P. brassicae and its host is not understood, but the envelope surrounding actively dividing P. brassicae plasmodia has been partly characterised as a simple structure of two closely appressed tri-Iayered membranes (Williams and McNabola, 1970). As P. brassicae had been observed to engulf host organelles and cytoplasm (Buczacki, 1983), phagotrophic ingestion might be another possible route to siRNA uptake. Lastly, as P. brassicae infects Arabidopsis, the advantages of well established RNA silencing techniques, a complete host genome sequence, easy transformation and fast generation time were all available in the host plant. For these reasons, we set out to apply in p/anta RNA silencing versus P. brassicae.

Molecular biology of P. brassicae

To examine the effect of in p/anta hpRNAs, we aimed to test a number of gene targets from P. brassicae. However, studies of P. brassicae genomics were virtually non-existant. Even sequencing of plasmodiophorid genes for phylogenetics had only just begun, meaning the taxonomic position of the group remained in doubt. Historically, the plasmodiophorids were classified with fungi, but, with ribosomal RNA sequence-based phylogenies, the plasmodiophorids were found to be more closely allied to a diverse assemblage of protists in the (Cavalier-Smith and Chao, 1997; Bulman et aI., 2001; Cavalier-Smith and Chao, 2003). Further evidence for a close relationship between the plasmodiophorids and cercozoans came with confirmation that they shared a unique polyubiquitin amino acid insertion (Archibald and Keeling, 2004). The Cercozoa has since been subsumed into the newest eukaryotic supergroup, the Rhizaria (Bass et aI., 2005; Nikolaev et aI., 2004). Despite the Rhizaria including a highly diverse group of protiSts (Fig. 4) including , formanifera and radiolarians, very little is known about the molecular biology of any organism in the group (Ad I et aI., 2005; Keeling et aI., 11

2005). The only available rhizarian genomic information has been provided by ""3500 EST sequences from the Bige/owie//a natans (Rogers et aI., 2004; Archibald et aI., 2003) and ""1600 EST sequences from the formaniferan Reticu/omyxa fi/osa (Burki and Pawlowski, 2006). The very recent announcement of a genome project for B. natans will be the first of it's kind from a rhizarian (http://www.jqi.doe.qov/seguencinq/cspsegplans2007.html). At this time, there is little opportunity to draw on genomic information from closely related organisms to the plasmodiophorids.

Discicristates Excavates Heterolobosea Kinetopla,dds Dlplonemlds Euglenlds Core jakoblds Trlmastlx Oxymonads Trlchomonads Hypermastlgotes Csrpedlemons5 Retortamonads Diplomonads Ualawlmonacls

Cercomonadl Eugtyphlda Pbaeod.,.. Cercozoa HeleromlUda Thaumatomonads Chlor.rachnlophyteS Alveolates Phytomyxlda Haplosportdla Polycystlnes Acanlharla Rhlzarla DIatoms Raphldlophytes _--:;;:;;:::;...-'2' Ascomycetes Eustlgmatophytes Basidiomycetes Chrysophyte. Zygomycetes Phaeophytes Mlcrosporldla 801kJophytel Chytrid8 Bl6stocy.tls Nucfeailds Actlnophrylds Animals Stramenopiles LabyrlnthuUcl$ CIloenoftltgel.... Opisthokonts Thrauslrochytr1ds c.psuportJ Oomycele. Ic:hthyosporea OpaI'nids Bicosoecids Dlctyoatellds Haptophytes Myxogaatrlda Proto.... 1ds Cryptomonads Lobo... Archamoebae Chromalveolates Amoebozoa 'Untkontst

Figure 4. The major groups of eukaryotes. Diagram shows the diversity of known eukaryotes and the 5-6 suprgroups that the eukaryotes are currently thought to fall into. The plasmodiophorids can be found within the Phytomyxids, in the Rhizaria supergroup (blue). From Keeling et ai, (2005). 12

At the start of this study, the only DNA sequences available from plasmodiophorids were a short (1V1S0bp) DNA sequence from the P. brassicaetrehalose-6-phosphate synthase (TPS) gene (Brodmann et aI., 2002) and less than 10 other cDNA sequences deposited in GenBank. However, the latter sequences were far from characterised. Thus, gene sequences that might be suitable targets for RNA silencing were unavailable, meaning that the first step necessary in our study was to identify genes from P. brassicae.

Choice of cDNA cloning strategy in P. brassicae

To create multiple hpRNA, we needed to identify a range of new P. brassicae gene sequences. In most organisms, this could be achieved by the relatively straightforward process of cloning and sequencing random ESTs. However, this simple approach was not possible in P. brassicae because techniques for experimentation on this organism lag some way behind those for other plant pathogens. A brief description of the status of P. brassicae experimental manipulation will facilitate an understanding of our research approach. As with most biotrophic pathogens, no protocols exist for axenic culture of plasmodiophorids. Procedures for the gnotobiotiC culture of P. brassicae in plant callus tissue were first published several decades ago (Ingram, 1969; Tommerup and Ingram 1971; Williams et aI., 1969) and have recently been re-established (Asano and Kageyama, 2006; J. Candy unpublished data). However, these methods have yet to be widely adopted for use in laboratory experiments, primarily because they do not replicate the complete infection cycle of P. brassicae. Instead, host plants are typically infected by flooding soil with solutions of P. brassicae resting spores that have been prepared from rotted club root galls. Overall, the primary methods for manipulating P. brassicae infection remain soil-based and have progressed little from those summarised by Buczacki (1983).

Significant consideration was given to finding the best developmental stage and experimental set-up for P. brassicae gene identification. P. brassicae resting spore mixtures, which are the most concentrated source of P. brassicae material, were presumed to be relatively metabolically inert and thus a poor source of RNA. In preliminary research, we were unable to obtain significant amounts of RNA from 13 these mixtures. Methods for germinating spores are neither rapid, nor allow for concerted production of zoospores (e.g. Williams et aI., 1970). Since even differential centrifugation protocols yield spore preparations that retain some microbial contamination, this germination of zoospores is accompanied by a growth of contaminating microbes. Collectively, there were no means to obtain pure samples of P. brassicae.

After considering all of these factors, it was concluded that the most favourable lifecycle stage for identifying P. brassicae genes was during secondary plant infection. At this time, a significant amount of infected but otherwise healthy gall material can be collected, allowing the preparation of RNA samples consisting almost entirely of either plant or P. brassicae transcripts. Discerning P. brassicae mRNAs from a mixture of plant mRNAs would be easier than from the mixed microbial RNAs that might derive from spore preparations. Moreover, identifying P. brassicae RNAs transcribed in the plant would offer the opportunity to find genes involved in the plant-pathogen interaction. Another step in designing our experiments was to choose Arabidopsis as a host, since the fully sequenced and annotated Arabidopsis genome should render host cDNAs easily identifiable by BLASTN (Altschul et aI., 1997).

A problem that remained with the use of plant galls as a source of RNA, was the likelihood that P. brassicae transcripts would be heavily outnumbered by plant transcripts. Cross sections of club root galls suggest that up to 50% of cells can be infected (Kobelt et aI., 2000; J. Siemens pers. comm.), but an infected Arabidopsis root sample will also contain a significant amount of non-gall material. Moreover, in prior unpublished research, we found only one putative plasmodiophorid (from S. subterranea) gene among several cDNA-AFLP clones from Plasmodiophora and S. subterranea galls. Similar difficulties in obtaining P. brassicae cDNAs from club root galls were noted by another researcher (J. Siemens pers. comm.). To reduce the number of host cDNAs sequenced, we chose suppression subtractive hybridisation (SSH), a strategy for identifying differentially expressed transcripts between two RNA samples. The SSH technique has been used by other research groups to identify genes from eukaryotic plant pathogens transcribed during plant infection (Bittner­ Eddy et aI., 2003; Thara et aI., 2003). 14

To augment the SSH gene identification approach, we also aimed to obtain full­ length sequences from the P. brassicaeTPS gene and actin genes (the latter published shortly after the beginning of this work).

Overall strategy

In summary, to test the effect of in p/anta generated RNA silencing signals on P. brassicae infection, we aimed to; 1) Identify gene sequences from P. brassicae. 2) Select genes likely to be essential for P. brassicae growth. 3) Construct a hpRNA format DNA vector containing P. brassicae gene sequences, and transform these into Arabidopsis. 4) Infect the Arabidopsis plants and measure transcription of the genes in P. brassicae. 15

CHAPTER 2. Identification of genes from the obligate intracellular plant pathogen, Plasmodiophora brassicae

Abstract

Plasmodiophora brassicae is an intracellular pathogen that infects plants in the Brassicaceae family. Although an important pathogen group, information on the genomic makeup of the plasmodiophorids is almost completely lacking. We performed suppression subtractive hybridisation between RNA from P. brassicae­ infected and uninfected Arabidopsistissue, screening 232 clones from the resulting SSH library. In addition, we used an oligo-capping procedure to screen 305 full­ length cDNA clones from the infected tissue. A total of 76 new P. brassicae gene sequences were located, the majority of which were extended to full-length at the 5' end by the use of RACE amplification. Many of the unisequences were predicted to contain signal peptides for ER translocation. Although we located few sequences in total, these markedly increase available data from the plasmodiophorids, and provide new opportunities to examine plasmodiophorid biology. Our study also points towards the best methods for future plasmodiophorid gene discovery.

Introduction

The Plasmodiophorida is an enigmatic protistan order containing several economically significant plant pathogens. Plasmodiophorids cause potato powdery scab disease, characterised by unsightly lesions on tubers and vigour-reducing root galls (Spongospora subterranea), water-cress crooked root disease (Spongospora nasturtil), and club root disease of Brassicaceae plants (Plasmodiophora brassicae) (Braselton, 1995; 2001). Spongospora and Po/ymyxa species also serve as vectors for many plant viruses including barley yellow mosaic virus, potato mop top virus and 16 beet soilborne mosaic virus (Rochon et a!., 2004). These diseases are serious problems for crop growers worldwide.

Despite their agricultural importance, the plasmodiophorids remain poorly understood, especially at the molecular level. Plasmodiophorid transformation techniques do not exist, no species are cultured axenically, and growth with plant hosts in vitro remains difficult to establish (Ingram, 1969; Tommerup and Ingram, 1971). The protein-coding sequences currently known from plasmodiophorids are limited to the actin and ubiquitin genes (Archibald and Keeling, 2004) and partial sequences from three other genes (Graf et a!., 2004; Brodmann et a!., 2002). Phylogenetic analysis of the former sequences and small subunit ribosomal RNA sequences (Archibald and Keeling, 2004, Cavalier-Smith and Chao, 1997; Bulman et a!., 2001) showed that the Plasmodiophorida are not related to true fungi but to a grouping of diverse protozoans in the eukaryotic supergoup, the Rhizaria (e.g. Cavalier-Smith and Chao, 2003; Nikolaev et a!., 2004). Although random sequencing of expressed sequence tags (EST) has provided a method to rapidly identify genes in diverse organisms, only a single EST collection has been made from an organism in the Rhizaria, the amoeboflagellate alga 8ige/owie//a natans (Rogers et a!., 2004; Archibald et a!., 2003).

As our first step to understanding the molecular biology of plasmodiophorid species, we aimed to identify genes from P. brassicae. Since P. brassicae resting spores - the principal life stage occurring outside the plant - yield little RNA (SRB unpublished), the plant infection period is the most tractable plasmodiophorid growth stage. Following primary P. brassicae infection in plant root hairs and epidermal cells, secondary infection of the root cortex and stele stimulates formation of distinctive hypertrophic galls. To allow easy identification of host sequences, we searched for P. brassicae genes from Arabidopsis gall tissue and, expecting that there would be more plant genes than P. brassicae genes, we initially used the suppression subtractive hybridisation (SSH; Diatchenko et a!., 1996) technique to reduce the number of plant genes to be sequenced from the mixed plant-pathogen sample (e.g. Thara eta!., 2003; Bittner-Eddy eta!., 2003). 17

Materials and methods

Plant growth and RNA preparation The P. brassicae isolate used in this research was obtained from Christchurch, New Zealand (Falloon et aI., 1997). To produce inoculant, B. rapa ssp. pekinensis cv Wong Bok plants were infected with the isolate, and fresh club material was collected from the plants at maturity. Clubs were allowed to rot at 4°C for one week prior to maceration (Ultra Turrax; Janke & Kunkel/IKA-WERK) and filtration through cheesecloth. Spores were collected by centrifugation at 4000g, resuspended in H20, and stored at -20°C.

Fourteen day old A. tha/iana Columbia seedlings were inoculated with spore suspension (tv 1x106 spore/mL), and grown under glasshouse conditions for a further 30 days. Root masses were collected from tv50 plants showing visible signs of secondary infection. Roots were extensively washed in tap water and then inspected with a dissecting microscope to ensure the absence of necrosis and to trim away fine roots tangled with potting-mix. The final samples included clubbed hypocotyl and root material as well as some secondary roots that did not have obvious infection. To avoid problems with potting mix tangling in fine roots, uninfected Arabidopsis were grown on 112 MS agar media. Four g of material, consisting of the entire root mass and hypocotyl from each specimen, was harvested from these plants.

Total RNA was extracted with Trizol reagent (Invitrogen). To reduce RNA from surface microbes, infected club material was first soaked for tv20 min in Trizol, then immediately homogenised in 10 mL fresh reagent with an Ultra Turrax. DNA was removed from the RNA preparations by digestion with DNAse1 (Roche). Poly­ adenylated messenger RNA was selected from total RNA samples using the Poly-A Tract Kit (Promega).

SSH eDNA preparation Suppression subtraction hybridisation was performed with the PCR-Select cDNA Subtraction Kit (Clontech). To select for P. brassicae sequences, subtraction was performed in a single direction with cDNA from infected galls as tester and cDNA from the roots of uninfected agar-grown Arabidopsis as driver. Only minor changes 18 in gene expression profiles between soil-grown and agar-grown Arabidopsiswere found by Schmid et al. (2005). PCR fragments generated from SSH were cloned into the pGEM-T vector.

Full-length eDNA preparation Oligo-capped (OC), full-length cDNAs were prepared using the GeneRacer Kit (Invitrogen). Approximately two ug total RNA plus 50 ng poly-A RNA (the same RNA as for the SSH) from infected tissue was used as starting material. RNA oligonucleotides were ligated to the 5' end of RNAs (Maruyama and Sugano, 1994; Suzuki et aI., 1997) and cDNA template was generated by reverse-transcription from an oligo-dT primer. The cDNA template was PCR amplified for 19-24 cycles using Accuprime High Fidelity Taq (Invitrogen), with nested PCR primers (Appendix 1 Table 2) modified to include a Notl site in the 5' primer (GR5NNot) and a BamHI site in the 3' primer (GR3NBamHI). Extension times during the PCR were for 4 min. To select larger transcripts, PCR fragments were separated by agarose gel electrophoresis, and DNA fragments > 800 bp were purified using the QIAquick Gel Extraction Kit (Qiagen). PCR fragments were cloned into pGEM-T, or restriction digested and directionally ligated into pBluescript. Colonies were screened with PCR primers located in the vector sequences flanking the inserts, and positive products were sequenced from the 5' end.

PCRand RACE To avoid sequencing redundant clones as the project progressed, we identified some common P. brassicae clones by gene-specific PCR. The clones that were PCR­ detected were PbsHSPl (SSH and OC), PbsHSP2 (OC), and PbSUNKl (OC) (primer sequences in Appendix 1 Table 2). 3' RACE and 5'RACE reactions were performed with Accuprime High Fidelity Taq and a gene-specific primer paired with the appropriate 5' or 3' GR nested primer. The template used for RACE amplifications was a 1/50 dilution of the GR5NNot/GR3NBamHI PCR products used for cDNA cloning. RACE amplification products were gel purified and directly sequenced. PCR primers for ten randomly selected genes (Table 1) were sent to another laboratory in Germany. The primers were used in amplification reactions with DNA prepared from DNAse treated spores of a German P. brassicae isolate as template (Klewer et aI., 2001). 19

Bioinformatic examination DNA sequences were manually checked for quality and incorrect base calls, then trimmed of vector sequences, using Sequencher (Gene Codes). Host plant ESTs were identified by BLASTN (Altschul et aI., 1997); tv100% nucleotide similarity to Arabidopsissequences was used as a criteria. Overlapping cDNA sequences were identified by using CAP3 with default settings (Huang and Madan, 1999). Contigs were assembled from the original ABI sequence trace files with Sequencher. SSH unisequences were queried against GenBank using BLASTX, both with and without a low complexity filter. From each full-length P. brassicae sequence, the ORF following the ATG initiation codon was translated to amino-acids and then compared to the GenBank non-redundant database with BLASTP. Amino-acid sequences were examined for conserved protein motifs and putative transmembrane domains using the InterProScan tool (http://www.ebLac.uk/InterProScan/ Quevillon et aI., 2005). The best Evalues <10-4, and the organism from which these sequences came, were recorded (Table 1). Putative functions or gene features based on matches to conserved protein motifs were recorded. N-terminal signal peptides were identified from amino acid sequences with the SignalP 3.0 programme (http://www.cbs.dtu.dk/services/SignaIP/ Bendtsen et aI., 2004) and iPSORT (http://hc.ims.u-tokyo.ac.jp/iPSORT/ Bannai et aI., 2002).

Results

SSH sequences Of 232 randomly chosen clones screened from the SSH library, DNA sequence was obtained from 160 clones ranging in size from 140 bp to 550 bp. The SSH sequences were highly redundant, with 17 copies of the PbsHSP1 clone sequenced and a further 72 PbsHSP1 clones identified by PCR prior to sequencing (i.e. 89 of the 232 SSH clones in total). Thirty one clones were easily identified as being from Arabidopsisby BLASTN (Appendix 1 Table 1). Four SSH sequences matched the P. brassicae 185 ribosomal rRNA gene (Castlebury and Domier, 1998) and another four were from the 285 ribosomal rRNA gene. After assembly of the SSH sequences with CAP3 (Huang and Madan, 1999), a total of 48 non-rRNA P. brassicae unisequences were identified from the SSH collection (Table 1). 20

Full-length cDNA sequences To obtain DNA sequence beyond the short SSH fragments, an oligo-capped (OC) pool of cDNAs was created from the P. brassicae-Arabidopsis RNA sample. Prior to completing RACE amplifications, the integrity of this template was tested by cloning and sequencing a small number of OC cDNAs. As a higher percentage of P. brassicae cDNAs than expected was found, we eventually screened a total of 305 OC clones. DNA sequences were obtained from 223 of the clones with inserts that ranged in size between tV 300 bp and 1.7 kb. The P. brassicae OC cDNAs were highly redundant, with numerous copies of three genes, PbsHPS1, PbsHSP2 and PbSUNK1, identified by PCR prior to sequencing. Of the 305 OC clones screened, 123 were from these three genes (Table 1), and 103 were from Arabidopsis (Appendix 1 Table 1). Assembly of the OC sequences gave a total of 44 unisequences (Table 1). There was si9nificant overlap of genes identified by the two methods; ten of the 48 SSH unisequences fell within the longer sequences from the OC cDNA collection.

RACE Of 21 SSH sequences extended by oligo-capped RACE, full-length gene sequences were obtained for all but two (Table 1). A full-length sequence for the tV2.7 kb in size PLRSA gene was retrieved, confirming that cDNAs longer than 2 kb remained in the amplified OC pool, but the abundance of short products hampered our ability to access them. Assembly of the RACE sequences together with the OC and SSH unisequences showed that some SSH clones fell within the unsequenced 3' end of OC cDNAs while several non-overlapping SSH clones were derived from the same gene. For example, four SSH unisequences, comprised of 13 clones, proved to be from the PbSUR2 gene.

Primers for 10 randomly selected genes (Table 1) applied to genomic DNA from a German P. brassicae isolate, gave PCR products of the expected size (data not shown).

Bioinformatic comparisons In all, we collected 76 P. brassicae unisequences that were subsequently examined for possible gene function. Amino-acid translations and BLASTP comparisons to the 21

GenBank non-redundant database showed that 60 of the sequences had matches with Fvalues less than an arbitrarily chosen 10e-04 (Table 1). Searching via InterProScan identified conserved CDC26-like (PD739011), TLC (IPR006634) and von Willebrand factor (IPR002035) protein motifs in three further P. brassicae sequences (Table 1).

Nucleotide comparisons showed that partial cDNA sequences matching the ribosomal 517, cyclophilin A and polyubiquitin (Graf et aI., 2004; Archibald and Keeling, 2004) cDNA clones were previously identified from P. brassicaewhile a fifth gene (PtGST2) showed greatest similarity to a gene from P. betae (Mutasa-Gottgens et aI., 2000). BLASTP matches of the PtGST2 gene to the GenBank non-redundant database were consistent with the earlier report that the P. betae GST sequence appeared closely related to bacterial GST sequences. However, tBLASTn comparisons to the GenBank EST database revealed many higher matches to eukaryotic sequences (highest F value = 5e-61 to Mesostigma viride). tBLASTn also gave much higher EST matches for the PLSUNK10 gene (highest Fvalue = 5e-24 to Amphidinium carterae), emphasising the current paucity of genomic data available for the Rhizaria.

Amino-acid sequence alignments confirmed that several genes were members of families. PbsHSPl and PbsHSP2 shared 45% identity over 101 amino-acids, PL:6UNKl and PL:6UNK2 shared 56% identity over 162 amino-acids, and PL:6URl and PL:6UR2 had 40% identity over 395 amino-acids. The 5'-UTR of PbUBI1 also had no obvious sequence similarity to that from PbUBI2, indicating that the P. brassicae ubiquitin cDNA clones were likely from separate genes.

Given the small overall number of genes, we did not separate the P. brassicae sequences into a full range of functional categories (Table 1). However, some obvious groupings were apparent; 11 unisequences encoded ribosomal proteins and a further 12 encoded proteins with possible roles in abiotic stress response. Five of the latter proteins had chaperone or protein folding functions, including the two most common sequences for small heat shock proteins, and six had redox or detOXifying functions.

Seventeen (IV 23%) of the P. brassicae unisequences identified here had predicted ER translocation signal peptides. A large number of these sequences were without 22 similarity to characterised genes, but the P1:SEC61, PbTLC and PbTPP1 genes were similar to, or likely homologues of, characterised genes from other organisms that are processed by the cellular secretory system (Osborne et aI., 2005; Voigt et aI., 1996; Tomkinson, 1999). In addition to the signal peptide-containing sequences, five further sequences (Table 1) were predicted to contain mitochondrial localisation signals. Similarity to characterised genes again provided supporting evidence that three of these proteins (PtcHAP10, P1i..47, PbIDS) were likely to be located in the mitochondria.

Discussion

No concerted isolation of plasmodiophorid genes has previously been completed. Plasmodiophorids cannot be grown axenically and it is difficult to produce resting spore preparations that are entirely free of microbial contamination. Consequently, confirmation that isolated genes are indeed from a plasmodiophorid is harder than it would be for other micro-organisms. We chose to search for P. brassicae genes transcribed during secondary plant infection because club root galls provide large amounts of otherwise healthy tissue, meaning that almost all cDNAs should be either from P. brassicae or the plant host. Use of Arabidopsisthen alloweo simple identification of sequences from the plant host, and the probability of amplifying prokaryotic sequences was reduced by reverse transcription from a poly-T primer. No unusually high BLAST matches to microbial nucleotide sequences were seen, suggesting that none of the cDNAs had an obvious contaminant origin. Conversely, there was clear evidence for a plasmodiophorid origin for a number of genes. Several genes were members of families, while others had previously been identified from P. brassicae, or showed high Similarity to a P. betae gene. We were also able to amplify DNA fragments matching all 10 selected genes from a DNA sample of a German P. brassicae isolate. Although performing confirmatory PCR tests would be prudent prior to further research on some P. brassicae genes, we believe the combined evidence shows the Arabidopsis club root system to be suitable for larger scale isolation of plasmodiophorid genes. Moreover, as this study suggested a higher than expected percentage of P. brassicae cDNAs in secondary stage infected Arabidopsistissue, future sequencing of random clones from a standard cDNA library might be a cost effective means to access new P. brassicae genes. 23

Secondary P. brassicae development promotes plant cell enlargement and tissue disorganisation, plus substantial changes in phytohormone expression (Devos et aI., 2005). It appears inevitable that P. brassicae-like many other phytopathogens­ secretes proteins into the host cell to effect this modification of cell structure. Of the P. brassicae sequences with signal peptides, two matched functionally characterised genes that provided clues of possible roles in pathogenicity. The first gene, PLPPIl, contained an FKBP family peptidyl prolyl cis-trans isomerase domain. Surface exposed or secreted PPlase domain-containing proteins have been found in several intracellular pathogens, including Legione//a, Trypanosoma cruziand Theileria annulata (Engleberg et aI., 1989; Moro et aI., 1995; Pain et aI., 2005). The second gene, PLPDAl contained a 'NodB-type' chitooligosaccharide deacetylase domain (Davies et aI., 2005). This family includes members known to deacetylate an increasing number of substrates, including xylan and chitin. Historical explanations (Braselton, 2001) for secondary penetration of cortical cells by P. brassicae, by re­ encystment or passive redistribution, have appeared unsatisfactory. More recently, ultrastructural evidence has been presented for cell to cell movement by a myxamoeboidstage in both P. brassicae and S. nasturtii (Mithen and MacGrath, 1992; Claxton et aI., 1996). Since this myxamoeboid movement presumably requires the action of plant cell-wall modifying enzymes, confirmation of xylan deacetylation by PLPDA would strengthen the case for between-cell penetration by plasmodiophorids.

A number of effector/avirulence molecules from phytopathogenic eukaryotes are short, cysteine rich proteins (see Skamnioti and Ridout, 2005). Eight P. brassicae genes with signal peptides and no similarity to characterised genes were predicted to be shorter than 171 aa. Cysteines were most common in PL:6UNK4 with 9 residues, followed by PL:6UNKl and PL:6UNK2 containing 6 and 5 residues each. These and several of the other proteins predicted to be secreted by P. brassicae appeal as candidates for further investigation as virulence molecules. However, because plasmodiophorids continue to resist routine manipulation in the laboratory, finding functions for these genes will remain highly challenging. 24

Name Accession SSH OC Length Protein information E- Organism

value

Ribosomal Proteins (11)

PL114e AM180191 2 470 Ll4e ribosomal 1e-30 Pan troglodytes

PL110 AM180192 1 850 LlO ribosomal 2e-66 Rana sylvatica

Pt617e AM180193 1 490 S17e ribosomal 8e-38 P. brassicae

Pt629 AM180194 2 300 S29 ribosomal mt 5e-06 Pan troglodytes

PL128e AM180225 1 RACE 540 L28e ribosomal 3e-ll Suberites domuncula

Pt627e AM180195 1 280 S27e ribosomal 4e-22 Pan troglodytes

PL124e AM180196 1 580 L24e ribosomal 4e-15 Prunus avium

PL147 AM180197 1 540 Mt 39-S ribosomal 2e-16 Apis mellitera

protein L47 mt

PL129 AM 177644 1 L29 ribosomal 2e-13 Oryza sativa PL1lOe AM 177645 5 190 LlOe ribosomal 2e-12 Pectinaria gouldii Pt618 AM 177646 1 180 S18 ribosomal 5e-08 Argopecten irradians

Chaperones or Stress-related (12)

Pb5HSP1 AM180226 89 71 630 small HSP1 GPCR ge-09 Glycine max

Pb5HSP2 AM180227 4 33 700 small HSP2 GPCR 5e-08 Lactobacillus johnsonii

Pb5HSP3 AM180228 1 RACE 550 small HSP3/HSP20 6e-ll Rickettsia felis

Pl:tlSP70 AM180229 2(2) 1.0 kb HSP70 1e- Phytophthora

RACE 126 cinnamomi

PtcHAP10 AM180198 3 440 Mitochondrial 6e-22 Griffithsia japonica

chaperonin 10 mt

PtcYCA AM180199 2 620 Cyc\ophilin A 2e-66 Tetraodon nigroviridis

PbTRX AM180200 1 520 Thloredoxin mt ge-07 Kluyveromyces lactis

Pl:GRX AM180201 2 470 Glutaredoxin 4e-17 Pongo pygmaeus

P~RX AM180230 1 RACE 770 Peroxiredoxiri" GPCR 1e-60 Ralstonia

metallidurans

PblDS AM180231 1 RACE 1.2 kb Trans-Isoprenyl 1e-30 Quercus robur

Diphosphate Synthase

mt

Pl:GSTl AM180232 13 7 630 GST 1 GPCR 3e-16 Shewanella

amazonensis

Pl:GST2 AM180202 1 690 GST2 3e-55 Po/ymyxa betae 25

cDNAs with predicted ER signal peptides (17)

PLSUNK1 AM180233 10 19 570 Unknown GPCR --

PLSUNK2 AM180234 3 12 620 Unknown --

PLSUNK3 AM 180203 3 630 Unknown - -

PLSUNK4 AM180235 3 8 610 Unknown --

PLSUNK5 AM180236 4 5 540 Unknown --

PLSUNK6 AM 180204 1 390 Unknown --

PLSUNK7 AM180205 1 510 Unknown --

PLSUNK8 AM180206 1 300 Unknown --

PLSUNK9 AM180237 8 RACE 1 1.1 kb Unknown --

PLSUR1 AM 180239 4 RACE 2 1.4 kb Unknown GPCR 4e-16 Usti/ago maydis

PLSUR2 AM180240 13(4) 1.4 kb Unknown GPCR 2e-19 Usti/ago maydis

RACE

P~DA1 AM180241 1 RACE 940 Polysaccharide 2e-19 Streptococcus

deacetylase GPCR aga/actiae

P~PI1 AM180207 1 700 FKBP PPlase 1e-21 Porphyromonas

gingiva/is

PLSec61 AM180242 1 RACE 1.6 kb Sec61 0.0 Cryptosporidium

parvum

PbTLC AM180208 1 990 TLC domain 2e-1O Gibberel/a zeae

(IPR006634)

PbTPP AM180243 3(2) 1.8 kb Peptidase GPCR 1e-62 Macaca fascicu/aris

RACE

PLSUNK10 AM180238 1 RACE 1.0 kb Unknown, vWF domain 1e-05 Neurospora crassa

(IPR002035)

"Other" cDNAs (36)

PLSNAP AM 180244 1 RACE 980 SNAP protein 7e-32 Dictyoste/ium

discoideum

PlAAB1 AM 177647 2 220 Rab small GTPase 2e-20 Dictyoste/ium

discoideum

PINPS29 AM180209 1 640 Phosphoesterase 5e-56 Tetraodon nigroviridis

(Vacuolar protein

sorting 29)

PMSL AM180245 1 RACE 430 AMP-dependent 1e-ll Gal/us gal/us

synthetase/ligase 26

PLDGATl AM180210 1 780 Diacylglycerol 6e-18 Arabidopsis thaliana

acyltransferase

PIRSA AM 180246 1 RACE 2.7 kb Puromycin sensitive le- Strongylocentrotus

aminopeptidase GPCR 116 purpuratus

PtuBI1 AM180211 1 1.7 kb Ubiquitin 1 O.OBN P. brassicae

PtuBI2 AM180212 1 1.7 kb Ubiquitin 2 O.OBN P. brassicae

PtuCE2 AM 177648 1 270 Ubiquitin-conjugating 4e-27 Theileria annulata

enzyme E2

PreIFl AM180213 1 440 SUI1/eIFl 2e-19 Aspergillus nidulans

PLttISH4 AM180214 1 430 Histone H4 ge-37 Apis mellifera

PLCDC26 AM180215 1 320 CDC26-like --

(PD739011)

P&lTF2 AM180216 1 520 Nuclear transport 4e-32 Arabidopsis thaliana

factor 2

PLSRPl AM 177649 1 100 SRP1, Importin a 2e-05 Anopheles gambiae

PbRanA AMl77650 1 170 RanA small GTPase 2e-14 Emericella nidulans

PLCSP AM180247 2 RACE 1 470 Cold shock 4e-15 Oryza sativa

protein/DNA binding

PLCOF AM180217 1 620 Cofilin 2e-19 Dicfyostelium

discoideum

P~TUB AM180248 2(2) 1.6 kb ~-tubulin 0.0 Tetrahymena

RACE thermophila

PtfIM AMl77651 1 350 CH domain; Fimbrin- 3e-24 Arabidopsis thaliana

like

PLtv'lOBl AM180218 2 740 MOBl 3e-78 canis familiaris

PL{]LYT AM180219 1 690 Glycosyltransferase 4e-07 Desulfovibrio

desulfuricans

PL{]PA AM 177652 1 230 G protein a subunit 4e-l0 Homo sapiens

PLNAS AM 177653 1 160 Vacuolar ATP synthase 2e-09 Daniorerio

PLDEAH AM180220 1 1.0 kb DEAD/DEAH helicase ge-72 Tetraodon nigroviridis

PbEBP2 AM180221 1 890 rRNA processing 3e-16 Schizosaccharomyces

protein EBP2 pombe

PLUM AM180222 1 430 LIM domain protein ge-14 Bostaurus

Pb1433 AM 177654 1 220 14-3-3 protein 8e-06 Spinacia oleracea

PLSTPKl AM 177656 1 320 Serine/Threonine 4e-16 Magnaporthe grisea 27

protein kinase ,t',_

Pl6TPK2 AM180249 1 RACE 1.7 kb 5erinefThreonine 3e-06 Homo sapiens

protein kinase

Pl6TPK3 AM177655 1 270 5erinefThreonine- 7e-04 Drosophila

protein kinase melanogaster

PtuNK4 AM 177657 1 90 Unknown - -

PtuNK3 AM 177658 1 120 Unknown - -

PtuNK2 AM 177659 1 130 Unknown -- PtuNK1 AM180250 1 RACE 780 Unknown - - PtuNK5 AM180223 1 490 Unknown - -

PtuNK6 AM180224 1 360 Unknown - -

Table 1. Summary of cDNA clones from P. brassieae. Name = cDNA name. Accession = EMBL accession number. SSH or OC columns indicate the cDNA source library and the number of clones. Bracketed numbers in the SSH column indicate the number of non­ overlapping SSH unisequences for a single gene. Length = size of cDNAs in base pairs, or kilo bases where noted. Protein information = gene function information collated from BLASTX matches and/or matches to conserved protein domains. Evalue = highest BLASTP

probability, or BN = highest BLASTN probability, for matches to the GenBank non-redundant

database. Organism = source organism for highest database match. mt = sequence with

predicted mitochondrial localisation signal. RACE = EST amplified by RACE. GPCR = ESTs amplified from a German P. brassicae field isolate. 28

CHAPTER 3. Genomic DNA sequences from Plasmodiophora brassicae

Abstract

P/asmodiophora brassicae, a pathogen of Brassicaceae plants, is grouped within the eukaryotic supergroup, the Rhizaria. Although a large diversity of protists are found in the Rhizaria, genomes of organisms within the group have barely been examined. In this study, we identified DNA sequences spanning or flanking 24 P. brassicae genes, eventually sequencing close to 44 kb of genomic DNA. Evidence from this preliminary genome survey suggested that splicing is an important feature of P. brassicae gene expression; the P. brassicae genes were rich in spliceosomal introns and two mini-exons of less than 20 bp were identified. Consensus splice sites and branch-point sequences in P. brassicae introns were similar to those found in other eukaryotes. Examination of the promoter and transcription start sites of genes indicated that P. brassicae transcription is likely to begin from initiator elerpents rather than TATA-box containing promoters. Where neighbouring genes were confirmed, intergenic distances were short, ranging from 44-470 bp, but a number of larger DNA fragments containing no obvious genes were also sequenced.

Introduction

Molecular phylogenetic studies are continually increasing knowledge of eukaryotic evolution and the breadth of microbial diversity. Eukaryotes are currently thought to coalesce into five or six supergroups, one of which, the Rhizaria, contains mostly amoeboid protists. Prominent protistan groups in the Rhizaria includes the radiolarians, Foraminifera and Cercozoa (Keeling et aI., 2005). Despite the Rhizaria being circumscribed largely on the basis of recent molecular evidence, very little is known about the genomes of these organisms. cDNA sequences have only been identified in large numbers from the amoeboflagellate alga 8ige/owie//a natans (Archibald et ai., 2003; Rogers et aI., 2004) and the foraminiferan Reticu/omyxa 29 filosa (Burki et aI., 2006). The first genome sequencing project on a rhizarian (B. natans) has just begun (http://www.jgi.doe.gov/seguencing/cspsegplans2007.html).

The Plasmodiophorida is a cercozoan order (Archibald and Keeling 2004; Bass et aI., 2005; Cavalier-Smith and Chao 1997) of special interest because it contains several plant-pathogens of worldwide importance (Braselton 1995; 2001). These include Spongospora subterranea, the cause of potato powdery scab disease, and Polymyxa species that serve as vectors of many plant viruses (Rochon et aI., 2004). However, because plasmodiophorids are obligate intracellular parasites, gene sequences from these organisms are only now being identified. The most studied plasmodiophorid is Plasmodiophora brassicae, which causes club root disease in Brassicaceae including Arabidopsis thaliana (Ludwig-Muller 1999). Reports of genomic structures in P. brassicae are largely confined to a few examples of introns in genes. An early study showed the presence of self-splicing type I introns in the P. brassicae ribosomal small subunit gene (Castlebury and Domier 1998), and a single spliceosomal intron sequence was identified in a DNA fragment from the P. brassicae trehalose-6- phosphate synthase (TPS) gene (Brodmann et aI., 2002). More recently, cDNA and genomic DNA sequences of a P. brassicae serine-threonine kinase gene (PbSTKL1) were published (Ando et aI., 2006). The presence of multiple introns in PbSTKLl was rep~rted, but no information on the structure of these introns was presented.

We recently identified a collection of cDNA sequences from P. brassicae (Bulman et aI., 2006). With an initial aim to gain information about the structure of P. brassicae promoters, we set out to isolate sequences adjacent to a selection of these cDNAs. During this process, the sequences of many introns were also discovered. Here we present the results of this small-scale genome survey from P. brassicae.

Materials and methods

Initial peR primer design For genome walking, pairs of PCR primers facing into the unknown flanking DNA were designed in the 5' and 3' regions of genes, using the Primer3 programme (Rozen and Skaletsky 2000). The first primers were designed from the GenBank sequences for the P. brassicae actin gene (Archibald and Keeling 2004) and the 30 trehalose-6-phosphate synthase (TPS) gene (Brodmann et aI., 2002). The three actin sequences (AY452179-81) were 640 bp long and began about 370 bp downstream of the ATG start codon. The 217 bp TPS fragment (AF334707) was estimated to be 760 bp downstream of the ATG.

Further PCR primers were designed in 14 genes (Table 1) from the cDNA libraries described in Bulman et al. (2006). Two cDNAs, PbCAL and PbCHAp, were identified from the oligo-capped library after publication of the previous sequences (see Table 1 for details of these two genes).

PCR walking DNA was extracted from A. thaliana Columbia gall tissue infected with P. brassicae, using a CTAB method (Russell and Bulman 2005). The genomic DNA sequences that bordered P. brassicae genes were amplified from this DNA with Accuprime High Fidelity Taq (Invitrogen) using one of two DNA walking techniques. 1. Suppression­ PCR walking (Siebert et aI., 1995). Genomic DNA was digested with the Pst-1 endonuclease and then ligated to a SUP-Pst adaptor (primer and adaptor sequences provided in Appendix 2 Table 1). Flanking DNA was PCR amplified with a gene specific primer paired with the Sup1a primer. 2. A universal PCR walking technique modified from Sarkar et al. (1993). First round PCRs were carried out with genomic DNA as template, and each of four degenerate primers (UNU-4) paired with a gene specific primer. A second round of four PCRs were performed with the T7 primer paired with a nested gene-specific primer. One IJL of each first round PCR reaction was used as template.

Amplified products were separated on 1% agarose gels and DNA fragments purified with the QIAquick Gel Extraction Kit (Qiagen). DNA fragments were directly sequenced with gene specific primers. Fragments longer than tv 800 bp were cloned in pGEM-T (Promega) and sequenced with the M13 forward and reverse primers. Longer DNA fragments were obtained by an iterative process of sequencing, design of new PCR primers, and renewed PCR-walking.

RACE and RT-PCR amplification Genomic DNA sequences were compared to the GenBank non-redundant or EST databases by BLASTX or tBLASTX respectively (Altschul et aI., 1997). In nine cases, 31 translated BLAST yielded hits that suggested the presence of new genes in these sequences (all nine Evalues < 1e-08). PCR primers for RACE were designed in regions of highest similarity and used to amplify cDNA sequences by 3'RACE and 5'RACE. RACE reactions were performed on an oligo-capped cDNA template as described in Bulman et al. (2006). In a number of cases where 5'RACE products were cloned and then sequenced, the sequence of the longest clone was recorded as the full-length cDNA.

Bioinformatics DNA contigs were assembled from ABI sequence trace files with Sequencher (GeneCodes). cDNA sequences were aligned to genomic DNA sequences using Spidey (http://www.ncbLnlm.nih.gov/spideyl) with some manual alignment. Putative splice sites were searched for by manually examining intron sequences for the 3' most conserved splice site motif CTRAY, as in Kupfer et al. (2004). In introns lacking a complete CTRAY, the 3' most motif with only one variant nucleotide was recorded. It was assumed that the fourth 8. nucleotide in the splice sites would always be invariant.

As a simple estimate of intron position, the intron-containing genes were divided into quarters, and the number of introns in each quarter recorded.

SEQ LOGOs (Crooks et aI., 2004; Schneider and Stephens 1990) were constructed at http://weblogo.berkeley.edu/. P. brassicae promoters were manually examined for TATA boxes. The P. brassicae L7a amino acid sequence was aligned to the L7a dataset from Russell et al. (2005) using Clustal X (Thompson et aI., 1997). A neighbour joining phylogenetic trees was constructed with PAUP*4.0b10 (Swofford 2002) using default parameters. A maximum likelihood tree was constructed using PhyML (Guindon and Gascuel 2003) with the input tree from BIOJ and the WAG substitution matrix. 32

Results

Genomic DNA sequences from actin and TPS Genomic DNA sequence stretching 2.2 kb in the 5' direction and 350 bp in the 3' direction from the GenBank P. brassicae actin sequences was obtained (see Fig. 1 for a diagram of all genomic sequences obtained). This sequence contained the entire actin coding region; with PCR primers on either side of this region, we re-amplified and directly sequenced this gene (PbACT1).

During sequencing of P. brassicae actin cDNAs, a clone was found with a 5' UTR sequence that did not match that of PbACTI. DNA-walking with primers based on this cDNA yielded the sequence of a second P. brassicae actin gene (PbAC71l). PbAC71and PbAC71Idiffered by 44 bp across the 1128 bp coding sequences. Neither gene sequence exactly matched any of the three actin P. brassicae GenBank sequences (Archibald and Keeling 2004). Both PbAC71and PbAC71Ihad greatest similarity to the actin 3 sequence (AY452181), matching 711 and 715 of 733 bp respectively.

With PCR primers designed from the short GenBank P. brassicae TPS sequence (PbTPS; Brodmann et aI., 2002), a complete 2640 bp PbTPScDNA was amplified by RACE. By DNA walking, genomic DNA sequence spanning from 870 bp upstream of the PbTPStranscription start site (TSS) to 80 bp from the 3' end of the PbTPScDNA was obtained (Fig. 1).

New genes were identified in flanking DNA DNA sequences that flanked a further 12 genes from Bulman et al. (2006) were obtained (Table 1A). The new genomic DNA sequences obtained in this study are deposited under EMBL accessions AM411663-AM411680. Using RACE and RT-PCR, the sequences of seven new cDNAs in these genomic DNA sequences were identified (Table 1B).

In total, about 44 kb of P. brassicae genomic DNA was sequenced (Fig. 1). The longest DNA fragment of 6.4 kb contained four closely spaced genes, separated by 44 bp, 215 bp and 122 bp (Fig. 1). On another genomic DNA fragment, 97 bp 33 separated the 3' end of the PbGTPXcDNA from the TSS of the PbPDA cDNA. The maximum distance between confirmed cDNAs was 474 bp (PbURM-PbACTl). Several

A. P. brassicae gene sequences for primer design

Name Accession Name Accession Name Accession PbACTf'B AY452179-81 PbsHSPl AM180226 PbPDA AM180241

PbACTlf'B AY452179-81 PbsHSP2 AM180227 PbIDS AM180231

PbTP~B AF334707 PbSUNKl AM180233 PbHISH4 AM180214

PbPTUB AM180248 PbSUNK2 AM180234 PbMOB AM180218

PbCYCA AM180199 PbGSTl AM180232 PbPSA AM 180246

.B. New P. brassicae gene sequences

Name Accession Length Protein information Evalue Organism

caltractin-like; IPR011992 (ca2+- PbCAL§ AM411655 520 0.001 Naeg/eria gruberi binding EF-Hand superfamily). Cystidine-histeine dependent amino- PbCHAP AM411658 760 2e-35 Rhodospiri/lum rubrum peptidases (CHAP) SP

PbCALUP AM411654 730 Conserved hypothetical proteins 2e-17 . Canis tamiliaris

PbPEPS28 AM411657 1470 Serine carboxypeptidase S28 family 4e-74 Arabidopsis thaliana

Transmembrane 9 superfamily PbENDO AM411656 2140 0.0 Oryza sativa protein member 4 SP

PbL7A AM411659 900 Ribosomal L7Ae 3e-58 Theileria annulata

PbGTPX AM411660 980 GTPase 1e-24 Drosophila me/anogaster

cd02429; Peptidyl-tRNA hydrolase PbGSTUP AM411661 490 5e-20 Daniorerio type 2 (PTH2-like)

PbURM AM411653 330 ubiquitin-related modifier-1 2e-ll Arabidopsis thaliana

Table 1. Details of P. brassicae genes. A. P. brassicae genes from Bulman et at. (2006a) and from GenBank (GB) that were used for initial peR primer design. B. Information on new P. brassicaegenes presented in this study. Name = cDNA name. Accession = EMBL accession number. Length = size of cDNAs in base pairs. Protein information = gene function information collated from BLAST)( matches and/or matches to conserved protein domains. SP = signal peptide predicted by SignalP 3.0 programme (Bendtsen et at., 2004). Evalue = highest BLASTP probability for matches to the GenBank non-redundant database.

Organism = source organism for highest database match. § = cDNAs first identified by screening of an oligo-capped library. 34

_ = cDNA sequence but no genomic sequence q:J = intron r-__~13~0~0 __~~ ~3000 PbURM PbACTI

Mf------,>! 600 12100 PbACTII

~8ZQ...70---I=====t=t=tl =tl =tl :1=1:t=I1=t1 ::::ti 4000 PbTPS

~7~00L-t=1 I~====C>I 550 13000 ~ I I II I ~ 1400 PbBTUB PbMOB

210 "'- FP ~9----++, K II I I I I I Rt-----t-t-tlI--+-t---~ II ~ 6400 PbPEPS28 PbENDO PbCAL PbCAWP

1450 580 4100 ~ ~ 1 ~1000 PbL7A PbCHAP PbHISH3 PbHISH4 (7)

600 1400 ~O I :>I-.lli.j 2900 PbGSTUP PbGSTl PbGTPX PbPDA

2500 ~r----fI--+I-----1~ 1900 H--I-+II+-1 ++-11 +-1----,>1 PbIDS PbTPP

I I I I II I II I I I I I I I I I I I ~ 440 I 4700 PbPSA

1100 750 220011-----+--I--+-r.a. PbSUNK2

Figure 1. Diagram of the genomic sequences obtained in this study. Numbers = DNA lengths in base pairs. Numbers on the right of each fragment indicate the total length of the sequence. The sizes of genes and the positions of introns are drawn approximately to scale. * = five genes for which the lengths of the promoter sequences are shown in a single diagram. BLAST matches indicate that a histone H3 gene (PbHISH3(7)) is found adjacent to the PbHISH4gene. A partial mRNA sequence from this gene was amplifiable by RT-PCR but the complete gene was not retrieved by RACE amplification; the uncertainty about this gene is denoted by (7). 35

P. brassicae genomic DNA fragments of up to 2.2 kb in length contained no obvious genes by BLASTX similarity (Fig. 1).

Genomic DNA sequence was obtained containing the PbCHAPgene, a likely cysteine, histidine-dependent amino-peptidase (CHAP). The CHAP domain is predominantly found in bacterial genes, but has also been characterised in trypanosome genes (Bateman and Rawlings 2003; Rigden et aI., 2003). A ribosomal L7a gene was found upstream of PbCHAP(Fig. 1). Phylogenetic comparisons of the P. brassicae L7a amino acid sequence with orthologs from a broad variety of organisms (Russell et aI., 2005) showed PbL7a to group most closely with a sequence from the chlorarachnion 8. natans, albeit with low bootstrap support throughout the tree (Appendix 2 Fig. 1).

Introns were located in P. brassicaegenes Genomic DNA that spanned the coding region of several genes was sequenced. Alignment of the genomic sequences with the corresponding cDNAs revealed the presence of 73 spliceosomal introns in 14 of the P. brassicae genes (see Fig. 1). The greatest number of introns (20) was found in the PbPSA gene. By comparison, the closely related puromycin-sensitive aminopeptidase gene of humans (NM_006310) contains 22 introns. The conserved ribosomal L7a gene from P. brassicae contained two introns, at the 115 aa and 250 aa positions relative to the Saccharomyces cerevisiae L7a gene.

The sequences of 14 further introns were extracted from the P. brassicae STKLl gene (Ando et aI., 2006), giving a total of 87 P. brassicae introns ranging in size from 46-103 bp, at an average of 55.7 bp (Appendix 2 Fig. 2). The GC content of the introns was 50.2%. All of the introns were bordered by the standard 5' IGT and AGI 3' nucleotide pairs, except two that were bordered by 5' IGC nucleotides (Fig. 2). Searches for conserved splice sites showed that 44 introns contained CTRAY motifs (Kupfer et aI., 2004) between 8 and 22 bp (average 13 bp) from the 3' intron end. A further 35 introns had CTRA Y motifs with one variant nucleotide, all of which lay between 8 and 27 bp (average 13.6 bp) from the 3' end of the intron. Of these 35 motifs, nine varied at the central nucleotide (CTCAY) whereas the remaining motifs had variant nucleotides at either end (NTRAYjCTRAN). Six of the remaining introns had CTRAY motifs with two variant nucleotides between 10 and 21 bp from the 3' intron end. 36

Ten, 26, 32, and 19 introns were found in each quarter (from 5' to 3') of the intron­ containing P. brassicae genes.

Amongst the P. brassicae exons, we identified two very short exons, or micro-exons, of 4 bp (PbMOBexon 4) and 19 bp (PbCARPEPexon 7).

A 35 52 5 o 51 28 9 9 14 14 16 19 C 27 14 9 2 11 23 13 23 30 26 23 16 G 16 5 64 87 0 13 12 49 21 22 18 20 27 T 9 16 9 o 85 12 24 16 34 21 29 28 25

J'l'l-Exon-

i:i)1 T A 0 - 1- ~ ~ "3 4 5 ~ 7 8 9 10 11 12 13 Sequence position

A 13 11 13 10 14 12 17 1 87 0 19 22 20 C 20 22 23 23 23 23 18 60 o 0 13 20 20 G 18 24 21 19 23 19 26 0 o 87 45 14 24 T 36 30 30 35 27 33 26 26 o 0 10 31 23 ~:,l~ ~&:~A _~Exon-

B 0 1 2 3 4 5 6 7 8 9 10 11 12 13 Sequence position

Figure 2. Conserved sequences at intron splice sites. A. SEQLOGOs were constructed from sequences at the 5' border (A) and 3' border (8) of 87 P. brassicae introns. The intronjexon boundaries are indicated. Exact numbers of nucleotides at each positions are shown above the SEQLOGOs.

Promoter and TSS sequences The 5'UTR sequences from 65 full length P. brassicae cDNAs, identified here or in Bulman et a/. (2006), were assembled. These ranged from 2 bp to 112 bp in size. The GC content of the UTRs was 57.6%, compared with 57.4% for the P. brassicae cDNAs coding sequences. The GC content of P. brassicae cDNA sequences was relatively high and similar to the 57-58% reported for Phytophthora spp. (Qutob et a/., 2000).

All but one of the full-length cDNAs began with an A nucleotide. As indicated by a sequence logo plot of the first 10 nucleotides of the UTR sequences, the initial A nucleotide was most often followed by T nucleotides in both the +2 and +3 positions (Fig. 3). The promoter sequences that were collected from the 24 P. brassicae 37

genes were assembled. A sequence logo of 5 bp on either side of the predicted TSS of these genes is shown in Fig. 3. This again showed the ATT sequence at the start of transcripts, but also revealed a tendency towards the C nucleotide being present at the -1 position.

Manual examination failed to show the presence of TATA boxes in any of the P. brassicae promoter sequences.

A 64 5 18 7 7 12 12 21 28 16 C 0 8 6 25 20 7 26 9 12 15 G 0 13 0 14 16 30 12 12 11 16 T 1 39 40 18 21 15 14 20 11 14

~:~~X 5 6 7 8 9 10 A +1 2 3 4 Sequence position

A 7 3 2 0 23 0 6 1 1 4 C 6 8 6 20 0 1 0 10 5 3 G 6 9 11 4 1 0 6 0 5 7 14 T 5 4 4 12 3 1 17 18 8 11 3

~ :l . O cATI_ _ 2 3 4 5 +1 7 8 9 10 11 B Sequence position

Figure 3. Conserved TSS sequences. A. A SEQLOGO constructed from the first 10 nucleotides of 65 full-length P. brassicae cDNAs. B. A SEQLOGO constructed from 5 nucleotides on either side of the TSS. The first nucleotide of transcripts is marked by + 1. Exact numbers of nucleotides at each positions are shown above the SEQLOGOs.

Discussion

Because of the obligate nature of plasmodiophorid-plant relationships, it remains very difficult to obtain pure P. brassicae DNA or RNA. Many typical techniques of molecular biology are made difficult by the need to work with a mixture of host plant and plasmodiophorid DNA. Here, PCR-walking techniques were used to obtain P. brassicae DNA sequences, thus avoiding the necessity to construct and screen genomic DNA libraries. Initially, we used PCR-walking to extend the genomic DNA sequences of the actin and TPS P. brassicae genes previously identified by other researchers (Archibald and Keeling 2004; Brodmann et aI., 2002). A notable finding 38 from the cloning of the actin sequences, was confirmation that there are multiple copies of the P. brassicae actin gene. Five and six actin genes, respectively, are found in the genomes of the oomycete Lagenidium giganteum and the unicellular alga Emi/iana hux/eyi, so the presence of multiple actin genes in a protist is not unusual (Bhattacharya et aI., 1993; Bhattacharya and Stickel 1994). However, the multiple actin sequences may also be related to the unusual variability seen in P. brassicae field populations. Single-spore lineages taken from P. brassicae field isolates display considerable differences in pathogenicity and DNA structure (Buczacki et aI., 1975; Manzanares-Dauleux et aI., 2001; Some et aI., 1996). Hence, the different actin sequences could be from multiple genes carried within a single P. brassicae genome, or they may reflect between-lineage genetic variations. Discerning between these possibilities will require further research with single-spore isolates.

An advantage of the oligo-capping cDNA cloning approach is that most sequences should be full-length and include the TSSs. In combination with study of previously identified cDNAs (Bulman et aI., 2006), examination of the P. brassicae sequences suggested that P. brassicae gene transcription begins from initiator element (Inr)-like motifs. In eukaryotes, gene transcription is controlled by combinations of DNA elements in a core promoter within 200-300 bp of the TSS. These include the well characterised TATA box motif, usually found 25-35 bp upstream of the TSS in some plant and metazoan genes (Molina and Grotewold 2005; Yang et aI., 2007), and the Inr element surrounding the TSS (Smale 1997). Genes from several eukaryotes appear to have only Inrs; for example, transcription of Trichomonas genes begins from an adenosine within a TCA+ 1YW site (Quon et aI., 1994), and oomycete genes contain the Inr element TCA+1TTYY in the -2 to +5 region around TSSs (McLeod et aI., 2004). TATA box elements were not evident in any of the 24 P. brassicae promoter sequences. Examination of the P. brassicae genes pOinted to a consensus sequence of CA+1TT at the TSS, a sequence that resembles Inr elements from other organisms. This is especially so as a number of the full-length cDNA sequences were composed of multiple clones ending at different points, suggesting that the longer sequence might not end at the true TSS.

In addition to examining sequences surrounding the start of genes, study of genomic DNA sequences revealed the presence of many introns in P. brassicae. The mean 39 size of the P. brassicae introns (56 bp) was similar to that seen in several fungi species (50-70 bp) (Kupfer et aI., 2004), being considerably shorter than introns seen in most eukaryotes but larger than the near-minimum sized 25 bp introns from Paramecium tetraure/ia (Aury et aI., 2006; Keeling and Siamovits 2005). Almost all of the introns were bordered by the 5' I GT and 3' AG I nucleotide pairs. The majority of spliceosomal introns in other organisms are U2 dependent, and are bordered by these almost invariant dinucleotides (see Sharp and Burge 1997). Two of the P. brassicae introns had 5' IGC, which is the most common splice site variant in other organisms; for example, being detected in 0.08-1.98% of introns in five species of fungi (Kupfer et aI., 2004). Beyond the bordering dinucleotides, the consensus splice sites of the P. brassicaeintrons - 5' AGIGTANG and 3' YAGI - were similar to those from other organisms. The 5' consensus is usually the longer of the two, extending to GIGTRWGT in fungi and IGTRNGT in oomycetes, whereas the 3' consensus is generally confined to YAGI (Kamoun 2003; Kupfer et aI., 2004; Win et aI., 2006). We were also able to distinguish putative branch point motifs at the 3' ends of the majority of P. brassicae introns. Such motifs have previously been shown to be YNCTRAY in metazoans and RCTRAY in fungi (Berglund et aI., 1997; Kupfer et aI., 2004). In P. brassicae, 85/87 (98%) of introns contained putative branch point motifs within 8-27 bp from the 3' intron end, compared with more than 98% of fungal introns with potential branch points 13-36 bp from the 3' end (Kupfer et aI., 2004).

A ribosomal L7a gene containing two introns was identified upstream of the P. brassicae CHAP-domain gene. Phylogenetic comparison with L7a genes from a broad range of organisms showed that the P. brassicae gene grouped most closely with a gene from the other rhizarian, B. natans. An intron at a conserved position of the L7a gene has previously been identified from members of the Ophistokonta, Amoebozoa and Excavata, but was absent from members of the Chromalveolata and Plantae (Russell et aI., 2005). The intron was not found in B. natans, the only rhizarian then examined, and neither of the P. brassicae introns was at the conserved position. Although the conserved L7a intron alone is likely to be a weak marker of early eukaryotic evolution, the absence of this intron in P. brassicae lends some further support for the grouping of the Rhizaria with Chromalveolata and Plantae. 40

The prevalence of introns has not been reported in any other rhizarian organism. Introns appear to be present in all other eukaryotes that have been examined (Vanacova et aI., 2005), but many protist species contain few introns (Jeffares et aI., 2006). For example, only three introns have been identified in the Giardia lamblia genome (Russell et aI., 2005) and 65 introns in Trichomonas vaginalis (Vanacova et aI., 2005; Carlton et aI., 2007). By comparison, splicing must be a prominent feature of P. brassicae transcription, since spliceosomal introns were plentiful in the genes examined and two micro-exons were detected. Little information on the prevalence of micro-exons in different organisms has been published, but, in a computational study, Volfovsky et al. (2003) found that between 0.5-1.6% of mRNAs, from four relatively intron-rich organisms, contained micro-exons. Two mini-exons in 19 fully sequenced genes seems unexpectedly high, were this to be further reflected in the overall P. brassicae genome.

As in other intron-rich organisms, the distribution of P. brassicae introns showed no strong bias towards the 5' end of genes. Spliceosomal introns show a 5' bias in many organisms, but this bias is especially pronounced in intron-poor protists such as T. vaginalis (Jeffares et aI., 2006; Mourier and Jeffares 2003).

The forces at work on the P. brassicae genome remain uncertain. A feature of many parasites is a reduction in genome size, and, in some cases, a reduction in the number of introns (Mourier and Jeffares 2003). For example, the intracellular microsporidian Encephalitozoon cuniculi has a mean intergenic distances of 129 bp and only 23 introns (Katinka et aI., 2001). All of the intergenic distances between confirmed P. brassicae genes were relatively short, but genomic fragments of up to 2.2 kb without obvious genes were also sequenced. More detailed examination of gene transcription from larger genomic regions will confirm if P. brassicae has a compact genome but intron rich genes. It has often been postulated that intron loss is correlated to the reproduction rate of organisms, such that unicellular organisms with short life cycles would have fewer introns (see Jeffares et aI., 2006). However, recent studies indicate that some free-living protists have a significant number of introns (Archibald et aI., 2002; Siamovits and Keeling 2006). Although intron densities in extant eukaryotes show no clear phylogenetic pattern, it is thought that the ancestral had a high to moderate numbers of introns (Roy and Gilbert 2006). In this context, P. brassicae might be seen as an example of a eukaryote that 41 has retained ancient intron numbers. The (probable) presence of an intron in a Po/ymyxa betae gene (Mutasa-Gottgens et aI., 2000) and a 51 bp intron in a repeat of a Spongospora subterranea polyubiquitin gene (Archibald and Keeling 2004) is the only evidence of introns in other plasmodiophorid species. Given that P. brassicae was the sole phytomyxid to have self-splicing introns in the ribosomal SSU (Bulman et aI., 2001; Castlebury and Domier 1998), it will be interesting to see if all of the phytomyxid parasites are rich in introns, or if the P. brassicae genome is especially permissive to intron presence. 42

CHAPTER 4. Testing plant-derived RNA silencing against the intracellular pathogen Plasmodiophora brassicae

Abstract

Plasmodiophora brassicae is an intracellular pathogen that infects plants in the Brassicaceae family. A number of P. brassicae cDNA coding sequences have recently been identified, but tools for in vivo manipulation of P. brassicae gene expression are not available. RNA silencing has rapidly become an indispensible tool for knockdown of gene expression in a range of organisms. We tested for the presence of RNA silencing in P. brassicae. Due to the obligate intracellular lifestyle of P. brassicae, we attempted to induce silencing during infection of transgenic Arabidopsis thaliana plants. Stable transgenes containing hairpin-format segments of the P. brassicae actin gene, were introduced into Arabidopsis. Transformants showing transcription and processing of the hairpin RNAs were infected with P. brassicae but these plants showed full development of club root disease. No changes in transcription of the endogenous P. brassicae actin genes were measured.

Introduction

Plasmodiophora brassicae is a plasmodiophorid pathogen that causes club root disease of Brassicaceae plants (Ludwig-MOiler, 1999). Club root is distinguished by the growth of large hypertrophic galls on plant roots and hypercotyls. The disease is a problem for growers worldwide. The plasmodiophorids, which infect both plant and heterokont hosts (Braselton, 2001), are phylogenetically separated from other groups of plant pathogens, falling within the recently defined eukaryotic supergroup, the Rhizaria (Nikolaev et aI., 2005; Keeling et aI., 2005). Although the Rhizaria contains a wide diversity of protiSts including the Formanifera and Cercozoa, the first genome sequencing projects on any organism from the group has just been initiated. 43

Consequently, the molecular biology of the Rhizaria is less well understood than in . other major groups of eukaryotes.

On one side of the club root interaction, a preliminary understanding of changes in plant gene expression is' being gathered (Siemens et aI., 2006; Ando et aI., 2006b). By comparison, the molecular biology of P. brassicae - and all p.lasmodiophorids - remains largely untouched, primarily because these parasites are difficult to manipulate in the laboratory. All are obligate intracellular biotrophs that cannot be axenically cultured in vitro. Protocols for gnotobiotiC growth of P. brassicae in plant callus tissues have been developed (Ingram, 1969; Asano and Kageyama, 2006), but these are yet to be widely adopted for experimental purposes. Instead, experimental infections with P. brassicae continue to be made on soil-grown plants using solutions of resting spores. Methods for laboratory manipulations of P. brassicae have changed little since those summarised by Buczacki (1983).

Recently, a number of new P. brassicae genes were sequenced, many of which had unknown functions (Archibald and Keeling, 2004; Ando et aI., 2006a; Bulman et aI., 2006a). To further investigate the roles of these genes, techniques for gene manipulation in vivo are highly desirable. In many organisms, RNA silencing induced by the presence of double stranded (ds)RNAs (for a review see Brodersen and Voinnet, 2006), has been adapted for gene inactivation. A widely used technique to produce silencing in plants is stable transformation with transgenes that transcribe hairpin-format (hp)RNAs (Wesley et aI., 2001). Vectors for hpRNA assembly and transformation of Arabidopsis have been widely adopted for gene knockdown (Helliwell et aI., 2002).

One of the most important roles for RNA silencing is defending genomes against viruses and transposable elements (Waterhouse et aI., 2001). Building on this role, several plant species have been engineered with RNA silencing-mediated resistance to specific viruses (e.g. Wang et aI., 2000). An obvious question arising from this application of RNA silencing was, would endogenously expressed siRNAs protect plants against other pathogens? There was some evidence supporting such an approach, since hpRNAs directed at tumour-forming genes of Agrobacterium tumefaciens produced tomato plants resistant to crown gall formation (Escobar et aI., 2001). In multicellular pathogens, this approach is being most actively pursued in 44 biotrophic plant parasitic nematodes, in part because of the ease with which RNA silencing is induced in Caenorhabditis e/egans (Bakhetia et aI., 2005).

In the absence of other methods, the in p/anta expression of hpRNAs appeared an attractive approach for in vivo manipulation of P. brassicae gene expression. Given· that P. brassicae has an intimate biotrophic relationship with the plant cell, and the·· easily manipulated Arabidopsis is an established P. brassicae host, we investigated whether siRNAs transcribed in Arabidopsis could cause gene knockdown in P. brassicae.

Material and methods

Vector construction DNA manipulations were performed using standard techniques in £ co/iDH5a (Sam brook and Russell, 2001). All DNA fragments were amplified using Accuprime High Fidelity Taq (Invitrogen). A diagram of the vectors assembled in this study is shown in Fig. 1. A completely conserved 153 bp sequence at the 5' end of the P. brassicaeactin I and II genes (corresponding to nucleotides 59-212 from the ATG start codon) was chosen as an RNA silencing target (Bulman et aI., 2006b). BLASTN comparisons of the sequence against the Arabidopsis genome gave a best match of 49/59 bp to the actin 12 gene (At3g46520), but no hits of more than 15 consecutive nucleotides to any of the Arabidopsis genome. The 5' peR primer for this fragment was modified to contain an attBl site at the 5' end (peR primers shown in Appendix

3 Table 1). A 201 bp DNA fragment from the £ co/i~-glucuronidase (GUS) gene (509-710 bp from the ATG) was amplified, including a 3' peR primer modified to contain an attB2.. site at the 3' end. The actin 3' primer and the GUS 5' primers were designed with sequences complementary to one another. The two DNA fragments were gel purified with the QIAquick Gel Extraction Kit (Qiagen) then fused together using overlap extension peR (Sam brook and Russell, 2001). The fused product was gel purified, then ligated into the pGEM-T vector (Promega). Insert-containing plasm ids were purified (Qiagen) and sequenced. A plasmid containing the correct actin-GUS fusion (Appendix 3 Fig. 1) was used for recombination of the insert into the pDONR201 vector (Invitrogen) by a BP c10nase (Invitrogen) reaction as in Helliwell et al. (2002). An insert-containing vector was again purified, and then used 45 to recombine the actin-GUS fusion into the pHelisgate8 RNA silencing vector (Helliwell et al., 2002) via an LR clonase (Invitrogen) reaction.

The correct orientation of the pHelisgate inserts was verified by Xba I restriction enzyme digestion. The final vector (pH8AG) was transformed into Agrobacterium tumefaciensGV3101 (Koncz and Schell, 1986) byelectroporation.

pH8AG :::•••• = UTRs

Figure 1. Vector assembly. The positions of the DNA fragments from the P. brassicae actin gene(s) (top) and the GUS gene from the pCAMBIA 1303 vector (bottom) that were used for hpRNAs are shown. The two fragments were fused and inserted into the pHelisgate8 vector in inverted format. The positions of the 3' end (UTR) peR primers used to assess actin expression are shown.

Transgenic Plant Construction Arabidopsis thaliana Columbia plants were transformed by floral dipping (Clough and Bent, 1998). A plant line (ALC.03) expressing a translational fusion of the GUS and GFP genes driven by the cauliflower mosaic virus (CaMV) 35S promoter, was generated by transforming plants with the pCAMBIA 1303 vector (http://www.cambia.org/). Transformed ALC.03 plants were selected on MS media (Invitrogen) containing hygromycin. GUS expression in plants was assayed by histochemical staining (Jefferson et ai., 1987). GFP expression in the Arabidopsis roots was unreliable compared with GUS expression. Many seedlings that appeared 46

, , to be GFP negative were positive when stained for GUS activity. Expression of the mgfp5 gene has been reported to be weaker than for some other forms of the GFP gene (http://www;cambia.org/daisy/cambia/home.html). Homozygous AtC.03 plantS were floral~dipped with pH8AG and transformed plants were selected on kanamycin­ containing media.

DNA and RNA techniques DNA was extracted from both P. brassicae-infected and uninfected Arabidopsis tha/iana Columbia plants using a CTAB method (Russell and Bulman, 2005). Plants containing pH8AG inserts were identified by PCR with the PbACT2attB1/HLOOPR1 primer pair (all PCR primers shown'in Appendix 3 Table 1). Southern blotting was performed as in Sambrook and Russell (2001).

RNA from uninfected Arabidopsis plants and P. brassicae-infected roots was extracted using TRIzol reagent (Invitrogen). For northern blotting, N 10 I-Ig total RNA was separated on formaldehyde-agarose gels and transferred to Hybond-N+ (GE Healthcare).

For siRNA gel blotting (Hamilton and Baulcombe, 1999), N 20 I-Ig total RNA was separated on 1.5 mm thick 15% polyacrylamide and 8M urea gels, with a Protean II apparatus (BioRad). RNA was electro-blotted onto Hybond-N+ with an LKB NovaBlot (GE Healthcare) semi-dry apparatus. DNA oligonucleotides of 18-24 bp were used as size markers and hybridisation controls.

Blots were UV crosslinked in a Bio-Link device (Vilber Lourmat). All hybridisations were performed in modified Church buffer (Sam brook and Russell, 2001), at 65°C for Southern and northern blots, and at 40°C for the siRNA blots. Radioactively labelled probes were produced by random-priming of PCR amplified DNA fragments with 0- p32dCTP using the RediPrime kit (GE Healthcare) (PCR primer positions shown in Fig. 1). Southern blots were hybridised with a DNA fragment from the nptlI gene, amplified with the primers NPTIIa/b. Northern blots were hybridised with a P. brassicae actin DNA fragment 735-986 bp downstream from the ATG start codon (amplified with PbACTR1F/R primers). siRNA blots were hybridised with the P. brassicae actin DNA fragment used for the pH8AG assembly. 47

RT-PCR For RT-PCR, all RNA samples were pre-treated with DNAse 1 (Roche). Two IJg of total RNA was reverse transcribed with the SuperScript III system and the CTRT primer (Invitrogen). DNA was amplified with Platinum qPCR SYBR Green Supermix­ UDG (Invitrogen) in 20 IJL reactions. PCR primers are in Appendix 3 Table 1. Duplicate reactions in each run were analysed on an ABI 7700 Sequence Detector (Applied Biosystems). The P. brassicae actin I and II genes were specifically amplified with primers in the 3'UTR of each gene, paired with a non-gene specific P. brassicae actin primer (primer sequences for RT-PCR given in Appendix 3 Table 1). The P. brassicae actin genes were also separately amplified with a non-gene specific primer pair. The Arabidopsis actin 8 gene and the P. brassicae ~-tubulin (AM 180248) genes were amplified as standards. CT values were normalised to the value for the latter gene for each template. Relative gene expression levels were calculated using the 2 -Mer method (Livak and Schmittgen, 2001).

Plant infection For P. brassicae infections, Arabidopsisseedlings were transferred to soil and grown in a glasshouse. Infection was initiated by applying 1 mL of P. brassicae resting spore inoculum (tv 1x106 sporejmL) to the base of each plant (Bulman et aI., 2006a). Secondary stage disease was assessed at 30 days post infection (dpi) by visually examining roots for galls. Infected root material for RNA extraction was collected from tv 40 seedlings at 8 dpi and galled roots from 4-5 plants at 30 dpi. Infection at 8 dpi was confirmed by staining roots in trypan bluejlactophenol solution (Appendix 3 Fig. 2).

Results

Arabidopsis lines containing hpRNA constructs typically show a range of RNA knockdown of the targeted gene (e.g. Helliwell et aI., 2002). To obtain a measure of hpRNA expression prior to carrying out time-consuming P. brassicae infection assays, we utilised the ability of chimeric constructs to simultaneously knockdown transcription from more than one gene (Helliwell et aI., 2002). For this purpose, a GUS gene that was constitutively expressed in Arabidopsis, was targeted. For the hpRNA, a 153 bp fragment from the P. brassicae actin gene was fused to a GUS DNA 48 sequence, and assembled in the pHelisgate8 vector, to produce pH8AG. Homozygous ALC.03 plants expressing a GUS/GFP marker gene were floral dipped with Agrobacterium containing pH8AG, yielding eight kanamycin-resistant transformants. Segregation ratios of T1 progeny on kanamycin suggested that all except line 6 had single insertions (Appendix 3 Table 1). Southern hybridisation confirmed that lines 2, 4, 5, and 7 contained single pH8AG insertions, whereas line 6 had two inserts (Appendix 3 Fig. 3). Histochemical staining showed that T1 seedlings of lines 2, 4, 5, 7 and 8 displayed either little or no GUS expression or strong expression, at ratios of IV 3: 1 (Fig. 2). Only a few line 6 seedlings stained positively for GUS, consistent with the presence of two pH8AG insertions. All seedlings from line 1 were GUS-positive and this line was found not to contain the hpRNA section of pH8AG by PCR (Appendix 3 Fig. 4). Line 3 was discarded because most seedlings of this line grew in a stunted form. Comparison of PCR and histochemical staining showed that GUS-positive T1 seedlings from lines 4 and 7 did not contain the pH8AG insert whereas GUS-negative plants did (Appendix 3 Fig. 5).

a b

Figure 2. GUS staining of T2 seedlings. Seedlings transformed with pH8AG, lines 4 (A) and 7 (6). T2 seedlings showed high/low GUS expression at 1V1:3 ratios. A single plant showing elevated GUS expression and two with reduced GUS expression are shown from each plant.

Homozygous plants were selected, and lines 4, 5 and 7 were chosen for further infection experiments. Four-five seedlings each from these lines were transferred to soil and infected with P. brassicae spores (Appendix 3 Fig. 6). Visual examination of infection in these Arabidopsis plants showed heavy galling at 30 dpi (Fig. 3). Visible levels of infection, in five separate experiments, were comparable in the pH8AG plants to those seen in the ALC.03 control plants. 49

Figure 3. Infected pH8AG plants from line 4 and 7 at 30 dpi. A. pH8AG line 4; B. pH8AG line 7; C. ALC.03; D. uninfected pH8AG line 4 plant.

Hybridisation to northern blots with a P. brassicae actin gene fragment showed that transcription of P. brassicae actin was clearly detectable in the infected pH8AG hpRNA lines (Fig. 4). Transcription levels were similar to those seen in the control ALC.03 plants. No cross-hybridisation with Arabidopsis actin gene sequences was observed in RNA from uninfected seedlings.

1234567

Figure 4. Detection of P. brassicae actin transcription at 30 dpi. Northern blot of RNA hybridised with actin fragment P2 (Fig. 1). Lanes 1 and 2 = uninfected root RNA; lanes 3-7 30 dpi root gall RNA. Lanes 1. ALC.03; 2. pH8AG line 4; 3. Arabidopsis Columbia wild type; 4. ALC.03; 5. pH8AG line 4; 6. pH8AG line 5; 7. pH8AG line 7. 50

Total RNA from galls on infected Arabidopsis plants was separated on acrylamide gels (Appendix 3 Fig. 7), blotted onto nylon membrane and hybridised with the labelled 153 bp P. brassicae actin fragment. Small RNAs were detected in RNA from uninfected seedlings of pH8AG lines 4, 5 and 7, but were barely detectable in RNA prepared from club root galls of the same lines (Fig. 5).

123405678

,24

Figure 5. siRNAs blot hybridised with 153 bp actin hpRNA fragment. 0 = 24 bp DNA oligonucleotide size marker. Lanes 1-4 = uninfected seedling RNA and 5-8 = 30 dpi root gall RNA. Lane 1. AtC.03; 2. pH8AG line 4; 3. pH8AG line 5; 4. pH8AG line 7; 5. AtC.03; 6. pH8AG line 4; 7. pH8AG line 5; 8. pH8AG line 7.

pHBAG line 4 vs A~.03 pHBAG line 5 vs A~.03 pHBAG line 7 vs A~.03 eDNA name 2-MCT 2-MCT 2-MCT

B dpi 30 dpi B dpi 30 dpi B dpi 30 dpi

P~TUB 1.0 1.0 1.0 1.0 1.0 1.0 (0.63-1.59) (O.93-1.0B) (0.93-1.07) (O.B3-1.21) (0.43-2.31) (O.B9-1.13) Pb\CTI 1.91 1.04 0.84 2.4 3.03 0.99 (1.37-2.03) (0.93-1.24) (0.74-0.95) (2.1-2.73) (1.39-6.59) (O.B7-1.12) Pb\CTII 2.85 1.13 1.34 2.95 3.23 1.14 (2.04-3.43) (1.01-1.1B) (1.27-1.4) (1.2-3.63) (1.77-5.9) (1.0-1.3) Pb\cY" 1.75 0.97 0.24 1.4 0.56 1.05 (1.24-1.B3) (0.72-1.29) (0.1-0.57) (1.22-1.61) (0.31-1.02) (0.97-1.14) AtACTB 3.36 0.65 2.0 1.96 3.56 1.8 (2.27-4.11) (0.59-0.72) (1.69-2.36) (1.69-2.25) (1.93-6.54) (1.61-1.99)

Table 1. Quantitative RT-PCR in pH8AG lines. SYBR green qRT-PCR for the PbACf I and II genes and the Arabidopsis actin 8 gene. PbAcrM represents amplification with the PbACfR1F/R primer pair, that would be expected to amplify from both P. brassicae actin genes. Within each sample, the CT values were calibrated to those for P. brassicae ~-tubulin.

Differences between CT values for pH8AG lines and the control AtC.03 line are represented as MCf fold differences in expression (2- ) Figures in brackets represent the fold difference ± standard deviation. 51

Gene cloning has shown that there are at least two P. brassicae actin genes and possibly several more (Bulman et aI., 2006b). To test the possibility that transcription of the targeted P. brassicae actin genes was indeed reduced, but other actin genes escaped knockdown, we used quantitative RT-PCR. When calibrated to

P. brassicae ~-tubulin, no convincing reductions in transcription of PbACTI or PbACTII were observed in 8dpi primary-infected roots or 30dpi secondary infected galls (Fig. 6).

Discussion

We set-out to test whether plant-expressed siRNAs might be able to down-regulate the transcription of P. brassicae genes. The intimate biotrophic relationship of plasmodiophorids with the host cell and the appearance of phagocytosis-like ingestion in P. brassicae (Buczacki, 1983) both suggest that P. brassicae might be a suitable target for in-p/anta induction of RNA silencing. This was tested in the model plant Arabidopsis, an established host for P. brassicae in which methods for constructing and expressing hpRNA constructs are readily available (Wesley et aI., 2001; Helliwell et aI., 2002).

Knockdown of constitutively expressed GUS, and parallel detection of siRNAs, indicated that PbACT-GUS hpRNA constructs were expressed and converted to siRNAs in the transgenic Arabidopsis plants. Nonetheless, the infected hpRNA­ containing Arabidopsis lines showed no visible differences in secondary stage club root disease symptoms. Analyses of RNA from the hpRNA-containing plants also showed no reductions in transcription of the P. brassicae actin genes at either primary or secondary stages of infection. Together, these experiments indicated that in p/anta expressed siRNAs had no effect on transcription of the P. brassicae actin genes.

There are many reasons why external siRNAs might fail to induce clear RNA knockdown in P. brassicae. The most obvious explanation is that RNA silencing mechanisms do not exist in P. brassicae. Conserved protein components of the RNA silencing machinery, such as Dicer, have been found in all major groups of eukaryotes for which whole genome sequences are available (Cerutti and Casas- 52

Mollano, 2006). However, RNA silencing appears to be absent from many organisms, and has frequently been lost from closely related lineages. For example, RNA silencing is routinely used as a tool to knockdown gene expression in Trypanosoma brucei(Motyka and Englund, 2004), but th€!o RNA silencing machinery is absent from Trypanosoma cruzi. The limited amount of gene sequencing from 8igelowie//a natans and filosa provides no evidence of RNA silencing­ associated genes in Rhizarians.

The ability of RNA silencing signals to cross cell membranes is also poorly understood (May and Plasterk, 2005). Should RNA silencing mechanisms be present in P. brassicae, these still may not be triggered by the external application of siRNAs. Soaking Caenorhabditis elegans in double stranded RNA (Tabara et aI., 1998) or the protists Entamoeba histolytica (Vayssie et aI., 2004) and Acanthamoeba (Lorenzo­ Morales et aI., 2005) in siRNA duplexes, effectively stimulates RNA silencing. However, external application of dsRNAs is not a universal means for stimulating RNA silencing,and uptake of siRNAs is inefficient compared with longer dsRNAs in C elegans (Feinberg and Hunter, 2003).

Finally, it is possible that our failure to detect RNA silencing may have been because of insufficient hpRNA expression. Although the CaMV35S promoter is routinely used to give constitutive plant gene expression, we observed very low or patchy GUS expression driven by the CaMV35S promoter in older Arabidopsis plants. Gel blot analyses showed that P. brassicae actin siRNAs were barely detectable in RNA from infected gall tissue. Elevated siRNA levels might be especially important for the knockdown of a highly transcribed gene like actin. It seems unlikely that RNA silencing was absent solely because of low hpRNA expression, since we detected no differences in the levels of actin gene transcripts in younger plants where expression of both GUS and the actin hpRNA from the CaMV35S promoter appeared strong. The CaMV35S promoter is down-regulated in the feeding structures formed by the nematodes Heterodera schachtiiand Meloidogyne incognita (Goddijn et aI., 1993). As a consequence, researchers aiming to target in planta siRNAs at plant-parasitic nematodes have proposed the use of promoters that display high expression in these feeding cells (Bakhetia et aI., 2005). Use of promoters that are highly expressed in club root galls would improve the chances of observing RNA silencing in P. brassicae. 53

Ultimately, further work with RNA silencing in P. brassicae may only be worthwhile if developments in genome sequencing provide further evidence of RNA silencing mechanisms in the plasmodiophorids or closely related organisms. The recently announced B. natans genome project may reveal new information on the presence of RNA silencing in the Rhizaria. 54

CHAPTER 5. Discussion

Summary of research findiri'gs

At the beginning of this study, suitable gene sequences from P. brassicae for hpRNA assembly were unavailable. 10 identify new gene· sequences, suppression subtractive hybridisation was performed between RNA from P. brassicae-infected and uninfected Arabidopsistissue. Full-length cDNA clones from the infected tissue were also screened. From 537 clones, a total of 76 new P. brassicae gene sequences were located. Full-length gene sequences for most of the P. brassicae genes were obtained by RACE amplification. Many of the unisequences were predicted to contain signal peptides for ER translocation.

In addition to the new cDNA identified here, partial sequences for the P. brassicae actin and TPS genes were published by other researchers close to the beginning of this study. Using PCR-walking, full-length genomic DNA sequences from both genes were obtained. Later, genomic DNA sequences spanning or flanking a total of 24 P. brassicae genes were obtained. The P. brassicae genes were rich in typical eukaryotic spliceosomal introns. Transcription of P. brassicae genes also appears likely to begin from initiator elements rather than TATA-box-containing promoters.

A segment of the P. brassicae actin gene was assembled in hairpin format and transformed into Arabidopsis tha/iana. Observation of simultaneous knockdown of the GUS marker gene as well as detection of siRNAs indicated that these hpRNA sequences induced RNA silencing. However, inoculation of these plants with P. brassicae resulted in heavy club root infection. We were unable to detect decreases in actin gene expression in the infecting P. brassicae, at either early or late stages of infection. We conclude that, within the limits of the techniques used here, there is no evidence for induction of RNA silenCing in P. brassicae by in p/anta produced siRNAs. 55

SSH cDNA cloning techniques

The host plant sequences were easily and unequivocally identified by BLA.STN/ confirming the success of using Arabidopsis as a host. Once the B. rapa genome sequence is released to the public/ these same advantages will be available with Chinese cabbage as a club root host.

Without knowing the initial transcript representation in club root galls, it is impossible to fully assess the effectiveness of SSH for removing plant cDNAs. However, the SSH technique appeared to be successful for enriching P. brassicae cDNAs since only tv13% of the screened clones were from Arabidopsis. Allowing for one highly redundant P. brassicae SSH sequence/ the percentage of Arabidopsis clones was still only tv19%. Based on our prior exploratory studies/ we expected the percentage of ArabidopsiscDNAs to be much higher. Cloning of oligo-capped full-length cDNAs indicated that the percentage of P. brassicae transcripts from galls (the starting material for the SSH) might be as high as 66%. However, this cDNA collection was highly redundant (see next section) and unlikely to closely reflect the true gall transcriptome. After removing the three most redundant full-length clones, 43% of the DC transcripts were from P. brassicae.

The formation of galls during secondary club root infection is driven by pronounced changes in phytohormone expression (Devos et a!., 2005; Ludwig-Mulier et aI./ 1999)/ resulting in significantly different gene expression profiles to those found in typical plant roots (Siemens et aI., 2006). It is more typical to use SSH to isolate differentially expressed genes between closely related RNA sources rather than from such substantially different RNA samples. In this study/ SSH was used to titrate away as many plant genes from the gall RNA as possible. The use of uninfected root RNA for subtraction was successful, but RNA from hormone-induced callus might offer some advantages as an RNA driver/ were this procedure to be repeated.

Aside from accessing a high percentage of pathogens cDNAs/ SSH had some clear shortcomings for the identification of P. brassicae genes. The SSH clones fell within a range of small sizes and many derived from the 3' end of genes. This often made it difficult to assign putative functions to the sequences using database comparisons/ or even to be confident of the direction of transcription. SSH would likely be unsuitable for larger scale P. brassicae gene identification. With respect to the 56

requirements of this study, even where putative identities could confidently be assigned to the sequences, the clones were too short to accommodate a region for hpRNA design as well as another downstream region for RT-PCR or northern probing. Altogether, techniques of gaining further sequence information were required to complement the SSH findings.

Full-length cDNA cloning techniques

To obtain full-length cDNA sequences of genes identified from the SSH collection, we performed RACE amplification. After improvements to the PCR steps, the oligo­ capped RACE technique proved highly successful. We were able to retrieve full­ length gene sequences from almost all of the SSH clones chosen for RACE. We also cloned and sequenced a selection of random full-length cDNAs from the oligo-capped template. This collection of tv300 cDNAs contained a higher than expected percentage of P. brassicae cDNAs, some of which were from the same genes as the SSH clones. As such, the OC collection complemented the SSH clone collection.

The selection of OC cDNAs was highly redundant, necessitating PCR pre-screening steps to avoid repeatedly sequencing copies of three especially redundant genes. Even with this pre-screening, later batches of cDNAs contained less and less sequences from new genes. Thus, the usefulness of sequencing a larger number of full-length clones from this collection was limited. Using qRT-PCR, we compared the abundance of transcripts in the OC library versus their abundance in 1st strand cDNA from the gall RNA source (Appendix 1 Quantitative RT-PCR analysis of P. brassicae gene expression). This showed that the OC library did not contain a representative selection of transcripts. This was likely caused by the use of PCR amplification during library construction. Thus, the higher than expected ratio of P. brassicae transcripts may not be an accurate reflection of the club root transcriptome. The cDNA clones also had a short length compared with the average length of cDNAs (e.g. tv1.4 kb in Candida a/bicans) (Braun et aI., 2005).

Large (or even medium) scale gene identification was not a central aim of the study, . so the number of cDNAs sequenced was relatively small compared with typical EST studies. For example, recent publications have described tv1600 EST sequences 57 from Reticu/omyxa ff/osa/ a representative of the little studied Rhizaria (Burki and Pawlowski/ 2006)/ to "'75 000 EST sequences from Phytophthora infestans(Randall et aI./ 2005). By comparison/ the number of ESTs located in studies from biotrophic pathogens have always been much smaller (Bittner-Eddy et aI./ 2003; Thara et aI./ 2003; Broeker et aI./ 2006). Importantly/ our ability to sequence a larger number of P. brassicae cDNA clones was circumscribed by the high redundancy in both the SSH and oligo-capped cDNA collections.

The future of P. brassicae cDNA cloning

The findings presented here/ in particular the greater than expected percentage of P. brassicae transcripts in RNA from club root galls/ have some implications for future cDNA identification. With continuing decreases in sequencing costs/ construction of a typical cDNA library from club root galls and sequencing of random cDNAs may be a cost effective means to identify new P. brassicae genes. The eventual availability of the Brassica rapa genome (http://www.brassica-rapa.org/) will make it easier to work with Chinese cabbage/ a favoured P. brassicae host that provides larger amounts of material from each developmental stage than is available from Arabidopsis. SSH appears to be a more suitable tool for identifying genes expressed at a particular stage of P. brassicae infection/ rather than for concerted gene identification. For example/ it would be highly interesting to identify genes expressed during primary infection by subtracting uninfected root cDNAs from P. brassicae-infected root cDNAs at 7-10 dpi. At this time/ the amount of P. brassicae tissue in the roots would be quite low/ and the background levels of plant gene transcripts would be similar.

Genomic sequences from P. brassicae by peR-walking

In parallel with our use of SSH to identify new P. brassicae genes/ we sought to extend the lengths of the actin (Archibald and Keeling/ 2004) and TPS (Brodmann et aI./ 2002) gene sequences so that they might be used as targets for RNA silencing. A P. brassicae ubiquitin gene sequence was also published (Archibald and Keeling/ 2004)/ but this appeared to be an unsuitable target for RNA silencing. The ubiquitin coding sequence was very highly conserved at the nucleotide level/ meaning that in 58 p/anta expression of P. brassicae-ubiquitin hpRNAs would probably also result in knockdown of the endogenous Arabidopsis ubiquitin genes. The work on the genomic sequences of the TPS and actin genes was performed in parallel with the SSH and DC cDNA cloning approaches, since it was unclear if one or all of these techniques would be successful.

We chose PCR-walking techniques to obtain new genomic sequences, because, based on previous experience, these techniques are well suited to working with DNA samples from mixtures of organisms. PCR-walking indeed proved to be successful for retrieving P. brassicae DNA sequences from gall DNA, obviating the need for long-winded construction of typical genomic DNA libraries.

Multiple P. brassicae actin genes-

Two actin genes were sequenced from the New Zealand P. brassicae isolate and three actin genes were previously identified in a Japanese isolate (Archibald and Keeling,2004). None of the actin sequences appeared to be from the same gene, with differences occuring at 38 nucleotide positions over the five sequences. All but 3 of the differences were at 3rd base position of codons. The presence of these divergent sequences suggests that P. brassicae might contain at least five actin genes, or that the New Zealand and Japanese isolates are highly divergent at the nucleotide level. Five and six actin genes are respectively found in the genomes of the oomycete Lagenidium giganteum and the unicellular alga Emi/iana hux/eyi, so the presence of multiple actin genes in a protist would not be unusual (Bhattacharya et aI., 1993; Bhattacharya and Stickel, 1994).

The divergent actin sequences might also arise from genetically distinct single spores within the P. brassicae field isolate. P. brassicae samples from the field are highly variable, each consisting of races that exhibit differing pathogenicity to hosts with different resistance genes (Buczacki et aI., 1975; Some et aI., 1996). Multiple pathotypes are found within a field and among single spore isolates from the same gall (Some et aI., 1996; Manzanares-Dauleux et aI., 2001). The basis for this inherent variabilty is not understood, but experiments with RAPD DNA markers showed a high degree of genetiC variability between single spore isolates from the same field (Manzanares-Dauleux et aI., 2001). Clearly it is possible that this genetiC 59 variation would extend to coding sequences including the actin genes. The number of actin genes in the New Zealand isolate could be illuminated by screening more RACE clones, or, more especially, by Southern hybridisation. The genes identified in this study will provide highly useful targets for comparing genetic diversity in P. brassicae isolates at the local and global level. The isolation of single spore isolates will be necessary to answer these questions.

New details of the P. brassicae genome

A PCR walk from the 5' end of the actin gene yielded a genomic DNA fragment that contained both the actin promoter and a new P. brassicae gene. Spurred by this discovery, we used PCR walking to obtain DNA sequence beyond the 5' ends of a selection of P. brassicae genes. This offered the chance to identify new genes and to characterise several P. brassicae promoters. These promoters were to be used as tools elsewhere in the project.

Despite it's small scope, this survey of genomic DNA sequences gave surprising new details of P. brassicae genome structure. The principal new findings were that P. brassicae genes are surpisingly intron-rich and that P. brassicae genes are likely to start transcription from initiator elements (discussed in Chapter 3). There was some evidence from the close spacing of several genes to suggest that the P. brassicae genome is relatively compact, but several DNA fragments up to 2.2 kb that contained no obvious genes were also sequenced. Since only BLASD< versus the GenBank non-redundant database were used to search for genes, the presence or absence of genes in these DNA fragments might be confirmed by northern blotting of gall RNA. The likelihood of a reduced genome size provides encouragment for the future of P. brassicae genomics, since sequencing of larger genomic DNA clones would reveal many new P. brassicae genes. Such clones could easily be identified from a mixed plant-P. brassicae genomic DNA library by blotting.

Evidence for RNA silencing machinery in plasmodiophorids? 60

It was tempting to think that the isolation of P. brassicae genomic DNA sequences might cast lighton the presence of RNA silencing in P. brassicae. For example, since RNA silencing protects the genome against transposon replication, detection of intact mobile elements could indicate a continuing requirement for RNA silencing by P. brassicae. No transposable elements were detected in our survey, although the chance of finding such elements in this manner was low. A more direct means to find evidence for RNA silencing was also attempted, using degenerate PCR to amplify Dicer-like proteins from plasmodiophorids (Appendix 2 Degenerate PCR for Dicer proteins). This technique led to the identification of a 8. rapa DCL-1 fragment and another from the aphid, Myzuspersicae. However, no putative plasmodiophorid Dicer sequences were detected. Given that the Dicer-like gene sequences are known to be poorly conserved across eukaryotes (Cerutti and Casas-Mollano, 2006), and the host plant sequences were possibly preferentially amplified, these findings do not show the absence of Dicer-like genes in P. brassicae. Ultimately, a much greater exploration of plasmodiophorid genomes will be required before the presence or absence of RNA silencing machinery in P. brassicae is confirmed.

Assessment of P. brassicae infection and choice of first hpRNA target

Based on the limited means for assessing P. brassicae infection, a lethal phenotype was regarded as an important criteria for our chosen hpRNA targets. Once plants are exposed to P. brassicae spores, there are two typical points for infection assessment. Firstly, at 7-10 dpi, primary infection can be assessed by microscopically searching stained roots for root hair infection. Then, after at least 23 dpi, secondary infection can be assessed by the degree of root galling. This is the most reliable point for assessing infection. Since the distribution of primary root infection on roots is patchy, and the size of galls are dependent on plant vigour, applying quantification to measurements of P. brassicae infection is extremely difficult. Methods for quantifying P. brassicae infection of Arabidopsis have been extensively tested and reviewed by Siemens et aI., (2002), but such measurements remain less than 100% reliable. Greatly reduced gall formation/infection would be the ideal marker of hpRNA activity. 61

The first full-length P. brassicae cDNAs obtained in this work were the TPS and actin sequences. The TPS gene is involved in synthesis of trehalose, a disaccharide produced by microbes in response to various environmental stresses (Thevelein, 1984). Although trehalose is difficult to detect in Arabidopsis, trehalose metabolism appears to be complex as reflected by the presence of as many as 11 Arabidopsis TPS genes (Leyman et aI., 2001). Although TPS gene deletion resulted in much reduced plant pathogenicity in the fungal pathogen Magnaporthe grisea (Foster et aI., 2003), it seemed unlikely that knockdown of the TPS gene would prevent P. brassicae growth. RT-PCR reactions also indicated that the TPS gene was not highly transcribed, meaning that measuring knockdown would be more difficult than in a highly expressed gene.

By contrast, disruption of the actin gene seemed likely to be detrimental to P. brassicae survival. Actin is an important cytoskeleton protein generally considered to be essential for cell function. RNAi knockdown of actin has been shown to be lethal in both Trypanosoma brucei (Gam' a-Salcedo et aI., 2004) and C e/egans (Kamath et aI., 2003). Consequently, from the two initial P. brassicae genes, we chose to create hpRNAs from the actin sequence.

Design of actin hpRNA construct

It was desirable to place the hpRNA at the 5' end of the actin gene, allowing for the design of qRT-PCR primers in the downstream part of the gene. We also sought to avoid off-target knockdown of Arabidopsisgenes through sequence homology. Off­ target RNA silencing effects have been a major concern in animals, although these appeared to be less of a problem in plants (Watson et aI., 2005). Comparison with the Arabidopsis genome showed that nucleotides 150-650 of the P. brassicae actin gene had high similarity to the Arabidopsis actin genes, especially Arabidopsis actin 12 (At3g46520). Although no matches of 20 or more contiguous bases were present, to avoid (most of) this region, the hpRNA was designed from nucleotides 59-212 of the P. brassicae actin gene. This segment had a maximum match of 15 contiguous nucleotides to any sequence in the Arabidopsis genome. A similar strategy was followed for the ~-glucuronidase (GUS) gene, choosing a sequence with a highest match of 17 contiguous nucleotides to the Arabidopsisgenome. 62

The hpRNAs were assembled in the pHelisgate vector system that has been widely adopted for RNA silencing in Arabidopsis (Helliwell et a/., 2002).

Assessment of hpRNA expression

As with all transgenic plants, expression of introduced hpRNAs shows considerable variability between plant lines. Had we measured P. brassicae gene expression in randomly chosen hpRNA-plants and found it unchanged, this might simply have reflected poor hpRNA expression, rather than a failure to induce RNA silencing in Plasmodiophora. Therefore, to gain an indication of hpRNA expression levels prior to infection assays, we utilised the ability of chimeric hpRNAs to simultaneously inactivate more than one gene. Chimeric hpRNAs have been used to select Arabidopsis plants with highly expressed hpRNAs by Helliwell et a/. (2002). The chalcone synthase (CHS) gene plus another gene Were targeted, and successful knockdown was determined by reduced seed pigmentation (Helliwell et a/., 2002). Since we had no experience with the CHS technique, we instead decided to use knockdown of the visible markers GUS and GFP to select for active hpRNAs. Knockdown of GFP has frequently been used in studies of RNA silencing mechanisms (e.g. Dunoyer et a/., 2006). In our actin-GUS hpRNA lines, GUS knockdown appeared to be an excellent indicator of hpRNA expression. In the majority of lines, greatly reduced GUS expression was seen to co-segregate with the hpRNA transgenes. Seedlings from one kanamycin resistant line showed no GUS knockdown, and this line was subsequently found not to contain the hpRNA part of the transgene.

Initially, we hoped to observe GFP expression in roots as an indicator of knockdown. However, GFP expression in plant roots did not always correlate with the presence of transgenes or with subsequent assessment of GUS expression. Expression of the mgfpS gene has been reported to be weak in some plants, and may be improved by using other forms of the GFP gene (http://www.cambia.org/daisy/cambia/home.html). 63

In planta siRNAs showed no effect on P. brassicae infection

At the end of this study, we were unable to find evidence that in p/anta expressed hpRNA transgenes versus the P. brassicae actin gene had any effect on club root infection. By measuring reporter gene knockdown and the presence of siRNAs, we are confident that P. brassicae was exposed to siRNAs homologous to the P. brassicae actin gene during infection. However, we saw no differences in disease expression nor measured any decrease in the level of overall actin transcripts at the primary or secondary infection stages. Using qRT-PCR, we observed no specific down-regulation of the two actin sequences that were targeted in this New Zealand P. brassicae field isolate. Thus, even if the continued growth of P. brassicae in these lines were due to the presence of other divergent actin sequences, we did not succeed in reducing transcripts of the two targeted genes.

Why might RNA silencing have failed to be induced?

Clearly there are many points at which in p/anta dsRNA might fail to induce silenCing in P. brassicae. Firstly, the RNA silencing machinery may simply be absent in P. brassicae. RNA silencing has been induced in few protiSts, and there is no evidence for either ifs presence or absence in the plasmodiophorids.

Alternatively, should P. brassicae be capable of mounting an RNA silencing defence, the in p/anta expressed siRNAs may not induce this response. The phylogenetic distribution of RNA silencing induction by external exposure to dsRNAs is unknown. Observations of systemic silencing spread have been confined to a few organisms, and the mechanisms that mediate this spread remain incompletely understood (May and Plasterk, 2005). Interestingly, in C e/egans, in which spread of RNA silencing has been best characterised, SID! has a preference for longer dsRNAs and imports siRNAs rather poorly (Feinberg and Hunter, 2003). Thus, induction of RNA silenCing by in-p/anta hpRNAs might be ineffective against SID!-containing organisms because of the rapid conversion of dsRNA to siRNAs by the plant DCl enzymes. Even in triple 64 del2. de8 de/4 Dicer mutant Arabidopsis plants, the remaining DCLl is capable of producing siRNAs from inverted repeats (Henderson et aI., 2006; Deleris et aI., 2006). Collectively, RNA silencing by external dsRNA application is continuing to be empirically tested in a variety of organisms. To date, this has met with only sporadic success.

Another factor to consider is the effect of single-spore variability in P. brassieae. As discussed earlier, single-spore lineages from P. brassieae field isolates exhibit substantial differences in pathogenicity and genetic makeup. It seems quite possible that some single-spore isolates have different RNA silencing responses and escape the effects of the hpRNA constructs. The isolation and use of single-spore P. brassieae isolates may be a pre-requisite before further studies of this kind are carried out.

Future research

During this project, we shifted from our initial strategy of testing hpRNAs versus several P. brassieae genes, to more exhaustively testing the effect of a single hpRNA. This was to accomodate the introduction of internal controls for the effectiveness of hpRNA expression. Since not all genes appear equally susceptible to RNA silencing, testing hpRNAs versus other P. brassicae genes would be valuable to confirm the results with the P. brassieae actin sequence. We have recently assembled a new hpRNA construct targeting the P. brassieae MOB gene (AM180218). T1 Arabidopsis seedlings containing these constructs are now being grown.

The CaMV35S promoter has been shown to be down-regulated in the syncytia (feeding sites) of plant pathogenic nematodes (Goddijn et aI., 1993). As a consequence, considerable effort has been expended in identifying new plant promoters that are up-regulated in syncytia (Barthels et aI., 1997). It has been suggested that these promoters be used for nematode-targeted hpRNA expression (Rosso et aI., 2005). To see if the CaMV35S promoter is also down-regulated in club root galls, we inspected GUS expression in galls on the ALC.03 line. Expression in the galls was patchy (Appendix 3 Reporter-gene expression in P. brassieae galls. Fig. 8). There was also good evidence of GUS expression in uninfected roots but reduced, 65

patchy expression in adjoining gall tissue. Because there was some evidence of age­ related decline in GUS expression in the ALC.03 line, we cannot yet state with certainty that CaMV35S expression is downregulated by P. brassicae infection. Experiments are continuing in this area.

In unpublished research, we have identified two promoters that are strongly up­ regulated in club root galls (Appendix 3 Fig. 8). These promoters could be used to drive high hpRNA (or other transgene) expression in club root galls. Overall, however, further attempts to apply dsRNA to P. brassicae will only be worthwhile if plasmodiophorid genome sequencing reveals evidence of plasmodiophorid RNA silencing pathways. Such experiments would also be facilitated by development of . better P. brassicae infection assays and production of single spore isolates.

Beyond the study of RNA silenCing, this work provides a comparatively large amount of starting material for studying the P. brassicae-Arabidopsis interaction, and the evolution of the plasmodiophorids. 66

Literature

Adl SM, Simpson AGB, Farmer MA, Andersen RA, Anderson OR, Barta JR, Browser 55, Brugerolle G, Fensome RA, Fredericq S, et aI., (2005) The new higher level classification of eukaryotes with emphasis on the of protists. J Eukaryot Microbiol 52: 399-451.

Aist J and Williams P (1971) The cytology and kinetics of cabbage root hair penetration by Plasmodiophora brassicae. Can J Bot 49: 2023-2034.

Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller Wand Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25: 3389-3402.

Ando 5, Yamada T, Asano T, Kamachi 5, Tsushima 5, Hagio T and Tabei Y (2006a)

'~." Molecular cloning of PbSTKLl gene from Plasmodiophora brassicae expressed during club root development. J Phytopathol 154: 185-189.

Ando 5, Tsushima 5, Tagiri A, Kamachi 5, Konagaya K-I, Hagio T and Tabei Y (2006b) Increase in BrA01 gene expression and aldehyde oxidase activity during development in Chinese cabbage (Brassica rapa L.). Mol Plant Pathol 7: 223-234.

Asano T and Kageyama K (2006) Growth and movement of secondary plasmodia of Plasmodiophora brassicae in turnip suspension-culture cells. Plant Pathol 55: 145- 151.

Archibald JM, Longet D, Pawlowski J and Keeling P (2003) A novel polyubiquitin structure in Cercozoa and Formanifera: evidence for a new eukaryotic supergroup. Mol Bioi Evol 20: 62-66.

Archibald JM and Keeling PJ (2004) Actin and ubiquitin protein sequences support a cercozoanjforaminiferan ancestry for the plasmodiophorid plant pathogens. J Eukaryot Microbiol 51: 113-118. 67

Archibald JM, O'Kelly 0 and Doolittle WF (2002) The chaperonin genes of jakobid and jakobid-like : Implications for eukaryotic evolution. Mol Bioi Evol 19: 422-431.

Bakhetia M, Charlton WL, Urwin PE, McPherson MJ and Atkinson HJ (2005) RNA interference and plant parasitic nematodes. Trends Plant Sci 10: 362-367.

Bannai H, Tamada Y, Maruyama 0, Nakai K and Miyano S (2002) Extensive feature detection of N-terminal protein sorting signals. Bioinformatics 18: 298-305.

Barthels N, van der Lee FM, Klap J, Goddijn OJM, Karimi M, Puzio P, Grundler FMW, Ohl SA, Lindsey K, Robertson L, Robertson WM, Van Montagua M, Gheysen G and Sijmons PC (1997) Regulatory sequences of Arabidopsisdrive reporter gene expression in nematode feeding structures. Plant Cell 9: 2119-2134.

Bass D, Moreira D, Lopez-Garcia P, Polet S, Chao EE, von der Heyden S, Pawlowski J, and Cavalier-Smith T (2005) Polyubiquitin insertions and the phylogeny of Cercozoa and Rhizaria. Protist 156: 149-161.

Bateman A and Rawlings ND (2003) The CHAP domain: a large family of amidases including GSP amidase and peptidoglycan hydrolases. Trends Biochem Sci 28: 234- 237.

Bendahmane A, Querci M, Kanyuka K and Baulcombe D (2000) Agrobacterium transient expression system as a tool for the isolation of disease resistance genes: application to the Rx2 locus in potato. Plant J 21: 73-81.

Bendtsen JD, Nielsen H, von Heijne G and Brunak S (2004) Improved prediction of signal peptides: SignalP 3.0. J Mol Bioi 340: 783-795.

Berglund JA, Chua K, Abovich N, Reed Rand Rosbash M (1997) The splicing factor BBP interacts specifically with the pre-mRNA branchpoint sequence UACUAAC. Cell 89: 781-787. 68

Bernstein E, Caudy A, Hammond S and Hannon G (2001) Role for a bidentate ribonuclease in the initiation step of RNA interference. Nature 409: 363-366.

Bhattacharya D, Stickel SK and Sogin ML (1993) Isolation and molecular phylogenetic analysis of actin coding regions from Emiliania huxleyi, a prymnesiophyte alga, by reverse-transcriptase and PCR methods. Mol Bioi Evol 10: 689-703.

Bhattacharya D and Stickel SK (1994) Sequence analysis of duplicated actin genes in Lagenidium giganteum and Pythium irregulare (Oomycota). J Mol Evol 39: 56-61.

Bittner-Eddy PD, Allen RL, Rehmany AP, Birch P and Beynon JL (2003) Use of suppression subtractive hybridization to identify downy mildew genes expressed during infection of Arabidopsis thaliana. Mol Plant Pathol 4: 501-507.

Borsani 0, Zhu JH, Verslues PE, Sunkar Rand Zhu JK (2005) Endogenous siRNAs derived from a pair of natural cis-antisense transcripts regulate salt tolerance in Arabidopsis. Cell 123: 1279-1291.

Boutla A, Kalantidis K, Tavernarakis N, Tsagris M and Tabler M (2002) Induction of RNA interference in Caenorhabditis elegans by RNAs derived from plants exhibiting post-transcriptional gene silencing. Nucleic Acids Res 30: 1688-1694.

Boutros M, Kiger AA, Armknecht S, Kerr K, Hild M, Koch B, Haas SA, Consortium HF, Paro Rand Perrimon N (2004) Genome-wide RNAi analysis of growth and viability in Drosophila cells. Science 303: 832-835.

Braselton JP (1995) Current status of the plasmodiophorids. Crit Rev Microbiol 21: 263-275.

Braselton JP (2001) Plasmodiophoromycota. In: The Mycota VII, Part A, Systematics and Evolution. McLaughlin OJ, McLaughlin EG and Lemke PA (Eds.), Springer-Verlag, Berlin-Heidelberg, pp 81-91. 69

Braun BR, van Het Hoog M, d'Enfert C, Martchenko M, Dungan J, Kuo A, Inglis DO, Uhl MA, Hogues H, Berriman M et a/., (2005) A human-curated annotation of the Candida albicansgenome. PLoS Genet 1: 36-57.

Brodersen P and Voinnet 0 (2006) The diversity of RNA silencing pathways in plants. Trends Genet 22: 268-280.

Brodmann A, Schuller A, Ludwig-Mulier J, Aeschbacher RA, Wiemken A, Boller T and Wingler A (2002) Induction of trehalase in Arabidopsis plants infected with the trehalose-producing pathogen Plasmodiophorabrassicae. Mol Plant Microbe In 15: 693-700.

Broeker K, Bernard F and Moerschbacher BM (2006) An EST library from Puccinia graminisf. sp. triticireveals genes potentially involved in fungal differentiation. FEMS Micro Lett 256: 273-281.

Bucher G, Scholten J and Klingler M (2002) Parental RNAi in Tribolium (Coleoptera). Curr Bioi 12: R85-R86.

Buczacki S, Toxopeus H, Mattusch P, Johnston T, Dixon G and Hobolth L (1975) Study of physiologic specialisation in Plasmodiophora brassicae: proposals for attempted rationalisation through an international approach. Trans Br Mycol Soc 65: 295-303.

Buczacki ST (1983) Plasmodiophora. In: Zoosporic Plant Pathogens. Buczacki ST Ed., London-New York, Academic Press, pp 161-191.

Bulman SR, Johannes Siemens J, Ridgway HJ, Eady C and Conner AJ (2006a) Identification of genes from the obligate intracellular plant pathogen, Plasmodiophora brassicae. FEMS Micro Lett (in press).

Bulman SR, Ridgway HJ, Eady C and Conner AJ (2006b) Genomic DNA sequences from Plasmodiophora brassicae. (in prep!). 70

Bulman SR, Ridgway HJ, Eady C and Conner Al (2006c) Testing plant-derived RNA silencing against the intracellular pathogen Plasmodiophora brassicae. (in prep!).

Bulman SR, Kuhn SF, Marshall JW and Schnepf E (2001) A phylogenetic analysis of the SSU rRNA from members of the Plasmodiophorida and Phagomyxida. Protist 152: 43-51.

Bulman SR and Marshall JW (1998) Detection of Spongospora subterranea in potato tuber lesions using the polymerase chain reaction (PCR). Plant Path 47: 759-766.

Burki F and Pawlowski J (2006) Monophyly of Rhizaria and Multigene Phylogeny of Unicellular Bikonts. Mol Bioi Evol, eFIRST date: 07 JUL 2006

Carrington JC and Ambros V (2003) Role of microRNAs in plant and animal development. Science 301: 336-338.

Castlebury LA and Domier LL (1998) Small subunit ribosomal RNA gene phylogeny of Plasmodiophora brassicae. Mycologia 90: 102-107.

Cavalier-Smith T and Chao EE (1997) Sarcomonad ribosomal RNA sequences, rhizopod phylogeny, and the origin of euglyphid amoebae. Arch Protist 147: 227- 236.

Cavalier-Smith T and Chao EE (2003) Phylogeny and classification of phylum Cercozoa (Protozoa). Protist 154: 341-358.

Cerutti Hand Casas-Mollano J (2006) On the origin and functions of RNA-mediated silencing: from protists to man. Curr Genetics 50: 81-99.

Chuang CF and Meyerowitz EM (2000) Specific and heritable genetic interference by double-stranded RNA in Arabidopsis thaliana. Proc Natl Acad Sci USA 97: 4985-4990.

Chen Q, Rehman S, Smant G and Jones JT (2005) Functional analysis of pathogenicity proteins of the potato cyst nematode Globodera rostochiensis using RNAi. Mol Plant Microbe In 18: 621-625. 71

Claxton J, Potter U, Blakesely D and Clarkson J (1996) An ultrastructural study of the interaction between Spongospora subterraneaf.sp. nasturtiiand watercress roots. Mycol Res 100: 1431-1439.

Clay C and Walsh J (1997) Spongospora subterranea f. sp. nasturtii, ultrastructure of the plasmodial-host interface,· food vacuoles, flagellar· apparatus and exit pores. Mycol Res 101: 737-744.

Clough SJ and Bent AF (1998) Floral dip: a simplified method for Agrobacteriurrr mediated transformation of Arabidopsis thaliana. Plant J 16: 735-743.

Cogoni C and Macino G (1997) Isolation of quelling-defective (qde) mutants impaired in posttranscriptional transgene-induced gene silencing in Neurospora crassa. ProcNatl Acad Sci USA 94: 10233-10238.

Cogoni C and Macino G (1999) Gene silencing in Neurospora crassa requires a protein homologous to RNA-dependent RNA polymerase. Nature 399: 166-169.

Crooks GE, Hon G, Chandonia JM, and Brenner SE (2004) WebLogo: A sequence logo generator. Genome Research 14: 1188-1190.

Crute I (1986) The relationship between Plasmodiophora brassicae and its hosts: The application of concepts relating to variation in inter-organismal associations. In: Advances in Plant Pathology. Academic Press, London, pp 1-52.

Davies GJ, Gloster TM and Henrissat B (2005) Recent structural insights into the expanding world of carbohydrate-active enzymes. Curr Opin Struct Bioi 15: 637-645.

Deleris A, Gallego-Bartolome J, Bao JS, Kasschau KD, Carrington JC and Voinnet 0 (2006) Hierarchical action and inhibition of plant Dicer-like proteins in antiviral defense. Science 313: 68-71.

Deutsch M and Long M (1999) Intron-exon structures of eukaryotic model organisms. Nucleic Acids Res 27: 3219-3228. 72

Devos S, Vissenberg K, Verbelen JP and Prinsen E (2005) Infection of Chinese cabbage by Plasmodiophora brassicae leads to a stimulation of plant growth: impacts on cell wall metabolism and hormone balance. New Phytol 166: 241-250.

Diatchenko L, Lau YFC, Campbell AP, Chenchik A, Moqadam F, Huang B, Lukyanov S, Lukyanov K, Gurskaya N, Sverdlov ED and Siebert PD (1996) Suppression subtractive hybridization: a method for generating differentially regulated or tissue­ specific cDNA probes and libraries. Proc Natl Acad Sci USA 93:6025-6030.

Di Nicola-Negri E, Brunetti A, Tavazza M and Ilardi V (2005) Hairpin RNA-mediated silencing of Plum pox virus Pi and HC-Pro genes for efficient and predictable resistance to the virus. Transgenic Res 14: 989-994.

Dong Y and Friedrich M (2005) Nymphal RNAi: systemic RNAi mediated gene knockdown in juvenile grasshopper. BMC Biotechnol 5: - OCT 32005

Dunoyer, P, Himber, C and Voinnet, a (2006) Induction, suppression and requirement of RNA silencing pathways in virulent Agrobacterium tumefaciens infections. Nat Genet 38: 258-263.

Dylewski DP (1990) Phylum .Plasmodiophoromycota. In: Handbook of Protoctista. Margulis L, Corliss JO, Melkonian M, Chapman OJ (Eds.), Jones and Bartlett Pub!', Boston, pp 399-416.

Engleberg NC, Carter C, Weber DR, Cianciotto NP and Eisenstein BI (1989) DNA sequence of mip, a Legione//a pneumophila gene associated with macrophage infectivity. Infect Immun 57: 1263-1270.

Escobar MA, Civerolo EL, Polito VS, Pinney KA and Dandekar AM (2003) Characterization of oncogene-silenced transgenic plants: implications for Agrobacterium biology and post-transcriptional gene silencing. Mol Plant Pathol 4: 57-65. 73

Escobar MA, Civerolo EL, Summerfelt KR and Dandekar AM (2001) RNAi-mediated oncogene silencing confers resistance to crown gall tumorigenesis. Proc Natl Acad Sci USA 98: 13437-13442.

Falloon RE, Nott HM, Cheah L-H and Wright PJ (1997) Control of clubroot of vegetable brassicas. CropInfo Report No. 323. Christchurch, New Zealand Institute for Crop & Food Research Ltd.

Feinberg E and HUnter C (2003) Transport of dsRNA into cells by the transmembrane protein SID-1. Science 301: 1545-1547.

Fire A, Xu S, Montgomery M, Kostas S, Driver S and Mello C (1998) Potent and specific genetic interference by double-stranded RNA in Caenorhabditis e/egans. Nature 391: 806-811.

Foster A, Jenkinson J and Talbot N (2003) Trehalose synthesis and metabolism are required at different stages of plant infection by Magnaporthe grisea. EMBO J 22: 225-235.

Fritz JH, Girardin SE and Philpott DJ (2006) Innate immune defense through RNA interference. Sci STKE (339): pe27.

Garcia-Salcedo JA, Perez-Morga D, Gijon P, Dilbeck V, Pays E and Nolan DP (2004) A differential role for actin during the life cycle of Trypanosoma brucei. EMBO J 23: 780-789.

Goddijn OJ, Lindsey K, van der Lee FM, Klap JC and Sijmons PC (1993) Differential gene expression in nematode-induced feeding structures of transgenic plants harbouring promoter-gusA fusion constructs. Plant J 4: 863-873.

Graf H, Fahling M and Siemens J (2004) Chromosome polymorphism of the obligate biotrophic parasite P/asmodiophora brassicae. J Phytopathol 152: 86-91.

Guindon Sand Gascuel 0 (2003) A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. SystematiC Biology 52: 696-704 74

Guo HS, Xie Q, Fei JF and Chua NH (2005) miRNA microRNA directs mRNA cleavage of the transcription factor NAC1 to downregulate auxin signals for Arabidopsis lateral root development. Plant Cell 17: 1376-1386.

Hamilton AJ and Baulcombe DC (1999) A species of small antisense RNA in posttranscriptional gene silencing in plants. Science 286: 950-952.

Hamilton A, Voinnet 0, Chappell Land Baulcombe D (2002) Two classes of short interfering RNA in RNA silencing. EMBO J 21: 4671-4679.

Helliwell CA,Wesley SV, Wielopolska AJ and Waterhouse PM (2002) High­ throughput vectors for efficient gene silencing in plants. Funct Plant Bioi 29: 1217- 1225.

Henderson IR, Zhang XY, Lu C, Johnson L, Meyers BC, Green PJ and Jacobsen SE (2006) Dissecting Arabidopsis thaliana DICER function in small RNA processing, gene silencing and DNA methylation patterning. Nat Genet 38: 721-725.

Himber C, Dunoyer P, Moissiard G, Ritzenthaler C and Voinnet 0 (2003) Transitivity­ dependent and -independent cell-to-cell movement of RNA silenCing. EMBO J 22: 4523-4533.

Huang X and Madan A (1999) CAP3: A DNA sequence assembly program. Genome Res 9: 868-877.

Ingram IC (1969) Growth of Plasmodiophora brassicae in host callus. J Gen Micro 55: 9-18.

Jeffares DC, Mourier T and Penny D (2006) The biology of intron gain and loss. Trends Genet 22: 16-22.

Jones-Rhoades MW and Bartel DP (2004) Computational identification of plant microRNAs and their targets, including a stress-induced miRNA. Mol Cell 14: 787- 799. 75

Jefferson RA, Kavanagh TA and Bevan MW (1987) GUS fusions: B -glucuronidase as a sensitive and versatile gene fusion marker in higher plants. EMBO J 6: 3901-3907

Kamath RS, Fraser AG, Dong Y, Poulin G, Durbin R, Gotta M, Kanapin A, Le Bot N, Moreno S, Sohrmann M, Welchman DP, Zipperlen P and Ahringer J (2003) Systematic functional analysis of the Caenorhabditis elegansgenome using RNAi. Nature 421: 231-237.

Kamoun S (2003) Molecular genetics of pathogenic oomycetes. Eukaryot Cell 2: 191-199.

Katinka MD, Duprat S, Cornillot E, Metenier G, Thomarat F, Prensier G, Barbe V, Peyretaillade E, Brottier P, Wincker P, Delbac F, EI Alaoui H, Peyret P, Saurin W, Gouy M, Weissenbach J and Vivares CP (2001) Genome sequence and gene compaction of the eukaryote parasite Encephalitozoon cuniculi. Nature 414: 450- 453.

Keeling PJ, Burger G, Durnford DG, Lang BF, Lee RW, Pearlman RE, Roger AJ and Gray MW (2005) The tree of eukaryotes. Trends Ecol Evol 20: 670-676

Keskin B and Fuchs WH (1969) Der infektions vorgang bei Polymyxa betae. Arch Mikrobiol 68: 218-226.

Kim VN and Nam JW (2006) Genomics of microRNA. Trends Genet 22: 165-173.

Kim JK, Gabel HW, Kamath RS, Tewari M, Pasquinelli A, Rual JF, Kennedy S, Dybbs M, Bertin N, Kaplan JM, Vidal M, Ruvkun G (2005) Functional genomic analysis of RNA interference in C elegans. Science 308: 1164-1167.

Klewer A, LuerBen H, Graf H and Siemens J (2001) Restriction fragment length polymorphism markers to characterize Plasmodiophora brassicae single-spore isolates with different virulence patterns. J Phytopathol 149: 121-127. 76

Kobelt P, Siemens J and Sacristan MD (2000) Histological characterisation of the incompatible interaction between Arabidopsis thaliana and the obligate biotrophic pathogen Plasmodiophora brassicae. Mycol Res 104: 220-225.

Koncz C and Schell J (1986) The promoter of tl-dna gene 5 controls the tissue­ specific expression of chimeric genes carried by a novel type of Agrobacterium binary vector. Mol Gen Genet 204: 383-396.

Kupfer DM, Drabenstot SD, Buchanan KL, Lai H, Zhu H, Dyer DW, Roe BA and Murphy JW (2004) Introns and splicing elements of five diverse fungi. Eukaryot Cell 3: 1088-1100.

Leyman B, Van Dijck P and Thevelein JM (2001) An unexpected plethora of trehalose biosynthesis genes in Arabidopsis thaliana. Trends Plant Sci 6: 510-513.

Liu J, Carmell MA, Rivas FV, Marsden CG, Thomson JM, Song JJ, Hammond SM, Joshua-Tor L and Hannon GJ (2004) Argonaute2 is the catalytic engine of mammalian RNAi. Science 305: 1437-1441.

Liu Y, Schiff M and Dinesh-Kumar S (2002) Virus-induced gene silencing in tomato. Plant J 31: 777-786.

Livak KJ and Schmittgen TD (2001) Analysis of relative gene expression data using real-time quantitative PCR and the 2 ·Mer method. Methods 25: 402-408.

Lorenzo-Morales J, Ortega-Rivas A, Foronda P, Abreu-Acosta N, Ballart D, Martinez E and Valladares B (2005) RNA interference (RNAi) for the silencing of extracellular serine proteases genes in Acanthamoeba: molecular analysis and effect on pathogenecity. Mol Biochem Parasit 144: 10-15.

Lu R, Martin-Hernandez A, Peart J, Malcuit I and Baulcombe D (2003) Virus-induced gene silencing in plants. Methods 30: 296-303. 77

Ludwig-MUlier J (1999) Plasmodiophora brassicae, the causal agent of clubroot disease: a review on molecular and biochemical events in pathogenesis. J Plant Dis Prot 106: 109-127.

Mallory AC and Vaucheret H (2006) Functions of microRNAs and related small RNAs in plants. Nature Genet 38 (Supplement): S31-S36

Manzanares-Dauleux MJ, Divaret I, Baron F and Thomas G (2001) Assessment of biological and molecular variability between and within field isolates of Plasmodiophora brassicae. Plant Pathol 50: 165-173.

Maruyama K and Sugano S (1994) Oligo-capping: a simple method to replace the cap structure of eukaryotic mRNAs with oligoribonucleotides. Gene 138: 171-174.

May RC and Plasterk RHA (2005) RNA interference spreading in C elegans. Methods Enz 392: 308-315.

McLeod A, Smart CD and Fry WE (2004) Core promoter structure in the oomycete Phytophthora infestans. Eukaryot Cell 3: 91-99.

Meister G and Tuschl T (2004) Mechanisms of gene silencing by double-stranded RNA. Nature 431: 343-349.

Mithen R and Magrath R (1992) A contribution to the life history of Plasmodiophora brassicae: secondary plasmodia development in root galls of Arabidopsis thaliana. Mycol Res 96: 877-885.

Moro A, Ruiz-Cabello F, Fernandez-Cano A, Stock RP and Gonzalez A (1995) Secretion by Trypanosoma cruzi of a peptidyl-prolyl cis-trans isomerase involved in cell infection. EMBO J 14: 2483-2490.

Motyka SA and Englund PT (2004) RNA interference for analysis of gene function in trypanosomatids. Curr Opin Microbiol 7: 362-368. 78

Mourrain P, Beclin C, Elmayan T, Feuerbach F, Godon C, Morel J, Jouette D, Lacombe A,- Nikic S, Picault N, et a/., (2000) Arabidopsis SGS2 and SGS3 genes are required for posttranscriptional gene silencing and natural virus resistance. Cell 101: 533-542.

Mourier T and Jeffares DC (2003) Eukaryotic Intron Loss. Science 300: 1393.

Mutasa-Gottgens ES, Chwarszczynska DM, Halsey K and Asher MJC (2000) Specific polyclonal antibodies for the obligate plant parasite Po/ymyxa - a targeted recombinant DNA approach. Plant Pathol 49: 276-287.

Napoli C, Lemieux C and Jorgensen R (1990) Introduction of a chimeric chalcone synthase gene into petunia results in reversible co-suppression of homologous genes in trans. Plant Cell 2: 279-289.

Navarro L, Dunoyer P, Jay F, Arnold B, Dharmasiri N, Estelle M, Voinnet 0 and Jones JDG (2006) A plant miRNA contributes to antibacterial resistance by repressing auxin Signalling. Science 312: 436-439.

Nikolaev SI, Berney C, Fahrni JF, Bolivar I, Polet S, Mylnikov AP, Aleshin W, Petrov NB and Pawlowski J (2004) The twilight of and rise of Rhizaria, an emerging supergroup of amoeboid eukaryotes. Proc Natl Acad Sci USA 101: 8066- 8071.

Osborne AR, Rapoport TA and van den Berg B (2005) Protein translocation by the Sec61jSecY channel. Annu Rev Cell Dev Bioi 21: 529-550.

Pain A, Renauld H, Berriman M, Murphy L, Yeats CA, Weir W, Kerhornou A, Aslett M, Bishop R, Bouchier C et a/., (2005) Genome of the host-cell transforming paraSite Theileria annulata compared with T. paIVa. Science 309: 131-133.

Palauqui J, Elmayan T, Pollien J and Vaucheret H (1997) Systemic acquired silencing: transgene-specific post-transcriptional silencing is transmitted by grafting from silenced stocks to non-silenced scions. EMBO J 16: 4738-4745. 79

Peragine A, Yoshikawa M, Wu G, Albrecht HL and Poethig RS (2004) SGS3 and SGS2/SDE1/RDR6 are required for juvenile development and the production of trans­ acting siRNAs in Arabidopsis. Gene Dev 18: 2368-2379.

Quevillon E, Silventoinen V, Pillai 5, Harte N, Mulder N, Apweiler R and Lopez R (2005) InterProScan: protein domains identifier. Nucleic Acids Res 33: W116-W120.

Quon DVK, Delgadillo MG, Khachi A, Smale ST and Johnson PJ (1994) Similarity between a ubiquitous promoter element in an ancient eukaryote and mammalian initiator elements. Proc Natl Acad Sci USA 91: 4579-4583.

Qutob D, Hraber PT, Sobral BW and Gijzen M (2000) Comparative analysis of expressed sequences in Phytophthora sojae. Plant Physiol123: 243-254

Randall TA, Dwyer RA, Huitema E, Beyer K, Cvitanich C, Kelkar H, Fong AMVA, Gates K, Roberts 5, Yatzkan E et aI., (2005) Large-scale gene discovery in the Oomycete Phytophthora infestans reveals likely components of phytopathogenicity shared with true fungi. Mol Plant Microbe In 18: 229-243.

Rigden DJ, Jedrzejas MJ and Galperin MY (2003) Amidase domains from bacterial and phage autolysins define a family of gamma-D,L-glutamate-specific amidohydrolases. Trends Biochem Sci 28: 230-234.

Rochon D, Kakani K, Robbins M and Reade R (2004) Molecular aspects of plant virus transmission by olpidium and plasmodiophorid vectors. Annu Rev Phytopathol 42: 211-241.

Rogers MB, Archibald JM, Field MA, Li C, Striepen B and Keeling PJ (2004) ­ targeting peptides from the chlorarachniophyte 8ige/owie//a natans. J Eukaryot Microbiol 51: 529-535.

Rosso M-N, Dubrana MP, Cimbolini N, Jaubert 5 and Abad P (2005) Application of RNA interference to root-knot nematode genes encoding esophageal gland proteins. Mol Plant Microbe In 18: 615-620. 80

Rozen Sand Skaletsky HI (2000) Primer3 on the WWW for general users and for biologist programmers. In: Bioinformatics Methods and Protocols: Methods in Molecular Biology. Krawetz S, Misener S (Eds.), Humana Press, Totowa, NJ, pp 365- 386.

Russell AG, Shutt TE, Watkins RF and Gray MW (2005) An ancient spliceosomal intron in the ribosomal protein L7a gene (RpI7a) of Giardia lamblia. BMC Evol Bioi 5: - AUG 18

Russell J and Bulman S (2005) The liverwort Marchantia foliacea forms a specialized symbiosis with arbuscular mycorrhizal fungi in the genus Glomus. New Phytol 165: 567-579.

Sambrook J and Russell D (2001) Molecular Cloning: A Laboratory Manual. 3rd ed. Cold Spring Harbor, New York, Cold Spring Harbor Laboratory Press.

Sarkar G, Turner RT and Bolander ME (1993) Restriction-site PCR: a direct method of unknown sequence retrieval adjacent to a known locus by using universal primers. PCR Methods App 2: 318-322.

Schmid M, Davison TS, Henz SR, Pape UJ, Demar M, Vingron M, Scholkopf B, Weigel D and Lohmann JU (2005) A gene expression map of Arabidopsis thaliana development. Nature Genet 37: 501-506.

Schnepf E (1994) A Phagomyxa-like endoparasite of the centric marine diatom Bellerochea malleus. a phagotrophic Plasmodiophoromycete. Botanica Acta 107: 374-382.

Sharp PA and Burge CB (1997) Classification of introns: U2-type or U12-type. Cell 91: 875-879.

Siebert PD, Chenchik A, Kellogg DE, Lukyanov KA and Lukyanov SA (1995) An improved PCR method for walking in uncloned genomic DNA. Nucleic Acids Res 23: 1087-1088. 81

Siemens J, Nagel M, Ludwig-Muller J and Sacristan MD (2002) The interaction of P/asmodiophora brassicae and Arabidopsis tha/iana: Parameters for disease quantification and screening of mutant lines. J Phytopathol 150: 592-605.

Siemens J, Keller I, Sarx J, Kunz S, Schuller A, Nagel W, Schmulling T, Parniske M and Ludwig-Muller J (2006) Transcriptome analysis of Arabidopsis clubroots indicate a key role for cytokinins in disease development. Mol Plant Microbe In 19: 480-494.

Skamnioti P and Ridout C (2005) Microbial avirulence determinants: guided missiles or antigenic flak? Mol Plant Path 6: 551-559.

Smale ST (1997) Transcription initiation. Biochim Biophys Acta 1351: 73-88.

Smith N, Singh S, Wang M, Stoutjesdijk P, Green A and Waterhouse P (2000) Total silencing by intron-spliced hairpin RNAs. Nature 407: 319-320.

Some A, Manzanares M, Laurens F, Baron F, Thomas G and Rouxel F (1996) Variation for virulence on Brassica napus L. amongst P/asmodiophora brassicae collections from France and derived single-spore isolates. Plant Pathol 45: 432-439.

Suzuki Y, Yoshitomo-Nakagawa K, Maruyama K, Suyama A and Sugano S (1997) Construction and characterization of a full length-enriched and a 5'-end-enriched cDNA library. Gene 200: 149-156.

Swofford DL (2002) PAUP*. Phylogenetic Analysis Using Parsimony (*and Other Methods). Version 4. Sinauer ASSOciates, Sunderland, Massachusetts.

Tabara H, Grishok A and Mello C (1998) RNA silencing in C e/egans. soaking in the genome sequence. Science 282: 430-431.

Thara VK, Fellers JP and Zhou J-M (2003) In p/anta induced genes of Puccinia tdticina. Mol Plant Pathol 4: 51-56.

Thevelein J (1984) Regulation of trehalose mobilisation in fungi. Microbiol Rev 48: 42-59. 82

Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F and Higgins DG (1997) The ClustalX windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res 24: 4876-4882.

Timmons L, Court D and Fire A (2001) Ingestion of bacterially expressed dsRNAs can produce specific and potent genetic interference in Caenorhabditis e/egans.

J Gene 263: 103-112.

Tomkinson B (1999) Tripeptidyl peptidases: enzymes that count. Trends Biochem Sci 24: 355-359.

Tommerup DS and Ingram IC (1971) The life cycle of P/asmodiophora brassicae Woron. in Brassica tissue cultures and intact roots. New Phytol 70: 327-332.

Urwin PE, Lilley 0 and Atkinson HJ (2002) Ingestion of double-stranded RNA by preparasitic juvenile cyst nematodes leads to RNA interference. Mol Plant Microbe In 15: 747-752

Vanacova S, Van W, Carlton JM and Johnson PJ (2005) Spliceosomal introns in the deep-branching eukaryote Trichomonas vagina/is. Proc Natl Acad Sci USA 102: 4430-4435.

Vastenhouw N, Fischer S, Robert V, Thijssen K, Fraser A, Kamath R, Ahringer J and Plasterk R (2003) A genome-wide screen identifies 27 genes involved in transposon silencing in Ce/egans. Curr Bioi 13: 1311-1316.

Vazquez F, Vaucheret H, Rajagopalan R, Lepers C, Gasciolli V, Mallory AC, Hilbert JL, Bartel DP and Crete P (2004) Endogenous trans-acting siRNAs regulate the accumulation of ArabidopsismRNAs. Molecular Cell 16: 69-79.

Vayssie L, Vargas M, Weber C and Guillen N (2004) Double-stranded RNA mediates homology-dependent gene silencing of gamma-tubulin in the human paraSite Entamoeba histo/ytica. Mol Biochem Parasitol138: 21-28. 83

Voigt S, Jungnickel B, Hartmann E and Rapoport TA (1996) Signal sequence­ dependent function of the TRAM protein during early phases of protein transport across the endoplasmic reticulum membrane. J Cell Bioi 134: 25-35.

Voinnet 0, Lederer C and Baulcombe D (2000) A viral movement protein prevents spread of the gene silencing signal in Nicotiana benthamiana. Cell 103: 157-167.

Voinnet 0, Vain P, Angell Sand Baulcombe DC (1998) Systemic spread of sequence­ specific transgene RNA degradation in plants is initiated by localized introduction of ectopic promoterless DNA. Cell 95: 177-187.

Volfovsky N, Haas BJ and Salzberg SL (2003) Computational discovery of internal micro-exons. Genome Res 13: 1216-1221.

Wang MB, Abbott DC and Waterhouse PM (2000) A single copy of a virus-derived transgene encoding hairpin RNA gives immunity to barley yellow dwarf virus. Mol Plant Pathol 1: 347-356.

Wang MB and Metzlaff M (2005) RNA silencing and antiviral defense in plants. Curr Opin Plant Bioi 8: 216-222.

Waterhouse PM, Wang MB and Lough T (2001) Gene silencing as an adaptive defence against viruses. Nature 411: 834-842.

Watson JM, Fusaro AF, Wang MB and Waterhouse PM (2005) RNA silencing platforms in plants. FEBS Letters 579: 5982-5987.

Wesley SV, Helliwell CA, Smith NA, Wang MB, Rouse DT, Liu Q, Gooding PS, Singh SP, Abbott D, Stoutjesdijk PA, Robinson SP, Gleave AP, Green AG and Waterhouse PM (2001) Construct design for efficient, effective and high-throughput gene silencing in plants. Plant J 27: 581-590.

Willhoeft U, Campos-Gongora E, Touzni S, Bruchhaus I and Tannich E (2001) Introns of Entamoeba histo/yticaand Entamoeba dispar. Protist 152: 149-156. 84

Williams PH, Aist SJ and Aist JR (1970) Response of cabbage root hairs to infection by PJasmodiophora brassicae. Can J Bot 49: 41-47.

Williams PH and McNabola SS (1970) Fine structure of the host-parasite interface of

PJasmodiophora brassicae in cabbage~ Phytopathology 60: 1557-1561.

Williams PH, Reddy MN and Strandberg JO (1969) Growth of noninfected and PJasmodiophora brassicae infected cabbage callus in culture. Can J Bot 47: 1217- 1221.

Win J, Kanneganti TD, Torto-Alalibo T and Kamoun S (2006) Computational and comparative analyses of 150 full-length cDNA sequences from the oomycete plant pathogen Phytophthora infestans. Fungal Genet Bioi 43: 20-33.

Winston WM, Molodowitch C and Hunter CP (2002) Systemic RNAi in C eJegans requires the putative transmembrane protein SID-1. Science 295: 2456-2459.

Xie Z, Johansen LK, Gustafson AM, Kasschau KD, Lellis AD, Zilberman D, Jacobsen SE and Carrington JC (2004) Genetic and functional diversification of small RNA pathways in plants. PLoS Bioi 2: E104.

Xie ZX, Allen E, Fahlgren N, Calamar A, Givan SA and Carrington JC (2005) ExpreSSion of Arabidopsis MIRNA genes. Plant Physiol 138: 2145-2154.

Zilberman D, Cao X and Jacobsen S (2003) ARGONAUTE4 control of locus-specific siRNA accumulation and DNA and histone methylation. Science 299: 716-719. 85

Appendices

Appendix 1. Identification of genes from the obligate intracellular plant pathogen, Plasmodiophora brassicae

ArabidopsiscDNA clones SSH cllA At4g33070 pyruvate decarboxylase SSH c2C M Atlg03220 dermal glycoprotein, putative SSH c3C M At4g30210 NADPH-cytochrome p4S0 reductase, putative SSH c6C M At4g33070 pyruvate decarboxylase SSH cSD B Atlg62480 vacuolar calcium-binding protein-related SSH c9E B At3g1S4S0 Unknown SSH c2E M Atlg21720 20S proteasome beta subunit Cl SSH c4H B At2g31360 delta 9 desaturase SSH c7H M At2g03090 expansin, putative (EXP1S) SSH c2K At 26S SSH c4K At3g22440 putative hydroxyproline rich SSH cSK AtSg623S0 invertase/pectin methylesterase inhibitor SSH c6K AtlgS2300 ribosomal protein L37 SSH c7K At3g22440 putative hydroxyproline rich SSH PLAS c18 At2g36220 unknown SSH PLAS c19 At3g22440 putative hydroxyproline rich SSH PLAS c20 AtSg623S0 invertase/pectinmethylesterase SSH PLAS c21 Atlg77120 alcohol dehydrogenase SSH PllOS cll Atlg07940 EFla SSH PI 105 c23 AtSg64260 Phi protein SSH PllOS c30 At3g44300 Nitrilase II SSH PI 105 c34 At4g33070 pyruvate decarboxylase SSH PllOS c38 At4g33070 pyruvate decarboxylase SSH PllOS c61 At4g33070 pyruvate decarboxylase SSH PllOS c83 AtlglO140 Unknown SSH PI30S cA4 AtSgS7S60 xyloglucan endotransglycosylase THC4 SSH PI30S c19 At3gSS430 putative beta-l,3-glucanase SSH PI30S cA36 Atlg77760 nitrate reductase 1 (NR1) 86

xyloglucan endo-1,4-beta-D-glucanase (EC 3.2.1.- SSH PI305 cA40 At4g30270 ) . SSH PI305 cA42 At4g27450 Unknown SSH PI305 cA58 At3g11930 Putative ER6 - ethylene response gene

GRc1A At4g28240 wound-responsive protein-related GRc2B Atlg55675 Unknown GRc5B At4g25570 cytochrome B561 family protein GRc8B Atlg32920 Unknown GRc10B At2g30490 cinnamate-4-hydroxylase GRc1C At5g09440 phosphate-responsive protein, putative GRc2C At4g28060 cytochrome c oxidase subunit 6b, putative GRc9C At5g03210 Unknown DNAJ heat shock N-terminal domain-containing GRc8E Atlg80920 protein GRc9E Atlg19570 GSH-dependent dehydroascorbate reductase 1 GR cllG At2g20490 nucleolar RNA-binding Nop10p family protein GR PIAc3H At3g48450 nitrate-responsive NO! protein, putative GR PIAc8H At2g15970 cold-acclimation protein, putative GR PIAclOH At5g02380 metallothionein protein 2B GR PIAc12H At3g16640 translationally controlled tumor family protein GR PIAc14H At3g49910 60S ribosomal protein L26 GR PIAc15H At5g53500 WD-40 repeat family protein ER lumen protein receptor, putative/HDEL GR PIAc42 At3g25040 receptor GR PIAc44 Atlg15270 Unknown/hypothetical GR PIAc46 Atlg54580 acyl carrier protein, chloroplast, putative ACP GR PIAc48 Atlg15270 Unknown/hypothetical GR PIAc49 At3g01130 Unknown GR c10J At3g47520 malate dehydrogenase GR c12J AtgDNA GR PIAc53J At3g02190 60S ribosomal protein L39 GR PIAc55J At gDNA GR PIAc56J At4g28085 Unknown GR PIAc58 Atlg27970 NTF2 GR PIAc61 At3g47520 malate dehydrogenase GR PIAc62 At3g22550 Unknown; senescence-related GR PIAc63 At3g16640 translationally controlled tumor family protein GR PIAc67* At3g22550 Unknown; senescence-related 87

GR PIAc7S At3g44735 phytosulfokines-related GR PIAc77 At4g13220 Unknown GR PIAc78 At3g13S20 arabinogalactan-protein GR PIAc82 At3g47S20 malate dehydrogenase GR PIAc83 AtSgS6360 calmodulin-binding GR PIAc8S At4g276S2 Unknown GR PIAc86 At4g2808S unknown GR PIAc88 At4g3386S 529 ribosomal protein GR PIAc91 AtSgSOnO ABA responsive AtHVA22 mitochondrial NADH:ubiquinone oxidoreductase GR PIAc92 At4g201S0 subunit? GR PIAc94 AtlgS604S ribosomal protein L41 GR PIAc9S AtgDNA GR 4X c1S At4g40030 histone H3.2 GR4X c20 At4g306S0 hydrophobic protein GR 4X c28 At2g47170 ADP-ribosylation factor 1 GR 4X c46* Atlg63840 RING zinc finger protein GR PIBc1S At3g47S20 Malate dehydrogenase GR PIBc16 At gDNA GR PIBc18 Atlg200S0 C-8,7 sterol isomerase GR PIBc21 At2g23090 unknown GR PIBc27 At4g2808S unknown GR PIBc28 Atlg2320S invertase/pectin methyl esterase GR PIBc29 AtgDNA GR PIBc32 Atlg02130 Ras related small GTP binding protein GR PIBc33 AtlgS604S ribosomal protein L41 GR OS c1 At3g04400 605 ribosomal protein GR PIOSB c62 AtSg62900 Unknown protein GR N5Pl1 c8 AtSg024S0 605 ribosomal protein L36 GR NBPl1 cS4 At2g27960 cyclin-dependent kinase GR NBPl1 c64 At4g3386S 405 ribosomal protein 529 GR NBPl1 c6S At3g62400 cytochrome c oxidase subunit Sc GR NBPl1 c87 AtSg6S6S0 Unknown GR NBPl1 c93 Atlg27400 605 ribosomal protein L17 GR PI2NBOS(M) c3 AtSg18380 405 ribosomal protein 516 GR PI2NBOS(M) c4 AtgDNA GR PI2NBOS(M) c8 Atlg09640 eEF-1B gamma, putative GR PI2NBOS c44 AtSg47930 Ribosomal 527 88

GR PI2NB05 c45 At5g16470 Zinc finger GR PI2NB05 c50 At5g48810 putative cytochrome b5 protein GR PI2NB05 c53 At5g58375 Unknown GR PI2NB05 c59 At2g46510 succinate dehydrogenase GR PI2NB05 c68 At3g01130 cystathionine gamma-synthase GR PI2NB05 c69 At3g52800 zinc finger (AN1-like) family protein GR PI2NB05 c87 At4g33565 Unknown GR PI3NB05 c2 Atlg32920 Unknown GR PI3NB05 c5 Atlg65980 peroxiredoxin type 2 GR PI3NB05 c6 Atlg65980 peroxiredoxin type 2 GR PI405 c2 At4g02520 GST GR PI405 c7 At5g54750 TRAPP component Bet3 GR PI405 c8 At4g 11 150 vacuolar ATP synthase subunit E GR PI405 c10 Atlg27030 Unknown GR PI405 cll At5g11090 serine rich GR PI405 c15 Atlg20440 ABA responsive gene, dehydrin GR PI405 c18 Atlg09560 putative germ in protein GR PI405 c19 At5g35735 similar to auxin-induced protein AIR12 GR PI405 c21 At2g36830 aquaporin GR PI405 c24 At4g23470 hydroxyproline-rich glycoprotein GR PI405 c26 Atlg62880 cornichon family protein GR PI405 c30 Atlg62880 cornichon family protein GR PI405 c31 Atlg62880 cornichon family protein GR PI405 c32 Atlg19660 wound-responsive family protein GR PI40S c36 Atlg64970 gamma-tocopherol methyltransferase GR PI405 c38 At4g09580 Unknown GR PI40S c41 At2g36830 Aquaporin GR PI405 c42 Atlg51070 bHLH transcription factor GR PI405 c45 At3g11940 putative 405 ribosomal protein 55 GR PI405 c53 At4g18340 beta-1,3-glucanase-like protein GR PI405 c59 At3g09900 Ras-related GTP-binding protein GR PI405 c60 Atlg27730 putative salt-tolerance zinc finger protein GR PI405 c64 At2g44610 Ras-related GTP-binding protein, putative GR PI40S c65 At3g16640 translationally controlled tumor family

Appendix1 Table 1. Arabidopsis tha/iana cDNAs. Summary of Arabidopsis cDNAs sequenced during this study. Protein function descriptions are from TAIR-curated annotations that accompany GenBank sequences. 89

Quantitative RT-peR analysis of P. brassicae gene expression The SSH cDNA collection had one especially common P. brassicae clone. The DC collection was also highly redundant, containing three especially common P. brassicae clones. To test whether the cDNA redundancy in our SSH and DC collections reflected the biology of secondary stage infection, we used qRT -PCR to examine transcription of ten P. brassicae genes and the A. tha/iana actin 8 gene (Atlg49240). All amplified fragments were between 100-300 bp in length (primer sequences in Appendix 1, Table 2). First strand cDNA for RT-PCR was generated with the SuperScript III system (Invitrogen). Two ug of total RNA was transcribed with the GR oligo dT primer. DNA was amplified with the Platinum qPCR SYBR Green Supermix-UDG (Invitrogen). Reactions were 12.5 IJL in volume and were analysed on an ABI 7700 Sequence Detector. Duplicate reactions were performed in each run. Two runs/cDNA were performed for the 7 dpi cDNA samples and 2-6 runs/cDNA for

the OC template and 30 dpi cDNA samples. CT values were normalised to P.

brassicae ~-tubulin for each template. Relative gene expression levels were calculated using the 2 -Mer method (Livak and Schmittgen 2001).

qPCR confirmed that, in comparison with the P. brassicae ~-tubulin gene used as a control, short mRNAs were more abundant in the DC template than in the first strand cDNA (Appendix 1, Table 3). Differences in qPCR between the amplified and non­ amplified cDNA template also emphasised that some highly redundant DC cDNAs had become overrepresented in the library. Thus, amplification steps during library construction yielded clones with much reduced lengths compared with average eukaryotic cDNAs (e.g. 1V1.4 kb in Candida a/bicans) (Braun et aI., 2005), and a clone frequency that was not a reliable approximation of gene transcription levels.

We also compared the expression levels of the same genes at 7dpi and 30dpi. qRT­ PCR on cDNA from a primary-infection sample showed a low percentage of pathogen RNA compared with the 30 dpi sample (Appendix 1, Table 3). These experiments showed that quantitative PCR can be used to measure transcription of P. brassicae genes, but a more detailed dissection of plasmodiophorid growth stages will be required to understand developmental regulation of P. brassicae gene transcription. 90

Primer pairs for qRT-peR

P~TUB BTUBFl CTGCTCATCTCCAAGATCCGA BTUBR2 ACGAACGACTCATCGGAGTT Ptl2B L2BFl CCAACCAGGAAGTGATCTGG L2BRl ACTTCTTCGCGTTCAGGTTG PLPRX PEROFl CAGCCACAAGAAGTGGATCA PERORl CTTGTCTGGCCCAATGATG Ph5HSPl CRAAFl GACTCCTCCCAAAGCCAGGT CRAARl TCCGCGTACAGCATCTTGC Ph5HSP2 CRABFl CCAAGTAACGACCTACGCAGC CRABRl CGGGCGTGTCGATGTAGAG Pt6UNKl CRACFl CTGCATTGCTTGAGTTAACATGG CRACRl CAGCTCCAGCGTTGACACAC PtHISH4 HISFl GTGCTGCGCAACAACATC HISRl CTCCGTGTAGGTGACCGAGT PtGSTl GSTFl CCGTCGAACATCAAGATCAC GSTRl GGAACGAAGTCTGGAACGTC PI£YCA CYCA1F GTCATCCCCAACTTCATGCT CYCA1R ACAATGTCCATGCCCTCAAT PtGST2 GST2Fl CCCAACAAGGACAAGGATTT GST2Rl GCGACCGTGAACTTGTTG AlACTB ACTINBF GTATGTTGCCATTCAAGCTGTTCTA ACTINBR GAGCTTGGTTTTCGAGGTCTCC

Appendix 1 Table 2. PCR primers used for qRT-PCR. All primers are written in the 5' to 3' direction. 91

eDNA name PL:f3TUB PttlISH4 Ptl28 PLsHSP1 PLsHSP2 PI£'(CA Size (bp) 1570 430 540 630 700 620

30 dpi vs 1.00 0.10 0.29 3.94 0.09 0.90 7dpi 2-MCT (0.8-1.26) (0.06-0.17) (0_23-0.37) (1.9-8.1) (0.03-0.35) (0_66-1.2)

DC template 1.00 43.4 8.28 1016 7281 52.7 vs 30 dpi (1.2-0.86) (29-65) eDNA 2-MCT (6.4-10.8) (530-1951) (3492-1.5 x (36-78) 1004)

eDNA name P~ST1 P~ST2 PteRX Pl:6UNK1 AtACT8 Size (bp) 630 700 770 570 1620

30 dpi vs 2.58 0.08 0.53 55.7 2.3 x 10-05 7dpi 2-MCT (1.8-3.7) (0_06-0.11) (0.39-0.71) (31-98) (1.3-4.1 x 10-

05) DC template 48.8 140 8.69 729 0.29 vs 30 dpi (32-76) (114-172) (7.3-10.3) eDNA 2-MCT (419-1269) (0.18-0.47)

Appendix 1 Table 3. Quantitative RT-PCR for selected P. brassicaegenes and the Arabidopsis actin 8 gene. Template was 7 days post infection and 30 days post infection tissue first

strand cDNA, and amplified OC template. Changes between lJ.CT values (MCT) were calculated for 7 to 30 days infection and 30 days to OC template, and then converted to

approximate fold differences of expression by 2 -MCf. Figures in brackets represent each fold difference ± standard deviation. 92

Appendix 2. Genomic DNA sequences from Plasmodiophora brassicae

Primers for Universal PCR-walking

UNIl AATACGACTCACTATAGNlOGATC

UNI2 AATACGACTCACTATAGNlOGAATTC

UNI3 AATACGACTCACTATAGN'OGCGC

UNI4 GTAATACGACTCACTATAGGGCN'OCGCG

T7 CGCGTAATACGACTCACTATAG

Primers for Suppression PCR-walking

SUPPstlA GTAATACGACTCACTATAGGGCTCGAGCGGCCGCCCGGGCAGGTTGCA

SUPBA P-ACCTGCCC-NH [C6]

SUPla GTAATACGACTCACTATAGGGC

Appendix 2 Table 1. PCR primer and adapto,-A sequences used for PCR walking. The SUPB primer has a 5' phosphate group and a 3' blocking amine group. 93

Haloarcula marismortul

Methanococcus /annaschii

SpJronucteus barl

Giardia lamblia

600 Eugena g-aci/is ,--_____ Diclyoslelium discoideum

346 Drosophila melanogasler

Ciona inteslina/is

Homo sapiens

AcantiJamoeba castellanii 375 '--___ Saccharomyces ceTevlsiae

Arabldopsls Ihaliana

,--___ Trimaslix pytlfonnis

Jahoba libera

0----- RecDnomonas americana Seculamonas ecuadoriensis

Malawimonas jakoblformls

P/asmodJophora brassicae

'--___ nalans

Cyan/dioschyzon merolae

'--____ Harlmannel/a vermiformis

399,------Naegleria g-uber; Trichomonas vaginalis 0.1

Appendix 2 Figure 1. Maximum likelihood phylogenetic tree constructed from ribosomal L7a proteins. Amino acid sequences from a range of organisms (from Russell et al. 2005). Values of greater than 50% for five hundred bootstrap pseudoreplicates are shown above nodes. 94

L7A 1-2 IIIGTATTGCAATTGTGTTTGAGCCTATAAACGGAGCGCTAACATGGTCGACAGill IIIGTACGTTTTCTCTGTCGTCGCGTGTGTCTCACGAAGACCTAACGTTGTGTCTAGiII PDA 1-2 IIIGTAAGGACAATTTGAAATTGCAGGACGCGGCGT~ACAGTGTATAGill IIIGTCGTTCATCGCTTGTTCTCGCTTTTGCACCTGGTTGCTGACTAGTTCACGTTCAGIII URM 1-2 IIIGTCGGCCAGCCAGTCGATGTCGACTTTGGAAATGTGTGCGTCTGACCGTGGCGTCTAGIII IIIGTAGGGGTTTCCGTACGCGGCCCCCTGTCTGCTCGACTAATGTGTGTGTCAGill BTUB 1-2 IIIGTATGGGCGCGGCGCTCCATCCCGACGCCGACCTTCCCGCTGATTTCCGTTTGTGTACTCGTGTAGIII IIIGTAAGTCTTGGTGTTTCTGGACCGTTCGATGATCGCAGTCTGAATGGGTTCTTCAGIII IDS 1-3 TAAATGGCAATGCCCCGTTTAGGCTTCTGACACGGTGGTGATTAACCCTCCATACCGTGCTATCGTATAGIII TAGGGCGCGACGTTAAATGTTCGGTGCGTTGCTGCTGAACTCGTGTCGTACCAGill TGAGAGGATCGCGAATTATCGGTTTGCTTCGACTCCTCTGGGTCTGACATATGCTGTCAGIII IENDO 1-5 TTGGTGCCGTGCGTTCTTGGAGCGCGGACGGCTTCTCTAATTTCCGTCCCGCCTTCCTAGIII TATTTGTTTTAATTAATGGAGATACCAGGCCGATGCTGACTGATTTGTACAGill TATGATCTGGATGAACCCGCGAACTTGCACCGTGCTGACTCCTGCTTGCAGilli TATCTGTGAACAGTGTCGATTCGCCATACAATATCGTCTCTGACTTTGCTTTTTCTAGIII TGGGTCCGGAGTTCCAACCAATTCTTCGGAAACCACTATTCTGACGGTCCGCGTGCATCAGIII ICALF 1 IIIGTACACCGTTCAGCATCAGAAGACATGTGCGAGCTCTGACAAACCGGGCGCACACAGIII CALUPl-2 IIIGTAAGCCTCGAACCGGCTGGGTCGCTCTGCCGTCGTTGACGTGTGTGTGCCGCAGill IIIGTAAGCGCTATGGTGCGTGCTCCTCTCTGGACGTGCCGGTTGATGTTCTTAACCAGIII PSA 1-20 TACACCTGATGCTTCTGGTCGGTTACTGGTTATCAACG~TCTGTTTCAGill TACGTTGAACCGGGGTTCGATAATTTAATCTGAACGAATATGTAGill TTTTCTAATGTTCTCTTCTTGAGTTTGAAAATGTTGCTGACGGTATCCATCGTATAGIll TTTGGTTTTAATGGATTTACACATCTTCCCTTCCTAATATTCCATGTCAGill TTGGCCTTTCCCACACTTTCCAGAATTGATACTGACAGCACATGCA~ TACGCATAATATTCGAGGCAATGTGACAATCCTAACAACCGTTGCTGCTCGATAGTCAGTGGAAGATTGCTGTACGTCTTGA ITGAAGCTCACTTGCTATCAGIII TGACGCCGTATGCTATTGGCGAATGATATGTCTCTGAACCAATGTCGCTTGACGCAGIll CGAGAACGTACGCATCG~CTGTGCAGIII GTI3C(3m~T1rGC,TC:GC:CC~GTTTACCCATCTGTT T GTGCGTTCTGACCAAATGTATAGill CTGGTGCCGAATG~TGTTTCATGTA~ GT,~T(:A~rG1rTl~C;AC:GCTGGAAAATGTGCCAACTGATTAGCTGCCTTTCGCA GTI:rn3G1~T!'ATGA'TG(:A(~Al~;~'~A_TCCCGC ~TGGATGGTGTCTTA GTI:A(;TC;GA,CTCT IGT(3C(;TC:CG~CGACAI:C(~AT~GTGTGCAGIII GT.AC'TCIGC'TC'rTICCICT(3T(3A(3T;~G(3Cj~G~rTAT TGACCCGATAAACA~ TTGCGTGTTCGCCATTTGCGAATGTAGACGTCGTTGACTACTATTTCGTAGill GTICT(3Tl~T(;Gl~T1:TG~C~AA. CATCAGCTGCTTAATTCTAATCCAAGGTTCTA

~~.~~,~~,~~/~~(~~~~~r'~~·~~TCAGACTCGCTGGCGCTCATG'TG·rT~rA~rC(:G1~AC:AG GTAA~GI:Tl~C(;Cl~GC,C1'GG~CG:GTCATGCTAAGCTATGGATAATAACTGCGAGCGTCCCAGIIi MOB 1-7 TTAAGCCCGGTTGCCCTGCTCGCGTCGCGATGCCCGCTAAGCACTGCTCTGGAGCAGIII TCGGTTTCGCCACACGAAATGCTCCTTCGC;GTTCCTAGAGTTAATCGATGGCTGCAGill TACGTCCGGGTTGGCGGTTGCAATCTATAAGAGTTGAACCTAATCCATTCCTTTGTTGCAGIIi I TGACCGATTCCACAGTGGTGGATATGTGTCCAGGACCCTAATCTGCCACAGill 95

TCATATGCTTTTCGTCAGGTATCGCACTCGATTTGGTTTTAACACGGTGGTCAACGGCAGili TGAGCCAACGATGTTTCTCGTTTATGTCCGCCTAGCCAGTCTGATTCTGGTCGTCCGGCAGIII TATCGTCCATCGTTCACAAACGTGTCCGTGTATTTGACTGCCTCCCAGili ITREH 1-9 TGTTATTGGGCAACTGCTGCGACGTCCGTCGCAGCCGCCTGATTCGTTCGr CA "~~~r~"~CGATGCGAGTTGCACGTCTCGCTCGATTCCTGACGGTGGTGTGCGCA TTCCCGTGTACTGTTTCTATTGGGCAAGTCTTAACATTTCAATGTCA ~~1~0f'~r'~TTGTTTGGATTCGCTCGTACAGTGGCGTGTGCTGATTGTTACGTTCGCTCTGTCTAGili GT.AAI:A'rC·rG(~A1'G(~ATTTGTGATGTACTTGTGTTTGACTGTACTCTTCTTAGili ~~,~,_~~. ~CGAATGCCGCTTACCTCGCCCCACCTGGTTCCGATATCTGAACCGTTGCAGili CGTTCGTTGATTGGTAGTAGTTTGT~CTGCGGTCA~ ~~I~~r~~~'~CCCCCGATCACGTGTACGCGTCCTATGCTGACGTTCCTGTTCAGili GTICT(:TC;CC;CG:GTCGICC(:T1rGP,TG:CCTTIGG(:TC;G1'TT'G_CT_IG~I~(3ATGTATTCCCCAGili GSTUP 1-2 CATGCCCGGGTGTGCTGCCGCCTGTCGTATTCCATTGTGCTGATGACCGAAGCGTGr CA TATGTTACGTAGACTGTATCGCAGTACTGCTGGCCCCGTCTCACGTTTTACAACCA TGCGTACCTAACCGGGCACTCGGCGCCGAGTGACTGCACTATAATGCTGTCCGCCA IPEPS28 1-8 GT'rC(3C(~G(;TG~GTAICC(:G(~G(;TG~TGGICT(3C 1r G1:~:~'~_TCTGTTCGTCAGili ATGACCTGAATCACGACGAACTI CTCTGCCCAATGGACAGACTCTGAATTCGAATCGTCA TG'rG'rA(:AC;A1rG1~CCGTCTGGTGTCGATGTGAATTGACGCCAGTGCTcGIA GT 'TT(GC(:A(:A(~AC;CP,CC:GA.GCCCATACTAGTAAGCACGTTCTCATTGGCAGTATA TATATGCTGTGGTGTCTCGAATTCAGTCTGATGACTATTTGCTCCTTAGili CATGAGCGCACGTCCAGCGTTTGTGACCTGAAACGAAATCACTGTCAGili TCGTGCTCCTGCTGTGTAACCGACCTGACCGCTGCCTACAGili

Appendix 2 Figure 2 Intron sequences from P. brassicae genes. Three nucleotides from the exons on either side of the introns are included (purple). Putative splice site motifs are underlined. Blue nucleotides indicate splice site motifs with one variant nucleotide and green nucleotides indicate motifs with two variant sites. 96

Degenerate peR for Dicer proteins In an attempt to directly detect the presence of RNA silencing components in plasmodiophorids, we applied degenerate PCR for Dicer-like genes. An alignment of 8 Dicer-like genes from a range of organisms was made with Clustal X (Thompson et aI., 1997) (Appendix 2 Fig. 3a). Four primers were designed in conserved amino­ acid regions of the Dicer genes (Appendix 2 Fig. 3b). Components ofthe PCR reactions were standard (Bulman et aI., 1998), but with primers at 1IJM concentration. The PCR parameters followed a touchdown protocol; annealing temperatures were two cycles each at 60°C, 55°C, 50°C, 45°C, then 35 cycles at 50°C. Extension times were for 3 min. PCR reactions were carried out on genomic DNA templates from B. rapa club root galls, potato-Spongospora subterranea galls, and cDNA from Arabidopsis club root galls. Of the three PCR primer combinations used (GDSFF/VEALR, GDSFFjRLEFR, VEALFjRLEFR), a strong band of tv300 bp was amplified with the VEALFjRLEFR pair from the B. rapa gall genomic DNA sample. After cloning, two sequences were obtained from the band. One gave no significant hits with BLASTX whereas the other sequence (Appendix 4 Fig. 4) was found to have high similarity to Arabidopsis DCLl at both the amino-acid and nucleotide levels (Appendix 4 Fig. 5). No further work was performed on this sequence because this degree of nucleotide conservation indicated that it was from the B. rapa genome rather than from P. brassicae. Many P. brassicae cDNA sequence show high amino­ acid similarity to genes from other eukaryotes, but few have conservation at the nucleotide level (Bulman et aI., 2006a). A tv300 bp fragment was amplified from potato-Spongospora subterranea gall DNA, but this was found to derive from a plant transposon sequence.

The effectiveness of our Dicer amplification strategy was further demonstrated by the identification of a Dicer1 sequence from cDNA of the aphid Myzus persicae (this sequence has been extended by 3'RACE). Anophelesgambiae 1 Drosophilamelanogaster 1 ------Zebrafish 1 ------Caenorhabditiselegans 1 ------MiRVRADLQCFNliIlB1 Arabidopsisthalianadcl1 1 ------MVMEDEPREATIKP Spombe 1 ------IDISSFLLr~~~~n~l~ Neurospora Dictyosteliumdiscoideum i ~;;~~~~;;~~;;;~~~~~~~~~Ii~1I

Anophelesgambiae 48 QSMR-TNRWSSHPEAPGKAIYLTRMDRSL!sslvs NlTDlQvANlDDVEDSEGSH Drosophilamelanogaster 47 QmLSRRARRHGRVSVYLSCEVGTSTEPcsiY~ILT~T~RVWQE------Zebrafish 1 ------Caenorhabditiselegans 43 I HBEVHTSFKiGQVHGQTSSG Arabidopsisthalianadcl1 51 NDFF(3G:rDIUilfDSIKNGGGL SQD1:-J ~i TV'TPQviiAK E'IJKENGLQKNGG Spombe 42 ~~ C1KLEE(Jl e! IQESN------­ Neurospora 44 EPKPKKISAK~TAAFNSwDEEHQE~RDQRK------­ Dictyosteliumdiscoideum 61 §ii~!~~FSIN?QDNSCGDSKRIALF~RVpLlTQQAGAiEAiTNLKiCKLYGEIND-

Anophelesgambiae 107 E~GAlNTPVTDVASAD , FFGSETTrlQYlEQGrrlRlQe IS ~C ~ - - -NYGRjE Drosophilamelanogaster 97 Q~--MQIPFDHCWTD SILRPEGFlyuJETR~JLlSSVEL!iLilc D---SAVYSi Zebrafish 1 ------Caenorhabditiselegans 101 LwE--IKEQCDQFMKRHHlvvITAQCU]DLBRHAYlKDE!MC~F!mC ! ---ALGS~ Arabidopsisthalianadcl1 111 KRiEFlKEEGDKDRKRARlCSYQSERSNLSGRGHVNNF RiGDRF~R i DEAGNIm Spombe 61 ------Neurospora 95 ------Dictyosteliumdiscoideum 120 -----IRTRAFVRSKEYnlLVSTVGS ~EVRH!N&LeFYF&TF~ ~TGEHDFii

Anophelesgambiae 164 LWEICARLTHQAPSSDRPAQR • LGLAG~GAGCTPERlCWElHYLERCL RARDmTA§ Drosophilamelanogaster 152 IRPLFENHIMPAP----PADR • LGLAG~SAGCELQQe S ~TLEQSVLCQiiTA§ Zebrafish 1 ------Caenorhabditiselegans 156 P---YRSIMVDYKLLKKDKPV • LGLTASllKAKVAPEIDlMEQIKKLESAMDSvUBTAS Arabidopsisthalianadcl1 171 KKRECNNYRRDGRDREVRGYW ·DKVGSN~RSGTWEADHERDVKKVSGGNREC!vKA Spombe 61 ------Neurospora 95 ------AAIEAARVA Dictyosteliumdiscoideum 175 VVDYIRKTDLNYRPRILGLTASLISIGNSTiDIVQRSIKnlEERlLSRVFKPTSU] TNSU

Anophelesgambiae 224 iDTSiLRFS ~PT~lBLECIPPKPSNlTQLLRMLDQRQiAFLKQ;'RYEPLAlYGLDGNDS Drosophilamelanogaster 208 BBvTiLRYCS~PHEY!vQCApm~]D~SLVLADv!NTHKSFLLDWRYDPYEgYGT----- Zebrafish 1 ------Caenorhabditiselegans 213 ~ S ! SKYG~PYEii IICKnmE&GCIGIPNFD rrl IEIFDETVAFVN------Arabidopsisthalianadcl1 231 mENKSKPEE~EKVlEEQARRiQIDvlEQAKAKNTIAFeETGAQJTLIAI L! IKS----- Spombe 61 ------Neurospora 104 G1Dv!PAIGFDSERBiTSPREDQI ELFERAKQQNTIAvlDTGSGiTLlAAM!LRWVITG- Dictyosteliumdiscoideum 235 QSELQPELVSFKTSGQESLIETSUCEFLVSNSKElQIF!NYSDVDLDGVRGNPSFLK---

Anophelesgambiae Drosophilamelanogaster ;~~ ~:~~~~=~~~~~~~~~~~~~~~~:~:~~:'s S~~~ D~ :~~~~HF Zebrafish 1 ------Caenorhabditiselegans 260 Arabidopsisthalianadcl1 286 Spombe 61 Neurospora 163 ------EL IREKGL • Dictyosteliumdiscoideum 292 ------ALJ RLDKYS Q~ESI

Anophelesgambiae 344 LTTD - IQElvKvp~HF FCMVYTTI!QAiATVA~FAQHD----TELE ~RYST Pi Drosophilamelanogaster 3 06 YQCN - liKu:lVKTPrtIH CLVSTAlDoIYSLC ~iAFHRHlGSGSi SwQ ERYSS Pi Zebrafish 1 Caenorhabditiselegans 303 Arabidopsisthalianadcl1 332 Spombe 100 Neurospora 209 Dictyosteliumdiscoideum 337 Anophelesgambiae 399 i~LL WFGEQRNRPKDRNQPATLHHHQQVHNQQRILYCFCRNZECKELEKSYI,TFG Drosophilamelanogaster 365 IRWLLQ RcFKpE------EVHTQADGLRRMRHQ.DQADFNRLs~TLE Zebrafish 1 ------Caenorhabditiselegans 358 ETFNPEFQKERMKLEKAEHLSAIIFVDQRYIAYSLLLMIRHIKSWEPKFKFV Arabidopsisthalianadcl1 392 TTPKDKRPAIFGMTASPVNLKGVSSQVDCAIKIRNLETKNDSTVCTIKO!KEL Spombe 159 YHR------~LSKKHFTL dl IFG Neurospora 269 D------EQLERR~! ILG Dictyosteliumdiscoideum 397 ~Q~DL FVETR------FGASNLTSMLKKE

Anophelesgambiae 459 AQIADlD~ IKQLDGLLlSIRKR~RLNV]HKVVDNASAE---GGGTLiBQSPRHGHESR Drosophilamelanogaster 408 SKCRMlDQMDQPPTETRILiATL~ILHTTEDRQTNRSAARVTPTPT ~HAKPKPSSGAN Zebrafish 1 ------Caenorhabditiselegans 418 NPD~GASGRNLASS QGLHK .TEVLiRFHRNEINCLIATSVLEEiiDiKQCNLVIK Arabidopsisthalianadcl1 452 EKHVPlPSEIWEYD Tli IWSLTIKQMlAAVEEAAQASSRKSKWQFi!GARDAGAKDE Spombe 184 MTASPFTIGNLYHR--- ~YQW .LFDS~~SENELADYFCLPEESBIYSNKLWP Neurospora 288 LTAS D DP---RRlAAELEALLHSQIATAADPAALQHTICKPKT~ EYVRGRPD Dictyosteliumdiscoideum 424 PFQE ' HTlRLVGHNGVDGlDSEKlQSlliKFRDGKCRLIVTTNVLE ~DiQDCNIVIC

Anophelesgambiae 516 REFEFISP Drosophilamelanogaster 468 PDLKFLRC Zebrafish 1 Caenorhabditiselegans 478 FDRP!D~ SYVQ ------Arabidopsisthalianadcl1 512 LRQVYGVSERT Spombe 241 PS ~C ------Neurospora 345 SE . ;;Q Dictyosteliumdiscoideum 484 YDGI! SrlSLI

Anophelesgambiae 576 QYTVDKVATNPQNCLKQTTIEHRKQEEVLKRFRMHECNLLIGTSVLEEGIELPKCNLVIR Drosophilamelanogaster 528 QYTTDRVAD-PTTEPKEAELEHRRQEEVLKRFRMHDCNVLIGTSVLEEGIDVPKCNLVVR Zebrafish 27 ------Caenorhabditiselegans 530 ------Arabidopsisthalianadcl1 564 Spombe 294 Neurospora 395 Dictyosteliumdiscoideum 536

Anophelesgambiae 636 Drosophilamelanogaster 587 Zebrafish 27 ~~~~~;~~~I~:t :~~~~~~~~~~~~~~~~~~~~~~~~~ Caenorhabditiselegans 530 ------QIIPIEENGVT CAE ------Arabidopsisthalianadcl1 564 ------ERv!FQVD FQES LSE------Spombe 294 Neurospora 395 Dictyosteliumdiscoideum 536 ======~~------QILSi RERQLESmiITKNQ~~~~i ~LNWKKiN------~======

Anophelesgambiae 666 ---ENAASRNEQQED------Drosophilamelanogaster 647 SDSDDSAMPNSSGSDPYTFGTARGTVKILNPEVFSKQPPTACDIKLQEIQDELPAAAQLD Zebrafish 46 ------Caenorhabditiselegans 550 Arabidopsisthalianadcll 582 Spombe 314 Neurospora 414 Dictyosteliumdiscoideum 559

Anophelesgambiae 678 IEDVSIDGMIPEGP------KDDNTRETl niQD~ETAiDA Drosophilamelanogaster 707 Zebrafish 46 ~~~~SI~~!:~:~::=~~~:~~~=~~::~=~~~~=:~==~~~!~~~~ Caenorhabditiselegans 550 PIK~~T:KN------PMPNKKTAQI~VALE Arabidopsisthalianadcl1 582 LLQ ~~~G------~KVAAEiGKPENG Spombe 314 ---LKIFVEDWKNNK------YSING ~~mF ~TD Neurospora 414 - EUTEKHVKQI REAHELVN------AHTFSP D ~ Dictyosteliumdiscoideum 559 -INNEYFTNNQLDGG------PU] KDli c LSIYH Anophelesgarnbiae 718 RNGEPPDiLK I.CFNHCIEiYiSSGGAVALQSNGTCAS Drosophilamelanogaster 767 EPPE QSRFSACIAAY -KP------HLLTGAS Zebrafish 73 PVGKE~YE LDL IEEETSl pG- -GS------Caenorhabditiselegans 578 ~GRES m HID , PDEYAP~G------­ Arabidopsisthalianadc11 610 P~:D,p~N- ~nnEHI~ IGAA IGKVTPKVQSLIK------­ Spornbe 341 SDSVR FVE TAFTLSI FI KTLNLP------­ Neurospora 446 si~~~-n~/7fJrc~~p:~C!FERGVGAQRCI FVRQRNTAMLLA~U'QoiE------­ Dictyosteliumdiscoideum 588 Tnll,.,rr'."r FoBPD I INQmDQ~ S IRMEjJ!DEYLSS;JIVILN ------

Anophelesgarnbiae 778 LWLGNAIQTLNKYCAKLPSDTFTKLipIWRCAT~KGRKlYQYTIRLPINSPWKEDIL­ Drosophilamelanogaster 820 VDLGSAIALVNKYCARLPSDTFTK~WRCTRNElAGV~FQYTLRLPINSPLKHDIVG Zebrafish 118 Caenorhabditiselegans 624 Arabidopsisthalianadcl1 655 ======------LL======i~U~~~~~~QHTADF~iA~ ~IVFIER------======Spornbe 388 Neurospora 491 Dictyosteliumdiscoideum 634 ------TCIO] EQSIYRU] SLLNDNG------

Anophe1esgarnbiae 837 Drosophi1ame1anogaster 880 LPMPTQTLARRLAALQACVELHRIGELDDQLQPIGKEGFRALEPDWECFELEPEDEQIVQ Zebrafish 126 Caenorhabditiselegans 641 Arabidopsisthalianadc11 674 Spornbe 388 Neurospora 491 Dictyosteliumdiscoideum 654

Anophelesgarnbiae 837 Drosophilamelanogaster 940 Zebrafish 126 Caenorhabditiselegans 641 Arabidopsistha1ianadcl1 674 Spornbe 388 Neurospora 491 Dictyosteliumdiscoideum 654

Anophelesgarnbiae 878 Drosophilamelanogaster 1000 Zebrafish 158 Caenorhabditiselegans 676 Arabidopsisthalianadcl1 711 Spornbe 411 Neurospora 522 Dictyosteliumdiscoideum 688

Anophelesgarnbiae 938 Drosophilamelanogaster 1060 zebrafish 218 Caenorhabditise1egans 736 Arabidopsistha1ianadcl1 771 Spornbe 471 Neurospora 582 Dictyosteliumdiscoideum 748

Anophelesgarnbiae 996 Drosophilamelanogaster 1105 ~~~~:~~:~EE :RQ~~Ql?iI)PC~ RUi Q Zebrafish 264 EKSEA;~I Caenorhabditise1egans 784 1 ENMPRIIPK - Arabidopsistha1ianadc11 824 NSAiGL SQLPGDRYAI CRLQL Spornbe 521 ivY G GLYAiSLLYNFCNTLSRDiYT PCLSGWYCFEVt1.~C Neurospora 638 SRQY ~P ~QSIvCLANFTATLPHPPETSLS TTVPG-I FQC EVI ·oms Dictyosteliumdiscoideum 806 DKP GF RiYESSLE~CI IQGGI NKNiNcIi IDLNLQYIKRNLEDHYlLQC QS;mS Anophelesgambiae 1056 Drosophilamelanogaster 1164 Zebrafish 324 Caenorhabditiselegans 843 Arabidopsisthalianadcl1 884 Spombe 581 R------Neurospora 697 Dictyosteliumdiscoideum 866 TSLISKRTKFINY~ ·FIi~D(:~KN1~I

Anophelesgambiae 1104 Drosophilamelanogaster 1212 Zebrafish 372 ~~I~~===~~~======---ESLQ------Caenorhabditiselegans 891 R SSTSNIPQ---ASASDSKESNTSVPHS Arabidopsisthalianadcl1 932 EKADQDiEGEPVPGTAR------Spombe 629 EEDELKD------~~'~~m'~nv~Tnm Neurospora 745 RLAvlSK~TEYA------­ Dictyosteliumdiscoideum 926 STKlGiFTMIWDVFITPSR------TIFIQ

Anophelesgambiae 1148 Drosophilamelanogaster 1256 ------EENVDRQ---QElIQEDEDF~ ~ '~~-,~ Zebrafish 416 ~~ ~~.~ ~~~rv ---- Q----TLPPDF~i pN.D Caenorhabditiselegans 948 ----KEK-- rKI~DI)NI: ~NSm~ Arabidopsisthalianadcl1 976 : YM~iR~CVTIF'GSSKDPFLS-----EVSEF Spombe 664 ~u~~~uu~uIF~ pLiFVQAHS-----FPKIDS Neurospora 783 ~~~~~l~nT~mT SRPLL SRSALPmiAS-----FPLF PG------Dictyosteliumdiscoideum 976 SIEPIIQlNNKH I EiRIK ~LEGGILvlPEIYSYAGNSNSQLREYSSwiv SNQ

Anophelesgambiae 1203 VLRKTKEQKIAQAQ DASAPEVED~KEAPNVRD~DEEDGLKMENGVIAEVE Drosophilamelanogaster 1308 Zebrafish 463 ~~~:~~===:~~_~~ I D~~~~~~A~~~~~~=:~~=~~~~~~~~======~~~ Caenorhabditiselegans 1000 ------I Q IVKiQLilKSIEDQERET~K&D------Arabidopsisthalianadcl1 1031 A------T FKGSI • TENQLSSLKKFHlRL! s------Spombe 708 ------W .. EDSF . SHLLELLKKSTR~LQFG------Neurospora 832 ------SVQADDT EQLTRF TlKAFMDVFSKE~TAiN ------Dictyosteliumdiscoideum 1036 IGTHTVKIWSGI~]vnNVRKFFRCIGDIFSTTIPTVTLPQNRDYRDQD------

Anophelesgambiae 1263 KSQVDGEDDTGDKKTDSDGTLL TlSN~GVG----TDNiMGEEGRRGASSPSFmR Drosophilamelanogaster 1356 KSAIELIIE-GEEKLQ TlSNDMIDDIASFN-QEnEDEDDAFHL ~LPANIK Zebrafish 481 TSS------Caenorhabditiselegans 1034 . PE~IGVEISSRDiRMDGEDQDTlGLTQGL Arabidopsisthalianadcl1 1066 · pAKlYLFVP----V~TSMEPIKGgNwELiE Spombe 742 Neurospora 866 Dictyosteliumdiscoideum 1084

Anophelesgambiae 1319 Y~DC~NSSAN SSDEYDE-----EDDyyLYDGSEKPKNAIEP------SQ ~SGTD Drosophilamelanogaster 1414 Zebrafish 493 ~:~~~~~~~~~~~~~~~~~~~::~~~~~~~~~~~~:~~~:~~::~::= Caenorhabditiselegans 1078 H~ IID~EL ~DYTARLTSNRNGIGAWSGSESIVPSGWGDWDGPEPDNSPMPFQ Arabidopsisthalianadcl1 1109 Spombe 742 ~~~~Q ~ I:~QRAR!~~~======~~~=~::~~~=:~:~~~~:~ Neurospora 885 KSPAHLIDRKAL SENEKVP------YT Dictyosteliumdiscoideum 1130 L~QA~ PSGI RPSMVKFN----PIDCGDEHRTLEICSVSTTSRCKLNRQvgSLLS

Anophelesgambiae 1368 NANDSGDK-PGSRNRT!TQSQDTivNLGGRLKIEFKiETimEAI ~ECILQRQRTQ Drosophilamelanogaster 1474 DDDDNAGP-L~HHNYSSDDDnlAnDIDAGRIAF~ETIE~ Q~EKRQKQ Zebrafish 513 APNTPDEK-LETI'TL PiTDLNKDCFPNLPNG------UQ~ SDDLPHRS C Caenorhabditiselegans 1138 ILGGPGGI Q~~ ~.GRVFDPSTASSSLSQTVQm;TVSPPKQ LUKE!EQFKKLQNa Arabidopsisthalianadcl1 1157 KSHPTYC~RGAV~SF~SGLlpVRDAFEKEVEIDLS KlKLMMADGCMVAEDLIG Spombe 763 -CTDYRFIENLIDVDTBQNFFKLPEPVQNVTDLQS~VLLVNPQSIYEQYAFEG EF Neurospora 909 FLEPDDFFQDKFlvn pYDGARKFFTHHRRHDMKP~PVP~ IVAPNHRAWRGLGTTH Dictyosteliumdiscoideum 1186 TLGTQDNiFFALQDHYlNQVAQI INnTNASKQAlvmFFPW ITEGELYQ~PYIRRILI Anophelesgambiae 1427 E!QNDMLY~S KNAVDGFCYSP~DRQCAEEIQAEQRIQaTKD T6RQQGSLiRW Drosophilamelanogaster 1533 Q ANERQYQ KNLLIGFNFKHED -----Q PAT1R S1 ~ LKTEB ESGGMLl pH Zebrafish 558 Q QLGPLERD TQTTTSVSIRP~ P -- ---A PQPWPS I CT · SSDLCD------­ Caenorhabditiselegans 1198 KQAKERLEALE EDMEKPR~EnDvNLEDYG I QENQE TPTNFPKTBDEE1EEI S1 Arabidopsisthalianadcl1 1217 TAAHSGKRFY 1CYDMSAETSFPRKEGYLGPLEYNTY I YYKamYGvnlNCKQQPl 1K Spombe 822 M1PAKKKDKAPS~ LCKKLPL ~~LWGNRAKS1PKSQQVRSFY1N------­ Neurospora 969 N~ SLWSKSRGFM1FQADQPIVEAAL1STRRDFLDDTLRilDVEPQ ------­ Dictyosteliumdiscoideum 1246 KLKMER1QQKCH1E1KDSRML!GVCDPTNSLPPNTVFVQ ~EDEDDDDDGRKYEKvBEG

Anophelesgambiae 1487 NEpl sVSHWRSKLDESEAATLGALMDDLGGQPFTELVPYVEPEQLFELLVRNGSSGERWL Drosophilamelanogaster 1588 DQalvLKRS----DAAEAQVAKVSMMEL----LKQLLPYVNEDVLAKKLG-----DRREL Zebrafish 605 ------PHVKKP ------______Caenorhabditiselegans 1258 GARKKQE1DDN ---AAKTDVLER ------Arabidopsisthalianadcl1 1277 GRGlSYCKNLLSPRFEQSGESETVLDKT------Spombe 868 Neurospora 1016 Dictyosteliumdiscoideum 1306 LVMl 1KNPCTHPGDVRYLKAVDN------

Anophelesgambiae 1547 RLHDLYCLNRPFFP - - - -ERYT1FRGGSQFDNFLDESCQEGEGASSPPKV1ELT1GDPFP Drosophilamelanogaster 1635 LLSDLVELNADWVARHEQETYNVMGCGDSFDNYNDHHR------Zebrafish 611 Caenorhabditiselegans 1278 Arabidopsisthalianadcl1 1305 Spombe 868 Neurospora 1016 Dictyosteliumdiscoideum 1329

Anophelesgambiae 1603 Drosophilamelanogaster 1673 ------Zebrafish 611 ------PKSETATSTPKLQYER1EBEPPTSTC~ PSETu l§ :ED(: RI~ C~G ------~-~-I-~- ~PAgl ~s~P~K~P Caenorhabditiselegans 1278 ------ENCEVL ~ 1NEKSRSFSFEKES~~(3RI~F~Q§! SI --EiVSH1 SD1 Arabidopsisthalianadcl1 1305 ------YYVFLPPELCVVHP~sisL1 VQLKNL1SYP1 Spombe 868 Neurospora 1016 ======;;; ~~;~~~~~~I s AL~~~~~~ ~ Dictyosteliumdiscoideum 1329 ------1R~HlRNVLVFSTKGDVPNF FciDKSL1GNRSESE

Anophelesgambiae 1663 Drosophilamelanogaster 1722 Zebrafish 653 Caenorhabditiselegans 1328 Arabidopsisthalianadcl1 1355 Spombe 910 Neurospora 1065 Dictyosteliumdiscoideum 1379

Anophelesgambiae 1709 Drosophilamelanogaster 1768 Zebrafish 699 Ba v u! u~'~~,. uuTDKWD Caenorhabditiselegans 1374 Arabidopsisthalianadcl1 1401 Spombe 956 Neurospora 1125 Dictyosteliumdiscoideum 1428

Anophelesgambiae 1759 Drosophilamelanogaster 1818 Zebrafish 759 Caenorhabditiselegans 1424 Arabidopsisthalianadcl1 1450 Spombe 1005 Neur ospor a 1174 Dictyosteliumdiscoideum 1478 Anophelesgambiae 1790 EICQJm~~RAKRtvDRFDLQ--HADSN~DGNED------DDEDDEDDDDDD Drosophilarnelanogaster 1849 QICE ~~DALG ------QN QNGQL------DDSNDS-----­ Zebrafish 819 AFGK SLQPTDPG KAPKKAHNSHFSPD EFDYSSWDAMCYLDPSKAGEEDDFV Caenorhabditiselegans 1454 PENKTGwiI!GDVSKS------TT IETITFP------­ Arabidopsisthalianadcl1 1471 EENSDiFmDiEMEDGi------Spombe 1011 Neurospora 1184 ------Dictyosteliurndiscoideurn 1490 KIMGKI~QIDQLVYIG------

Anophelesgambiae 1839 DDTTTGEGNEHEADNGS Drosophilarnelanogaster 1881 ------CNDFSC Zebrafish 879 VGFWNPSEENCGTDIGKQSD:sI I~~E~QC~ Caenorhabditiselegans 1481 ------KQARVGNDDISPm:~~II~QH@S Arabidopsisthalianadcl1 1487 ------I!EG:~sB~~1!ss Spombe 1011 Neurospora 1184

Dictyosteliurndiscoideurn 1507 ------LJJ.: •• 'r '.,' ... o...> r.:/LJ.'I..o...> ... ,r.:/ v..... "' ...... , ... ""!=' I~ '~.L.J

Anophelesgambiae 1899 IREPPVKLNSNNETALTPYKATGQNDGPLSTGVT6AIGHWVA~SIr:. Drosophilarnelanogaster 1929 ITR---QLDGGNQEQRIPGSTKPNAEN------viT GAWPTPRSP IF- Zebrafish 939 EK---QSSGGS------AEL GWLKI oR iiiH­ Caenorhabditiselegans 1534 ---KSDVPS------P~L IDTPT AS FL Arabidopsisthalianadcl1 1528 PD------EIDGTL - ES ~S- Spombe 1048 ISN------WDEaFDLNTYADS RNVQ Neurospora 1217 KA KK------HRMISYG YAVYQ ~TWQTESA Dictyosteliurndiscoideurn 1545 LRHYSAESEES------I I I LDQGFISDKiSi D1

Anophelesgambiae 1959 ITFGGIETGAAATSRE~."r.:II .L.J~, ~I~~ Drosophilarnelanogaster 1979 ------APNATEE.L.JLJ~I .L.JO"'>· UU, ~~ Zebrafish 974 ------PDAERT.L.J... ~I ... O"'> ~I .L.J ... , Caenorhabditiselegans 1570 N------NLWQQF Arabidopsisthalianadcl1 1557 ------~~~ Vu~~~~~ ~I~ Spombe 1080 FP------Neurospora 1251 N------SAQ>'I.LJ,·~ ... ~,,_ ~, ~ >'I. Dictyosteliurndiscoideurn 1575 KG------EI KNDlIKT~~~'~~lu~I= T

Anophelesgambiae 2016 Drosophilarnelanogaster 2027 Zebrafish 1021 Caenorhabditiselegans 1612 Arabidopsisthalianadcl1 1592 Spombe 1118 Neurospora 1293 Dictyosteliurndiscoideurn 1619

Anophelesgambiae 2076 Drosophilarnelanogaster 2087 ==~:~~~IE ~~~~~~~~~:~~~~~~~~~~~~~~ Zebrafish 1081 ~~·~~_L~--NEMQG ISELRRS E------EKE Caenorhabditiselegans 1672 ~CS'=IRN,rnmuFN ~MVT EID------EGQE Arabidopsisthalianadcl1 1652 --SSKPGFNS G------L Spombe 1178 TASENPW~F------­ Neurospora 1353 T~m~~,E EAINAGKT~YSR------DYWVHIT Dictyosteliurndiscoideurn 1679 hl~~=~u~KESTLEIIGSTATMLFDE------NSNLN

Anophelesgambiae 2134 Drosophilarnelanogaster 2126 Zebrafish 1121 Caenorhabditiselegans 1716 Arabidopsisthalianadcl1 1683 Spombe Neurospora 1400 Dictyosteliurndiscoideurn 1724 Anophelesgambiae 2191 Drosophilamelanogaster 2183 Zebrafish 1178 Caenorhabditiselegans 1773 Arabidopsisthalianadcll 1738 Spombe 1266 ~!L!~C~GC:E------DFGTKCV1EEVKS~~ ~~ ••~~~. Neurospora 1460 ~~~a~lKFRCNEWRSFAKELDTDVTEGRGGRGGN!AVAGE1SE1NP Dictyosteliumdiscoideum 1784 ~~,~~~.SK------WYETLVQQ~YiTKNPu~JUU"C ~ L

Anophelesgambiae 2224 ~Gyr~mKRR------Drosophilamelanogaster 2216 Zebrafish 1210 Caenorhabditiselegans 1806 iOOEO()RFtOspsLTTV------­ Arabidopsisthalianadcll 1771 ~1I1~Kl~~EE;mE:KH1NNGNAGEDQGENENGNKKNGHQP Spombe 1310 Neurospora 1520 ~~i! ~~II~~iI~~~~;II;~;;~ERLGCNCKGVPMEVDGGVPEADVDGEVSNLLLYSCNCKFSKKKPSDEQ1KGDGK Dictyosteliumdiscoideum 1819 uu,••• ~~L' a 1VENSM1PS1FNNKNHNNTT11111111111

Anophelesgambiae Drosophilamelanogaster Zebrafish Caenorhabditiselegans Arabidopsisthalianadcll 1831 FTRQTLND1CLRKNWPMPSYRCVKEGGPAHAKRFTFGVRVNTSDRGWTDEC1GEPMPSVK Spombe 1370 VKSLT------Neurospora 1580 HGTAV------Dictyosteliumdiscoideum 1879 11111------

Anophelesgambiae Drosophilamelanogaster Zebrafish Caenorhabditiselegans Arabidopsisthalianadcll 1891 KAKDSAAVLLLELLNKTFS Spombe gij32416996jrefjxP_328976.1j Dictyosteliumdiscoideum 3a.

DicerGDSFF GAAACRATYGGMGAYTCYTTYYTGAA DicerVEALR GTCAGCWAYYGATTTRTC DicerVEALF WGTSGAAGCYYTGATCGG DicerRLEFR RGCRTCKCCWAGRAAYTC

3b.

Appendix 2 Figure 3. A. Alignment of Dicer genes from Anopheles gambiae (AA073809), Drosophila melanogaster (NP _524453), Zebrafish (AAQ90464), Caenorhabditis elegans (NP _ 498761), Arabidopsis thaliana (AAF03534), Schizosaccharomyces pombe (Q09884), Neurospora crassa (XP _328976) and Dictyoste/ium discoideum (AJ314909). Sequences were aligned with Clustal X and the alignment is shown in Boxshade output. The position of the degenerate PCR primers is shown in green. B. Sequences of primers in 5' to 3' orientation. In A., GDSFF and VEALF are in the 5' to 3' direction while VEALR and RLEFR are in the 3' to 5' direction. 98

Brassica rapa Del sequence >CC_Dicer_T7_sq AGTCGAAGCTCTGATCGGTGTTTACTACGTGGAAGGAGGTAAAGCTGCGG CGAACCATTTGATGAAATGGATTGGGATCCACGTGGAGGATGATCCCGAA GAAACCGAAGGGGCCGTGAAACCGGTTAGTGTTCCGGAGAGCGTGCTCAA GAGCATAGACTTTGTTGGTTTAGAGAGAGCTTTGAAGTACGAGTTTACGG AGAAAGGTCTTCTTGTGGAAGCTATCACGCATGCTTCGAGACCGTCTTCG GGTGTTTCGTGTTACCAGAGGTTGGAGTTTCTTGGAGATGCT

Appendix 2 Figure 4. DCl sequence from Brassica rapa club root gall. Primer sequences shown in italics.

SA. >giI30677869IrefINM_099986.21 UniGene infoGeo Arabidopsis thaliana DCLl (DICER­ LIKE1); ATP-dependent helicase/ ribonuclease III (DCLl) mRNA, complete cds length=6l88 Score = 194 bits (98), Expect = 1e-46 Identities = 230/274 (83%), Gaps = 0/274 (0%) Strand=Plus/Plus

Query 12 TGATCGGTGTTTACTACGTGGAAGGAGGTAAAGCTGCGGCGAACCATTTGATGAAATGGA 71 1111 11111111 11111 11111 11111 III II II 1111111111111111 Sbjct 4899 TGATTGGTGTTTATTACGTCGAAGGGGGTAAGATTGCAGCTAATCATTTGATGAAATGGA 4958

Query 72 TTGGGATCCACGTGGAGGATGATCCCGAAGAAACCGAAGGGGCCGTGAAACCGGTTAGTG 131 1111111 11111111111111111 II III III II I 11111 1111 II Sbjct 4959 TTGGGATTCACGTGGAGGATGATCCTGATGAAGTCGATGGAACATTGAAAAATGTTAATG 5018

Query 132 TTCCGGAGAGCGTGCTCAAGAGCATAGACTTTGTTGGTTTAGAGAGAGCTTTGAAGTACG 191 1111 11111 11111111111111 111111111111 I 111111111 I II II I Sbjct 5019 TTCCAGAGAGTGTGCTCAAGAGCATCGACTTTGTTGGTCTTGAGAGAGCTCTTAAATATG 5078

Query 192 AGTTTACGGAGAAAGGTCTTCTTGTGGAAGCTATCACGCATGCTTCGAGACCGTCTTCGG 251 111111 11111111111111111 11111111 II 11111111 11111 11111 I Sbjct 5079 AGTTTAAAGAGAAAGGTCTTCTTGTTGAAGCTATAACACATGCTTCAAGACCATCTTCAG 5138

Query 252 GTGTTTCGTGTTACCAGAGGTTGGAGTTTCTTGG 285 1111111111111111111 11111 III 1111 Sbjct 5139 GTGTTTCGTGTTACCAGAGATTGGAATTTGTTGG 5172

5B. >giI15223286IrefINP_171612.11 UniGene infoGene info DCL1 (DICER-LIKE1); ATP­ dependent helicase/ ribonuclease III [Arabidopsis thalianal Length=1909 Score = 178 bits (452), Expect = 6e-44, Identities = 87/97 (89%), positives = 92/97 (94%), Gaps = 0/97 (0%), Frame = +2

Query 2 VEALIGVYYVEGGKAAANHLMKWIGIHVEDDPEETEGAVKPVSVPESVLKSIDFVGLERA 181 VEALIGVYYVEGGK AANHLMKWIGIHVEDDP+E +G +K V+VPESVLKSIDFVGLERA Sbjct 1506 VEALIGVYYVEGGKlAANHLMKWIGIHVEDDPDEVDGTLKNVNVPESVLKSIDFVGLERA 1565

Query 182 LKYEFTEKGLLVEAITHASRPSSGVSCYQRLEFLGDA 292 LKYEF EKGLLVEAITHASRPSSGVSCYQRLEF+GDA Sbjct 1566 LKYEFKEKGLLVEAITHASRPSSGVSCYQRLEFVGDA 1602

Appendix 2 Figure 5. BLAST alignments of Dicer sequence from Brassica rapa with DCll sequence from Arabidopsis. A. nucleotide sequences; B. amino-acid sequences. 99

Appendix 3. Testing plant-derived RNA silencing against the intracellular pathogen Plasmodiophora brassicae

Vector assembly, hybridisation fragments and PCR screening

PbACTR2at tbl HYB PbACTR2 GUSOL HYB TCCATGCTllIIIII GUSR2attb2 GGGGACCACTTTGTACAAGAAAGCTGGGTCCGCTAGTGCCTTGTCCAGT GUSR1F1PbACTOL TACCCGATAGCATGGATCCTAACTATGCCGGAATCCAT

NPT II a HYB ATGACTGGGCACAACAGACAATCGGCTGCT NPT II b HYB CGGGTAGCCAACGCTATGTCCTGATAGCGG HANLOOPRl CCCGGTAGTGATCTTATTTCATTATG

Primers for qRT-PCR

PbACTR1FHYB CACCCAGGTCATCACGGTCGGC PbACTR1RHYB ATACCTTGACGCGCATCGA PbACTF3 GGTCGATCCTGTCGTCGCT PbACT3UTRIR ACACATTGACGTCGCGATAA PbACT3UTRIIR ATGATTCGATGCAAGCCTCT BTUBFl CTGCTCATCTCCAAGATCCGA BTUBR2 ACGAACGACTCATCGGAGTT AtACT8F GTATGTTGCCATTCAAGCTGTTCTA AtACT8F GAGCTTGGTTTTCGAGGTCTCC

CTRT TCTAGAATTCAGCGGCCGCT 30 VN

Appendix 3 Table 1. PCR primer sequences. Primers written in 5' to 3' direction. The italics in the vector-assembly primers indicate attBl. and attB2. sequences. Colours in the vector­ assembly primers indicate the actin and GUS sequences as shown in Appendix 3 Fig. 1. HYB

= primers also used for amplifying fragments for hybridisations. 100

>PbACTGUS pH8 c 2 GCACAATCCCACTATCCTTCGCAAGACCCTTCCTCTATATAAGGAAGTTC ATTTCATTTGGAGAGGACACGCTCGAGACAAGTTTGTACAAAAAAGCAGG

TACACCACGCCGAACACCTGGGTGGACGATATCACCGTGGTGACGCATGT CGCGCAGGACTGTAACCACGCGTCTGTTGACTGGCAGGTGGTGGCCAATG GTGATGTCAGCGTTGAACTGCGTGATGCGGATCAACAGGTGGTTGCAACT GGACAACGCACTAGCGGACCCAGCTTTCTTGTACAAAGTGGTCTCGAGGA ATTCGGTACCCCAGCTTGGTAAGGAAATAATTATTTTCTTTTTTCCTTTT

Appendix 3 Figure 1. Sequence of pH8AG insert. Sequence of pH8AG between CaMV35S promoter and pHelisgate8 intron. Red = P. brassicae actin sequence; blue = GUS sequence.

pH8AG line -/- -/+ and +/+ X2 value

pH8AG line 1 30 79 0.37 pH8AG line 2 28 82 0.01 pH8AG line 3 21 45 1.63 pH8AG line 4 9 47 2.38 pH8AG line 5 29 96 0.22 pH8AG line 6 5 96 21.65 pH8AG line 7 28 100 0.66 pH8AG line 8 58 157 0.22

Appendix 3 Table 2. Segregation ratios of pH8AG T2 seedlings on kanamycin containing media. To assess if the segregation ratios significantly differed from 1:3 ratios, a X2 goodness of fit test was calculated. Only the value for line 6 exceeded the 950/0 X2 value of 3.841. 101

Appendix 3 Figure 2. Primary P. brassicae infection in 8 dpi root hairs. Roots were stained with trypan blue/lactophenol solution.

1 234 5

Appendix 3 Figure 3. Southern hybridisation. 8al1i111 -digested DNA from pH8AG plant lines hybridised with an nptIIDNA fragment. Lanes 1. line 2; 2. line 4; 3. line 5; 4. line 6; 5. line 7.

Appendix 3 Figure 4. PCR detection of pH8AG insert. PCR with a primer in the pHelisgate8 intron (HLOOPR1) paired with a primer in the PbACT DNA fragment (PbACTR2attb1). 1-8 = pH8AG lines 1-8. + = positive plasmid control; - = negative water control. 102

'-"' ...., ...... _- ~ Line 4

Line 7

Appendix 3 Figure 5. Segregation of pH8AG insert. Line 4 (top) and line 7 (bottom). Tl progeny from each line were grown on non-selective media. peR (see Appendix 3 Fig. 4) for the pH8AG insert was performed on DNA from a single leaf of each plant. The remainder of the plant was then stained for GUS expression. All plants lacking the pH8AG insert showed GUS expression in roots and young leaves.

Appendix 3 Figure 6. Pot grown Arabidopsis infected with P. brassicae. 103

Appendix 3 Figure 7. siRNA acrylamide gel. Gel stained with ethidium bromide prior to transfer to nylon membrane. 0 = DNA oligonucleotides used as size markers. 104

Reporter-gene expression in P. brassicae galls In unpublished research prior to this study, about 40 promoterless-GUS tagged Arabidopsis lines were infected with Plasmodiophora brassicae. The lines had previously been shown to have up-regulated GUS expression in response to nematode infection (Florian Grundler, pers. comm.). The plants were histochemically stained and examined for changes in GUS expression as a result of P. brassicae infection. Two lines ("ALC.904" and "ALC.184") with elevated GUS expression in club root galls were identified. To assess levels of CaMV35S-driven expression during secondary stage P. brassicae infection, club root galls from ALC.03 plants were stained for GUS expression and compared to stained galls from ALC.904. GUS expression in ALC.904 galls was strong and relatively consistent across galls whereas GUS expression in ALC.03 galls was weaker and patchy. Non-galled ALC.03 roots continued to show GUS expression while expression in adjoining galls was considerably reduced.

,

c

Appendix 3 Figure 8. Arabidopsis plants containing Arabidopsis promoter-GUS fusions. Promoter-reporter fusions = (ALC.904) (A and B) or pCAMBIA 1303 (ALC.03) (C and D). Plants were infected with P. brassicae, harvested at 30 dpi and stained for GUS expression. Note that in D, non-galled roots show GUS expression, but the gall shows patchy GUS expression.