Expressed sequences tags of the anther smut , violaceum, identify mating and pathogenicity genes Roxana Yockteng, Sylvain Marthey, Hélène Chiapello, Annie Gendrault, Michael Hood, Francois Rodolphe, Benjamin Devier, Patrick Wincker, Carole Dossat, Tatiana Giraud

To cite this version:

Roxana Yockteng, Sylvain Marthey, Hélène Chiapello, Annie Gendrault, Michael Hood, et al.. Ex- pressed sequences tags of the anther smut fungus, , identify mating and pathogenicity genes. BMC Genomics, BioMed Central, 2009, 8 (1), pp.272. ￿10.1186/1471-2164-8- 272￿. ￿hal-02333218￿

HAL Id: hal-02333218 https://hal.archives-ouvertes.fr/hal-02333218 Submitted on 31 May 2020

HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. BMC Genomics BioMed Central

Research article Open Access Expressed sequences tags of the anther smut fungus, Microbotryum violaceum, identify mating and pathogenicity genes Roxana Yockteng1,2, Sylvain Marthey3, Hélène Chiapello3, Annie Gendrault3, Michael E Hood4, François Rodolphe3, Benjamin Devier1, Patrick Wincker5, Carole Dossat5 and Tatiana Giraud*1

Address: 1UMR 8079 CNRS-UPS, Ecologie, Systématique et Evolution, Bâtiment 360, Université Paris-Sud, F-91405 Orsay Cedex, France, 2UMR 5202, CNRS-MNHN, Origine, Structure et Evolution de la Biodiversité, Département Systématique et Evolution, 16 rue Buffon CP 39, 75005, Paris, France, 3INRA, Unité Mathématique, Informatique et Génome, Domaine Vilvert, Jouy-en-Josas, F-78352, France, 4Department of Biology, Amherst College, Amherst, MA 01002, USA and 5Génoscope, UMR CNRS 8030, 2 Gaston Crémieux, CP 5706, 91507 Evry, France Email: Roxana Yockteng - [email protected]; Sylvain Marthey - [email protected]; Hélène Chiapello - [email protected]; Annie Gendrault - [email protected]; Michael E Hood - [email protected]; François Rodolphe - [email protected]; Benjamin Devier - [email protected]; Patrick Wincker - [email protected]; Carole Dossat - [email protected]; Tatiana Giraud* - [email protected] * Corresponding author

Published: 10 August 2007 Received: 17 November 2006 Accepted: 10 August 2007 BMC Genomics 2007, 8:272 doi:10.1186/1471-2164-8-272 This article is available from: http://www.biomedcentral.com/1471-2164/8/272 © 2007 Yockteng et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract Background: The basidiomycete fungus Microbotryum violaceum is responsible for the anther-smut disease in many plants of the family and is a model in genetics and evolutionary biology. Infection is initiated by dikaryotic hyphae produced after the conjugation of two haploid sporidia of opposite mating type. This study describes M. violaceum ESTs corresponding to nuclear genes expressed during conjugation and early hyphal production. Results: A normalized cDNA library generated 24,128 sequences, which were assembled into 7,765 unique genes; 25.2% of them displayed significant similarity to annotated proteins from other organisms, 74.3% a weak similarity to the same set of known proteins, and 0.5% were orphans. We identified putative pheromone receptors and genes that in other fungi are involved in the mating process. We also identified many sequences similar to genes known to be involved in pathogenicity in other fungi. The M. violaceum EST database, MICROBASE, is available on the Web and provides access to the sequences, assembled contigs, annotations and programs to compare similarities against MICROBASE. Conclusion: This study provides a basis for cloning the mating type locus, for further investigation of pathogenicity genes in the anther smut fungi, and for comparative genomics.

Background the evolution of host-pathogen interactions, thereby Deciphering the molecular mechanisms involved in infec- advancing our understanding of host specificity, viru- tion is important for the control of devastating crop dis- lence, and the emergence of new diseases. Modern eases. Furthermore, the comparison of pathogenicity- sequencing technologies have led to a remarkable increase related genes from different fungi provides insight into in genomic data available for identifying genes by similar-

Page 1 of 12 (page number not for citation purposes) BMC Genomics 2007, 8:272 http://www.biomedcentral.com/1471-2164/8/272

ity searches [1]. Key genes involved in pathogenicity in human activities, making it valuable for the study of nat- several fungi have been compiled into the PHI database ural host-pathogen coevolution, and avoiding the risk of [2]. dispersion in human crops. However, one present limita- tion of this model is that little genomic sequence data are In the smut fungi of monocot hosts (e.g. Ustilago maydis available, except studies on transposable elements and on and U. hordei, major pathogens of corn and barley, respec- the genomic defense mechanism against the accumula- tively), the sexual phase and the genes linked to the mat- tion of mobile elements [12]. In particular, the mating- ing-type loci play a key role in development and type locus was reticent to several cloning attempts (T. pathogenicity [3]. Mating-type loci determine sexual com- Giraud and M.E. Hood, unpublished; J. Kronstad, pers. patibility: only individuals differing at these loci can mate. com.) and there exist few sequences of expressed genes In U. maydis, cell recognition and fusion is regulated by a from M. violaceum in public databases. Only a few Micro- pheromone/receptor system that resides at the a locus. botryum genes that contribute to hyphal development and After fusion, the dikaryon is maintained and cells switch subsequent infectious capability have been described to filamentous growth if they are heterozygous for the sec- [13]. ond mating type locus, the b locus [4,5]. The b locus encodes two homeodomain proteins that function as The generation of Expressed Sequence Tags (EST) is an transcriptional regulators after dimerization. The majority efficient tool to discover novel genes and investigate their of sexual basidiomycete fungi possess such a system called expression at different developmental stages (e.g., "tetrapolar", where a and b unlinked loci (respectively [14,15]). Therefore, a cDNA library has been built from called B and A in some ) are both involved in sex- pools of mating haploid cells and growing infectious ual compatibility and are often multiallelic [5,6]. Other hyphae for a single dikaryotic isolate of M. violaceum col- members of this phylum are "bipolar", due to the a and b lected from the host plant species . Genes loci being tightly linked (e.g. in U. hordei, [7]) or due to involved in mating and during early pathogenic develop- one of the two mating type loci having lost their role in ment were expected to be expressed under these condi- mating type specificity (e.g. in Coprinellus disseminatus, tions because they represent the mating and infectious [8]). Tetrapolarity is likely ancestral [9] and promotes out- stages. We generated 24,128 ESTs from this library, on crossing as it increases the number of available mating which we performed similarity searches in order to iden- type. The study of mating-type loci is important for under- tify genes with functions known as important for these standing the infection process and the evolution of mat- developmental stages. ing systems in basidiomycetes. Results and discussion A widely recognized model to study host-pathogen coev- EST sequence analysis olution and fungal genetics is the anther smut fungus The cDNA library created from poly(A)+mRNA from Microbotryum violaceum (Pers.) Deml and Oberw. (for- seven days-old mixed A1 and A2 cultures produced merly Ustilago violacea (Pers.) Fuckel), which is a basidio- enough material to sequence 40,000 clones. A total of mycete, obligate parasite of more than 100 perennial 28,430 sequences were obtained (success rate of 71%) species of Caryophyllaceae [10]. In plants infected by M. with an average read length of 815 bp, which is similar to violaceum, fungal teliospores are produced in anthers and the EST library of U. maydis [14]. Some ribosomal (n = diseased plants are usually completely sterilized, the pol- 109), mitochondrial (n = 16) and vector (n = 14) len being replaced by fungal spores and the stigmas and sequences were identified. After discarding them, a total ovaries being reduced. New infections occur when fungal of 24,128 ESTs were obtained (85% of the initial spores are transported from a diseased to a healthy plant sequences). After trimming vector and low quality by the insects that usually serve as pollinators. Once sequences, the average cDNA read was not very long, with deposited on a host plant, diploid teliospores undergo 345 ± 167 bases (mean ± SD). We indeed did not select meiosis and give rise to four haploid cells, two of mating the sizes of mRNA, as recommended for normalized type A1 and two of mating-type A2, M. violaceum having a libraries. bipolar mating system. Each of these post-meiotic cells can buds off yeast-like sporidia on the plant surface. New These 24,128 ESTs were assembled into 4,178 contigs infectious dikaryons are produced only after conjugation while 3,587 remained as singlets (Figure 1). This corre- of two cells of opposite mating-types [11]. The fungus sponds to a redundancy of 85% (number of ESTs assem- then grows endophytically and causes perennial systemic bled in clusters/total number of ESTs), which is very high infections. compared to the redundancy obtained in other fungal EST libraries such as Phytophtora parasitica (49.5%, [16]), Bot- Although M. violaceum is related to major crop pathogens rytis cinerea (67%, [15]) or Ustilago maydis (72.3%, [14]). like U. maydis and Puccinia spp., it has no impact on This does not result from a low efficiency of our normali-

Page 2 of 12 (page number not for citation purposes) BMC Genomics 2007, 8:272 http://www.biomedcentral.com/1471-2164/8/272

zation, but from the large scale of the present study com- Genbank databases. A total of 125 unisequences (0.5%) pared to the ones cited above. Our library indeed had a had no similarity to any existing UniProt nor Genbank size of 6.65× compared to P. parasitica, 3.74× compared to entry ("orphans"). This high frequency of genes without B. cinerea and 8.4× compared to U. maydis. The number of significant BLAST hit is similar to previous fungal EST unisequences (i.e. all contigs and singlets) identified in libraries (e.g., [14,15]). In some cases, this lack of similar- our library represents 7,765 putative unique genes, which ity to protein database entries could be due to the lies within the total gene number in fungi (range from sequence being derived from the 5' or 3' untranslated 5,570 to 16,597 [17]). region of the cDNA [19]. Among the 1,953 sequences that had a highly significant similarity to Genbank entries, Online database: MICROBASE 93.48% had their most significant hit against fungal A website is available with open access to the EST sequences and 4.79% against sequences from other sequences, unisequences and annotations [18]. Several organisms: animals (1.6%), plants (2.58%), protozoa tools are made available, allowing visualising contig (0.37%) and bacteria (0.24%). assemblies and performing searches on our ESTs or unise- quences, by BLAST, by annotation, by function, and by Regarding the Gene Ontology classes in our M. violaceum sequence ID. The database was named MICROBASE, after EST library, the molecular function class was the most Microbotryum EST database. abundant (37.65%), followed by the cellular component class (30.68%), and by the biological process class Functional classification (16.35%). Whitin the molecular function class, sequences Similarity searches performed on the 7,765 unisequences classified in catalytic and binding activities functions were indicated that 1,953 (25.2%) had a highly significant sim- the most abundant (Figure 2). We found also 33 unise- ilarity to UniProt or Genbank entries (E-value ≤ 10-10). quences with significant similarity to genes belonging to, Among these, 125 unisequences were similar to strictly or linked to, the mating-type loci of other basidiomycetes "hypothetical" or "unknown" proteins. A total of 818 (see the Table in Manual Annotations in the "Annota- sequences (10.5%) could be classified in a putative cellu- tions" section at MICROBASE) and 70 unisequences with lar function according to the characterization scheme out- significant similarity to genes that have been shown exper- lined by the Gene Ontology Consortium. In addition, imentally to be involved in pathogenicity in other fungi 5,772 (74.3%) unisequences showed moderate to very according to the PHI-base [2] (Table 1, see also Manual low similarity (10-10 < E-value < 10-1) to the UniProt and Annotations in the "Annotations" section at MICRO-

FigureDistribution 1 of Microbotryum violaceum EST Distribution of Microbotryum violaceum EST. EST redundancy among the 7,765 unisequences obtained from a cDNA library of the basidiomycete fungus Microbotryum vi.olaceum. The number of ESTs is indicated above each bar.

Page 3 of 12 (page number not for citation purposes) BMC Genomics 2007, 8:272 http://www.biomedcentral.com/1471-2164/8/272

BASE). In addition, 148 unisequences (15.31%) showed mitogen-activated protein kinase (MAPK) and the cAMP significant similarity to transposable elements. dependent protein kinase (PKA), components of the PKA/ MAPK network in U. maydis [4]. Other sequences had a Sequences relevant to mating-types significant similarity to transcription factors, like the Prf1 Our cDNA library contained 70 sequences presenting a of U. maydis [20], which are essential for the interconnec- similarity (E-value ≤ 10-10) with genes belonging to, or tion between the pathways of PKA and MAPK pathways. linked to, the MAT loci in other fungi. According to the Gene Ontology classification, most of these sequences The most interesting ESTs regarding the MAT locus of M. (61%) would have molecular functions. Thirteen violaceum were those constituting the four unisequences sequences were similar to pheromone receptors, trans- similar to pheromone receptors. These four unisequences porters or response factors, mainly from the other basidi- (the singlet pr0aaa87yh06 and the contigs 588, 2096 and omycete species Coprinopsis cinerea, Schizophyllum 660) showed significant sequence similarity to each other commune, Ustilago maydis and Cryptococcus neoformans. We but not enough to be assembled in a single contig. We also identified 332 sequences similar to genes regulating designed primers (Table 2) within each of the four unise- mating, morphogenesis and pathogenesis, such as the quences similar to pheromone receptors and performed

MolecularFigure 2 function categories of Microbotryum violaceum sequences Molecular function categories of Microbotryum violaceum sequences. Distribution of the 797 contigs and singlets hav- ing a significant blast hit in public databases into molecular function class according to the Gene Ontology classification.

Page 4 of 12 (page number not for citation purposes) BMC Genomics 2007, 8:272 http://www.biomedcentral.com/1471-2164/8/272

Table 1: Contigs of Microbotryum violaceum blasting significantly to the pathogenicity-related genes reported in the PHI-database

Putative function Contig or singlet ID E-value EMBL accession PHI accession Gene Ontology Gene Ontology category class

ABC-transporter 1816 6,00E-53 BAC67162 391 Transporter Molecular function activity/catalytic activity 211 1,00E-16 AAK15314 310 Transporter Molecular function activity/catalytic activity Acetolactate synthase 3362 8,00E-20 AAR29084 358 Catalytic activity Molecular function Adenylate cyclase pr0aaa104yj02scm1.1 4E-08 AAG60619 241 Catalytic activity Molecular function ATP molecular dependent 342 1,00E-60 AAA02743 463 Binding Molecular function chaperone Benomyl/methotrexate 547 2E-11 CAA37820 26 Transporter Molecular function resistance activity capsule protein pr0aaa63yn21scm1.1 8,00E-15 BAC76819 139 Transporter Molecular function activity Carnitine acetyl transferase 1367 2E-14 AAB88887 120 Catalytic activity Molecular function 830 1E-12 AAB88887 120 Catalytic activity Molecular function Chitin synthase 1149 1,00E-31 AAC34496 236 Catalytic activity Molecular function pr0aaa19yo09scm1.1 1E-11 AAC35278 237 Catalytic activity Molecular function 2395 9,00E-27 AAT77184 337 Catalytic activity Molecular function Cyclophilin 229 9,00E-23 AAG13968 249 Catalytic activity Molecular function 2275 1,00E-20 AAF69795 213 Catalytic activity Molecular function Exopolygalacturonase PGX1 pr0aaa12yk07scm1.1 1,00E-30 AAK81847 181 Catalytic activity Molecular function G protein alpha subunit 3233 4,00E-22 AAC49724 76 Binding Molecular function Glyoxaloxidase 1 1675 9,00E-15 CAD79488 352 Catalytic activity Molecular function G-protein beta subunit 1 402 9,00E-15 AAP55639 316 Binding Molecular function Guanine nucleotide exchange pr0aaa67yi20scm1.1 9,00E-19 AAO25556 283 Enzyme regulator Molecular function factor Cdc24 activity Guanyl nucleotide exchange pr0aaa26ye11scm1.1 2E-14 AAO19638 319 Enzyme regulator Molecular function factor Sql2 activity pr0aaa94yf22scm1.1 6,00E-15 AAA02743 463 Binding Molecular function Imidazole glycerol phosphate 2572 3,00E-17 AAB88888 121 Cellular process/ Biological process dehydratase metabolic process Isocitrate lyase 1103 2,00E-39 AAM89498 261 Catalytic activity Molecular function pr0aaa36yl22scm1.1 1,00E-36 AAM89498 261 Catalytic activity Molecular function MAP Kinase pr0aaa24yh13scm1.1 6,00E-29 AAO27796 342 Binding Molecular function 1219 4,00E-23 CAC47939 245 Binding Molecular function 908 2,00E-16 AAK49432 234 Binding Molecular function 955 8,00E-31 AAF15528 151 Binding Molecular function 374 5,00E-25 AAF15528 151 Binding Molecular function Mitochondrial glycoprotein, 1454 9,00E-19 AAT81148 367 Multiorganism Biological process Mrb1 process NADH-ubiquinone 2040 3,00E-32 EAA69636 445 Multiorganism Biological process/ oxidoreductase 49 kDa process/catalytic molecular function subunit, mitochondrial activity precursor Peroxisome biogenesis – Pex6 2886 3,00E-31 AAK16738 226 Metabolic process/ Biological process protein cellular process Pheromone receptor CPRa1p 660 2,00E-32 AAK31936 292 Signal transducer Molecular function activity Phosphatidylinositol 3-kinase 4039 1,00E-27 CAA70254 195 Catalytic activity Molecular function

Page 5 of 12 (page number not for citation purposes) BMC Genomics 2007, 8:272 http://www.biomedcentral.com/1471-2164/8/272

Table 1: Contigs of Microbotryum violaceum blasting significantly to the pathogenicity-related genes reported in the PHI-database

Polygalacturonase 3187 2E-11 CAA71246 247 Catalytic activity Molecular function Protein kinase 444 1,00E-16 AAW46354 360 Binding/calalytic Molecular function activity 2829 9,00E-48 AAB68613 85 Binding/calalytic Molecular function activity 2748 2,00E-23 AAC09291 158 Binding/calalytic Molecular function activity Protein mannosyltransferase 1910 1,00E-51 AAF16867 452 Catalytic activity Molecular function pr0aaa54yd01scm1.1 8,00E-48 CAA67930 104 Catalytic activity Molecular function Putative branched-chain 1888 9,00E-21 AAD45321 157 Catalytic activity Molecular function amino acid aminotransferase Rab subfamily of small 231 8,00E-26 CAC41973 339 Binding Molecular function GTPases, Rsr1p pr0aaa81ye23scm1.1 5,00E-22 CAC41973 339 Binding Molecular function pr0aaa90yb05scm1.1 5E-10 CAC41973 339 Binding Molecular function 3235 1,00E-44 CAC41973 339 Binding Molecular function Ras-like small GTPases 3583 4,00E-38 BAA24262 270 Binding Molecular function CaRho1 Topoisomerase I 1694 1,00E-17 AAB39507 80 Binding/catalytic Molecular function activity Transcriptional repressor pr0aaa11yo22scm1.1 2,00E-20 AAB63195 211 Cellular process/ Biological process metabolic process pr0aaa104ym01scm1.1 9,00E-20 AAB63195 211 Cellular process/ Biological process metabolic process Transmembrane protein 631 5,00E-42 AAD51594 267 Binding Molecular function pr0aaa92yb19scm1.1 5,00E-32 AAD51594 267 Binding Molecular function 2916 7,00E-25 AAD51594 267 Binding Molecular function pr0aaa47yd11scm1.1 7,00E-22 AAD51594 267 Binding Molecular function 1972 3,00E-19 AAD51594 267 Binding Molecular function 3153 1E-13 AAD51594 267 Binding Molecular function pr0aaa62yh02scm1.1 7E-13 AAD51594 267 Binding Molecular function 2016 2E-11 AAD51594 267 Binding Molecular function Trehalose-6-phosphate 2473 9,00E-49 AAN46744 322 Catalytic activity Molecular function phosphatase 960 2,00E-21 AAN46744 322 Catalytic activity Molecular function Uac pr0aaa75yc14scm1.1 9E-14 AAA57469 22 Catalytic activity Molecular function Urease pr0aaa84yg09scm1.1 3E-13 AAC62257 194 Metabolic process Biological process 1663 4E-11 AAC62257 194 Metabolic process Biological process vacuolar (H+)-ATPase subunit 2474 2E-13 AAK81705 235 Localization Cellular component Virulence associated DEAD 1516 3E-13 AAV41010 423 Binding Molecular function box protein 1 2939 4E-12 AAV41010 423 Binding Molecular function pr0aaa70ym18scm1.1 8E-11 AAV41010 423 Binding Molecular function Hypotethical protein 1112 1,00E-28 EAL03139 290 Binding Molecular function 1224 4,00E-16 EAL03139 290 Binding Molecular function pr0aaa11yc08scm1.1 7E-13 EAL03139 290 Binding Molecular function

Page 6 of 12 (page number not for citation purposes) BMC Genomics 2007, 8:272 http://www.biomedcentral.com/1471-2164/8/272

PCRs on A1 and A2 sporidial lines of ten strains of M. vio- 2096 and 660 were not assembled into a single contig laceum. Amplification products were of higher size than because the region of overlap with sufficient PHRED qual- expected from ESTs for 3 of the unisequences, indicating ity sequence was too short. The PCR performed using the the presence of introns (Table 2). The amplifications cor- forward primer of the Contig 2096 and the reverse primer responding to each of the four unisequences were specific of the Contig 660 (Table 2) yielded a single amplification of a single mating type (Table 2). product whose sequence read without apparent heteroge- neity on the chromatograms. This indicates that the con- Furthermore, the p-distance [21] showed that sequences tigs 2096 and 660 indeed correspond to the same of singlet pr0aaa87yh06 and contig 588 were highly sim- pheromone receptor. ilar (p = 0.273) and identical on the second halves of the sequences (p = 0.000). Inspection of the chromatograms Microbotryum violaceum thus appears to carry a single phe- showed that one of the 3 ESTs assembled in the contig 588 romone receptor at the A1 locus and a single pheromone was of very poor quality on the first half of the sequences, receptor at the A2 locus, which would be in agreement suggesting that the singlet pr0aaa87yh06 and contig 588 with its bipolar status. In contrast, tetrapolar species such were actually probably transcripts of the same gene. This as C. cinereus and S. commune have several pheromone was checked by designing primers on the most different receptors at each of the alternate forms of the B mating parts of the two unisequences, which amplification prod- type locus [22,23]. A genome walking approach allowed ucts indeed yielded identical sequences, including the us to obtain the complete sequence of the putative A1 and intronic parts. A2 pheromone receptors of Microbotryum violaceum (Fig- ure 3A; Genbank accession numbers EF584742 and These two sequences were less similar to the contigs 2096 EF584741, respectively for the A1 and A2 pheromone and 660 (p = 0.702 and 0.793 respectively). The contigs receptors). 2096 and 660 overlapped only on 25 bp, but aligned one to each other perfectly at their edges (p = 0.000), suggest- The putative pheromone receptors identified in our cDNA ing that they represent ESTs from the same gene. Contigs library did not show highly significant similarity to the

Table 2: Unisequences of Microbotryum violaceum blasting against pheromone receptors. For each of the four unisequences significantly blasting against pheromone receptors: best hits, primer designed for PCR amplification, expected size from the EST sequence, rough amplification size obtained, and mating type of the sporidia that gave amplification products.

Number of Best hits Primers Contig size Rough Amplification in ESTs obtained from amplification sporidia of ESTs size mating type

Contig588 3 Rcb1 of Coprinopsis F1: 350 500 A2 only cinerea GGAAGGCCATTACAA GAAAGG Bbr2 of R2: Schizophyllum TGTGCTTTTCGCTCT commune TAGCA Contig660 4 Rcb2 of C. cinerea F: 551 800 A1 only ACGATTCCAGTAGGC GTGAA B alpha 8 of S. R: commune CTGCGTCACGATACC TTTCTT Contig2096 3 Bbr2 of S. F: 213 220 A1 only commune TCCTTTGTCACGACA AGCAC Rcb3 of C. cinerea R: CCAATTTTCACGCCT ACTGG Singlet 1 Rcb1 of C. cinerea F: 383 600 A2 only pr0aaa87yh06 ATCAGAATATGACGG CAGCA Bbr2 of S. R: commune AAGAAAGGGAACTCC AAATGC

1 F: Forward primer 2 R: Reverse primer

Page 7 of 12 (page number not for citation purposes) BMC Genomics 2007, 8:272 http://www.biomedcentral.com/1471-2164/8/272

pheromone receptors of U. maydis and U. hordei, which pressed along most of the sex chromosomes in M. viol- explains why they hybridized only weakly on Southern aceum [25]. blots [24], and why cloning attempts of the M. violaceum mating type locus by designing degenerate primers from Other sequences relevant to pathogenesis the U. maydis sequences have failed (T. Giraud, unpub- A total of 70 sequences had a high similarity to genes lished). shown experimentally to play a role in pathogenicity in other fungi (Table 2). An important class of proteins in The cloning of the complete mating type locus of M. viol- pathogenicity is the secretome, which play important aceum is currently under way, starting from the pherom- roles in penetration and colonization of plant tissues [26]. one receptors obtained in the present library. The No sequence in MICROBASE presented high similarity complete sequence of the mating-type locus will allow with genes encoding cell wall-degrading enzymes, such as identifying the organisation and composition of this lyases, lipases, proteases, and we detected only two polyg- genomic region, and thus understand how the transition alacturonases. Plant pathogens that kill host cells, like occurred between tetrapolarity and bipolarity in M. viol- Magnaporthe grisea and Fusarium graminearum, contain in aceum or its ancestral lineages. One tentative hypothesis their genome many genes involved in degradation of cell given the data at hand is that it exists a single allele of each tissue. In contrast, it is not surprising to find a reduced mating type locus and that the two mating type loci are number of genes involved in such hydrolytic functions in linked, as in U. hordei [7]. Recombination is indeed sup- fungi with a biotrophic life style in which host damage is

PutativeFigure 3pheromone receptors in Microbotryum. violaceum Putative pheromone receptors in Microbotryum. violaceum. A) Diagram of the two putative pheromone receptor genes identified in the EST library of Microbotryum violaceum, respectively linked to the A1 and A2 mating type. B) Alignment of the two putative pheromone receptors of Microbotryum violaceum with the most similar published protein sequences of other fungi: B2 and B-alpha of Schizophyllum commune, the transmembrane pheromone receptor of Coprinellus disseminatus and Rcb3B5 of Coprinopsis cinerea.

Page 8 of 12 (page number not for citation purposes) BMC Genomics 2007, 8:272 http://www.biomedcentral.com/1471-2164/8/272

minimized, like M. violaceum. Similar conclusions have been drawn from the complete genome sequence of Usti- lago maydis [27], which also has a biotrophic life style. The genome of U. maydis contained in contrast numerous secreted proteins with unknown functions, and even with no similarity to any other proteins in the databases. The total number (594) of proteins predicted to be secreted in MICROBASE was similar to that in the genome of U. may- dis [27], and the percentage of secreted proteins without a significant hit in databases was also very high (86.4% in MICROBASE). This suggests that the specific and intimate relationships between biotrophic fungi and their host plant select for specific secreted functions.

In several fungi, the cAMP signalling and two MAP kinase pathways have been implicated in regulating various plant infection processes, in particular in monocot-infect- ComparisonryumFigure violaceum 4 of transposableexpressed and elements genomic copies of Microbot- ing smuts [28]. Several contigs of M. violaceum were simi- Comparison of expressed and genomic copies of lar to enzymes of these molecular pathways, including G Microbotryum violaceum transposable elements. Class proteins, protein kinases and Ras proteins. In U. maydis I elements are represented by the Copia, Gypsy, and Non- for instance, disruption of Ras2 resulted in loss of patho- LTR (long-terminal repeat) categories; Class II elements are genicity and dramatic changes in cell morphology [29]. represented by the Helicase and DNA Transposon catego- Another important molecular pathway in pathogenic ries. The Group II Intron category corresponds to a mito- fungi is the Calcineurin/cyclophilin signalling [30], for chondrial mobile element. The data on genomic survey which we also detected putative genes in the MICRO- sequences are from ref [12]. BASE. Other important molecules involved in pathogenit- icy belong to the secondary metabolism which includes P450 genes, such as the putative ones present in the RIP recognition sites were 2 to 3 times more frequent that MICROBASE, or the small peptides synthetized by nonri- to any other sites. bosomal peptide synthases (NRPS). We detected contigs similar to NRPS, such as the one similar to CPS1 [31]. Prior studies have reported that some unidentified trans- posable elements may be active only during mitosis, Expressed transposable elements whereas others would be active during meiosis [33], and Our library presented 148 unisequences with significant the conditions under which our library was built may similarity to transposable elements (TE), with an addi- therefore lead to an underestimation of the TE transcrip- tional 10 showing putative or weak similarity to TEs. The tional activity. A more specific study is required to under- 148 unisequences, when categorized by the major types of stand the importance of the RIP mechanism in the Class I (RNA-based replication) and Class II (DNA-based accumulation of transposable elements in the genome of replication) transposable elements, were in similar rela- M. violaceum, especially as RIP-affected and non-func- tive frequencies as TEs from the M. violaceum genomic sur- tional TE copies may still be transcribed. The comparison vey [12] (Figure 4). Putative Class II DNA transposons and of TE transcripts in the MICROBASE with the genomic a maturase sequence from a mitochondrial Group II copies should be interesting to estimate the impact of the Intron were also identified among the expressed RIP defense mechanism in M. violaceum. We did not iden- sequences, but were not found in the prior genomic sur- tify any EST similar to the RID (RIP defective) DNA meth- vey. Although Hood et al. [32] showed that the RIP yltransferase gene required for RIP in Neurospora crassa (repeat-induced point mutation) genome defense has [34], although we detected several sequences similar to been very active in M. violaceum, our results suggest that methyltransferases. the transposable elements can escape this genomic mech- anism of defense to some extent, at least regarding the Conclusion transcriptional activity. In fact, there is evidence of RIP This study, providing the first extensive genomic dataset mutation among the expressed TE sequences; five unise- on M. violaceum, has permitted the detection of many quences could be aligned with the genomic consensus of genes putatively involved in mating, some of which were copia-like integrase gene from a prior analysis of RIP in M. shown to be linked to the mating-type locus, and also violaceum [32], and among these alignments mutations at many genes possibly involved in pathogenesis. Studies of reverse genetics are however required to validate these

Page 9 of 12 (page number not for citation purposes) BMC Genomics 2007, 8:272 http://www.biomedcentral.com/1471-2164/8/272

putative biological functions. Studies of comparative cfu. Forty thousand clones were then sequenced in one genomics among fungi should also benefit from the exist- direction by the Genoscope (Evry-France) using the ence of resources such as the MICROBASE [35]. This primer of cDNA synthesis kit (SMART II A Oligonucle- extensive database will not only allow comparing the otide 5'-AAG CAG TGG TAT CAA CGC AGA GTA CGC sequence evolution among species, but also searches for GGG-3'). the presence of genes and the numbers for gene families. Such comparative approaches yield valuable insights into Sequence analyses and EST clustering the evolution of host-pathogen interactions [35]. Further- Raw sequence data were cleaned from vector and adaptor more, it is now possible to clone and sequence the whole sequences. Contaminating plasmid sequences, such as E. mating type locus of M. violaceum, allowing elucidating its coli, mitochondrial or ribosomal fungal sequences were organization. Comparison with the mating type loci of removed from the analyses. PHRED software [38,39] was other basidiomycetes will then provide insights into its used for base-calling the chromatogram trace files. Only evolution, in particular into the mechanism of the transi- sequences with a PHRED score over 20 on at least 100 bp tion between tetra- and bipolarity. Finally, the high were released in the EST division of the EMBL-EBI (Euro- expression level of transposable elements raises questions pean Molecular Biology Laboratory – European Bioinfor- about the importance of the RIP genome defense, and matic Institute) Nucleotide Sequence Database. how it can be escaped. ESTs were aligned and assembled into contigs using the Methods CAP3 software [40] when the criterion of a minimum Microbotryum violaceum strain and culture conditions identity of 95% over 50 bp was met. When an EST could Teliospores from the strain 100.02 of M. violaceum, col- not be assembled with others in a contig, it remained as a lected from the host Silene latifolia in 2001 in the Alps, "singlet". The contigs and the singlets should thus corre- near Tirano in Italy, was plated on GMB1 medium [36]. spond to sequences of unique genes, and will be called On such nutritive media, diploid teliospores germinate hereafter "unisequences". The consensus sequences of the and produce haploid sporidia of the two mating type A1 contigs and the sequences of the singlets were compared and A2. A1 and A2 sporidia lines from the strain 100.02 to the sequences in the GenBank database and in the Uni- were identified by pairing with existing stocks of known prot database using the tBLASTx and the BLASTx algo- mating type. rithms [41]. Unisequences showing significant similarity (E-value <= 10-4) to database entries were annotated using A mixed suspension of A1 and A2 sporidia (250 μL of the most significant matches. Unisequences were also each) was plated on water agar supplemented with α- classified into Gene Ontology functional categories [42] tocopherol (10 IU/g) and incubated at 4°C for one week. based on BLAST similarities to known genes of the NCBI These conditions of low nutrients with α-tocopherol are nr (non-redundant) protein database and using the thought to mimic the host plant surface for the fungus, Blast2GO annotation tool [43]. Sequences were also com- because sporidia conjugate and produce hyphae of a few pared to the pathogenicity genes assembled in the PHI cells [37]. This was checked using a light microscope database [2,44] and to the genes linked to the mating-type (400×). in other fungi using a manually built list of such genes. The sequences showing significant similarity to transpos- RNA isolation, cDNA library construction and sequencing able elements were also recorded. WoLF PSORT version Total RNA was extracted from conjugated cells and 2.0 [45] was used to predict protein localization using the hyphae using the Trizol reagent following the manufac- higher prediction score for external compartments. ture protocol (Invitrogen, The Netherlands). Extractions Finally, using a modified version of the ESTIMA tool [46] yielded 50 μg–500 μg of total RNA. Poly (A+) RNA was we developed a public database named MICROBASE, purified using the mRNA Absolutely Purification Kit dedicated to Microbotryum violaceum EST management and (Stratagene, CA). The SuperSmart cDNA Synthesis Kit analysis. This database includes information on EST (Clontech, CA) was used to synthesize cDNA, and the sequences, contigs, annotations, gene ontology functional library was normalized using the Trimmer kit (Evrogen, categories and search programs to compare similarities of Moscow) that reduces the quantity of the most abundant any sequence against the database. MICROBASE is acces- cDNA copies. cDNAs were ligated into the pGEM-T vector sible freely through a web interface [18]. (Promega, WI). To test the quality of the ligation, we transformed ultracompetent cells (XL10-Gold, Stratagene, Amplification of putative pheromone receptors CA) and amplified inserts from 100 clones. The average Primers were designed in the four unisequences with sig- size of inserts was 500 bp. Individual colonies were exam- nificant sequence similarity to pheromone receptors ined using the blue-white selection for the vector giving (Table 2) and amplifications were performed on DNA >50% of vector with inserts and an estimate of 2.0 × 105 extracted from sporidia of known mating type, from ten

Page 10 of 12 (page number not for citation purposes) BMC Genomics 2007, 8:272 http://www.biomedcentral.com/1471-2164/8/272

different strains of M. violaceum, of various geographical specific pheromone receptor function. Genetics 2006, 172:1877-1891. origins. DNA was extracted from single-sporidial colonies 9. Hibbett DS, Donoghue MJ: Analysis of character correlations using the Chelex (Biorad) protocol [47]. PCR amplifica- among wood decay mechanisms, mating systems, and sub- tions were performed using a PTC 100 thermal cycler (MJ strate ranges in homobasidiomycetes. Systematic Biology 2001, 50:215-242. Research), with 65°C as the annealing temperature, for 10. Thrall PH, Biere A, Antonovics J: Plant-life history and disease the amplification to be as specific as possible, using the susceptibility – the occurrence of Ustilago violacea on differ- Qbiogene (Irvine, CA) Taq polymerase following the ent species within the Caryophyllaceae. Journal of Ecology 1993, 81:489-498. manufacturer recommendations. 11. Day AW: Mating type and morphogenesis in Ustilago viola- cea. Bot Gaz 1979, 140:94-109. 12. Hood ME: Repetitive DNA in the automictic fungus Microbot- Genome walking ryum violaceum. Genetica 2005, 124:1-10. High quality genomic DNA was isolated from a Microbot- 13. Wang L, Perlin MH: Isolation of a novel gene (HSGc11) whose ryum violaceum strain from S. latifolia. The DNA was expression is apparently limited to the hyphal stage of Micro- botryum violaceum. International Journal of Plant Sciences 1998, digested by blunt end cutting enzymes (DraI, PvuII, EcoRV 159:206-212. and StuI) provided in the Universal GenomeWalker kit 14. Sacadura NT, Saville BJ: Gene expression and EST analyses of Ustilago maydis germinating teliospores. Fungal Genet Biol (BD Biosciences, Clontech, USA). The digested DNA was 2003, 40(1):47-64. then purified and ligated overnight with the adaptors pro- 15. Viaud M, Legeai F, Pradier JM, Brygoo Y, Bitton F, Weissenbach J, Bru- vided in the kit. The genome walking approach was fol- net-Simon A, Duclert A, Fillinger S, Fortini D, Gioti A, Giraud C, Halary S, Lebrun I, Le Pecheur P, Samson D, Levis C: Expressed lowed according to the manufacturer instructions. sequence tags from the phytopathogenic fungus Botrytis cin- erea. European Journal of Plant Pathology 2005, 111:139-146. 16. Panabieres F, Amselem J, Galiana E, Le Berre JY: Gene identifica- Authors' contributions tion in the oomycete pathogen Phytophthora parasitica dur- TG and RY contributed to the conception and design of ing in vitro vegetative growth through expressed sequence the study, to the acquisition and analysis of data, to coor- tags. Fungal Genet Biol 2005, 42:611-623. 17. Galagan JE, Henn MR, Ma LJ, Cuomo CA, Birren B: Genomics of the dination of the study, and were involved in drafting the fungal kingdom: Insights into eukaryotic biology. Genome manuscript. SM, HC, MEH participated in data analysis Research 2005, 15:1620-1631. and drafting of the manuscript. FR and AG were involved 18. Database MICROBASE [http://genome.jouy.inra.fr/microbase] 19. Skinner W, Keon J, Hargreaves J: Gene information for fungal in data analysis. CR and PW carried out the sequencing plant pathogens from expressed sequences. Current Opinion in and first data analysis. BD performed sequences and anal- Microbiology 2001, 4:381-386. yses of the pheromone receptors. All authors read and 20. Hartmann HA, Kahmann R, Bolker M: The pheromone response factor coordinates filamentous growth and pathogenicity in approved the final manuscript. Ustilago maydis. Embo Journal 1996, 15:1632-1641. 21. Nei M, Kumar S: Molecular evolution and phylogenetics. New York, Oxford University Press; 2000. Acknowledgements 22. Fowler TJ, Mitton MF, Vaillancourt LJ, Raper CA: Changes in mate We thank Jessie Abbate for the genome walking libraries and Bernard Leje- recognition through alterations of pheromones and recep- une for helpful discussions and advice. Muriel Viaud, Bernard Lejeune and tors in the multisexual mushroom fungus Schizophyllum Marc-Henri Lebrun provided helpful comments on an earlier draft of the commune. Genetics 2001, 158:1491-1503. 23. Halsall JR, Milner MJ, Casselton LA: Three subfamilies of pherom- manuscript. Joelle Amselem provided help in sequence analysis. This work one and receptor genes generate multiple B mating specifi- was funded by ACI Jeunes Chercheurs and by the "Consortium National de cities in the mushroom Coprinus cinereus. Genetics 2000, Recherche en Génomique" for sequencing the library. 154:1115-1123. 24. Bakkeren G, Gibbard B, Yee A, Froeliger E, Leong S, Kronstad J: The a-loci and b-loci of Ustilago maydis hybridize with DNA- References sequences from other smut fungi. Molecular Plant-Microbe Inter- 1. Xu J: Microbial ecology in the age of genomics and metagen- actions 1992, 5:347-355. omics: concepts, tools, and recent advances. Molecular Ecology 25. Hood ME, Antonovics J, Koskella B: Shared forces of sex chromo- 2006, 15:1713-1729. some evolution in haploid-mating and diploid-mating organ- 2. Winnenburg R, Baldwin TK, Urban M, Rawlings C, Kohler J, Ham- isms: Microbotryum violaceum and other model organisms. mond-Kosack KE: PHI-base: a new database for pathogen host Genetics 2004, 168:141-146. interactions. Nucl Acids Res 2006, 34:D459-464. 26. Kars I, van Kan JAL: Intracellular enzymes and metabolites 3. Brachmann A, Weinzierl G, Kamper J, Kahmann R: Identification of involved in pathogenesis of Botrytis. In Botrytis: Biology, Pathology genes in the bW/bE regulatory cascade in Ustilago maydis. and Control Edited by: Elad Y, Williamson B, Tudzynski P and Delen N. Molecular Microbiology 2001, 42:1047-1063. , Kluwer Academic Publisher; 2004. 4. Feldbrugge M, Kamper J, Steinberg G, Kahmann R: Regulation of 27. Kamper J, Kahmann R, Bolker M, Ma LJ, Brefort T, Saville BJ, Banuett mating and pathogenic development in Ustilago maydis. Cur- F, Kronstad JW, Gold SE, Muller O, Perlin MH, Wosten HAB, de rent Opinion in Microbiology 2004, 7:666-672. Vries R, Ruiz-Herrera J, Reynaga-Pena CG, Snetselaar K, McCann M, 5. Raper JR: Genetics of sexuality in higher fungi. New York, Ron- Perez-Martin J, Feldbrugge M, Basse CW, Steinberg G, Ibeas JI, Hollo- ald Press; 1966. man W, Guzman P, Farman M, Stajich JE, Sentandreu R, Gonzalez-Pri- 6. Raper JR, Flexer AS: Mating systems and evolution of the Basid- eto JM, Kennell JC, Molina L, Schirawski J, Mendoza-Mendoza A, iomycetes. In Evolution in the Higher Basidiomycetes Edited by: Greilinger D, Munch K, Rossel N, Scherer M, Vranes M, Ladendorf O, Petersen RH. Knoxville, TN., University of Tennessee Press; Vincon V, Fuchs U, Sandrock B, Meng S, Ho ECH, Cahill MJ, Boyce KJ, 1971:149–167. Klose J, Klosterman SJ, Deelstra HJ, Ortiz-Castellanos L, Li WX, 7. Bakkeren G, Kronstad JW: Linkage of mating-type loci distin- Sanchez-Alonso P, Schreier PH, Hauser-Hahn I, Vaupel M, Koopmann guishes bipolar from tetrapolar mating in Basidiomycetous E, Friedrich G, Voss H, Schluter T, Margolis J, Platt D, Swimmer C, smut fungi. Proc Natl Acad Sci U S A 1994, 91(15):7085-7089. Gnirke A, Chen F, Vysotskaia V, Mannhaupt G, Guldener U, Mun- 8. James TY, Srivilai P, Kues U, Vilgalys R: Evolution of the bipolar sterkotter M, Haase D, Oesterheld M, Mewes HW, Mauceli EW, mating system of the mushroom Coprinellus disseminatus DeCaprio D, Wade CM, Butler J, Young S, Jaffe DB, Calvo S, Nus- from its tetrapolar ancestors involves loss of mating-type-

Page 11 of 12 (page number not for citation purposes) BMC Genomics 2007, 8:272 http://www.biomedcentral.com/1471-2164/8/272

baum C, Galagan J, Birren BW: Insights from the genome of the biotrophic fungal plant pathogen Ustilago maydis. Nature 2006, 444:97-101. 28. Lee N, D'Souza CA, Kronstad JW: Of smuts, blasts, mildews, and blights: cAMP signaling in phytopathogenic fungi. Annu Rev Phytopathol 2003, 41:399-427. 29. Lee N, Kronstad JW: ras2 Controls morphogenesis, pherom- one response, and pathogenicity in the fungal pathogen Usti- lago maydis. Eukaryotic Cell 2002, 1:954-966. 30. Viaud M, Brunet-Simon A, Brygoo Y, Pradier JM, Levis C: Cyclophi- lin A and calcineurin functions investigated by gene inactiva- tion, cyclosporin A inhibition and cDNA arrays approaches in the phytopathogenic fungus Botrytis cinerea. Molecular Microbiology 2003, 50:1451-1465. 31. Lu SW, Kroken S, Lee BN, Robbertse B, Churchill ACL, Yoder OC, Turgeon BG: A novel class of gene controlling virulence in plant pathogenic ascomycete fungi. Proc Natl Acad Sci U S A 2003, 100:5980-5985. 32. Hood ME, Katawczik M, Giraud T: Repeat-induced point muta- tion and the population structure of transposable elements in Microbotryum violaceum. Genetics 2005, 170:1081-1089. 33. Garber ED, Ruddat M: Genetics of Ustilago violacea. XXXIV. Genetic evidence for a transposable element functioning during mitosis and two transposable elements functioning during meiosis. International Journal of Plant Sciences 1998, 159:1018–1022. 34. Freitag M, Williams RL, Kothe GO, Selker EU: A cytosine methyl- transferase homologue is essential for repeat-induced point mutation in Neurospora crassa. Proc Natl Acad Sci U S A 2002, 99:8802-8807. 35. Xu JR, Peng YL, Dickman MB, Sharon A: The dawn of fungal path- ogen genomics. Annu Rev Phytopathol 2006, 44:337-366. 36. Thomas A, Shykoff J, Jonot O, Giraud T: Sex-ratio bias in popula- tions of the phytopathogenic fungus Microbotryum viol- aceum from several host species. International Journal of Plant Sciences 2003, 164:641-647. 37. Day AW, Garber ED: Ustilago violacea, anther smut of the Caryophyllaceae. Advances in Plant Pathology 1988, 6:457-482. 38. Ewing B, Green P: Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Research 1998, 8:186-194. 39. Ewing B, Hillier L, Wendl MC, P. G: Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Research 1998, 8:175-185. 40. Huang X, Madan A: CAP3: A DNA Sequence Assembly Pro- gram. Genome Res 1999, 9:868-877. 41. Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lip- man DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Research 1997, 25:3389-3402. 42. Database Gene Ontology [http://www.geneontology.org] 43. Conesa A, Gotz S, Garcia-Gomez JM, Terol J, Talon M, Robles M: Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics 2005, 21:3674-3676. 44. PHI database [http://www.phi-base.org] 45. Horton P, Park KJ, Obayashi T, Nakai K: WoLF PSORT: protein localization predictor. Nucleic Acids Res 2007, 35(Web Server issue):W585-7. 46. Kumar CG, LeDuc R, Gong G, Roinishivili L, Lewin HA, Liu L: ESTIMA, a tool for EST management in a multi-project envi- Publish with BioMed Central and every ronment. BMC Bioinformatics 2004, 5:176-187. 47. Bucheli E, Gautschi B, Shykoff JA: Differences in population struc- scientist can read your work free of charge ture of the anther smut fungus Microbotryum violaceum on "BioMed Central will be the most significant development for two closely related host species, Silene latifolia and S. dioica. disseminating the results of biomedical research in our lifetime." Molecular Ecology 2001, 10:285-294. Sir Paul Nurse, Cancer Research UK Your research papers will be: available free of charge to the entire biomedical community peer reviewed and published immediately upon acceptance cited in PubMed and archived on PubMed Central yours — you keep the copyright

Submit your manuscript here: BioMedcentral http://www.biomedcentral.com/info/publishing_adv.asp

Page 12 of 12 (page number not for citation purposes)