bioRxiv preprint doi: https://doi.org/10.1101/582205; this version posted March 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

1 The Zymoseptoria tritici ORFeome: a functional

2 community resource

3

4 Yogesh Chaudhari*,a, Timothy C. Cairns*,b, Yaadwinder Sidhua, Victoria Attaha, 5 Graham Thomasa, Michael Csukaic, Nicholas J. Talbotd, David J. Studholmea,e, Ken 6 Haynesa,f.

7 *Authors contributed equally to this study

8 aBiosciences, , Exeter EX4 4QD, United Kingdom

9 bCurrent address; Tianjin Institute of Industrial Biotechnology, Chinese Academy of 10 Sciences, Tianjin, 300308, China

11 cSyngenta, Jealott’s Hill International Research Centre, Bracknell, RG42 6EY

12 dCurrent address; The Sainsbury Laboratory, University of East Anglia, Norwich 13 Research Park, NR47UH, United Kingdom

14 eCorresponding author ([email protected])

15 fAuthor deceased

16 Emails:

17 [email protected]

18 [email protected]

19 [email protected]

20 [email protected]

21 [email protected]

22 [email protected]

23 [email protected]

24 [email protected]

25

1

bioRxiv preprint doi: https://doi.org/10.1101/582205; this version posted March 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

26 Abstract:

27 Libraries of protein-encoding sequences can be generated by identification of open 28 reading frames (ORFs) from a of choice that are then assembled into 29 collections of plasmids termed ORFeome libraries. These represent powerful 30 resources to facilitate functional genomic characterization of and their encoded 31 products. Here, we report the generation of an ORFeome for Zymoseptoria tritici, 32 which causes the most serious disease of wheat in temperate regions of the world. 33 We screened the genome of strain IP0323 for high confidence models, 34 identifying 4075 candidates from 10,933 predicted genes. These were amplified from 35 genomic DNA, cloned into the Gateway® Entry Vector pDONR207, and sequenced, 36 providing a total of 3022 quality-controlled plasmids. The ORFeome includes genes 37 predicted to encode effectors (n = 410) and secondary metabolite biosynthetic proteins 38 (n = 171), in addition to genes residing at dispensable chromosomes (n= 122), or those 39 that are preferentially expressed during plant infection (n = 527). The ORFeome 40 plasmid library is compatible with our previously developed suite of Gateway® 41 Destination vectors, which have various combinations of promoters, selection 42 markers, and epitope tags. The Z. tritici ORFeome constitutes a powerful resource for 43 functional genomics, and offers unparalleled opportunities to understand the biology 44 of Z. tritici.

45

46 Keywords: ORFeome, Zymoseptoria tritici, Mycosphaerella graminicola, functional 47 genomics.

48

49 Dedication: This paper is dedicated to the memory of Ken Haynes who led the study 50 and was an outstanding fungal biologist, as well as an inspirational colleague, friend 51 and mentor to his fellow co-authors.

52

53

54

2

bioRxiv preprint doi: https://doi.org/10.1101/582205; this version posted March 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

55 Introduction

56 Fungal kill more people per year than malaria, and result in crop destruction 57 or post-harvest spoilage that destroys enough food to feed approximately 10% of the 58 population (1, 2). Technological advances in fungal genomics, transcriptomics, 59 proteomics, metabolomics, bioinformatics, and network analyses, however, now 60 enable pathogenic fungi to be studied as integrated systems, providing unparalleled 61 opportunities to understand their biology (3, 4). Functional genomic approaches, which 62 define the function and interactions of genes and their encoded products at a genome 63 or near-genome level, are increasingly used to dissect host interactions, 64 virulence factors, drug resistance, and infectious growth during fungal disease (5–9). 65 However, a significant constraint to conducting functional genomic experiments are 66 high reagent and labour costs, due to the necessity to study thousands, or tens of 67 thousands of genes from a given fungal pathogen.

68 In order to obviate this challenge, community accessible libraries have been 69 developed, which consist of hundreds or thousands of either individual genes, null 70 mutant or over-expression strains, which ultimately enable facile and high throughput 71 experimentation by the end user at a minimal expense (10–17). ORFeomes are 72 collections of open reading frames (ORFs) that are encoded in a library of plasmid 73 vectors. These resources have been generated for several model organisms, including 74 humans, , Caenorhabditis elegans, and Arabidopsis thaliana (18–23). 75 ORFeomes have also been developed for the fungal kingdom, including fission yeast 76 (18), budding yeast (24) and, most recently, the human pathogenic yeast Candida 77 albicans (8). Usually, ORFeomes are compatible with the Gateway® cloning 78 technology (Invitrogen), which enables rapid and high throughput recombinase-based 79 transfer of an ORF coding sequence to generate expression vectors (25, 26). 80 Community access to hundreds or thousands of such plasmids in a single library 81 enables highly flexible generation of expression vectors for high-throughput functional 82 genomic experiments.

83 The filamentous ascomycete fungus Zymoseptoria tritici (previously Mycosphaerella 84 graminicola) (27) causes Septoria tritici blotch, an important foliar disease of wheat 85 (28). Z. tritici is a significant threat to international food security, and even with access 86 to some resistant wheat cultivars and frequent fungicide applications, the estimated

3

bioRxiv preprint doi: https://doi.org/10.1101/582205; this version posted March 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

87 average yield losses due to this pathogen are still 10% (28). Even more strikingly, 88 approximately 70% of agricultural fungicides in Europe are deployed to control just this 89 single disease (29), which likely drives triazole resistance in fungal pathogens of 90 humans as well as plants (30). These challenges are compounded by very high levels 91 of genome plasticity and gene flow between populations of Z. tritici (31, 32). While 92 recent efforts have characterized the underlying cellular biology and infectious growth 93 (33, 34), transcriptionally deployed secondary metabolite loci (35, 36), and 94 components of the secreted effector arsenal [37-40], the vast majority of genes and 95 encoded proteins remain uncharacterized in the laboratory (27).

96 Recently, there has been a community-wide effort to develop numerous tools, 97 techniques, and resources for Z. tritici. This research toolkit includes mutants in the 98 non-homologous end joining pathway for highly efficient gene targeting (41), 99 optimization of conditional expression systems (42), a range of fluorescent 100 translational gene fusion for sub-cellular localization studies (43, 44), optimized 101 virulence assays (45), and a suite of Gateway® Destination vectors (46, 47). These 102 Gateway® destination vectors have been validated using a pilot Gateway® Entry 103 library to generate 32 over-expression mutants, demonstrating the role of a fungal 104 specific transcription factor for in vitro hyphal growth (48).

105 In this study, we report the generation of an improved functional genomics community 106 resource to supplement these tools, by generating a Gateway® compatible Z. tritici 107 ORFeome, which to our knowledge is the first such library for a plant infecting fungus. 108 This library is compatible with numerous Gateway® destination vectors that have 109 multiple functionality in Z. tritici, including numerous selection markers, epitope tags, 110 and promoters (26, 47). For ORFeome construction, we firstly screened the IP0323 111 reference genome for high confidence gene models, yielding 4075 candidate ORFs 112 from a possible 10,933 predicted genes. These were PCR amplified from genomic 113 DNA and cloned into the Gateway® Entry vector pDONR207. Quality of ORF 114 sequences was verified by a combination of Sanger and Illumina sequencing, yielding 115 3022 plasmids that passed quality control checks with 100% sequence verification.

116 The Z. tritici ORFeome described in this study is freely available to the research 117 community. This resource can be rapidly utilized to interrogate the broadest aspects 118 of Z. tritici biology, including identification of novel drug targets, mechanisms of drug

4

bioRxiv preprint doi: https://doi.org/10.1101/582205; this version posted March 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

119 detoxification and resistance, and pathogen virulence factors, which may ultimately 120 enable development of new disease control strategies.

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

5

bioRxiv preprint doi: https://doi.org/10.1101/582205; this version posted March 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

142 2. Material and Methods

143 2.1 Strains used in this study

144 E. coli One Shot® ccdB Survival™ 2 T1R were used for propagation of pDONR207 145 (Invitrogen, UK) and destination vectors. All Gateway®Entry and expression vectors 146 were propagated in Escherichia coli DH5α (Invitrogen, UK).

147 2.2 Plasmids used in this study

148 For generation of Gateway®Entry vectors, this study utilized the Gateway®Donor 149 vector pDONR207 (Invitrogen, UK) which contains a gentamicin resistance gene for 150 selection in E. coli. This plasmid also contains a ccdB gene flanked by attP sequences 151 for Gateway® mediated recombination using the BP reaction.

152 2.3 Construction of the Z. tritici ORFeome

153 Generation of Gateway® Entry vectors was conducted, as described previously (48). 154 For PCR amplification of each gene of interest, forward primers were designed to 155 include the attB1 site (ggggacaagtttgtacaaaaaagcaggcttg and the first 20 bp of the 156 gene, and reverse primers to include the attB2 site (ggggaccactttgtacaagaaagctgggtc) 157 and the last 20 bp of the gene. The stop codon was excluded to enable c-terminal 158 epitope tagging by the end user. Primers were synthesised by Sigma-Aldrich, UK, and 159 are listed in Supplementary File S1. PCRs were conducted using Phusion® High- 160 Fidelity DNA Polymerase (NEB, UK) with a 65 °C primer annealing temperature, an 161 extension of 30 seconds/kb, using Z. tritici IP0323 genomic DNA as template. PCR 162 amplicons of predicted sizes were confirmed by gel electrophoresis, PEG purified, and 163 suspended in 10 μl TE buffer (40 mM TRIS base, 20 mM glacial acetic acid, 0.1 mM 164 EDTA, pH8). For construction of Gateway®Entry vectors, 150 ng of pDONR207 was 165 mixed with 2.5 μl of purified PCR product, 0.5 μl of Gateway® BP Clonase™ with TE 166 buffer added to a total volume of 10 μl. Reactions were incubated at 25 °C for 12-24h, 167 then treated with Proteinase K (Invitrogen, UK) following the manufacturer’s 168 instructions. E. coli strain DH5α was transformed with 5 μl of each reaction mixture. 169 LB supplemented with gentamicin (50 μg/ml) was subsequently used to select 170 transformants, which were grown over-night in LB medium with selection, and 171 plasmids extracted using Plasmid Mini Kit (Qiagen, UK). Plasmids were indexed and

6

bioRxiv preprint doi: https://doi.org/10.1101/582205; this version posted March 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

172 stored in 96 well plates and at -20 °C. All Entry vectors are detailed in Supplementary 173 File 1.

174 2.4 Sanger and Illumina sequencing for quality control of Gateway® Entry 175 vectors

176 In order to confirm replacement of the ccdB gene with the ORF encoding sequence, 177 and to confirm high fidelity PCR amplification, a total of 688 Gateway®Entry vector 178 were randomly selected and Sanger Sequenced (Eurofins, UK) using primer GOXF 179 (tcgcgttaacgctagcatgga). Quality control reactions are given in Supplementary File S2. 180 A second quality control experiment was conducted using two rounds of Illumina 181 HiSeq 2500 sequencing of 3396 pooled Gateway Entry vectors. Sequencing data are 182 summarized in Supplementary Table S1 and S2. Raw sequencing data are available 183 at the Sequencing Read Archive (49) (accession SRX1267196 and SRX1265386).

184

185

186

187

188

189

190

191

192

193

194

195

196

197

7

bioRxiv preprint doi: https://doi.org/10.1101/582205; this version posted March 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

198 3. Results

199 Interrogation of the 10,933 predicted genes in the Z. tritici IP0323 reference genome 200 (29) returned 4075 high confidence gene models that had unambiguous start and stop 201 codons (data not shown). This analysis was conducted prior to RNA-seq analysis and 202 comparative genomic analyses (50), which have since improved Z. tritici gene models. 203 High-confidence gene models were complemented with a total of 1345 priority ORFs 204 that had been requested during consultation with members of the Z. tritici research 205 community. These latter ORFs were included in the ORFeome construction project 206 even if they failed our gene model quality control (primers for all PCR amplifications 207 are included in Supplementary Table S1). An overview of the ORFeome construction 208 project is shown in Figure 1A. PCR amplification utilized genomic DNA as template, 209 which was chosen over cDNA, in order to maintain alternative splice variants during 210 downstream ORFeome expression in Z. tritici. Additionally, in order to enable c- 211 terminal epitope-tagging of encoded ORFs using a variety of Destination vectors (26), 212 primers were designed to omit the native stop codon. If expression of the encoded 213 ORFs with a 3’ stop codon is desired, we have developed numerous Destination 214 vectors for this purpose (46).

215 A total of 4896 PCR reactions were conducted in 51 x 96 well plates, which yielded 216 4174 amplicons of the predicted molecular weight as determined by gel 217 electrophoresis (data not shown). A total of 3396 of these genes were successfully 218 cloned into pDONR207 using the Gateway® BP reaction, as determined by bacterial 219 growth on selection agar, for which plasmids were extracted (Figure 1A and 220 Supplementary File S2). Over 650 Gateway Entry plasmids were randomly selected 221 for sequence verification using Sanger sequencing, with 99.5% passing quality control 222 (Supplementary File S1 and S2). A second quality control experiment was conducted, 223 in which all 3396 ORFs were pooled and sequenced using an Illumina HiSeq 2500 224 (Supplementary File S1). When combined with Sanger sequencing experiments, a 225 total of 3022 Z. tritici ORFs passed quality control checks (Figure 1A). Genes that are 226 represented in the quality controlled Z. tritici ORFeome (n = 3022) are plotted as a 227 function of chromosomal locus (Figure 1B), and cover both core and accessory 228 chromosomes (Table 1). All 3396 plasmids are available to end users (Supplementary 229 File S2) with the caveat that plasmids which failed our quality control need be 230 sequence verified by the end user. A summary of ORFeome coverage for 3022 quality

8

bioRxiv preprint doi: https://doi.org/10.1101/582205; this version posted March 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

231 controlled ORFs amongst predicted effector encoding genes, secondary metabolite 232 biosynthetic genes, various other functional groups, chromosomal loci, and 233 differentially expressed genes during infection (35) is provided in Table 1. These data 234 indicate that the ORFeome will be applicable for functional genomic experiments to 235 test diverse hypotheses regarding, for example, gene function, expression, or genomic 236 location.

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

9

bioRxiv preprint doi: https://doi.org/10.1101/582205; this version posted March 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

255 4. Discussion

256 We have generated an ORFeome library of Z. tritici that covers high confidence gene 257 models from the reference genome isolate IP0323 (29) for functional genomic 258 analyses. The ORFeome contains 3022 sequence verified clones. Genes represented 259 in this library are putatively involved in a diverse range of processes, including 260 secreted proteins and putative effectors, biosynthesis of secondary metabolite toxins, 261 drug detoxification, signal sensing and transduction, regulation of gene expression, 262 amongst many others (see Table 1, Supplementary File 1 and 2), and will therefore 263 facilitate functional genomic experiments for diverse aspects of Z. tritici biology.

264 Our strategy prioritised high-confidence gene models for ORFeome construction over 265 a genome-wide cloning approach. While this has resulted in a partial ORFeome for Z. 266 tritici IP0323, we believe that a focus on accurate gene models will avoid large-scale 267 future updates of this resource. For example, the first C. elegans ORFeome (51) 268 underwent various revisions and additions due to improved gene model predictions 269 (20). More importantly, high confidence gene models will likely limit expression of 270 incorrect ORFs during cost and labour-intensive experiments by end users.

271 ORFeomes for several model organisms are amplified -free sequences from 272 cDNA libraries (20, 21). In contrast, we amplified ORF sequences from genomic DNA 273 in order to maintain the opportunity to generate alternative splice variants (e.g. due to 274 intron skipping) in subsequent Z. tritici over-expression or localisation experiments. 275 While the extent of in fungi is not comprehensively determined, an 276 estimated 6.1% of Z. tritici genes have splice variants (52). Alternative splicing is 277 thought to predominantly occur for genes required for virulence, multicellularity, and 278 dimorphic switching (52). Our ORFeome will therefore facilitate the study of splice 279 variants that may have critical impacts on infection, or, alternatively, encode promising 280 drug targets in this pathogen.

281 We have previously demonstrated that the application of a pilot (n = 32) collection of 282 putative DNA-binding protein encoding genes in Entry vectors can enable medium- 283 throughput gene functional analyses in Z. tritici (48). The ORFeome generated in this 284 study will drastically increase the throughput of these capabilities for the research 285 community. We predict that functional genomics in Z. tritici will enable systems-level 286 understanding of diverse range of processes, including but not limited to growth and

10

bioRxiv preprint doi: https://doi.org/10.1101/582205; this version posted March 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

287 development, sensing and signal transduction, virulence, host-pathogen interactions, 288 toxin biosynthesis, drug resistance, and chemical-genetic interactions. Such advances 289 may ultimately lead to novel fungicide development and development of novel 290 resistant wheat cultivars. 291 292 293 Data Availability 294 All reagents generated during the current study will be made available. Plasmid 295 requests should be sent to: [email protected] and [email protected]. Raw 296 sequencing data are available at the Sequencing Read Archive (accession 297 SRX1267196 and SRX1265386). 298 299 Supplementary Data

300 Supplementary File S1: List of genes included in the Z. tritici IPO323 ORFeome 301 project. Listed for each ORF are: PCR primers used to amplify sequence, KOG/GO 302 terms, and quality control information from Illumia/Sanger sequencing.

303 Supplementary File S2: In silico characterisation of the ORFeome library. Listed 304 for each ORFs are: presence at various genomic loci (dispensable chromosome, 305 putative secondary metabolite cluster, subtelomeric locus), transcriptional deployment 306 during infection, and various exemplar GO categories (e.g. DNA binding/GTPase, 307 etc).7

308 Conflict of interests 309 The authors declare that they have no competing interests 310 311 Funding 312 This work was funded by a BBSRC BBR grant (BB/I025956/1) to KH and collaborators 313 and a BBSRC CASE studentship (BB/J500793/1), supported by Syngenta UK to YS. 314

11

bioRxiv preprint doi: https://doi.org/10.1101/582205; this version posted March 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

315 316 Table 1. Summary information for the Z. tritici ORFeome. 317 Predicted IP0323 ORFeome Coverage in Reference category Genome 1 (no. ORFeome (% (no. genes) genes) of genome total) Total no 10933 3022 27.6 (29) genes Secreted Signal Peptide 909 538 59.1 (53, 54) 2 proteins Effector P 1438 410 28.5 Chromosomal Core chromosome 10278 2900 28.2 (29, 36) loci Accessory 654 122 18.6 chromosome Subtelomeric3 2501 644 25.7 Secondary 682 171 25.0 metabolite cluster Differentially 1 DPI 626 194 30.9 (35) expressed 4 DPI 769 256 33.2 during infection4 9 DPI 812 271 33.3 14 DPI 637 206 32.3 21 DPI 718 203 28.2 Exemplar GO DNA Binding 491 76 15.4 (53) Terms (GO:0003677) GTPase Activity 82 12 14.6 (GO:0003924) Protein kinase 301 53 17.6 activity (GO:0004672) Transmembrane 950 145 15.2 transport (GO:0055085) 318 1 ORFs cloned into pDONR207 that passed quality control (n = 3022, Supplementary File S1 319 and S2) were assigned various functional categories and are reported both as number of 320 genes, and as a percentage of the predicted total for the IP0323 genome.

321 2 Putative effector-encoding genes were predicted from amino acid coding sequences using 322 default parameters in the Effector P prediction algorithm (54). ORFs encoding predicted signal 323 peptides and exemplar GO terms were retrieved using the Ensemble Biomart pipeline (53).

324 3 Subtelomeric genes were defined as those residing within 300 kb of the chromosome end. 325 Genes predicted to reside in secondary metabolite biosynthetic gene clusters were identified 326 from AntiSMASH and SMURF predictions (36).

327 4 Differentially expressed genes were defined from transcriptional profiling by Rudd and co- 328 workers, with the number of genes significantly upregulated in planta at various days post 329 infection (DPI) relative to in vitro growth on Czapex-Dox media reported (35). The total 330 predicted number of genes in the IP0323 genome belonging to each functional category is 331 also shown.

332 ORFeome clones for each functional category are given in Supplemental Dataset S1 and S2.

12

bioRxiv preprint doi: https://doi.org/10.1101/582205; this version posted March 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

333 References 334 1. Denning,D.W. and Bromley,M.J. (2015) How to bolster the antifungal pipeline. 335 Science (80-. )., 347, 1414–1416. 336 2. Fisher,M.C., Henk,D.A., Briggs,C.J., Brownstein,J.S., Madoff,L.C., McCraw,S.L. 337 and Gurr,S.J. (2012) Emerging fungal threats to animal, plant and ecosystem 338 health. Nature, 484, 186–194. 339 3. Cairns,T.C., Studholme,D.J., Talbot,N.J. and Haynes,K. (2015) New and 340 Improved Techniques for the Study of Pathogenic Fungi. Trends Microbiol., 24, 341 35–50. 342 4. Meyer,V., Andersen,M.R., Brakhage,A.A., Braus,G.H., Caddick,M.X., Cairns,C.T., 343 de Vries,R.P., Haarmann,T., Hansen,K., Hertz-Fowler,C., et al. (2016) Current 344 challenges of research on filamentous fungi in relation to human welfare and a 345 sustainable bio-economy: a white paper. Fungal Biol. Biotechnol., 3, 1–17. 346 5. Jeon,J., Park,S.-Y., Chi,M.-H., Choi,J., Park,J., Rho,H.-S., Kim,S., Goh,J., Yoo,S., 347 Choi,J., et al. (2007) Genome-wide functional analysis of pathogenicity genes in 348 the rice blast fungus. Nat. Genet., 39, 561–565. 349 6. Son,H., Seo,Y.S., Min,K., Park,A.R., Lee,J., Jin,J.M., Lin,Y., Cao,P., Hong,S.Y., 350 Kim,E.K., et al. (2011) A phenome-based functional analysis of transcription 351 factors in the cereal head blight fungus, Fusarium graminearum. Plos Pathog., 352 7, e1002310. 353 7. Chauvel,M., Nesseir,A., Cabral,V., Znaidi,S., Goyard,S., Bachellier-Bassi,S., 354 Firon,A., Legrand,M., Diogo,D., Naulleau,C., et al. (2012) A versatile 355 overexpression strategy in the pathogenic yeast Candida albicans: identification 356 of regulators of morphogenesis and fitness. PLoS One, 7, e45912. 357 8. Legrand,M., Bachellier-Bassi,S., Lee,K.K., Chaudhari,Y., Tournu,H., Arbogast,L., 358 Boyer,H., Chauvel,M., Cabral,V., Maufrais,C., et al. (2018) Generating genomic 359 platforms to study Candida albicans pathogenesis. Nucleic Acids Res., 46, 360 6935–6949. 361 9. Schwarzmuller,T., Ma,B., Hiller,E., Istel,F., Tscherner,M., Brunke,S., Ames,L., 362 Firon,A., Green,B., Cabral,V., et al. (2014) Systematic phenotyping of a large- 363 scale Candida glabrata deletion collection reveals novel antifungal tolerance 364 genes. Plos Pathog., 10, e1004211. 365 10. Giaever,G., Chu,A.M., Ni,L., Connelly,C., Riles,L., Véronneau,S., Dow,S., Lucau- 366 Danila,A., Anderson,K., André,B., et al. (2002) Functional profiling of the 367 Saccharomyces cerevisiae genome. Nature, 418, 387–391. 368 11. Winzeler,E.A., Shoemaker,D.D., Astromoff,A., Liang,H., Anderson,K., Andre,B., 369 Bangham,R., Benito,R., Boeke,J.D., Bussey,H., et al. (1999) Functional 370 characterization of the S. cerevisiae genome by gene deletion and parallel 371 analysis. Science, 285, 901–6. 372 12. Giaever,G. and Nislow,C. (2014) The yeast deletion collection: A decade of 373 functional genomics. , 197, 451–465. 374 13. Homann,O.R., Dea,J., Noble,S.M. and Johnson,A.D. (2009) A phenotypic profile 375 of the Candida albicans regulatory network. PLoS Genet., 5. 376 14. Liu,O.W., Chun,C.D., Chow,E.D., Chen,C., Madhani,H.D. and Noble,S.M. (2008) 377 Systematic genetic analysis of virulence in the human fungal pathogen 378 Cryptococcus neoformans. Cell, 135, 174–188. 379 15. Dunlap,J.C., Borkovich,K.A., Henn,M.R., Turner,G.E., Sachs,M.S., Glass,N.L., 380 McCluskey,K., Plamann,M., Galagan,J.E., Birren,B.W., et al. (2007) Enabling a 381 community to dissect an organism: overview of the Neurospora functional 382 genomics project. Adv. Genet., 57, 49–96.

13

bioRxiv preprint doi: https://doi.org/10.1101/582205; this version posted March 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

383 16. Roemer,T., Jiang,B., Davison,J., Ketela,T., Veillette,K., Breton,A., Tandia,F., 384 Linteau,A., Sillaots,S., Marta,C., et al. (2003) Large-scale essential gene 385 identification in Candida albicans and applications to antifungal drug discovery. 386 Mol Microbiol, 50, 167–181. 387 17. Noble,S.M., French,S., Kohn,L.A., Chen,V. and Johnson,A.D. (2010) Systematic 388 screens of a Candida albicans homozygous deletion library decouple 389 morphogenetic switching and pathogenicity. Nat. Genet., 42, 590–598. 390 18. Matsuyama,A., Arai,R., Yashiroda,Y., Shirai,A., Kamata,A., Sekido,S., 391 Kobayashi,Y., Hashimoto,A., Hamamoto,M., Hiraoka,Y., et al. (2006) ORFeome 392 cloning and global analysis of protein localization in the fission yeast 393 Schizosaccharomyces pombe. Nat. Biotechnol., 24, 841–847. 394 19. Rajagopala,S. V, Yamamoto,N., Zweifel,A.E., Nakamichi,T., Huang,H.K., 395 Mendez-Rios,J.D., Franca-Koh,J., Boorgula,M.P., Fujita,K., Suzuki,K., et al. 396 (2010) The Escherichia coli K-12 ORFeome: a resource for comparative 397 molecular . BMC Genomics, 11, 470. 398 20. Lamesch,P., Milstein,S., Hao,T., Rosenberg,J., Li,N., Sequerra,R., Bosak,S., 399 Doucette-Stamm,L., Vandenhaute,J., Hill,D.E., et al. (2004) C. elegans 400 ORFeome version 3.1: Increasing the coverage of ORFeome resources with 401 improved gene predictions. Genome Res., 14, 2064–2069. 402 21. Wiemann,S., Pennacchio,C., Hu,Y., Hunter,P., Harbers,M., Amiet,A., Bethel,G., 403 Busse,M., Carninci,P., Diekhans,M., et al. (2016) The ORFeome Collaboration: 404 A genome-scale human ORF-clone resource. Nat. Methods, 13, 191–192. 405 22. Gong,W., Shen,Y.-P., Ma,L.-G., Pan,Y., Du,Y.-L., Wang,D.-H., Yang,J.-Y., Hu,L.- 406 D., Liu,X.-F., Dong,C.-X., et al. (2004) Genome-wide ORFeome cloning and 407 analysis of Arabidopsis transcription factor genes. PLANT Physiol., 135, 773– 408 782. 409 23. Li,Q.R., Carvunis,A.R., Yu,H., Han,J.D.J., Zhong,Q., Simonis,N., Tam,S., Hao,T., 410 Klitgord,N.J., Dupuy,D., et al. (2008) Revisiting the Saccharomyces cerevisiae 411 predicted ORFeome. Genome Res., 18, 1294–1303. 412 24. Gelperin,D.M., White,M.A., Wilkinson,M.L., Kon,Y., Kung,L.A., Wise,K.J., Lopez- 413 Hoyo,N., Jiang,L., Piccirillo,S., Yu,H., et al. (2005) Biochemical and genetic 414 analysis of the yeast proteome with a movable ORF collection. Genes Dev., 19, 415 2816–2826. 416 25. Walhout,A.J., Temple,G.F., Brasch,M.A., Hartley,J.L., Lorson,M.A., van den 417 Heuvel,S. and Vidal,M. (2000) GATEWAY recombinational cloning: application 418 to the cloning of large numbers of open reading frames or ORFeomes. Methods 419 Enzym., 328, 575–592. 420 26. Alberti,S., Gitler,A.D. and Lindquist,S. (2007) A suite of Gateway® cloning 421 vectors for high-throughput genetic analysis in Saccharomyces cerevisiae. 422 Yeast, 24, 913–919. 423 27. Talbot,N.J. (2015) Taming a wild beast: Developing molecular tools and new 424 methods to understand the biology of Zymoseptoria tritici. Fungal Genet Biol, 425 S1087-1845(15)00100-0 [pii]10.1016/j.fgb.2015.05.004. 426 28. Fones,H. and Gurr,S. (2015) The impact of Septoria tritici Blotch disease on 427 wheat: An {EU} perspective. Fungal Genet. Biol., 79, 3–7. 428 29. Goodwin,S.B., M’Barek S,B., Dhillon,B., Wittenberg,A.H., Crane,C.F., Hane,J.K., 429 Foster,A.J., Van der Lee,T.A., Grimwood,J., Aerts,A., et al. (2011) Finished 430 genome of the fungal wheat pathogen Mycosphaerella graminicola reveals 431 dispensome structure, chromosome plasticity, and stealth pathogenesis. PLoS 432 Genet, 7, e1002070.

14

bioRxiv preprint doi: https://doi.org/10.1101/582205; this version posted March 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

433 30. Chowdhary,A., Kathuria,S., Xu,J. and Meis,J.F. (2013) Emergence of azole- 434 resistant Aspergillus fumigatus strains due to agricultural azole use creates an 435 increasing threat to human health. Plos Pathog., 9, e1003633. 436 31. Zhan,J., Pettway,R.E. and McDonald,B.A. (2003) The global genetic structure of 437 the wheat pathogen Mycosphaerella graminicola is characterized by high 438 nuclear diversity, low mitochondrial diversity, regular recombination, and gene 439 flow. Fungal Genet. Biol., 38, 286–297. 440 32. Möller,M., Habig,M., Freitag,M. and Stukenbrock,E.H. (2018) Extraordinary 441 genome instability and widespread chromosome rearrangements during 442 vegetative growth. Genetics, 10.1534/genetics.118.301050. 443 33. Steinberg,G. (2015) of Zymoseptoria tritici: Pathogen cell 444 organization and wheat infection. Fungal Genet. Biol., 79, 17–23. 445 34. King,R., Urban,M., Lauder,R.P., Hawkins,N., Evans,M., Plummer,A., Halsey,K., 446 Lovegrove,A., Hammond-Kosack,K. and Rudd,J.J. (2017) A conserved fungal 447 glycosyltransferase facilitates pathogenesis of plants by enabling hyphal growth 448 on solid surfaces. PLoS Pathog., 13. 449 35. Rudd,J.J., Kanyuka,K., Hassani-Pak,K., Derbyshire,M., Andongabo,A., 450 Devonshire,J., Lysenko,A., Saqi,M., Desai,N.M., Powers,S.J., et al. (2015) 451 Transcriptome and metabolite profiling of the infection cycle of Zymoseptoria 452 tritici on wheat reveals a biphasic interaction with plant immunity involving 453 differential pathogen chromosomal contributions and a variation on the 454 hemibiotrophic lifestyle. Plant Physiol., 167, 1158–1185. 455 36. Cairns,T. and Meyer,V. (2017) In silico prediction and characterization of 456 secondary metabolite biosynthetic gene clusters in the wheat pathogen 457 Zymoseptoria tritici. BMC Genomics, 18. 458 37. Lee,W., Rudd,J.J., Hammond-kosack,K.E. and Kanyuka,K.K. (2013) 459 Mycosphaerella graminicola LysM effector-mediated stealth pathogenesis 460 subverts recognition through both CERK1 and CEBiP homologues in wheat. 461 Mol. Plant. Microbe. Interact., 27, 236–243. 462 38. Kettles,G.J., Bayon,C., Canning,G., Rudd,J.J. and Kanyuka,K. (2017) Apoplastic 463 recognition of multiple candidate effectors from the wheat pathogen 464 Zymoseptoria tritici in the nonhost plant Nicotiana benthamiana . New Phytol., 465 213, 338–350. 466 39. Saintenac,C., Lee,W.-S., Cambon,F., Rudd,J.J., King,R.C., Marande,W., 467 Powers,S.J., Bergès,H., Phillips,A.L., Uauy,C., et al. (2018) Wheat receptor- 468 kinase-like protein Stb6 controls gene-for-gene resistance to fungal pathogen 469 Zymoseptoria tritici. Nat. Genet., 10.1038/s41588-018-0051-x. 470 40. Zhong,Z., Marcel,T.C., Hartmann,F.E., Ma,X., Plissonneau,C., Zala,M., 471 Ducasse,A., Confais,J., Compain,J., Lapalu,N., et al. (2017) A small secreted 472 protein in Zymoseptoria tritici is responsible for avirulence on wheat cultivars 473 carrying the Stb6 resistance gene. New Phytol., 10.1111/nph.14434. 474 41. Sidhu,Y.S., Cairns,T.C., Chaudhari,Y.K., Usher,J., Talbot,N.J., Studholme,D.J., 475 Csukai,M. and Haynes,K. (2015) Exploitation of sulfonylurea resistance marker 476 and non-homologous end joining mutants for functional analysis in 477 Zymoseptoria tritici. Fungal Genet Biol, 79, 102–109. 478 42. Marchegiani,E., Sidhu,Y., Haynes,K. and Lebrun,M.H. (2015) Conditional gene 479 expression and promoter replacement in Zymoseptoria tritici using fungal nitrate 480 reductase promoters. Fungal Genet. Biol., 79, 174–179. 481 43. Kilaru,S., Schuster,M., Ma,W. and Steinberg,G. (2017) Fluorescent markers of 482 various organelles in the wheat pathogen Zymoseptoria tritici. Fungal Genet.

15

bioRxiv preprint doi: https://doi.org/10.1101/582205; this version posted March 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

483 Biol., 105, 16–27. 484 44. Kilaru,S., Schuster,M., Studholme,D., Soanes,D., Lin,C., Talbot,N.J. and 485 Steinberg,G. (2015) A codon-optimized green fluorescent protein for live cell 486 imaging in Zymoseptoria tritici. Fungal Genet. Biol., 79, 125–131. 487 45. Fones,H.N., Steinberg,G. and Gurr,S.J. (2015) Measurement of virulence in 488 Zymoseptoria tritici through low inoculum-density assays. Fungal Genet. Biol., 489 79, 89–93. 490 46. Sidhu,Y.S., Chaudhari,Y.K., Usher,J., Cairns,T.C., Csukai,M. and Haynes,K. 491 (2015) A suite of Gateway® compatible ternary expression vectors for functional 492 analysis in Zymoseptoria tritici. Fungal Genet. Biol., 79, 180–185. 493 47. Mehrabi,R., Mirzadi Gohari,A., da Silva,G.F., Steinberg,G., Kema,G.H.J. and de 494 Wit,P.J.G.M. (2015) Flexible gateway constructs for functional analyses of 495 genes in plant pathogenic fungi. Fungal Genet. Biol., 79, 186–192. 496 48. Cairns,T.C., Sidhu,Y.S., Chaudhari,Y.K., Talbot,N.J., Studholme,D.J. and 497 Haynes,K. (2015) Construction and high-throughput phenotypic screening of 498 Zymoseptoria tritici over-expression strains. Fungal Genet. Biol., 79, 110–117. 499 49. Leinonen,R., Sugawara,H. and Shumway,M. (2011) The sequence read archive. 500 Nucleic Acids Res., 39. 501 50. Grandaubert,J., Bhattacharyya,A. and Stukenbrock,E.H. (2015) RNA-seq-based 502 gene annotation and comparative genomics of four fungal grass pathogens in 503 the genus Zymoseptoria identify novel orphan genes and species-specific 504 invasions of transposable elements. G3 Genes||Genetics, 5, 1323– 505 1333. 506 51. Reboul,J., Vaglio,P., Rual,J.F., Lamesch,P., Martinez,M., Armstrong,C.M., Li,S., 507 Jacotot,L., Bertin,N., Janky,R., et al. (2003) C. elegans ORFeome version 1.1: 508 Experimental verification of the genome annotation and resource for 509 proteomescale protein expression. Nat. Genet., 34, 35–41. 510 52. Grützmann,K., Szafranski,K., Pohl,M., Voigt,K., Petzold,A. and Schuster,S. 511 (2014) Fungal alternative splicing is associated with multicellular complexity and 512 virulence: A genome-wide multi-species study. DNA Res., 21, 27–39. 513 53. Aken,B.L., Ayling,S., Barrell,D., Clarke,L., Curwen,V., Fairley,S., Fernandez 514 Banet,J., Billis,K., García Girón,C., Hourlier,T., et al. (2016) The Ensembl gene 515 annotation system. Database, 2016, baw093. 516 54. Sperschneider,J., Dodds,P.N., Gardiner,D.M., Singh,K.B. and Taylor,J.M. (2018) 517 Improved prediction of fungal effector proteins from secretomes with EffectorP 518 2.0. Mol. Plant Pathol., 19, 2094–2110. 519 520

16

bioRxiv preprint doi: https://doi.org/10.1101/582205; this version posted March 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

521 522

523 Figure 1: Schematic workflow for ORFeome design, generation, and quality control (A). Note that while 374 Entry vectors did not pass quality control, these 524 plasmids are listed in Supplementary Table 2 and can still be distributed. The 3022 ORFs passing quality control were plotted as a function of chromosomal 525 locus (B). ORFs are shown as vertical light grey lines, with genes encoding predicted effectors and secondary metabolite biosynthetic genes highlighted in blue 526 and red, respectively. Chromosome numbers are depicted with chromosome 1-13 being core and 14-21 dispensable.

17