bioRxiv preprint doi: https://doi.org/10.1101/820977; this version posted October 28, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. Wu et al., 1, Oc-j genome
1 The genome of the butternut canker pathogen, Ophiognomonia clavigignenti-
2 juglandacearum shows an elevated number of genes associated with secondary metabolism
3 and protection from host resistance responses in comparison with other members of the
5 Guangxi Wua, Taruna A. Schuelkeb,c, Kirk Brodersa,d
6 a Department of Bioagricultural Sciences and Pest Management, Colorado State University, Fort 7 Collins, Colorado, United States
8 b Department of Molecular, Cellular, & Biomedical Sciences, University of New Hampshire, 9 Durham, New Hampshire, United States
10 c Ecology, Evolution, and Marine Biology Department at the University of California, Santa 11 Barbara.
12 dSmithsonian Tropical Research Institute, Apartado 0843-03092, Balboa, Ancon,
13 Republic of Panamá.
14
15 Keywords
16 Canker pathogen, Efflux pump, Cytochrome P450, Juglans, Fagales bioRxiv preprint doi: https://doi.org/10.1101/820977; this version posted October 28, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. Wu et al., 2, Oc-j genome
17 Abstract
18 Ophiognomonia clavigignentijuglandacearum (Oc-j) is a plant pathogenic fungus that causes
19 canker and branch dieback diseases in the hardwood tree butternut, Juglans cinerea. Oc-j is a
20 member of the order of Diaporthales, which includes many other plant pathogenic species,
21 several of which also infect hardwood tree species. In this study, we sequenced the genome of
22 Oc-j and achieved a high-quality assembly and delineated the phylogeny of Oc-j within the
23 Diaporthales order using a genome-wide multi-gene approach. We also further examined
24 multiple gene families that might be involved in plant pathogenicity and degradation of complex
25 biomass, which are relevant to a pathogenic life-style in a tree host. We found that the Oc-j
26 genome contains a greater number of genes in these gene families compared to other species in
27 Diaporthales. These gene families include secreted CAZymes, kinases, cytochrome P450, efflux
28 pumps, and secondary metabolism gene clusters. The large numbers of these genes provide Oc-j
29 with an arsenal to cope with the specific ecological niche as a pathogen of the butternut tree.
30 Introduction
31 Ophiognomonia clavigignenti-juglandacearum (Oc-j) is an Ascomycetous fungus in the
32 family Gnomoniaceae and order Diaporthales. Like many of the other species within the
33 Diaporthales, Oc-j is a canker pathogen, and is known to infect the hardwood butternut (Juglans
34 cinerea) (Figure 1). The Diaporthales order is composed of 13 families [1], which include
35 several plant pathogens, saprophytes, and enodphtyes [2]. Numerous tree diseases are caused by
36 members of this order. These diseases include dogwood anthracnose (Discula destructiva),
37 butternut canker (Ophiognomonia clavigignenti-juglandacearum), apple canker (Valsa mali &
38 Valsa pyri), Eucalyptus canker (Chrysoporthe autroafricana, C. cubensis and C.
39 deuterocubensis), and perhaps the most infamous and well-known chestnut blight bioRxiv preprint doi: https://doi.org/10.1101/820977; this version posted October 28, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. Wu et al., 3, Oc-j genome
40 (Cryphonectira parastica). Furthermore, several species of the Diaporthales also cause important
41 disease of crops including soybean canker (Diaporthe aspalathi), soybean seed decay (Diaporthe
42 langicolla) and sunflower stem canker (Diaporthe helianti). In addition to pathogens, there are
43 also a multitude of species that have a saprotrophic or endophytic life strategy [3]. However, the
44 saprophytic and endophytic species have not been studied as extensively as the pathogenic
45 species in the Diaporthales.
46 Like many of the tree pathogens in Diaporthales, Oc-j is an invasive species, introduced
47 into the U.S. from an unknown origin. This introduction caused extensive damage among the
48 butternut population in North America during the latter half of the 20th century. The first report
49 of butternut canker was in Wisconsin in 1967 [4], and in 1979, the fungus was described for the
50 first time as Sirococcus clavigignenti-juglandacearum (Sc-j) [5]. Recent phylogenetic studies
51 have determined the pathogen which causes butternut canker is a member of the genus
52 Ophiognomonia and was reclassified as Ophiognomonia clavigignenti-juglandacearum (Oc-j)
53 [6]. The sudden emergence of Oc-j, its rapid spread in native North American butternuts, the
54 scarcity of resistant trees, and low genetic variability in the fungus point to a recent introduction
55 of a new pathogenic fungus that is causing a pandemic throughout North America [7].
56 While Oc-j is a devastating pathogen of a hardwood tree species, many of the species in
57 the genus Ophiognomonia are endophytes or saprophytes of tree species in the order Fagales and
58 more specifically the Juglandaceae or walnut family [8,9]. This relationship may support the
59 hypothesis of a host jump, where the fungus may have previously been living as an endophyte or
60 saprophyte before coming into contact with butternut. In fact, a recent study from China reported
61 Sirococcus (Ophiognomonia) clavigignenti-juglandacearum as an endophyte of Acer truncatum,
62 which is a maple species native to northern China [10]. The identification of the endophyte bioRxiv preprint doi: https://doi.org/10.1101/820977; this version posted October 28, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. Wu et al., 4, Oc-j genome
63 strain was made based on sequence similarity of the ITS region of the rDNA. A recent
64 morphological and phylogenetic analysis of this isolate determined that while it is not Oc-j, this
65 isolate is indeed more closely related to Oc-j than any other previously reported fungal species
66 [11]. The endophyte isolate also did not produce conidia in culture in comparison to Oc-j which
67 produces abundant conidia in culture. It is more likely that these organisms share are common
68 ancestor and represent distinct species.
69 While the impact of members in Diaporthales on both agricultural and forested
70 ecosystems is significant [2], there has been limited information regarding the genomic evolution
71 of this order of fungi. Several species have recently been sequenced and the genome data made
72 public. This includes pathogens of trees and crops as well as an endophytic and saprotrophic
73 species [12]. However, these were generally brief genome reports and a more thorough
74 comparative analysis of the species within the Diaporthales has yet to be completed.
75 Here we report the genome sequence of Oc-j and use it in comparative analyses with
76 those of tree and crop pathogens within Diaporthales. Comparative genomics of several members
77 of the Diaporthales order provides valuable insights into fundamental questions regarding fungal
78 lifestyles, evolution and phylogeny, and adaptation to diverse ecological niches, especially as
79 they relate to plant pathogenicity and degradation of complex biomass associated with tree
80 species.
81 Results and Discussion
82 Genome assembly and annotation of Oc-j
83 The draft genome assembly of Oc-j contains a total of 52.6 Mbp and 5,401 contigs, with an N50
84 of 151 Kbp. The completeness of the genome assembly was assessed by identifying universal bioRxiv preprint doi: https://doi.org/10.1101/820977; this version posted October 28, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. Wu et al., 5, Oc-j genome
85 single-copy orthologs using BUSCO with lineage dataset for Sordariomycota [13]. Out of 3,725
86 total BUSCO groups searched, we found that 3,378 (90.7%) were complete and 264 (7.1%) were
87 fragmented in the Oc-j genome, while only 83 (2.2%) were missing. This result indicates that the
88 Oc-j genome is relatively complete.
89 Genome-wide multi-gene phylogeny of Diaporthales
90 A total of 340 genes were used to generate a multigene phylogeny of the order Diaporthales. All
91 sequenced species in Diaporthales (one genome per species for 12 species) were used. These
92 species include: Cryphonectria parasitica (jgi.doe.gov), Chrysoporthe cubensis [14],
93 Chrysoporthe deuterocubensis [14], Chrysoporthe austroafricana [15], Valsa mali [12], Valsa
94 pyri [12], Diaporthe ampelina [16], Diaporthe aspalathi [17], Diaporthe longicolla [18],
95 Diaporthe helianthi [19], and Melanconium sp. The outgroups used were Neurospora crassa
96 [20] and Magnaporthe grisea [21]. We show that the order Diaporthales is divided into two main
97 branches. One branch includes Oc-j, C. parasitica, and the Chrysoporthe species (Figure 2), in
98 which Oc-j is the outlier, indicating its early divergence from the rest of the branch. The other
99 branch includes the Valsa and Diaporthe species.
100 Gene content comparison across the Diaporthales
101 To examine the functional capacity of the Oc-j gene repertoire, the PFAM domains were
102 identified in the protein sequences (Supplementary Table 1). For comparison, we also
103 examined eight of the above mentioned 13 related species, where protein sequences were
104 successfully retrieved (see Methods for details).
105 Secreted CAZymes bioRxiv preprint doi: https://doi.org/10.1101/820977; this version posted October 28, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. Wu et al., 6, Oc-j genome
106 CAZymes are a group of proteins that are involved in degrading, modifying, or creating
107 glycosidic bonds and contain predicted catalytic and carbohydrate-binding domains [22]. When
108 secreted, CAZymes can participate in degrading plant cell walls during colonization by fungal
109 pathogens; therefore, a combination of CAZyme and protein secretion prediction was used to
110 identify and classify enzymes likely involved in cell wall degradation in plant pathogens [22].
111 Overall, the Oc-j genome contains 576 putatively secreted CAZymes, more than any of the other
112 eight species included in this analysis, and 60 CAZymes more than D. helianthi, the species with
113 the second most (Figure 3, Suppl. Table 1). Given that all except bread mold N. crassa, which
114 has the lowest number of secreted CAZymes, are plant pathogens, this result indicates that Oc-j
115 contains an especially large gene repertoire for cell wall degradation. This pattern also holds true
116 when compared to other tree pathogens. In a recent comparative genomic analysis of tree
117 pathogens, it was reported that the black walnut pathogen, Geosmithia morbida, had 406
118 CAZymes, the lodgepole pine pathogen Grosmania clavigera had 530 CAZymes, and the plane
119 tree canker pathogen, Ceratocystis platani, had 360 CAZymes [23]. Since secreted CAZymes are
120 involved in degrading plant cell walls [22], the large number of secreted CAZymes in Oc-j might
121 help facilitate its life-style infecting and colonizing butternut trees.
122 Kinase
123 Among a total of 77 kinase related PFAM domains found in the nine above mentioned species,
124 61 domains are present in more genes in Oc-j than the average of the eight related species, while
125 only eight domains are present in fewer genes in Oc-j (Figure 3, Suppl. Table 1). Given that
126 kinases are involved in signaling networks, this result might indicate a more complicated
127 signaling network in Oc-j. One example of kinase gene family expansion is the domain family of bioRxiv preprint doi: https://doi.org/10.1101/820977; this version posted October 28, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. Wu et al., 7, Oc-j genome
128 fructosamine kinase (PF03881). Oc-j contains 29 genes with this domain, while N. crassa has
129 none and the other related species have between 3-8 genes (Figure 3).
130 Cytochrome P450
131 Cytochrome P450s (CYPs) are a superfamily of monooxygenases that play a wide range of roles
132 in metabolism and adaptation to ecological niches in fungi [24]. Among the nine species
133 included in the PFAM analysis, N. crassa has only 31 CYPs, while the other eight plant-
134 pathogenic species have between 104 – 223 CYPs. This result likely reflects that the ecological
135 niche of N. crassa, bread, is more favorable for fungal growth than live plants with active
136 defense mechanisms, thus fewer CYPs are needed for N. crassa to cope with relatively simple
137 substrates. The Oc-j genome contains 223 CYPs (Figure 3, Suppl. Table 1), more than any of
138 the other seven plant-pathogenic fungal species. This may be an evolutionary response to the
139 number and diversity of secondary metabolites, such as juglone, produced by butternut as well as
140 other Juglans species, that would need to be metabolized by any fungal pathogen trying to
141 colonize the tree.
142 Efflux pumps
143 To overcome host defenses, infect and maintain colonization of the host, fungi employ efflux
144 pumps to counter intercellular toxin accumulation [25]. Here, we examine the presence of two
145 major efflux pump families, ATP-binding cassette (ABC) transporters and transporters of the
146 major facilitator superfamily (MFS) [25], in Diaporthales genomes. ABC transporters (PF00005,
147 pfam.xfam.org) are present in all of the nine species included in the PFAM analysis, ranging
148 from 25 genes in N. crassa to 51 genes in V. mali, while Oc-j has 47 genes (Figure 3, Suppl.
149 Table 1). The major facilitator superfamily are membrane proteins which are expressed bioRxiv preprint doi: https://doi.org/10.1101/820977; this version posted October 28, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. Wu et al., 8, Oc-j genome
150 ubiquitously in all kingdoms of life for the import or export of target substrates. The MFS is a
151 clan that contains 24 PFAM domain families (pfam.xfam.org). The Oc-j genome contains 653
152 MFS efflux pump genes, the most among all nine species included in the PFAM analysis (Figure
153 3, Suppl. Table 1). C. parasitica has 552 genes, the second most. Interestingly, it was recently
154 shown that a secondary metabolite juglone extracted from Juglans spp. could be used as
155 potential efflux pump inhibitors in Staphylococcus aureus, inhibiting the export of antibiotics out
156 of the bacterial cells [26]. In line with this previous finding, our results suggest that Oc-j genome
157 contains a large arsenal of efflux pumps likely due to the need to cope with hostile secondary
158 metabolites such as the efflux pump inhibitor juglone.
159 Secondary metabolism gene clusters
160 Secondary metabolite toxins play an important role in fungal nutrition and virulence [27]. To
161 identify gene clusters involved in the biosynthesis of secondary metabolites in Oc-j and related
162 species, we scanned their genomes for such gene clusters. We found that the Oc-j genome
163 contains a remarkably large repertoire of secondary metabolism gene clusters, when compared to
164 the closely related C. parasitica and the Chrysoporthe species (Figure 3, Suppl. Table). While
165 C. parasitica has 44 gene clusters and the Chrysoporthe species have 18 – 48 gene clusters, the
166 Oc-j genome has a total of 72 gene clusters (Figure 3), reflecting a greater capacity to produce
167 various secondary metabolites. Among the 72 gene clusters in Oc-j, more than half are type 1
168 polyketide synthases (t1pks, 39 total, including hybrids), followed by non-ribosomal peptide
169 synthetases (nrps, 14 total, including hybrids) and terpene synthases (9) (Supplementary Table
170 1).
171 When compared to all species included in this study, we found that Oc-j has the third highest
172 number of clusters, only after D. longicolla and Melanconium sp.; however, when normalized by bioRxiv preprint doi: https://doi.org/10.1101/820977; this version posted October 28, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. Wu et al., 9, Oc-j genome
173 total gene number in each species, Oc-j is shown to have 6.4 secondary metabolism gene clusters
174 per 1000 genes, the highest among all species included in this study. The D. helianthi genome
175 contains only six clusters, likely due to the poor quality of the genome assembly with an N50 of
176 ~6 kbps (Figure 3, Suppl. Table 1). Cluster numbers can vary greatly within the same genus.
177 Conclusion
178 We constructed a high-quality genome assembly for Oc-j, and delineated the phylogeny of
179 Diaporthales with a genome-wide multi-gene approach, revealing two major branches. We then
180 examined several gene families relevant to plant pathogenicity and complex biomass
181 degradation. We found that the Oc-j genome contains large numbers of genes in these gene
182 families. These genes might be essential for Oc-j to cope with its niche in the hardwood butternut
183 (Juglans cinerea). Future research will need to focus on understanding the prevalence of these
184 genes associated with complex biomass degradation among other members of the
185 Ophiognomonia genus which include endophytes, saprophytes and pathogens. It will be
186 interesting to know which of these different classes of genes are important in the evolution of
187 distinct fungal lifestyles and niche adaptation. Future research will also focus on comparing the
188 butternut canker pathogen to canker pathogens of other tree species which produce large
189 numbers of secondary metabolites known to be important in host defense. Our genome also
190 serves as an essential resource for the Oc-j research community.
191 Methods
192 DNA extraction and library preparation
193 For this study, the type culture of Oc-j (ATCC 36624) recovered from an infected butternut tree
194 in Wisconsin in 1978, was sequenced. For DNA extraction, isolates were grown on cellophane- bioRxiv preprint doi: https://doi.org/10.1101/820977; this version posted October 28, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. Wu et al., 10, Oc-j genome
195 covered potato dextrose agar for 7-10 d, and mycelia was collected and lyophilized. DNA was
196 extracted from lyophilized mycelia using the CTAB method as outlined by the Joint Genome
197 Institute for whole genome sequencing (Kohler A, Francis M. Genomic DNA Extraction,
198 Accessed 12/12/2015 http://1000.fungalgenomes.org/home/wp-
199 content/uploads/2013/02/genomicDNAProtocol-AK0511.pdf). The total DNA quantity and
200 quality were measured using both Nanodrop and Qubit, and the sample was sent to the Hubbard
201 Center for Genome Studies at the University of New Hampshire, Durham, New Hampshire.
202 DNA libraries were prepared using the paired-end Illumina Truseq sample preparation kit, and
203 were sequenced on an Illumina HiSeq 2500.
204 Genome assembly and annotation
205 We corrected our raw reads using BLESS 0.16 [28] with the following options: -kmerlength 23 -
206 verify –notrim. Once our reads were corrected, we trimmed the reads at a phred score of 2 both
207 at the leading and trailing ends of the reads using Trimmomatic 0.32 [29]. We used a sliding
208 window of four bases that must average a phred score of 2 and the reads must maintain a
209 minimum length of 25 bases. Next, de novo assembly was built using SPAdes-3.1.1 [30] with
210 both paired and unpaired reads and the following settings: -t 8 -m 100 --only-assembler.
211 Genome sequences were deposited at ncbi.nlm.nih.gov under Bioproject number PRJNA434132.
212 Gene annotation was performed using the MAKER2 pipeline [31] in an iterative manner as is
213 described in [32], with protein evidence from related species of Melanconium sp., Cryphonectria
214 parasitica, Diaporthe ampelina, (from jgi.doe.gov) and Diaporthe helianthi [19], for a total of
215 three iterations. PFAM domains were identified in all the genomes using hmmscan with trusted
216 cutoff [33]. Only nine species were included in the PFAM analysis. For the three Chrysoporthe
217 species, D. aspalathi, and D. longicolla, protein sequences were not readily available for bioRxiv preprint doi: https://doi.org/10.1101/820977; this version posted October 28, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. Wu et al., 11, Oc-j genome
218 downloading online and e-mail requests were unsuccessful. Secondary metabolism gene clusters
219 were identified using antiSMASH 4.0 [34].
220
221 Species phylogeny
222 Core eukaryotic proteins identified by CEGMA [35] were first aligned by MAFFT [36] and then
223 concatenated. Only proteins that were present in all genomes and all sequences were longer than
224 90% of the Saccharomyces cerevisiae ortholog were used. Phylogeny was then inferred using
225 maximum likelihood by RAxML [37] with 100 bootstraps and then midpoint rooted.
226 Secreted CAZyme prediction
227 SignalP [38] was used to predict the presence of secretory signal peptides. CAZymes were
228 predicted using CAZymes Analysis Toolkit [39] based on the most recent CAZY database
229 (www.cazy.org). Proteins that both contain a signal peptide and are predicted to be a CAZyme
230 are annotated as a secreted CAZyme.
231 Conflicts of interest
232 The authors have no conflict of interest.
233
234 Acknowledgements. Special thanks to Dr. Matt MacManus. This started as an in-class project
235 for PhD student TS and provided the basis for this final manuscript. Dr. Broders was supported
236 by the Simon’s Foundation Grant number 429440 to the Smithsonian Tropical Research
237 Institute.
238 bioRxiv preprint doi: https://doi.org/10.1101/820977; this version posted October 28, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. Wu et al., 12, Oc-j genome
239 References
240 [1] H. Voglmayr, L.A. Castlebury, W.M. Jaklitsch, Juglanconis gen. nov. on Juglandaceae, and 241 the new family Juglanconidaceae (Diaporthales), Persoonia Mol. Phylogeny Evol. Fungi. 38 242 (2017) 136–155. doi:10.3767/003158517X694768. 243 [2] A.Y. Rossman, D.F. Farr, L.A. Castlebury, A review of the phylogeny and biology of the 244 Diaporthales, Mycoscience. 48 (2007) 135–144. doi:10.1007/S10267-007-0347-7. 245 [3] L.A. Castlebury, A.Y. Rossman, W.J. Jaklitsch, L.N. Vasilyeva, A preliminary overview of 246 the Diaporthales based on large subunit nuclear ribosomal DNA sequences, Mycologia. 94 247 (2002) 1017–1031. 248 [4] D.W. Renlund, Forest pest conditions in Wisconsin, Annual Report, 53 p., Wisconsin 249 Department of Natural Resources, Madison, WI, 1971. 250 [5] V.M.G. Nair, C.J. Kostichka, J.E. Kuntz, Sirococcus clavigignenti-juglandacearum: An 251 Undescribed Species Causing Canker on Butternut, Mycologia. 71 (1979) 641–646. 252 doi:10.2307/3759076. 253 [6] K.D. Broders, G.J. Boland, Reclassification of the butternut canker fungus, Sirococcus 254 clavigignenti-juglandacearum, into the genus Ophiognomonia, Fungal Biol. 115 (2011) 70– 255 79. doi:10.1016/j.funbio.2010.10.007. 256 [7] K.D. Broders, A. Boraks, A.M. Sanchez, G.J. Boland, Population structure of the butternut 257 canker fungus, Ophiognomonia clavigignenti-juglandacearum, in North American forests, 258 Ecol. Evol. 2 (2012) 2114–2127. doi:10.1002/ece3.332. 259 [8] M.V. Sogonov, L.A. Castlebury, R.A. Y., L.C. Mejia, J.F. White, Leaf-inhabiting genera of 260 the Gnomoniaceae, Diapothathales, Stud. Mycol. (2008) 62: 1-79. 261 [9] D.M. Walker, L.A. Castlebury, A.Y. Rossman, L.C. Mejia, J.F. White, Phylogeny and 262 taxonomy of Ophiognomonia (Gnomoniaceae, Diaporthales), including twenty-five new 263 species in this highly diverse genus, Fungal Diversity. (2012) 57: 85-147. 264 [10] X. Sun, L.-D. Guo, K.D. Hyde, Community composition of endophytic fungi in Acer 265 truncatum and their role in decomposition, Fungal Divers. 47 (2011) 85–95. 266 doi:10.1007/s13225-010-0086-5. bioRxiv preprint doi: https://doi.org/10.1101/820977; this version posted October 28, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. Wu et al., 13, Oc-j genome
267 [11] K. Broders, A. Boraks, L. Barbison, J. Brown, G.J. Boland, Recent insights into the 268 pandemic disease butternut canker caused by the invasive pathogen Ophiognomonia 269 clavigignenti-juglandacearum, For. Pathol. 45 (2015) 1–8. doi:10.1111/efp.12161. 270 [12] Z. Yin, H. Liu, Z. Li, X. Ke, D. Dou, X. Gao, N. Song, Q. Dai, Y. Wu, J.-R. Xu, Z. Kang, L. 271 Huang, Genome sequence of Valsa canker pathogens uncovers a potential adaptation of 272 colonization of woody bark, New Phytol. 208 (2015) 1202–1216. doi:10.1111/nph.13544. 273 [13] F.A. Simão, R.M. Waterhouse, P. Ioannidis, E.V. Kriventseva, E.M. Zdobnov, BUSCO: 274 assessing genome assembly and annotation completeness with single-copy orthologs, 275 Bioinformatics. 31 (2015) 3210–3212. doi:10.1093/bioinformatics/btv351. 276 [14] B.D. Wingfield, I. Barnes, Z. Wilhelm de Beer, L. De Vos, T.A. Duong, A.M. Kanzi, K. 277 Naidoo, H.D.T. Nguyen, Q.C. Santana, M. Sayari, K.A. Seifert, E.T. Steenkamp, C. Trollip, 278 N.A. van der Merwe, M.A. van der Nest, P. Markus Wilken, M.J. Wingfield, IMA Genome- 279 F 5: Draft genome sequences of Ceratocystis eucalypticola, Chrysoporthe cubensis, C. 280 deuterocubensis, Davidsoniella virescens, Fusarium temperatum,Graphilbum fragrans, 281 Penicillium nordicum, and Thielaviopsis musarum, IMA Fungus. 6 (2015) 493–506. 282 doi:10.5598/imafungus.2015.06.02.13. 283 [15] B.D. Wingfield, P.K. Ades, F.A. Al-Naemi, L.A. Beirn, W. Bihon, J.A. Crouch, Z.W. de 284 Beer, L. De Vos, T.A. Duong, C.J. Fields, G. Fourie, A.M. Kanzi, M. Malapi-Wight, S.J. 285 Pethybridge, O. Radwan, G. Rendon, B. Slippers, Q.C. Santana, E.T. Steenkamp, P.W.J. 286 Taylor, N. Vaghefi, N.A. van der Merwe, D. Veltri, M.J. Wingfield, IMA Genome-F 4: Draft 287 genome sequences of Chrysoporthe austroafricana, Diplodia scrobiculata, Fusarium 288 nygamai, Leptographium lundbergii, Limonomyces culmigenus, Stagonosporopsis tanaceti, 289 and Thielaviopsis punctulata, IMA Fungus. 6 (2015) 233–248. 290 doi:10.5598/imafungus.2015.06.01.15. 291 [16] J. Savitha, S.D. Bhargavi, V.K. Praveen, Complete Genome Sequence of the Endophytic 292 Fungus Diaporthe (Phomopsis) ampelina, Genome Announc. 4 (2016) e00477-16. 293 doi:10.1128/genomeA.00477-16. 294 [17] S. Li, Q. Song, A.M. Martins, P. Cregan, Draft genome sequence of Diaporthe aspalathi 295 isolate MS-SSC91, a fungus causing stem canker in soybean, Genomics Data. 7 (2016) 262– 296 263. doi:10.1016/j.gdata.2016.02.002. bioRxiv preprint doi: https://doi.org/10.1101/820977; this version posted October 28, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. Wu et al., 14, Oc-j genome
297 [18] S. Li, O. Darwish, N. Alkharouf, B. Matthews, P. Ji, L.L. Domier, N. Zhang, B.H. Bluhm, 298 Draft genome sequence of Phomopsis longicolla isolate MSPL 10-6, Genomics Data. 3 299 (2015) 55–56. doi:10.1016/j.gdata.2014.11.007. 300 [19] R. Baroncelli, F. Scala, M. Vergara, M.R. Thon, M. Ruocco, Draft whole-genome sequence 301 of the Diaporthe helianthi 7/96 strain, causal agent of sunflower stem canker, Genomics 302 Data. 10 (2016) 151–152. doi:10.1016/j.gdata.2016.11.005. 303 [20] J.E. Galagan, S.E. Calvo, K.A. Borkovich, E.U. Selker, N.D. Read, D. Jaffe, W. FitzHugh, 304 L.-J. Ma, S. Smirnov, S. Purcell, B. Rehman, T. Elkins, R. Engels, S. Wang, C.B. Nielsen, J. 305 Butler, M. Endrizzi, D. Qui, P. Ianakiev, D. Bell-Pedersen, M.A. Nelson, M. Werner- 306 Washburne, C.P. Selitrennikoff, J.A. Kinsey, E.L. Braun, A. Zelter, U. Schulte, G.O. Kothe, 307 G. Jedd, W. Mewes, C. Staben, E. Marcotte, D. Greenberg, A. Roy, K. Foley, J. Naylor, N. 308 Stange-Thomann, R. Barrett, S. Gnerre, M. Kamal, M. Kamvysselis, E. Mauceli, C. Bielke, 309 S. Rudd, D. Frishman, S. Krystofova, C. Rasmussen, R.L. Metzenberg, D.D. Perkins, S. 310 Kroken, C. Cogoni, G. Macino, D. Catcheside, W. Li, R.J. Pratt, S.A. Osmani, C.P.C. 311 DeSouza, L. Glass, M.J. Orbach, J.A. Berglund, R. Voelker, O. Yarden, M. Plamann, S. 312 Seiler, J. Dunlap, A. Radford, R. Aramayo, D.O. Natvig, L.A. Alex, G. Mannhaupt, D.J. 313 Ebbole, M. Freitag, I. Paulsen, M.S. Sachs, E.S. Lander, C. Nusbaum, B. Birren, The 314 genome sequence of the filamentous fungus Neurospora crassa, Nature. 422 (2003) 859–868. 315 doi:10.1038/nature01554. 316 [21] R.A. Dean, N.J. Talbot, D.J. Ebbole, M.L. Farman, T.K. Mitchell, M.J. Orbach, M. Thon, R. 317 Kulkarni, J.-R. Xu, H. Pan, N.D. Read, Y.-H. Lee, I. Carbone, D. Brown, Y.Y. Oh, N. 318 Donofrio, J.S. Jeong, D.M. Soanes, S. Djonovic, E. Kolomiets, C. Rehmeyer, W. Li, M. 319 Harding, S. Kim, M.-H. Lebrun, H. Bohnert, S. Coughlan, J. Butler, S. Calvo, L.-J. Ma, R. 320 Nicol, S. Purcell, C. Nusbaum, J.E. Galagan, B.W. Birren, The genome sequence of the rice 321 blast fungus Magnaporthe grisea, Nature. 434 (2005) 980–986. doi:10.1038/nature03449. 322 [22] A. Morales-Cruz, K.C.H. Amrine, B. Blanco-Ulate, D.P. Lawrence, R. Travadon, P.E. 323 Rolshausen, K. Baumgartner, D. Cantu, Distinctive expansion of gene families associated 324 with plant cell wall degradation, secondary metabolism, and nutrient uptake in the genomes 325 of grapevine trunk pathogens, BMC Genomics. 16 (2015) 469. doi:10.1186/s12864-015- 326 1624-z. bioRxiv preprint doi: https://doi.org/10.1101/820977; this version posted October 28, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. Wu et al., 15, Oc-j genome
327 [23] T.A. Schuelke, G. Wu, A. Westbrook, K. Woeste, D.C. Plachetzki, K. Broders, M.D. 328 MacManes, Comparative Genomics of Pathogenic and Nonpathogenic Beetle-Vectored 329 Fungi in the Genus Geosmithia, Genome Biol. Evol. 9 (2017) 3312–3327. 330 doi:10.1093/gbe/evx242. 331 [24] W. Chen, M.-K. Lee, C. Jefcoate, S.-C. Kim, F. Chen, J.-H. Yu, Fungal Cytochrome P450 332 Monooxygenases: Their Distribution, Structure, Functions, Family Expansion, and 333 Evolutionary Origin, Genome Biol. Evol. 6 (2014) 1620–1634. doi:10.1093/gbe/evu132. 334 [25] J.J. Coleman, E. Mylonakis, Efflux in Fungi: La Pièce de Résistance, PLOS Pathog. 5 335 (2009) e1000486. doi:10.1371/journal.ppat.1000486. 336 [26] T. Zmantar, H. Miladi, B. Kouidhi, Y. Chaabouni, R. Ben Slama, A. Bakhrouf, K. 337 Mahdouani, K. Chaieb, Use of juglone as antibacterial and potential efflux pump inhibitors 338 in Staphylococcus aureus isolated from the oral cavity, Microb. Pathog. 101 (2016) 44–49. 339 doi:10.1016/j.micpath.2016.10.022. 340 [27] B.J. Howlett, Secondary metabolite toxins and nutrition of plant pathogenic fungi, Curr. 341 Opin. Plant Biol. 9 (2006) 371–375. doi:10.1016/j.pbi.2006.05.004. 342 [28] Y. Heo, X.-L. Wu, D. Chen, J. Ma, W.-M. Hwu, BLESS: Bloom filter-based error 343 correction solution for high-throughput sequencing reads, Bioinformatics. 30 (2014) 1354– 344 1362. doi:10.1093/bioinformatics/btu030. 345 [29] A.M. Bolger, M. Lohse, B. Usadel, Trimmomatic: a flexible trimmer for Illumina sequence 346 data, Bioinformatics. 30 (2014) 2114–2120. doi:10.1093/bioinformatics/btu170. 347 [30] A. Bankevich, S. Nurk, D. Antipov, A.A. Gurevich, M. Dvorkin, A.S. Kulikov, V.M. Lesin, 348 S.I. Nikolenko, S. Pham, A.D. Prjibelski, A.V. Pyshkin, A.V. Sirotkin, N. Vyahhi, G. Tesler, 349 M.A. Alekseyev, P.A. Pevzner, SPAdes: A New Genome Assembly Algorithm and Its 350 Applications to Single-Cell Sequencing, J. Comput. Biol. 19 (2012) 455–477. 351 doi:10.1089/cmb.2012.0021. 352 [31] C. Holt, M. Yandell, MAKER2: an annotation pipeline and genome-database management 353 tool for second-generation genome projects, BMC Bioinformatics. 12 (2011) 491. 354 doi:10.1186/1471-2105-12-491. 355 [32] J. Yu, G. Wu, W.M. Jurick, V.L. Gaskins, Y. Yin, G. Yin, J.W. Bennett, D.R. Shelton, 356 Genome Sequence of Penicillium solitum RS1, Which Causes Postharvest Apple Decay, 357 Genome Announc. 4 (2016) e00363-16. doi:10.1128/genomeA.00363-16. bioRxiv preprint doi: https://doi.org/10.1101/820977; this version posted October 28, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. Wu et al., 16, Oc-j genome
358 [33] A. Bateman, L. Coin, R. Durbin, R.D. Finn, V. Hollich, S. Griffiths-Jones, A. Khanna, M. 359 Marshall, S. Moxon, E.L.L. Sonnhammer, D.J. Studholme, C. Yeats, S.R. Eddy, The Pfam 360 protein families database, Nucleic Acids Res. 32 (2004) D138–D141. 361 doi:10.1093/nar/gkh121. 362 [34] T. Weber, K. Blin, S. Duddela, D. Krug, H.U. Kim, R. Bruccoleri, S.Y. Lee, M.A. 363 Fischbach, R. Müller, W. Wohlleben, R. Breitling, E. Takano, M.H. Medema, antiSMASH 364 3.0—a comprehensive resource for the genome mining of biosynthetic gene clusters, Nucleic 365 Acids Res. 43 (2015) W237–W243. doi:10.1093/nar/gkv437. 366 [35] G. Parra, K. Bradnam, I. Korf, CEGMA: a pipeline to accurately annotate core genes in 367 eukaryotic genomes, Bioinformatics. 23 (2007) 1061–1067. 368 doi:10.1093/bioinformatics/btm071. 369 [36] K. Katoh, K. Misawa, K. Kuma, T. Miyata, MAFFT: a novel method for rapid multiple 370 sequence alignment based on fast Fourier transform, Nucleic Acids Res. 30 (2002) 3059– 371 3066. doi:10.1093/nar/gkf436. 372 [37] A. Stamatakis, RAxML version 8: a tool for phylogenetic analysis and post-analysis of large 373 phylogenies, Bioinformatics. 30 (2014) 1312–1313. doi:10.1093/bioinformatics/btu033. 374 [38] T.N. Petersen, S. Brunak, G. von Heijne, H. Nielsen, SignalP 4.0: discriminating signal 375 peptides from transmembrane regions, Nat. Methods. 8 (2011) 785. doi:10.1038/nmeth.1701. 376 [39] B.H. Park, T.V. Karpinets, M.H. Syed, M.R. Leuze, E.C. Uberbacher, CAZymes Analysis 377 Toolkit (CAT): Web service for searching and analyzing carbohydrate-active enzymes in a 378 newly sequenced organism using CAZy database, Glycobiology. 20 (2010) 1574–1584. 379 doi:10.1093/glycob/cwq106. 380
381
382
383
384
385 bioRxiv preprint doi: https://doi.org/10.1101/820977; this version posted October 28, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. Wu et al., 17, Oc-j genome
386 Supplementary Table 1. PFAM and antiSMASH annotation of Oc-j and related species
387
388 Figure Legend
389 Figure 1. Symptoms of infection caused by Oc-j on the A) leaves, B) branches and C) trunk as
390 well as as D) signs of fruting bodies on an infected branch.
391
392 Figure 2. Phylogeny of Diaporthales and related species inferred using maximum likelihood by
393 RAxML [37] with 1000 bootstraps and then midpoint rooted. The first and second numbers in
394 parentheses represent the genome sizes in Mb and the number of predict protein models,
395 respectively
396
397 Figure 3. Abundance of genes in specific classes including ABC transporters, Cytochrome
398 P450, Kinases, CAZymes, and MFS Efflux pumps present in nine species within the order
399 Diaporthales
400
401
402
403
404
405
406
407
408 bioRxiv preprint doi: https://doi.org/10.1101/820977; this version posted October 28, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. Wu et al., 18, Oc-j genome
409
410 bioRxiv preprint doi: https://doi.org/10.1101/820977; this version posted October 28, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. bioRxiv preprint doi: https://doi.org/10.1101/820977; this version posted October 28, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. bioRxiv preprint doi: https://doi.org/10.1101/820977; this version posted October 28, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.