<<

bioRxiv preprint doi: https://doi.org/10.1101/820977; this version posted October 28, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. Wu et al., 1, Oc-j genome

1 The genome of the butternut canker pathogen, clavigignenti-

2 juglandacearum shows an elevated number of genes associated with secondary metabolism

3 and protection from host resistance responses in comparison with other members of the

4

5 Guangxi Wua, Taruna A. Schuelkeb,c, Kirk Brodersa,d

6 a Department of Bioagricultural Sciences and Pest Management, Colorado State University, Fort 7 Collins, Colorado,

8 b Department of Molecular, Cellular, & Biomedical Sciences, University of New Hampshire, 9 Durham, New Hampshire, United States

10 c Ecology, Evolution, and Marine Biology Department at the University of California, Santa 11 Barbara.

12 dSmithsonian Tropical Research Institute, Apartado 0843-03092, Balboa, Ancon,

13 Republic of Panamá.

14

15 Keywords

16 Canker pathogen, Efflux pump, Cytochrome P450, , bioRxiv preprint doi: https://doi.org/10.1101/820977; this version posted October 28, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. Wu et al., 2, Oc-j genome

17 Abstract

18 Ophiognomonia clavigignentijuglandacearum (Oc-j) is a pathogenic that causes

19 canker and branch dieback diseases in the hardwood butternut, Juglans cinerea. Oc-j is a

20 member of the order of Diaporthales, which includes many other plant pathogenic species,

21 several of which also infect hardwood tree species. In this study, we sequenced the genome of

22 Oc-j and achieved a high-quality assembly and delineated the phylogeny of Oc-j within the

23 Diaporthales order using a genome-wide multi-gene approach. We also further examined

24 multiple gene families that might be involved in plant pathogenicity and degradation of complex

25 biomass, which are relevant to a pathogenic life-style in a tree host. We found that the Oc-j

26 genome contains a greater number of genes in these gene families compared to other species in

27 Diaporthales. These gene families include secreted CAZymes, kinases, cytochrome P450, efflux

28 pumps, and secondary metabolism gene clusters. The large numbers of these genes provide Oc-j

29 with an arsenal to cope with the specific ecological niche as a pathogen of the butternut tree.

30 Introduction

31 Ophiognomonia clavigignenti-juglandacearum (Oc-j) is an Ascomycetous fungus in the

32 family and order Diaporthales. Like many of the other species within the

33 Diaporthales, Oc-j is a canker pathogen, and is known to infect the hardwood butternut (Juglans

34 cinerea) (Figure 1). The Diaporthales order is composed of 13 families [1], which include

35 several plant pathogens, saprophytes, and enodphtyes [2]. Numerous tree diseases are caused by

36 members of this order. These diseases include dogwood anthracnose (),

37 butternut canker (Ophiognomonia clavigignenti-juglandacearum), apple canker ( mali &

38 Valsa pyri), Eucalyptus canker ( autroafricana, C. cubensis and C.

39 deuterocubensis), and perhaps the most infamous and well-known blight bioRxiv preprint doi: https://doi.org/10.1101/820977; this version posted October 28, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. Wu et al., 3, Oc-j genome

40 (Cryphonectira parastica). Furthermore, several species of the Diaporthales also cause important

41 disease of crops including canker ( aspalathi), soybean seed decay (Diaporthe

42 langicolla) and sunflower stem canker (Diaporthe helianti). In addition to pathogens, there are

43 also a multitude of species that have a saprotrophic or endophytic life strategy [3]. However, the

44 saprophytic and endophytic species have not been studied as extensively as the pathogenic

45 species in the Diaporthales.

46 Like many of the tree pathogens in Diaporthales, Oc-j is an invasive species, introduced

47 into the U.S. from an unknown origin. This introduction caused extensive damage among the

48 butternut population in North America during the latter half of the 20th century. The first report

49 of butternut canker was in in 1967 [4], and in 1979, the fungus was described for the

50 first time as Sirococcus clavigignenti-juglandacearum (Sc-j) [5]. Recent phylogenetic studies

51 have determined the pathogen which causes butternut canker is a member of the

52 Ophiognomonia and was reclassified as Ophiognomonia clavigignenti-juglandacearum (Oc-j)

53 [6]. The sudden emergence of Oc-j, its rapid spread in native North American butternuts, the

54 scarcity of resistant , and low genetic variability in the fungus point to a recent introduction

55 of a new pathogenic fungus that is causing a pandemic throughout North America [7].

56 While Oc-j is a devastating pathogen of a hardwood tree species, many of the species in

57 the genus Ophiognomonia are endophytes or saprophytes of tree species in the order Fagales and

58 more specifically the or family [8,9]. This relationship may support the

59 hypothesis of a host jump, where the fungus may have previously been living as an endophyte or

60 saprophyte before coming into contact with butternut. In fact, a recent study from China reported

61 Sirococcus (Ophiognomonia) clavigignenti-juglandacearum as an endophyte of Acer truncatum,

62 which is a maple species native to northern China [10]. The identification of the endophyte bioRxiv preprint doi: https://doi.org/10.1101/820977; this version posted October 28, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. Wu et al., 4, Oc-j genome

63 strain was made based on sequence similarity of the ITS region of the rDNA. A recent

64 morphological and phylogenetic analysis of this isolate determined that while it is not Oc-j, this

65 isolate is indeed more closely related to Oc-j than any other previously reported fungal species

66 [11]. The endophyte isolate also did not produce conidia in culture in comparison to Oc-j which

67 produces abundant conidia in culture. It is more likely that these organisms share are common

68 ancestor and represent distinct species.

69 While the impact of members in Diaporthales on both agricultural and forested

70 ecosystems is significant [2], there has been limited information regarding the genomic evolution

71 of this order of fungi. Several species have recently been sequenced and the genome data made

72 public. This includes pathogens of trees and crops as well as an endophytic and saprotrophic

73 species [12]. However, these were generally brief genome reports and a more thorough

74 comparative analysis of the species within the Diaporthales has yet to be completed.

75 Here we report the genome sequence of Oc-j and use it in comparative analyses with

76 those of tree and crop pathogens within Diaporthales. Comparative genomics of several members

77 of the Diaporthales order provides valuable insights into fundamental questions regarding fungal

78 lifestyles, evolution and phylogeny, and adaptation to diverse ecological niches, especially as

79 they relate to plant pathogenicity and degradation of complex biomass associated with tree

80 species.

81 Results and Discussion

82 Genome assembly and annotation of Oc-j

83 The draft genome assembly of Oc-j contains a total of 52.6 Mbp and 5,401 contigs, with an N50

84 of 151 Kbp. The completeness of the genome assembly was assessed by identifying universal bioRxiv preprint doi: https://doi.org/10.1101/820977; this version posted October 28, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. Wu et al., 5, Oc-j genome

85 single-copy orthologs using BUSCO with lineage dataset for Sordariomycota [13]. Out of 3,725

86 total BUSCO groups searched, we found that 3,378 (90.7%) were complete and 264 (7.1%) were

87 fragmented in the Oc-j genome, while only 83 (2.2%) were missing. This result indicates that the

88 Oc-j genome is relatively complete.

89 Genome-wide multi-gene phylogeny of Diaporthales

90 A total of 340 genes were used to generate a multigene phylogeny of the order Diaporthales. All

91 sequenced species in Diaporthales (one genome per species for 12 species) were used. These

92 species include: parasitica (jgi.doe.gov), Chrysoporthe cubensis [14],

93 Chrysoporthe deuterocubensis [14], Chrysoporthe austroafricana [15], Valsa mali [12], Valsa

94 pyri [12], Diaporthe ampelina [16], Diaporthe aspalathi [17], Diaporthe longicolla [18],

95 [19], and sp. The outgroups used were Neurospora crassa

96 [20] and Magnaporthe grisea [21]. We show that the order Diaporthales is divided into two main

97 branches. One branch includes Oc-j, C. parasitica, and the Chrysoporthe species (Figure 2), in

98 which Oc-j is the outlier, indicating its early divergence from the rest of the branch. The other

99 branch includes the Valsa and Diaporthe species.

100 Gene content comparison across the Diaporthales

101 To examine the functional capacity of the Oc-j gene repertoire, the PFAM domains were

102 identified in the protein sequences (Supplementary Table 1). For comparison, we also

103 examined eight of the above mentioned 13 related species, where protein sequences were

104 successfully retrieved (see Methods for details).

105 Secreted CAZymes bioRxiv preprint doi: https://doi.org/10.1101/820977; this version posted October 28, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. Wu et al., 6, Oc-j genome

106 CAZymes are a group of proteins that are involved in degrading, modifying, or creating

107 glycosidic bonds and contain predicted catalytic and carbohydrate-binding domains [22]. When

108 secreted, CAZymes can participate in degrading plant cell walls during colonization by fungal

109 pathogens; therefore, a combination of CAZyme and protein secretion prediction was used to

110 identify and classify enzymes likely involved in cell wall degradation in plant pathogens [22].

111 Overall, the Oc-j genome contains 576 putatively secreted CAZymes, more than any of the other

112 eight species included in this analysis, and 60 CAZymes more than D. helianthi, the species with

113 the second most (Figure 3, Suppl. Table 1). Given that all except bread mold N. crassa, which

114 has the lowest number of secreted CAZymes, are plant pathogens, this result indicates that Oc-j

115 contains an especially large gene repertoire for cell wall degradation. This pattern also holds true

116 when compared to other tree pathogens. In a recent comparative genomic analysis of tree

117 pathogens, it was reported that the black walnut pathogen, Geosmithia morbida, had 406

118 CAZymes, the lodgepole pine pathogen Grosmania clavigera had 530 CAZymes, and the plane

119 tree canker pathogen, Ceratocystis platani, had 360 CAZymes [23]. Since secreted CAZymes are

120 involved in degrading plant cell walls [22], the large number of secreted CAZymes in Oc-j might

121 help facilitate its life-style infecting and colonizing butternut trees.

122 Kinase

123 Among a total of 77 kinase related PFAM domains found in the nine above mentioned species,

124 61 domains are present in more genes in Oc-j than the average of the eight related species, while

125 only eight domains are present in fewer genes in Oc-j (Figure 3, Suppl. Table 1). Given that

126 kinases are involved in signaling networks, this result might indicate a more complicated

127 signaling network in Oc-j. One example of kinase gene family expansion is the domain family of bioRxiv preprint doi: https://doi.org/10.1101/820977; this version posted October 28, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. Wu et al., 7, Oc-j genome

128 fructosamine kinase (PF03881). Oc-j contains 29 genes with this domain, while N. crassa has

129 none and the other related species have between 3-8 genes (Figure 3).

130 Cytochrome P450

131 Cytochrome P450s (CYPs) are a superfamily of monooxygenases that play a wide range of roles

132 in metabolism and adaptation to ecological niches in fungi [24]. Among the nine species

133 included in the PFAM analysis, N. crassa has only 31 CYPs, while the other eight plant-

134 pathogenic species have between 104 – 223 CYPs. This result likely reflects that the ecological

135 niche of N. crassa, bread, is more favorable for fungal growth than live with active

136 defense mechanisms, thus fewer CYPs are needed for N. crassa to cope with relatively simple

137 substrates. The Oc-j genome contains 223 CYPs (Figure 3, Suppl. Table 1), more than any of

138 the other seven plant-pathogenic fungal species. This may be an evolutionary response to the

139 number and diversity of secondary metabolites, such as juglone, produced by butternut as well as

140 other Juglans species, that would need to be metabolized by any fungal pathogen trying to

141 colonize the tree.

142 Efflux pumps

143 To overcome host defenses, infect and maintain colonization of the host, fungi employ efflux

144 pumps to counter intercellular toxin accumulation [25]. Here, we examine the presence of two

145 major efflux pump families, ATP-binding cassette (ABC) transporters and transporters of the

146 major facilitator superfamily (MFS) [25], in Diaporthales genomes. ABC transporters (PF00005,

147 pfam.xfam.org) are present in all of the nine species included in the PFAM analysis, ranging

148 from 25 genes in N. crassa to 51 genes in V. mali, while Oc-j has 47 genes (Figure 3, Suppl.

149 Table 1). The major facilitator superfamily are membrane proteins which are expressed bioRxiv preprint doi: https://doi.org/10.1101/820977; this version posted October 28, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. Wu et al., 8, Oc-j genome

150 ubiquitously in all kingdoms of life for the import or export of target substrates. The MFS is a

151 clan that contains 24 PFAM domain families (pfam.xfam.org). The Oc-j genome contains 653

152 MFS efflux pump genes, the most among all nine species included in the PFAM analysis (Figure

153 3, Suppl. Table 1). C. parasitica has 552 genes, the second most. Interestingly, it was recently

154 shown that a secondary metabolite juglone extracted from Juglans spp. could be used as

155 potential efflux pump inhibitors in Staphylococcus aureus, inhibiting the export of antibiotics out

156 of the bacterial cells [26]. In line with this previous finding, our results suggest that Oc-j genome

157 contains a large arsenal of efflux pumps likely due to the need to cope with hostile secondary

158 metabolites such as the efflux pump inhibitor juglone.

159 Secondary metabolism gene clusters

160 Secondary metabolite toxins play an important role in fungal nutrition and virulence [27]. To

161 identify gene clusters involved in the biosynthesis of secondary metabolites in Oc-j and related

162 species, we scanned their genomes for such gene clusters. We found that the Oc-j genome

163 contains a remarkably large repertoire of secondary metabolism gene clusters, when compared to

164 the closely related C. parasitica and the Chrysoporthe species (Figure 3, Suppl. Table). While

165 C. parasitica has 44 gene clusters and the Chrysoporthe species have 18 – 48 gene clusters, the

166 Oc-j genome has a total of 72 gene clusters (Figure 3), reflecting a greater capacity to produce

167 various secondary metabolites. Among the 72 gene clusters in Oc-j, more than half are type 1

168 polyketide synthases (t1pks, 39 total, including hybrids), followed by non-ribosomal peptide

169 synthetases (nrps, 14 total, including hybrids) and terpene synthases (9) (Supplementary Table

170 1).

171 When compared to all species included in this study, we found that Oc-j has the third highest

172 number of clusters, only after D. longicolla and Melanconium sp.; however, when normalized by bioRxiv preprint doi: https://doi.org/10.1101/820977; this version posted October 28, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. Wu et al., 9, Oc-j genome

173 total gene number in each species, Oc-j is shown to have 6.4 secondary metabolism gene clusters

174 per 1000 genes, the highest among all species included in this study. The D. helianthi genome

175 contains only six clusters, likely due to the poor quality of the genome assembly with an N50 of

176 ~6 kbps (Figure 3, Suppl. Table 1). Cluster numbers can vary greatly within the same genus.

177 Conclusion

178 We constructed a high-quality genome assembly for Oc-j, and delineated the phylogeny of

179 Diaporthales with a genome-wide multi-gene approach, revealing two major branches. We then

180 examined several gene families relevant to plant pathogenicity and complex biomass

181 degradation. We found that the Oc-j genome contains large numbers of genes in these gene

182 families. These genes might be essential for Oc-j to cope with its niche in the hardwood butternut

183 (Juglans cinerea). Future research will need to focus on understanding the prevalence of these

184 genes associated with complex biomass degradation among other members of the

185 Ophiognomonia genus which include endophytes, saprophytes and pathogens. It will be

186 interesting to know which of these different classes of genes are important in the evolution of

187 distinct fungal lifestyles and niche adaptation. Future research will also focus on comparing the

188 butternut canker pathogen to canker pathogens of other tree species which produce large

189 numbers of secondary metabolites known to be important in host defense. Our genome also

190 serves as an essential resource for the Oc-j research community.

191 Methods

192 DNA extraction and library preparation

193 For this study, the type culture of Oc-j (ATCC 36624) recovered from an infected butternut tree

194 in Wisconsin in 1978, was sequenced. For DNA extraction, isolates were grown on cellophane- bioRxiv preprint doi: https://doi.org/10.1101/820977; this version posted October 28, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. Wu et al., 10, Oc-j genome

195 covered potato dextrose agar for 7-10 d, and mycelia was collected and lyophilized. DNA was

196 extracted from lyophilized mycelia using the CTAB method as outlined by the Joint Genome

197 Institute for whole genome sequencing (Kohler A, Francis M. Genomic DNA Extraction,

198 Accessed 12/12/2015 http://1000.fungalgenomes.org/home/wp-

199 content/uploads/2013/02/genomicDNAProtocol-AK0511.pdf). The total DNA quantity and

200 quality were measured using both Nanodrop and Qubit, and the sample was sent to the Hubbard

201 Center for Genome Studies at the University of New Hampshire, Durham, New Hampshire.

202 DNA libraries were prepared using the paired-end Illumina Truseq sample preparation kit, and

203 were sequenced on an Illumina HiSeq 2500.

204 Genome assembly and annotation

205 We corrected our raw reads using BLESS 0.16 [28] with the following options: -kmerlength 23 -

206 verify –notrim. Once our reads were corrected, we trimmed the reads at a phred score of 2 both

207 at the leading and trailing ends of the reads using Trimmomatic 0.32 [29]. We used a sliding

208 window of four bases that must average a phred score of 2 and the reads must maintain a

209 minimum length of 25 bases. Next, de novo assembly was built using SPAdes-3.1.1 [30] with

210 both paired and unpaired reads and the following settings: -t 8 -m 100 --only-assembler.

211 Genome sequences were deposited at ncbi.nlm.nih.gov under Bioproject number PRJNA434132.

212 Gene annotation was performed using the MAKER2 pipeline [31] in an iterative manner as is

213 described in [32], with protein evidence from related species of Melanconium sp., Cryphonectria

214 parasitica, Diaporthe ampelina, (from jgi.doe.gov) and Diaporthe helianthi [19], for a total of

215 three iterations. PFAM domains were identified in all the genomes using hmmscan with trusted

216 cutoff [33]. Only nine species were included in the PFAM analysis. For the three Chrysoporthe

217 species, D. aspalathi, and D. longicolla, protein sequences were not readily available for bioRxiv preprint doi: https://doi.org/10.1101/820977; this version posted October 28, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. Wu et al., 11, Oc-j genome

218 downloading online and e-mail requests were unsuccessful. Secondary metabolism gene clusters

219 were identified using antiSMASH 4.0 [34].

220

221 Species phylogeny

222 Core eukaryotic proteins identified by CEGMA [35] were first aligned by MAFFT [36] and then

223 concatenated. Only proteins that were present in all genomes and all sequences were longer than

224 90% of the Saccharomyces cerevisiae ortholog were used. Phylogeny was then inferred using

225 maximum likelihood by RAxML [37] with 100 bootstraps and then midpoint rooted.

226 Secreted CAZyme prediction

227 SignalP [38] was used to predict the presence of secretory signal peptides. CAZymes were

228 predicted using CAZymes Analysis Toolkit [39] based on the most recent CAZY database

229 (www.cazy.org). Proteins that both contain a signal peptide and are predicted to be a CAZyme

230 are annotated as a secreted CAZyme.

231 Conflicts of interest

232 The authors have no conflict of interest.

233

234 Acknowledgements. Special thanks to Dr. Matt MacManus. This started as an in-class project

235 for PhD student TS and provided the basis for this final manuscript. Dr. Broders was supported

236 by the Simon’s Foundation Grant number 429440 to the Smithsonian Tropical Research

237 Institute.

238 bioRxiv preprint doi: https://doi.org/10.1101/820977; this version posted October 28, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. Wu et al., 12, Oc-j genome

239 References

240 [1] H. Voglmayr, L.A. Castlebury, W.M. Jaklitsch, Juglanconis gen. nov. on Juglandaceae, and 241 the new family Juglanconidaceae (Diaporthales), Persoonia Mol. Phylogeny Evol. Fungi. 38 242 (2017) 136–155. doi:10.3767/003158517X694768. 243 [2] A.Y. Rossman, D.F. Farr, L.A. Castlebury, A review of the phylogeny and biology of the 244 Diaporthales, Mycoscience. 48 (2007) 135–144. doi:10.1007/S10267-007-0347-7. 245 [3] L.A. Castlebury, A.Y. Rossman, W.J. Jaklitsch, L.N. Vasilyeva, A preliminary overview of 246 the Diaporthales based on large subunit nuclear ribosomal DNA sequences, Mycologia. 94 247 (2002) 1017–1031. 248 [4] D.W. Renlund, Forest pest conditions in Wisconsin, Annual Report, 53 p., Wisconsin 249 Department of Natural Resources, Madison, WI, 1971. 250 [5] V.M.G. Nair, C.J. Kostichka, J.E. Kuntz, Sirococcus clavigignenti-juglandacearum: An 251 Undescribed Species Causing Canker on Butternut, Mycologia. 71 (1979) 641–646. 252 doi:10.2307/3759076. 253 [6] K.D. Broders, G.J. Boland, Reclassification of the butternut canker fungus, Sirococcus 254 clavigignenti-juglandacearum, into the genus Ophiognomonia, Fungal Biol. 115 (2011) 70– 255 79. doi:10.1016/j.funbio.2010.10.007. 256 [7] K.D. Broders, A. Boraks, A.M. Sanchez, G.J. Boland, Population structure of the butternut 257 canker fungus, Ophiognomonia clavigignenti-juglandacearum, in North American forests, 258 Ecol. Evol. 2 (2012) 2114–2127. doi:10.1002/ece3.332. 259 [8] M.V. Sogonov, L.A. Castlebury, R.A. Y., L.C. Mejia, J.F. White, -inhabiting genera of 260 the Gnomoniaceae, Diapothathales, Stud. Mycol. (2008) 62: 1-79. 261 [9] D.M. Walker, L.A. Castlebury, A.Y. Rossman, L.C. Mejia, J.F. White, Phylogeny and 262 of Ophiognomonia (Gnomoniaceae, Diaporthales), including twenty-five new 263 species in this highly diverse genus, Fungal Diversity. (2012) 57: 85-147. 264 [10] X. Sun, L.-D. Guo, K.D. Hyde, Community composition of endophytic fungi in Acer 265 truncatum and their role in decomposition, Fungal Divers. 47 (2011) 85–95. 266 doi:10.1007/s13225-010-0086-5. bioRxiv preprint doi: https://doi.org/10.1101/820977; this version posted October 28, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. Wu et al., 13, Oc-j genome

267 [11] K. Broders, A. Boraks, L. Barbison, J. Brown, G.J. Boland, Recent insights into the 268 pandemic disease butternut canker caused by the invasive pathogen Ophiognomonia 269 clavigignenti-juglandacearum, For. Pathol. 45 (2015) 1–8. doi:10.1111/efp.12161. 270 [12] Z. Yin, H. Liu, Z. Li, X. Ke, D. Dou, X. Gao, N. Song, Q. Dai, Y. Wu, J.-R. Xu, Z. Kang, L. 271 Huang, Genome sequence of Valsa canker pathogens uncovers a potential adaptation of 272 colonization of woody bark, New Phytol. 208 (2015) 1202–1216. doi:10.1111/nph.13544. 273 [13] F.A. Simão, R.M. Waterhouse, P. Ioannidis, E.V. Kriventseva, E.M. Zdobnov, BUSCO: 274 assessing genome assembly and annotation completeness with single-copy orthologs, 275 Bioinformatics. 31 (2015) 3210–3212. doi:10.1093/bioinformatics/btv351. 276 [14] B.D. Wingfield, I. Barnes, Z. Wilhelm de Beer, L. De Vos, T.A. Duong, A.M. Kanzi, K. 277 Naidoo, H.D.T. Nguyen, Q.C. Santana, M. Sayari, K.A. Seifert, E.T. Steenkamp, C. Trollip, 278 N.A. van der Merwe, M.A. van der Nest, P. Markus Wilken, M.J. Wingfield, IMA Genome- 279 F 5: Draft genome sequences of Ceratocystis eucalypticola, Chrysoporthe cubensis, C. 280 deuterocubensis, Davidsoniella virescens, Fusarium temperatum,Graphilbum fragrans, 281 Penicillium nordicum, and Thielaviopsis musarum, IMA Fungus. 6 (2015) 493–506. 282 doi:10.5598/imafungus.2015.06.02.13. 283 [15] B.D. Wingfield, P.K. Ades, F.A. Al-Naemi, L.A. Beirn, W. Bihon, J.A. Crouch, Z.W. de 284 Beer, L. De Vos, T.A. Duong, C.J. Fields, G. Fourie, A.M. Kanzi, M. Malapi-Wight, S.J. 285 Pethybridge, O. Radwan, G. Rendon, B. Slippers, Q.C. Santana, E.T. Steenkamp, P.W.J. 286 Taylor, N. Vaghefi, N.A. van der Merwe, D. Veltri, M.J. Wingfield, IMA Genome-F 4: Draft 287 genome sequences of Chrysoporthe austroafricana, Diplodia scrobiculata, Fusarium 288 nygamai, Leptographium lundbergii, Limonomyces culmigenus, Stagonosporopsis tanaceti, 289 and Thielaviopsis punctulata, IMA Fungus. 6 (2015) 233–248. 290 doi:10.5598/imafungus.2015.06.01.15. 291 [16] J. Savitha, S.D. Bhargavi, V.K. Praveen, Complete Genome Sequence of the Endophytic 292 Fungus Diaporthe () ampelina, Genome Announc. 4 (2016) e00477-16. 293 doi:10.1128/genomeA.00477-16. 294 [17] S. Li, Q. Song, A.M. Martins, P. Cregan, Draft genome sequence of Diaporthe aspalathi 295 isolate MS-SSC91, a fungus causing stem canker in soybean, Genomics Data. 7 (2016) 262– 296 263. doi:10.1016/j.gdata.2016.02.002. bioRxiv preprint doi: https://doi.org/10.1101/820977; this version posted October 28, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. Wu et al., 14, Oc-j genome

297 [18] S. Li, O. Darwish, N. Alkharouf, B. Matthews, P. Ji, L.L. Domier, N. Zhang, B.H. Bluhm, 298 Draft genome sequence of isolate MSPL 10-6, Genomics Data. 3 299 (2015) 55–56. doi:10.1016/j.gdata.2014.11.007. 300 [19] R. Baroncelli, F. Scala, M. Vergara, M.R. Thon, M. Ruocco, Draft whole-genome sequence 301 of the Diaporthe helianthi 7/96 strain, causal agent of sunflower stem canker, Genomics 302 Data. 10 (2016) 151–152. doi:10.1016/j.gdata.2016.11.005. 303 [20] J.E. Galagan, S.E. Calvo, K.A. Borkovich, E.U. Selker, N.D. Read, D. Jaffe, W. FitzHugh, 304 L.-J. Ma, S. Smirnov, S. Purcell, B. Rehman, T. Elkins, R. Engels, S. Wang, C.B. Nielsen, J. 305 Butler, M. Endrizzi, D. Qui, P. Ianakiev, D. Bell-Pedersen, M.A. Nelson, M. Werner- 306 Washburne, C.P. Selitrennikoff, J.A. Kinsey, E.L. Braun, A. Zelter, U. Schulte, G.O. Kothe, 307 G. Jedd, W. Mewes, C. Staben, E. Marcotte, D. Greenberg, A. Roy, K. Foley, J. Naylor, N. 308 Stange-Thomann, R. Barrett, S. Gnerre, M. Kamal, M. Kamvysselis, E. Mauceli, C. Bielke, 309 S. Rudd, D. Frishman, S. Krystofova, C. Rasmussen, R.L. Metzenberg, D.D. Perkins, S. 310 Kroken, C. Cogoni, G. Macino, D. Catcheside, W. Li, R.J. Pratt, S.A. Osmani, C.P.C. 311 DeSouza, L. Glass, M.J. Orbach, J.A. Berglund, R. Voelker, O. Yarden, M. Plamann, S. 312 Seiler, J. Dunlap, A. Radford, R. Aramayo, D.O. Natvig, L.A. Alex, G. Mannhaupt, D.J. 313 Ebbole, M. Freitag, I. Paulsen, M.S. Sachs, E.S. Lander, C. Nusbaum, B. Birren, The 314 genome sequence of the filamentous fungus Neurospora crassa, Nature. 422 (2003) 859–868. 315 doi:10.1038/nature01554. 316 [21] R.A. Dean, N.J. Talbot, D.J. Ebbole, M.L. Farman, T.K. Mitchell, M.J. Orbach, M. Thon, R. 317 Kulkarni, J.-R. Xu, H. Pan, N.D. Read, Y.-H. Lee, I. Carbone, D. Brown, Y.Y. Oh, N. 318 Donofrio, J.S. Jeong, D.M. Soanes, S. Djonovic, E. Kolomiets, C. Rehmeyer, W. Li, M. 319 Harding, S. Kim, M.-H. Lebrun, H. Bohnert, S. Coughlan, J. Butler, S. Calvo, L.-J. Ma, R. 320 Nicol, S. Purcell, C. Nusbaum, J.E. Galagan, B.W. Birren, The genome sequence of the rice 321 blast fungus Magnaporthe grisea, Nature. 434 (2005) 980–986. doi:10.1038/nature03449. 322 [22] A. Morales-Cruz, K.C.H. Amrine, B. Blanco-Ulate, D.P. Lawrence, R. Travadon, P.E. 323 Rolshausen, K. Baumgartner, D. Cantu, Distinctive expansion of gene families associated 324 with plant cell wall degradation, secondary metabolism, and nutrient uptake in the genomes 325 of grapevine trunk pathogens, BMC Genomics. 16 (2015) 469. doi:10.1186/s12864-015- 326 1624-z. bioRxiv preprint doi: https://doi.org/10.1101/820977; this version posted October 28, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. Wu et al., 15, Oc-j genome

327 [23] T.A. Schuelke, G. Wu, A. Westbrook, K. Woeste, D.C. Plachetzki, K. Broders, M.D. 328 MacManes, Comparative Genomics of Pathogenic and Nonpathogenic Beetle-Vectored 329 Fungi in the Genus Geosmithia, Genome Biol. Evol. 9 (2017) 3312–3327. 330 doi:10.1093/gbe/evx242. 331 [24] W. Chen, M.-K. Lee, C. Jefcoate, S.-C. Kim, F. Chen, J.-H. Yu, Fungal Cytochrome P450 332 Monooxygenases: Their Distribution, Structure, Functions, Family Expansion, and 333 Evolutionary Origin, Genome Biol. Evol. 6 (2014) 1620–1634. doi:10.1093/gbe/evu132. 334 [25] J.J. Coleman, E. Mylonakis, Efflux in Fungi: La Pièce de Résistance, PLOS Pathog. 5 335 (2009) e1000486. doi:10.1371/journal.ppat.1000486. 336 [26] T. Zmantar, H. Miladi, B. Kouidhi, Y. Chaabouni, R. Ben Slama, A. Bakhrouf, K. 337 Mahdouani, K. Chaieb, Use of juglone as antibacterial and potential efflux pump inhibitors 338 in Staphylococcus aureus isolated from the oral cavity, Microb. Pathog. 101 (2016) 44–49. 339 doi:10.1016/j.micpath.2016.10.022. 340 [27] B.J. Howlett, Secondary metabolite toxins and nutrition of plant pathogenic fungi, Curr. 341 Opin. Plant Biol. 9 (2006) 371–375. doi:10.1016/j.pbi.2006.05.004. 342 [28] Y. Heo, X.-L. Wu, D. Chen, J. Ma, W.-M. Hwu, BLESS: Bloom filter-based error 343 correction solution for high-throughput sequencing reads, Bioinformatics. 30 (2014) 1354– 344 1362. doi:10.1093/bioinformatics/btu030. 345 [29] A.M. Bolger, M. Lohse, B. Usadel, Trimmomatic: a flexible trimmer for Illumina sequence 346 data, Bioinformatics. 30 (2014) 2114–2120. doi:10.1093/bioinformatics/btu170. 347 [30] A. Bankevich, S. Nurk, D. Antipov, A.A. Gurevich, M. Dvorkin, A.S. Kulikov, V.M. Lesin, 348 S.I. Nikolenko, S. Pham, A.D. Prjibelski, A.V. Pyshkin, A.V. Sirotkin, N. Vyahhi, G. Tesler, 349 M.A. Alekseyev, P.A. Pevzner, SPAdes: A New Genome Assembly Algorithm and Its 350 Applications to Single-Cell Sequencing, J. Comput. Biol. 19 (2012) 455–477. 351 doi:10.1089/cmb.2012.0021. 352 [31] C. Holt, M. Yandell, MAKER2: an annotation pipeline and genome-database management 353 tool for second-generation genome projects, BMC Bioinformatics. 12 (2011) 491. 354 doi:10.1186/1471-2105-12-491. 355 [32] J. Yu, G. Wu, W.M. Jurick, V.L. Gaskins, Y. Yin, G. Yin, J.W. Bennett, D.R. Shelton, 356 Genome Sequence of Penicillium solitum RS1, Which Causes Postharvest Apple Decay, 357 Genome Announc. 4 (2016) e00363-16. doi:10.1128/genomeA.00363-16. bioRxiv preprint doi: https://doi.org/10.1101/820977; this version posted October 28, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. Wu et al., 16, Oc-j genome

358 [33] A. Bateman, L. Coin, R. Durbin, R.D. Finn, V. Hollich, S. Griffiths-Jones, A. Khanna, M. 359 Marshall, S. Moxon, E.L.L. Sonnhammer, D.J. Studholme, C. Yeats, S.R. Eddy, The Pfam 360 protein families database, Nucleic Acids Res. 32 (2004) D138–D141. 361 doi:10.1093/nar/gkh121. 362 [34] T. Weber, K. Blin, S. Duddela, D. Krug, H.U. Kim, R. Bruccoleri, S.Y. Lee, M.A. 363 Fischbach, R. Müller, W. Wohlleben, R. Breitling, E. Takano, M.H. Medema, antiSMASH 364 3.0—a comprehensive resource for the genome mining of biosynthetic gene clusters, Nucleic 365 Acids Res. 43 (2015) W237–W243. doi:10.1093/nar/gkv437. 366 [35] G. Parra, K. Bradnam, I. Korf, CEGMA: a pipeline to accurately annotate core genes in 367 eukaryotic genomes, Bioinformatics. 23 (2007) 1061–1067. 368 doi:10.1093/bioinformatics/btm071. 369 [36] K. Katoh, K. Misawa, K. Kuma, T. Miyata, MAFFT: a novel method for rapid multiple 370 sequence alignment based on fast Fourier transform, Nucleic Acids Res. 30 (2002) 3059– 371 3066. doi:10.1093/nar/gkf436. 372 [37] A. Stamatakis, RAxML version 8: a tool for phylogenetic analysis and post-analysis of large 373 phylogenies, Bioinformatics. 30 (2014) 1312–1313. doi:10.1093/bioinformatics/btu033. 374 [38] T.N. Petersen, S. Brunak, G. von Heijne, H. Nielsen, SignalP 4.0: discriminating signal 375 peptides from transmembrane regions, Nat. Methods. 8 (2011) 785. doi:10.1038/nmeth.1701. 376 [39] B.H. Park, T.V. Karpinets, M.H. Syed, M.R. Leuze, E.C. Uberbacher, CAZymes Analysis 377 Toolkit (CAT): Web service for searching and analyzing carbohydrate-active enzymes in a 378 newly sequenced organism using CAZy database, Glycobiology. 20 (2010) 1574–1584. 379 doi:10.1093/glycob/cwq106. 380

381

382

383

384

385 bioRxiv preprint doi: https://doi.org/10.1101/820977; this version posted October 28, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. Wu et al., 17, Oc-j genome

386 Supplementary Table 1. PFAM and antiSMASH annotation of Oc-j and related species

387

388 Figure Legend

389 Figure 1. Symptoms of infection caused by Oc-j on the A) , B) branches and C) trunk as

390 well as as D) signs of fruting bodies on an infected branch.

391

392 Figure 2. Phylogeny of Diaporthales and related species inferred using maximum likelihood by

393 RAxML [37] with 1000 bootstraps and then midpoint rooted. The first and second numbers in

394 parentheses represent the genome sizes in Mb and the number of predict protein models,

395 respectively

396

397 Figure 3. Abundance of genes in specific classes including ABC transporters, Cytochrome

398 P450, Kinases, CAZymes, and MFS Efflux pumps present in nine species within the order

399 Diaporthales

400

401

402

403

404

405

406

407

408 bioRxiv preprint doi: https://doi.org/10.1101/820977; this version posted October 28, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. Wu et al., 18, Oc-j genome

409

410 bioRxiv preprint doi: https://doi.org/10.1101/820977; this version posted October 28, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. bioRxiv preprint doi: https://doi.org/10.1101/820977; this version posted October 28, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. bioRxiv preprint doi: https://doi.org/10.1101/820977; this version posted October 28, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.