Genetics: Early Online, published on October 24, 2019 as 10.1534/genetics.119.302617

1 Programmed cell death in crassa is controlled by the

2 allorecognition determinant rcd-1

3 Asen Daskalov*§, Pierre Gladieux‡, Jens Heller*† and N. Louise Glass*†

4

5

6 *Plant and Microbial Department, The University of California, Berkeley, CA 94720

7 †Environmental Genomics and Systems Biology Division, The Lawrence Berkeley National

8 Laboratory, 1 Cyclotron Road, Berkeley, CA 94720

9 ‡UMR BGPI, INRA, CIRAD, Montpellier SupAgro, Univ Montpellier, Montpellier, France

10 §Current address: CNRS, UMR 5248, European Institute of Chemistry and Biology, University

11 of Bordeaux, Pessac, France

12

13

14

15

16

17

18

19

20 Short running title: Identification of regulator of cell death-1

21 Keywords: Neurospora, programmed cell death, allorecognition, heterokaryon

22 incompatibility, cell fusion

1

Copyright 2019. 23

24 Corresponding author

25 N. Louise Glass

26 Environmental Genomics and Systems Biology Division

27 The Lawrence Berkeley National Laboratory

28 1 Cyclotron Road, Berkeley, CA 94720

29 510.643.2546

30 [email protected]

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

2

46 Abstract

47 Non-self recognition following cell fusion between genetically distinct individuals of the

48 same species in filamentous fungi often results in a programmed cell death (PCD) reaction,

49 where the heterokaryotic fusion cell is compartmentalized and rapidly killed. The

50 allorecognition process plays a key role as a defense mechanism that restricts genome

51 exploitation, resource plundering and the spread of deleterious senescence plasmids and

52 mycoviruses. Although a number of incompatibility systems have been described that

53 function in mature hyphae, less is known about the PCD pathways in asexual spores, which

54 represent the main infectious unit in various human and plant fungal pathogens. Here, we

55 report the identification of regulator of cell death-1 (rcd-1), a novel allorecognition ,

56 controlling PCD in germinating asexual spores of ; rcd-1 is one of the most

57 polymorphic in the genomes of wild N. crassa isolates. The co-expression of two

58 antagonistic rcd-1-1 and rcd-1-2 alleles was necessary and sufficient to trigger cell death in

59 fused germlings and in hyphae. Based on analysis of wild populations of N. crassa and N.

60 discreta, rcd-1 alleles appeared to be under balancing selection and associated with trans-

61 species polymorphisms. We shed light on genomic rearrangements that could have led to

62 the emergence of the incompatibility system in Neurospora and show that rcd-1 belongs to a

63 much larger gene family in fungi. Overall, our work contributes towards a better

64 understanding of allorecognition and PCD in an underexplored developmental stage of

65 filamentous fungi.

66

3

67 Introduction

68 Allorecognition or discrimination against conspecific but genetically distinct non-self

69 is a phenomenon that is important in various colonial organisms that limits genome

70 exploitation and resource plundering (Debets and Griffiths 1998; Laird et al. 2005; Ho et al.

71 2013; Bastiaans et al. 2016). The phenomenon has been described in early diverging

72 invertebrate metazoans, including the ascidian Botryllus schlosseri, the cnidarian

73 Hydractinia symbiolongicarpus and in the social amoebae Dictyostelium discoideum, where it

74 restricts colonial chimerism to closely related individuals (Laird et al. 2005; Lakkis et al.

75 2008; Rosengarten and Nicotra 2011). In filamentous fungi, where cell fusion (anastomosis)

76 between genetically different fungal colonies is often abortive, allorecognition mechanisms

77 play a similar role by triggering a programmed cell death (PCD) reaction termed vegetative

78 or heterokaryon incompatibility (HI) (Glass and Dementhon 2006; Paoletti 2016; Daskalov

79 et al. 2017). The cell death reaction is spatially and temporally restricted to the

80 heterokaryotic fusion cells in the contact zone between two genetically distinct individuals

81 and is often associated with the formation of a pigmented demarcation line or barrage

82 between incompatible strains (Perkins et al. 2000). Fusion cells undergo extensive

83 vacuolization, hyper-septation, ROS production, lipid droplet accumulation and plasma

84 membrane disruption (Saupe et al. 2000; Glass and Kaneko 2003). HI has been described in

85 dozens of filamentous ascomycete species and the populations of numerous

86 phytopathogenic species have been characterized in terms of vegetative compatibility

87 groups (VCGs). Strains belonging to the same VCG are able to form heterokaryons and

88 transmit genetic content horizontally and often exhibit similar virulence and host

89 specificity, while strains of different VCGs are incompatible and do not from viable

4

90 heterokaryons (Zeise and Von Tiedemann 2002; Fan et al. 2018). By restricting chimerism

91 between individuals, HI limits the horizontal transmission of deleterious cytoplasmic

92 elements (i.e. mycoviruses, senescence plasmids), thus functioning as a fungal defense

93 mechanism (Debets et al. 1994; Zhang et al. 2014). In Cryphonectria parasitica, the causal

94 agent of chestnut blight disease, systematic disruption of allorecognition genes that

95 prevented heterokaryon formation between strains of different VCG resulted in the creation

96 of a super mycovirus donor strain (Zhang and Nuss 2016). The ‘super mycovirus donor

97 strain’ carries a hypo-virulence plasmid and hence can be an effective biocontrol agent

98 (Zhang and Nuss 2016). Similar approaches underscore the importance of genetically

99 identifying and characterizing novel allorecognition loci in fungi.

100 Neurospora crassa is a model filamentous ascomycete species for studying cell-cell

101 communication, anastomosis formation and heterokaryon incompatibility (Zhao et al. 2015;

102 Herzog et al. 2015; Daskalov et al. 2017). At least 11 loci control HI in N. crassa (Mylyk

103 1975). Several het genes (for heterokaryon incompatibility) have been molecularly

104 identified and shown to induce cell death when co-expressed with incompatible alleles of

105 the same gene (allelic HI systems) or with incompatible alleles of a different gene (non-

106 allelic HI systems) (Kaneko et al. 2006; Paoletti 2016). The identified het genes are often

107 highly polymorphic with several different alleles segregating in near equal frequencies in

108 wild populations of N. crassa (Wu et al. 1998; Hall et al. 2010). Such a distribution pattern of

109 antagonistic alleles in wild isolates is an indication that a particular locus is under balancing

110 selection. Balancing selection has been described to operate on the self-incompatibility

111 locus (SI) in flowering plants, the major histocompatibility complex (MHC) in mammals and

112 has been associated with allorecognition loci in other animals (Richman 2000; Nydam et al.

5

113 2017). Alleles of a gene under balancing selection can be maintained in populations

114 undergoing speciation events, which results in a trans-species segregation of multiple allelic

115 haplogroups (Klein et al. 1998). Evolutionary hallmarks of long-term balancing selection

116 and trans-species polymorphism have been recently used in a population genomics analysis

117 of N. crassa wild isolates to successfully identify novel het genes (Zhao et al. 2015). However,

118 in spite of the frequently shared molecular hallmarks of evolution, het gene function seem

119 poorly conserved between species (Paoletti 2016).

120 The products of several het genes belong to a large family of fungal proteins,

121 analogous to NOD-like receptors (NLRs) (Dyrka et al. 2014). NLR proteins (also known as

122 nucleotide-binding site and leucine-rich repeats containing proteins or NBs-LRR) are

123 intracellular innate immune receptors controlling cell death in plants and animals in

124 response to immunogenic cues (Dyrka et al. 2014; Duxbury et al. 2016; Jones et al. 2016).

125 The similarities between het determinants and genes involved in innate immunity in other

126 eukaryotic lineages have prompted the hypothesis that HI may have originated from

127 signaling pathways mediating interspecific biotic interactions, possibly in the context of an

128 uncharacterized fungal innate immunity system (Paoletti and Saupe 2009; Dyrka et al. 2014;

129 Uehling et al. 2017). Thus, the molecular identification and characterization of novel

130 allorecognition factors in fungi could help us elucidate the evolutionary history of non-self

131 recognition processes in Eukaryotes.

132 Previously identified HI-inducing systems are suppressed during the Neurospora

133 sexual cycle and also in germinating conidia (germlings), which play a fundamental role for

134 vegetative propagation (Shiu and Glass 1999; Ishikawa et al. 2012). However, recently

135 Heller et al. identified a gene pair controlling PCD and allorecognition in germlings of N.

6

136 crassa (Heller et al. 2018). Germling-regulated death (GRD) is triggered when germlings of

137 incompatible haplotypes undergo cellular fusions during early colony establishment. The

138 cell death reaction occurs very rapidly (~20 min) with the appearance of large vacuoles in

139 the cytoplasm of fused incompatible germlings preceding cellular lysis and uptake of vital

140 dyes (Heller et al. 2018). The two genes controlling GRD are linked in the genome of

141 Neurospora. One gene (plp-1) encodes an NLR-like protein containing a phospholipase

142 domain and the other (sec-9) encodes a homolog of SEC9 from yeast, a protein involved in

143 plasma membrane fusion and exocytosis (Brennwald et al. 1994; Heller et al. 2018).

144 The discovery of GRD represents an exciting possibility to study fungal PCD in a well-

145 controlled biological system and during a developmental stage of the fungal life cycle

146 (asexual spores) of great importance for clinical and agricultural applications, because it

147 constitutes the infectious stage of many human and plant fungal pathogens (Brown et al.

148 2012). Here, we further investigated the molecular basis of GRD and identified a novel

149 genetic determinant of allorecognition in N. crassa. By using bulked segregant analysis

150 (BSA) of GRD-specific DNA pools and population genomics data, the NCU05712 locus was

151 identified as a second GRD-controlling allorecognition factor, termed regulator of cell death-

152 1 (rcd-1). Viable germling fusion of cells containing incompatible alleles of rcd-1 (rcd-1-1

153 and rcd-1-2) was prevented by PCD induction. The rcd-1 alleles are highly polymorphic in N.

154 crassa populations and show signs of balancing selection. Furthermore, we identified

155 genomic rearrangements in the rcd-1 locus, which could be at the origin of this

156 allorecognition system in Neurospora. Such genomic rearrangements could represent a

157 general evolutionary mode for fungal incompatibility systems. Moreover, we show that that

158 rcd-1 belongs to a large gene family in filamentous ascomycete species. The latter

7

159 observation suggests that rcd-1 might be an important component of cell death pathways in

160 fungi with broader role in non-self recognition.

161 Materials and Methods

162 Strains, Growth media, and Molecular constructs

163 Strains were grown using standard procedures and protocols that can be found on

164 the Neurospora homepage at FGSC (www.fgsc.net/Neurospora/NeurosporaProtocolGuide).

165 Vogel’s minimal media (VMM) (with supplements, if required) was used to culture all

166 strains, except when specified otherwise (Vogel 1956). Crosses were performed on

167 Westergaard’s synthetic cross medium (Westergaard and Mitchell 1947). For flow

168 cytometry experiments, thermo-reversible solid Vogel’s media was obtained substituting

169 the agar with 20% Pluronic F-127 (Sigma–Aldrich).

170 All wild N. crassa strains in the present study were isolated from Louisiana (USA),

171 have been described previously, and are available at the FGSC (Dettman et al. 2003; Palma-

172 Guerrero et al. 2013; Zhao et al. 2015; Corcoran et al. 2016) (Table S1). All engineered

173 strains were from the genetic background of the wild type lab strain (FGSC2489). The

174 ΔNCU05712 deletion strain was obtained from the single gene deletion collection of N.

175 crassa strains at the FGSC (Colot et al. 2006).

176 The rcd-1-1 and rcd-1-2 alleles with native promoters were cloned in the pMF272

177 vector (Freitag et al. 2004) using the restrictions enzymes NotI and ApaI and introduced in

178 the his-3 locus of a Δrcd-1 strain. Molecular fusions of rcd-1 with fluorescent proteins (GFP

179 or mCherry) were produced by cloning the rcd-1 alleles in pMF272-derived vector using

180 MscI/PacI (for the rcd-1-1 allele) and XbaI/PacI (for the rcd-1-2 allele) under the regulation

8

181 of the tef-1 promoter. Positive transformants were backcrossed with an Δrcd-1 strain to

182 construct homokaryotic strains that were subsequently verified by PCR.

183 Bulked segregant analyses and genome re-sequencing

184 Bulked segregant analysis was performed as described (Heller et al. 2018). The GRD

185 phenotype of progeny from a cross between FGSC2489 and a wild isolate JW258 (FGSC

186 10679) was determined by pairing germlings with each parental strain and assessing PCD.

187 Two GRD-specific DNA pools were produced (60 ng from 50 progeny strains in each DNA

188 pool) for library preparation and sequencing. All paired-end libraries were sequenced on a

189 HiSeq2000 sequencing platform using standard Illumina operating procedures (Vincent J.

190 Coates Genomics Sequencing Laboratory, University of California, Berkeley). The mapped

191 reads for each group of 50 pooled segregants are available at the Sequence Read Archive

192 (PRJNA556444; https://www.ncbi.nlm.nih.gov/sra/PRJNA556444).

193 Flow cytometry

194 Flow cytometry was performed according to (Heller et al. 2018). We recorded 20,000

195 events per sample for each experiment. Experiments were performed at least three times.

196 Data was analyzed with custom MATLAB (MathWorks) script with ungerminated conidia

197 excluded from the analyses (Gonçalves et al. 2019). Cell death is shown as the average

198 percentage of fluorescent events from all experiments.

199 Evolutionary analysis of DNA and protein sequences

200 We characterized homologs of NCU05712 in a collection of 194 publicly available

201 genomes representing 16 Neurospora species ( II) (Kruys et

202 al. 2015), one non-Neurospora Lasiosphaeriaceae II species, 2 species in Lasiosphaeriaceae

203 IV, one species in Lasiosphaeriaceae I, and 14 species of thermophilic sub-family

9

204 (Table S1). The dataset included resequencing data for 27 N. crassa isolates

205 from Louisiana (Galagan et al. 2003; Zhao et al. 2015), 92 N. tetrasperma strains from North

206 America, Europe and Oceania (Ellison et al. 2011; Corcoran et al. 2016) and 43 N. discreta

207 PS4 from North America, Europe and Asia (Gladieux et al. 2015). We used publicly available

208 genome assemblies and predicted genes where possible. Other genomes were assembled

209 using ABySS (Simpson et al. 2009) and annotated using Augustus, with N. crassa as the

210 reference species and intron locations obtained by mapping other species’ proteins onto

211 each focal genome using Exonerate (Slater and Birney 2005).

212 We identified NCU05712 homologs in Neurospora genomes using BlastP against

213 predicted proteins and keeping hits with an e-value < 1e-5 and score > 300. The resulting

214 set of protein sequences was filtered from proteins whose length was > 270 aa, aligned

215 using Clustalo (Sievers and Higgins 2018) and used to build an HMM profile using hmmbuild

216 (Finn et al. 2011). We then searched for homologs in non-Neurospora genomes using

217 hmmsearch (Finn et al. 2011), keeping only proteins in the inclusion threshold (Table S2).

218 We removed from the dataset homologs identified in Neurospora tetrasperma reference

219 genome, because gene models predicted premature stops codons. N. tetrasperma rcd-1 ORF

220 sequences were aligned using BLAST Global align (at NCBI) and sequences with SNPs or

221 micro indels (up to 6 nucleotides) leading to predicted stop codons in the N. tetrasperma

222 rcd-1 ORFs were removed.

223 For each taxon with resequencing data, genomic variation at rcd-1 was compared to a

224 set of orthologous genes. Orthologous relationships were analysed using Orthofinder

225 (Emms and Kelly 2015) using Diamond (Buchfink et al. 2015) as the sequence aligner. We

226 generated sets of single-copy orthologs present in 95% of isolates, and removed coding

10

227 sequences not starting with ‘ATG’ or having more than 10 Ns. TranslatorX (Abascal et al.

228 2010) and aligned coding sequences, while keeping the coding frame (protein sequence

229 aligner: Muscle; sequence cleaning: GBlocks with parameters -g "-b2 8 -b3 3 -b4 3 -b5 n”).

230 The number of protein variants was estimated based on non-synonymous substitutions

231 using Biopython, Ksmax (the maximum number of synonymous substitutions between

232 sequences pairs) using yn00 in PAML (Yang 2007), and π (the average number of nucleotide

233 differences between sequence pairs), s (number of segregating sites standardized by

234 sequence length) and Tajima’s D (a measure of skewness of the allele frequency spectrum)

235 using Egglib v3 (De Mita and Siol 2012). Gene genealogy of rcd-1 was inferred using the

236 GTRGAMMA model in RAxML v8.2.10 (Stamatakis 2014). The genome genealogy within

237 used the same approach and program by concatenating coding sequences at

238 874 single copy orthologs present in at least 90% of isolates.

239 Long-term positive selection was assessed using a maximum likelihood framework

240 for parameter inference and codon-based substitution models that allow the dN/dS ratio ω

241 to vary among sites and among branches, as implemented in PAML’s codeml program. For

242 the site-specific analysis, we compared three codon substitution models: M7 which is a

243 model of purifying selection that allows individual sites to evolve under differing levels of

244 constraint and assumes a beta distribution of negatively selected (0 < ω < 1), M8a which is a

245 model of purifying selection that adds neutral sites ω=1 to M7, and M8, which is a model of

246 positive selection that adds positively selected sites (ω > 1) to M7. For the branch-site

247 analysis, assuming variable selective pressures among branches and sites, we specified rcd-

248 1-2 as the foreground branch and rcd-1-1 as the background branch. We compared model

249 A1, which is a model of purifying selection (0 < ω ≤ 1) with model A, which adds positively

11

250 selected sites along the foreground branch (ω > 1) to model A1. Nested models were

251 compared using likelihood ratio tests.

252 To gather rcd-1 homologs outside the Sordariales, we used the HMMER web server

253 (Finn et al. 2011). The initial query sequence was rcd-1-1 from N. crassa strain FGSC2489.

254 Three iterative searches were performed to retrieve as many rcd-1 homologs as possible

255 (Table S2). Significant hits were aligned with Clustal, sorted by size and sequences < 230aa

256 and > 350 were discarded. Sequences were assigned to taxonomic groups (Order) using

257 NCBI , followed by manual curation of “Unclassified” taxa using Index Fungorum.

258 Pseudogymnoascus destructans accounted for most of the sequences left “Unclassified” after

259 manual curation. Protein sequences were aligned using Clustalo and a protein genealogy

260 was inferred using model PROTGAMMAJTTF in RAxML v8.2.10.

261 Microscopy

262 Microscopy was performed on Zeiss Axioskop 2 equipped with a Q Imaging Retiga-

263 2000R camera (Surrey), using a 40×/1.30 Plan-Neofluar oil immersion objective and the

264 iVision Mac4.5 software. Agar squares (∼1 cm2) were excised from plates with growing

265 germlings and observed under the microscope after 4-5 hrs of growth (Heller et al. 2016).

266 Data availability

267 Mapped sequencing reads for BSA analysis of pooled segregant strains are available

268 at the Sequence Read Archive (PRJNA556444). Files used for evolutionary analysis of rcd-1

269 homologs are deposited at FileSender and are publically accessible with the following link

270 https://filesender.renater.fr/?s=download&token=f9913ffb-5ea6-e7c2-27c4-cc7e751aecf6.

271 The deposited datasets are as follows – File 1: DNA sequences of rcd-1 homologs in

272 Sordariaceae used for phylogeny in Figure 5 and positive selection analyses. File 2: Protein

12

273 sequences of rcd-1 homologs in fungi and bacteria used for phylogeny presented in Figure

274 S11. File 3: DNA sequences of genes representing the genomic background in N. crassa used

275 for analyses of balancing selection at rcd-1 in N. crassa. File 4: DNA sequences of genes

276 representing the genomic background in N. discreta used for analyses of balancing selection

277 at rcd-1 in N. discreta. File 5: DNA sequences of genes representing the genomic background

278 in N. tetrasperma used for analyses of balancing selection at rcd-1 in N. tetrasperma. Raw

279 flow cytometry data files are available upon demand. Supplementary Tables and Figures are

280 deposited on Figshare.

281

282 Results

283 BSA and population genomics identify NCU05712 as a highly polymorphic candidate

284 gene controlling GRD

285 To identify novel GRD-inducing genes, we used progeny from a cross between two

286 wild isolates, the lab reference strain FGSC2489 and JW258 (FGSC10679). Genomic analyses

287 of the progeny from this cross (FGSC2489 x JW258) was used previously to identify kind

288 recognition determinants that act during chemotropic interactions between germlings and

289 two linked GRD-inducing genes, plp-1 and sec-9 (Heller et al. 2016, 2018). The GRD

290 phenotype (Fig. 1A, B) segregated 3:1 in progeny of this cross suggesting that two

291 independent loci regulated the GRD phenotype (sec-9/plp-1 and a second unidentified

292 locus). To identify the second GRD locus, we performed a bulked segregant analysis (BSA)

293 with progeny from an F2 backcross between FGSC2489 and an F1 progeny (Segregant 121)

294 that showed GRD with FGSC2489 in spite of being isogenic to FGSC2489 at sec-9/plp-1 (GRD

295 due to second locus). More than a hundred F2 progeny strains were phenotyped for GRD in

13

296 co-cultures with germlings of FGSC2489. Approximately half of the progeny formed viable

297 heterokaryons with FGSC2489 germlings, while the other half showed rapid death of

298 germlings ~20 min post-fusion (Video 1). We extracted DNA from F2 progeny strains and

299 sequenced two GRD-specific DNA pools (compatible with FGSC2489 and incompatible with

300 FGSC2489) for BSA analysis. A random SNP distribution of ~50% was observed for the two

301 groups on all chromosomes with the exception of a large region (~900 kbp) on chromosome

302 III, where SNPs segregated at ~100% frequency between GRD groups (Fig. 1C). Within this

303 interval we searched for highly polymorphic genes, hypothesizing that strongly divergent

304 alleles might represent good candidates for being involved in allorecognition. While most

305 genes in this region showed high identity (>95%) between the two parental strains

306 (FGSC2489 and JW258), NCU05712 was highly polymorphic and parental alleles showed

307 only 55% identity in the ORF region (Fig. 1D). The high degree of sequence divergence

308 between the two allelic haplogroups of NCU05712 was not observed at any of the adjacent

309 genes, which showed identity percentage ranging from 95 to 100%.

310 We asked how the observed level of polymorphism in NCU05712 compared to a set

311 of reference genes representing the broader gene polymorphism found in the genomes of N.

312 crassa isolates from Louisiana (the population of origin of the lab reference strain

313 FGSC2489). We computed five summary statistics of genetic variation for the set of

314 reference genes and compared the scores obtained for each statistic, representing the

315 overall polymorpism level in the genome, to scores obtained with NCU05712 for the same

316 five statistics (Table 1). Our analysis ranked the GRD candidate gene NCU05712 in the top

317 1% of most polymorphic genes in the N. crassa genome for all five statistics (Table 1).

318 Alleles at NCU05712 were in the 99.9% percentile on three of the computed five statistics –

14

319 the number of amino-acids sequences based on nonsynonymous substitutions (Pvar), the

320 average number of nucleotide difference between sequences (π) and Tajima’s neutrality

321 statistic D (Table 1). NCU05712 equally featured among the most polymorphic genes in

322 Neurospora discreta, but not in Neurospora tetrasperma, consistent with the fact that we

323 could retrieve only one of the NCU05712 allelic lines in the latter species (Table 1). These

324 results indicate that NCU05712 is one of the most polymorphic genes in the genomes of at

325 least two Neurospora species. Based on our analysis, we hypothesized that NCU05712 was

326 the likeliest candidate gene to control GRD in the segregating region.

327 NCU05712 is necessary and sufficient for GRD

328 To test the hypothesis that NCU05712 controls GRD, a NCU05712 deletion strain

329 from the Neurospora deletion collection (Colot et al. 2006; Dunlap et al. 2007) was

330 characterized. First, we showed that the ΔNCU05712 mutant was not affected in growth or

331 conidiation as compared to the parental strain (FGSC2489) (Fig. S1). We then compared the

332 frequency of GRD in pairings between FGSC2489 germlings with an incompatible segregant

333 strain (seg29) carrying the JW258 allele of NCU05712 or in pairs of ΔNCU05712 germlings

334 with seg29 using flow cytometry and the vital dye propidium iodide (PI) (Fig. 2A; Video 1).

335 When germlings of FGSC2489 were mixed in 1:1 ratio with germlings from seg29, we

336 observed ~30 % of pairs that exhibited PI uptake (a proxy for GRD). This frequency of GRD

337 was significantly higher than that observed for mixtures of ΔNCU05712 and seg29

338 germlings, where GRD was ~5 % (Fig. 2A); approximately 5% cell death was also observed

339 in germlings undergoing self-fusion in a single strain (Fig. S2). The strong reduction in GRD

340 in ΔNCU05712 + seg29 pairings segregated with the hygromycin resistance marker used to

341 construct the deletion strain (Fig. S3). Furthermore, the GRD phenotype was restored when

15

342 NCU05712 was reintroduced into the ΔNCU05712 strain (Fig. 2A). We concluded that

343 NCU05712 was necessary for the allorecognition reaction and named the locus regulator of

344 cell death or rcd-1. We termed the two allelic haplogroups as rcd-1-1 (as represented by

345 FGSC2489) and rcd-1-2 (as represented by JW258).

346 To test if polymorphism at the rcd-1 locus was sufficient to trigger cell death, we

347 targeted an rcd-1-1 allele or an rcd-1-2 allele to the his-3 locus of a Δrcd-1 strain. Paired

348 germlings of engineered rcd-1-1 and rcd-1-2 isogenic strains, expressing the antagonistic

349 rcd-1 alleles in otherwise isogenic backgrounds, produced a strong GRD reaction with a

350 ~45% cell death frequency (Fig. 2B). Strains expressing just rcd-1-1 or rcd-1-2 did not show

351 GRD above the background level when grown by themselves or in co-cultivations with a

352 Δrcd-1 strain transformed by an empty control vector (pMF274) (Fig. 2B,C; Fig. S3). In

353 addition, we observed an incompatible reaction in progeny bearing both rcd-1-1

354 and rcd-1-2 (Fig. S4), while heterokaryotic fusion cells bearing fluorescently-tagged RCD-1-1

355 and RCD-1-2 showed strong vacuolization and compartmentation of fused cells. These

356 results indicated that allelic differences at the rcd-1 locus were necessary and sufficient to

357 trigger cell death in fused germling pairs and also in hyphae in N. crassa and established rcd-

358 1 as a second allorecognition system controlling GRD in N. crassa.

359 rcd-1 does not act downstream of plp-1 in the induction of GRD

360 The similar GRD phenotype between the two loci that regulate death upon fusion of

361 germlings, sec-9/plp-1 and rcd-1, suggested that a functional relationship might exist in the

362 induction of GRD by allelic differences at these loci. Therefore, we assessed whether the

363 GRD phenotype induced in germlings with different sec-9/plp-1 allelic specificities was

364 dependent on rcd-1, by assessing sec-9/plp-1-induced GRD in a Δrcd-1 mutant. Genetic

16

365 differences at sec-9/plp-1 were sufficient to induce GRD in a Δrcd-1 mutant background with

366 a frequency of cell death and cellular phenotype that was indistinguishable from strains

367 inducing sec-9/plp-1-triggered GRD in a wild type genetic background (Fig. S5). Thus, RCD-1

368 does not function downstream of SEC-9/PLP-1 in the execution of cell death, suggesting that

369 these two GRD pathways act independently.

370 Genomic rearrangements at the rcd-1 locus in Sordariaceae

371 The rcd-1-1 and rcd-1-2 alleles encode proteins of unknown function of 257 and 244

372 amino acids, respectively. The two allelic variants showed 38% identity (54% similarity) on

373 the protein level (Fig. 3). The coding region of both rcd-1 alleles consisted of two exons,

374 which were confirmed by sequencing rcd-1-1 cDNA from FGSC2489. The analysis of rcd-1

375 gene structure showed a partial duplication of rcd-1-1 in the 3’ end of the rcd-1-2 allele in

376 the JW258 strain. The duplicated rcd-1-1 in the rcd-1-2 background lacked 212 bp from the

377 5’end of the ORF including the start codon of the gene and the promoter (Fig. 4). We

378 examined the rcd-1 locus of other rcd-1-2-encoding N. crassa strains and found that all

379 sequenced wild isolates of the rcd-1-2 genotype carried a partial duplication of the rcd-1-1

380 allele at the 3’ end of rcd-1-2. However, unlike JW258, the other N. crassa rcd-1-2 strains

381 have a shorter partial duplication of the rcd-1-1 allele (rcd-1-1Δ1-610) comprising mostly the

382 second exon (Fig. 4). These data suggest that an rcd-1-2 allele inserted into an rcd-1-1 allele,

383 thereby disrupting the gene structure of the latter.

384 Sequenced N. crassa strains carrying the rcd-1-2 allele had a partially duplicated rcd-

385 1-1Δ1-610 or rcd-1-1Δ1-212 copy, with the vast majority of strains carrying rcd-1-1Δ1-610 and only

386 JW258 presenting the longer duplication (Fig. S6). Yet, the insertion site in the intergenic

387 region upstream of the rcd-1-2 locus in sequenced rcd-1-2 N. crassa strains was the same,

17

388 strongly suggesting that one ancestral insertion event was the origin of the rcd-1-1 gene

389 disruption in N. crassa and that the partially duplicated rcd-1-1Δ1-610 and rcd-1-1Δ1-212 alleles

390 evolved during a subsequent independent event (Fig. S7). The molecular signature of the

391 rcd-1 gene disruption event was also found in the genomes of other related Neurospora

392 species. For example, in the genome of N. sublineolata (FGSC5508), downstream of the rcd-

393 1-2 allele was a partially duplicated rcd-1-1Δ1-459 sequence, while in N. hispaniola

394 (FGSC8817) the rcd-1-2 allele had the same insertion sites as in N. crassa JW258 strain and

395 thus carried a partially duplicated rcd-1-1Δ1-212 allele (Fig. 4; Fig. S6). This observation

396 suggested that the genomic rearrangement was a relatively ancient event and likely

397 predates speciation. Although, we did not find a partially duplicated rcd-1-1 ORF in N.

398 discreta rcd-1-2 strains, we identified clear breakpoints in the sequence homology of the

399 rcd-1 locus between N. discreta strains carrying rcd-1-1 and rcd-1-2 alleles (Fig. S8). As the

400 sudden decrease in sequence similarity was in the upstream/downstream region of the rcd-

401 1-2 gene similarly to the N. crassa strains, we concluded that in N. discreta the insertion of

402 rcd-1-2 led to the complete loss of the rcd-1-1 ORF. Likely, taking into account the observed

403 degeneration of the pseudogenic partial rcd-1-1 alleles in the rcd-1-2 strains of N. crassa and

404 N. hispaniola, the rcd-1-2 locus in N. discreta strains has resulted from the same initial

405 ancestral event of an rcd-1-2 allele disrupting an rcd-1-1 locus. These data suggest that the

406 partial rcd-1-1 allele present in the rcd-1-2 strain was not sufficient to induce GRD.

407 Consistent with this hypothesis, we quantified the amount of cell death in mixtures of

408 germlings expressing the rcd-1-1 allele and germlings expressing the rcd-1-2 allele lacking

409 the partial duplication of rcd-1-1 (rcd-1-2-NPD) or rcd-1-2 in the presence of partially

18

410 duplicated rcd-1-1Δ1-610. A significant difference in the frequency of cell death between these

411 two germlings mixtures was not observed (Fig. S9).

412 Evolutionary signatures of balanced and long-term positive selection

413 associated with rcd-1

414 As rcd-1 is a novel allorecognition determinant, we investigated the evolutionary

415 signatures associated with the gene. Natural selection has been shown to operate on

416 allorecognition loci via a mechanism of negative frequency dependent selection that favors

417 rare alleles and results in the maintenance of multiple alleles at relatively high frequencies

418 (Wu et al. 1998; Hall et al. 2010; Zhao et al. 2015). The high positive Tajima’s D scores

419 calculated for rcd-1 in N. crassa (D = 3.491) and N. discreta (D = 2.637) indicated an excess of

420 intermediate frequency alleles at rcd-1 in the two Neurospora populations (Table 1). To

421 deepen our analysis of the allelic distribution of rcd-1, we investigated the phylogenetic

422 history of this locus in members of the Sordariaceae. We analyzed 194 fungal genomes from

423 26 species and identified 113 rcd-1 homologs in 104 of the sequenced strains (Table S1).

424 Two Chaemotiaceae species ( cochliodes and hyrcaniae) and one

425 Lasiosphaeriaceae II had two copies of rcd-1, and one Lasiosphaeriaceae IV (Cladorrhinum

426 bulbillosum) had seven copies. However, rcd-1 homologs were not identified in 90 genomes,

427 representing 12 species of Chaetomiaceae and Lasiosphaeriaceae I and II. We found that 81

428 of these genomes belonged to N. tetrasperma (65) and N. discreta (16) strains, where gene

429 models encoding longer versions (>500 aa) of rcd-1 were predicted in different lineages,

430 likely corresponding to annotation errors. When examined manually, rcd-1 alleles from N.

431 tetrasperma strains (FGSC7586, FGSC9035) carried premature stop codons, similar to an

432 rcd-1 allele from N. hispaniola (FGSC8817).

19

433 Phylogenetic analysis (maximum-likelihood) of rcd-1 in 27 wild isolates of N. crassa

434 from Louisiana revealed a nearly equal distribution of two main polyphyletic clades (14 rcd-

435 1-1-like and 13 rcd-1-2-like homologs in each clade) (Fig. 5A ; Table S1). Furthermore, in N.

436 discreta, a relatively equal distribution (60-40%) between the two clades for the 35

437 analyzed rcd-1 homologs was also identified. The lack of monophyletic segregation between

438 the N. crassa and N. discreta alleles indicates that the allele-specific polymorphisms predate

439 the speciation event and represent an example of trans-species polymorphism (Fig. 5).

440 Other Neurospora species carrying rcd-1 alleles also split between the two clades with N.

441 pannonica and N. sitophila rcd-1 alleles being closer to rcd-1-1 alleles, while N. terricola, N.

442 africana and N. sublineolata carried sequences situated on the rcd-1-2 branch. The allele-

443 specific polymorphisms were distributed over the entire length of the allelic variants (Fig.

444 S10). We did not find marks of trans-species polymorphism on the adjacent genes

445 (NCU05713 and NCU05711) to rcd-1 (NCU05712), which are situated at ~1.3 kb and ~0.9

446 kb, respectively, from NCU05712 (Fig. S11). The equal distribution of allelic frequencies in

447 the wild populations of Neurospora and the observed trans-species polymorphism suggest

448 that the rcd-1 locus is under long-term balancing selection.

449 Positive diversifying selection has been reported previously to operate on different

450 allorecognition genes in Neurospora and (Bastiaans et al. 2014; Zhao et al. 2015).

451 Sites under positive selection could be of importance for subsequent functional analyses of

452 the rcd-1 allorecognition process. We thus tested for long-term positive selection by

453 comparing, with likelihood-ratio tests, models assuming positive selection with null models

454 in PAML’s codeml program. For site-specific analyses, positive selection model M8 (which

455 assumes that one class of sites has ω > 1) provided a better fit to the data than purifying

20

456 selection models M7 and M8a (which assumes 0 < ω < 1 and 0 < ω ≤ 1, respectively;

457 Likelihood ratio test: P=0.0397 for M7 vs M8, P=0.0374 for M8a vs M8) (Table 2). Under

458 M8a, a single amino acid had a >0.95 posterior Bayesian probability of being under positive

459 selection using the Bayes empirical Bayes method. For branch-site analyses, positive

460 selection model A (which assumes positive selection ω > 1 along foreground branch rcd-1-2)

461 provided a better fit to the data than purifying selection model A (which assumes 0 < ω ≤ 1

462 along all branches; Likelihood ratio test: P=0.0003). Under model M8 and A, respectively,

463 one and two amino acids (plus the stop codon) had a >0.95 posterior Bayesian probability of

464 being under positive selection using the empirical Bayes method. Both models found the

465 position 173 (V173 in RCD-1-2 ; boxed in red, Fig. 3) as being under positive selection, while

466 model A identified H26 (RCD-1-2) (a histidine residue present in same position in the RCD-

467 1-1 variant from FGSC2489 and the RCD-1-2 variant of JW258, but with a greater sequence

468 diversity outside of the two parental strains) (boxed in red, Fig. 3, Fig. S9).

469 Overall, the evolutionary analysis of rcd-1 uncovered molecular signatures typically

470 associated with allorecogniton genes. These results suggest that the rcd-1 allorecognition

471 process operates in the wild.

472 Distribution of rcd-1 homologs in

473 Functional allorecognition genes are generally poorly conserved between different

474 fungal species (Fedorova et al. 2005; Paoletti 2016; Daskalov et al. 2017). However, genes

475 encoding proteins carrying predicted motifs associated with allorecognition genes fall in

476 several large gene families and are distributed throughout fungi (Van der Nest et al. 2014;

477 Zhao et al. 2015; Paoletti 2016). As rcd-1 is not related to any currently known fungal cell

478 death-inducing determinant, we decided to explore the broader distribution of rcd-1

21

479 homologs. BLAST searches with either of the allelic variants of RCD-1 retrieved ~260

480 significant hits (E value < 0.05) from the non-redundant proteins database (NCBI). All

481 identified sequences were encoded in the genomes of filamentous ascomycetes

482 (), predominantly in species in the . The number of rcd-1

483 homologs increased to ~940 sequences when we performed a profile HMMs search using

484 HMMER with RCD-1-1 as initial query sequence (Finn et al. 2011; Potter et al. 2018); the

485 HMMs search was performed until no new sequences were retrieved, which occurred after

486 the third iteration. Almost all significant hits (>99.9% of hits with E value < 0.005) were

487 distributed in 24 fungal orders (23 in Ascomycota and 1 in Basidiomycota) and three

488 bacterial orders containing only four sequences (Fig. 6; Table S2). The unique rcd-1 homolog

489 found in Basidiomycota was from Naematelia encephala (Tremellales), while the

490 prokaryotic sequences were distributed in three Gram-negative species (Flavobacteriales

491 and Xanthomonadales) and one rcd-1 homolog in the filamentous actinomycete

492 Actinoplanes missouriensis (Micromonosporales) (Table S2). The four bacterial sequences

493 grouped near the root of the tree in two different nodes with other fungal rcd-1 homologs

494 (Fig. 6). Two fungal orders – Hypocreales (324) and Eurotiales (112) – contained

495 approximately half of the analyzed rcd-1 homologs. Yet, rcd-1 genealogy did not follow

496 Ascomycota phylogeny, because a large number of nodes grouped rcd-1 homologs from

497 phylogenetically distant taxa (Fig. 6) (interactive tree at:

498 https://itol.embl.de/tree/1472106711587311563969558#). For example, rcd-1 homologs

499 from Hypocreales (black in Fig. 6) are interspersed among homologs from other orders of

500 Ascomycota and are found on most branches of the tree. A similar distribution is observed

501 for homologs belonging to Eurotiales (sky blue), Magnaporthales (purple) and in general

22

502 most orders with high number of rcd-1 homologs (Fig. 6). Sequences from Sordariales

503 (pink) were scattered in distant nodes of the tree, with previously analyzed rcd-1 homologs

504 from the Sordariaceae (mainly Neurospora species) (Fig 5A.) – where the incompatible rcd-

505 1-1 and rcd-1-2 alleles could be found – grouped together in a separate branch of the tree.

506 The latter finding was not surprising as the sequences situated on the Sordariaceae branch

507 are alleles of rcd-1 from one species (N. crassa, N. discreta and N. tetrasperma) and/or rcd-1

508 orthologs from closely related species.

509 At least one rcd-1 homolog was present in ~180 fungal species, with a mean of 5 and

510 median of 3 rcd-1 homologs per strains (sequenced genome). The genomes of more than a

511 dozen ascomycete taxa contained more than 10 rcd-1 homologs. Amongst the species with

512 higher number of rcd-1 homologs were several Trichoderma species, where individual

513 strains carried up to 23 rcd-1 genes in their genome (T. atroviride ATCC 20476).

514 Remarkably, the genome of the closely related species T. reesei QM6A contained only two

515 rcd-1 genes. We concluded from these results that the allorecognition determinant rcd-1

516 identified in N. crassa is a member of a large gene family, widespread in the Ascomycota.

517 The phylogeny of rcd-1 in fungi suggests that the gene has experienced lineage-specific gene

518 expansions and was subject to horizontal transfers between species. Similar distribution

519 patterns have also been reported for other families of nonself recognition determinants in

520 fungi (Dyrka et al. 2014; Zhao et al. 2015).

521

522 Discussion

523 Here we investigated the molecular basis of programmed cell-death occurring during

524 non-self recognition in germinating conidia of N. crassa and identified NCU05712 as a novel

23

525 allorecognition determinant, which we named regulator of cell death-1 or rcd-1. Two

526 incompatible alleles, rcd-1-1 and rcd-1-2, were identified at the rcd-1 locus and the co-

527 expression of these alleles in fused germlings and hyphae was necessary and sufficient to

528 trigger massive vacuolization and cell-death. The two alleles showed evidence of balancing-

529 selection and were associated with trans-species polymorphism, two molecular hallmarks

530 associated with genes involved in the control of non-self discrimination (Richman 2000;

531 Hall et al. 2010; Milgroom et al. 2018). Furthermore, close examination of the rcd-1 locus

532 revealed that an rcd-1-2 allele inserted in an rcd-1-1 allele, disrupting the gene structure of

533 the later. We find this observation particularly intriguing as it underscores a trend of

534 genomic rearrangements frequently associated with non-self recognition loci. For example,

535 several distinct GRD-specific haplotypes are present at the recently identified plp-1 locus in

536 wild populations of Neurospora and Podospora (Heller et al. 2018). Multiple rearranged

537 haplotypes were also found at the doc-1 (determinant of communication-1) locus, which

538 defines non-self recognition at distance (pre-contact) in germlings (Heller et al. 2016), while

539 in Cryphonectria parasitica, the incompatibility locus vic4 contains two idiomorphic genes

540 (Choi et al. 2012). Population genomics data and evolutionary signatures of rcd-1 suggest

541 that the genomic rearrangements may have initiated events that led to the emergence of this

542 incompatibility system in at least some Neurospora species. This hypothesis is based on the

543 following observations: i) rcd-1 alleles appear to be under balancing selection, evenly

544 distributed in the populations of N. crassa and N. discreta, which suggests that natural

545 selection is acting on the rcd-1-1/rcd-1-2 incompatibility and ii) all currently sequenced rcd-

546 1-2 strains from N. crassa and N. discreta carry molecular evidence of a rearrangement in

547 the rcd-1 locus. If the rcd-1-1/rcd-1-2 incompatibility system predates the rearrangements

24

548 in the rcd-1 locus, and assuming that the adaptive value of a hypothetical wild type rcd-1-2

549 and the ‘inserted rcd-1-2’ is equivalent, one could ask why the locus carrying the insertion is

550 so strongly over-represented in N. crassa and N. discreta. A plausible explanation is that the

551 rcd-1-2 allele has been horizontally transferred or introgressed; this event could have been

552 possible only if it disrupted the rcd-1-1 allele (co-expression of both alleles is lethal).

553 Interspecies transfer of heterokaryon incompatibility genes and mating-related

554 genes has been previously documented in Ophiostoma novo-ulmi (Paoletti et al. 2006). In

555 Neurospora, introgression has been shown to play a role in shaping mating-type

556 chromosomes and possibly the evolution of the rsk (resistant to spore killer) locus (Sun et

557 al. 2012; Svedberg et al. 2018). Further phylogenetic analyses are needed to explore the

558 evolutionary origin of rcd-1 incompatibility, notably sequencing of additional Neurospora

559 strains and/or species closely related to Neurospora within the Sordariales. The

560 accumulation of such genomic data could identify more rcd-1 haplotypes from species with

561 population samples and offer a better view of the allelic diversity of rcd-1 in Sordariales,

562 which could reveal a more precise hypothesis regarding the emergence of rcd-1-1/rcd-1-2

563 incompatibility. The discovery of the rcd-1 incompatibility system represents an exciting

564 opportunity to investigate the evolution of allorecognition genes and how particular

565 genomic rearrangements impact their distribution in fungi.

566 Several key contributions have been made recently towards a better understanding

567 of the molecular basis of allorecognition in N. crassa, with three key steps or checkpoints

568 that regulate early colony establishment (Dyer 2019). The first checkpoint regulates

569 chemotropic interactions between germlings, with Neurospora populations showing five

570 communication groups, and which is regulated by allelic specificity at the determinate of

25

571 communication or doc loci. Isolates from the same communication group (and with identical

572 doc allelic specificity) show robust chemotropic interactions, while isolates from different

573 communication groups (with different doc allelic specificity) show significantly reduced

574 communication and thus, cell fusion (Heller et al. 2016). The second checkpoint, controlled

575 by the cwr-1 and cwr-2 (cell wall remodeling) loci, regulates whether cell wall dissolution

576 and cytoplasmic mixing occurs between adhered germlings (Gonçalves et al. 2019). Cells

577 with identical allelic specificity at cwr-1 and cwr-2 undergo cell fusion and cytoplasmic

578 mixing, while adhered germlings with alternate cwr-1 cwr-2 allelic specificity are blocked. If

579 cells have identity at the doc and cwr loci, somatic cell fusion proceeds, but the viability and

580 fitness of fused germling are dependent upon the GRD loci – plp-1/sec-9 (Heller et al. 2018)

581 and rcd-1. These loci also control allorecognition in mature hyphae and thus could play a

582 critical role in fungal inter-individual relations beyond germling interactions and colony

583 establishment.

584 In future work, investigations into the molecular mechanisms of rcd-1-induced cell

585 death are of major interest as the gene is active in a developmental stage (asexual spores)

586 frequently associated with threats to human and plant health. Targeting of endogenous cell-

587 death pathways in fungi could represent a viable strategy in the fight against various fungal

588 pathogens; it was recently proposed that mice clear inhaled spores from their lungs by

589 targeting fungal PCD (Shlezinger et al. 2017; Kulkarni et al. 2019). The identification of rcd-1

590 diversifies the inventory of allorecognition genes identified in fungi. Considering the

591 widespread nature and abundance of rcd-1 homologs, especially in the Pezizomycotina, we

592 hypothesize that conspecific non-self discrimination is unlikely to be the sole role

593 performed by rcd-1 in all of these species. This hypothesis is especially intriguing in light of

26

594 recent reports positioning various allorecognition determinants inside broader gene

595 families, which control innate immunity in animals and plants (Paoletti and Saupe 2009;

596 Dyrka et al. 2014; Gonçalves et al. 2017). Future work on the molecular mechanism of

597 recognition and induction of cell death mediated by RCD-1 will test its role in the paradigm

598 of allorecognition and potentially provide clues to alternative functions of this gene family

599 in the Pezizomycotina.

600

601 Acknowledgements

602 This work was supported by a Laboratory Directed Research and Development Program

603 of Lawrence Berkeley National Laboratory under U.S. Department of Energy Contract No. DE-

604 AC02-05CH11231 to NLG. We thank the Berkeley Flow Cytometry Facility and the CNR

605 Biological Imaging Facility for their technical support. This work used the Vincent J. Coates

606 Genomics Sequencing Laboratory (UC, Berkeley), supported by NIH S10 Instrumentation

607 Grants S10RR029668 and S10RR027303. The authors wish to thank Amy Powell and Don

608 Natvig for the kind permission to use Thielavia genome sequences for analysis of rcd-1

609 homology.

610

611 References

612 Abascal, F., R. Zardoya, and M. J. Telford, 2010 TranslatorX: multiple alignment of

613 nucleotide sequences guided by amino acid translations. Nucleic Acids Res. 38: W7-

614 13.

615 Bastiaans, E., A. J. M. Debets, and D. K. Aanen, 2016 Experimental evolution reveals that

616 high relatedness protects multicellular cooperation from cheaters. Nat. Commun. 7:

27

617 11435.

618 Bastiaans, E., A. J. M. Debets, D. K. Aanen, A. D. van Diepeningen, S. J. Saupe et al., 2014

619 Natural variation of heterokaryon incompatibility gene het-c in

620 reveals diversifying selection. Mol. Biol. Evol. 31: 962–974.

621 Brennwald, P., B. Kearns, K. Champion, S. Keränen, V. Bankaitis et al., 1994 Sec9 is a SNAP-

622 25-like component of a yeast SNARE complex that may be the effector of Sec4

623 function in exocytosis. Cell 79: 245–258.

624 Brown, G. D., D. W. Denning, N. A. R. Gow, S. M. Levitz, M. G. Netea et al., 2012 Hidden killers:

625 human fungal infections. Sci. Transl. Med. 4: 165rv13.

626 Buchfink, B., C. Xie, and D. H. Huson, 2015 Fast and sensitive protein alignment using

627 DIAMOND. Nat. Methods 12: 59–60.

628 Choi, G. H., A. L. Dawe, A. Churbanov, M. L. Smith, M. G. Milgroom et al., 2012 Molecular

629 characterization of vegetative incompatibility genes that restrict hypovirus

630 transmission in the chestnut blight Cryphonectria parasitica. Genetics 190:

631 113–127.

632 Colot, H. V., G. Park, G. E. Turner, C. Ringelberg, C. M. Crew et al., 2006 A high-throughput

633 gene knockout procedure for Neurospora reveals functions for multiple

634 transcription factors. Proc. Natl. Acad. Sci. USA 103: 10352–10357.

635 Corcoran, P., J. L. Anderson, D. J. Jacobson, Y. Sun, P. Ni et al., 2016 Introgression maintains

636 the genetic integrity of the mating-type determining chromosome of the fungus

637 Neurospora tetrasperma. Genome Res. 26: 486–498.

638 Daskalov, A., J. Heller, S. Herzog, A. Fleißner, and N. L. Glass, 2017 Molecular mechanisms

639 regulating cell fusion and heterokaryon formation in filamentous fungi. Microbiol.

28

640 Spectr. 5:.

641 De Mita, S., and M. Siol, 2012 EggLib: processing, analysis and simulation tools for

642 population genetics and genomics. BMC Genet. 13: 27.

643 Debets, A. J. M., and A. J. F. Griffiths, 1998 Polymorphism of het-genes prevents resource

644 plundering in Neurospora crassa. Mycol Res 102: 1343–1349.

645 Debets, F., X. Yang, and A. J. F. Griffiths, 1994 Vegetative incompatibility in Neurospora: its

646 effect on horizontal transfer of mitochondrial plasmids and senescence in natural

647 populations. Curr. Genet. 26: 113–119.

648 Dettman, J. R., D. J. Jacobson, and J. W. Taylor, 2003 A multilocus genealogical approach to

649 phylogenetic species recognition in the model eukaryote Neurospora. Evolution 57:

650 2703–2720.

651 Dunlap, J. C., K. A. Borkovich, M. R. Henn, G. E. Turner, M. S. Sachs et al., 2007 Enabling a

652 community to dissect an organism: Overview of the Neurospora functional genomics

653 project. Adv. Genet. 57:49-96.

654 Duxbury, Z., Y. Ma, O. J. Furzer, S. U. Huh, V. Cevik et al., 2016 Pathogen perception by NLRs

655 in plants and animals: Parallel worlds. Bioessays 38: 769–781.

656 Dyer, P. S., 2019 Self/Non-self recognition: Microbes playing hard to get. Curr. Biol. 29:

657 R866–R868.

658 Dyrka, W., M. Lamacchia, P. Durrens, B. Kobe, A. Daskalov et al., 2014 Diversity and

659 variability of NOD-like receptors in fungi. Genome Biol. Evol. 6: 3137–3158.

660 Ellison, C. E., J. E. Stajich, D. J. Jacobson, D. O. Natvig, A. Lapidus et al., 2011 Massive changes

661 in genome architecture accompany the transition to self-fertility in the filamentous

662 fungus Neurospora tetrasperma. Genetics 189: 55–69.

29

663 Emms, D. M., and S. Kelly, 2015 OrthoFinder: solving fundamental biases in whole genome

664 comparisons dramatically improves orthogroup inference accuracy. Genome Biol.

665 16: 157.

666 Fan, R., H. M. Cockerton, A. D. Armitage, H. Bates, E. Cascant-Lopez et al., 2018 Vegetative

667 compatibility groups partition variation in the virulence of Verticillium dahliae on

668 strawberry. PLoS One 13: e0191824.

669 Fedorova, N. D., J. H. Badger, G. D. Robson, J. R. Wortman, and W. C. Nierman, 2005

670 Comparative analysis of programmed cell death pathways in filamentous fungi. BMC

671 Genomics 6: 177.

672 Finn, R. D., J. Clements, and S. R. Eddy, 2011 HMMER web server: interactive sequence

673 similarity searching. Nucleic Acids Res. 39: W29-37.

674 Freitag, M., P. C. Hickey, N. B. Raju, E. U. Selker, and N. D. Read, 2004 GFP as a tool to analyze

675 the organization, dynamics and function of nuclei and microtubules in Neurospora

676 crassa. Fungal Genet. Biol. 41: 897–910.

677 Galagan, J. E., S. E. Calvo, K. A. Borkovich, E. U. Selker, N. D. Read et al., 2003 The genome

678 sequence of the filamentous fungus Neurospora crassa. Nature 422: 859–868.

679 Gladieux, P., B. A. Wilson, F. Perraudeau, L. A. Montoya, D. Kowbel et al., 2015 Genomic

680 sequencing reveals historical, demographic and selective factors associated with the

681 diversification of the fire-associated fungus Neurospora discreta. Mol. Ecol. 24:

682 5657–5675.

683 Glass, N. L., and K. Dementhon, 2006 Non-self recognition and programmed cell death in

684 filamentous fungi. Curr. Opin. Microbiol. 9: 553–558.

685 Glass, N. L., and I. Kaneko, 2003 Fatal attraction: nonself recognition and heterokaryon

30

686 incompatibility in filamentous fungi. Eukaryotic Cell 2: 1–8.

687 Gonçalves, A. P., J. Heller, A. Daskalov, A. Videira, and N. L. Glass, 2017 Regulated forms of

688 cell death in fungi. Front. Microbiol. 8: 1837.

689 Gonçalves, A. P., J. Heller, E. A. Span, G. Rosenfield, H. P. Do et al., 2019 Allorecognition upon

690 fungal cell-cell contact determines social cooperation and impacts the acquisition of

691 multicellularity. Curr. Biol. 29:3006-3017.

692 Hall, C., J. Welch, D. J. Kowbel, and N. L. Glass, 2010 Evolution and diversity of a fungal

693 self/nonself recognition locus. PLoS One 5: e14055.

694 Heller, J., C. Clavé, P. Gladieux, S. J. Saupe, and N. L. Glass, 2018 NLR surveillance of essential

695 SEC-9 SNARE proteins induces programmed cell death upon allorecognition in

696 filamentous fungi. Proc. Natl. Acad. Sci. USA 115: E2292–E2301.

697 Heller, J., J. Zhao, G. Rosenfield, D. J. Kowbel, P. Gladieux et al., 2016 Characterization of

698 greenbeard genes involved in long-distance kind discrimination in a microbial

699 eukaryote. PLoS Biol. 14: e1002431.

700 Herzog, S., M. R. Schumann, and A. Fleißner, 2015 Cell fusion in Neurospora crassa. Curr.

701 Opin. Microbiol. 28: 53–59.

702 Ho, H.-I., S. Hirose, A. Kuspa, and G. Shaulsky, 2013 Kin recognition protects cooperators

703 against cheaters. Curr. Biol. 23: 1590–1595.

704 Ishikawa, F. H., E. A. Souza, J.-Y. Shoji, L. Connolly, M. Freitag et al., 2012 Heterokaryon

705 incompatibility is suppressed following conidial anastomosis tube fusion in a fungal

706 plant pathogen. PLoS One 7: e31175.

707 Jones, J. D. G., R. E. Vance, and J. L. Dangl, 2016 Intracellular innate immune surveillance

708 devices in plants and animals. Science 354:6316.

31

709 Kaneko, I., K. Dementhon, Q. Xiang, and N. L. Glass, 2006 Nonallelic interactions between

710 het-c and a polymorphic locus, pin-c, are essential for nonself recognition and

711 programmed cell death in Neurospora crassa. Genetics 172: 1545–1555.

712 Klein, J., A. Sato, S. Nagl, and C. O’hUigín, 1998 Molecular trans-species polymorphism.

713 Annu. Rev. Ecol. Syst. 29: 1–21.

714 Kruys, Å., S. M. Huhndorf, and A. N. Miller, 2015 Coprophilous contributions to the

715 phylogeny of Lasiosphaeriaceae and allied taxa within Sordariales (Ascomycota,

716 Fungi). Fungal Divers. 70: 101–113.

717 Kulkarni, M., Z. D. Stolp, and J. M. Hardwick, 2019 Targeting intrinsic cell death pathways to

718 control fungal pathogens. Biochem. Pharmacol. 162: 71–78.

719 Laird, D. J., A. W. De Tomaso, and I. L. Weissman, 2005 Stem cells are units of natural

720 selection in a colonial ascidian. Cell 123: 1351–1360.

721 Lakkis, F. G., S. L. Dellaporta, and L. W. Buss, 2008 Allorecognition and chimerism in an

722 invertebrate model organism. Organogenesis 4: 236–240.

723 Milgroom, M. G., M. L. Smith, M. T. Drott, and D. L. Nuss, 2018 Balancing selection at nonself

724 recognition loci in the chestnut blight fungus, Cryphonectria parasitica,

725 demonstrated by trans-species polymorphisms, positive selection, and even allele

726 frequencies. Heredity 121: 511–523.

727 Mylyk, O. M., 1975 Heterokaryon incompatibility genes in Neurospora crassa detected using

728 duplication-producing chromosome rearrangements. Genetics 80: 107–124.

729 Nydam, M. L., E. E. Stephenson, C. E. Waldman, and A. W. De Tomaso, 2017 Balancing

730 selection on allorecognition genes in the colonial ascidian Botryllus schlosseri. Dev.

731 Comp. Immunol. 69: 60–74.

32

732 Palma-Guerrero, J., C. R. Hall, D. Kowbel, J. Welch, J. W. Taylor et al., 2013 Genome wide

733 association identifies novel loci involved in fungal communication. PLoS Genet. 9:

734 e1003669.

735 Paoletti, M., 2016 Vegetative incompatibility in fungi: From recognition to cell death,

736 whatever does the trick. Fungal Biol. Rev. 30: 152–162.

737 Paoletti, M., K. W. Buck, and C. M. Brasier, 2006 Selective acquisition of novel

738 and vegetative incompatibility genes via interspecies gene transfer in the globally

739 invading eukaryote Ophiostoma novo-ulmi. Mol. Ecol. 15: 249–262.

740 Paoletti, M., and S. J. Saupe, 2009 Fungal incompatibility: evolutionary origin in pathogen

741 defense? Bioessays 31: 1201–1210.

742 Perkins, D. D., A. Radford, and M. S. Sachs, 2001 The Neurospora Compendium:

743 chromosomal loci. Academic Press

744 Potter, S. C., A. Luciani, S. R. Eddy, Y. Park, R. Lopez et al., 2018 HMMER web server: 2018

745 update. Nucleic Acids Res. 46: W200–W204.

746 Richman, A., 2000 Evolution of balanced genetic polymorphism. Mol. Ecol. 9: 1953–1963.

747 Rosengarten, R. D., and M. L. Nicotra, 2011 Model systems of invertebrate allorecognition.

748 Curr. Biol. 21: R82-92.

749 Saupe, S. J., C. Clavé, and J. Bégueret, 2000 Vegetative incompatibility in filamentous fungi:

750 Podospora and Neurospora provide some clues. Curr. Opin. Microbiol. 3: 608–612.

751 Shiu, P. K., and N. L. Glass, 1999 Molecular characterization of tol, a mediator of mating-

752 type-associated vegetative incompatibility in Neurospora crassa. Genetics 151: 545–

753 555.

754 Shlezinger, N., H. Irmer, S. Dhingra, S. R. Beattie, R. A. Cramer et al., 2017 Sterilizing

33

755 immunity in the lung relies on targeting fungal apoptosis-like programmed cell

756 death. Science 357: 1037–1041.

757 Sievers, F., and D. G. Higgins, 2018 Clustal Omega for making accurate alignments of many

758 protein sequences. Protein Sci. 27: 135–145.

759 Simpson, J. T., K. Wong, S. D. Jackman, J. E. Schein, S. J. M. Jones et al., 2009 ABySS: a parallel

760 assembler for short read sequence data. Genome Res. 19: 1117–1123.

761 Slater, G. S. C., and E. Birney, 2005 Automated generation of heuristics for biological

762 sequence comparison. BMC Bioinformatics 6: 31.

763 Stamatakis, A., 2014 RAxML version 8: a tool for phylogenetic analysis and post-analysis of

764 large phylogenies. Bioinformatics 30: 1312–1313.

765 Sun, Y., P. Corcoran, A. Menkis, C. A. Whittle, S. G. E. Andersson et al., 2012 Large-scale

766 introgression shapes the evolution of the mating-type chromosomes of the

767 filamentous ascomycete Neurospora tetrasperma. PLoS Genet. 8: e1002820.

768 Svedberg, J., S. Hosseini, J. Chen, A. A. Vogan, I. Mozgova et al., 2018 Convergent evolution of

769 complex genomic rearrangements in two fungal meiotic drive elements. Nat.

770 Commun. 9: 4242.

771 Uehling, J., A. Deveau, and M. Paoletti, 2017 Do fungi have an innate immune response? An

772 NLR-based comparison to plant and animal immune systems. PLoS Pathog. 13:

773 e1006578.

774 Van der Nest, M. A., A. Olson, M. Lind, H. Vélëz, K. Dalman et al., 2014 Distribution and

775 evolution of het gene homologs in the basidiomycota. Fungal Genet. Biol. 64: 45–57.

776 Vogel, H. J., 1956 A convenient for Neurospora crassa. Microbial Genet. Bull.

777 13:42-46.

34

778 Westergaard, M., and H. K. Mitchell, 1947 NEUROSPORA V. A synthetic medium favoring

779 sexual reproduction. Am. J. Bot. 34: 573–577.

780 Wu, J., S. J. Saupe, and N. L. Glass, 1998 Evidence for balancing selection operating at the

781 het-c heterokaryon incompatibility locus in a group of filamentous fungi. Proc. Natl.

782 Acad. Sci. USA 95: 12398–12403.

783 Yang, Z., 2007 PAML 4: phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 24:

784 1586–1591.

785 Zeise, K., and A. Von Tiedemann, 2002 Host specialization among vegetative compatibility

786 groups of Verticillium dahliae in relation to Verticillium longisporum. J. Phytopathol.

787 150: 112–119.

788 Zhang, D.-X., and D. L. Nuss, 2016 Engineering super mycovirus donor strains of chestnut

789 blight fungus by systematic disruption of multilocus vic genes. Proc. Natl. Acad. Sci.

790 USA 113: 2062–2067.

791 Zhang, D.-X., M. J. Spiering, A. L. Dawe, and D. L. Nuss, 2014 Vegetative incompatibility loci

792 with dedicated roles in allorecognition restrict mycovirus transmission in chestnut

793 blight fungus. Genetics 197: 701–714.

794 Zhao, J., P. Gladieux, E. Hutchison, J. Bueche, C. Hall et al., 2015 Identification of

795 allorecognition loci in Neurospora crassa by genomics and evolutionary approaches.

796 Mol. Biol. Evol. 32: 2417–2432.

797

798 Figure Legends

799 Figure 1. Bulked segregant analysis (BSA) and comparative genomics identifies GRD-

800 controlling candidate gene. A. Cartoon of germling-regulated death (GRD). Shown are

35

801 two genetically distant germlings, represented by nuclei (circles) in pink and green,

802 undergoing cellular fusion (left). The GRD reaction (right) manifests with extensive

803 vacuolization of the germlings (white shapes) and cell lysis (shown with dashed line). B.

804 GRD reaction occurring between genetically incompatible asexual spores of two different

805 strains as revealed by the uptake of the vital dye propidium iodide (black arrows) at fusion

806 points between incompatible germlings. One strain is marked with calcofluor-white (blue)

807 and the other strain expresses cytoplasmic GFP (green). Scale bar is 5 µm. C. SNP

808 segregation on linkage group III after sequencing of two genomic DNA pools of 50

809 segregant strains each from the FGSC2489 X JW258 cross. Purple line: SNP frequencies of

810 pooled DNA of segregant strains showing GRD with FGSC2489. Green line: SNP frequencies

811 of pooled DNA of segregant strains compatible with FGSC2489. D. Genomic organization of

812 the locus containing the GRD candidate gene NCU05712 or regulator of cell death-1.

813 Percentage of identity between the parental strains (FGSC2489 and JW258) is shown for

814 each ORF in the region.

815

816 Figure 2. NCU05712 (rcd-1) is necessary and sufficient for GRD induction in N. crassa.

817 A. A flow cytometry-based quantification of GRD using the vital dye propidium iodide (PI)

818 (Heller et al. 2018) shows that rcd-1 is necessary for the GRD phenotype. B. GRD

819 quantification with isogenic strains of N. crassa carrying different rcd-1 alleles. Strains

820 carrying ectopic rcd-1 allele at the his-3 locus are indicated with (h::) preceding the allele.

821 Experiments were performed in triplicate with 20,000 events counted per experiment. *P <

822 0.0006, one-way ANOVA with Tukey’s multiple comparisons test. C. Reconstitution of rcd-

823 1-1/rcd-1-2 GRD phenotype in isogenic N. crassa strains. Cells undergoing GRD (white

36

824 arrows) show strong vacuolization and uptake of propidium iodide (PI). Deletion of rcd-1

825 abolishes GRD and genetically distinct strains can undergo cell fusion to form viable cell

826 networks. Fusion points between genetically distinct germlings are shown with black

827 asterisks. Scale bar is 5 µm.

828

829 Figure 3. Alignment of GRD-inducing allelic variants RCD-1-1 and RCD-1-2 from the

830 wild type reference strain FGSC2489 and the wild isolate JW258. Identity between the

831 two proteins is shown in dark blue, similarity in light blue. Amino acids residues found to

832 be under positive selection are boxed in red.

833

834 Figure 4. Genetic rearrangements in the rcd-1 locus in Neurospora. rcd-1-1 (green) and

835 rcd-1-2 (pink) alleles from different species are shown as cartoons in the rcd-1 locus. The

836 extended region in pink in rcd-1-2 represents the sequence with sharp decrease in

837 homology (or regions with lack of homology) at nucleotide level between strains of the rcd-

838 1-1 and rcd-1-2 genotype. Percentage identities between two different genomic regions are

839 shown between black lines delimiting the comparisons. Numbers over the rcd-1-1 allele

840 indicate the limits of the truncated ORF in the genome of different Neurospora species and

841 strains. Red line in the rcd-1-2 allele of Neurospora hispaniola (Nhis-8817) represents a

842 premature stop codon. Species names are abbreviated as follows; N. africana (Nafr), N.

843 hispaniola (Nhis), N. crassa (Ncra), N. sublineolata (Nsub), N. discreta (Ndis).

844

845 Figure 5. Phylogenetic distribution of rcd-1 in Sordariaceae (A) as compared with

846 species phylogeny (B). Species are color-coded and nodes with bootstrap values above

37

847 0.7 are indicated with black dots. Detailed species and strain information is available in

848 Table S1.

849

850 Figure 6. Phylogenetic distribution of rcd-1 in fungi and bacteria. Maximum likelihood

851 tree of ~940 rcd-1 homologs. Branches of the phylogenetic tree are color-coded in

852 accordance with the taxonomic rank at order level for the species from which the various

853 rcd-1 sequences have been retrieved. Order names are indicated in the panel next to the

854 phylogenetic tree. The gene family expansion in Sordariaceae family (order Sordariales),

855 which display the various rcd-1-1 and rcd-1-2 Neurospora alleles from this study, is

856 highlighted. Branches corresponding to the four bacterial sequences are indicated with red

857 stars. Bacterial phylogenetic orders are indicated with the letter (b) in the legend of the

858 figure. Sequences from species with no clearly established phylogeny at order level, are

859 grouped under ‘unclassified’.

860

861 Figure S1. Morphological characterization of rcd-1 engineered strains in this study.

862 A. Deletion of NCU05712 does not lead to growth, conidiation (left panels, bottom right

863 panels) or germination defects (top right panels). B. Introduction of an rcd-1-2 allele

864 targeted to the his-3 locus in an rcd-1 deletion strain in the otherwise FGSC2489 genetic

865 background does not produce any visible phenotype.

866

867 Figure S2. Germling death frequencies of individual N. crassa strains used in this

868 study. Flow cytometry experiments were performed at lest in triplicate with 20,000 events

869 counted per experiment (see Materials and Methods). *P < 0.0001, one-way ANOVA with

38

870 Tukey’s multiple comparisons test, ns – non-significant. Positive control for the GRD

871 reaction is shown as the percentage of propidium iodide (PI) positive cells in a mixture of

872 germlings expressing the incompatible rcd-1-1 and rcd-1-2 alleles. Strains carrying an rcd-1

873 allele in the his-3 locus are indicated with (his3::) preceding the allele name.

874

875 Figure S3. Suppression of the GRD phenotype segregates with the hygromycin

876 resistant cassette in the NCU05712 locus. A. Segregant strains (form cross ΔNCU05712 x

877 FGSC2489) grown on minimum media (MM) with the antibiotic hygromycin. Half of the

878 strains were found to be resistant to hygromycin (left panel) and half to be sensitive to

879 hygromycin (right panel). B. Germling mixtures of hygromycin-resistant segregant strains

880 and thus ΔNCU05712 and an rcd-1-2 expressing strain (seg29) (four left panels) do not

881 show GRD, while hygromycin-sensitive segregant strains (four right panels) produce GRD

882 in presence of seg29. Black arrowheads show vacuolated pairs of germlings undergoing

883 GRD. Scale bar is 3 µm.

884

885 Figure S4. Heterokaryon incompatibility induced by RCD-1 allelic variants in hyphae

886 of N. crassa. A. Germinating co-expressing rcd-1-1 and rcd-1-2 alleles show

887 growth arrest approximately 24 hrs after germination and progressive melanization (over

888 a period of 10 days). B. Hyphal fusion between strains expressing GFP-RCD-1-1 and

889 mCherry-RCD-1-2 resulted in vacuolization and cellular death (red arrowheads) and

890 cytoplasmic mixing restriction by compartmentation (white arrowheads). Black boxes

891 indicate image crop position. Scale bars are 10 µm.

892

39

893 Figure S5. PLP-1/SEC-9 GRD incompatibility is not RCD-1-dependent. GRD reaction

894 triggered by antagonistic PLP-1/SEC-9 alleles in presence (A) or absence (B) of RCD-1

895 revealed by the uptake of propidium iodide (red). Strains are stained with calcofluor-white

896 (blue) or marked with Concavalin-A-coupled fluorochrome (green). Arrowheads show cells

897 undergoing GRD, some of which are already stained with the vital dye (PI). Scale bars are 5

898 µm. C. Flow cytometry quantification of the GRD reaction. ns – non-significant, unpaired T-

899 test. Strains are indicated by their GRD-inducing genotype and the rcd-1 genetic

900 background is shown for all strains. The sec-9 allele is shown in superscript (plp-1FGSC2489 +

901 sec-9JW199) for germlings undergo GRD following cell fusion (Heller et al. 2018).

902

903 Figure S6. Nucleotide alignment of rcd-1 locus from different Neurospora species.

904 Colors show conservation with darkest purple corresponding to strong (above threshold of

905 90% of the sequences) and light purple to weak conservation. Exons encoding rcd-1-1 and

906 rcd-1-2 alleles are boxed in green and red, respectively. Sequences encoding partially

907 duplicated rcd-1-1 exons are boxed with dashed green line.

908

909 Figure S7. Nucleotide alignment of the NCU05713 – NCU05712 (rcd-1) intergenic

910 regions from N. crassa strains. Color shows conservation with threshold of 90%. A

911 reference rcd-1-1 sequence from FGSC2489 strain is boxed in green. Red arrow indicates

912 conservation breakpoint corresponding to the insertion point of rcd-1-2 alleles.

913

914 Figure S8. Nucleotide alignment of rcd-1 loci in different N. discreta strains. Color

915 shows conservation with threshold of 90%. Strains carrying an rcd-1-2 allele are boxed in

40

916 purple. Unboxed strains carry an rcd-1-1 allele. Red arrows delimit the upstream and

917 downstream breakpoints of homology surrounding the rcd-1 ORF.

918

919 Figure S9. Partial duplication of rcd-1-1 found in strains from the rcd-1-2 genetic

920 background does not impact the GRD phenotype. All rcd-1 alleles are expressed

921 ectopically from the his-3 locus. Germling death frequencies are measured by the uptake of

922 propidium iodide (PI) using flow cytometry. Experiments were performed at least in

923 triplicate with 20,000 events counted per experiment. *P < 0.004, **P < 0.0002, one-way

924 ANOVA with Tukey’s multiple comparisons test. NPD – no partial duplication.

925

926 Figure S10. Alignment of RCD-1 allelic variants from Neurospora/Sordaria

927 phylogenetic branch. Proteins are shown in ClustalX colors and RCD-1-1 (FGSC2489) and

928 RCD-1-2 (JW258) are highlighted in green and magenta boxes, respectively. Color scheme

929 is based on the nature of the amino acid residues: hydrophobic – blue, polar – green,

930 negative charge – magenta, positive charge – red, aromatic – cyan, glycine – orange, proline

931 – yellow, cysteine – pink, any/gap – white.

932

933 Figure S11. Signatures of balancing-selection and trans-species polymorphism

934 associated with NCU05712 but absent on the adjacent NCU05711 and NCU05713

935 genes. Maximum likelihood analyses performed with MEGA7. N. discreta strains are shown

936 in blue, the reference lab strain FGSC2489 (rcd-1-1 allele) in green and JW258 (rcd-1-2

937 allele) in pink. Bootstrap values of 1 are shown next to each node.

938

41

939

42

A B C

100

50

SNP frequencies [%] SNP 0 5262473 ~900 kbp ~3.2 Mbp

D NCU05705 NCU05707 NCU16617 NCU05710 NCU05712 NCU05714 NCU05716 NCU05718 98 98 98 98 99 95 99 99 55 100 99 99 99 98 99 NCU05706 NCU05708 NCU10387 NCU05711 NCU05713 NCU05715 NCU05717 5 kb AB 50 60

40 50

40 30 30 20 20 10 % PI positive cells

% PI positive cells 10

0 0

rcd-1 rcd-1 ∆ ∆

seg292489 + 2489 + rcd-1 h::rcd-1 seg29∆ +

seg29 + h::rcd-1-1 + h::rcd-1-2 C h::rcd-1-1h::rcd-1-2 + pMF vector+ pMF vector DIC GFP CFW merge rcd-1-1 rcd-1-2

rcd-1 rcd-1-1

rcd-1 rcd-1-2 NM OM PM QM RM êÅÇNJN| kÅê~J l oTQ^ j aJ h` t c q i ak^evmmm p i a p j op dem f pm^ p i d ei f m p i ^ e i an f f k^h^ f RN êÅÇNJO| kÅê~J gt ORU j a k b b t c m i hn qevmmmq f m p j hqdemqdmfpfde ffmaioei aks f k` hd c RO

SM TM UM VM NMM êÅÇNJN| kÅê~J l oTQ^ b m c m^q j a f edmq f f ba cht aeJ p e b v pi pi ddhs m f m i ^m^ds m c s a i ks d NMO êÅÇNJO| kÅê~J gt ORU b m c mmk j a s c q^ev b n ` e c dae i k pbc s s n ^h^^^m f h k f J J J s mds as qd p NMN

NNM NOM NPM NQM NRM êÅÇNJN| kÅê~J l oTQ^ i dd^ c p o p s ^kvt bcaoibovfjn mqo p v s n h ` f bo a b s ho t f ^hkh p j j j j NRQ êÅÇNJO| kÅê~J gt ORU ^ d i eeqkfqpaot bvapssb v^s vmqo n v f a o i i bp h b s h n v f nhp hhJ J i i NRN

NSM NTM NUM NVM OMM êÅÇNJN| kÅê~J l oTQ^ dot bsvj fqdff s ^odddohhh b hqqdh bcp s b s qs b s mi f s b ^dmddhokq OMS êÅÇNJO| kÅê~J gt ORU d d t ` s v j s qd f j s ^odddoks s pbb hd^ds c dks d c n s md f d b u^m b s d t a q OMP

ONM OOM OPM OQM ORM êÅÇNJN| kÅê~J l oTQ^ ^ o n h q t d q p n qda cs t ^s oi^hfqhpdie pat h j b q s c dhq ppc o d n h ^ f c ORT êÅÇNJO| kÅê~J gt ORU hqhqhsk^eeqqa c s ` ^foishf^hpdioppt qj hhs J J J qo b c JJJJJJ J OQQ Nafr-1740 rcd-1-2 70% 1-212 Nhis-8817 rcd-1-2 93% 32% 69% Ncra-OR74A NCU05713 rcd-1-1 NCU05711 96% 35% 98% 1-212 Ncra-JW258 rcd-1-2

90% 1-610 Ncra-D30 rcd-1-2

66% 1-459 Nsub-5508 rcd-1-2 67% Ndis-MM2 rcd-1-2 94% 47% 95% Ndis-MM20 rcd-1-1 1 kb Ntet−2503 Ntet−9029 Ntet−9027 Ntet−9044 A Ntet−2505H B Ntet−2506H Ntet−9031 Ntet−9036 Ntet−9028 Ntet−9031 Ntet−9043 Ntet−2504 Ntet−9030 Ntet−9052 Ntet−9051 Ntet−9050 Ntet−2506H Ntet−9027 Ntet−9029 Ntet−9040 Ntet−9044 Ntet−9048 Ntet−2504 Ntet−9028 Ntet−9049 Ntet−9043 Ntet−9052 Ntet−9026 Ntet−2505H Ntet−9050 Ntet−9035 Ntet−9039 Ntet−9026 Ntet−9036 Ntet−2503 Ntet−9035 Ntet−9051 Ntet−9048 Ntet−9049 Ntet−9047 Ntet−9030 Ntet−9040 Ntet−9039 Ntet−7585 Ntet−9047 Ntet−7586 Taxa Ntet−7585 Ncra−JW242 Ntet−7586 Ncra−JW196 Ntet−2502 Ncra−JW148 Neurospora tetrasperma Ntet−10732C2 Ncra−JW75 Ntet−P4372 Ncra−P4479 Ntet−10743 Ncra−JW193 Nsit−1135 Ncra−D113 Neurospora sitophila Ncra−JW222 Ncra−OR74A Ncra−JW224 Ncra−JW179 Ncra−JW199 Ncra−P4452 Neurospora crassa Ncra−JW258 Ncra−D111 Ncra−JW22 Ncra−P4489 Ncra−P4471 Ncra−P4463 Ncra−P4476 Ncra−D112 Neurospora discreta Ncra−JW246 Nsit−1135 Ncra−JW204 Ntet−P4372 Ntet−10732C2 Neurospora africana Ncra−OR74A Ntet−10743 Ncra−JW220 Ntet−2502 Ncra−P4468 Ndis−MM20 Ncra−JW228 Ndis−MM6−1 Neurospora terricola Ncra−JW196 Ndis−MM31−2 Ncra−JW148 Ndis−W793−3 Ncra−JW193 Ndis−W796−5 Ncra−D113 Ndis−W1289−1 Ncra−P4489 Ndis−W1068−1 Ncra−P4463 Ndis−9962 Ncra−D112 Ndis−W967−8 Neurospora sublineolata Ncra−P4452 Ndis−W1040−3 Ncra−JW75 Ndis−W786−2 Ncra−P4479 Ndis−W978−3 Ncra−JW179 Ndis−W1102−3 Neurospora pannonica Ncra−JW242 Ndis−W1110−2 Ncra−D111 Ndis−W1101−1 Ndis−W778−5 Ndis−9950 mycetomatis Ndis−W793−3 Ndis−9948 Ndis−W1040−3 Ndis−9951 Ndis−W1068−1 Ndis−9947 Ndis−W796−5 Ndis−9959 Ndis−W786−2 Ndis−9941 Ndis−MM12−3 Npan−7221 Sorma−ASM18280v2 Ndis−MM2−3 Ndis−MM2−3 Ndis−W1289−1 Ndis−MM10 Ndis−MM6−1 Ndis−W778−5 Ndis−MM31−2 Ndis−MM12−3 Ndis−W1056−1 Ndis−W973−5 Ndis−W1073−3 Ndis−W1056−1 Ndis−W973−5 Ndis−W1073−3 Ndis−W967−8 Ndis−9956 Ndis−W978−3 Ndis−8765 Ndis−W1101−1 Ndis−9945 Ndis−W1102−3 Ndis−9971 Ndis−W1110−2 Ndis−9944 Ndis−9956 Ndis−99536 Ndis−8765 Nsub−5508 Ndis−9947 Nter−1889 Ndis−9948 Nafr−1740 Ndis−9950 Ndis−9958 Ndis−9951 Ncra−JW224 Ndis−9945 0.1 Ncra−JW222 Ndis−9962 Ncra−JW246 Ndis−9959 Ncra−D30 Ndis−9971 Ncra−JW258 Ndis−9944 Ncra−JW220 0.1 Ndis−9941 Ncra−JW22 Ndis−9953 Ncra−JW199 Ncra−P4468 Ndis−9958 Ncra−JW204 Nafr−1740 Ncra−JW228 Nter−1889− Ncra−P4471 Sorma−ASM18280v2 Ncra−P4476 Nsub−5508 Nsub−5508 Npan−7221 Madmy Madmy Order Botryosphaeriales Ophiostomatales Capnodiales Orbiliales Chaetothyriales Pezizales Phaeomoniellales Diaporthales Pleosporales Dothideales Sordariales Eurotiales Taphrinales Flavobacteriales(b) Togniniales Glomerellales Tremellales Helotiales Unclassified Hypocreales Venturiales Magnaporthales Verrucariales Micromonosporales(b) Xanthomonadales(b) Onygenales Xylariales

0.8

ceae a

rdari o

S Table 1. Average summary statistics (and 95% percentile) for rcd-1 and reference genes representing the genomic background in three Neurospora species.

1 2 3 4 5 Species/Gene set S π D KSmax Pvar

Neurospora crassa

rcd-1 0.451** 0.220*** 3.491*** 1.762** 22.0***

reference genes (n=5632) 0.043 (0.136) 0.011 (0.031) -0.490 (1.072) 0.261 (0.248 ) 8.0 (18.0)

Neurospora discreta

rcd-1 0.491** 0.202*** 2.637** 2.318** 16.0**

reference genes (n=5265) 0.032 (0.085) 0.008 (0.019) 0.244 (1.580) 0.324 (0.149) 5.7 (12.0)

Neurospora tetrasperma

rcd-1 0.116 0.023 -0.908 0.218 6.0

reference genes (n=4698) 0.054 (0.154) 0.011 (0.032) -0.0367 (2.190) 0.480 (0.319) 7.735 (17.0) 1S is the number of polymorphic sites per base pair. 2π is the average number of 3 nucleotide differences between sequence pairs. KSmax is the maximum number of synonymous substitutions between sequences pairs. 4D is Tajima's neutrality statistic (Tajima 1989). 5Pvar is the number of amino-acid sequences based on nonsynonymous substitutions. *** in 99.9% percentile, ** in 99% percentile, * in 95% percentile

Table 2. Parameter estimates and likelihood scores of negative selection (A1, M7 and M8a) and positive selection models (A, M8)

Model dN/dS Parameter estimates PSS Likelihood

M7 0.364 p = 1.06499, q = 1.81851 NA -7758.9

M8 0.365 p0 = 0.99506 p = 1.13305 q = 2.02613 1 (1) -7755.6 (p1 = 0.00494) ω = 2.65855

M8a 0.359 p0 = 0.96236 p = 1.21055 q = 2.35599 NA -7757.8 (p1 = 0.03764) ω = 1.00000

A1 site class 0 1 2a 2b NA -7774.8 proportion 0.75587 0.20934 0.02724 0.00754 background w 0.24450 1.00000 0.24450 1.00000 foreground w 0.24450 1.00000 1.00000 1.00000

A site class 0 1 2a 2b 4 (3) -7764.5 proportion 0.76963 0.21413 0.01270 0.00353 background w 0.25112 1.00000 0.25112 1.00000 foreground w 0.25112 1.00000 15.33251 15.33251 The dN/dS ratio is an average across all sites categories. PSS, number of positively selected sites (number of sites with posterior probability cutoff P>95%)