bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 The plot thickens: haploid and triploid-

2 like thalli, hybridization, and biased

3 mating type ratios in 4 5 S. Lorena Ament-Velásquez1+, Veera Tuovinen1,2**+, Linnea Bergström1, Toby 6 Spribille3, Dan Vanderpool4, Juri Nascimbene5, Yoshikazu Yamamoto6, Göran Thor2, 7 Hanna Johannesson1* 8 9 1Department of Organismal Biology, Uppsala University, Norbyvägen 18D, 752 36 Uppsala, 10 Sweden. 11 2Department of Ecology, Swedish University of Agricultural Sciences, P.O. Box 7044, SE-750 12 07 Uppsala, Sweden. 13 3Biological Sciences CW 405, University of Alberta, Edmonton, AB T6G 2R3, Canada. 14 4 Department of Biology, Indiana University, Bloomington, IN 47405, USA 15 5Department of Biological, Geological and Environmental Sciences, University of Bologna, via 16 Irnerio 42, I-40126, Bologna, Italy. 17 6Department of Bioproduction Science, Faculty of Bioresource Sciences, Akita Prefectural 18 University Shimoshinjyo-nakano, Akita, 010-0190 Akita, Japan. 19 20 *Corresponding author: E-mail: [email protected] 21 + These authors contributed equally 22 ** Current address 23 24 Keywords: , heterothallism, ploidy 25 bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

26 Abstract

27 The study of the reproductive biology of fungal symbionts has been traditionally

28 challenging due to their complex and symbiotic lifestyles. Against the common belief of

29 haploidy, a recent genomic study found a triploid-like signal in Letharia. Here, we used

30 genomic data from a pure culture and from thalli, together with a PCR survey of the MAT locus,

31 to infer the genome organization and reproduction in Letharia. We found that the read count

32 variation in the four Letharia specimens, including the pure culture derived from a single sexual

33 spore of L. lupina, is consistent with haploidy. By contrast, the L. lupina read counts from a

34 thallus’ metagenome are triploid-like. Characterization of the mating-type locus revealed a

35 conserved heterothallic configuration across the , along with auxiliary genes that we

36 identified. We found that the mating-type distributions are balanced in North America for L.

37 vulpina and L. lupina, suggesting widespread sexual reproduction, but highly skewed in Europe

38 for L. vulpina, consistent with predominant asexuality. Taken together, we propose that

39 Letharia fungi are heterothallic and typically haploid, and provide evidence that triploid-like

40 individuals are rare hybrids between L. lupina and an unknown Letharia lineage, reconciling

41 classic systematic and genetic studies with recent genomic observations.

42 bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

43 Introduction

44 The way an organism reproduces is one of its most important traits, since patterns of

45 inheritance drastically affect a ability to adapt and survive over evolutionary time (Bell

46 1982; Whittle et al. 2011). In Fungi, mating types are key players in reproduction, as they

47 define the sexual identity of an individual at the haploid stage, making it able to recognize and

48 mate with compatible cell types (Kronstad and Staben 1997). Researchers can use the

49 structure of the mating-type (MAT) locus, and the frequency of mating types in natural

50 populations to shed light on the reproductive biology of a species. In particular, the

51 configuration of the MAT locus can inform us about the species’ sexual system.

52

53 The most basic MAT configuration occurs in heterothallic ascomycetes, where either of two

54 allelic variants (also referred to as idiomorphs as they represent highly dissimilar sequences

55 that encode different transcription factors) are found at the MAT locus (Butler 2007): one

56 contains a gene encoding a transcriptional activator protein with an α-box domain (referred to

57 herein as MAT1-1-1, see Turgeon and Yoder (2000)), and the other contains a gene encoding

58 a high-mobility group (HMG) protein (MAT1-2-1). In heterothallic species, successful mating is

59 only possible between haploid individuals of opposite mating types (Glass et al. 1988). By

60 contrast, both idiomorphs are often present in homothallic species, allowing them to mate with

61 itself, and in many cases with any other individual (e.g., Billiard et al. 2012; Gioti et al. 2012). In

62 addition, the idiomorphic regions of many ascomycetes harbour additional MAT genes, which

63 may contribute to the sexual development of the species (e.g., Ferreira et al. 1998; Mandel et

64 al. 2007; Tsui et al. 2013; Dyer et al. 2016; Hutchinson et al. 2016). Moreover, the relative

65 frequency of allelic variants at the MAT-locus in natural populations can be used to investigate

66 the reproductive mode of the species (e.g., Mandel et al. 2007), as an even distribution

67 indicates frequent sexual reproduction while an uneven distribution points towards a relatively bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

68 high frequency of asexual propagation. Within the class , which includes

69 many lichen symbionts, multiple studies have found a single mating type per sample, indicating

70 a widespread heterothallic sexual system (Singh et al. 2012; Ludwig et al. 2017; Allen et al.

71 2018; Dal Grande et al. 2018; Pizarro et al. 2019). However, homothallic Lecanoromycetes do

72 exist, such as some members of the genus Xanthoria s. lat. (Honegger et al. 2004; Scherrer et

73 al. 2005), emphasizing that taxa must be investigated in a case-by-case manner.

74

75 Ascomycete lichen symbionts are usually assumed to grow vegetatively as haploids, as the

76 majority of (Honegger and Scherrer 2008). However, there have been suggestions

77 of alternative genetic compositions present in lichens, such as different nuclear organization,

78 like dikaryosis (n+n), diploidy, and higher ploidy levels (Tripp et al. 2017; Tripp and Lendemer

79 2018; Pizarro et al. 2019) or as intrathallus variation, in which multiple individuals (genotypes)

80 of the lichen fungal symbiont come together into a single thallus but without cell fusion (Jahns

81 and Ott 1997). The latter phenomenon is also known as “chimeras” or as “mechanical hybrids”

82 (Murtagh et al. 2000; Dyer et al. 2001; Altermann 2004). Commonly, authors make no

83 distinction between different nuclear organizations and the “chimera” scenarios (Honegger et

84 al. 2004; Mansournia et al. 2012). This is partly due to the fact that very little is known about

85 the genetics and composition of thalli, as few species have been successfully grown in isolation

86 from their symbiotic partners (Crittenden et al. 1995; Honegger et al. 2004; Honegger and

87 Zippler 2007). Historically, it has remained extremely difficult to artificially reconstruct the full

88 lichen symbiosis and to mate different individuals in the lab (Bubrick and Galunj 1986; Jahns

89 1993; Stocker-Wörgötter 2001). However, the advent of next generation sequencing

90 technologies has opened the door to re-evaluate many of our assumptions on lichen biology,

91 including the genetic composition of the dominant ascomycete partner (Tripp et al. 2017; Dal

92 Grande et al. 2018; Armaleo et al. 2019; McKenzie et al. 2020). In particular, Tripp et al. (2017)

93 inferred the ploidy level of seven unrelated fungal lichen symbionts using sequencing data from bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

94 cultures derived from multiple spores, as well as from thalli containing all symbiotic partners

95 (i.e. metagenomic data). They found evidence of some lineages deviating from haploidy. As

96 compelling as such results may initially seem, such a pattern could have multiple explanations,

97 not all of which involve ploidy changes. Multispore cultures could represent a collection of

98 siblings that would naturally look different from haploid samples. Moreover, it is unclear if lichen

99 fungi have a vegetative incompatibility system in place as other ascomycetes (Dyer et al. 2001;

100 Sanders 2014), which could influence the fusion or lack thereof between germinating spores.

101 Thus, there is no clear expectation as to what would happen in a multispore culture in which

102 different genotypes might meet and undergo cell fusion. Furthermore, the metagenomes

103 studied might include sexual fruiting bodies (apothecia) or nearby tissue, which presumably

104 contains fertilized (dikaryotic or diploid) cells. Hence, general ploidy levels in lichen thalli

105 remain unclear.

106

107 In a recent study, McKenzie et al. (2020) used the Oxford Nanopore MinION technology to

108 produce long-read data from whole thalli of two species from the lichen genus Letharia:

109 Letharia lupina and Letharia columbiana. Unexpectedly, they discovered the presence of three

110 Letharia genotypes in the metagenomic data of both samples, suggesting that Letharia fungal

111 symbionts are triploid-like. They took advantage of the fact that one of the three genotypes

112 within the L. lupina sample was markedly different from the other two, allowing them to phase

113 the assembly graph and to produce a highly continuous haploid-like genome assembly. While

114 methodologically convenient, their findings are nothing short of puzzling, since Letharia is a

115 genus that belongs to the diverse family , whose members are heterothallic and

116 in general assumed to be haploid (Honegger and Zippler 2007; Pizarro et al. 2019). In plant

117 and animal taxa, triploid individuals often have difficulties during meiosis as the chromosomes

118 are unbalanced (Ramsey and Schemske 1998; Tiwary et al. 2005; Köhler et al. 2009), and

119 many higher ploidy populations often persist via asexual reproduction or selfing (Levin, 1975; bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

120 Mable 2003; Bicknell and Koltunow 2004). The sexual system of Letharia has not been

121 studied, but it can be hypothesised that it follows the heterothallic, self-infertile ancestral state.

122 How, then, do triploid or triploid-like fungal symbionts deal with sexual reproduction under

123 heterothallism?

124

125 In this study, we explored the reproductive biology of Letharia. In their seminal paper, Kroken

126 and Taylor (2001a) studied the diversity within this genus and defined six cryptic, closely

127 related species based on the analyses of 12 loci and their coalescent patterns. That and

128 several subsequent studies have further refined these species hypotheses based on

129 morphological characters, photosynthetic partner lineage, and their assumed main

130 reproductive mode — sexual or asexual (Kroken and Taylor 2001a; Altermann et al. 2014;

131 Altermann et al. 2016). Sexual Letharia lineages (referred to as “apotheciate”) produce multiple

132 apothecia that release sexual fungal spores, while asexual lineages (“sorediate”) produce

133 abundant asexual symbiotic propagules also known as soredia. The distinction, however, is not

134 clear-cut as apotheciate species might have isidia, another type of symbiotic asexual

135 propagule, and sorediate species occasionally produce apothecia (Kroken and Taylor 2001a).

136 The global species diversity centre of Letharia is found in western North America, where all six

137 lineages of the genus are present. In contrast, only the sorediate species L. lupina and L.

138 vulpina are known in parts of Europe, Asia and Africa. , the only representative

139 of the genus in Norway and Sweden, is red-listed in these countries, where it produces

140 apothecia extremely rarely and the genetic variation is low (Högberg et al. 2002; Arnerup et al.

141 2004; SLU ArtDatabanken 2015; Henriksen and Hilmo 2015).

142

143 Here we aimed at collecting Letharia samples to represent the diversity of species as

144 previously defined. We gathered genomic data from thalli of four taxa (as metagenomes), as

145 well as from a pure culture derived from a single spore of the species L. lupina, and performed bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

146 PCR surveys of the mating type in additional specimens. With this data in hand, we re-

147 evaluated the ploidy level and verified the sexual system in Letharia. We also contrasted

148 patterns of mating type frequencies in different populations of the sorediate species to get

149 insight into the occurrence of sexual vs. asexual reproduction in the species of conservation

150 concern.

151 Methods

152 Lichen material used in the study

153 We sequenced the genome and the transcriptome of the pure-cultured L. lupina (culture

154 number: 0602M), originating from a single spore from Canada and maintained in the culture

155 collection at Akita Prefectural University (Yamamoto et al. 1998). Likewise, we sequenced both

156 the metagenome and the metatranscriptome of one lichen thallus from each of four Letharia

157 taxa: L. columbiana, L. lupina, L. ‘rugosa’, and L. vulpina, all collected from Montana, U.S.A.

158 (Table S1, SRA: SRP149293 from Tuovinen et al. (2019)) (fig 1A). In addition, we used 317

159 Letharia thalli for PCR analyses of the mating type from Sweden (N=98; six localities with 1-20

160 thalli/locality), Switzerland (N=18; one locality), Italy (N=60; six localities with10 thalli/locality)

161 and the U.S.A. (N=138; 14 localities with 1-31 thalli/locality; Table S1). All lichen specimens

162 were collected from different trees in order to avoid collecting clones of the individual thalli

163 dispersed on the same tree trunk, except for eight putative L. gracilis thalli which we received

164 from B. McCune, all collected from the same tree. In Sweden, only a small part of each thallus

165 (< 3 cm in length) was collected due to the red-listed status of L. vulpina. After collection, the

166 specimens were air-dried and stored at –20°C until DNA was extracted. The thalli used for

167 RNA extraction were snap frozen in the field in liquid nitrogen, transferred to the lab and stored

168 at –80°C until further processing. Letharia species were identified based on a combination of

169 morphology and internal transcribed spacer (ITS) sequence, as the combination of these bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

170 characteristics should distinguish the previously described cryptic species (Kroken and Taylor

171 2001a). In addition, the algal ITS identity was used to assign two sorediate specimens to

172 species following Altermann et al. (2016) (data not shown). We aligned the ITS sequences

173 from all specimens included in this study (GenBank accessions MG645014–MG645052 from

174 Tuovinen et al. (2019)) with the previously published Letharia ITS sequences from Kroken and

175 Taylor (2001a, TreeBASE ID # 1126), Altermann et al. (2014, TreeBASE ID # 15485) and

176 Altermann et al. (2016, NCBI PopSet 1072900244) using MUSCLE in AliView v.1.18 (Larsson

177 2014). Detailed information about the specimens, their collection sites and ITS variants is given

178 in Table S1, and the specimens are deposited in the Uppsala University Herbarium (UPS).

179 DNA and RNA extraction 180 For the reference genome and transcriptome sequencing, mycelia of the pure culture of L.

181 lupina was placed in a 2 ml safe lock tube with two sterile 3 mm tungsten carbide beads, frozen

182 in a liquid nitrogen and pulverized in a TissueLyzer II (Qiagen) with 25 r/sec for 1 min. In

183 addition, pieces of the vegetative thallus part, carefully avoiding any apothecia or their close

184 proximity, of four Letharia used for metagenome and metatranscriptome sequencing and 317

185 specimens used for PCR screening were physically disrupted as described in Tuovinen et al.

186 2019. DNA was extracted from both mycelia and lichen tissue by using the DNeasy Plant Mini

187 kit, following the manufacturer’s instructions with a prolonged incubation at 65°C for 1 hr. RNA

188 of the mycelia from the pure culture was extracted with RNeasy Plant Mini Kit (Qiagen) with

189 RLT buffer. RNA from the thalli was extracted with the Ambion Ribopure Yeast Kit (Life

190 Technologies). For detailed description of DNA and RNA extraction of the lichen thalli, see

191 Tuovinen et al. 2019. bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

192 Library preparation and NGS sequencing 193 Libraries for genome sequencing were prepared from 100 ng of DNA using the TruSeq Nano

194 DNA Sample Preparation Kit (#15041110, rev B). The pure culture was sequenced twice on a

195 split lane of Illumina MiSeq v3 with 300bp paired-end sequencing. The DNA from each thallus

196 was sequenced on 1/5 lane of Illumina HiSeq 2500 with 100bp paired-end sequencing by the

197 SNP&SEQ Technology Platform, Science for Life Laboratory at Uppsala University (Tuovinen

198 et al. 2019).

199 200 Libraries for the mRNA sequencing were made by using the Illumina TrueSeq Stranded mRNA

201 Kit (#15031047, rev E) and sequenced with Illumina HiSeq 2500. The transcriptome of the pure

202 culture was sequenced on 1/5 of a lane at the SNP&SEQ Technology Platform, and the

203 metatranscriptomes from the thalli on 1/6 of a lane each at the Huntsman Cancer Institute,

204 University of Utah sequencing core (Tuovinen et al. 2019).

205 Quality control and genome and transcriptome assembly 206 Both the DNA and RNA raw reads were trimmed and quality controlled using Trimmomatic 0.32

207 (Bolger et al. 2014) (with options ILLUMINACLIP:/adapters.fa:2:20:10:4:true LEADING:10

208 TRAILING:10 SLIDINGWINDOW:4:15 MINLEN:25 HEADCROP:0). Data quality was inspected

209 using FastQC (Patel and Jain 2012). The draft genome of the pure cultured L. lupina was de

210 novo assembled from the trimmed reads using SPAdes v. 3.5.0 (Bankevich et al. 2012) with k-

211 mers 21,33,55,77,97,127 and the --careful flag. The raw reads of the metagenomes were

212 assembled with SPAdes using k-mers 55, 75, 85, 95, as preliminary analyses using trimmed

213 reads performed worse (not shown). Likewise, downstream analyses of the L. lupina pure

214 culture were done with trimmed reads while the raw reads were used for the metagenomes. A

215 summary of the genome assembly statistics is presented in Table S2 as calculated with

216 QUAST 3.2 (Gurevich et al. 2013). The filtered RNA reads of the pure cultured L. lupina, bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

217 excluding the orphan reads produced after trimming, were assembled using Trinity 2.0.6

218 (Grabherr et al. 2011; Haas et al. 2013).

219 Genome characterization of L. lupina 220 As the long-read L. lupina assembly from McKenzie et al. (2020), hereafter referred to as the

221 McKenzie L. lupina assembly, was derived from an entire lichen thallus, we compared it to the

222 SPAdes assembly of the pure-cultured L. lupina. We produced whole genome alignments with

223 the NUCmer program from the MUMmer package v. 4.0.0beta2 (Kurtz et al. 2004) with options

224 -b 2000 -c 200 --maxmatch. This analysis revealed only small alignments (less than 11 kbp)

225 between the contig 1 of the McKenzie L. lupina assembly and multiple scaffolds from the

226 SPAdes assembly of the pure-cultured L. lupina. To evaluate if the contig 1 of the McKenzie L.

227 lupina assembly really is part of the genome of the ascomycete and not other symbiotic

228 partners, we explored the repetitive content of both assemblies with RepeatMasker v. 4.0.8

229 (http://www.repeatmasker.org/) and the repeat library from McKenzie et al. (2020) with

230 parameters -a -xsmall -gccalc -gff -excln.

231 Phylogenetic analyses of marker genes 232 Alignments were produced with MAFFT version 7.245 (Katoh and Standley 2013) and options -

233 -maxiterate 1000 --retree 1 --localpair. We used the program RAxML v. 8.0.20 (Stamatakis

234 2014) to infer a Maximum Likelihood (ML) tree of the previously published Letharia ITS variants

235 along with variants discovered in our data. Likewise, we produced ML trees from the idiomorph

236 sequences (MAT1-1 or MAT1-2) as above but partitioning by gene and intergenic regions.

237 Branch support was assessed with 1000 bootstrap pseudo-replicates. The specimens with

238 unique ITS variants were putatively assigned to species based on the ML tree, previously

239 published informative SNPs that can separate between L. vulpina and L. lupina, and previously bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

240 described morphological characters (Kroken and Taylor 2001a; McCune and Altermann 2009;

241 Altermann et al. 2014; Altermann et al. 2016).

242

243 In order to place our genomic samples within the diversity of the genus, we extracted the

244 sequence from 11 markers used in previous Letharia studies (ribosomal ITS, the 18S intron,

245 chitin synthase I, and the anonymous loci 11, 2, 13, DO, CT, BA, 4, and CS) from the haploid

246 individuals (see below) and fused them with the concatenated alignment from Altermann et al.

247 2016 (TreeBASE study ID S18729). We then inferred a ML topology with IQ-TREE v. 1.6.8

248 (Nguyen et al. 2015). Branch support was estimated with 100 standard bootstraps. Rooting

249 was arbitrarily done with the well-supported L. columbiana (L. ‘lucida’) lineage.

250 Network analysis of genomic data 251 Previous population studies using few markers have shown that the Letharia species are very

252 closely related (Kroken and Taylor 2001a; Altermann et al. 2014; Altermann et al. 2016). In

253 order to approximate the genetic distance and relationships between the Letharia individuals

254 for which we had whole-genome data available, we used as reference the genome annotation

255 of the McKenzie L. lupina. We randomly selected 1000 gene features, extracted their

256 corresponding sequences from the McKenzie L. lupina assembly (including introns), and used

257 them as queries in BLAST searches performed on the SPAdes assembly of our four

258 metagenomes and the L. lupina pure culture. We also included the long-read assembly of the

259 L. columbiana individual from McKenzie et al. (2020), hereafter referred to as the McKenzie L.

260 columbiana. We filtered the BLAST output to include only hits that had at least a 250bp-long hit

261 alignment and 90% identity to the query. We extracted the sequence corresponding to the

262 surviving hits and aligned them using MAFFT version 7.458 (Katoh and Standley 2013) with

263 options --adjustdirection --maxiterate 1000 --retree 1 --localpair. With the aim of approximating

264 single-copy orthologs, only those alignments that contained a single hit for all samples were bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

265 retained. The resulting 680 genes were concatenated into a matrix of 1,184,497 bp from which

266 35,458 bp were variable. We used the matrix as input for SplitsTree v. 4.14.16, build 26 Sep

267 2017 (Huson and Bryant 2006) to produce an unrooted split network with a NeighborNet

268 (Bryant and Moulton 2004) distance transformation (uncorrected distances) with an EqualAngle

269 splits transformation and excluding gap sites.

270

271 Inferring ploidy levels based on the minor allele frequency (MAF) 272 distribution 273 As the McKenzie samples were inferred to be triploid-like (McKenzie et al. 2020), we

274 determined the ploidy levels of our metagenomes and the pure culture by calculating the

275 distribution of the minor (i.e. less common) allele in read counts of biallelic SNPs (Yoshida et

276 al. 2013; Ament-Velásquez et al. 2016). Normally, a diploid individual should have a minor

277 allele frequency (MAF) distribution around 0.5, while in a triploid individual the mode should be

278 around 1/3. A tetraploid individual is expected to have two overlapping curves, one centered

279 around 0.25 and another around 0.5. The width of the curves is proportional to factors like

280 coverage variance, mapping biases, and hidden paralogy (Heinrich et al. 2011). In addition,

281 any given sample is expected to have a large number of very low frequency alleles due to

282 sequencing errors. As a result, a haploid sample should only exhibit a curve corresponding to

283 the sequencing errors. Note that a caveat of this approach is that the haploid signal cannot be

284 distinguished from an extremely homozygous higher-ploidy level individual, as these will show

285 few heterozygous sites. Moreover, a diploid or higher ploidy signal in MAF distributions could

286 correspond to a heterokaryon condition (n+n, n+n+n, etc), which strictly speaking is composed

287 of haploid nuclei. Hence, “ploidy” is used here in a broad sense to include genetically different

288 heterokaryons.

289 bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

290 To produce MAF distributions, we mapped the Illumina reads of all of our metagenomes to the

291 pure-cultured L. lupina, which should exclude most reads from other symbionts, using BWA v.

292 0.7.17 (Li and Durbin 2010) with PCR duplicates marked by Picard v. 2.23.0

293 (http://broadinstitute.github.io/picard/). To call variants we used VarScan2 v. 2.4.4 (Koboldt et

294 al. 2012) with parameters --p-value 0.1 --min-var-freq 0.005 to ensure the recovery of low

295 frequency alleles and no further influence of the variant caller on the shape of the MAF

296 distribution. Only biallelic SNPs from contigs larger than 100 kbp and with no missing data

297 were used. We further filtered out all SNPs overlapping with repeated elements annotated

298 above with RepeatMasker, and modified the output of VarScan2 to be compliant with the vcf

299 format (https://samtools.github.io/hts-specs/). Based on the general coverage levels of all

300 samples, we excluded variants with coverage above 300x or below 50x. These high quality

301 SNPs were processed with the R packages vcfR v. 1.10.0 (Kamvar et al. 2015; Knaus and

302 Grünwald 2017), dplyr v. 0.8.0.1 (Wickham et al. 2018), ggplot2 v. 3.1.1 (Wickham 2016), and

303 cowplot v. 1.0.0 (Wilke 2019). MAF distributions were scaled such that the area of all bars

304 amount to 1. A full Snakemake v. 5.19.2 (Köster and Rahmann 2012) pipeline is available at

305 https://github.com/johannessonlab/Letharia/tree/master/LichenPloidy.

306

307 Measuring distance of the subgenomes within the L. lupina 308 metagenome 309 Given the presence of two distinguishable genotypes in the metagenome of L. lupina (see

310 Results), we calculated a relative measure of distance of each subgenome with respect to the

311 other Letharia lineages (L. columbiana, L. rugosa, and L. vulpina). We assumed that the pure-

312 cultured L. lupina corresponds to the “true” L. lupina genotype, and assigned all sites in the L.

313 lupina metagenome with identical genotype to the pure culture as “lupina”, while all alternative

314 alleles were treated as “other”. For each Letharia lineage, we counted the proportion of sites bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

315 that were not identical to either the “lupina” or the “other” variant. We made a further distinction

316 between areas along the scaffolds that were polymorphic (having both “lupina” and “other”

317 alleles), or fixed for either allele within the L. lupina metagenome. As input we used the filtered

318 SNPs produced above but we discarded all sites with MAF < 0.2 (L. lupina metagenome) or

319 MAF < 0.9 (haploid samples) under the assumption that they correspond to sequencing errors.

320 Only sites with data from all samples were considered.

321 Identifying scaffolds of the genomes containing the MAT locus 322 We performed searches with the BLAST 2.2.31+ suit (Camacho et al. 2009) of the MAT-gene

323 domains (HMG and α; see Martin et al. 2010) in the assembly of the L. lupina pure culture. As

324 queries we used the corresponding gene sequences of Polycauliona polycarpa (Xanthoria

325 polycarpa; Lecanoromycetes, Genbank accession numbers AJ884598 and AJ884599).

326 Furthermore, we searched for APN2 (encoding a basic endonuclease/DNA lyase) and SLA2

327 (encoding a protein that binds to cortical patch actin) since these genes are typically flanking

328 the MAT genes in Pezizomycotina (Heitman et al. 2007). As queries we used the ortholog

329 clusters (OG5 126768 and OG5 129034 for APN2 and SLA2, respectively) retrieved from

330 OrthoMCL-DB (Chen et al. 2006) and restricted to Pezizomycotina. Finally, we used the MAT

331 locus retrieved from the L. lupina pure culture assembly as a query to perform BLAST searches

332 against the assemblies of the metagenomes (L. lupina, L. vulpina, L. columbiana, and L.

333 ‘rugosa’).

334 Annotating the MAT locus 335 We used three ab initio gene predictors to annotate Letharia scaffolds containing the MAT

336 locus: SNAP release 2013-02-16 (Korf 2004), Augustus v. 3.2.2 (Stanke and Waack 2003) and

337 GeneMark-ES v. 4.32 (Lomsadze et al. 2005; Ter-Hovhannisyan et al. 2008). All gene

338 predictors were trained for the Letharia genomes as described in the Appendix. bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

339

340 We used STAR v. 2.5.1b (Dobin et al. 2013) to map filtered RNAseq reads to the entire

341 genome assembly of the pure-cultured L. lupina, or to the relevant scaffold extracted from the

342 metagenomes. Cufflinks v. 2.2.1 (Trapnell et al. 2010) was used to create transcript models

343 from which ORFs were inferred using TransDecoder v. 3.0.1 (Haas et al. 2013). The ortholog

344 sequences of APN2 and SLA2 of Aspergillus oryzae (OrthoMCL-DB: RIB40_aory) and the

345 MAT genes of Polycauliona polycarpa were aligned to the Letharia scaffolds using Exonerate

346 v. 2.2.0 (Slater and Birney 2005) with the options --exhaustive TRUE --maxintron 1000. Using

347 all these sources of evidence, we created weighted consensus gene structures using

348 EVidenceModeler (EVM) v. 1.1.1 (Haas et al. 2008) that were subject to manual curation with

349 Artemis release 16.0.0 (Rutherford et al. 2000).

350

351 To determine the identity of auxiliary MAT genes found in Letharia, we used the protein

352 sequences of L. rugosa (MAT1-1) and L. columbiana (MAT1-2) to conduct BLASTp and

353 DELTA-BLAST searches in the NCBI Genbank database (Boratyn et al. 2012) (searches

354 conducted the 29th of April of 2020), as well as HMMER v. 3.3 (www.hmmer.org; Eddy 2011)

355 with the uniprotrefprot database, version 2019_09. We also compared the sequences with the

356 protein models provided by Pizarro et al. (2019) for multiple lecanoromycete species. In

357 addition, we re-annotated the scaffold with the MAT locus of the species Leptogium

358 austroamericanum and (Pizarro et al. 2019), as well as the MAT1-2

359 idiomorph of Cladonia grayi (accession number MH795990; Armaleo et al. 2019) using the

360 pipeline above in order to confirm their gene order and composition. We found that the protein

361 model of the MAT1-1 auxiliary gene as provided by Pizarro et al. (2019) is relatively variable at

362 the C’ terminus, which could be explained by inconsistencies in the gene annotation process in

363 the absence of sources of evidence additional to the ab initio prediction (e.g. RNAseq data), or

364 by true biological differences. Hence, only the first exon was analysed further with IQ-TREE. bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

365 Likewise, we only used the first and second exon of the MAT1-2 auxiliary gene for an IQ-TREE

366 analysis.

367 PCR screening of the mating type and additional annotation of 368 the idiomorphs 369 We designed idiomorph specific primer pairs to screen for the mating-type ratios in different

370 populations based on the annotation of the MAT-loci from the genomes (MAT1-1: MAT11L6F

371 and MAT11L6kR, MAT1-2: MAT12LF1 and MAT12LR1; Table S3). Each sample was amplified

372 with primer pairs for both idiomorphs and the presence/absence of the PCR-product of each

373 idiomorph was determined by using agarose gel electrophoresis. A subsample of each

374 idiomorph was sequenced to ensure the amplification of the correct target sequence with the

375 PCR primers. We tested whether the distribution of mating types in the sorediate species (L.

376 lupina and L. vulpina) in U.S.A., the Alps and Sweden follows the 1:1 ratio expected for a

377 population that regularly reproduces sexually by using an exact binomial test with 95%

378 confidence interval.

379

380 In order to sequence and annotate both idiomorphs for each Letharia species, we selected

381 specimens of the same species (based on ITS sequence information) as the individual from

382 which the genomic data originated, but of the opposite mating type, and amplified a circa 3800

383 bp long part of the MAT locus using primers LMATF2 and LMATR1 and APN2LF and SLA2LR

384 (Table 1). The same procedure was done for L. ‘barbata’ and the putative L. gracilis

385 specimens. The entire MAT locus was then sequenced with primers designed for Letharia

386 based on the genome annotations (Table 1). The PCR-obtained sequences of the idiomorphs

387 were annotated using the trained ab initio gene predictors and the alignments (with Exonerate)

388 of the final gene models. We define the idiomorphic part of the MAT locus as the whole region bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

389 between APN2 and SLA2 that lack sequence similarity between individuals of different mating

390 types. bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

391 Results

392 Letharia species in the study 393 Based on morphology and ITS sequence variation, we assigned the collected 322 specimens

394 to five of the six previously recognized Letharia species (Kroken and Taylor 2001a; Altermann

395 2009; Altermann et al. 2016), including those used for metagenome sequencing (see

396 Appendix). While traditionally-used markers generally provide little resolution to the Letharia

397 species complex (fig. S1 and S2; Altermann et al. 2014; Altermann et al. 2016), a NeighborNet

398 split network of 680 genes extracted from all (meta)genome assemblies confirmed clustering

399 by species for these samples (fig. 1B).

400

401 High correspondence between our pure culture assembly and the 402 McKenzie assembly of L. lupina 403 The McKenzie L. lupina assembly was produced by manual curation of metagenomic long-read

404 data and is one of the best genome assemblies for an ascomycete lichen symbiont currently

405 available (McKenzie et al. 2020). However, it could still contain sequences from other symbiotic

406 partners. As a sanity check, we compared the L. lupina pure culture assembly generated by us

407 to the McKenzie L. lupina assembly and found that all contigs, except contig 1, have good

408 correspondence between the two. There are very few obvious translocations or inversions,

409 which might be either assembly artifacts or biological differences between samples (fig. S3).

410 Conversely, contig 1 only produced very small (< 11 kbp) alignments to multiple scaffolds of

411 the L. lupina pure culture assembly (fig. S4). This contig is the largest in the McKenzie L.

412 lupina assembly (> 3 Mbp), with a size comparable to the smallest chromosomes of well-

413 studied ascomycetes like Podospora anserina (Espagne et al. 2008). Remarkably, contig 1 is

414 composed of 87% repetitive elements, in stark contrast with most other contigs that harbor

415 around 30% repetitive elements (fig. S4). We reasoned that if the repetitive elements present bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

416 in contig 1 also occur in the other contigs of either the McKenzie L. lupina or the pure-cultured

417 L. lupina, then this large contig is indeed part of the L. lupina genome and not of another

418 symbiont. Accordingly, we found that the five most abundant repeats of contig 1 are present

419 elsewhere in both assemblies (fig. S6 and fig. S7), and were probably not recovered as such

420 in the pure culture assembly due to the limitations of short-read technology. Hence, we

421 conclude that the McKenzie L. lupina assembly does reflect the genome of the ascomycete L.

422 lupina and the observation of a triploid-like signal is not an artifact of additional unrelated fungal

423 sequences in their assembly.

424 Genomic and metagenomic datasets show haploid-like patterns 425 except for the metagenome of L. lupina 426 As the results of McKenzie et al. (2020) showed a triploid-like signal for L. lupina and L.

427 columbiana, we inferred the ploidy levels in our genomic and metagenomic samples by

428 producing MAF distributions using the L. lupina pure culture as reference for variant calling. As

429 expected under haploidy, all but one sample show an exponential decay of low frequency

430 variants produced by sequencing errors, including the L. lupina pure culture (fig. 1C). The

431 exception to the haploid pattern is our L. lupina metagenome, which has a clear peak centered

432 around ⅓, consistent with triploidy. Note that the exponential decay of the variants of the L.

433 lupina pure culture is steeper than for the other samples. This difference results from higher

434 coverage and longer reads, as the L. lupina pure culture was sequenced with MiSeq, as

435 opposed to HiSeq for other samples, leading to better read mapping. In addition, its own

436 genome was used as reference for mapping, minimizing the variance further. We obtained

437 similar results using the McKenzie L. lupina as reference (data not shown). bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

438

439 Figure 1. Relatedness and minor allele frequency (MAF) distributions of Letharia specimens. A) Thalli

440 used for metagenome sequencing in this study. B) Unrooted NeighborNet split network based on 680

441 genes from our samples with genomic or metagenomic data available and those from McKenzie et al.

442 (2020). C) Folded MAF distributions of the metagenomes from Letharia specimens in (A), as well as the

443 whole genome derived from the pure culture of a single L. lupina spore. All samples show distributions

444 concordant with haploidy, with the exception of the triploid-like metagenome of L. lupina. bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

445 Letharia has a heterothallic MAT-locus architecture 446 Having established that most metagenomes and the pure-cultured L. lupina are most likely

447 haploid, we proceeded to characterize the MAT locus to give us an indication of the sexual

448 system in Letharia. The MAT locus with the flanking genes APN2 and SLA2 was found within a

449 single scaffold for the pure-cultured L. lupina (submitted to GenBank under the accession

450 number MK521630) and from the metagenomes of L. columbiana, L. ‘rugosa’, and L. vulpina

451 (GenBank numbers MK521629, MK521632 and MK521631, respectively). Only one mating-

452 type idiomorph was recovered in each assembled genome, suggesting that Letharia is

453 heterothallic.

454

455 The genome of the pure-cultured L. lupina and the metagenomes of L. vulpina, and L. ‘rugosa’

456 contained the MAT1-1 idiomorph (with the α domain of the MAT1-1-1 gene), while the MAT1-2

457 idiomorph (with the HMG domain of the MAT1-2-1 gene) was found in the metagenome of L.

458 columbiana. Fig. 2 shows the gene content of the MAT1-1 idiomorph as represented by L.

459 ‘rugosa’, and the MAT1-2 as represented by L. columbiana. Notably, alignments with

460 Exonerate revealed the remnants of the last exon of the MAT1-1-1 gene in the idiomorph

461 MAT1-2 (the * in fig. 2B). Albeit homologous, this piece of exon is highly diverged from the

462 MAT1-1-1 in Letharia MAT1-1, as revealed by the percentage of pairwise identity along the

463 idiomorphs (fig. 2C). This same pattern was observed in the species of the genus Cladonia

464 (family Cladoniaceae) by Armaleo et al. (2019), suggesting that the pseudogenized exon might

465 be ancestral to both families. In addition, a smaller ORF was predicted weakly based on

466 transcriptomic data of L. columbiana downstream of APN2 in both idiomorphs (as lorf, from

467 Letharia open reading frame, in fig. 2; see also fig. S8-S11). Likewise, the transcriptomic data

468 revealed an antisense transcript for the MAT1-1-1 gene in all species (fig. 2 and fig. S8-S11).

469 See Appendix for more details.

470 bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

A L. 'rugosa'

APN2 lorfψ MAT1-1-7 MAT1-1-1 SLA2

B L. columbiana

MAT1-2-1

APN2 lorf MAT1-2-14 * SLA2

C 1.00 0.75 0.50

0.25 dentity

I 0.00 471 0 2500 5000 Position (bp) 7500 10000 12500

472 Figure 2. The structure of the MAT idiomorphs in the genus Letharia. The typical gene content of the

473 MAT1-1 idiomorph as exemplified by L. ‘rugosa’ (A) is contrasted to the MAT1-2 idiomorph of L.

474 columbiana (B). The final gene models with intron-exon boundaries are shown along with the transcript

475 models (orange). In (A) the antisense transcript of the MAT1-1-1 gene is marked with a darker orange

476 colour. Notice that the lorf gene is pseudogenized in L. ‘rugosa’ (lorfΨ). In (B) a star (*) shows the

477 location of a sequence homologous, but highly divergent, to the last exon of the MAT1-1-1 gene. The

478 pairwise identity between idiomorphs of the MAT locus is plotted in (C) using overlapping windows of

479 100 bp (incremental steps of 10 bp along the sequence), showing the drop in identity. The white gap

480 towards the 3’ end of the area of low identity corresponds to a track of 129 bp in the MAT1-2 idiomorph

481 of all species. The positions of forward (blue) and reverse (red) primers used for PCR screening are

482 marked with arrows.

483

484 We further obtained the sequence of the MAT region from additional 46 thalli by PCR, which

485 corresponded to either the MAT1-1 or MAT1-2 idiomorph for all species (except L. cf. gracilis,

486 for which all samples had a MAT1-2 idiomorph). As the idiomorphs are very similar between

487 species (fig. S12), we can use the sequence comparison between L. rugosa and L.

488 columbiana to estimate the size of the non-recombining region in Letharia by examining the

489 level of divergence between genomic sequences of MAT1-1 and MAT1-2 (fig. 2C). Towards bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

490 APN2, the identity between sequences drops from 98%-99% (overlapping windows of 100 bp

491 with incremental steps of 10 bp) to on average 48%, marking the beginning of the idiomorphic

492 region. On the 3’ end, towards SLA2, a plateau region of around 70% identity covers

493 approximately 500bp. The sequence identity reaches this plateau of intermediate values just

494 before the end of the idiomorph, and it is followed by an indel of 129 bp present in MAT1-2, but

495 not in the MAT1-1, for all Letharia species. We consider the 3’-end of this indel the boundary

496 for the 3’-end of the idiomorph. Using these boundaries, we estimate that the size of each

497 idiomorph of Letharia is around 3.8 kbp.

498 The auxiliary genes present in Letharia MAT idiomorphs are 499 ancestral to the Lecanoromycetes 500 We found that both idiomorphs of the MAT locus of each Letharia species harbour additional

501 ORFs (one in each idiomorph) designated herein as MAT1-1-7 and MAT1-2-14 (fig. 2). The

502 nomenclature of these additional genes is contentious and discussed extensively in the

503 Appendix.

504

505 Pizarro et al. (2019) reported a number of ORFs associated with the lecanoromycete

506 idiomorphs, but it was not clarified whether these so-called “auxiliary MAT genes” are

507 homologous across lichen symbionts or other fungal taxa, making comparisons difficult. Hence,

508 we used protein sequence and synteny comparisons to define homology of these genes. We

509 found that most, but not all, Lecanoromycetes have the same auxiliary gene in the MAT1-1

510 idiomorph as Letharia (i.e., MAT1-1-7; left panel of fig. 3) and at the same location (data not

511 shown). In addition, this ORF seems to be homologous to a gene predicted for members of the

512 class Eurotiomycetes (left panel of fig. 3). By contrast, the auxiliary genes present in the

513 MAT1-1 idiomorph of Leptogium austroamericanum (Collemataceae, Lecanoromycetes) and bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

514 Dibaeis baeomyces (Icmadophilaceae, Lecanoromycetes) (Pizarro et al. 2019) are not related

515 to MAT1-1-7, and are located in a different position (between MAT1-1-1 and SLA2).

516

AuxMAT1_Letharia_rugosa AuxMAT2_Letharia_columbiana

AuxMAT1_Cornicularia_normoerica AuxMAT2_Evernia_prunastri

AuxMAT1_Usnea_strigosa AuxMAT2_Cetraria_islandica 100

AuxMAT1_Canoparmelia_schelpei AuxMAT2_Cetraria_conmixta

AuxMAT1_Punctelia_borreri AuxMAT2_Oropogon_secalonicus MAT1-1-7 AuxMAT1_Parmeliopsis_ambigua AuxMAT2_Protousnea_magellanica MAT1-2-14 AuxMAT1_Hypotrachyna_scytodes AuxMAT2_Notoparmelia_tenuirima

AuxMAT1_Canoparmelia_nairobiensis AuxMAT2_Parmelia_saxatilis

AuxMAT1_Bulbotrix_sensibilis AuxMAT2_Xanthoparmelia_chlorochroa

55 AuxMAT1_Parmelinella_wallichiana AuxMAT2_Canoparmelia_texana 0.5 AuxMAT1_Platismatia_glauca AuxMAT2_Hypogymnia_subphysodes 0.3 AuxMAT1_Melanelia_stygia AuxMAT2_Pseudevernia_fufuracea 61 98 AuxMAT1_Alectoria_sarmentosa AuxMAT2_Parmotrema_austrosinense

AuxMAT1_Pseudephebe_pubescens AuxMAT2_Flavoparmelia_pubescens 72 99 AuxMAT1_Cladonia_macilenta AuxMAT2_Melanelixia_glabra Lecanoromycetes 100 AuxMAT1_Cladonia_metacoralifera 81 AuxMAT2_Rhizoplaca_melanphthalma AuxMAT1_Gyalolechia_flavorubescens AuxMAT2_Cladonia_grayi__MH795990.1

98 AuxMAT1_Umbilicaria_muehlenbergii AuxMAT2_Lasallia_hispanica 100

Lasallia_pustulata__SLM38448.1 Lasallia_pustulata__KAA6408227.1 Paracoccidioides_lutzii__XP_002792086.2 99 Paracoccidioides_brasiliensis__ODH46739.1 Parmeliaceae Paracoccidioides_brasiliensis__KGY14940.1 96 Umbilicariaceae Paracoccidioides_brasiliensis__ODH15599.1 Cladoniaceae Eurotiomycetes Helicocarpus_griseus__PGG98885.1 38 Lecanoraceae 30 MAT1-1-9_Trichophyton_verrucosum__XP003023854.1 MAT1-1-7_Coccidioides_immitis__ABS19617.1 517 MAT1-1-8_Diplodia_sapinea__AHA91691.1 Dothidiomycetes

518 Figure 3. Maximum likelihood trees of the auxiliary genes MAT1-1-7 (left) and MAT1-2-14 (right).

519 Colored vertical bars mark the different families within the lichen-forming class Lecanoromycetes, and

520 the Letharia sequences are highlighted. MAT1-1-7 is found outside Lecanoromycetes, while MAT1-2-14

521 seems unique for this class. Notice that the homologous genes of MAT1-1-7 in Trichophyton verrucosum

522 and Diplodia sapinea have been named differently by other authors. Branches with bootstrap values

523 above 70% are thickened. Branch lengths are proportional to the scale bar (amino acid substitution per

524 site).

525

526 For the Letharia auxiliary gene in the MAT1-2 idiomorph (MAT1-2-14 in fig. 2), we also found

527 homologs in other Lecanoromycetes (right panel of fig. 3), but no homolog outside of

528 Lecanoromycetes. This gene was not identified by Armaleo et al. (2019) in Cladonia, but our bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

529 prediction pipeline recovered it in the Cladonia grayi genome (right panel of fig. 3) in the

530 expected position.

531

532 Taken together, our data shows that the general idiomorph architecture of Letharia is

533 conserved at least between Parmeliaceae, Cladoniaceae, and Umbilicariaceae (fig. 3). Given

534 the presence of MAT1-1-7 in a member of Teloschistaceae, and of MAT1-2-14 in a member

535 from Lecanoraceae (fig. 3), potentially the general structure of the idiomorph is the same also

536 in these families.

537 The allele frequencies along the genome of the L. lupina 538 metagenome suggest hybridization 539 To explore the triploid-like signal in our L. lupina metagenome, we mapped its reads to the

540 pure culture assembly. We assigned the SNPs used for the MAF distribution as either matching

541 the pure culture (“lupina” allele) or having an alternative allele (“other”) and produced an

542 unfolded allele frequency distribution of the metagenome sample (fig. 4). We found that the

543 allele frequency distribution of the “lupina” allele is not identical to that of the “other” allele. In

544 fact, there seems to be a tendency for the “lupina” allele to be in higher frequency (⅔ or 1).

545 Hence, we plotted the distribution of the “other” allele along the pure culture assembly and

546 found that, while there are large regions polymorphic at ⅓ and ⅔ frequencies, there are also

547 areas of loss of heterozygosity (LOH), mostly in favor of the “lupina” allele (fig. 5). The

548 observed genome-wide MAF distribution patterns suggests that there are three haplotypes

549 within the L. lupina metagenome. Two of these haplotypes are more similar to each other, as

550 reported for the McKenzie L. lupina long-read data, and they correspond to the pure culture L.

551 lupina. Such signal is highly suggestive of hybridization between L. lupina and another Letharia

552 lineage, presumably leading to triploidy (i.e. allopolyploidization) or to a triploid-like condition

553 (n+n+n or 2n+n). bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

0.05

0.04

allele 0.03 other

0.02 lupina Frequency of sites 0.01

0.00 0.00 0.25 0.50 0.75 1.00 Minor allele frequency 554

555 Figure 4. Unfolded allele frequency distribution of the L. lupina metagenome, with alleles classified as

556 either identical to that of the L. lupina pure culture (“lupina”) or an alternative allele (“other”).

557

558 The reads of the L. lupina metagenome that map to the scaffold containing the MAT locus also

559 maintain a ⅔ proportion in favor of the “lupina” allele (fig. 6). BLAST searches revealed that

560 our L. lupina metagenome contains both MAT idiomorphs. As expected, the coverage of the

561 MAT1-1 idiomorph is around ⅔ of the mean (fig. S13), confirming that two haplotypes are

562 MAT1-1 and the other haplotype is MAT1-2. In addition, we found both idiomorphs in the raw

563 reads of the McKenzie L. columbiana metagenome, but not in the McKenzie L. lupina one, in

564 which we only found MAT1-1. Thus, the triploid-like Letharia thalli can have either both or a

565 single mating type.

566 bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

567

568 Figure 5. Frequency of the “other” (alternative) allele in the reads of the L. lupina metagenome when

569 mapped to the L. lupina pure culture assembly (only the eight largest scaffolds are shown). Golden

570 points represent the raw allele frequency at each site, while the solid line connects the median allele

571 frequency of non-overlapping 7.5 kb-long windows. Sites and windows overlapping with repetitive

572 elements were discarded (missing data). Representative areas with long tracks of loss of heterozygosity

573 (LOH) are marked with arrows.

574 bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

575

576 Figure 6. Frequency of the “lupina” and the “other” (alternative) allele in the reads of the L. lupina

577 metagenome along the scaffold containing the MAT locus in the L. lupina pure culture assembly. Points

578 represent the raw allele frequency at each site, while the solid lines connect the median allele frequency

579 of non-overlapping 2.5 kb-long windows. Sites and windows overlapping with repetitive elements were

580 discarded (missing data). The genes within the MAT1-1 locus are marked with black rectangles.

581

582 In an effort to determine the unknown parental lineage, i.e., the origin of the “other” allele, we

583 compared the “lupina” allele and the “other” allele with their homologs in L. columbiana, L.

584 ‘rugosa’ and L. vulpina. Surprisingly, in areas where the metagenome remains polymorphic, the

585 “other” allele is twice as divergent from any other known Letharia lineage as the “lupina” allele

586 (left panel of fig. 7). Specifically, ~31% of the “lupina” alleles are different from any other

587 sampled Letharia, while the “other” allele is different ~69% of the time. This finding suggests

588 that the unknown parental lineage is considerably more divergent than any of the sampled

589 species. Surprisingly, this pattern is inverted at sites where LOH occurred in favor of the “other”

590 allele (middle of fig. 7), which suggests that the L. lupina pure culture itself (used as reference)

591 has some tracks of highly divergent regions introgressed from the unknown Letharia lineage.

592 Notably, at positions where LOH occurs in favor of the “lupina” allele, the proportion of different

593 sites is intermediate in value (right of fig. 7). This might be the result of averaging sites of

594 different ancestry (either “lupina” or “other”) that cannot be recognized as such because they

595 are also introgressed in the reference. Moreover, L. columbiana has a higher proportion of bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

596 sites that are different from the “lupina” allele compared to L. ‘rugosa’ and specially L. vulpina.

597 This pattern might reflect the fact L. columbiana has a slightly higher genetic distance to L.

598 lupina than the other species sampled here (fig. 1; fig. S2), but it raises the question as to why

599 such a difference is not observed in the other two categories of sites.

600

Polymorphic (n = 286512) Fixed other (n = 132464) Fixed lupina (n = 848914) 1.00

0.75 ● ● ● allele

0.50 ● other

● ● ● lupina 0.25

0.00 L.columbiana L.rugosa L.vulpina L.columbiana L.rugosa L.vulpina L.columbiana L.rugosa L.vulpina Proportion of sites that have different alleles different Proportion of sites that have Species 601

602 Figure 7. Relative genetic distance between the alleles present in the L. lupina metagenome ("lupina" or

603 "other") and the sampled Letharia species: L. columbiana, L. ‘rugosa’ and L. vulpina. All the genotyped

604 sites can be classified as those that are polymorphic in the L. lupina metagenome (that is, both the

605 "lupina" or "other" alleles are present), or as sites that are fixed for either the "other" or the "lupina" allele

606 (sites with loss of heterozygosity). If the subgenomes within the L. lupina metagenome belong to the

607 same species, the proportion of sites that have different alleles from those present in the sampled

608 Letharia species should be roughly the same for both the “other” and “lupina” alleles. However, such a

609 pattern is not observed, suggesting that the “other” allele originates from an unknown, highly divergent

610 Letharia lineage. n: number of genotyped sites. bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

41 44 7 6 98

6 6 12 Letharia lupina MAT1-2 53 Letharia lupina MAT1-1 Letharia vulpina MAT1-2 Letharia vulpina MAT1-1 Collection site 611 612 Figure 8. A map showing the collection sites and mating-type distribution in the sorediate species in the

613 U.S.A., the Swiss and Italian Alps and Sweden. The numbers correspond to the number of sampled

614 thalli (n), and the areas of the pie charts are proportional to n.

615

616 Regional distribution of mating types in sorediate species shows 617 variation in reproductive mode 618 619 We amplified a single idiomorph in all specimens collected with two exceptions: in one L.

620 vulpina from Italy and one L. lupina from Switzerland we amplified both idiomorphs (Table S1).

621 With the aim to investigate the primary reproductive mode of the sorediate species (L. lupina

622 and L. vulpina), we characterized the regional occurrence of their mating types. In the U.S.A.,

623 even when only a few individuals of each species were collected from the same locality, both

624 mating types were sampled. The mating-type ratio in the U.S.A. (when all American samples of

625 a species were treated as one region) of neither sorediate species deviated from a 1:1 ratio bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

626 (exact binomial test with 95% confidence interval: N=85; p=0.8 for L. lupina and N=13; p=1 for

627 L. vulpina). Furthermore, both mating types were present in L. lupina in the Alps and the

628 mating-type ratio in these populations did not deviate from 1:1 (N=18; p=0.2). However, the

629 ratio of mating types in L. vulpina is significantly skewed from a 1:1 ratio (N=59; p=1.8e-10) in

630 the Alps, and in Sweden only one mating-type was found in 98 samples (fig. 8, Table S1).

631 These findings suggest that sorediate species reproduce sexually in U.S.A, but that L. vulpina

632 reproduces sexually at very low frequency in Europe.

633 Discussion

634 Our combined approach of next-generation sequencing from representative samples, along

635 with regional PCR surveys of the mating types, has allowed us to revise key aspects of the

636 reproductive biology of Letharia. Specifically, we were able to determine haploidy in most

637 genomic samples and a triploid-like signal in one, we revealed conserved heterothallism across

638 the genus, and we uncovered differences in mating type frequencies between North American

639 and European specimens of sorediate species. McKenzie et al. (2020) qualified the genome

640 architecture of Letharia as “enigmatic”, referring to the triploid-like nature of their two samples.

641 Here we further point out the unexpectedly high content of repeated elements in the contig 1 of

642 their L. lupina assembly. Future research should aim at clarifying if this contig is a core feature

643 of the Letharia genome, or if it represents some form of accessory or B chromosome, which

644 typically have a high repeat content (Bertazzoni et al. 2018; Ahmad and Martins 2019).

645

646 In contrast to previous studies addressing ploidy levels (Tripp et al. 2017; McKenzie et al.

647 2020), we made use of a pure culture derived from a single spore, which removes confounding

648 factors related to the presence of other unrelated fungi within the thallus, and to unknown

649 effects of growing alongside siblings. In effect, our pure culture represents the immediate state

650 after sexual reproduction in L. lupina. The fact that the metagenomes of L. columbiana, L. bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

651 ‘rugosa’, and L. vulpina are also haploid further suggests that a single spore is enough, with

652 symbionts, for the development of a lichen thallus, and that a Letharia lichen is not formed from

653 a collection of spores as it has been suggested for other species (Jahns and Ott 1997; Murtagh

654 et al. 2000). Crucially, in their classic study, Kroken & Taylor (2001b) separately genotyped

655 ascomata and maternal thalli from four specimens of L. gracilis and two of L. lupina using

656 restriction enzymes. They consistently found heterozygosity in the fruiting bodies but not in the

657 thalli, which is indicative of outcrossing and concordant with a dominant haploid life stage.

658 Moreover, we detected very few specimens in our regional sampling with both mating types by

659 PCR. Hence, it appears that the majority of Letharia thalli in nature are in fact haploid.

660

661 The conspicuous pattern of divergent haplotypes within both our and the McKenzie L. lupina

662 metagenomes is consistent with hybridization. A triploid-like hybrid could be formed in a

663 number of ways, but here we highlight some possible models (fig. 9). It has been proposed

664 before that multiple genotypes could coexist within a single lichen thallus, forming chimeras

665 (Jahns and Ott 1997; Dyer et al. 2001; Mansournia et al. 2012), which in the case of the

666 triploid-like L. lupina thalli would be of three individuals of two divergent lineages (fig. 9Ai).

667 However, the read count ratios along the genome are perfectly aligned to triploidy in all three,

668 individually prepared metagenome samples, and there is no reason to expect that different

669 genotypes growing together in a single thallus without cell fusion would maintain such strict

670 proportions of hyphal biomass and DNA content in different thalli. Rather, it is more plausible

671 that the metagenomes are indeed triploid-like in one of three nuclear configurations: 3n, 2n + n,

672 or n+n+n. In plants and animals, hybridization is a common cause of triploidization via

673 fertilization by unreduced gametes, which leads to triploid offspring that are often unstable and

674 aneuploid (Köhler et al. 2099; Mable 2003). However, the absence of diploid-like variants in our

675 data suggest that there is no obvious aneuploidy, and hence the triploid-like configuration is

676 genome-wide. Moreover, the dominant life stage of fungi is haploidy, and hence the products of bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

677 meiosis are directly the offspring (spores). It follows that triploidization might occur during

678 meiosis and spore development.

679

680 An alternative explanation, to our knowledge absent from the literature, is the faulty packaging

681 of nuclei into ascospores (fig. 9Aii). Normally, the packaging of nuclei during spore

682 development is strictly controlled, but it is known from model ascomycetes like Podospora and

683 Neurospora that occasionally this fails, and spores with an unexpected number of nuclei are

684 accidentally formed at low rates (Ames 1934; Page 1936; Raju and Newmeyer 1977; Raju

685 1992). Crosses between different species have been shown to induce or exacerbate

686 irregularities in spore development (Perkins et al. 1976; Perkins 1994; Jacobson 1995).

687 Notably, our L. lupina and the McKenzie L. columbiana metagenomes contain both mating

688 types (in a 1:2 ratio), but the McKenzie L. lupina metagenome has only one. These two types

689 of triploid-like cases could be formed by differential segregation of the MAT locus along the

690 ascus: under first division segregation, all nuclei in an accidentally trikaryotic spore (two mitotic

691 sister nuclei plus one non-sister nucleus) can be of the same mating type, while under second

692 division segregation such a spore would have two sister nuclei of one mating type and a non-

693 sister nucleus of the other mating type (fig. 9Aii). Yet another model where fertilization occurs

694 from one diploid parent would require the absence of meiosis to preserve the triploid (3n)

695 configuration in the spores (fig. 9Aiii). However, given a functional heterothallic life-style,

696 triploid spores should always have both mating types, which does not match the observation in

697 the McKenzie L. lupina metagenome. Finally, the triploid-like lichens could be produced via

698 vegetative fusion of different individuals of the ascomycete lichen symbiont (fig. 9Aiv), forming

699 heterokaryons (Tripp and Lendemer 2018). Nonetheless, fusion of ascomycetes not involved in

700 lichens typically leads to dikaryons (n+n) only. In the absence of knowledge on vegetative

701 incompatibility systems in lichenized fungi, it is unclear how likely heterokaryosis would lead to

702 trikaryons (n+n+n). bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

703

704 Note that these different models lead to specific predictions on the distribution of allele

705 frequency variation along the genome (fig. 9B). For example, in the trikaryotic spore two of the

706 genotypes should be identical, as they are the sister products of mitosis, and there should be

707 perfectly aligned tracks of swapped ancestry between the subgenomes produced by

708 recombination (left side of fig. 9B). This would not be the case in the heterokaryon produced

709 by fusion unless the individuals involved are siblings (right side of fig. 9B). Moreover, the

710 trikaryotic spore, the heterokaryon produced by fusion, and even the chimera would not be

711 able to recover the observed distribution of allele frequencies in the absence of introgressed

712 tracks in the parentals (compare fig.9Bi with fig. 9Bii). With the available data, it is not

713 possible to distinguish between these different scenarios, but our results strongly suggest that

714 there is extensive hybridization and introgression between L. lupina and an additional Letharia

715 lineage. In addition, it remains unclear if triploid-like Letharia individuals are infertile or if the

716 presence of both mating types allows for self-compatibility.

717

718 Historically, the of the Letharia genus has been chronically plagued by cryptic

719 lineages and blurry species definitions (Kroken and Taylor 2001a; Altermann 2009; Altermann

720 et al. 2014; Altermann et al. 2016), partially due to their recent diversification (judging by their

721 high genetic similarity), which inevitably leads to high incomplete lineage sorting. It is likely that

722 hybrids and gene flow further complicate matters. Clearly, L. lupina is not the only taxon

723 producing triploid-like thalli. Specifically, while we found that our L. columbiana metagenome is

724 haploid, the sample of McKenzie et al. (2020) turned out to be triploid-like. In addition, one of

725 the L. vulpina samples assayed here with PCR had both mating types. Hence, it is possible

726 that hybridization and formation of triploid-like individuals occurs in other lineages of the genus,

727 contributing further to the taxonomical difficulty in this group. bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

A) Models to expl ain a trip loid -li ke signal Assump tions

i) Chime ra n n - Any n uclear ra tio - Diff erent ra tios in diff erent n sections of the tha llus - Diff erent mycelia ca n coexist No fus ion during the tha lllus development

ii) Mispackagi ng of n ucl ei i nto ascospore s nn+n+n Normal spore Faulty spore Meiosis Mitosis format ion format ion Sa me MAT - Trikaryosis with nu clear n FDS of MAT or ra tio of 1: 2 is main ta ined Microconidium Fertilization in each cell of the tha llus Trichogyne 2n n - Hybrid cross a ff ects spore packaging Apothecium karyoga my Both development MAT or SDS of MAT

iii ) Diplo id fe rti lizatio n 3n - 3n in the whole tha llus 2n No meiosis mitosis? Microconidium Fertilization - No reduction through meiosis Trichogyne 3n - Triploid conta ins both MAT n n idiomorphs (unless MAT wa s non-f unctiona l to begin with) Apothecium karyoga my 2n development Diploid and h aploid Meiosis nuclei? Aneuploidy? Abortion?

iv) Vege tative fusion

n n n+n+n - Trikaryosis with nu clear ra tio of 1: 2 is main ta ined n in each cell of the tha llus

Fu s i o n - Weak or a bsent heterokaryon incompatibility

B) Loss of heter ozygo sity und er d iffe re nt mod els i) Pure parentals

1 1

F( ) 2/3 F( ) 2/3 1/3 1/3 0 0

ii) Parentals with introgression

1 1

F( ) 2/3 F( ) 2/3 1/3 1/3 728 0 0 bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

729 Figure 9. (Previous page) How is a triploid-like hybrid of Letharia formed? (A) We present four main

730 models that can explain a triploid-like pattern where only two out of three (sub)genomes belong to the

731 same species. More complex models or combinations are also possible. The L. lupina allele ancestry is

732 represented by green and the unknown Letharia lineage by yellow. Two shades of green, light and dark,

733 correspond to different variants of the same species. In model (i) chimeric thalli formed from three

734 independent haploid (n) mycelia that do not undergo cell fusion. In model (ii) normal sexual reproduction

735 with fertilization involves haploid parents, followed by a standard meiosis but with mispackaging of nuclei

736 during ascospore formation, creating trikaryotic (n+n+n) spores. These spores can have a single mating

737 type if the MAT locus undergoes first division segregation (FDS), or both mating types if it follows second

738 division segregation (SDS). Model (iii) implies sexual reproduction where one of the parents is diploid

739 (2n), leading to a triploid (3n) zygote during karyogamy, and a failure of meiosis to preserve the triploid

740 condition in the resulting spores. In the last model (iv), vegetative fusion of three individuals leads to

741 trikaryotic mycelia. For each model, a number of testable assumptions is given, highlighting in red those

742 that are unlikely or do not conform to the observed data. Model (i) is unlikely given the triploid-like

743 observation in three independently sequenced lichen thalli (ours and McKenzie’s). Model (iii) requires

744 multiple events (e.g. previous diploidization of one of the parentals, no meiotic reduction, a bypass of the

745 heterothallic system to produce spores of the same mating type in case of the McKenzie L. lupina), and

746 given the occurrence of meiosis after karyogamy, aneuploidy is likely to occur in the resulting spores,

747 which is also incompatible with the observed data. Hence, we interpret the models (ii) and (iv) as the

748 most likely hypotheses, since they can accommodate thalli with a single and two MAT idiomorphs. Still,

749 even under those two models, previous introgression between the two parentals is necessary in order to

750 recover the observed areas of loss of heterozygosity along the genome, as shown by schematic

751 representations in (B). The frequency of the yellow allele is shown along cartoon chromosomes in the

752 absence (i) or presence (ii) of introgression. The black dot on the chromosomes represents the

753 centromere or any repetitive area that leads to missing data.

754

755 In a large-scale study, Pizarro et al. (2019) found that a single MAT idiomorph was found in

756 representative samples across Lecanoromycetes, consistent with widespread heterothallism. bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

757 As expected, all Letharia lineages studied here also follow this pattern. We further

758 characterized both idiomorphs and determined the size of the non-recombining region (3.8

759 kbp), which has not been studied but for a handful of taxa (Scherrer et al. 2005; Dal Grande et

760 al. 2018; Armaleo et al. 2019; Pizarro et al. 2019). Moreover, we refined the annotation of the

761 auxiliary genes present in Letharia using transcriptomic data and homology searches to other

762 fungal classes. This analysis confirmed that synteny is conserved in both idiomorphs for

763 Parmeliaceae, Cladoniaceae, and Umbilicariaceae, and potentially for other families within

764 Lecanoromycetes, making Letharia an adequate reference for the mating type architecture of

765 this group. However, the MAT locus in Collemataceae and Icmadophilaceae seems to be

766 different, highlighting that the MAT locus architecture is not conserved across all lichen-forming

767 Lecanoromycetes.

768

769 Since heterothallism implies sexual self-incompatibility, the relative proportion of mating-types

770 in a population is critical for a population’s ability to reproduce sexually (Consolo et al. 2005;

771 Olarte et al. 2012; Saleh et al. 2012; Singh et al. 2012). Here we determined that both

772 sorediate species, L. lupina and L. vulpina, likely reproduce sexually often enough to maintain

773 balanced MAT frequencies in North American populations, despite the fact that apotheciate

774 thalli are relatively rare. In Europe, nonetheless, the MAT frequencies are biased, to the point

775 that only one mating type was found in the Swedish L. vulpina populations. The extreme

776 scarcity of apothecia and low genetic variation in Sweden (Högberg et al. 2002) could be

777 related to the lack of mating partner, as the populations may be pushed to predominantly clonal

778 reproduction. Alternatively, the absence of the MAT1-1 idiomorph might reflect the preferred

779 route of reproduction as mainly clonal propagation can skew the local distribution of mating

780 types even if the populations rarely recombine (Singh et al. 2015). Clonality at the edge of a

781 species’ geographical range is commonly observed across different organism groups and can

782 have ecological advantage over short time (e.g., Billingham et al. 2003; Kawecki 2008; Meloni bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

783 et al. 2013). In the Swedish range edge populations of L. vulpina, clonality may be especially

784 beneficial as it preserves the combination of the compatible symbionts adapted to the local

785 habitat, in addition to reassuring the survival of the individual and the species. However,

786 despite the short-term advantages of clonality, the lack of recombination should eventually lead

787 to limited evolutionary potential of the species and accumulation of deleterious mutations

788 (Kawecki 2008). Currently, there seems to be no, or very low, potential for sexual reproduction

789 of L. vulpina in the studied Swedish populations, which in combination with the low genetic

790 diversity may reduce the fitness and the adaptive potential of these lichens in a changing

791 environment.

792

793 Conclusions

794 795 Hybridization and polyploidization are emerging as significant influencers of fungal evolution

796 and speciation, based largely on the study of pathogens or human-associated species (Allman

797 et al. 2011; Marcet-Houben and Gabaldón 2015; Depotter et al. 2016; Steenwyk et al. 2020).

798 The results presented here, along with recent studies (Keuler et al. 2020; McKenzie et al.

799 2020), suggest that lichen fungal symbionts are also subject to these phenomena. The genus

800 Letharia is a classical model of evolutionary biology amongst lichen-forming species thanks to

801 the work of S. Kroken, J. W. Taylor and others (Kroken and Taylor 2001a; Kroken and Taylor

802 2001b; Högberg et al. 2002; Altermann et al. 2014; Altermann et al. 2016). Now, Letharia is

803 elevated into the genomics era, refueling research on lichen reproductive biology. In the

804 process new aspects of its biology are discovered, which are indeed enigmatic.

805 Supplementary Figure and Table Captions bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

806 Figures 807 Letharia cf. columbiana U 072 Letharia cf. columbiana U 154 808 Letharia lucida BH10 Letharia columbiana KU745838.1 Letharia cf. columbiana U 039 Letharia cf. columbiana metagenome Letharia cf. columbiana U 023 809 Letharia cf. columbiana U 152 L. columbiana Letharia lucida CL3 Letharia cf. columbiana U 069 Letharia lucida SG1 Letharia cf. columbiana U 002 Letharia columbiana SA280b 810 Letharia columbiana SA280e Letharia cf. columbiana U 133 Letharia columbiana SA280d Letharia cf. rugosa U 124 Letharia rugosa SA280a Letaria cf. rugosa metagenome 811 Letharia vulpina KR2a Letharia rugosa CL1 Letharia rugosa U 046 Letharia vulpina NO1 Letharia vulpina MD2a 812 Letharia vulpina SP2b Letharia vulpina U 049 Letharia vulpina CC1b Letharia vulpina metagenome Letharia vulpina ID7a Letharia rugosa CP2 813 Letharia aff. columbiana KU745811.1 L. vulpina Letharia vulpina FP1a L. rugosa Letharia vulpina U 019 Letharia vulpina QH1 1a Letharia vulpina CC1a 814 Letharia vulpina U 083 Letharia cf. vulpina It 017 Letharia cf. vulpina Sw 006 Letharia aff. vulpina PU14 Letharia vulpina It 003 Letharia vulpina PV1a 815 Letharia vulpina WP1a Letharia vulpina TK1a Letharia vulpina CC5a Letharia rugosa V1 Letharia cf. barbata U 125 816 Letharia lupina ID6a Letharia barbata SA279d Letharia lupina KB3a Letharia lupina U 001 Letharia lupina AB5a Letharia sp. KU745847.1 817 Letharia lupina LA2 Letharia lupina FM3a Letharia sp SJ5a Letharia lupina TP3b Letharia lupina U011 818 Letharia lupina BS1a Letharia lupina pure culture Letharia lupina FM5a Letharia lupina U 003 Letharia lupina BH12 Letharia cf. lupina U 056 819 Letharia lupina AB2b Letharia lupina LM5a Letharia lupina SJ6a Letharia barbata U 073 Letharia lupina U 005 820 Letharia lupina TC1 Letharia barbata CL2 Letharia barbata U 071 Letharia lupina BH4 Letharia lupina U 065 Letharia cf. barbata U 077 821 Supplementary Figure 1. Unrooted maximum likelihood tree Letharia columbiana AF115762.2 L. lupina / L. barbata Letharia barbata TB1 Letharia aff. columbiana KU745829.1 Letharia lupina AB3b Letharia sp. VW3 822 of ITS with all obtained variants from this study (marked with Letharia lupina EP1S Letharia cf. lupina U 042 Letharia barbata TC1 Letharia lupina AB11a Letharia lupina U014 Letharia aff. columbiana KU745809.1 823 U, Sw or It) and previously published ones, the code referring Letharia barbata LP1 Letharia lupina U010 Letharia barbata U 025 Letharia lupina AB4a Letharia lupina MN1 Letharia lupina CR2 824 to the sequences from Kroken & Taylor (2001a), Alterman et Letharia lupina AB1c Letharia lupina DH6 Letharia lupina BH1 Letharia lupina YA1 Letharia lupina U004 825 al. (2014), Alterman et al. (2016) or NCBI accession number. Letharia lupina CC4b Letharia lupina CY5a Letharia lupina U090 Letharia lupina SK3 Letharia lupina Ya8a Letharia lupina BH11 826 The unique variants obtained in this study are marked red. Letharia lupina U 116 Letharia lupina KB3b Letharia lupina BH9 Letharia lupina U 012 Letharia lupina TC5a 827 The Italian variant marked with * is assigned to L. vulpina Letharia cf. gracilis U 146 Letharia gracilis KU745833.1 Letharia vulpina WC2a Letharia gracilis K148 L. gracilis / Letharia gracilis K141 L.vulpina Letharia vulpina It 006 * 828 based on algal ITS. Branch lengths are proportional to the Letharia gracilis SY1 Letharia lupina DA1 Letharia aff. vulpina WN1 Letharia cf. columbiana U 151 Letharia cf. columbiana U 108 829 scale bar (nucleotide substitutions per site). Letharia lucida T3 Letharia columbiana SA280f Letharia lucida DP2 L. columbiana Letharia columbiana U 135 Letharia columbiana U 107 Letharia cf. columbiana U 132 Letharia cf. columbiana U 153 Letharia cf. columbiana U 123

0.005 bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

Letharia_columbiana_8827v1 46 Letharia_columbiana_EP1 96 47 Letharia_columbiana_8829v1 50 Letharia_columbiana_ST1 49 63 Letharia_columbiana_DP2 52 Letharia_columbiana_L20a Letharia_columbiana_T3 85 Letharia_columbiana_SG1 61 Letharia_cf_columbiana_MG 70 Letharia_columbiana_8825v1 56 Letharia_columbiana_9781 51 Letharia_columbiana_8828v1 Letharia_columbiana_CL3 Letharia_columbiana_barbata_TB1 8 4 Letharia_columbiana_9783 13 54 Letharia_columbiana_barbata_CL2 2 Letharia_columbiana_barbata_8816v1 Letharia_columbiana_barbata_9099 3 Letharia_columbiana_barbata_TB2 0 Letharia_columbiana_barbata_ST1 Letharia_columbiana_barbata_9096 06 Letharia_columbiana_barbata_LP1 8 Letharia_columbiana_barbata_OC1 0 Letharia_columbiana_barbata_TC1 0 Letharia_columbiana_9771 Letharia_columbiana_barbata_9779 0 Letharia_columbiana_barbata_9100 26 96 1 Letharia_columbiana_barbata_8819v1 65 Letharia_columbiana_rugosa_CL1 Letharia_cf_rugosa_MG 141 Letharia_columbiana_barbata_8830v1 65 Letharia_columbiana_barbata_8818v1 58 1 Letharia_columbiana_barbata_9097 60 Letharia_columbiana_barbata_9776 Letharia_columbiana_barbata_9778 13 Letharia_columbiana_barbata_8826v1 4 Letharia_columbiana_barbata_9098 18 Letharia_columbiana_barbata_9774 50 Letharia_columbiana_barbata_9093 Letharia_columbiana_barbata_9094 2 Letharia_gracilis_B3 Letharia_gracilis_B2 0 51 Letharia_gracilis_B5 0 Letharia_gracilis_K141 Letharia_gracilis_B1 0 Letharia_gracilis_K1 20 Letharia_gracilis_9782 0 Letharia_gracilis_B4 Letharia_gracilis_K148 7 Letharia_columbiana_rugosa_8824v1 1 95 Letharia_columbiana_rugosa_PM1 20 Letharia_columbiana_rugosa_PM2 5 Letharia_columbiana_rugosa_V1 12 55 73 Letharia_columbiana_rugosa_CP1 0 Letharia_columbiana_9784 Letharia_columbiana_rugosa_CP2 90 Letharia_vulpina_SW1 0.004 56 Letharia_vulpina_H1 57 Letharia_vulpina_9789 32 Letharia_vulpina_AV2 12 Letharia_vulpina_9791 0 Letharia_vulpina_CW8 2 15 Letharia_vulpina_IT2 70 Letharia_vulpina_IT1 54 Letharia_vulpina_CC1b_MG 3 Letharia_lupina_8820v1 26 Letharia_lupina_9795 0 Letharia_lupina_AP1 Letharia_lupina_T23b 0 Letharia_lupina_8821v1 0 4 Letharia_lupina_DP1 18 Letharia_lupina_8823v1 Letharia_lupina_L22b 13 Letharia_lupina_ST2 Letharia_lupina_BS1a_pure_culture 0 46 Letharia_lupina_TC1 46 Letharia_lupina_9091 0 Letharia_lupina_9780 56 Letharia_lupina_9787 Letharia_lupina_SH1 0 36 Letharia_lupina_BC1 68 Letharia_lupina_9092 0 53 Letharia_lupina_CL1 Letharia_lupina_T4 0 Letharia_lupina_T1 0 Letharia_lupina_MV1 0 Letharia_lupina_SG1 1 Letharia_lupina_T23a 1 Letharia_lupina_9794 830 Letharia_lupina_TB1

831 Supplementary Figure 2. Maximum likelihood phylogeny of the Letharia genus produced from

832 11 concatenated markers. Colours highlight specimens with our Illumina data and other

833 sequences are from previous studies. Numbers above branches represent bootstrap support

834 values. Branch lengths are proportional to the scale bar (nucleotide substitutions per site). The

835 tree was rooted arbitrarily with the highly supported clade containing the metagenome of L.

836 columbiana. MG: metagenome.

837 bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

contig_2 contig_3 contig_4 contig_5 contig_6 contig_7 contig_8 contig_9 contig_10 contig_11 contig_12 contig_13 contig_14 contig_15 contig_16 contig_17 contig_18 contig_19 contig_20 contig_21 contig_22 contig_23contig_24contig_25contig_26contig_27contig_28contig_30 ● ● ● ●●

●● ●●●

●● ●●

●●●●● ●●

●●●

●●●● ●● ● ● ●

● ●● ● ● ● ● ●●● ● ●●●

● ●● ●● ● ●● ●●

●●●

●●● ● ●● ● ●●● ●●● ●●●

●● ● ●● ● ● ●● ●● ●● ●● ●●●

●●●● ●● ● ●●●●

● ●● ●● ● ●●● ●●●● ●●● ●●● ●●●●● ●● ● ●●●

● ●● ●● ● ●●● ●●● ●●●● ●

●●● ●●●

●●●

●●●

●●● ●●●

●●

●● ●●● ●● ●●

●● ● ●●●● ● ●● ●●

●●● ●●●● ●● ● ●● ●●● ●●●

●● ●●● ●● ●● ●● ●●

● ● ● ● ● ●● ●

●● ●● ●●● ●● ●●●

●●● ●●●● ●●● ● ●●●

●● ●●●●● ●● ●●● ●●● ● ●●● ●●

●●● ● ● ●

●●● ●●●

●●●

●● ●● ●● ●● ●● ● ●●● ●●● ●●●● ● ●●

●●●

●● ●●● ●● ● ●● ●●● ●● ●●● ●● ● ●● ●

● ●●

●● ●●● ●●

●●●●

●● ●●● ● ●

pure − culture assembly ● ● ●● ● ●● ●● ●● ●● ● ● ● ●●●● ● ●● ● ●●●

●● ●●●● ●

●●● ● ●● ●

●●

●● ● ●●

L. lupina ● ●

●●

●● ● ●● ● ●●● ●●● ●●● ●● ● ●●●●

● ●

● ●●●●

●●● ●●●

●●● ●●●● ●●

●●●

●● ●●●●● ● ● ● ●●● ●● ●● ●●●●

●● ● ●●●● Scaffolds in the Scaffolds ●● ● ●

●● ● ● ●●

●● ●● ●●● ●● ● ●●

●● ● ●●●● ● ●● ● ● ● ●

●● ● ●● ●●● ● ●● ● ●● ●●●

●●●

●●●● ●●●● ●●● ●●● ●●●

●● ● ●●● ●

●●● ●●●● ●●

●● ●●●

●●●●● ●● ●●● ●

●● ●● ●●● ●●

● ●● ●● ● ●●●● ●●● ●● ●●

●● ●●●● ●● ●●●

●●● ●● ●●●

●●●●● ● ●●● ● ● ●●●● ●●

●●●

●● ●● ● ● ●● ●●●● ●●● ●●● ● ●

● ●●● ●● ● ●●● ●●● ● ●

●●●● ●●

●● ● ●● ●● ●●● ●●● ● ●● ●●●●

●● ●● ●●● ● ● ● ● ●●● ●●●● ● ●●

●●●

●●●

●●● ●●● ●●● ●●●●●● ● ●● ●●

●●●● ● ●● ● ●●●● ●●● ●●●● ●●●● ●●●● 838 Contigs in the McKenzie L. lupina assembly

839 Supplementary Figure 3. Dotplot of MUMmer alignments between the assemblies of the

840 McKenzie L. lupina metagenome and our L. lupina pure culture. Scaffold names of the pure

841 culture assembly are omitted for clarity. Blue indicates alignments in the same sense, while red

842 marks inversions. The Contig 1 of the McKenzie L. lupina assembly is excluded since

843 alignments are too short and scarce to be visible.

844

845 bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. NODE_1_length_1806572_cov_94.6139 NODE_10_length_962430_cov_96.5332

●● ●● ●● NODE_108_length_80712_cov_68.8444 NODE_11_length_896412_cov_85.115 ●● ●●

●● ●● NODE_112_length_76363_cov_93.7491

●● NODE_114_length_76054_cov_87.6141

● ● ●● NODE_119_length_64917_cov_103.435 ● ● ● ● ● ● ●● NODE_12_length_849307_cov_84.3712

●● NODE_120_length_64401_cov_97.1726 ● ● NODE_13_length_821745_cov_94.4626

●●

● ●

● ● ●● NODE_15_length_789525_cov_88.5282

●● ●● ●●●●

●● ●

● NODE_16_length_772694_cov_112.824

●● ●●

● ●

●● ● ● ●● ● ● NODE_18_length_649330_cov_86.4146

●● ●● NODE_19_length_627902_cov_80.9922

●● NODE_2_length_1652252_cov_91.7152

●●

●●

● ● NODE_20_length_605854_cov_95.0328 ●● NODE_23_length_548310_cov_79.0293

●●

●● ●● ●●

● ● ●● ●● NODE_25_length_510018_cov_89.0879 ●● ●● ●● ●● NODE_27_length_486249_cov_91.309

●● ●● NODE_28_length_473487_cov_82.1946

●● ●●

● ● NODE_29_length_461304_cov_78.9969

●● ●● ●●

●● ●● NODE_3_length_1466808_cov_89.1546 pure − culture assembly

●● NODE_30_length_459496_cov_94.9912

●● NODE_31_length_457871_cov_85.4556 ●● ●● ●● ● ● ● ● ●● ●● ●● ●● NODE_32_length_452574_cov_125.769 L. lupina NODE_33_length_431664_cov_86.6863 NODE_34_length_427061_cov_89.2992

●● ●● ●●

● ● ●● NODE_35_length_415149_cov_85.338 NODE_38_length_354543_cov_95.9375

●● ●● NODE_40_length_321006_cov_86.5116

●● ●● NODE_41_length_305573_cov_85.3819 Scaffolds in the Scaffolds

●● NODE_51_length_251990_cov_84.65

●● NODE_58_length_232401_cov_79.7107

●● NODE_6_length_1305920_cov_104.694

●●

●● ●● NODE_61_length_216292_cov_113.064

●● NODE_63_length_206381_cov_148.173

● ● NODE_66_length_191983_cov_78.4299 NODE_68_length_190041_cov_99.4027

●● ●● ●●

●●

●● NODE_7_length_1185015_cov_91.9694

●● ●● ●● ●● ●● ●● ●● NODE_79_length_150117_cov_93.6199

●●

●● NODE_8_length_1027700_cov_86.0268 ●● ● ● NODE_80_length_147794_cov_113.047

●● ●● NODE_81_length_147732_cov_78.7389 NODE_84_length_146021_cov_85.3318 NODE_89_length_131933_cov_81.0981 ●● NODE_93_length_124789_cov_107.37

●● NODE_97_length_108849_cov_83.4206

●●

●●

●● ● ● ● ●

●●

● ●

●●

●● ●● ●● ●●

●● ●● 0e+00 1e+06 2e+06 3e+06 846 Contig 1 of the McKenzie L. lupina assembly

847 Supplementary Figure 4. Dotplot of MUMmer alignments between the Contig 1 of the

848 McKenzie L. lupina metagenome and our L. lupina pure-culture assembly. Blue indicates

849 alignments in the same sense, while red marks inversions.

850 bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

3e+06 Repeat content

2e+06 80

60

1e+06 40 20 Size of contig (bp) Size

0e+00

contig_1contig_2contig_3contig_4contig_5contig_6contig_7contig_8contig_9 contig_10contig_11contig_12contig_13contig_14contig_15contig_16contig_17contig_18contig_19contig_20contig_21contig_22contig_23contig_24contig_25contig_26contig_27contig_28contig_29contig_30contig_31 Scaffold in the L. lupina McKenzie assembly 851

852 Supplementary Figure 5. Repeat content proportions of the contigs in the McKenzie L. lupina

853 metagenome assembly.

854

60000 Top repeats in contig_1 ● ● ● ● lelup−families2−10 40000 lelup−families3−45 ● ● lelup−families4−322 ● ● lelup−families4−693 20000

Repeat coverage (bp) Repeat coverage lelup−families4−970

0 ●

NODE_1_length_1806572_cov_94.6139 NODE_2_length_1652252_cov_91.7152 NODE_3_length_1466808_cov_89.1546 NODE_4_length_1408614_cov_87.6619 NODE_5_length_1381321_cov_89.3474 NODE_6_length_1305920_cov_104.694 NODE_7_length_1185015_cov_91.9694 NODE_8_length_1027700_cov_86.0268 Scaffold in the L. lupina pure−culture assembly 855

856 Supplementary Figure 6. Coverage (in total number of base pairs) of the top five most

857 abundant repetitive elements in the Contig 1 from the McKenzie L. lupina metagenome across

858 the biggest scaffolds of the L. lupina pure culture assembly.

859 bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

● Top repeats 2e+05 in contig_1

● lelup−families2−10 lelup−families3−45 lelup−families4−322 1e+05 lelup−families4−693

Repeat coverage (bp) Repeat coverage lelup−families4−970 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0e+00 ● ● ● ● ● ● ● ●

contig_1contig_2contig_3contig_4contig_5contig_6contig_7contig_8contig_9 contig_10contig_11contig_12contig_13contig_14contig_15contig_16contig_17contig_18contig_19contig_20contig_21contig_22contig_23contig_24contig_25contig_26contig_27contig_28contig_29contig_30contig_31 Scaffold in the L. lupina McKenzie assembly 860

861 Supplementary Figure 7. Coverage (in total number of base pairs) of the top five most

862 abundant repetitive elements in the Contig 1 across all the other contigs of the McKenzie L.

863 lupina metagenome assembly.

864

13 kb

0 kb 2 kb 4 kb 6 kb 8 kb 10 kb 12 kb L. columbiana

Augustus Augustus prediction Augustus prediction Augustus prediction Augustus prediction

Augustus prediction Augustus prediction Augustus prediction Augustus prediction

GeneMark.mRNA00011 GeneMark.mRNA00012 GeneMark.mRNA00013 GeneMark.mRNA00014 GeneMark GeneMark.gene00011 GeneMark.gene00012 GeneMark.gene00013 GeneMark.gene00014

intron00001.mRNA00011.11 intron00001.mRNA00012.12 intron00001.mRNA00013.13 intron00001.mRNA00014.14

SNAP Letharia_columbiana_MAT-snap.13 Letharia_columbiana_MAT-snap.14 Letharia_columbiana_MAT-snap.15 Letharia_columbiana_MAT-snap.16 Letharia_columbiana_MAT-snap.17 Letharia_columbiana_MAT-snap.13 Letharia_columbiana_MAT-snap.14 Letharia_columbiana_MAT-snap.15 Letharia_columbiana_MAT-snap.16 Letharia_columbiana_MAT-snap.17

Cufflinks CUFF.20.1 CUFF.21.1 CUFF.22.1 CUFF.26.1 metagenome CUFF.26.2

ORF ORF ORF TransDecoder metagenome ORF ORF ORF ORF ORF

ORF

Final models APN2 lorf MAT1-2-1 MAT1-2-14 SLA2 APN2 lorf MAT1-2-1 MAT1-2-14 SLA2

1500

1000

500

Coverage per site per Coverage 0

Metatranscriptome 35000 40000 865 Position (bp)

866 Supplementary Figure 8. Annotation of the MAT locus of Letharia columbiana. The main

867 sources of evidence for annotation are shown. These include the ab initio gene models

868 (Augustus, GeneMark and SNAP), transcript models (Cufflinks), and ORFs from transcripts

869 (TransDecoder). The final models are presented in black. At the bottom, we present the depth

870 of coverage per site from the metatranscriptome.

871 bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

12 kb L. lupina 0 kb 2 kb 4 kb 6 kb 8 kb 10 kb 12 kb

Augustus Augustus prediction Augustus prediction Augustus prediction Augustus prediction

Augustus prediction Augustus prediction Augustus prediction Augustus prediction

GeneMark.mRNA00011 GeneMark.mRNA00012 GeneMark.mRNA00013 GeneMark.mRNA00014 GeneMark GeneMark.gene00011 GeneMark.gene00012 GeneMark.gene00013 GeneMark.gene00014

intron00001.mRNA00011.11 intron00001.mRNA00012.12 intron00001.mRNA00014.14

SNAP Letharia_lupina_MAT-snap.13 Letharia_lupina_MAT-snap.14 Letharia_lupina_MAT-snap.15 Letharia_lupina_MAT-snap.16 Letharia_lupina_MAT-snap.17 Letharia_lupina_MAT-snap.13 Letharia_lupina_MAT-snap.14 Letharia_lupina_MAT-snap.15 Letharia_lupina_MAT-snap.16 Letharia_lupina_MAT-snap.17

CUFF.14075.1 CUFF.14076.1 CUFF.14077.1 CUFF.14078.1 CUFF.14079.1 CUFF.14080.1 CUFF.14081.1

Cufflinks CUFF.14080.2 pure culture CUFF.14080.3

Cufflinks CUFF.38.1 CUFF.39.1 CUFF.41.1 metagenome CUFF.38.2 CUFF.39.2

CUFF.40.1

TransDecoder ORF ORF

pure culture ORF ORF

ORF ORF ORF

TransDecoder ORF ORF ORF

metagenome ORF ORF

ORF

ORF

Final models APN2 lorf MAT1-1-7 MAT1-1-1 SLA2

APN2 lorf MAT1-1-7 MAT1-1-1 SLA2

2000

1500

1000

500

Coverage per site per Coverage 0

Pure-culture

transcriptome 50000 55000 872 Position (bp)

873 Supplementary Figure 9. Annotation of the MAT locus of Letharia lupina. The main sources of

874 evidence for annotation are shown. These include the ab initio gene models (Augustus,

875 GeneMark and SNAP), transcript models (Cufflinks), and ORFs from transcripts

876 (TransDecoder). The final models are presented in black. Note that the transcriptomic models

877 and ORFs for both the pure culture and the lichen thallus (metagenome) are shown. The depth

878 of coverage at the bottom is only for the pure culture.

879

13 kb L. 'rugosa'0 kb 2 kb 4 kb 6 kb 8 kb 10 kb 12 kb

Augustus Augustus prediction Augustus prediction Augustus prediction Augustus prediction

Augustus prediction Augustus prediction Augustus prediction Augustus prediction

GeneMark.mRNA00023 GeneMark.mRNA00024 GeneMark.mRNA00025 GeneMark.mRNA00026 GeneMark GeneMark.gene00023 GeneMark.gene00024 GeneMark.gene00025 GeneMark.gene00026

intron00001.mRNA00023.23 intron00001.mRNA00024.24 intron00001.mRNA00026.26

SNAP Letharia_rugosa_MAT-snap.13 Letharia_rugosa_MAT-snap.14 Letharia_rugosa_MAT-snap.15 Letharia_rugosa_MAT-snap.16 Letharia_rugosa_MAT-snap.13 Letharia_rugosa_MAT-snap.14 Letharia_rugosa_MAT-snap.15 Letharia_rugosa_MAT-snap.16

CUFF.11.1 CUFF.14.1 CUFF.21.1 CUFF.23.1 CUFF.32.3 Cufflinks CUFF.11.2 CUFF.24.1 CUFF.32.2 metagenome CUFF.24.2 CUFF.32.1

CUFF.24.3

ORF ORF ORF ORF ORF

ORF ORF ORF ORF ORF TransDecoder metagenome ORF ORF ORF

ORF

ORF

ORF

Final models APN2 lorf MAT1-1-7 MAT1-1-1 SLA2 APN2 MAT1-1-7 MAT1-1-1 SLA2

800

600

400

200

Coveragesite per 0

Metatranscriptome 150000 153000 156000 159000 162000 880 Position (bp)

881 Supplementary Figure 10. Annotation of the MAT locus of Letharia ‘rugosa’. The main

882 sources of evidence for annotation are shown. These include the ab initio gene models

883 (Augustus, GeneMark and SNAP), transcript models (Cufflinks), and ORFs from transcripts bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

884 (TransDecoder). The final models are presented in black. At the bottom, we present the depth

885 of coverage per site from the metatranscriptome.

886

12 kb L. vulpina 0 kb 2 kb 4 kb 6 kb 8 kb 10 kb 12 kb

Augustus Augustus prediction Augustus prediction Augustus prediction Augustus prediction

Augustus prediction Augustus prediction Augustus prediction Augustus prediction

GeneMark.mRNA00010 GeneMark.mRNA00011 GeneMark.mRNA00012 GeneMark.mRNA00013 GeneMark GeneMark.gene00010 GeneMark.gene00011 GeneMark.gene00012 GeneMark.gene00013

intron00001.mRNA00010.10 intron00001.mRNA00011.11 intron00001.mRNA00013.13

SNAP Letharia_vulpina_MAT-snap.13 Letharia_vulpina_MAT-snap.14 Letharia_vulpina_MAT-snap.15 Letharia_vulpina_MAT-snap.16

Letharia_vulpina_MAT-snap.13 Letharia_vulpina_MAT-snap.14 Letharia_vulpina_MAT-snap.15 Letharia_vulpina_MAT-snap.16

Cufflinks CUFF.10.2 CUFF.1.1 CUFF.3.1

metagenome CUFF.10.1 CUFF.2.2 CUFF.3.2

CUFF.2.1

TransDecoder ORF ORF ORF metagenome ORF ORF ORF

ORF ORF

Final models APN2 lorf MAT1-1-7 MAT1-1-1 SLA2

APN2 lorf MAT1-1-7 MAT1-1-1 SLA2

1000

500

Coverage per site per Coverage 0

Metatranscriptome 30000 32500 35000 37500 40000 887 Position (bp)

888 Supplementary Figure 11. Annotation of the MAT locus of Letharia vulpina. The main sources

889 of evidence for annotation are shown. These include the ab initio gene models (Augustus,

890 GeneMark and SNAP), transcript models (Cufflinks), and ORFs from transcripts

891 (TransDecoder). The final models are presented in black. At the bottom, we present the depth

892 of coverage per site from the metatranscriptome.

893

894

a) MAT1-1 b) MAT1-2 c) ITS 98 Letharia_columbiana_DP2_U_018 Letharia_cf_columbiana_MG 100 91 Letharia_cf_columbiana_U_023 100 Letharia_columbiana_DP2_U_107 Letharia_cf_columbiana_U_023 81 Letharia_columbiana_DP2_U_018 Letharia_columniana_DP2_U_107 Letharia_columbiana_DP2_U037 79 Letharia_cf_columbiana_MG Letharia_columbiana_DP2_U037 Letharia_columbiana_DP2_U_110 Letharia_vulpina_CC1b_MG Letharia_columbiana_DP2_U_110 Letharia_lupina_BH1_U_009 90 Letharia_lupina_BH1_U_086 Letharia_barbata_LP1_U073 Letharia_lupina_BH1_U_004 73 Letharia_vulpina_Fp1a_U_048 84 Letharia_lupina_BH1_U_029 Letharia_cf_barbata_U077 64 Letharia_lupina_Ab5a_U_045 86 Letharia_vulpina_Pv1a_It_008 Letharia_lupina_Ab5a_U_105 Letharia_cf_gracilis_U_146 91 Letharia_lupina_Ab5a_U_006 Letharia_vulpina_CC1b_It_002 Letharia_cf_gracilis_Fp1a_U_148 Letharia_lupina_Ab5a_U_027 92 Letharia_barbata_LP1_U073 Letharia_cf_barbata_U_125 Letharia_rugosa_CL1_U_046 Letharia_cf_barbata_U077 63 Letharia_barbata_CL2_U_071 100 Letharia_rugosa_CL1_U_127 Letharia_cf_barbata_U_125 Letharia_cf_barbata_U_120 60 94 62 Letharia_lupina_BS1a_U_097 Letharia_cf_gracilis_Pv1a_U_147 Letharia_lupina_BS1a_U_078 Letharia_cf_rugosa_MG Letharia_lupina_BS1a_pure_culture Letharia_cf_gracilis_Fp1a_U_144 Letharia_lupina_BS1a_U_068 Letharia_cf_rugosa_U_124 Letharia_lupina_Ab2b_U013 91 89 Letharia_vulpina_Pv1a_It_046 Letharia_lupina_Ab2b_MG 100 Letharia_lupina_Ab2b_U_010 Letharia_barbata_CL2_U_071 57 88 Letharia_vulpina_Pv1a_It_043 Letharia_cf_barbata_U_120 Letharia_vulpina_CC1b_Sw_001 Letharia_lupina_Ab3b_U_008 Letharia_vulpina_Pv1a_U_114 Letharia_lupina_Ab3b_It_016 64 Letharia_vulpina_CC1b_Sw_005 Letharia_lupina_Ab3b_U111 Letharia_lupina_Ab3b_It_016 Letharia_lupina_Ab3b_U_100 69 Letharia_vulpina_Fp1a_U_026 Letharia_cf_gracilis_BH12_U_130 Letharia_vulpina_Fp1a_U_019 Letharia_cf_gracilis_U_146 61 Letharia_vulpina_Fp1a_U_050 Letharia_vulpina_CC1b_Sw_001 71 90 Letharia_cf_rugosa_MG Letharia_lupina_Ab5a_U_027 Letharia_cf_gracilis_BH12_U_130 Letharia_cf_rugosa_U124 66 61 Letharia_rugosa_CL1_U_046 Letharia_lupina_Ab2b_U013 66 Letharia_lupina_BH1_U_086 Letharia_rugosa_CL1_U_127 Letharia_lupina_Ab2b_U_010 Letharia_vulpina_CC1b_MG Letharia_lupina_Ab2b_MG Letharia_vulpina_CC1b_Sw_005 Letharia_lupina_BH1_U_009 Letharia_vulpina_CC1b_It_002 Letharia_lupina_BS1a_pure_culture Letharia_vulpina_Pv1a_U_114 Letharia_lupina_Ab5a_U_006 Letharia_cf_gracilis_Pv1a_U_147 Letharia_vulpina_Pv1a_It_008 Letharia_lupina_Ab5a_U_045 Letharia_lupina_Ab5a_U_105 Letharia_vulpina_Pv1a_It_043 65 Letharia_vulpina_Pv1a_It_046 Letharia_lupina_BH1_U_004 Letharia_lupina_BS1a_U_068 Letharia_vulpina_Fp1a_U_048 Letharia_vulpina_Fp1a_U_050 Letharia_lupina_BH1_U_029 Letharia_lupina_Ab3b_U111 Letharia_vulpina_Fp1a_U_026 Letharia_cf_gracilis_Fp1a_U_148 Letharia_lupina_BS1a_U_097 Letharia_vulpina_Fp1a_U_019 75 Letharia_lupina_Ab3b_U_100 Letharia_lupina_Ab3b_U_008 Letharia_cf_gracilis_Fp1a_U_144 Letharia_lupina_BS1a_U_078

895 0.002 7.0E-4 0.003

896 bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

897 Supplementary Figure 12. (previous page) Maximum likelihood trees of the alignment of (A)

898 MAT1-1 (encompassing the MAT1-1-7 and MAT1-1-1 genes along with their intergenic region),

899 (B) MAT1-2 (MAT1-2-1 and MAT1-2-14 with their intergenic region), and the fungal barcode

900 ITS (C). Bootstrap values are shown above the branches. Branches with bootstrap values

901 above 70% are thickened. Colours highlight the four main taxa under study: L. columbiana in

902 purple, L. ‘rugosa’ in orange, L. vulpina in blue and L. lupina in green. Samples with genomic or

903 metagenomic data are in bold. Trees were rooted arbitrarily with L. columbiana. Branch lengths

904 are proportional to the scale bar (nucleotide substitutions per site).

905

906

907

908 Supplementary Figure 13. Depth of coverage of the L. lupina metagenome reads mapped

909 along the MAT idiomorph as displayed by the Integrative Genomics Viewer (IGV) program. The

910 pure culture assembly (MAT1-1) was used as reference. The coverage within the idiomorph

911 drops to approximately two thirds of the normal coverage, as expected for a 2:1 proportion of

912 the MAT1-1 and MAT1-2 subgenomes within the L. lupina metagenome data. Colored columns

913 along the coverage track represent the polymorphic sites and their frequencies.

914 bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

915

916 Tables 917 918 Table S1. Letharia specimens used in this study and detailed information on their collection

919 sites, obtained MAT idiomorphs and data repository. The taxon identification is based on the

920 ITS variants and thallus morphology. The ITS variants are named with the codes used by

921 Kroken and Taylor (2001a) and Altermann et al. (2014).

922

923 Table S2. Assembly statistics of the draft assemblies used in this study.

L. lupina L. lupina L. vulpina L. columbiana L. rugosa Assembly Statistics † pure culture metagenome metagenome metagenome metagenome # contigs 4558 36455 10783 17173 41548 Largest contig 1806572 495389 1135240 823758 462828 Total length 47203922 135054511 115181923 107637060 143272164 Total length 42781331 20122947 47648189 43573134 40552224 (>= 50000 bp) N50 457871 6882 26134 15301 6357 N75 183749 2915 10044 5563 2543 L50 30 3904 513 717 3080 L75 75 11611 2375 3804 12228 GC (%) 40.93 45.67 45.66 45.08 45.41 # N's per 100 kbp 0 333.62 34.6 36.75 60.87 924 925 † All metrics are computed for contigs > 500 bp. 926

927 bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

928 Table S3. Primers used for the PCR amplification and sequencing of the MAT locus in

929 Letharia. The primers with no annealing temperature were used only for sequencing. The ITS

930 primers were used to amplify the ITS region for all studied specimens (Tuovinen et al., 2019).

Primer Sequence 5’-3’ Position/target Annealing Reference LMATF1 GACACTGAGTTCCGGTGT lorf this study LMATF2 TCGAATTGGCGTTGGCCG lorf 57°C this study LMATF3 TTGCRATTGACYGACCGGAA lorf this study LMATR1 AACTTGCGTCGCTTGTTGCAA flanking 3’ 57°C this study LMATR2 GCCTAGWGACAAGGTTAGCCT Plateau before this study SLA2 APN2LF GCGACTGGAACCCTAATG APN2 60°C this study SLA2LR ATCTTGCACAGGCTAGGC SLA2 60°C this study MATL_8146F AAGAGCTTGCGCAAGATG flanking 3’ this study MATL_8243R GAAGACTGGCATCGTCTC flanking 3’ this study lorf11R TTCCACTACGCTCAGGAG lorf of MAT1-1 this study †MAT11L6F CGAAGTCATACTCAGTGATCC MAT1-1 62°C this study †MAT11L6kR GTCGCCATAGCATAGTTACTC MAT1-1 62°C this study MAT11L8R AGCATAGTTACTCCCATCGTC MAT1-1 this study MAT11L_5243F GTTCGTGGCATGGAGTAG MAT1-1 this study MAT11L_5346R CAGCTGCAGATTACTGGAC MAT1-1 this study lorf12R TTCCGGTCAGTCAATTGC lorf of MAT1-2 this study †MAT12LF1 CGACGCTTCTTYTCVSYDGGCTT MAT1-2-L 57°C this study †MAT12LR1 CGBCCBCCBAAYGCBTTCAT MAT1-2-L 57°C this study MAT12_L_5137F GCAACGAGGCTTGTCAAG MAT1-2-L this study MAT12L_6577R ATCACATCAGCAGCAGTG MAT1-2 this study ITS1F CTTGGTCATTTAGAGGAAGTAA Letharia ITS 57.5 °C Gardes and Bruns (1993)

ITS4A ATTTGAGCTGTTGCCGCTTCA Letharia ITS 57.5 °C Kroken and Taylor (2001a) 931 †used for PCR screen of the mating type ratios in the populations 932

933 bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

934 Acknowledgments

935 This project was supported by a stipend to LB by Leksand municipality, grants from Stiftelsen

936 Lars Hiertas minne and Stiftelsen Extensus to VT, and by a grant (DO2011-0022) from

937 Stiftelsen Oscar och Lili Lamms minne to GT. We acknowledge that the sequencing of

938 metagenomes and pure culture was performed by the SNP&SEQ Technology Platform,

939 Science for Life Laboratory at Uppsala University, a national infrastructure supported by the

940 Swedish Research Council (VR-RFI) and the Knut and Alice Wallenberg Foundation. A

941 collecting permit for Letharia vulpina in Sweden was issued by the county administration board

942 of Dalarna (Dnr: 522-2513-2014). Helmut Mayrhofer and John McCutcheon are thanked for

943 supporting metatranscriptome sequencing at University of Graz and University of Montana,

944 respectively. We thank Bruce McCune and Steve Leavitt for providing us with Letharia

945 specimens, Markus Hiltunen for laboratory assistance, and Gulnara Tagirdzhanova, Sergio

946 Tusso, and Aaron A. Vogan for fruitful discussions about the metagenomes and analyses, and

947 Dr. Markus Wilken about the MAT nomenclature. We also thank Jessica L. Allen and Sean

948 McKenzie for access to their Letharia repeat library and genome annotations.

949

950 Data accessibility

951 The raw genome and transcriptome data are available through NCBI’s Short Read Archive with

952 accession PRJNA523679. The annotated MAT loci are available in GenBank with accessions

953 MK521629-MK521632. The alignments used for the phylogenetic analyses are available in

954 Dryad: XXXX. The custom scripts written to analyse the data are publicly available on

955 https://github.com/johannessonlab/Letharia. bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

956 Author contributions

957 VT, LB and HJ designed the study. VT and LB collected the Swedish specimens, TS the

958 American specimens, JN the Italian specimens, HJ the Swiss specimens and YY provided the

959 pure culture of Letharia. VT, LB, DV and TS conducted the laboratory work. SLAV, VT, LB, and

960 DV conducted the data-analyses. GT and HJ provided the research resources. SLAV, VT and

961 HJ wrote the paper. TS, DV, JN and GT commented and contributed to the final version of the

962 manuscript.

963

964 Appendix

965 Identification of Letharia specimens 966 We used the ITS marker as a barcode, together with thallus morphology (sorediate vs.

967 apotheciate) to assign Letharia thalli to the previously described species (Kroken and Taylor

968 2001a; Altermann 2009; Altermann et al. 2016). In addition, we found 18 unique ITS variants

969 (Table S1). The specimens with unique ITS variants were putatively assigned to previously

970 known lineages based on thallus morphology and ML analyses of the ITS and marked with “cf.”

971 in fig. S1. The ITS alignment, including variants unique to this study, and previously published

972 variants, was 555 bp long with 69 variable sites. The ITS trees are in concordance with

973 previously published results (Kroken and Taylor 2001a; Altermann et al. 2014; Altermann et al.

974 2016). Two Italian specimens had a 100% identical match to previously published ITS variants

975 extracted from L. gracilis, L. vulpina and L. lupina (Altermann 2009). These specimens were

976 putatively assigned to L. vulpina based on the identity of the symbiotic alga from the thallus

977 (ITS sequence, data not shown), following Altermann (2009) and Altermann et al. (2016). We

978 had eight specimens putatively assigned to L. gracilis in our dataset (morphology-based

979 identification), but the ITS variants for these did not match the previously published L. gracilis bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

980 variants but instead those of L. lupina or L. vulpina (Table S1). These specimens were

981 designated L. cf. gracilis and their species identity cannot be resolved with current data.

982

983 Training of ab initio gene predictors for Letharia genomes 984 The program SNAP was trained a priori with the L. lupina transcripts produced by Trinity,

985 following the instructions in the SNAP’s README and Campbell et al. (2014). The MAKER

986 pipeline v. 2.31.8 (Holt and Yandell 2011) was run to infer gene predictions directly from the

987 Trinity transcripts (est2genome=1). The resulting GFF files were transformed into the non-

988 standard format ZFF using the script maker2zff. All models with warnings or errors reported by

989 the program fathom where excluded. The program forge was used to estimate parameters for

990 the script hmm-assembler.pl. The resulting HMM training files were used to run MAKER once

991 more, this time with est2genome=0. SNAP was retrained as above using this new MAKER

992 annotation. The HMM training files of the retraining process were used for annotation in

993 downstream analyses. Augustus was trained using the protein sequences of the gene models

994 produced by running SNAP on the SPAdes assembly of L. lupina. We used the script

995 autoAug.pl (part of Augustus distribution) to do the training with --maxIntronLen=1000. Finally,

996 GeneMark was self-trained using the script gmes_petap.pl and the flags: --fungal --max_intron

997 3000 --min_gene_prediction 120. All scripts used and the training files are available at

998 https://github.com/johannessonlab/Letharia.

999

1000 Additional results of the MAT region annotation 1001 The metatranscriptomic data supported the presence of a small ORF (herein referred to as lorf,

1002 from Letharia open reading frame) between APN2 and MAT1-2-1 in L. columbiana, flanking the

1003 beginning of the idiomorphic region (fig. 2 and fig. S8). In the other species, however, lorf

1004 displays relatively low levels of expression (see fig. S8-S11), which may be due to differences bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1005 in biological conditions during the time of collection. In the L. lupina metatranscriptome,

1006 TransDecoder detected lorf within the transcripts, suggesting activity in this species.

1007 Nevertheless, an alignment of all samples (including sequences acquired from PCR) revealed

1008 a potentially on-going process of pseudogenization in the entire genus. All the lorf sequences

1009 present in samples of the MAT1-1 idiomorph have two potential start codons (AUG), while

1010 some lorf sequences next to a MAT1-2 idiomorph have either one start codon, or none. Some

1011 samples also have frame-shift mutations. Noteworthy, lorf is located at the border between the

1012 idiomorph and the flanking region (fig. 2) and shows a slight divergence (~96% of similarity)

1013 between MAT1-2 (L. columbiana) and MAT1-1 (L. lupina, L. vulpina, and L. ‘rugosa’).

1014

1015 In addition, the Letharia transcript models for the MAT1-1-1 gene were consistently predicted in

1016 the opposite sense, based on homology with other fungal species and the ab initio gene

1017 predictors. This antisense transcript (fig. 2A) was present in all Letharia species, including the

1018 transcriptome of the pure culture of L. lupina. We recovered the transcript in the canonical

1019 sense for MAT1-1-1 along with antisense transcripts only for L. ‘rugosa’ (fig. 2; see also fig.

1020 S8-S11).

1021 Nomenclature of auxiliary MAT genes 1022 The nomenclature for mating-type genes as suggested by Turgeon and Yoder (2000) is used

1023 for most fungal groups. Under this system, whenever a new gene is discovered, with no

1024 detectable homology to any other, an incremental number is assigned to it following certain

1025 guidelines (Turgeon and Yoder 2000). Unfortunately, there is no centralized database

1026 associated with all published auxiliary MAT genes, which has led to a number of

1027 inconsistencies and overlaps with gene names (Wilken et al. 2017). For example, genes

1028 homologous to the auxiliary gene in Letharia have previously been named MAT1-1-4, MAT1-1-

1029 7, or MAT1-1-9 (fig 3, left), depending on the author (Wilken et al. 2017). As discussed by bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1030 Wilken et al., this gene is not related to the MAT1-1-4 present in the Leotiomycete

1031 Pyrenopeziza brassicae, so this name is not appropriate. However, we disagree with Wilken et

1032 al. (2017) in giving a separate name for MAT1-1-7 and MAT1-1-9 (and perhaps even the

1033 MAT1-1-8 gene present in the Dothidiomycete Shaeropsis sapinea) on the basis of low

1034 pairwise identity, as the high divergence between these genes is coherent with the large

1035 phylogenetic distance between these fungi. Hence, we refer to the Letharia gene as MAT1-1-7,

1036 which is the smallest number available. Incidentally, Armaleo et al. (2019) also used this name

1037 for the ortholog in Cladonia.

1038

1039 Since the auxiliary gene in the MAT1-2 idiomorph of Letharia has no detected homologs

1040 outside Lecanoromycetes, a new number should be assigned. The highest number in use that

1041 we could identify is MAT1-2-12. However, two independent studies used the name MAT1-2-12

1042 for the genes in unrelated taxa: Teratosphaeria (Aylward et al. 2019) and Calonectria (Li et al.

1043 2020). We confirmed that these genes are not homologous to each other, or to the gene in

1044 Letharia, using BLASTp searches. Hence, the correct name for the Letharia auxiliary gene is

1045 MAT1-2-14 and the genes in Calonectria (Li et al. 2020) should be called MAT1-2-13 based on

1046 the date of publication.

1047 1048

1049 References

1050 Ahmad S, Martins C. 2019. The Modern View of B Chromosomes Under the Impact of High Scale Omics

1051 Analyses. Cells 8:156.

1052 Allen JL, McKenzie SK, Sleith RS, Alter SE. 2018. First genome-wide analysis of the endangered,

1053 endemic lichen Cetradonia linearis reveals isolation by distance and strong population structure.

1054 American Journal of Botany 105:1556–1567.

1055 Allman ES, Degnan JH, Rhodes JA. 2011. Identifying the rooted species tree from the distribution of bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1056 unrooted gene trees under the coalescent. Journal of Mathematical Biology 62:833–862.

1057 Altermann S. 2004. A second look at Letharia (Th. Fr.) Zahlbr. Bulletin of the California Lichen Society

1058 11(2): 33–36.

1059 Altermann S. 2009. Geographic structure in a symbiotic mutualism. PhD Thesis.

1060 Altermann S, Leavitt SD, Goward T. 2016. Tidying up the genus Letharia: Introducing L. lupina sp. nov.

1061 and a new circumscription for L. columbiana. Lichenologist 48:423–439.

1062 Altermann S, Leavitt SD, Goward T, Nelsen MP, Lumbsch HT. 2014. How do you solve a problem like

1063 Letharia? A new look at cryptic species in lichen-forming fungi using Bayesian clustering and SNPs

1064 from multilocus sequence data. PLoS One 9:e97556.

1065 Ament-Velásquez SL, Figuet E, Ballenghien M, Zattara EE, Norenburg JL, Fernández-Álvarez FA,

1066 Bierne J, Bierne N, Galtier N. 2016. Population genomics of sexual and asexual lineages in

1067 fissiparous ribbon worms (Lineus, Nemertea): hybridization, polyploidy and the Meselson effect.

1068 Molecular Ecology 25:3356–3369.

1069 Ames LM. 1934. Hermaphroditism Involving Self-Sterility and Cross-Fertility in the Ascomycete Pleurage

1070 anserina. Mycologia 26:392–414.

1071 Armaleo D, Müller O, Lutzoni F, Andrésson ÓS, Blanc G, Bode HB, Collart FR, Dal Grande F, Dietrich F,

1072 Grigoriev I V., et al. 2019. The lichen symbiosis re-viewed through the genomes of Cladonia grayi

1073 and its algal partner Asterochloris glomerata. BMC Genomics 20.

1074 Arnerup J, Högberg N, Thor G. 2004. Phylogenetic analysis of multiple loci reveal the population

1075 structure within Letharia in the Caucasus and Morocco. Mycological Research 108:311–316.

1076 SLU ArtDatabanken. 2020. Rödlistade arter i Sverige 2020. Uppsala.

1077 Aylward J, Havenga M, Dreyer LL, Roets F, Wingfield BD, Wingfield MJ. 2019. Genomic characterization

1078 of mating type loci and mating type distribution in two apparently asexual plantation tree

1079 pathogens. Plant Pathology. 69:28–37.

1080 Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, Lesin VM, Nikolenko SI, Pham S,

1081 Prjibelski AD, et al. 2012. SPAdes: a new genome assembly algorithm and its applications to

1082 single-cell sequencing. Journal of Computational Biology 19:455–477.

1083 Bell G. 1982. The masterpiece of nature: the evolution and genetics of sexuality. Berkeley, CA: bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1084 University of California Press

1085 Bertazzoni S, Williams AH, Jones DA, Syme RA, Tan K-C, Hane JK. 2018. Accessories Make the Outfit:

1086 Accessory Chromosomes and Other Dispensable DNA Regions in Plant-Pathogenic Fungi.

1087 Molecular Plant-Microbe Interactions. 31:779–788.

1088 Bicknell RA, Koltunow AM 2004. Understanding apomixis: recent advances and remaining conundrums.

1089 The Plant Cell 16:S228–S245

1090 Billiard S, López-Villavicencio M, Hood ME, Giraud T. 2012. Sex, outcrossing and mating types:

1091 Unsolved questions in fungi and beyond. Journal of Evolutionary Biology 25:1020–1038.

1092 Billingham MR, Reusch TBHH, Alberto F, Serrão EA. 2003. Is asexual reproduction more important at

1093 geographical limits? A genetic study of the seagrass Zostera marina in the Ria Formosa, Portugal.

1094 Marine Ecology Progress Series 265:77–83.

1095 Boratyn GM, Schäffer AA, Agarwala R, Altschul SF, Lipman DJ, Madden TL. 2012. Domain enhanced

1096 lookup time accelerated BLAST. Biology Direct 7:1–14.

1097 Bryant D, Moulton V. 2004. Neighbor-net: an agglomerative method for the construction of phylogenetic

1098 networks. Molecular Biology and Evolution 21:255–265.

1099 Bubrick P, Galunj M. 1986. Spore to spore resynthesis of Xanthoria parietina. Lichenologist 18:47–49.

1100 Butler G. 2007. The Evolution of MAT: The Ascomycetes. In: Heitman J, Kronstad JW, Taylor JW,

1101 Casselton LA, editors. Sex in fungi: Molecular determination and evolutionary implication.

1102 Washington, D. C.: American Society for Microbiology Press. p. 3–18.

1103 Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL. 2009. BLAST+:

1104 architecture and applications. BMC Bioinformatics 10:421.

1105 Campbell MS, Holt C, Moore B, Yandell M. 2014. Genome Annotation and Curation Using MAKER and

1106 MAKER-P. In: Current Protocols in Bioinformatics. Vol. 2014. Hoboken, NJ, USA: John Wiley &

1107 Sons, Inc. p. 4.11.1-4.11.39.

1108 Chen F, Mackey AJ, Stoeckert CJ, Roos DS. 2006. OrthoMCL-DB: querying a comprehensive multi-

1109 species collection of ortholog groups. Nucleic Acids Research 34:D363-8.

1110 Consolo VF, Cordo CA, Salerno GL. 2005. Mating-type distribution and fertility status in Magnaporthe

1111 grisea populations from Argentina. Mycopathologia 160:285–290. bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1112 Crittenden PD, David JC, Hawksworth DL, Campbell FS. 1995. Attempted isolation and success in the

1113 culturing of a broad spectrum of lichen-forming and lichenicolous fungi. New Phytologist 130:267–

1114 297.

1115 Dal Grande F, Meiser A, Greshake Tzovaras B, Otte J, Ebersberger I, Schmitt I. 2018. The draft genome

1116 of the lichen-forming Lasallia hispanica (Frey) Sancho & A. Crespo. Lichenologist 50:329–

1117 340.

1118 Depotter JRL, Seidl MF, Wood TA, Thomma BPHJ. 2016. Interspecific hybridization impacts host range

1119 and pathogenicity of filamentous microbes. Current Opinion in Microbiology 32:7–13.

1120 Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, Gingeras TR.

1121 2013. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29:15–21.

1122 Dyer PS, Inderbitzin P, Debuchy R (2016) 14 Mating-Type Structure, Function, Regulation and Evolution

1123 in the Pezizomycotina. In: Wendland J. (ed) Growth, Differentiation and Sexuality. The Mycota (A

1124 Comprehensive Treatise on Fungi as Experimental Systems for Basic and Applied Research), vol

1125 1. Springer, Cham.

1126 Dyer PS, Murtagh GJ, Crittenden PD. 2001. Use of RAPD-PCR DNA fingerprinting and vegetative

1127 incompatibility tests to investigate genetic variation within lichen-forming fungi. Symbiosis 31:213–

1128 229.

1129 Eddy SR. 2011. Accelerated profile HMM searches. PLoS Computational Biology 7(10): e1002195.

1130 Espagne E, Lespinet O, Malagnac F, Da Silva C, Jaillon O, Porcel BM, Couloux A, Aury J-M, Ségurens

1131 B, Poulain J, et al. 2008. The genome sequence of the model ascomycete fungus Podospora

1132 anserina. Genome Biology 9(5):R77.

1133 Ferreira AVB, An Z, Metzenberg RL, Glass NL. 1998. Characterization of mat A-2, mat A-3 and ΔmatA

1134 mating-type mutants of Neurospora crassa. Genetics 148:1069–1079.

1135 Gardes, M., Bruns, T. D. 1993. ITS primers with enhanced specificity for basidiomycetes - application to

1136 the identification of mycorrhizae and rusts. Molecular Ecology 2: 113-118.

1137 Gioti A, Mushegian AA, Strandberg R, Stajich JE, Johannesson H. 2012. Unidirectional evolutionary

1138 transitions in fungal mating systems and the role of transposable elements. Molecular Biology and

1139 Evolution 29:3215–3226. bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1140 Glass NL, Vollmer SJ, Staben C, Grotelueschen J, Metzenberg RL, Yanofsky C. 1988. DNAs of the two

1141 mating-type alleles of Neurospora crassa are highly dissimilar. Science 241:570–573.

1142 Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adiconis X, Fan L, Raychowdhury

1143 R, Zeng Q, et al. 2011. Full-length transcriptome assembly from RNA-Seq data without a reference

1144 genome. Nature Biotechnology 29:644–652.

1145 Gurevich A, Saveliev V, Vyahhi N, Tesler G. 2013. QUAST: Quality assessment tool for genome

1146 assemblies. Bioinformatics 29:1072–1075.

1147 Haas BJ, Papanicolaou A, Yassour M, Grabherr M, Blood PD, Bowden J, Couger MB, Eccles D, Li B,

1148 Lieber M, et al. 2013. De novo transcript sequence reconstruction from RNA-seq using the Trinity

1149 platform for reference generation and analysis. Nature Protocols 8:1494–1512.

1150 Haas BJ, Salzberg SL, Zhu W, Pertea M, Allen JE, Orvis J, White O, Buell CR, Wortman JR. 2008.

1151 Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to

1152 Assemble Spliced Alignments. Genome Biology 9:R7.

1153 Heinrich V, Stange J, Dickhaus T, Imkeller P, Krüger U, Bauer S, Mundlos S, Robinson PN, Hecht J,

1154 Krawitz PM. 2011. The allele distribution in next-generation sequencing data sets is accurately

1155 described as the result of a stochastic branching process. Nucleic Acids Research 40:2426–2431.

1156 Heitman J, Kronstad JW, Taylor JW, Casselton LA. 2007. Sex in Fungi. 1st edition. Washington, D. C.:

1157 American Society for Microbiology Press

1158 Henriksen S, Hilmo OR. 2015. Norsk rødliste for arter. Norge: Artsdatabanken.

1159 Högberg N, Kroken S, Thor G, Taylor JW. 2002. Reproductive mode and genetic variation suggest a

1160 North American origin of European Letharia vulpina. Molecular Ecology 11:1191–1196.

1161 Holt C, Yandell M. 2011. MAKER2: an annotation pipeline and genome-database management tool for

1162 second-generation genome projects. BMC Bioinformatics 12:491.

1163 Honegger R, Scherrer S. 2008. Sexual reproduction in lichen-forming ascomycete. In: Nash TH (ed).

1164 Lichen Biology. 2nd Edition. New York: Cambridge University Press. p. 94–103.

1165 Honegger R, Zippler U. 2007. Mating systems in representatives of Parmeliaceae, Ramalinaceae and

1166 Physciaceae (Lecanoromycetes, lichen-forming ascomycetes). Mycological Research 111:424–

1167 432. bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1168 Honegger R, Zippler U, Gansner H, Scherrer S. 2004. Mating systems in the genus Xanthoria (lichen-

1169 forming ascomycetes). Mycological Research 108:480–488.

1170 Huson DH, Bryant D. 2006. Application of phylogenetic networks in evolutionary studies. Molecular

1171 Biology and Evolution 23:254–267.

1172 Hutchinson MI, Powell AJ, Tsang A, O’Toole N, Berka RM, Barry K, Grigoriev I V., Natvig DO. 2016.

1173 Genetics of mating in members of the Chaetomiaceae as revealed by experimental and genomic

1174 characterization of reproduction in Myceliophthora heterothallica. Fungal Genetics and Biology

1175 86:9–19.

1176 Jacobson DJ. 1995. Sexual dysfunction associated with outcrossing in Neurospora tetrasperma, a

1177 pseudohomothallic ascomycete. Mycologia 87:604–617.

1178 Jahns HM. 1993. Culture experiments with lichens. Plant Systematics and Evolution 187:145–174.

1179 Jahns HM, Ott S. 1997. Life strategies in lichens - some general considerations. Bibliotheca

1180 Lichenologica 67:49–67.

1181 Kamvar ZN, Brooks JC, Grünwald NJ. 2015. Novel R tools for analysis of genome-wide population

1182 genetic data with emphasis on clonality. Frontiers in Genetics 6:1–10.

1183 Katoh K, Standley DM. 2013. MAFFT multiple sequence alignment software version 7: Improvements in

1184 performance and usability. Molecular Biology and Evolution 30:772–780.

1185 Kawecki TJ. 2008. Adaptation to marginal habitats. Annual Review of Ecology, Evolution, and

1186 Systematics 39:321–342.

1187 Keuler R, Garretson A, Saunders T, Erickson RJ, St. Andre N, Grewe F, Smith H, Lumbsch HT, Huang

1188 JP, St. Clair LL, et al. 2020. Genome-scale data reveal the role of hybridization in lichen-forming

1189 fungi. Scientific Reports 10:1–14.

1190 Knaus BJ, Grünwald NJ. 2017. vcfR: a package to manipulate and visualize variant call format data in R.

1191 Molecular Ecology Resources 17:44–53.

1192 Koboldt DC, Zhang Q, Larson DE, Shen D, McLellan MD, Lin L, Miller CA, Mardis ER, Ding L, Wilson

1193 RK. 2012. VarScan 2: Somatic mutation and copy number alteration discovery in cancer by exome

1194 sequencing. Genome Research 22:568–576.

1195 Korf I. 2004. Gene finding in novel genomes. BMC Bioinformatics 5:59. bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1196 Köhler C, Scheid OM, Erilova A 2009. The impact of the triploid block on the origin and evolution of

1197 polyploid plants. Trends in Genetics 26(3):142–148

1198 Köster J, Rahmann S. 2012. Snakemake-a scalable bioinformatics workflow engine. Bioinformatics

1199 28:2520–2522.

1200 Kroken S, Taylor JW. 2001a. A gene genealogical approach to recognize phylogenetic species

1201 boundaries in the lichenized fungus Letharia. Mycologia 93:38–53.

1202 Kroken S, Taylor JW. 2001b. Outcrossing and recombination in the lichenized fungus Letharia. Fungal

1203 Genetics and Biology 34:83–92.

1204 Kronstad JW, Staben C. 1997. Mating type in filamentous fungi. Annu. Rev. Genet. 31:245–276.

1205 Kurtz S, Phillippy A, Delcher AL, Smoot M, Shumway M, Antonescu C, Salzberg SL. 2004. Versatile and

1206 open software for comparing large genomes. Genome Biololgy 5:R12.

1207 Larsson A. 2014. AliView: a fast and lightweight alignment viewer and editor for large data sets.

1208 Bioinformatics 30:3276-3278.

1209 Levin DA 1975. Minority cytotype exclusion in local plant populations. Taxon 24(1):35–43

1210 Li H, Durbin R. 2010. Fast and accurate long-read alignment with Burrows-Wheeler transform.

1211 Bioinformatics 26:589–595.

1212 Li JQ, Wingfield BD, Wingfield MJ, Barnes I, Fourie A, Crous PW, Chen SF. 2020. Mating genes in

1213 Calonectria and evidence for a heterothallic ancestral state. Persoonia - Molecular phylogeny and

1214 evolution of fungi 163–176.

1215 Lomsadze A, Ter-Hovhannisyan V, Chernoff YO, Borodovsky M. 2005. Gene identification in novel

1216 eukaryotic genomes by self-training algorithm. Nucleic Acids Research 33:6494–6506.

1217 Ludwig LR, Summerfield TC, Lord JM, Singh G. 2017. Characterization of the mating-type locus (MAT)

1218 reveals a heterothallic mating system in Knightiella splachnirima. Lichenologist 49:373–385.

1219 Mable BK 2003. Breaking down taxonomic barriers in polyploid research. Trends in Plant Science

1220 8(12):582–590

1221 Mandel MA, Barker BM, Kroken S, Rounsley SD, Orbach MJ. 2007. Genomic and Population Analyses

1222 of the Mating Type Loci in Coccidioides Species Reveal Evidence for Sexual Reproduction and

1223 Gene Acquisition. Eukaryotic Cell 6:1189–1199. bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1224 Mansournia MR, Wu B, Matsushita N, Hogetsu T. 2012. Genotypic analysis of the foliose lichen

1225 Parmotrema tinctorum using microsatellite markers: Association of mycobiont and photobiont, and

1226 their reproductive modes. Lichenologist 44:419–440.

1227 Marcet-Houben M, Gabaldón T. 2015. Beyond the whole-genome duplication: Phylogenetic evidence for

1228 an ancient interspecies hybridization in the baker’s yeast lineage. PLOS Biology 13:1–26.

1229 Martin T, Lu S, van Tilbeurgh H, Ripoll DR, Dixelius C, Turgeon BG, Debuchy R. 2010. Tracing the origin

1230 of the fungal α1 domain places its ancestor in the HMG-box superfamily: implication for fungal

1231 mating-type evolution. PLoS One 5:e15199.

1232 McCune B, Altermann S. 2009. Letharia gracilis (Parmeliaceae), a new species from California and

1233 Oregon. Bryologist 112:375–378.

1234 McKenzie SK, Walston RF, Allen JL. 2020. Complete, high-quality genomes from long-read

1235 metagenomic sequencing of two wolf lichen thalli reveals enigmatic genome architecture.

1236 Genomics 112:3150–3156.

1237 Meloni M, Reid A, Caujapé-Castells J, Marrero Á, Fernández-Palacios JM, Mesa-Coelo RA, Conti E.

1238 2013. Effects of clonality on the genetic variability of rare, insular species: The case of Ruta

1239 microcarpa from the Canary Islands. Ecology and Evolution 3:1569–1579.

1240 Murtagh GJ, Dyer PS, Crittenden PD. 2000. Sex and the single lichen. Nature 404:564.

1241 Nguyen LT, Schmidt HA, Von Haeseler A, Minh BQ. 2015. IQ-TREE: A fast and effective stochastic

1242 algorithm for estimating maximum-likelihood phylogenies. Molecular Biology and Evolution 32:268–

1243 274.

1244 Olarte RA, Horn BW, Dorner JW, Monacell JT, Singh R, Stone EA, Carbone I. 2012. Effect of sexual

1245 recombination on population diversity in aflatoxin production by Aspergillus flavus and evidence for

1246 cryptic heterokaryosis. Molecular Ecology 21:1453–1476.

1247 Page WM. 1936. Note on abnormal spores in Podospora minuta. Trans. Br. Mycol. Soc. 20:186–189.

1248 Patel RK, Jain M. 2012. NGS QC Toolkit: a toolkit for quality control of next generation sequencing data.

1249 PLoS One 7:e30619.

1250 Perkins DD. 1994. How should the infertility of interspecies crosses be designated? Mycologia 86:758–

1251 761. bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1252 Perkins DD, Turner BC, Barry EG. 1976. Strains of Neurospora collected from nature. Evolution 30:281–

1253 313.

1254 Pizarro D, Dal Grande F, Leavitt SD, Dyer PS, Schmitt I, Crespo A, Thorsten Lumbsch H, Divakar PK,

1255 Barluenga M. 2019. Whole-Genome Sequence Data Uncover Widespread Heterothallism in the

1256 Largest Group of Lichen-Forming Fungi. Genome Biology and Evolution 11:721–730.

1257 Raju NB. 1992. Functional heterothallism resulting from homokaryotic conidia and ascospores in

1258 Neurospora tetrasperma. Mycological Research 96:103–116.

1259 Raju NB, Newmeyer D. 1977. Giant ascospores and abnormal croziers in a mutant of Neurospora

1260 crassa. Experimental Mycology 1:152–165.

1261 Ramsey J, Schemske DW 1998. Pathways, mechanisms, and rates of polyploid formation in flowering

1262 plants. Annual Review of Ecology, Evolution, and Systematics 29:467–501

1263 Rutherford K, Parkhill J, Crook J, Horsnell T, Rice P, Rajandream M-A, Barrell B. 2000. Artemis:

1264 sequence visualization and annotation. Bioinformatics 16:944–945.

1265 Saleh D, Xu P, Shen Y, Li C, Adreit H, Milazzo J, Ravigné V, Bazin E, Nottéghem JL, Fournier E, et al.

1266 2012. Sex at the origin: An Asian population of the rice blast fungus Magnaporthe oryzae

1267 reproduces sexually. Molecular Ecology 21:1330–1344.

1268 Sanders WB. 2014. Complete life cycle of the lichen fungus Calopadia puiggarii (Pilocarpaceae,

1269 Ascomycetes) documented in situ: Propagule dispersal, establishment of symbiosis, thallus

1270 development, and formation of sexual and asexual reproductive structures. American Journal of

1271 Botany 101:1836–1848.

1272 Scherrer S, Zippler U, Honegger R. 2005. Characterisation of the mating-type locus in the genus

1273 Xanthoria (lichen-forming ascomycetes, Lecanoromycetes). Fungal Genetics and Biology 42:976–

1274 988.

1275 Singh G, Dal Grande F, Cornejo C, Schmitt I, Scheidegger C. 2012. Genetic Basis of Self-Incompatibility

1276 in the Lichen-Forming Fungus Lobaria pulmonaria and Skewed Frequency Distribution of Mating-

1277 Type Idiomorphs: Implications for Conservation. PLoS One 7:e51402.

1278 Singh G, Dal Grande F, Werth S, Scheidegger C. 2015. Long-term consequences of disturbances on

1279 reproductive strategies of the rare epiphytic lichen Lobaria pulmonaria: Clonality a gift and a curse. bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1280 FEMS Microbiology Ecology 91:1–11.

1281 Slater GSC, Birney E. 2005. Automated generation of heuristics for biological sequence comparison.

1282 BMC Bioinformatics 6:31.

1283 Stamatakis A. 2014. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large

1284 phylogenies. Bioinformatics:1–2.

1285 Stanke M, Waack S. 2003. Gene prediction with a hidden Markov model and a new intron submodel.

1286 Bioinformatics 19:ii215–ii225.

1287 Steenwyk JL, Lind AL, Ries LNA, dos Reis TF, Silva LP, Almeida F, Bastos RW, Fraga da Silva TF de

1288 C, Bonato VLD, Pessoni AM, et al. 2020. Pathogenic Allodiploid Hybrids of Aspergillus Fungi.

1289 Current Biology 30:2495-2507.e7.

1290 Stocker-Wörgötter E. 2001. Experimental and microbiology of lichens: culture experiments,

1291 secondary chemistry of cultured mycobionts, resynthesis, and thallus. Bryologist 104:576–581.

1292 Ter-Hovhannisyan V, Lomsadze A, Chernoff YO, Borodovsky M. 2008. Gene prediction in novel fungal

1293 genomes using an ab initio algorithm with unsupervised training. Genome Research 18:1979–

1294 1990.

1295 Tiwary BK, Kirubagaran R, Ray AK 2004. The biology of triploid fish. Reviews in Fish Biology and

1296 Fisheries 14: 391–402

1297 Trapnell C, Williams B a, Pertea G, Mortazavi A, Kwan G, van Baren MJ, Salzberg SL, Wold BJ, Pachter

1298 L. 2010. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and

1299 isoform switching during cell differentiation. Nature Biotechnology 28:511–515.

1300 Tripp EA, Lendemer JC. 2018. Twenty-seven modes of reproduction in the obligate lichen symbiosis.

1301 Brittonia 70:1–14.

1302 Tripp EA, Zhuang Y, Lendemer JC. 2017. A review of existing whole genome data suggests lichen

1303 mycelia may be haploid or diploid. Bryologist 120:302–310.

1304 Tsui CK-M, DiGuistini S, Wang Y, Feau N, Dhillon B, Bohlmann J, Hamelin RC. 2013. Unequal

1305 Recombination and Evolution of the Mating-Type (MAT) Loci in the Pathogenic Fungus

1306 Grosmannia clavigera and Relatives. G3 Genes|Genomes|Genetics 3:465–480.

1307 Tuovinen V, Ekman S, Thor G, Vanderpool D, Spribille T, Johannesson H. 2019. Two Basidiomycete bioRxiv preprint doi: https://doi.org/10.1101/2020.12.18.423428; this version posted December 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1308 Fungi in the Cortex of Wolf Lichens. Current Biology 29:476-483.e5.

1309 Turgeon BG, Yoder OC. 2000. Proposed nomenclature for mating type genes of filamentous

1310 ascomycetes. Fungal Genetics and Biology 31:1–5.

1311 Whittle CA, Nygren K, Johannesson H. 2011. Consequences of reproductive mode on genome evolution

1312 in fungi. Fungal Genetics and Biology 48:661–667.

1313 Wickham H. 2016. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York

1314 Wickham H, François R, Henry L, Müller K. 2018. dplyr: A Grammar of Data Manipulation. R package

1315 version 0.8.0.1

1316 Wilke CO. 2019. cowplot: Streamlined Plot Theme and Plot Annotations for “ggplot2”. R package version

1317 1.0.0.

1318 Wilken PM, Steenkamp ET, Wingfield MJ, de Beer ZW, Wingfield BD. 2017. Which MAT gene?

1319 Pezizomycotina (Ascomycota) mating-type gene nomenclature reconsidered. Fungal Biology

1320 Reviews 31:199–211.

1321 Yamamoto Y, Kinoshita Y, Yoshimura I. 1998. Difference of cell-viability and spore discharge capability

1322 between two Lichen species, Letharia columbiana (Nutt.) Thoms. and L. vulpina (L.) Hue. Plant

1323 Biotechnology 15:131–133.

1324 Yoshida K, Schuenemann VJ, Cano LM, Pais M, Mishra B, Sharma R, Lanz C, Martin FN, Kamoun S,

1325 Krause J, et al. 2013. The rise and fall of the Phytophthora infestans lineage that triggered the Irish

1326 potato famine. eLife 2:e00731.

1327